High Volume Imaging: OmniDocs

OmniDocs is an Enterprise Image and Document Management (EDM) Engine to capture, digitize, store and extract information through Image Repository.

OmniDocs is a tool that can capture, digitize, store and extract information from Image Repository. Now faster storage and retrieval of documents throughout the entire document lifecycle is easily possible. You can also move documents from online to offline storage, go for data caching and replication etc.

OmniDocs Image Server

Image server is responsible for storage and retrieval of documents and for the entire document lifecycle management, moving documents from online to offline storage, data caching, replication etc. It is implemented using server-side Java and is available on Windows, Linux, Solaris and Unix platforms.

The image server is highly scalable, and can support billions of documents and terabytes of data. The image server is designed to support LAN, WAN and internet environments, where image storage is distributed across multiple locations. The index information for the archived images is maintained in a centralized database server, and the actual images are stored in physical entities called sites, which can be deployed at the same or remote locations.

Image Server Architecture and Organization

The image server consists of a Centralized Index Database (Image Server Database) and multiple Storage Management Servers (SMS) located at multiple sites.

Storage Management Server (SMS)

Storage Management Server manages the actual storage and retrieval of image data. Multiple SMS can be deployed for a single image server. Typically, this will happen for scalability / availability or at multiple locations for bandwidth management.
SMS has the concept of labels, created for different storage media attached to the SMS. These labels are logical references to absolute path on the media.

Image Volumes and Volume-Blocks

Image storage at a physical location is divided into logical storage units called image volumes. An image volume in turn consists of multiple image volume blocks. These volume blocks are physical files corresponding to a group of images. This image data file is built by the image server in correspondence to the volume block and provides the actual physical storage.

An image volume can be replicated across multiple sites. There is no limit to the physical storage of an image volume. Multiple image volumes can be defined for an SMS. Volume blocks are typically configured as 50-100 MB units of files for administration & manageability. There is no limit on number of volume blocks per volume. There is no limit on the number of documents stored in image server.

Site, Home Site, Preferred Site and Remote Site

A site is a physical location where images are stored. Each image archive is logically divided into one or more image volumes, which can be replicated across various sites. This replication can be either automatic or manual.

  • Home Site: Each image volume has a home site. This is where the documents of that image volume get added by default
  • Preferred Site: A Preferred site is the site from where the user would wish to retrieve his documents. Multiple preferred sites can be set with priority / round-robin etc.
  • Replica Site: All sites other than the home-site/preferred site are referred to as replica sites
Image Server Database

The Image Server Database consists of index information about sites, volumes, volume-blocks and the documents stored within each volume-block. Index information about a document includes the volume-id, and the document-id. This is a centralized database and the index information may pertain to documents available at local or remote sites. Typically, in OmniDocs, there is one single image server database for each document database.

Caching

The SMS supports caching of both read as well as write requests. For Write cache, the volumes can be created on online media, and later the volume-blocks moved to offline media. In case of Read cache, documents that exist on offline / near-online media are cached to online media for future access, when they are retrieved.

Replication and Pre-fetching

This is a concept of replicating/copying documents from a home site to one or multiple replica site(s). The Storage Management Server is responsible for replication. Image volumes can be configured for immediate replication as well as delayed replication. In this case, the images from the home site are pre-fetched onto the remote site for subsequent access. In case the images are not pre-fetched, the images may be accessed from the central (home) server.

A replica site is an alternate site where the document of the home-site is replicated. One volume can have multiple replica sites but have only one home site.

Multi-location Deployment

Image server deployment depends upon how the user decides to organize his documents in OmniDocs especially in a scenario where the clients and servers can span across different geographical locations. In a multi-location scenario, the central server resides at one location, and users from various geographical locations access the central OmniDocs Server.

In such cases, the administrator can either configure the central site as the home site, or replicate it at the local site, or he can configure the local site as the home site, and replicate it at central sites. Typically, such a decision would depend upon the likely access to the documents from a particular location, likely addition of documents from a particular site and connectivity between the sites.

Central Site configured as Home Site
  • This configuration is recommended in case when remote sites are used mostly to retrieve documents i.e. large numbers of documents are not added on a periodic basis from Remote sites
  • Documents always added to central site first and then replicated to remote sites
  • Replication can be done on immediate as well as scheduled basis
  • Replication between the local and central storage servers requires direct TCP/IP connectivity
  • At the time of retrieval in case the data is available locally, then it is fetched from local site. This ensures optimized usage of bandwidth
  • If the document has not replicated to the local preferred site, then the document will be fetched from central server. This configuration ensures that at the time of retrieval, the user is guaranteed to get the document
  • Users accessing documents from central site will get document locally from central site
  • Users accessing documents from other remote location will get document either locally or from central site depending upon the replication status of document
Local Site configured as Home Site
  • This configuration is recommended in case when remote sites (branches) upload large number of documents on a periodic basis
  • Documents added to local site first and then replicated to central sites
  • Replication can be done on immediate as well as scheduled basis
  • Replication between the local and central storage servers requires direct TCP/IP connectivity
  • Retrieval of document from the site where document was added is always done locally to optimize bandwidth
  • Retrieval of document from the central site and other remote sites depends upon the replication status of the document (i.e. if the document is not available locally)
  • If the document has not replicated to the central site, then the document will be fetched from remote server (depends on connectivity). This configuration ensures that at the time of retrieval, the user is guaranteed to get the document.
For Sales Inquiries

Americas: +1 (202) 800 7783 | Europe: +44 (0) 2036 514805 | Middle East: +973-1-619-8002 | Africas: +973-1-619-8002 | India: +91 11 40773769 | Asia Pacific: +65 3157 6189

Search Newgen For More Information on High Volume Imaging: OmniDocs