Posted January 29, 2025 in Digital Access
It can be easy to think of digitization as a once and done project, material simply being scanned, processed, and made available through our online finding aid. In reality it is an ongoing process requiring recurring maintenance and on occasion changes to improve the material’s quality and usability for staff and researchers. A good example of this is the photo album the photographs above are from, which was originally scanned several years ago and is currently being updated.
The photographs in this album were taken between 1895 and 1901 and document the activities of Moravian missionaries in northern India. The two landscapes on the page shown above both include the 17th century palace in the city of Leh and are accompanied by handwritten text reading “A favorite bit of scenery outside of Leh, looking towards the north.” This ancient city has long been a trading hub and political seat and is currently the capital of Ladakh, part of the disputed Kashmir region.
Much as we take measures to preserve physical items in our holdings, such as letter, journals, and photographs, through means such as transferring them to acid free boxes, having damaged items professionally conserved, and storing them in a climate controlled vault, digital items such as scanned images require ongoing attention in order to ensure that these valuable resources will be available far into the future. Unfortunately, there is no form of electronic media that will last indefinitely, all become gradually less stable and more prone to failure with time. In light of this, we maintain three copies of each scanned image we produce, two on external hard drives at the archives, one being kept in the vault, and one being stored on off site servers. Each year we go through each of these copies using checksums[1] to confirm their condition, replacing any corrupted files from one of the other copies, and eventually replacing the hard drives.
A less common but still important aspect of maintaining our digital holdings is improving how they are described and accessed. Sometimes staff determine that the description of an item in our online finding aid, be it a physical item or born digital, would benefit from being updated. This can be to correct previous errors, bring it more inline with current archival standards, or simply to add additional information. In the case of the aforementioned photograph album these descriptions were made a few years ago and currently serve their purpose very well, but it was recently decided that the file names used for the digitized copy could be improved. When it was initially scanned the files were labeled simply in numerical order from front to back of the volume and over time it has become clear that this is inconvenient for staff and researchers as there is no easy way to relate the name of one of the scanned images with the equivalent page in the physical volume or that page’s description in our online file name. As such, we will be renaming the scanned images for this and a number of other items such that the filename will clearly identify both the item and the page within it, so that someone working with them can easily see that a given file is for example part of an index or is pages 32-33 without having to open it. This also ensures that if a file becomes separated from the folder it is from it’s identity will still be clear.
Through ongoing processes such as confirming file integrity with checksums and updating descriptions and file names as appropriate we are confident that the material in the archives holdings will remain both preserved and accessible for years to come.
[1] Checksums work via software that produces an alphanumeric sequence based on specified files, if the file changes in any way this sequence will also change, allowing errors to be easily detected.
Image: Leh Ladakh photo album, PhotAlbums 5 page 34, Moravian Archives Bethlehem