The following high level summary is provided in case it is of any use to others considering similar projects of their own.
- Digital images of original documents were named to preserve the original archive and source reference and page sequence and burned to CD for distribution to volunteers. In several cases paper photocopies of images were purchased from record offices. A spreadsheet log was kept of who had which collections.
- An MS Word template was created for transcribers to use, so that we could consistently identify for each item within a collection the source reference, writer (and in the case of letters, the recipient, creation date, and any notes that the transcriber wished to add to the transcript of the original document.
- Transcripts, typically containing batches of letters, were uploaded to dropbox as word files, and then later aggregated into larger files making up the contents of a whole collection. These were given a high level review for consistency (full checks against the document images being unrealistic across such a vast overall collection of material).
- Completed transcripts were saved as plain text files and put through a text processing tool created by our website developer, Digital Acorn and output as comma separated values spreadsheets. With file type, correspondents, dates, notes and text arranged thus in columnar form it was straightforward to review contents to ensure consistency in personal names and dates. These are important index fields within the database, interrogated by the advanced search options.
- The completed csv files are imported into the database through customised upload tools provided by Digital Acorn within the website, built upon the wordpress platform. This also allowed each batch of uploads to be associated with an original archive source document, another important feature for researchers who wish to follow a transcript to its source and consult the original material themselves.
- A facility was also provided to re-export the batch of uploads in text form, with the index fields reformatted into a simple line showing ‘date’, ‘from and ‘to’ correspondents to act individual item headers within the downloadable version of the full transcript of an original volume or collection. This saved a great deal of manual reformatting effort in converting the export text files into PDFs.
Document downloads: the project’s ‘getting started guide’ and document template with completion instructions