My Rules for Transcriptions

  • Semantic preservation: I always preserve the semantics of what is written/typed. If the text is factually incorrect, I nevertheless remain faithful to the author's words.
  • Brackets: I use brackets (i.e., [ ... ]) to denote something that is not in the original text. I have very rarely found brackets in use in older handwritten text.
    • Errors: I use [sic] to denote when an error is the author's, not mine. Sometimes, I will include the correction when it isn't entirely obvious (e.g., pson [sic: person]).
    • Dates: Dates are written in a variety of ways, to include numbers written as words. I strive to always type the date in a standardized Gregorian Calendar format of dd Mmm yyyy immediately after the text date (e.g., the Tenth Day of aprill 1684 [10 Apr 1684]). This makes the date more easily recognized by search engines.
    • Numbers: When quantities are written as words, I'll give the numeric equivalent in brackets (e.g., one hundred [100]).
    • Names: When a name is abbreviated, or given only as initials, I will provide a complete name if it is known (e.g., Wm. [William]). This helps improve search engine results.
    • Illegible/Confusing:
      • If a word is illegible, I'll typically use a sequence of underline characters in place of the unknown word (e.g., [____]).
      • Sometimes I'll take a reasonable guess (based on context and/or similarity to other text on the page) and include a question mark (e.g., [twenty?]).
      • Occasionally, if the word appears legible, but also seems incorrect in context, I'll just add a commented question mark after the transcription (e.g., “…Quingsby[?] Swamp…”).
    • Money: I will indicate money using modern notation wherever possible (e.g., ten pounds [£10]). If old style notation is used, I will convert to modern notation (e.g., 4/ [4 shillings]; or, 4/ [4s]).
  • No new content: I never introduce new content into the transcribed text (except when contained within brackets, as described above).
  • “Long” s: The long s (i.e., the ſ character) is an archaic form of the lower case letter s, which was used both in some early printing typesets as well as in handwriting. I simply replace it with a lowercase s. Some transcribers use the lowercase f, but that is grammatically and syntactically incorrect.
  • Existing line breaks: As a general rule, I do not preserve line breaks from the original text when they are due solely to the text encountering the edge of the paper. This rule is driven primarily by two factors:
    • Such line breaks are not necessary for preserving the semantics of the transcription.
    • My transcriptions are posted online and, as such, the text needs to “flow” based on screen size. Even so, some line breaks make perfect sense, such as when it's obvious that a new line is appropriate (lists, tabular data, etc.).
  • Inserted line breaks: I may insert a line break when it makes the transcription more readable or more comprehensible. An example would be to begin a new line for each “Item” in a will. Another example is when the original text is written in one long, continuous paragraph; in such a case I may break it into sections as content dictates.
  • Punctuation: No punctuation is added or removed, even if doing so would improve readability. Old documents (and images thereof) often have smudges or marks that are mistaken for punctuation. I'll ignore these when/if the context reasonably indicates they are not punctuation marks.
  • Last modified: 2019-12-26 10:41:02
  • by Ken Norman