Origin and coverage

Trismegistos Abbreviations is a spinoff of Named Entity Recognition (NER) work done for the Text Irregularities database. In that case we filtered out editorial corrections from the full text of papyri and ostraca in the Papyrological Navigator. We thought that it could equally be useful to extract all abbreviations from Latin inscriptions, and create an interface that improves upon the very useful tables created by Tom Elliott in 1998, now available at a new address, still under the aegis of American Society of Greek and Latin Epigraphy.

The Elliott lists were based on the Année Épigraphique corpus of 1888-1993. TM used the Clauss-Slaby database (EDCS), which contains almost all Latin inscriptions. Our work was done in late 2013, and we are planning an update for newly added inscriptions.

What is an abbreviation? Methodology

Abbreviations are words or clusters of words (see below) that have not been written out in full. Nowadays in Latin epigraphy this information is conveyed in the transliteration by putting the abbreviated part of the word between round brackets: e.g. D(is) indicates that only a D is written instead of the full form Dis. Abbreviations include contractions, where the abbreviated form includes an element from the middle of the word in addition to the initial letter(s), e.g. BF abbreviating beneficiarius.

Following the procedure by Elliott, to create this database the NER has filtered out every word containing a round bracket "(". On top of adding these to a database table, an additional entry was created for every cluster of abbreviations (what Elliott calls meta-abbreviations) in another table.

EDCS not only uses round brackets for abbreviations, but also for what many would consider non-standard orthographies. Examples include:

These forms have been included in the lists. If users want to exclude them, however, they can set a minimum length for the number of characters omitted, e.g. 2. Trismegistos intends to include these standardizations in TM Text Irregularities, together with those marked by < = >, e.g. opt<i=U>mae, by { }, e.g. po{c}suit and by (!), e.g. suom(!)

Furthermore the corrections proposed by EDCS as such, using angle brackets, have been ignored in the abbreviated form, but taken up in the full form, which is an interpretation anyway. Thus e.g. the abbreviation <F=P>ort(unae) appears as PORT abbreviating Fortunae in the list.

Database structure

Trismegistos Abbreviations has a complex structure.

There are two tables of attestations of abbreviations: ABBWORDREF, which has more than 1,500,000 attestations of abbreviated individual words, e.g. one for D abbreviating Dis and one for M abbreviating Manibus; and ABBCLUSTREF, which has over 700,000 attestations of abbreviation clusters, e.g. one for D M abbreviating Dis Manibus. These two tables are connected, as each cluster in ABBCLUSTREF consists of individual abbreviations in ABBWORDREF, and and an abbreviation in ABBWORDREF is always part of a cluster in ABBCLUSTREF, even if a cluster consists only of an individual abbreviation.

On top of that there are two sets of three tables with unique abbreviations and abbreviation clusters, for the abbreviations themselves (e.g. D or D M), for the abbreviated words or clusters (e.g. Dis or Dis Manibus), and for the combinations of these two (e.g. D abbreviating Dis or D M abbreviating Dis Manibus).

The structure is reflected in the following figure: