Recent years have seen a growing interest in research aimed at building new linguistic resources and Natural Language Processing (NLP) tools for derivational morphology. For decades, research in computational morphology was mainly focussed on its inflectional aspects and, specifically, on PoS tagging. The current increased interest in both the theoretical and applicative aspects of word formation is strictly connected to the large need for automatic semantic processing of linguistic data. Indeed, strict relations do hold between derivational morphology and semantics, as words that share the same formative elements or the same formative process also tend to share basic semantic features, which can in turn be induced automatically from those of their lexical basis.
Several lexical resources for derivational morphology have been made available for a number of languages. Among them are the lexical network for Czech DeriNet (Ševcíková and Žabokrtský [23]), the derivational lexicon for German DERIVBASE (Zeller et al., [26]) and that for Italian derivaTario (Talamo et al., [24]). Further more, stemming is a technique largely used for detecting word formation processes (Goldsmith [9]), and language independent probabilistic NLP tools were developed to extract derivation information from lexical data (Baranes and Sagot [3] 2014; Virpioja et al. [25]).
Over the last decade many efforts have been invested in the creation of advanced language resources and tools for ancient languages, notably the linguistic annotation of Latin and Ancient Greek textual data through treebanks (Bamman et al. [2]; Bamman & Crane [1]; Haug & Jyhndal [11]; Korkiakangas & Lassila [13]; Passarotti [19]). Numerous computational lexical resources for these languages have also been developed (McGillivray [16]; McGillivray & Passarotti [15]; Minozzi [17]; Passarotti et al. [21]).
In that time, what had been missing was a derivational lexicon and NLP tool for Latin. When in 2014 we decided to write a project proposal for a Marie Curie Individual Fellowship, we felt that times were ripe to address such a challenge. In our research experience before then, we had contributed to building a powerful morphological analyser for Latin (Lemlat: Passarotti et al. [22]) and to running the Index Thomisticus Treebank (Passarotti [19]) -currently the largest Latin treebank available, for more than a decade.
Derivational morphology was the missing link between inflectional morphology and syntax, so it seemed the natural next step to address.