Description of ARIES Natural Language Tools
ARIES Natural Language Tools
The ARIES Natural Language Tools make up a lexical platform for the
Spanish language. These tools can be integrated into NLP applications. They
include: a large Spanish lexicon, lexical maintenance and access tools and
morphological analyser/generator.
Non-exclusive, non-transferable licenses are available for
the following components:
- The
Prolog GRAMPAL analyser/generator
- A public domain demonstration system written in Prolog of our
morphological treatment and lexicon. It includes a small demo lexicon, a DCG
grammar for word formation and some predicates to test both analysis and
generation. It runs under Sicstus Prolog 2.1.9.
- The Prolog GRAMPAL dictionary
- A collection of Prolog predicates suitable for use with the public domain
GRAMPAL DCG grammar. It is capable of generating/recognizing well formed
inflected forms for verbs, nouns and adjectives. It has no adverbs,
determiners, conjunction, prepositions, etc. It does not treat clitic pronoun
attachment nor derivatives.
- The expanded ARIES dictionary
- A collection of expanded entries (allomorphs) with morphological
information. It contains a full set of morphemes dealing with clitic pronoun
attachment (but without verb marking for correct attachments). It includes
information about some derivative morphological processes (inflected
adjectives from past participles and adverbs ended in "-mente" from
adjectives).
- The source ARIES lexical base
- A collection of inflectional models, rules for off-line computing of
allomorphs, unexpanded lemma entries, lexicalized irregular words. It is the
most complete source of information we have available and the most useful for
dictionary maintenance. A tool for expanding the source dictionary to the
expanded dictionary is also provided. The current size of this lexicon is
38,500 lemma entries (21,000 nouns, 10,000 adjectives, 7,500 verbs and 500
auxiliary words) plus more than 600 inflectional morphemes.
- Access tools
- The C/C++ programming interface for lexical access to the ARIES
dictionary: It is a set of tools and libraries to build trie indexes to the
allomorph dictionary and to retrieve them by an application.
- Morphological analyser
- The C/C++ morphological analyzer that makes use of the lexical interface
mentioned above. This permits to improve efficiency by integrating word
segmentation with lexical access also. By now, it is a (pseudo)-unification
chart based parser for context-free morphological grammars.
NOTES:
Each license includes the right to receive any actualization of the
package that might be released in one year from the signature of the license
agreement. Limited e-mail support is also provided.
Supported platforms are UNIX and DOS Operating System with GNU gcc/g++
(djgpp for DOS) compilers. Tools have been tested MSDOS, HP-UX 9.05, SunOs
4.1.3 and Solaris 2.4.
No specific documentation available.
Several papers
describe the formalism, resources and tools.
Return to main page