El 2020-12-16 11:43, Jorge Gracia escribió:
Dear Apertium community,

As part of the European H2020 Prêt-à-LLOD project [1], we at
University of Zaragoza and Goethe University Frankfurt have been
working on a new version of the Apertium family of dictionaries in
RDF, with the final aim of making the Apertium data available as a
unified graph [2] of linked dictionaries on the Web, making it easily
re-usable for other tasks beyond machine translation.

To address the heterogeneity of MORPHOSYNTACTIC TAGS used in Apertium
we have created an initial manual MAPPING OF THE APERTIM TAGS TO THE
LEXINFO CATEGORY REGISTRY [3] AND TO THE UNIVERSAL DEPENDENCIES POS
TAGSET [4]. Such a mapping not only will allow to ease the linking of
lexical content of Apertium to external resources, but it will also
enable to homogeneously query Apertium lexica that originally used
diverse tagsets. This mapping is compliant with Apertium's List of
symbols [5] and includes tags absent there but identified during the
tag extraction.

Since many of the tags in the source data were specific to the
language, and not all of them were accompanied by a proper
description, the mappings may contain errors. We are thus turning to
the APERTIUM COMMUNITY to share our work and to jointly CURATE SUCH A
MAPPING TABLE. Our final goal is to build a table of mappings as
semantically fine-grained as possible. We are sure that this can be
beneficial for the Apertium community for other tasks beyond the RDF
conversion.

The mapping is available at
https://github.com/sid-unizar/apertium-lexinfo-mapping, where you can
find the mapping files, downloadable as CSV [6] and TSV [7], details
of the methodology followed, as well as additional documentation.

If you are interested in helping us identify errors, or want to
provide your expertise and insights in the interpretation of these
tags, please do not hesitate to contact our colleague Julia Bosque-Gil
(in cc) or to open a new issue in the GitHub repository mentioned
above.

Best regards,

Some of the languages have .udx files which specify mappings from
Apertium tags to UD UPOS+Feats:

https://github.com/apertium/apertium-kaz/blob/master/apertium-kaz.kaz.udx

And likewise:

apertium/languages/apertium-usp/apertium-usp.usp.udx
apertium/languages/apertium-kmr/apertium-kmr.kmr.udx
apertium/languages/apertium-fao/apertium-fao.fao.udx
apertium/languages/apertium-chv/apertium-chv.chv.udx
apertium/languages/apertium-cak/apertium-cak.cak.udx
apertium/languages/apertium-bre/apertium-bre.bre.udx
apertium/languages/apertium-uig/apertium-uig.uig.udx
apertium/languages/apertium-quc/apertium-quc.quc.udx
apertium/languages/apertium-cym/apertium-cym.cym.udx

Ideally every languaeg package would have such a file, but we haven't got around
to producing them yet.

Best regards,

Francis M. Tyers


_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to