[Apertium-stuff] apertium-lint: A Linter for Apertium Files
Greetings Apertiumers! I have yet another announcement for you all. Tino Didriksen and I have at long last vanquished one of the Python package managers (actually a messy amalgamation of 2 of them, but it's best not to dwell on that point) to bring you apertium-lint! To install apertium-lint: pip3 install apertium-lint To run apertium-lint: apertium-lint This will analyze all Apertium files in the current directory and produce output such as the following (abbreviated from the output on apertium-kir): /home/daniel/apertium/apertium-data/apertium-kir $ apertium-lint ./modes.xml Error (install-deps) on line 8: Debug modes using files in .deps/ should not be installed. ./paper/paper.tex Warning (unnorm) on line 113: Line contains non-normalized characters. ./apertium-kir.kir.rlx Warning (unuse-set) on line 23: Set Pron-Pers defined but not used. Error (redef-set) on line 63: Redefinition of set Pl. Errors: 10 Warnings: 11 Suggestions: 0 Nitpicks: 0 I hope its analysis proves useful (the check that identifies which lines have non-breaking spaces has already helped me a few times) and I'm happy to add more things for it to check. Daniel ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] Capitalization Handling
Greetings Apertiumers! For anyone testing this, I've now also added -w/--dictionary-case to apertium-{transfer,interchunk,postchunk} which makes the capitalization instructions simply do nothing so we don't have two conflicting sets of rules trying to solve the same problem in opposite ways. Daniel On Tue, Dec 27, 2022 at 6:47 AM Marc Riera Irigoyen wrote: > > Thanks for the great work! I'll make sure to test it with apertium-eng-cat, > which has generation errors due to capitalization. > > Happy holidays! > > Marc Riera > > > Missatge de Hèctor Alòs i Font del dia ds., 24 de des. > 2022 a les 14:12: >> >> Looks very good, Daniel. Thanks in advance. I'll try to test in the next >> days in the pairs I maintain. >> Merry Christmas/Hanukkah/New Year/*. >> Hèctor >> >> Missatge de Daniel Swanson del dia dv., 23 de >> des. 2022 a les 0:41: >>> >>> Greetings Apertiumers! >>> >>> I have two updates to report: >>> >>> First, I have rewritten the postgenerator (again), this time as part >>> of apertium-separable (and so not breaking the old one, unlike last >>> time), and in such a way that postgenerator rules can both match on >>> lemma and tags in addition to surface forms and iteratively apply to >>> their own output. >>> >>> This is available as part of apertium-separable 0.7.0 and is >>> documented at https://wiki.apertium.org/wiki/Postgenerator >>> >>> Second, I just added a pair of modules which move capitalization >>> information into word-bound blanks at the beginning of the pipeline >>> and then reapply them according to LRX-like rules at the end of the >>> pipeline, allowing all intermediate modules to operate solely on >>> dictionary case. >>> >>> This should be available after the next nightly build (i.e. tomorrow) >>> in apertium 3.9.0, and is documented at >>> https://wiki.apertium.org/wiki/Capitalization_restoration >>> >>> If anyone has questions or would like help trying this out for a >>> language pair or if I missed something in the documentation, let me >>> know. >>> >>> Thanks to Kevin Unhammer and Marc Riera for helping me figure out what >>> the design of the capitalization module should be. >>> >>> Merry Christmas, >>> Daniel >>> >>> P.S. To anyone not interested in either of these developments: your >>> Christmas gift is that I accidentally made lexical selection quite a >>> bit faster while I was working on these. >>> >>> >>> ___ >>> Apertium-stuff mailing list >>> Apertium-stuff@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >> >> ___ >> Apertium-stuff mailing list >> Apertium-stuff@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff > > ___ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] Capitalization Handling
Thanks for the great work! I'll make sure to test it with apertium-eng-cat, which has generation errors due to capitalization. Happy holidays! *Marc Riera* Missatge de Hèctor Alòs i Font del dia ds., 24 de des. 2022 a les 14:12: > Looks very good, Daniel. Thanks in advance. I'll try to test in the next > days in the pairs I maintain. > Merry Christmas/Hanukkah/New Year/*. > Hèctor > > Missatge de Daniel Swanson del dia dv., 23 > de des. 2022 a les 0:41: > >> Greetings Apertiumers! >> >> I have two updates to report: >> >> First, I have rewritten the postgenerator (again), this time as part >> of apertium-separable (and so not breaking the old one, unlike last >> time), and in such a way that postgenerator rules can both match on >> lemma and tags in addition to surface forms and iteratively apply to >> their own output. >> >> This is available as part of apertium-separable 0.7.0 and is >> documented at https://wiki.apertium.org/wiki/Postgenerator >> >> Second, I just added a pair of modules which move capitalization >> information into word-bound blanks at the beginning of the pipeline >> and then reapply them according to LRX-like rules at the end of the >> pipeline, allowing all intermediate modules to operate solely on >> dictionary case. >> >> This should be available after the next nightly build (i.e. tomorrow) >> in apertium 3.9.0, and is documented at >> https://wiki.apertium.org/wiki/Capitalization_restoration >> >> If anyone has questions or would like help trying this out for a >> language pair or if I missed something in the documentation, let me >> know. >> >> Thanks to Kevin Unhammer and Marc Riera for helping me figure out what >> the design of the capitalization module should be. >> >> Merry Christmas, >> Daniel >> >> P.S. To anyone not interested in either of these developments: your >> Christmas gift is that I accidentally made lexical selection quite a >> bit faster while I was working on these. >> >> >> ___ >> Apertium-stuff mailing list >> Apertium-stuff@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >> > ___ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff > ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff