[Apertium-stuff] apertium-lint: A Linter for Apertium Files

2022-12-27 Thread Daniel Swanson
Greetings Apertiumers!

I have yet another announcement for you all.

Tino Didriksen and I have at long last vanquished one of the Python
package managers (actually a messy amalgamation of 2 of them, but it's
best not to dwell on that point) to bring you apertium-lint!

To install apertium-lint:
pip3 install apertium-lint

To run apertium-lint:
apertium-lint

This will analyze all Apertium files in the current directory and
produce output such as the following (abbreviated from the output on
apertium-kir):

/home/daniel/apertium/apertium-data/apertium-kir $ apertium-lint
./modes.xml
Error (install-deps) on line 8: Debug modes using files in .deps/
should not be installed.
./paper/paper.tex
Warning (unnorm) on line 113: Line contains non-normalized characters.
./apertium-kir.kir.rlx
Warning (unuse-set) on line 23: Set Pron-Pers defined but not used.
Error (redef-set) on line 63: Redefinition of set Pl.
Errors: 10 Warnings: 11 Suggestions: 0 Nitpicks: 0

I hope its analysis proves useful (the check that identifies which
lines have non-breaking spaces has already helped me a few times) and
I'm happy to add more things for it to check.

Daniel


___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Capitalization Handling

2022-12-27 Thread Daniel Swanson
Greetings Apertiumers!

For anyone testing this, I've now also added -w/--dictionary-case to
apertium-{transfer,interchunk,postchunk} which makes the
capitalization instructions simply do nothing so we don't have two
conflicting sets of rules trying to solve the same problem in opposite
ways.

Daniel

On Tue, Dec 27, 2022 at 6:47 AM Marc Riera Irigoyen
 wrote:
>
> Thanks for the great work! I'll make sure to test it with apertium-eng-cat, 
> which has generation errors due to capitalization.
>
> Happy holidays!
>
> Marc Riera
>
>
> Missatge de Hèctor Alòs i Font  del dia ds., 24 de des. 
> 2022 a les 14:12:
>>
>> Looks very good, Daniel. Thanks in advance. I'll try to test in the next 
>> days in the pairs I maintain.
>> Merry Christmas/Hanukkah/New Year/*.
>> Hèctor
>>
>> Missatge de Daniel Swanson  del dia dv., 23 de 
>> des. 2022 a les 0:41:
>>>
>>> Greetings Apertiumers!
>>>
>>> I have two updates to report:
>>>
>>> First, I have rewritten the postgenerator (again), this time as part
>>> of apertium-separable (and so not breaking the old one, unlike last
>>> time), and in such a way that postgenerator rules can both match on
>>> lemma and tags in addition to surface forms and iteratively apply to
>>> their own output.
>>>
>>> This is available as part of apertium-separable 0.7.0 and is
>>> documented at https://wiki.apertium.org/wiki/Postgenerator
>>>
>>> Second, I just added a pair of modules which move capitalization
>>> information into word-bound blanks at the beginning of the pipeline
>>> and then reapply them according to LRX-like rules at the end of the
>>> pipeline, allowing all intermediate modules to operate solely on
>>> dictionary case.
>>>
>>> This should be available after the next nightly build (i.e. tomorrow)
>>> in apertium 3.9.0, and is documented at
>>> https://wiki.apertium.org/wiki/Capitalization_restoration
>>>
>>> If anyone has questions or would like help trying this out for a
>>> language pair or if I missed something in the documentation, let me
>>> know.
>>>
>>> Thanks to Kevin Unhammer and Marc Riera for helping me figure out what
>>> the design of the capitalization module should be.
>>>
>>> Merry Christmas,
>>> Daniel
>>>
>>> P.S. To anyone not interested in either of these developments: your
>>> Christmas gift is that I accidentally made lexical selection quite a
>>> bit faster while I was working on these.
>>>
>>>
>>> ___
>>> Apertium-stuff mailing list
>>> Apertium-stuff@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff


___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Capitalization Handling

2022-12-27 Thread Marc Riera Irigoyen
Thanks for the great work! I'll make sure to test it with apertium-eng-cat,
which has generation errors due to capitalization.

Happy holidays!

*Marc Riera*


Missatge de Hèctor Alòs i Font  del dia ds., 24 de
des. 2022 a les 14:12:

> Looks very good, Daniel. Thanks in advance. I'll try to test in the next
> days in the pairs I maintain.
> Merry Christmas/Hanukkah/New Year/*.
> Hèctor
>
> Missatge de Daniel Swanson  del dia dv., 23
> de des. 2022 a les 0:41:
>
>> Greetings Apertiumers!
>>
>> I have two updates to report:
>>
>> First, I have rewritten the postgenerator (again), this time as part
>> of apertium-separable (and so not breaking the old one, unlike last
>> time), and in such a way that postgenerator rules can both match on
>> lemma and tags in addition to surface forms and iteratively apply to
>> their own output.
>>
>> This is available as part of apertium-separable 0.7.0 and is
>> documented at https://wiki.apertium.org/wiki/Postgenerator
>>
>> Second, I just added a pair of modules which move capitalization
>> information into word-bound blanks at the beginning of the pipeline
>> and then reapply them according to LRX-like rules at the end of the
>> pipeline, allowing all intermediate modules to operate solely on
>> dictionary case.
>>
>> This should be available after the next nightly build (i.e. tomorrow)
>> in apertium 3.9.0, and is documented at
>> https://wiki.apertium.org/wiki/Capitalization_restoration
>>
>> If anyone has questions or would like help trying this out for a
>> language pair or if I missed something in the documentation, let me
>> know.
>>
>> Thanks to Kevin Unhammer and Marc Riera for helping me figure out what
>> the design of the capitalization module should be.
>>
>> Merry Christmas,
>> Daniel
>>
>> P.S. To anyone not interested in either of these developments: your
>> Christmas gift is that I accidentally made lexical selection quite a
>> bit faster while I was working on these.
>>
>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff