The move from traditional to recursive transfer is inspiring. I'm too lazy
to redo tons of code, but I'm glad to confirm that it's more compact and
powerful. Congrats!
Hèctor

Missatge de Kevin Brubeck Unhammer <unham...@fsfe.org> del dia dl., 19 de
des. 2022 a les 15:22:

> Goddag,
>
> I've just tagged new releases of swe-nor and dan-nor.
>
> The work on swe-nor is partially funded by the Norwegian News Agency,
> and dan-nor by Store norske leksikon.
>
> For both pairs, all directions now use apertium-separable (lsx) and
> recursive transfer (rtx), with testing by apertium-regtest.
>
> Most of the work has been focused on the nob→{swe,dan} direction, but
> all directions have of course improved vocabulary and seem to have
> improved quality. The directions into Nynorsk are also usable with style
> preferences (though it hasn't been added to the UI yet in this release).
>
> Some stats:
>
> dan-nor:
> - Over 22.000 new non-name bidix entries
> - Over 300 new lexical selection rules
> - Over 300 new lexical selection rules
> - ~60 separable/mwe entries, including comma insertion rules for
>   generating Danish
>
> swe-nor:
> - Over 20.000 new non-name bidix entries
> - Over 300 new lexical selection rules manually added
> - Nearly 7000 new lexical selection rules based on corpus frequencies
> - ~30 separable/mwe entries
>
> and the newer monolingual dependencies mean much better bokmål
> disambiguation (and some improvements there for the other languages as
> well) as well as much better compound epenthetic choices and tweaks all
> round.
>
> Moving from chunking transfer to recursive for these pairs was a joy. I
> have spent very little time on the rules, but they already cover more
> than the old rules did, in much fewer lines of code (including comments
> and everything, dan-nor has ~1011 lines of rtx in one file per
> direction, and 8347 of t?x with three files per direction). Each
> direction has about 20 rtx rules (where a rule is NP→n|ncmp n|…), 50 if
> you count alternatives. There's a lot less redundancy than before, and
> the recursion means we can have e.g. compounds of arbitrary length.
>
> -Kevin
>
>
>
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to