El 2018-08-06 23:43, Xavi Ivars escribió:
Hi,
We've recently seen some issues happening with apertium tagger and
recent nightlies.
This is what happens with nightly
Version: 3.5.1+g697~a60e0bc0-1~stretch1 (few days ago)
$ echo '^cumbre/cumbre<n><f><sg>$ ^en/en<pr>$
^Madrid/Madrid<np><ant>/Madrid<np><loc>$^./.<sent>$' |
apertium-tagger -gdm /src/apertium-spa-cat/spa-cat.prob
^cumbre<n><f><sg>$ ^en<pr>$ ^=Madrid<np><ant>$^.<sent>$
On the other hand, this is what happens with Version:
3.5.1+g702~9d8ad1a2-1~bionic1
$ echo '^cumbre/cumbre<n><f><sg>$ ^en/en<pr>$
^Madrid/Madrid<np><ant>/Madrid<np><loc>$^./.<sent>$' |
apertium-tagger -gdm /src/apertium-spa-cat/spa-cat.prob
^cumbre<n><f><sg>$ ^en<pr>$ ^=Madrid<np><loc>$^.<sent>$
In both cases, the .prob file is the same
$ md5sum ./spa-cat.prob
aaf24085338f39f9133b65ca73de71f9 ./spa-cat.prob
We've managed to reproduce the unstability in different environments,
we cannot understand why the tagger is not consistent.
Is it possible that the recent changes related to weighted FSTs has
any impact on it?
I'd say that's feasible as it will have changed the topology of some
transducers. Although in principle if the tagger is doing its job
as expected, that shouldn't happen ... e.g. it should only happen if
there is ambiguity left.
Techievena: What do you think?
Fran
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff