El 2018-08-06 23:43, Xavi Ivars escribió:
Hi,

We've recently seen some issues happening with apertium tagger and
recent nightlies.

This is what happens with nightly

Version: 3.5.1+g697~a60e0bc0-1~stretch1 (few days ago)

$ echo '^cumbre/cumbre<n><f><sg>$ ^en/en<pr>$
^Madrid/Madrid<np><ant>/Madrid<np><loc>$^./.<sent>$' |
apertium-tagger -gdm /src/apertium-spa-cat/spa-cat.prob
^cumbre<n><f><sg>$ ^en<pr>$ ^=Madrid<np><ant>$^.<sent>$

On the other hand, this is what happens with  Version:
3.5.1+g702~9d8ad1a2-1~bionic1

$ echo '^cumbre/cumbre<n><f><sg>$ ^en/en<pr>$
^Madrid/Madrid<np><ant>/Madrid<np><loc>$^./.<sent>$' |
apertium-tagger -gdm /src/apertium-spa-cat/spa-cat.prob
^cumbre<n><f><sg>$ ^en<pr>$ ^=Madrid<np><loc>$^.<sent>$

In both cases, the .prob file is the same

$ md5sum ./spa-cat.prob
aaf24085338f39f9133b65ca73de71f9  ./spa-cat.prob

We've managed to reproduce the unstability in different environments,
we cannot understand why the tagger is not consistent.

Is it possible that the recent changes related to weighted FSTs has
any impact on it?


I'd say that's feasible as it will have changed the topology of some
transducers. Although in principle if the tagger is doing its job
as expected, that shouldn't happen ... e.g. it should only happen if
there is ambiguity left.

Techievena: What do you think?

Fran

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to