[Apertium-stuff] Questions about lexical selection

Grzegorz Kulik Mon, 20 Dec 2021 10:40:15 -0800

Hello all,

I haven't contacted you for some time, I hope you are all well. I
developed the pol-szl pair and although the translation is quite
reasonable, I decided to make it better by improving the lexical
selection. I've been reading the documentation and managed to write
several rules for forms that need disambiguation and are the same parts
of speech. However, I cannot find any information anywhere about what
to do if there is a form that can mean two completely different things.
Example in Polish:


tacy (such) = taki<det><dem><mp><pl><nom>
tacy (of a tablet) =
taca<n><f><sg><gen>/taca<n><f><sg><dat>/taca<n><f><sg><loc>

The first meaning is obviously much more frequent but the translator
chooses the second one, which is less than desirable.

What can I do to remedy this? Can I write rules for that manually?
Should I train the tagger? If so, what method would be the best?
There's multiple training methods and I don't know which one to choose
for my pair. Could you recommend me the best approach?

Thank you in advance
Greg

_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

[Apertium-stuff] Questions about lexical selection

Reply via email to