El 2017-07-02 18:53, Jaume Ortolà i Font escribió:
The Spanish word "mango" (n. masc.) has two main meanings:

1) handle. E.g. El mango de la sartén. / The handle of the frying
pan.
2) mango (fruit or tree). E.g. He comido mango. / I have eaten mango.

To find the right translation is not difficult. "Handle" is the
default translation. But if some words are used in the sentence, then
"mango" should be the preferred translation.

I implemented a rule in a Catalan grammar checker to find wrong
translations of mango. With a list of about 100 lemmas, the results
are good enough. These lemmas come from a Catalan corpus (where the
two meanings are different words: mànec/mango).

How can this approach be best implemented in Apertium? I translated
the list of 100 words to Spanish and wrote some rules in the
Constraint-based lexical selection module.  But I need to write a
different rule for each distance from the ambiguous word.[1] Is there
a way to indicate "any place in the sentence"? If implemented, I think
it could be very productive. The same list of Spanish words can be
used by other language pairs.


There is no way to say "anywhere in the sentence" in the lexical selection
module. This is for efficiency reasons.

There are two possibilities:

1) you could mark them in CG, e.g.

SUBSTITUTE ("mango") ("mango¹") (0* MangoWords);
SUBSTITUTE ("mango") ("mango²") (0* ManecWords);

Then in the bidix you would have:

<e><p><l>mango¹<s n="n"/></l><r>mango<s n="n"/></r></p></e>
<e><p><l>mango²<s n="n"/></l><r>mànec<s n="n"/></r></p></e>

2) You could just use the constraint based module and only look
at 5 gram windows.  I guess you would get 90-95% that way.



Fran




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to