Hi,

I got the following situation:

This group age
is translated sometimes in:
ce groupe d'âge (correct)
ce groupe d" âge (incorrect)
ce groupe d "âge (incorrect)

I am wondering if this is more a detokenizer issue or a corpus issue, or 
both.

Technically in French, there shouldn't be any space before or after the 
apostrophe.
In the Europarl Corpus, as well as in the News2014 one, there are some 
instances with a space before or after.

Then I have the feeling that the decoder gets a ' with surrounding 
spaces leading to the detokenizer to transform into "

Anyone with a similar issue ?

thanks.
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to