Hi everyone,

I would like to ask for some details on how Moses deals with the handling
of unknown words. As I read from the tutorial, unknown words are copied
verbatim to the output. However, it is not clear of how we deal with the
distortion limit while copying unknown words to the output.

The situation is that :

With the default setting of Moses (phrase-based, distance-based reordering
...), the handling of unknown words will be postponed (as we penalize them
severely : -100) until the very end.

Therefore, it is likely that some unknown words are dropped and won't
appear in the output (due to the reordering limit constraint (default = 6)
!

==> It means that the copying of unknown words are forced to postpone
until the last and when it is possible to do so, the reordering limit
constrain interrupts and as a results, we won't get a complete translation
!

This is what happened with my re-implementation of Moses and it hampered
the translation quality a lot (1 or 2 BLEU points behind).

However, it seems that the situation won't happen with Moses (i.e : Moses
always finds a complete translation).

I hope that someone can help me clarify it as I cannot find the relevant
information anywhere.

Thanks.

Hoang Trong Nghia
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to