Re: [Moses-support] Handling unknown words in Moses

2010-08-09 Thread John Burger
Philipp Koehn wrote: > this is not correct - LM cost is in the future cost estimate. > Obviously, this is a rather low probability, depending > on if the language model was trained with open or > closed vocabulary. And also whether the word is unknown to the LM or not, yes? Typically there are

[Moses-support] Handling unknown words in Moses

2010-08-09 Thread Trong Nghia Hoang
Hi everyone,I would like to ask for some details on how Moses deals with the handling of unknown words. As I read from the tutorial, unknown words are copied verbatim to the output. However, it is not clear of how we deal with the distortion limit while copying unknown words to the output.The si

Re: [Moses-support] Handling unknown words in Moses

2010-08-09 Thread Philipp Koehn
Hi, this is not correct - LM cost is in the future cost estimate. Obviously, this is a rather low probability, depending on if the language model was trained with open or closed vocabulary. The reordering of unknown words does cause often some strange reordering, due to the fact that an unknown w

Re: [Moses-support] Handling unknown words in Moses

2010-08-09 Thread Alexander Fraser
It seems like even if this is correctly implemented, unknown words would be delayed until the edge of the window they are in, due to trying to avoid paying the high LM cost until the last minute. LM cost is not in the future cost, so hypotheses paying this LM cost should lose to hypotheses delaying

Re: [Moses-support] Handling unknown words in Moses

2010-08-09 Thread Hieu Hoang
make sure the scores of the unknown word (-100) is included in your calculation of future cost. On 09/08/2010 08:02, nghi...@comp.nus.edu.sg wrote: > Hi everyone, > > I would like to ask for some details on how Moses deals with the handling > of unknown words. As I read from the tutorial, unkno

Re: [Moses-support] Handling unknown words in Moses

2010-08-09 Thread Barry Haddow
> With the default setting of Moses (phrase-based, distance-based reordering > ...), the handling of unknown words will be postponed (as we penalize them > severely : -100) until the very end. > > Therefore, it is likely that some unknown words are dropped and won't > appear in the output (due to

[Moses-support] Handling unknown words in Moses

2010-08-09 Thread nghiaht
Hi everyone, I would like to ask for some details on how Moses deals with the handling of unknown words. As I read from the tutorial, unknown words are copied verbatim to the output. However, it is not clear of how we deal with the distortion limit while copying unknown words to the output. The s