Hi, Any words beyond N-1 have full context and are included in the phrase's score. So it's hypothesis + target phrase + adjustments. And the routine you cite is computing adjustments.
Kenneth On 04/19/16 10:50, Evgeny Matusov wrote: > > Hi, > > > my colleagues and I noticed the following in the KenLM code when a > Hypo is evaluated with the LM: > > > https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/Ken.cpp#L203 > > > Do we understand it correctly that because of this line, for phrases > longer than the LM order N only the first N words are scored with the > LM, the subsequent words are not scored? At least I don't see a call > to add their scores anywhere, they are just passed on to update the LM > state in lines 222-225. > > > Please clarify. It seems like a phrase should be scored by the LM > completely, otherwise longer phrases which start with frequent > n-grams but have unlikely word sequences afterwards are wrongly > preferred. Also, longer phrases are preferred in general with such > scoring. > > > Thanks, > > > Evgeny. > > > > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support