Hi LM stuff again!
I've created a language model with IRSTLM (release 5.70.04): tlm -tr=toy.sent_start_end.en -lm=msb -n=5 -o=toy.en.n5.lm When I specify type 1 (IRSTLM) in moses.ini it's loading fine. But if I try to load it with KenLM I get: The context of every 4-gram should appear as a 3-gram Byte: 471440 File: /global/markov/raybauds/DATA/TOY/toy.en.n5.lm Byte 471440 seems to be the '\n' between the following lines: -1.16894 to support them . -0.0679314 -0.836008 to deal with hamas As a matter of fact, "to support them" does not appear as a trigram in the model. If I remove this 4-gram the same problem arises with another one, whose 3-gram prefix is also missing. I think it is the problem. If I change the smoothing method to "sb" instead of "msb" I get a usable LM. Is this normal behavior? Do you think it's a KenLM or an IRSTLM related problem? cheers, -- Sylvain Raybaud _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support