Hi Lars,

The instructions you're looking for are here:
http://www.statmt.org/moses/?n=FactoredTraining.BuildingLanguageModel

You can also create a KenLM binary file instead and use it in the
decoder with the KENLM line in the [feature] section of your moses.ini.

$ kenlm/build_binary filename.arpa filename.binary

Cheers,
Matthias


On Tue, 2014-05-27 at 11:57 +0200, Lars Bungum wrote:
> Hi,
> 
> I am a bit confused on how to configure the LM features correctly.
> 
> In my moses.ini this feature line is provided from running the script 
> train-model.perl with the lm parameters 0:3:$LMPATH (otherwise standard 
> parameters from the baseline system instructions).  I built the LM with 
> srilm.  WIth the text model I receive the following error message when 
> decoding:
> 
>      The ARPA file is missing <unk>.  Substituting log10 probability 
> -100.000
> 
> but it otherwise works.  However, when I compiled the LM with the command:
> 
>      ngram -order 3 -lm en-de.kn5.lm -write-bin-lm en-de.kn5.lm.bin
> 
> I receive the error message:
> 
>       Reading $PATH/en-de.kn5.lm.bin
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> ****************************************************************************************************
>       Exception: lm/read_arpa.cc:65 in void 
> lm::ReadARPACounts(util::FilePiece&, std::vector<long unsigned int>&) 
> threw FormatLoadException'.
>       first non-empty line was "SRILM_BINARY_NGRAM_002" not \data\. Byte: 23
> 
> this led me to trying to figure out why and I looked in my moses.ini 
> file.  Here the LM is configured with this line:
> 
>       KENLM lazyken=0 name=LM0 factor=0 path=$PATH/en-de.kn5.lm.bin order=3
> 
> and I here is when I couldn't find out why.  Why is this feature named 
> KENLM?  And how do I know how to configure it?  Did I make a mistake in 
> running train-model somehow?  I guess intuitively I should configure it 
> with a line that is called SRILM that knows how to read this binary 
> format, but I was not able to find out how to do that.
> 
> Thanks
> //LB
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to