First I'd check, does the file exist? It shouldn't be calling ArpaLM. That's for loading plain text files. ".berkeleylm" files have been compiled into a special binary format that is more efficiently compacted and can be ready quickly. There is logic for determining which type of file it is, and I wonder if it is going astray. Or maybe the file is not what it says it is (can you "head" it)?
matt > On Oct 16, 2017, at 7:08 PM, kellen sunderland <kellen.sunderl...@gmail.com> > wrote: > > The feature function initialization message is just a general purpose > exception handler. I’ve seen this quite often when language models fail to > load. The most interesting part of the log to me is: > >> Caused by: java.lang.RuntimeException: Something wrong with I/O. >> >> at edu.berkeley.nlp.lm.io.ArpaLmReader.parseHeader(ArpaLmReader.java:114) >> >> at edu.berkeley.nlp.lm.io.ArpaLmReader.parse(ArpaLmReader.java:76) > > > To me it looks like it could only be caused by the lack of the text > "\\1-grams:" in the file you’re opening. Reference this function: > https://github.com/smilli/berkeleylm/blob/master/src/edu/berkeley/nlp/lm/io/ArpaLmReader.java#L105 > > Are you trying to load a binary lm with an Arpa reader by any chance? Do you > have the quoted text in your text based LM? > > -Kellen > From: Tommaso Teofili > Sent: Monday, October 16, 2017 4:09 PM > To: dev@joshua.incubator.apache.org > Subject: Re: problems with LM loading > > p.s.: > I've tried with other LPs (e.g. sd-en) and I get the same ... > > Il giorno lun 16 ott 2017 alle ore 15:06 Tommaso Teofili < > tommaso.teof...@gmail.com> ha scritto: > >> Hi all, >> >> I am trying to use the ES-EN language pack from our "Language Packs" page >> with Joshua 6.1, but when I get to load the two language models I get an IO >> execption. >> The config looks like: >> >> feature-function = LanguageModel -lm_type berkeleylm -lm_order 4 -lm_file >> model/lm.berkeleylm >> feature-function = Distortion >> feature-function = LanguageModel -lm_type berkeleylm -lm_order 4 -lm_file >> model/en.giga.twopercent.4.lm.berkeleylm >> feature-function = PhrasePenalty >> >> and I get the following: >> >> java.lang.RuntimeException: java.lang.RuntimeException: Unable to >> instantiate feature function 'LanguageModel -lm_type berkeleylm -lm_order 4 >> -lm_file model/lm.berkeleylm'! >> >> ... >> >> Caused by: java.lang.RuntimeException: Unable to instantiate feature >> function 'LanguageModel -lm_type berkeleylm -lm_order 4 -lm_file >> model/lm.berkeleylm'! >> >> at >> org.apache.joshua.decoder.Decoder.initializeFeatureFunctions(Decoder.java:642) >> >> at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:394) >> >> at org.apache.joshua.decoder.Decoder.<init>(Decoder.java:128) >> >> Caused by: java.lang.reflect.InvocationTargetException: null >> >> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) >> >> at >> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) >> >> at >> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >> >> at java.lang.reflect.Constructor.newInstance(Constructor.java:422) >> >> at >> org.apache.joshua.decoder.Decoder.initializeFeatureFunctions(Decoder.java:638) >> >> ... 58 common frames omitted >> >> Caused by: java.lang.RuntimeException: Something wrong with I/O. >> >> at edu.berkeley.nlp.lm.io.ArpaLmReader.parseHeader(ArpaLmReader.java:114) >> >> at edu.berkeley.nlp.lm.io.ArpaLmReader.parse(ArpaLmReader.java:76) >> >> at edu.berkeley.nlp.lm.io.ArpaLmReader.parse(ArpaLmReader.java:18) >> >> at edu.berkeley.nlp.lm.io.LmReaders.firstPassCommon(LmReaders.java:549) >> >> at edu.berkeley.nlp.lm.io.LmReaders.firstPassArpa(LmReaders.java:526) >> >> at >> edu.berkeley.nlp.lm.io.LmReaders.readArrayEncodedLmFromArpa(LmReaders.java:171) >> >> at >> edu.berkeley.nlp.lm.io.LmReaders.readArrayEncodedLmFromArpa(LmReaders.java:151) >> >> at >> org.apache.joshua.decoder.ff.lm.berkeley_lm.LMGrammarBerkeley.<init>(LMGrammarBerkeley.java:94) >> >> at >> org.apache.joshua.decoder.ff.lm.LanguageModelFF.initializeLM(LanguageModelFF.java:158) >> >> at >> org.apache.joshua.decoder.ff.lm.LanguageModelFF.<init>(LanguageModelFF.java:132) >> >> Any hints on what I could be doing wrong ? Encoding ? >> Did anyone else experience such issue ? >> >> BTW I am running this from within a Java application, Decoder is >> initialized as follows: >> >> JoshuaConfiguration configuration = new JoshuaConfiguration(); >> configuration.readConfigFile(pathToJoshuaConfig); >> configuration.use_structured_output = true; >> Decoder decoder = new Decoder(configuration, pathToJoshuaConfig); >> >> Regards, >> Tommaso >> >