http://kheafield.com/code/filter.html filters ARPA files. This is a separate project from KenLM. I think most LM toolkits don't check/require that the ARPA is normalized. At least, KenLM, IRSTLM, and SRILM work fine with filtered models.
On 11/24/11 13:07, Miles Osborne wrote: > this can be done, but it tends to not save much space. also it does > not help deal with OOVs, which the language model can still score even > though they are not in the parallel set. > > if you are worried about saving space then you should either look at > KenLM or RandLM > > Miles > > On 24 November 2011 12:58, Thomas Schoenemann > <thomas_schoenem...@yahoo.de> wrote: >> Dear all, >> I hope that this is not too stupid a question, and that it hasn't been >> asked recently. >> In the MOSES EMS, when running experiments the phrase table is automatically >> reduced to only those phrases that actually occur in the respective dev/test >> set. Obviously this saves a lot of memory without changing the resulting >> translations. >> >> Now, I was wondering if something similar can be done/is done with the >> language model. That is, can one reduce the ARPA-file to only those words >> that occur on the target side in the (filtered) phrase table? The objective >> would of course be to maintain the translation result. Would the LM-software >> renormalize internally if some of the original entries are removed? Then the >> results would differ. >> This may even depend on what language model you use to load (rather than >> train) the ARPA file. I am using SRILM in my own translation programs, but >> would also be interested in other toolkits in case they behave more >> suitably. >> >> Can anyone point me to anything? >> Many thanks! >> Thomas Schoenemann (currently University of Pisa) >> >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> > > > _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support