Hi,

W dniu 28.04.2015 o 16:06, Hieu Hoang pisze:

People have also been using the Common Crawl corpus to build huge backoff LM. They're very difficult to use as it consumes a lot of memory

That's what I added pruning to KenLM for :) Also if you combine that with some domain-filtering you get nice models form the common crawl data. You might need a couble of TV of free disk space though.
Best,
Marcin

On 25/04/2015 20:24, Alla Rozovskaya wrote:
Hello,

I have built an interpolated count-based LM on the Google Web N-gram corpus using SRILM toolkit, as specified here: http://www.speech.sri.com/projects/srilm/manpages/srilm-faq.7.html

Is it possible to use it in moses? In particular, since this model uses count files and a file specifying weights, what is the right way to specify the path in moses.ini?

Thank you,

Alla



_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

--
Hieu Hoang
Researcher
New York University, Abu Dhabi
http://www.hoang.co.uk/hieu


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to