[Moses-support] KenLM memory usage
Hi there, I want to run 4 MT servers at the same time on a machine with limited memory. Kenlm seems to reserve the amount of memory which the language model would have taken if it had been loaded into memory. So I don't have enough memory to run all these servers and the machine grinds to a halt if I try. Is there any flag I could use which would limit the amount of memory reserved? Lexi ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] KenLM memory usage
I have found the answer on the kenlm web page and it seems to be working: Full or lazy loading KenLM supports lazy loading via mmap. This allows you to further reduce memory usage, especially with trie which has good memory locality. In Moses, this is controlled by the language model number in moses.ini. Using language model number 8 will load the full model into memory (MAP_POPULATE on Linux and read() on other OSes). Language model number 9 will lazily load the model using mmap. I recommend fully loading if you have the RAM for it; it actually takes less time to load the full model and use it because the disk does not have to seek during decoding. Lazy loading works best with local disk and is not recommended for networked filesystems. On Thu, Mar 20, 2014 at 2:32 PM, Alexandra Birch lexi.bi...@gmail.comwrote: Hi there, I want to run 4 MT servers at the same time on a machine with limited memory. Kenlm seems to reserve the amount of memory which the language model would have taken if it had been loaded into memory. So I don't have enough memory to run all these servers and the machine grinds to a halt if I try. Is there any flag I could use which would limit the amount of memory reserved? Lexi ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] KenLM memory usage
Hi, since KenLM uses shared memory, four instances should take up the same amount of memory as only one instance (ran yesterday 8 instances with 8 threads each with a 99GB LM on a 128 GB machine). If the model fits into memory for a single instance it should work if you have enough memory left for all the phrase tables and the translation process itself (I guess this is actually the problem). Lazy loading was unbearably slow for me with the above mentioned configuration, but I was using 64 threads in total, so a lot of concurrent disk access happing, no wonder there. Best, Marcin W dniu 20.03.2014 14:35, Alexandra Birch pisze: I have found the answer on the kenlm web page and it seems to be working: Full or lazy loading KenLM supports lazy loading via mmap. This allows you to further reduce memory usage, especially with trie which has good memory locality. In Moses, this is controlled by the language model number in moses.ini. Using language model number 8 will load the full model into memory (MAP_POPULATE on Linux and read() on other OSes). Language model number 9 will lazily load the model using mmap. I recommend fully loading if you have the RAM for it; it actually takes less time to load the full model and use it because the disk does not have to seek during decoding. Lazy loading works best with local disk and is not recommended for networked filesystems. On Thu, Mar 20, 2014 at 2:32 PM, Alexandra Birch lexi.bi...@gmail.com mailto:lexi.bi...@gmail.com wrote: Hi there, I want to run 4 MT servers at the same time on a machine with limited memory. Kenlm seems to reserve the amount of memory which the language model would have taken if it had been loaded into memory. So I don't have enough memory to run all these servers and the machine grinds to a halt if I try. Is there any flag I could use which would limit the amount of memory reserved? Lexi ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] KenLM memory usage
Hi Lexi, I take it that these models are different, not the same model loaded into each process (in which case they would have shared). I'd really recommend trying to compress things more (e.g. trie -a 64 -q 8) before going to lazy loading. Kenneth On 03/20/14 08:13, Marcin Junczys-Dowmunt wrote: Hi, since KenLM uses shared memory, four instances should take up the same amount of memory as only one instance (ran yesterday 8 instances with 8 threads each with a 99GB LM on a 128 GB machine). If the model fits into memory for a single instance it should work if you have enough memory left for all the phrase tables and the translation process itself (I guess this is actually the problem). Lazy loading was unbearably slow for me with the above mentioned configuration, but I was using 64 threads in total, so a lot of concurrent disk access happing, no wonder there. Best, Marcin W dniu 20.03.2014 14:35, Alexandra Birch pisze: I have found the answer on the kenlm web page and it seems to be working: Full or lazy loading KenLM supports lazy loading via mmap. This allows you to further reduce memory usage, especially with trie which has good memory locality. In Moses, this is controlled by the language model number in moses.ini. Using language model number 8 will load the full model into memory (MAP_POPULATE on Linux and read() on other OSes). Language model number 9 will lazily load the model using mmap. I recommend fully loading if you have the RAM for it; it actually takes less time to load the full model and use it because the disk does not have to seek during decoding. Lazy loading works best with local disk and is not recommended for networked filesystems. On Thu, Mar 20, 2014 at 2:32 PM, Alexandra Birch lexi.bi...@gmail.com mailto:lexi.bi...@gmail.com wrote: Hi there, I want to run 4 MT servers at the same time on a machine with limited memory. Kenlm seems to reserve the amount of memory which the language model would have taken if it had been loaded into memory. So I don't have enough memory to run all these servers and the machine grinds to a halt if I try. Is there any flag I could use which would limit the amount of memory reserved? Lexi ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] KenLM memory usage
Hi Ken, Yes, different models, different languages. Thanks! Yes lazy loading is absolutely dead slow. Lexi On Thu, Mar 20, 2014 at 4:53 PM, Kenneth Heafield mo...@kheafield.comwrote: Hi Lexi, I take it that these models are different, not the same model loaded into each process (in which case they would have shared). I'd really recommend trying to compress things more (e.g. trie -a 64 -q 8) before going to lazy loading. Kenneth On 03/20/14 08:13, Marcin Junczys-Dowmunt wrote: Hi, since KenLM uses shared memory, four instances should take up the same amount of memory as only one instance (ran yesterday 8 instances with 8 threads each with a 99GB LM on a 128 GB machine). If the model fits into memory for a single instance it should work if you have enough memory left for all the phrase tables and the translation process itself (I guess this is actually the problem). Lazy loading was unbearably slow for me with the above mentioned configuration, but I was using 64 threads in total, so a lot of concurrent disk access happing, no wonder there. Best, Marcin W dniu 20.03.2014 14:35, Alexandra Birch pisze: I have found the answer on the kenlm web page and it seems to be working: Full or lazy loading KenLM supports lazy loading via mmap. This allows you to further reduce memory usage, especially with trie which has good memory locality. In Moses, this is controlled by the language model number in moses.ini. Using language model number 8 will load the full model into memory (MAP_POPULATE on Linux and read() on other OSes). Language model number 9 will lazily load the model using mmap. I recommend fully loading if you have the RAM for it; it actually takes less time to load the full model and use it because the disk does not have to seek during decoding. Lazy loading works best with local disk and is not recommended for networked filesystems. On Thu, Mar 20, 2014 at 2:32 PM, Alexandra Birch lexi.bi...@gmail.com mailto:lexi.bi...@gmail.com wrote: Hi there, I want to run 4 MT servers at the same time on a machine with limited memory. Kenlm seems to reserve the amount of memory which the language model would have taken if it had been loaded into memory. So I don't have enough memory to run all these servers and the machine grinds to a halt if I try. Is there any flag I could use which would limit the amount of memory reserved? Lexi ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support