Hi, great - that will be very useful.
Since you just ran the comparison - do you have any numbers on "still allowed everything to fit into memory", i.e., how much more memory is used by running parallel instances? -phi On Mon, Oct 5, 2015 at 10:15 AM, Michael Denkowski < michael.j.denkow...@gmail.com> wrote: > Hi all, > > Like some other Moses users, I noticed diminishing returns from running > Moses with several threads. To work around this, I added a script to run > multiple single-threaded instances of moses instead of one multi-threaded > instance. In practice, this sped things up by about 2.5x for 16 cpus and > using memory mapped models still allowed everything to fit into memory. > > If anyone else is interested in using this, you can prefix a moses command > with scripts/generic/multi_moses.py. To use multiple instances in > mert-moses.pl, specify --multi-moses and control the number of parallel > instances with --decoder-flags='-threads N'. > > Below is a benchmark on WMT fr-en data (2M training sentences, 400M words > mono, suffix array PT, compact reordering, 5-gram KenLM) testing default > stack decoding vs cube pruning without and with the parallelization script > (+multi): > > --- > 1cpu sent/sec > stack 1.04 > cube 2.10 > --- > 16cpu sent/sec > stack 7.63 > +multi 12.20 > cube 7.63 > +multi 18.18 > --- > > --Michael > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > >
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support