Hi, with regard to pruning ---
the example EMS config files have [TRAINING] score-settings = "--GoodTuring --MinScore 2:0.0001" which carries out threshold pruning during phrase table construction, going a good way towards avoiding too many translation options per phrase. -phi On Mon, Oct 5, 2015 at 11:08 AM, Barry Haddow <bhad...@inf.ed.ac.uk> wrote: > Hi Hieu > > That's exactly why I took to pre-pruning the phrase table, as I mentioned > on Friday. I had something like 750,000 translations of the most common > word, and it took half-an-hour to get the first sentence translated. > > cheers - Barry > > > On 05/10/15 15:48, Hieu Hoang wrote: > > what pt implementation did you use, and had it been pre-pruned so that > there's a limit on how many target phrase for a particular source phrase? > ie. don't have 10,000 entries for 'the' . > > I've been digging around multithreading in the last few weeks. I've > noticed that the compact pt is VERY bad at handling unpruned pt. > Cores 1 5 10 15 20 25 Unpruned compact pt 143 42 32 38 > 52 62 probing pt 245 58 33 25 24 21 Pruned compact pt 119 24 15 10 10 10 > probing pt 117 25 25 10 10 10 > > Hieu Hoang > http://www.hoang.co.uk/hieu > > On 5 October 2015 at 15:15, Michael Denkowski < > michael.j.denkow...@gmail.com> wrote: > >> Hi all, >> >> Like some other Moses users, I noticed diminishing returns from running >> Moses with several threads. To work around this, I added a script to run >> multiple single-threaded instances of moses instead of one multi-threaded >> instance. In practice, this sped things up by about 2.5x for 16 cpus and >> using memory mapped models still allowed everything to fit into memory. >> >> If anyone else is interested in using this, you can prefix a moses >> command with scripts/generic/multi_moses.py. To use multiple instances in >> mert-moses.pl, specify --multi-moses and control the number of parallel >> instances with --decoder-flags='-threads N'. >> >> Below is a benchmark on WMT fr-en data (2M training sentences, 400M words >> mono, suffix array PT, compact reordering, 5-gram KenLM) testing default >> stack decoding vs cube pruning without and with the parallelization script >> (+multi): >> >> --- >> 1cpu sent/sec >> stack 1.04 >> cube 2.10 >> --- >> 16cpu sent/sec >> stack 7.63 >> +multi 12.20 >> cube 7.63 >> +multi 18.18 >> --- >> >> --Michael >> >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> > > > _______________________________________________ > Moses-support mailing > listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support > > > > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > >
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support