After many tests, as mentioned before I had made these changes in EMS
score-settings = "--GoodTuring --MinScore 2:0.001"
and
pop limit cube pruning at 400 (instead of 5000 in EMS !!!!)
speed is much much higher (without impact on translation)
Le 05/10/2015 17:20, Philipp Koehn a écrit :
Hi,
with regard to pruning ---
the example EMS config files have
[TRAINING]
score-settings = "--GoodTuring --MinScore 2:0.0001"
which carries out threshold pruning during phrase table construction,
going a good way towards avoiding too many translation options per phrase.
-phi
On Mon, Oct 5, 2015 at 11:08 AM, Barry Haddow <bhad...@inf.ed.ac.uk
<mailto:bhad...@inf.ed.ac.uk>> wrote:
Hi Hieu
That's exactly why I took to pre-pruning the phrase table, as I
mentioned on Friday. I had something like 750,000 translations of
the most common word, and it took half-an-hour to get the first
sentence translated.
cheers - Barry
On 05/10/15 15:48, Hieu Hoang wrote:
what pt implementation did you use, and had it been pre-pruned so
that there's a limit on how many target phrase for a particular
source phrase? ie. don't have 10,000 entries for 'the' .
I've been digging around multithreading in the last few weeks.
I've noticed that the compact pt is VERY bad at handling unpruned
pt.
Cores
1 5 10 15 20 25
Unpruned compact pt 143 42 32 38 52 62
probing pt 245 58 33 25 24 21
Pruned compact pt 119 24 15 10 10 10
probing pt 117 25 25 10 10 10
Hieu Hoang
http://www.hoang.co.uk/hieu
On 5 October 2015 at 15:15, Michael Denkowski
<michael.j.denkow...@gmail.com
<mailto:michael.j.denkow...@gmail.com>> wrote:
Hi all,
Like some other Moses users, I noticed diminishing returns
from running Moses with several threads. To work around
this, I added a script to run multiple single-threaded
instances of moses instead of one multi-threaded instance. In
practice, this sped things up by about 2.5x for 16 cpus and
using memory mapped models still allowed everything to fit
into memory.
If anyone else is interested in using this, you can prefix a
moses command with scripts/generic/multi_moses.py. To use
multiple instances in mert-moses.pl <http://mert-moses.pl>,
specify --multi-moses and control the number of parallel
instances with --decoder-flags='-threads N'.
Below is a benchmark on WMT fr-en data (2M training
sentences, 400M words mono, suffix array PT, compact
reordering, 5-gram KenLM) testing default stack decoding vs
cube pruning without and with the parallelization script
(+multi):
---
1cpu sent/sec
stack 1.04
cube 2.10
---
16cpu sent/sec
stack 7.63
+multi 12.20
cube 7.63
+multi 18.18
---
--Michael
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu <mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu <mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu <mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support