hi mark & john
I wrote a script to parallelize the extract (step 5)
scripts/generic/extract-parallel.perl
I haven't integrated it into the train-model.perl due to lack of time.
Step 6 can also be parallelize but you need to split the data a little
more carefully. Attached is the script that
Mark Fishel wrote:
> the "--parallel" switch of the train-model.perl script is only
> effective during the first 2 steps -- is there a good reason not to
> make the phrase scoring (step 6) parallel? Currently it contains a
> 'for my $direction ("f2e","e2f")...', and on a large corpus the
> scoring
Dear list,
the "--parallel" switch of the train-model.perl script is only
effective during the first 2 steps -- is there a good reason not to
make the phrase scoring (step 6) parallel? Currently it contains a
'for my $direction ("f2e","e2f")...', and on a large corpus the
scoring can take quite lo