Phrase-table size in GB (.gz)
0.0001 => 18.6 GB => BLEU 29.59
0.001 => 16.4 GB => BLEU 29.67
0.01 => 13.5 GB => BLEU 29.38
I have not been able to test the speed yet because my reference with the
first one is in Compact mode and it takes for ever to binarize the 2
others ......
Given these sizes I doubt there will be a major gain.
Le 31/08/2015 16:44, Philipp Koehn a écrit :
hI,
0.0001 should have no impact on translation quality,
0.001 will have some impact
0.01 is probably a bit too drastic.
But that's the range you should explore.
-phi
On Mon, Aug 31, 2015 at 10:33 AM, Vincent Nguyen <vngu...@neuf.fr
<mailto:vngu...@neuf.fr>> wrote:
is there any benchmark on what value / what impact ?
what should I start with as a test 0.001 ?
the standard value 0.0001 seems really really low to me ....
maybe I am not getting what this probability exactly refers to.
where |FIELDn| is the position of the score (typically 2 for the
direct phrase probability p(e|f), or 0 for the indirect phrase
probability p(f|e)) and |THRESHOLD| the maximum probability
allowed. A good setting is |2:0.0001|, which removes all rules,
where the direct phrase translation probability is below 0.0001.
Le 31/08/2015 16:14, Philipp Koehn a écrit :
Hi,
I would suspect that with beam sizes <500 the bulk of the time is
spent on translation option collection, not decoding. You could speed
that up with tighter threshold pruning of the phrase table.
See the script scripts/training/threshold-filter.perl or the setting
score-settings = "--MinScore 2:0.0001"
in EMS.
-phi
On Mon, Aug 31, 2015 at 3:03 AM, Vincent Nguyen <vngu...@neuf.fr
<mailto:vngu...@neuf.fr>> wrote:
Hi,
Here are some results with several values with cube pruning
pop limit :
(pop limit / decoding time for 3000 sentences / BLEU score)
5000 - 15m45 - 29.59
1000 - 4m27 - 29.59
500 - 3m35 - 29.59
200 - 3m15 - 29.51
100 - 3m00 - 29.40
Therefore I took 400 - 3m19 - 29.58
If I am not mistaken the default value for Moses is 1000
[read in the
doc] but in the EMS
it is 5000 right now .... which makes the experience so long
.....
I suggest to change the EMS default value.
Is there a way to also use a cube pruning limit in the
decoder at Tuning
time ?
Now with this optimized setting I get a ration of 15 segments
per second
in average.
What is the reason for online tools like Google / Bing to be
much much
faster.
it's not a machine issue, is it ?
Cheers
Vincent
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu <mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support