Phrase-table size in GB (.gz)
0.0001 => 18.6 GB => BLEU 29.59
0.001 => 16.4 GB => BLEU 29.67
0.01 => 13.5 GB => BLEU 29.38

I have not been able to test the speed yet because my reference with the first one is in Compact mode and it takes for ever to binarize the 2 others ......
Given these sizes I doubt there will be a major gain.


Le 31/08/2015 16:44, Philipp Koehn a écrit :
hI,

0.0001 should have no impact on translation quality,
0.001 will have some impact
0.01 is probably a bit too drastic.

But that's the range you should explore.

-phi

On Mon, Aug 31, 2015 at 10:33 AM, Vincent Nguyen <vngu...@neuf.fr <mailto:vngu...@neuf.fr>> wrote:

    is there any benchmark on what value / what impact ?
    what should I start with as a test 0.001 ?

    the standard value 0.0001 seems really really low to me ....
    maybe I am not getting what this probability exactly refers to.



    where |FIELDn| is the position of the score (typically 2 for the
    direct phrase probability p(e|f), or 0 for the indirect phrase
    probability p(f|e)) and |THRESHOLD| the maximum probability
    allowed. A good setting is |2:0.0001|, which removes all rules,
    where the direct phrase translation probability is below 0.0001.



    Le 31/08/2015 16:14, Philipp Koehn a écrit :
    Hi,

    I would suspect that with beam sizes <500 the bulk of the time is
    spent on translation option collection, not decoding. You could speed
    that up with tighter threshold pruning of the phrase table.

    See the script scripts/training/threshold-filter.perl or the setting
    score-settings = "--MinScore 2:0.0001"
    in EMS.

    -phi

    On Mon, Aug 31, 2015 at 3:03 AM, Vincent Nguyen <vngu...@neuf.fr
    <mailto:vngu...@neuf.fr>> wrote:

        Hi,

        Here are some results with several values with cube pruning
        pop limit :

        (pop limit / decoding time for 3000 sentences / BLEU score)

        5000 - 15m45 - 29.59
        1000 - 4m27 - 29.59
        500 - 3m35 - 29.59
        200 - 3m15 - 29.51
        100 - 3m00 - 29.40

        Therefore I took 400 - 3m19 - 29.58

        If I am not mistaken the default value for Moses is 1000
        [read in the
        doc] but in the EMS
        it is 5000 right now .... which makes the experience so long
        .....
        I suggest to change the EMS default value.

        Is there a way to also use a cube pruning limit in the
        decoder at Tuning
        time ?

        Now with this optimized setting I get a ration of 15 segments
        per second
        in average.
        What is the reason for online tools like Google / Bing to be
        much much
        faster.
        it's not a machine issue, is it ?


        Cheers
        Vincent

        _______________________________________________
        Moses-support mailing list
        Moses-support@mit.edu <mailto:Moses-support@mit.edu>
        http://mailman.mit.edu/mailman/listinfo/moses-support





_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to