Hi - I trained a phrase based system from a low resource language to
english, and got *13.6633* as the BLEU score. However, when I tested on the
same dev set and computed BLEU against the English corpus in the dev set, I
only got *3.69*. Then I did a manual grid search over the parameter space
in moses.ini (the one that's generated upon the end of tuning/development),
and got the BLEU of *3.77* at best. Both recasing and tokenization are used
to the dev set I computed BLEU on.
I'm wondering what could be the potential reason why the BLEU score
reported in moses.ini derived from the dev set doesn't align with the one I
computed with the same dev set.

Thanks.

- Angli
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to