Hi - I trained a phrase based system from a low resource language to
english, and got *13.6633* as the BLEU score. However, when I tested on the
same dev set and computed BLEU against the English corpus in the dev set, I
only got *3.69*. Then I did a manual grid search over the parameter space
in moses.ini (the one that's generated upon the end of tuning/development),
and got the BLEU of *3.77* at best. Both recasing and tokenization are used
to the dev set I computed BLEU on.
I'm wondering what could be the potential reason why the BLEU score
reported in moses.ini derived from the dev set doesn't align with the one I
computed with the same dev set.


- Angli
Moses-support mailing list

Reply via email to