t.edu
Subject: Re: [Moses-support] BLEU score difference about 0.13 for one
dataset is normal?
Yes. Each tuning with the same test set will give you small variations
in the final BLEU. Yours looks like they're in a normal range.
Date: Sun, 11 Oct 2015 04:23:56 +
From: Davood Mo
: tah...@precisiontranslationtools.com
> Date: Sun, 11 Oct 2015 12:53:37 +0700
> To: moses-support@mit.edu
> Subject: Re: [Moses-support] BLEU score difference about 0.13 for one
> dataset is normal?
>
>
> Yes. Each tuning with the same test set will give you small variations
dataset
is normal?
Yes. Each tuning with the same test set will give you small variations in the
final BLEU. Yours looks like they're in a normal range.
Date: Sun, 11 Oct 2015 04:23:56 +
From: Davood Mohammadifar
Subject: [Moses-support] BLEU score difference about 0.13 fo
Yes. Each tuning with the same test set will give you small variations in the
final BLEU. Yours looks like they're in a normal range.
Date: Sun, 11 Oct 2015 04:23:56 +
From: Davood Mohammadifar
Subject: [Moses-support] BLEU score difference about 0.13 for one
datas
Hello every one
I noticed different BLEU scores for same dataset. Also the difference is not so
much and is about 0.13.
I trained my dataset and tuned development set for Persian-English translation.
after testing, the score was 21.95. For second time i did the same process and
obtained 21.82.
Hi Davood,
Optimizers like MERT will give you a slightly different result every time
you run them, leading to variance in BLEU score. It's generally a good
idea to use multiple optimizer runs, especially when comparing two
systems. There's a good paper on hypothesis testing for MT that goes into