-
> From: tah...@precisiontranslationtools.com
> Date: Sun, 11 Oct 2015 12:53:37 +0700
> To: moses-support@mit.edu
> Subject: Re: [Moses-support] BLEU score difference about 0.13 for one
> dataset is normal?
>
>
> Yes. Each tuning with the same test set will give you
Subject: Re: [Moses-support] BLEU score difference about 0.13 for one
dataset is normal?
Yes. Each tuning with the same test set will give you small variations
in the final BLEU. Yours looks like they're in a normal range.
Date: Sun, 11 Oct 2015 04:23:56 +
From: Davood Mohammadifar
?
my dataset for Persian to English includes:
Training: about 24 sentences
Tune: 1000 sentences
Test: 1000 sentences
From: tah...@precisiontranslationtools.com
Date: Sun, 11 Oct 2015 12:53:37 +0700
To: moses-support@mit.edu
Subject: Re: [Moses-support] BLEU score difference about 0.13 for one
Yes. Each tuning with the same test set will give you small variations in the
final BLEU. Yours looks like they're in a normal range.
Date: Sun, 11 Oct 2015 04:23:56 +
From: Davood Mohammadifar <davood...@hotmail.com>
Subject: [Moses-support] BLEU score difference about 0.13 f
Hello every one
I noticed different BLEU scores for same dataset. Also the difference is not so
much and is about 0.13.
I trained my dataset and tuned development set for Persian-English translation.
after testing, the score was 21.95. For second time i did the same process and
obtained
Hi Davood,
Optimizers like MERT will give you a slightly different result every time
you run them, leading to variance in BLEU score. It's generally a good
idea to use multiple optimizer runs, especially when comparing two
systems. There's a good paper on hypothesis testing for MT that goes