Re: [Moses-support] BLEU score difference about 0.13 for one dataset is normal?

2015-10-14 Thread Michael Denkowski
- > From: tah...@precisiontranslationtools.com > Date: Sun, 11 Oct 2015 12:53:37 +0700 > To: moses-support@mit.edu > Subject: Re: [Moses-support] BLEU score difference about 0.13 for one > dataset is normal? > > > Yes. Each tuning with the same test set will give you

Re: [Moses-support] BLEU score difference about 0.13 for one dataset is normal?

2015-10-14 Thread Tom Hoar
Subject: Re: [Moses-support] BLEU score difference about 0.13 for one dataset is normal? Yes. Each tuning with the same test set will give you small variations in the final BLEU. Yours looks like they're in a normal range. Date: Sun, 11 Oct 2015 04:23:56 + From: Davood Mohammadifar

Re: [Moses-support] BLEU score difference about 0.13 for one dataset is normal?

2015-10-13 Thread Davood Mohammadifar
? my dataset for Persian to English includes: Training: about 24 sentences Tune: 1000 sentences Test: 1000 sentences From: tah...@precisiontranslationtools.com Date: Sun, 11 Oct 2015 12:53:37 +0700 To: moses-support@mit.edu Subject: Re: [Moses-support] BLEU score difference about 0.13 for one

Re: [Moses-support] BLEU score difference about 0.13 for one dataset is normal?

2015-10-10 Thread Tom Hoar
Yes. Each tuning with the same test set will give you small variations in the final BLEU. Yours looks like they're in a normal range. Date: Sun, 11 Oct 2015 04:23:56 + From: Davood Mohammadifar <davood...@hotmail.com> Subject: [Moses-support] BLEU score difference about 0.13 f

[Moses-support] BLEU score difference about 0.13 for one dataset is normal?

2015-10-10 Thread Davood Mohammadifar
Hello every one I noticed different BLEU scores for same dataset. Also the difference is not so much and is about 0.13. I trained my dataset and tuned development set for Persian-English translation. after testing, the score was 21.95. For second time i did the same process and obtained

Re: [Moses-support] BLEU score difference about 0.13 for one dataset is normal?

2015-10-10 Thread Michael Denkowski
Hi Davood, Optimizers like MERT will give you a slightly different result every time you run them, leading to variance in BLEU score. It's generally a good idea to use multiple optimizer runs, especially when comparing two systems. There's a good paper on hypothesis testing for MT that goes