Improvements in 37 BLEU points over the default behaviour was not enough to 
show that there are problems with the default?


James


________________________________
From: Raphael Payen <raphael.pa...@gmail.com>
Sent: Sunday, June 21, 2015 5:29 PM
To: Read, James C
Cc: moses-support@mit.edu
Subject: Re: [Moses-support] Major bug found in Moses

James, did you try the modifications Philip suggested (removing the word 
penalty and lowering p(f|e)?
(I doubt it will be enough to get a best paper award, but it would probably 
improve your bleu, that's always a good start :) )



On Friday, June 19, 2015, Read, James C 
<jcr...@essex.ac.uk<mailto:jcr...@essex.ac.uk>> wrote:

So, all I did was filter out the less likely phrase pairs and the BLEU score 
shot up. Was that such a stroke of genius? Was that not blindingly obvious?


Your telling me that redesigning the search algorithm to prefer higher scoring 
phrase pairs is all we need to do to get a best paper at ACL?


James


________________________________
From: Lane Schwartz <dowob...@gmail.com>
Sent: Friday, June 19, 2015 7:40 PM
To: Read, James C
Cc: Philipp Koehn; Burger, John D.; moses-support@mit.edu
Subject: Re: [Moses-support] Major bug found in Moses

On Fri, Jun 19, 2015 at 11:28 AM, Read, James C <jcr...@essex.ac.uk> wrote:

What I take issue with is the en-masse denial that there is a problem with the 
system if it behaves in such a way with no LM + no pruning and/or tuning.

There is no mass denial taking place.

Regardless of whether or not you tune, the decoder will do its best to find 
translations with the highest model score. That is the expected behavior.

What I have tried to tell you, and what other people have tried to tell you, is 
that translations with high model scores are not necessarily good translations.

We all want our models to be such that high model scores correspond to good 
translations, and that low model scores correspond with bad translations. But 
unfortunately, our models do not innately have this characteristic. We all know 
this. We also know a good way to deal with this shortcoming, namely tuning. 
Tuning is the process by which we attempt to ensure that high model scores 
correspond to high quality translations, and that low model scores correspond 
to low quality translations.

If you can design models that naturally correspond with translation quality 
without tuning, that's great. If you can do that, you've got a great shot at 
winning a Best Paper award at ACL.

In the meantime, you may want to consider an apology for your rude behavior and 
unprofessional attitude.

Goodbye.
Lane

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to