Improvements in 37 BLEU points over the default behaviour was not enough to show that there are problems with the default?
James ________________________________ From: Raphael Payen <raphael.pa...@gmail.com> Sent: Sunday, June 21, 2015 5:29 PM To: Read, James C Cc: moses-support@mit.edu Subject: Re: [Moses-support] Major bug found in Moses James, did you try the modifications Philip suggested (removing the word penalty and lowering p(f|e)? (I doubt it will be enough to get a best paper award, but it would probably improve your bleu, that's always a good start :) ) On Friday, June 19, 2015, Read, James C <jcr...@essex.ac.uk<mailto:jcr...@essex.ac.uk>> wrote: So, all I did was filter out the less likely phrase pairs and the BLEU score shot up. Was that such a stroke of genius? Was that not blindingly obvious? Your telling me that redesigning the search algorithm to prefer higher scoring phrase pairs is all we need to do to get a best paper at ACL? James ________________________________ From: Lane Schwartz <dowob...@gmail.com> Sent: Friday, June 19, 2015 7:40 PM To: Read, James C Cc: Philipp Koehn; Burger, John D.; moses-support@mit.edu Subject: Re: [Moses-support] Major bug found in Moses On Fri, Jun 19, 2015 at 11:28 AM, Read, James C <jcr...@essex.ac.uk> wrote: What I take issue with is the en-masse denial that there is a problem with the system if it behaves in such a way with no LM + no pruning and/or tuning. There is no mass denial taking place. Regardless of whether or not you tune, the decoder will do its best to find translations with the highest model score. That is the expected behavior. What I have tried to tell you, and what other people have tried to tell you, is that translations with high model scores are not necessarily good translations. We all want our models to be such that high model scores correspond to good translations, and that low model scores correspond with bad translations. But unfortunately, our models do not innately have this characteristic. We all know this. We also know a good way to deal with this shortcoming, namely tuning. Tuning is the process by which we attempt to ensure that high model scores correspond to high quality translations, and that low model scores correspond to low quality translations. If you can design models that naturally correspond with translation quality without tuning, that's great. If you can do that, you've got a great shot at winning a Best Paper award at ACL. In the meantime, you may want to consider an apology for your rude behavior and unprofessional attitude. Goodbye. Lane
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support