Hi,

Not sure whether this was mentioned in the vast number of replies:

I'd like to stress that simple histogram pruning of the phrase table is
implemented in Moses and every other SMT system I'm aware of. 
(We know better pruning techniques, though:
http://anthology.aclweb.org/D/D12/D12-1089.pdf )

If you deactivate all non-local features (like the LM, the lexicalized
reordering model, the distance-based jump cost), run monotonic decoding,
and apply the features and scaling factors known to the decoder for
pruning as well, then it shouldn't matter how much you prune. If you
keep at least the best translation option per distinct source side, the
decoder should always output the very same Viterbi path.

A simple toy example should be sufficient to verify that the decoder
implements the argmax operation. 

We frequently run a couple of basic regression tests:
http://statmt.org/moses/cruise/ 
I'm pretty sure that we would have noticed quickly in case a major bug
was introduced just recently.

The decoder maximizes model score, not BLEU. Tuning is required to
achieve a correlation of model score with BLEU (or the quality metric of
your choice).

Cheers,
Matthias




On Thu, 2015-06-18 at 07:50 +0700, Tom Hoar wrote:
> Amittai, I understand your point about sounding "almost belligerently 
> confrontational." I also admire Jame's passion and the Moses team's 
> patience to walk through his logic. As a non-scientific reader, this is 
> the most educational exchange I've seen on this list for years. I'm 
> learning a lot. Thank you everyone.
> 
> James, as a non-scientific reader, let me say that Hieu's head bashing 
> to solve the same puzzle shows you're in good company. Yet, the Moses 
> "system" is defined, designed and works with two functionally different 
> pieces, i.e. the front-end and back-end. The front-end creates a (an 
> often wild) array of candidate hypotheses -- by design. Why is this 
> piece designed this way? Because the system design includes a back-end 
> that selects a final choice from amongst the candidates. The two halves 
> share a symbiotic relationship. Together, the pieces form a system with 
> a balance that can only be achieved by working together. In this 
> context, this is not a "bug" (major or minor) and the "system" is not 
> broken.
> 
> I submit, as others have suggested, that you have conceived and are 
> working with a new and different "system" that consists of two different 
> halves. Your front-end reduces table to a focused set. Your back-end 
> works much like today's translation table to select from the focused 
> set. Major advances sometimes come by challenging the status quo. We 
> have seen evidence here of both the challenge and the status quo.
> 
> So, although I can not "admit the system is broke," I encourage you to 
> advance your new system without trying to fix one that's not broken.
> 
> Tom
> 
> 
> > Date: Wed, 17 Jun 2015 15:48:14 +0000
> > From: "Read, James C"<jcr...@essex.ac.uk>
> > Subject: Re: [Moses-support] Major bug found in Moses
> > To: Marcin Junczys-Dowmunt<junc...@amu.edu.pl>
> > Cc:"moses-support@mit.edu"  <moses-support@mit.edu>, "Arnold,       
> > Doug"<d...@essex.ac.uk>
> > Message-ID:<db3pr06mb0713adf9af14ee5d93ec5bc485...@db3pr06mb0713.eurprd06.prod.outlook.com>
> > Content-Type: text/plain; charset="iso-8859-2"
> >
> > 1) So if I've understood you correctly you are saying we have a system that 
> > is purposefully designed to perform poorly with a disabled LM and this is 
> > the proof that the LM is the most fundamental part. Any attempt to prove 
> > otherwise by, e.g. filtering the phrase table to help the disfunctional 
> > search algorithm, does not constitute proof that the TM is the most 
> > fundamental component of the system and if designed correctly can perform 
> > just fine on its own but rather only evidence that the researcher is not 
> > using the system as intended (the intention being to break the TM to 
> > support the idea that the LM is the most fundamental part).
> >
> > 2) If you still feel that the LM is the most fundamental component I 
> > challenge you to disable the TM and perform LM only translations and see 
> > what kind of BLEU scores you get.
> >
> > In conclusion, I do hope that you don't feel that potential investors in MT 
> > systems lack the intelligence to see through these logical fallacies. Can 
> > we now just admit that the system is broke and get around to fixing it?
> >
> > James
> 
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to