Re: [Moses-support] Ensemble of Neural Machine Translation systems

Rico Sennrich Thu, 03 Nov 2016 08:00:31 -0700

Hello Nat,

for NMT ensembles, you just average the probability distribution ofdifferent models at each time step before selecting the next hypothesis(or hypotheses in beam search). If you're familiar with Moses, this issimilar to what happens when we combine different feature functions inthe log-linear, global model. That's also why I don't think thecomparison of a neural network ensemble to Moses is unfair in principle- both combine various models to obtain the final translationprobablities - the Moses phrase table alone has (at least) four.

Our official submissions to WMT16 are ensembles, but even our singlesystems outperform non-neural submissions for EN->DE, EN->RO, EN->CS andDE->EN (in terms of BLEU).


best wishes,
Rico

On 03/11/16 02:05, Nat Gillin wrote:

Dear Moses Community,
On recent papers, there has been much BLEU scores reported on ensembleof neural machine translation systems. I would like to ask whether anyone know how are these ensembles created?
Is it some sort of averaged pooling layer at the end? Is it some sortof voting of multiple system when the system is decoding at every timestep?
Any pointers to papers describing this magical ensemble would be great =)
Most papers just say that, we ensemble, we beat Moses. Are there caseswhere a single model beat Moses in a normal translation task withoutensembling?
Regards,
Nat


_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Ensemble of Neural Machine Translation systems

Reply via email to