thanks for the feedback. A coupla of questions: 1. do you have results from cdec/joshua/jane with the same data too? You can tell us even if we're worse, we're big boys now :) 2. I noticed that the translation rules don't have the constant phrase penalty (ie. 2.718 found as the last score when Moses create phrase tables ). Do you know if other decoders have phrase penalty as a built in feature function?
Some years ago, there was an belief that the phrase penalty doesn't help much, but I've never seen the evidence. If you want to verify this, I've created a phrase-penalty feature that you can use https://github.com/moses-smt/mosesdecoder/commit/e15a4fc882952be13efcdecc8284d19560229785 On 24 June 2013 22:16, Wilker Aziz <will.a...@gmail.com> wrote: > Hello everybody, > > I would like to share with you the results of a de-en hierarchical model > trained using this year's WMT constrained data. This model was trained > using Adam Lopez's hierarchical suffix arrays. > I patched some wrappers so hopefully anyone will be able to train such a > model using EMS now (see > http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc44). > Some code needed to be refactored in Moses as well (thanks for that Hieu!). > > About the training: > * fast_align for word alignments > * lmplz for lm estimation (and kenlm for decoding). > * suffix array implementation from cdec (via pycdec) with features: > EgivenFCoherent SampleCountF CountEF MaxLexFgivenE MaxLexEgivenF > IsSingletonF IsSingletonFE > Great tools by the way! > > So, altogether (europarl, nc and commoncrawl) there were 4.5M parallel > segments for training and 20M monlingual segments for language modelling > (europal-mono, nc-v8, news2012). This was a 3-gram LM, for LM interpolation > and MERT I used newstest2010. MERT was quite tedious specially because the > SA code in Moses is not thread-safe. It took 2.5 days to complete 19 > iterations and it reached 24.19 BLEU (dev-set). > > testset BLEU BLEU-c > newstest2011 22.29 21.08 > newstest2012 23.12 21.88 > newstest2013 25.80 24.53 > > These results seem compatible to last year's findings. However there is > probably more data here, so this model might be a little behind a standard > hiero model. > > Cheers, > > Wilker > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > -- Hieu Hoang Research Associate University of Edinburgh http://www.hoang.co.uk/hieu
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support