On 19/06/15 19:21, Marcin Junczys-Dowmunt wrote:
> So, if anything, Moses is just a very flexible text-rewriting tool.
> Tuning (and data) turns into a translator, GEC tool, POS-tagger,
> Chunker, Semantic Tagger etc.
that's a good point, and the basis of some criticism that can be 
levelled at the Moses community: because Moses is so flexible, the 
responsibility is on the user to find the right configuration for a 
task. I think it is getting harder to find out about all of the 
settings/models necessary to reproduce a state-of-the-art system, 
especially outside of an established SMT research group. The results is 
a high barrier of entry, and frustration on all sides when somebody 
performs experiments with default settings.

To stay with the example of phrase table pruning: this is widely used, 
and I used count-based pruning, threshold pruning based on p(e|f), and 
histogram pruning based on the model score in my WMT submission. Can and 
should we make a wider effort to facilitate the reproduction of systems 
by disseminating settings or configuration files? This dissemination is 
partially done by system description papers, but they cannot cover all 
settings [this would make for a very boring paper]. I put some effort 
into documenting my WMT submission by releasing EMS configuration files 
( https://github.com/rsennrich/wmt2014-scripts/tree/master/example ), 
and I would be happy to see this done more often.
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to