On 19/06/15 19:21, Marcin Junczys-Dowmunt wrote: > So, if anything, Moses is just a very flexible text-rewriting tool. > Tuning (and data) turns into a translator, GEC tool, POS-tagger, > Chunker, Semantic Tagger etc. that's a good point, and the basis of some criticism that can be levelled at the Moses community: because Moses is so flexible, the responsibility is on the user to find the right configuration for a task. I think it is getting harder to find out about all of the settings/models necessary to reproduce a state-of-the-art system, especially outside of an established SMT research group. The results is a high barrier of entry, and frustration on all sides when somebody performs experiments with default settings.
To stay with the example of phrase table pruning: this is widely used, and I used count-based pruning, threshold pruning based on p(e|f), and histogram pruning based on the model score in my WMT submission. Can and should we make a wider effort to facilitate the reproduction of systems by disseminating settings or configuration files? This dissemination is partially done by system description papers, but they cannot cover all settings [this would make for a very boring paper]. I put some effort into documenting my WMT submission by releasing EMS configuration files ( https://github.com/rsennrich/wmt2014-scripts/tree/master/example ), and I would be happy to see this done more often. _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support