Dear Marcin,

I have uploaded my EMS files for WMT'16:
https://github.com/bicici/ParFDAWMT16

Text processing steps can be language-dependent, might require domain
knowledge and expertise, and distinct you from others elevating your
results.
I suggest reading relevant sections from the papers of WMT participants to
get a feel of the computational requirements, that are not necessarily
made obvious, such as the use of unsupervised learning of classes in
language models and alignment. Text processing helps the datasets to take
the form you like them to have even if you consider as evil. If removing
punctuation from some dataset helps, then this may be found ingenuious as
well.

Barry Haddow has prepared preprocessed WMT'17 datasets:
http://data.statmt.org/wmt17/translation-task/preprocessed/
http://www.statmt.org/wmt17/translation-task.html


Regards,
Ergun


On Sun, Nov 26, 2017 at 12:41 PM, Marcin Junczys-Dowmunt <junc...@amu.edu.pl
> wrote:

> Hi list,
>
> I am preparing a couple of usage example for my NMT toolkit and got hung
> up on all the preprocessing and other evil stuff. I am wondering is
> there now anything decent around for doing preprocessing, running
> experiments and evaluation? Or is the best thing still GNU make (isn't
> that embarrassing)?
>
> Best,
>
> Marcin
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>



-- 

Regards,
Ergun
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to