On 19/07/2015 23:06, Vincent Nguyen wrote:
> I finally went through the all Baseline process with the KenLM model.
>
> results are mitigated, so from here what would be the best practices ?
>
> 1) I saw online a bunch of corpus available from the European Union
> should this be used to train the translation system  AND the langue
> model or just one of the 2 ?
you cab use the data for creating both the language model and the 
translation model. The only thing you have to make sure is that your 
training data is not part of the tuning or test data
>
> 2) Is there a benchmark between the different model (Kenlm, Irstlm, ...)
> ie is there a big difference in the observed results ?
> is it worth trying several ones ?
Try it yourself and tell us the results.
>
> 3) I read an article mentioning that the results after the tuning were
> not as good as before ...
> does this make any sense ?
If you report BLEU score without tuning first, you will be crucified, 
see this thread:
   https://www.mail-archive.com/moses-support@mit.edu/msg12593.html
You MUST tune. Tuning can sometime to difficult. See this post on how to 
pick a good tuning set:
    https://www.mail-archive.com/moses-support@mit.edu/msg12594.html
>
> Thanks.
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

-- 
Hieu Hoang
Researcher
New York University, Abu Dhabi
http://www.hoang.co.uk/hieu

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to