Re: [Moses-support] Randomisation by MGIZA and tuning result is worse than no tuning

Jelita Asian Tue, 22 Nov 2011 04:39:29 -0800

Hi Miles,

Thanks for your reply.


--in general, Machine Translation training is non-convex.  this means
> that there are multiple solutions and each time you run a full
> training job, you will get different results.  in particular, you will
> see different results when running Giza++ (any flavour) and MERT.
>
>
Is there no way to stop the variant in Giza++? I look at the code but has
no idea where it occurs.


> --the best way to deal with this (and most expensive) would be to run
> the full pipe-line, from scratch and multiple times.  this will give
> you a feel for variance --differences in results.  in general,
> variance arising from Giza++ is less damaging than variance from MERT.
>
> How many run is enough for this? As you say, it would be very expensive to
do so.


> --to reduce variance it is best to use as much data as possible at
> each stage.  (100 sentences for tuning is far too low;  you should be
> using at least 1000 sentences).  it is possible to reduce this
> variability by using better machine learning, but in general it will
> always be there.
>
> What do you mean by better machine learning? Isn't the 500,000 words
corpus enough? For the 1,000 sentences for tuning, can I use the same
sentences as used in the training or they shall be separate sets of
sentences?


> --another strategy I know about is to fix everything once you have a
> set of good weights and never rerun MERT.  should you need to change
> say the language model, you will then manually alter the associated
> weight.  this will mean stability, but at the obvious cost of
> generality.  it is also ugly.
>
> Could you elaborate a bit about the fixing everything and never rerun MERT
part? Do you mean after running n times, we find the best variation of
variables (there are so many of them) and don't run MERT which I understand
is for tuning?

Thanks and sorry to answer it with more questions.

Cheers,

Jelita

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Randomisation by MGIZA and tuning result is worse than no tuning

Reply via email to