Hi Vincent :)

here are some answers. I know, talking to myself.
this page
http://www.statmt.org/moses/?n=Advanced.Incremental
needs to be redone for sure.

it mixes up :
On one hand
Incremental training as a lot of people used it in the past with:
- training a baseline model, but if you do it with EMS, unless I am 
mistaken it won't keep the VCB files ....
- using inc Giza, beacause mgiza isn't really incremental, just reusing 
parameters (am I wrong ?)
- so if baseline model has been trained with fastalign (IBM Model2) it 
won't work since IncGiza supports only IMBModel1 and hmm.
- bottom line on huge corpora like Giga, you need patience to train your 
baseline with mgiza / option hmm.
My understanding is that EMS does not include the baseline preparation 
(vcb missing) but is good for incremental training itself.


On the other hand
Capability of incremental training using MMSapt:
- EMS includes the mmsapt option to train and binarize the arrays
- EMS does NOT include the part of incrementally adding the new data in 
an automated way. Has to be done manually.


Am I understanding things properly ?



Le 23/08/2015 09:06, Vincent Nguyen a écrit :
> Hello,
>
> I have a few questions on running MMSAPT within EMS. I am refering to
> the doc here : http://www.statmt.org/moses/?n=Advanced.Incremental
> and to the sections of the config.basic file of EMS.
>
> 1) the doc says
> initial training run EMS as usual but use modified version of Giza++ and
> add  training-options = "-final-alignment-model hmm"
> Does this mean that we cannot use FastAlign for initial training ? or
> does FastAlign support hmm ?
> also can the -final-alignment-model hmm parameter be in the same as line
> as other training-options ? (-sort-compress -cores ....)
>
> 2) do we need to comment the
> alignment-symmetrization-method=grow-diag-final-and line ?
>
> 3) what is the difference between : ("for use with interactive
> post-editing")
>
> mmsapt = "pfwd=g pbwd=g smooth=0.01 rare=0 prov=0 sample=1000 workers=1"
> binarize-all = $moses-script-dir/training/binarize-model.perl
> and
> mmsapt = "pfwd=g pbwd=g smooth=0 rare=1 prov=1 sample=1000 workers=1"
> binarize-all = $moses-script-dir/training/binarize-model.perl
>
> 4) then there is a section in the config file for which I find no
> documentation but seems related
> use of baseline alignment model (incremental training)
> baseline=68
> then 8 lines of parameters
>
> 5) in the "Updates" section of the doc (adding new data) I see nothing
> related with EMS.
> Does this mean that at this time there is no actual incremental training
> within EMS and this part has to be done manually ?
>
> Thanks for your help,
>
> Vincent
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to