Re: [Moses-support] Tuning for factored phrase based systems
Thank you! On Tue, Dec 6, 2016 at 12:55 AM Sašo Kuntaric wrote: > Hi Angli, > > Here is an excerpt of Hieu's answers regarding this topic when I was doing > research in factored models, might be of some help: > > On 30/06/2016 21:44, Sašo Kuntaric wrote: > > Hi all, > > I would like to ask one more question. When you say that my reference only > has the surface form, are you talking about the "tuning corpus", which in > the case of my command > > ~/mosesdecoder/scripts/training/mert-moses.pl > ~/working/IT_corpus/TMX/txt/factored_corpus/singles/tuning_corpus.tagged.clean.en > ~/working/IT_corpus/TMX/txt/factored_corpus/singles/ > tuning_corpus.tagged.clean.sl ~/mosesdecoder/bin/moses > ~/working/IT_corpus/TMX/txt/factored_corpus/singles/test/model/moses.ini > --mertdir ~/mosesdecoder/bin/ --decoder-flags="-threads all" > > are tuning_corpus.tagged.clean.en and tuning_corpus.tagged.clean.sl? Can > tuning be done with files that only contains surface forms? > > it's usual that the reference tuning data does not have factors, even if > there are factors in the phrase table. After all, you don't care if the > output surface form is correct but the other factors are wrong. > > Will the results be compatible with tuning done with a factored tuning > corpus? > > yes > > Best regards, > > Sašo > > 2016-12-04 1:37 GMT+01:00 Hieu Hoang : > > > > Hieu > Sent while bumping into things > > On 1 Dec 2016 07:01, "Angli Liu" wrote: > > Hi, what's the major difference between the tuning process for a factored > phrase based system (i.e., surface+pos data) and a simple baseline phrase > based system? > > > Nothing, the tuning just optimise weights for feature functions. > > If you decompose your translation so that it has multiple phrase tables > and generation models, then they are just extra feature functions with > weights to be tuned > > Do I need to organize the dev set the same way as the training set (i.e., > surface|pos)? > > Yes > > Is there a tutorial on the moses website on this topic? > > Maybe this > http://www.statmt.org/moses/?n=FactoredTraining.FactoredTraining > > > Thanks! > > -Angli > > ___ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > ___ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > > -- > lp, > > Sašo > ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Tuning for factored phrase based systems
Hi Angli, Here is an excerpt of Hieu's answers regarding this topic when I was doing research in factored models, might be of some help: On 30/06/2016 21:44, Sašo Kuntaric wrote: Hi all, I would like to ask one more question. When you say that my reference only has the surface form, are you talking about the "tuning corpus", which in the case of my command ~/mosesdecoder/scripts/training/mert-moses.pl ~/working/IT_corpus/TMX/txt/ factored_corpus/singles/tuning_corpus.tagged.clean.en ~/working/IT_corpus/TMX/txt/factored_corpus/singles/tuning _corpus.tagged.clean.sl ~/mosesdecoder/bin/moses ~/working/IT_corpus/TMX/txt/factored_corpus/singles/test/model/moses.ini --mertdir ~/mosesdecoder/bin/ --decoder-flags="-threads all" are tuning_corpus.tagged.clean.en and tuning_corpus.tagged.clean.sl? Can tuning be done with files that only contains surface forms? it's usual that the reference tuning data does not have factors, even if there are factors in the phrase table. After all, you don't care if the output surface form is correct but the other factors are wrong. Will the results be compatible with tuning done with a factored tuning corpus? yes Best regards, Sašo 2016-12-04 1:37 GMT+01:00 Hieu Hoang : > > > Hieu > Sent while bumping into things > > On 1 Dec 2016 07:01, "Angli Liu" wrote: > > Hi, what's the major difference between the tuning process for a factored > phrase based system (i.e., surface+pos data) and a simple baseline phrase > based system? > > > Nothing, the tuning just optimise weights for feature functions. > > If you decompose your translation so that it has multiple phrase tables > and generation models, then they are just extra feature functions with > weights to be tuned > > Do I need to organize the dev set the same way as the training set (i.e., > surface|pos)? > > Yes > > Is there a tutorial on the moses website on this topic? > > Maybe this > http://www.statmt.org/moses/?n=FactoredTraining.FactoredTraining > > > Thanks! > > -Angli > > ___ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > ___ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > -- lp, Sašo ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Tuning for factored phrase based systems
Hieu Sent while bumping into things On 1 Dec 2016 07:01, "Angli Liu" wrote: Hi, what's the major difference between the tuning process for a factored phrase based system (i.e., surface+pos data) and a simple baseline phrase based system? Nothing, the tuning just optimise weights for feature functions. If you decompose your translation so that it has multiple phrase tables and generation models, then they are just extra feature functions with weights to be tuned Do I need to organize the dev set the same way as the training set (i.e., surface|pos)? Yes Is there a tutorial on the moses website on this topic? Maybe this http://www.statmt.org/moses/?n=FactoredTraining.FactoredTraining Thanks! -Angli ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] Tuning for factored phrase based systems
Hi, what's the major difference between the tuning process for a factored phrase based system (i.e., surface+pos data) and a simple baseline phrase based system? Do I need to organize the dev set the same way as the training set (i.e., surface|pos)? Is there a tutorial on the moses website on this topic? Thanks! -Angli ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support