Re: [Moses-support] Incremental training without using incremental GIZA

2013-07-26 Thread Philipp Koehn
Hi, you could just run word alignment on the 50,000 lines, but you will get better performance if you somehow leverage the baseline parallel corpus for word alignment. One way is incremental GIZA++, the other is re-run everything. You could also try some middle ground of including some of the

Re: [Moses-support] Incremental training without using incremental GIZA

2013-07-26 Thread Elliot K Meyerson
Can I use incremental GIZA++ for the new lines, even though I didn't use it for the baseline? (does mgiza give me everything inc-giza needs?) If not, I like the idea of just running word alignment on the new lines. Would I need to update any files besides *.A3.final.gz for steps 3+ to run

Re: [Moses-support] Incremental training without using incremental GIZA

2013-07-26 Thread Philipp Koehn
Hi, you do not need incremental GIZA++ for the baseline run, but you need to run it with the HMM alignment models as final step and store intermediate files (which you likely have not done). Here some information: http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc33 -phi On Sat, Jul

[Moses-support] Incremental training without using incremental GIZA

2013-07-25 Thread Elliot K Meyerson
Hello, I have a large phrase-based translation system. Alignment was done with mgiza, and took a few weeks. I now have a small amount of extremely relevant new bitext (~50,000 lines) that I would like to use to augment the model, without having to retrain everything. The new data contains many