Re: [Moses-support] Running Giza++ on subsets of data

2011-06-15 Thread Qin Gao
Yes, MGIZA isn't really "incrementally training", it only initialize the model parameters with that trained previously, since it does not store sufficient statistics of the previous training. It will give bad performance if 1. You train only model 1 or 2. The incremental data or sub set is really

Re: [Moses-support] Running Giza++ on subsets of data

2011-06-15 Thread Miles Osborne
it is this: > Abby Levenberg, Chris Callison-Burch and Miles Osborne. Stream-based Translation Models for Statistical Machine Translation. NAACL, Los Angeles, USA, 2010. http://homepages.inf.ed.ac.uk/miles/papers/naacl10b.pdf Miles On 15 J

Re: [Moses-support] Running Giza++ on subsets of data

2011-06-15 Thread Miles Osborne
that isn't the expected answer here. i think the OP wants some kind of incremental (re) training. firstly: it is not really possible to guarantee that performance is not degraded when running from subsets up to the full set (compared with just running it on the full set). secondly, you may wish

Re: [Moses-support] Running Giza++ on subsets of data

2011-06-15 Thread Kenneth Heafield
Try using MGIZA: http://geek.kyloo.net/software/doku.php/mgiza:overview On 06/15/11 04:51, Prasanth K wrote: > Hello All, > > I am conducting a series of experiments to build translation systems > using Moses in which the corpus of the current experiment is a subset of > the corpora used in the p

Re: [Moses-support] How to change phrase representation

2011-06-15 Thread Ben Gottesman
Hi, I suggest moving the token-joining step to after the tokenization step. (You need to know where the token boundaries are before you can remove them, and once you remove token boundaries, you don't want to add new ones.) Ben On Tue, Jun 14, 2011 at 2:42 PM, wrote: > > -- Forwarded m

[Moses-support] Code monkey available. Will work for peanuts

2011-06-15 Thread xiaofeng wu
hi -- Xiaofeng Wu CNGL, School of Computing, Dublin City University, Glasnevin, Dublin 9. Email: xiaofen...@computing.dcu.ie Tel: +353 (0)1 700 6727 ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-suppo

[Moses-support] OpenMaTrEx: new version and technical report

2011-06-15 Thread Mikel L. Forcada
Dear all, A new version (0.98) of the free/open-source marker-driven example based machine translation system has been released. This version has been adapted to use more recent versions of Giza++, Moses and IRSTLM, and to compile with g++ 4.4). It may be downloaded from http://www.openmatrex.

[Moses-support] Running Giza++ on subsets of data

2011-06-15 Thread Prasanth K
Hello All, I am conducting a series of experiments to build translation systems using Moses in which the corpus of the current experiment is a subset of the corpora used in the previous experiment. I have started with the Europarl corpora and am likely to repeat this process about 20 times. Unless