Re: [Moses-support] wrong alignment
Here is the score of Chinese-Arabic using: mteval-v12.pl Evaluation of cn-to-ar translation using: src set "test2010" (1 docs, 1000 segs) ref set "test2010" (1 refs) tst set "test2010" (1 systems) NIST score = 6.3938 BLEU score = 0.4120 for system "chinese-arabic" From: mossaghu...@hotmail.com To: moses-support@mit.edu Date: Sat, 25 Sep 2010 02:19:00 +0800 Subject: Re: [Moses-support] wrong alignment Thank Miles, language model: -order 5 -interpolate -kndiscount -unk PhraseTable training command: -alignment grow-diag-final -reordering msd-bidirectional-fe -mgiza -mgiza-cpus 8 best regards > From: mi...@inf.ed.ac.uk > Date: Fri, 24 Sep 2010 19:09:50 +0100 > Subject: Re: [Moses-support] wrong alignment > To: mossaghu...@hotmail.com > CC: moses-support@mit.edu > > it is probably more helpful to give the number of sentences you used > for language model training (and other details, eg ngram order). > > but at first glance that looks like a tiny amount of language model > data --i would expect to see something closer to 2GB or so, depending > upon representation > > Miles > > 2010/9/24 musa ghurab : > > > > Thank Burger, > > > > > > here are some informations: > > Language model: 45MB > > Phrase Table: 26MB > > Reordering Model: 36MB > > > > but I'm still waiti! ng for tuning to finish > > > > > > > >> From: j...@mitre.org > >> To: moses-support@mit.edu > >> Date: Fri, 24 Sep 2010 13:40:40 -0400 > >> Subject: Re: [Moses-support] wrong alignment > >> > >> musa ghurab wrote: > >> > >> > I trained a system of Chinese-Arabic language, but many alignments > >> > are wrong. > >> > The same thing to lexical model, where are many words are wrongly > >> > aligned > >> > Here is an example of lexical model (lex.e2f): > >> > >> The point of Moses is not to get good alignments, but to get good > >> translation output. The target language model will help the decoder > >> to pick good translations, even if the translation probabilities that > >> come out of the alignment do not appear to be ideal. A grea! t deal of > >> research effort has been wasted (in my opin ion) on getting better > >> alignments, without actually achieving better translation. > >> > >> Have you run the resulting model! s on a test set? What was the score? > >> How big is your language model? More LM data is probably the easiest > >> way to make up for what might appear to be poor alignments. > >> > >> - John D. Burger > >> MITRE > >> > >> ___ > >> Moses-support mailing list > >> Moses-support@mit.edu > >> http://mailman.mit.edu/mailman/listinfo/moses-support > > > > ___ > > Moses-support mailing list > > Moses-support@mit.edu > > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > > > > > -- > The University of Edinb! urgh is a charitable body, registered in > Scotland, with registration number SC005336. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] wrong alignment
Thank Miles, language model: -order 5 -interpolate -kndiscount -unk PhraseTable training command: -alignment grow-diag-final -reordering msd-bidirectional-fe -mgiza -mgiza-cpus 8 best regards > From: mi...@inf.ed.ac.uk > Date: Fri, 24 Sep 2010 19:09:50 +0100 > Subject: Re: [Moses-support] wrong alignment > To: mossaghu...@hotmail.com > CC: moses-support@mit.edu > > it is probably more helpful to give the number of sentences you used > for language model training (and other details, eg ngram order). > > but at first glance that looks like a tiny amount of language model > data --i would expect to see something closer to 2GB or so, depending > upon representation > > Miles > > 2010/9/24 musa ghurab : > > > > Thank Burger, > > > > > > here are some informations: > > Language model: 45MB > > Phrase Table: 26MB > > Reordering Model: 36MB > > > > but I'm still waiting for tuning to finish > > > > > > > >> From: j...@mitre.org > >> To: moses-support@mit.edu > >> Date: Fri, 24 Sep 2010 13:40:40 -0400 > >> Subject: Re: [Moses-support] wrong alignment > >> > >> musa ghurab wrote: > >> > >> > I trained a system of Chinese-Arabic language, but many alignments > >> > are wrong. > >> > The same thing to lexical model, where are many words are wrongly > >> > aligned > >> > Here is an example of lexical model (lex.e2f): > >> > >> The point of Moses is not to get good alignments, but to get good > >> translation output. The target language model will help the decoder > >> to pick good translations, even if the translation probabilities that > >> come out of the alignment do not appear to be ideal. A great deal of > >> research effort has been wasted (in my opinion) on getting better > >> alignments, without actually achieving better translation. > >> > >> Have you run the resulting model! s on a test set? What was the score? > >> How big is your language model? More LM data is probably the easiest > >> way to make up for what might appear to be poor alignments. > >> > >> - John D. Burger > >> MITRE > >> > >> ___ > >> Moses-support mailing list > >> Moses-support@mit.edu > >> http://mailman.mit.edu/mailman/listinfo/moses-support > > > > ___ > > Moses-support mailing list > > Moses-support@mit.edu > > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > > > > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] wrong alignment
it is probably more helpful to give the number of sentences you used for language model training (and other details, eg ngram order). but at first glance that looks like a tiny amount of language model data --i would expect to see something closer to 2GB or so, depending upon representation Miles 2010/9/24 musa ghurab : > > Thank Burger, > > > here are some informations: > Language model: 45MB > Phrase Table: 26MB > Reordering Model: 36MB > > but I'm still waiting for tuning to finish > > > >> From: j...@mitre.org >> To: moses-support@mit.edu >> Date: Fri, 24 Sep 2010 13:40:40 -0400 >> Subject: Re: [Moses-support] wrong alignment >> >> musa ghurab wrote: >> >> > I trained a system of Chinese-Arabic language, but many alignments >> > are wrong. >> > The same thing to lexical model, where are many words are wrongly >> > aligned >> > Here is an example of lexical model (lex.e2f): >> >> The point of Moses is not to get good alignments, but to get good >> translation output. The target language model will help the decoder >> to pick good translations, even if the translation probabilities that >> come out of the alignment do not appear to be ideal. A great deal of >> research effort has been wasted (in my opinion) on getting better >> alignments, without actually achieving better translation. >> >> Have you run the resulting model! s on a test set? What was the score? >> How big is your language model? More LM data is probably the easiest >> way to make up for what might appear to be poor alignments. >> >> - John D. Burger >> MITRE >> >> ___ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support > > ___ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] wrong alignment
Thank Burger, here are some informations: Language model: 45MB Phrase Table: 26MB Reordering Model: 36MB but I'm still waiting for tuning to finish > From: j...@mitre.org > To: moses-support@mit.edu > Date: Fri, 24 Sep 2010 13:40:40 -0400 > Subject: Re: [Moses-support] wrong alignment > > musa ghurab wrote: > > > I trained a system of Chinese-Arabic language, but many alignments > > are wrong. > > The same thing to lexical model, where are many words are wrongly > > aligned > > Here is an example of lexical model (lex.e2f): > > The point of Moses is not to get good alignments, but to get good > translation output. The target language model will help the decoder > to pick good translations, even if the translation probabilities that > come out of the alignment do not appear to be ideal. A great deal of > research effort has been wasted (in my opinion) on getting better > alignments, without actually achieving better translation. > > Have you run the resulting models on a test set? What was the score? > How big is your language model? More LM data is probably the easiest > way to make up for what might appear to be poor alignments. > > - John D. Burger > MITRE > > ___ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] wrong alignment
musa ghurab wrote: > I trained a system of Chinese-Arabic language, but many alignments > are wrong. > The same thing to lexical model, where are many words are wrongly > aligned > Here is an example of lexical model (lex.e2f): The point of Moses is not to get good alignments, but to get good translation output. The target language model will help the decoder to pick good translations, even if the translation probabilities that come out of the alignment do not appear to be ideal. A great deal of research effort has been wasted (in my opinion) on getting better alignments, without actually achieving better translation. Have you run the resulting models on a test set? What was the score? How big is your language model? More LM data is probably the easiest way to make up for what might appear to be poor alignments. - John D. Burger MITRE ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] wrong alignment
Raphael, Thank you for your suggestion. That is will take more time to do. Is there any other suggestion? Best regards > Subject: Re: [Moses-support] wrong alignment > From: rpa...@alphacrc.com > To: mossaghu...@hotmail.com > Date: Fri, 24 Sep 2010 18:03:53 +0100 > > Hi > > I don't speak Chinese or Arabic, but if I understand your problem > correctly, it is not really solvable or even supposed to be solved with > current statistical machine translation systems. Or, if you prefer, the > current solution is: give more input data, maybe the alignments will be > better. > > A statistical system is not supposed to have only one exact translation > for each word, or for each phrase. Phrase tables always contain a lot of > noise, but in the whole, given that bad translations should have bad > scores, and that the language model should play its role, the > translation should be ok. In your case, the fact that the correct > translation doesn't even have a good score is certainly worrying. > > If you can find correct word alignments somewhere, you can add them to > giza's input - there is currently some work on an option to add a > dictionary as input, for now you can just add them as "monoword > sentences" to the input corpus. But anyway, giza's output is not > expected to contain only perfect alignments. > > Best regards, > > -- > Raphael > > > On Sat, 2010-09-25 at 00:33 +0800, musa ghurab wrote: > > Hi > > > > I trained a system of Chinese-Arabic language, but many alignments are > > wrong. > > The same thing to lexical model, where are many words are wrongly > > aligned > > Here is an example of lexical model (lex.e2f): > > > > Note: right translation has (**) marks > > > > 今天 وهنا 0.0009911 > > 今天 ما 0.110 > > 今天 الآن 0.0003424 > > 今天 وقال 0.0001732 > > 今天 الخط 0.0056625 > > 今天 يزالون 0.0046512 > > 今天 تم 0.496 > > 今天 هذا 0.0001187 > > 今天 سابق 0.0004292 > > 今天 يأت 0.0094340 > > 今天 الخاسر 0.0188679 > > 今天 المحلولة 0.200 > > 今天 السبت 0.0096154 > > 今天 أكون 0.0016247 > > 今天 نعلم 0.0003154 > > 今天 ان 0.560 > > 今天 ننطلق 0.020 > > 今天 الظهر 0.0029762 > > 今天 الصباح 0.4434348 > > 今天 مثلما 0.0022779 > > 今天 نفعله 0.0013316 > > 今天 لدينا 0.264 > > 今天 ادلى 0.017 > > 今天 يوم 0.0029304 ** > > 今天 عنها 0.0006026 > > 今天 عالم 0.0075829 > > 今天 برودي 0.0007008 > > 今天 انها 0.819 > > > > Any suggestion to solve this problem, please? > > ___ > > Moses-support mailing list > > Moses-support@mit.edu > > http://mailman.mit.edu/mailman/listinfo/moses-support > > ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] wrong alignment
Hi I trained a system of Chinese-Arabic language, but many alignments are wrong. The same thing to lexical model, where are many words are wrongly aligned Here is an example of lexical model (lex.e2f): Note: right translation has (**) marks 今天 وهنا 0.0009911 今天 ما 0.110 今天 الآن 0.0003424 今天 وقال 0.0001732 今天 الخط 0.0056625 今天 يزالون 0.0046512 今天 تم 0.496 今天 هذا 0.0001187 今天 سابق 0.0004292 今天 يأت 0.0094340 今天 الخاسر 0.0188679 今天 المحلولة 0.200 今天 السبت 0.0096154 今天 أكون 0.0016247 今天 نعلم 0.0003154 今天 ان 0.560 今天 ننطلق 0.020 今天 الظهر 0.0029762 今天 الصباح 0.4434348 今天 مثلما 0.0022779 今天 نفعله 0.0013316 今天 لدينا 0.264 今天 ادلى 0.017 今天 يوم 0.0029304** 今天 عنها 0.0006026 今天 عالم 0.0075829 今天 برودي 0.0007008 今天 انها 0.819 Any suggestion to solve this problem, please? ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support