Re: [Moses-support] Sentence mismatch error!

2014-10-10 Thread Philipp Koehn
Hi, these messages are common - no reason to worry. To track down the error, you should look at the first mismatch that has occurred, and check what caused it. There may be tokenization issues, problems with odd unicode characters, etc. -phi On Thu, Oct 9, 2014 at 3:06 PM, Arefeh Kazemi

Re: [Moses-support] Sentence mismatch error!

2014-10-09 Thread Matthias Huck
Hi Arefeh, Have you been able to resolve that issue? Maybe one of your GIZA alignments is flawed, for instance because the GIZA process was terminated before is finished. Did you check that both the standard and the inverse alignment files have the same number of lines? Check it like this: $

Re: [Moses-support] Sentence mismatch error!

2014-10-09 Thread Philipp Koehn
Hi, this may be also caused by having too long / empty / length-mismatched sentences when running GIZA. Make sure to run the clean-corpus-n.perl script first. -phi On Thu, Oct 9, 2014 at 10:49 AM, Matthias Huck mh...@inf.ed.ac.uk wrote: Hi Arefeh, Have you been able to resolve that issue?

Re: [Moses-support] Sentence mismatch error!

2014-10-09 Thread Arefeh Kazemi
Thanks philippe I already cleaned my data set, I'm using EMS. Hi Matthias, thank you for your reply. I only have alignment in one direction. why? there are many warning messages in giza log-file, are they related to the problem? (the maximum sentence size is only 80) 49 WARNING: Model2

[Moses-support] Sentence mismatch error!

2014-10-06 Thread Arefeh Kazemi
Hi I have re-installed moses on my system but I have a problem with giza - symmetrize step. it gets some errors of this type: Sentence mismatch error! Line #501714 Sentence mismatch error! Line #501715 . . . Sentence mismatch error! Line #90 all of my data files are in utf8 format and I

[Moses-support] Sentence mismatch error

2010-11-19 Thread marco turchi
Hi, I'm training the syntactic model, but I have some problems when I run inverse giza . I clean the data before and after running the Collins parser , but inverse-giza has fertility problems: WARNING: The following sentence pair has source/target sentence length ration more than the maximum