Hi,
these messages are common - no reason to worry.
To track down the error, you should look at the first mismatch
that has occurred, and check what caused it. There may be
tokenization issues, problems with odd unicode characters,
etc.
-phi
On Thu, Oct 9, 2014 at 3:06 PM, Arefeh Kazemi
Hi Arefeh,
Have you been able to resolve that issue? Maybe one of your GIZA
alignments is flawed, for instance because the GIZA process was
terminated before is finished. Did you check that both the standard and
the inverse alignment files have the same number of lines?
Check it like this:
$
Hi,
this may be also caused by having too long / empty / length-mismatched
sentences
when running GIZA. Make sure to run the clean-corpus-n.perl script first.
-phi
On Thu, Oct 9, 2014 at 10:49 AM, Matthias Huck mh...@inf.ed.ac.uk wrote:
Hi Arefeh,
Have you been able to resolve that issue?
Thanks philippe
I already cleaned my data set, I'm using EMS.
Hi Matthias,
thank you for your reply.
I only have alignment in one direction. why?
there are many warning messages in giza log-file, are they related to the
problem? (the maximum sentence size is only 80)
49
WARNING: Model2
Hi
I have re-installed moses on my system but I have a problem with giza -
symmetrize step.
it gets some errors of this type:
Sentence mismatch error! Line #501714
Sentence mismatch error! Line #501715
.
.
.
Sentence mismatch error! Line #90
all of my data files are in utf8 format and I
Hi,
I'm training the syntactic model, but I have some problems when I run
inverse giza .
I clean the data before and after running the Collins parser , but
inverse-giza has fertility problems:
WARNING: The following sentence pair has source/target sentence length
ration more than
the maximum