Hi,

these messages are common - no reason to worry.

To track down the error, you should look at the first mismatch
that has occurred, and check what caused it. There may be
tokenization issues, problems with odd unicode characters,
etc.

-phi

On Thu, Oct 9, 2014 at 3:06 PM, Arefeh Kazemi <arefeh_kaz...@yahoo.com>
wrote:

>
>
> Thanks philippe
> I already cleaned my data set, I'm using EMS.
>
> Hi Matthias,
> thank you for your reply.
> I only have alignment in one direction. why?
> there are many warning messages in giza log-file, are  they related to the
> problem? (the maximum sentence size is only 80)
>
> 490000
> WARNING: Model2 viterbi alignment has zero score.
> Here are the different elements that made this alignment probability zero
> 500000
> WARNING: already 41 iterations in hillclimb: 1.95402 1 17 64
> WARNING: already 42 iterations in hillclimb: 1.80881 2 34 8
> WARNING: already 43 iterations in hillclimb: 2.19253 2 66 8
> WARNING: already 44 iterations in hillclimb: 2.61934 1 35 66
> WARNING: already 45 iterations in hillclimb: 1.00471 1 62 64
> WARNING: already 46 iterations in hillclimb: 1.00001 0 62 64
> WARNING: already 41 iterations in hillclimb: 1.12453 2 55 2
> WARNING: already 42 iterations in hillclimb: 1.11522 2 26 2
> WARNING: already 43 iterations in hillclimb: 5.19799 2 30 2
>
>
>
>   On Thursday, October 9, 2014 4:08 PM, Philipp Koehn <pko...@inf.ed.ac.uk>
> wrote:
>
>
> Hi,
>
> this may be also caused by having too long / empty / length-mismatched
> sentences
> when running GIZA. Make sure to run the clean-corpus-n.perl script first.
>
> -phi
>
> On Thu, Oct 9, 2014 at 10:49 AM, Matthias Huck <mh...@inf.ed.ac.uk> wrote:
>
> Hi Arefeh,
>
> Have you been able to resolve that issue? Maybe one of your GIZA
> alignments is flawed, for instance because the GIZA process was
> terminated before is finished. Did you check that both the standard and
> the inverse alignment files have the same number of lines?
>
> Check it like this:
>
> $ zcat training/giza.1/de-en.A3.final.gz | wc -l; zcat
> training/giza-inverse.1/en-de.A3.final.gz | wc -l
> 900000
> 501713
>
> In that case there would be a problem and you'd have to rerun GIZA in
> the inverse direction. If you get the same number of lines and it
> matches what you expect to get from your corpus, then it's a different
> issue and you have to investigate further.
>
> Cheers,
> Matthias
>
>
> On Mon, 2014-10-06 at 03:13 -0700, Arefeh Kazemi wrote:
> > Hi
> > I have re-installed moses on my system but I have  a problem with giza
> > - symmetrize step.
> > it gets some errors of this type:
> > Sentence mismatch error! Line #501714
> > Sentence mismatch error! Line #501715
> > .
> > .
> > .
> > Sentence mismatch error! Line #900000
> >
> >
> > all of my data files are in utf8 format and I have run moses
> > successfully on these files before.
> >
> >
> > any suggestion to fix the problem would be appreciated.
> >
> >
> > Regards
> > Arefeh
> >
> >
> >
> >
> > _______________________________________________
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to