Re: [Moses-support] bleu-annotation / analysis.perl

2016-03-05 Thread Vincent Nguyen

Thanks Phil.
I figured out for the lowercase thing thanks.

For the short n-grams this is not exactly what I meant.

Any 1-gram sentence will give either 1 when exact match or 0.8409 when 
different match

Any 2-gram sentence will give 1 when exact match or 0.7598 when 1 word match


My point is that this is a twist of the BLEU+ algorithm.
it is supposed to avoid to get a 0 score when there is no match under or 
equal 4-gram (because of the geometric mean)

but it twists the scores of short segments giving a much too high score.

By the way :
The reason I am looking into this is that I am using sentence-level bleu 
to filter some noisy corpora.
For instance, out of 6 million sentences I will keep only the sentences 
with a Bleu score > XX to avoid keeping misaligned segments.




Le 04/03/2016 21:59, Philipp Koehn a écrit :

Hi,

this BLEU calculation happens in the function bleu_annotation in lines 
224ff in scripts/ems/support/analysis.perl


You could convert the system translation $system and the reference 
translations to $REFERENCE[$i] to lowercase (lc) if you prefer that.


The code suggests that n-gram precision for sentences of length < n is 
treated as 100% - which may be not what you want, but it is a 
degenerate case, so treating it is a bit undefined.


-phi

On Sat, Feb 27, 2016 at 6:20 AM, <vngu...@neuf.fr 
<mailto:vngu...@neuf.fr>> wrote:


Ok obviously this is a modified bleu+ algorithm, similar to what
sentence-bleu does.
However I believe this is still not right for unigram sentences.





De : "Vincent Nguyen"
Date : 26 févr. 2016 22:21:59
A : moses-support@mit.edu <mailto:moses-support@mit.edu>

    Sujet : Re: [Moses-support] bleu-annotation / analysis.perl



Am I correct saying that when sentences length is less or equal to 4
tokens then the BLEU score should be 1 for exact matches and 0
when not
exact match ?
(by definition of http://www1.cs.columbia.edu/nlp/sgd/bleu.pdf)


Le 26/02/2016  10:02, Vincent Nguyen a écrit :
> Hi,
>
> I would like to understand better the analysis.perl script that
> generates the bleu-annotation file.
>
> Is there an easy way to get the uncased bleu score of each line
instead
> of the cased calculation ?
> Am I right that this script recompute its own Bleu score without
calling
> the Nist-Bleu nor Multi-Bleu external scripts ?
>
>
> Also I find it strange sometimes when there is only one or two
words :
>
> Translation / reference / score
> Contents / Content / 0.8409
> Ireland / Irish / 0.8409
> Issuer / Italie / 0.8409
> PT / US / 0.8409
> .
> and so on, two words, unrelated will always generate similar
0.8409 scores.
>
> for 2-grams
> Very strong / Very high / 0.7598
> Public sector / Public Sector / 0.7598
> However : / But : / 0.7598
>
> so, for 2-grams, when one word only is good it will generate a
score of
> 0.7598
>
>
> Thanks,
>
> Vincent
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
___
Moses-support mailing list
Moses-support@mit.edu <mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu <mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support




___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] bleu-annotation / analysis.perl

2016-03-04 Thread Philipp Koehn
Hi,

this BLEU calculation happens in the function bleu_annotation in lines
224ff in scripts/ems/support/analysis.perl

You could convert the system translation $system and the reference
translations to $REFERENCE[$i] to lowercase (lc) if you prefer that.

The code suggests that n-gram precision for sentences of length < n is
treated as 100% - which may be not what you want, but it is a degenerate
case, so treating it is a bit undefined.

-phi

On Sat, Feb 27, 2016 at 6:20 AM, <vngu...@neuf.fr> wrote:

> Ok obviously this is a modified bleu+ algorithm, similar to what
> sentence-bleu does.
> However I believe this is still not right for unigram sentences.
>
>
>
> 
>
> De : "Vincent Nguyen"
> Date : 26 févr. 2016 22:21:59
> A : moses-support@mit.edu
>
> Sujet : Re: [Moses-support] bleu-annotation / analysis.perl
>
>
>
> Am I correct saying that when sentences length is less or equal to 4
> tokens then the BLEU score should be 1 for exact matches and 0 when not
> exact match ?
> (by definition of http://www1.cs.columbia.edu/nlp/sgd/bleu.pdf)
>
>
> Le 26/02/2016 10:02, Vincent Nguyen a écrit :
> > Hi,
> >
> > I would like to understand better the analysis.perl script that
> > generates the bleu-annotation file.
> >
> > Is there an easy way to get the uncased bleu score of each line instead
> > of the cased calculation ?
> > Am I right that this script recompute its own Bleu score without calling
> > the Nist-Bleu nor Multi-Bleu external scripts ?
> >
> >
> > Also I find it strange sometimes when there is only one or two words :
> >
> > Translation / reference / score
> > Contents / Content / 0.8409
> > Ireland / Irish / 0.8409
> > Issuer / Italie / 0.8409
> > PT / US / 0.8409
> > .
> > and so on, two words, unrelated will always generate similar 0.8409
> scores.
> >
> > for 2-grams
> > Very strong / Very high / 0.7598
> > Public sector / Public Sector / 0.7598
> > However : / But : / 0.7598
> >
> > so, for 2-grams, when one word only is good it will generate a score of
> > 0.7598
> >
> >
> > Thanks,
> >
> > Vincent
> >
> >
> > ___
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] bleu-annotation / analysis.perl

2016-02-27 Thread vnguyen

Ok obviously this is a modified 
bleu  algorithm, similar to what sentence-bleu does.However I believe this 
is still not right for unigram 
sentences.De : "Vincent Nguyen" 
Date : 26 févr. 2016 22:21:59A : 
moses-support@mit.eduSujet : Re: [Moses-support] bleu-annotation / 
analysis.perlAm I correct saying that when sentences length is less 
or equal to 4 tokens then the BLEU score should be 1 for exact matches and 
0 when not exact match ?(by definition of 
http://www1.cs.columbia.edu/nlp/sgd/bleu.pdf)Le 26/02/2016 10:02, Vincent Nguyen a écrit 
:___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] bleu-annotation / analysis.perl

2016-02-26 Thread Vincent Nguyen

Am I correct saying that when sentences length is less or equal to 4 
tokens then the BLEU score should be 1 for exact matches and 0 when not 
exact match ?
(by definition of http://www1.cs.columbia.edu/nlp/sgd/bleu.pdf)


Le 26/02/2016 10:02, Vincent Nguyen a écrit :
> Hi,
>
> I would like to understand better the analysis.perl script that
> generates the bleu-annotation file.
>
> Is there an easy way to get the uncased bleu score of each line instead
> of the cased calculation ?
> Am I right that this script recompute its own Bleu score without calling
> the Nist-Bleu nor Multi-Bleu external scripts ?
>
>
> Also I find it strange sometimes when there is only one or two words :
>
> Translation / reference / score
> Contents / Content / 0.8409
> Ireland / Irish / 0.8409
> Issuer / Italie / 0.8409
> PT / US / 0.8409
> .
> and so on, two words, unrelated will always generate similar 0.8409 scores.
>
> for 2-grams
> Very strong / Very high / 0.7598
> Public sector / Public Sector / 0.7598
> However : / But : / 0.7598
>
> so, for 2-grams, when one word only is good it will generate a score of
> 0.7598
>
>
> Thanks,
>
> Vincent
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] bleu-annotation / analysis.perl

2016-02-26 Thread Vincent Nguyen
Hi,

I would like to understand better the analysis.perl script that
generates the bleu-annotation file.

Is there an easy way to get the uncased bleu score of each line instead
of the cased calculation ?
Am I right that this script recompute its own Bleu score without calling
the Nist-Bleu nor Multi-Bleu external scripts ?


Also I find it strange sometimes when there is only one or two words :

Translation / reference / score
Contents / Content / 0.8409
Ireland / Irish / 0.8409
Issuer / Italie / 0.8409
PT / US / 0.8409
.
and so on, two words, unrelated will always generate similar 0.8409 scores.

for 2-grams
Very strong / Very high / 0.7598
Public sector / Public Sector / 0.7598
However : / But : / 0.7598

so, for 2-grams, when one word only is good it will generate a score of
0.7598


Thanks,

Vincent


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support