Am I correct saying that when sentences length is less or equal to 4 tokens then the BLEU score should be 1 for exact matches and 0 when not exact match ? (by definition of http://www1.cs.columbia.edu/nlp/sgd/bleu.pdf)
Le 26/02/2016 10:02, Vincent Nguyen a écrit : > Hi, > > I would like to understand better the analysis.perl script that > generates the bleu-annotation file. > > Is there an easy way to get the uncased bleu score of each line instead > of the cased calculation ? > Am I right that this script recompute its own Bleu score without calling > the Nist-Bleu nor Multi-Bleu external scripts ? > > > Also I find it strange sometimes when there is only one or two words : > > Translation / reference / score > Contents / Content / 0.8409 > Ireland / Irish / 0.8409 > Issuer / Italie / 0.8409 > PT / US / 0.8409 > ..... > and so on, two words, unrelated will always generate similar 0.8409 scores. > > for 2-grams > Very strong / Very high / 0.7598 > Public sector / Public Sector / 0.7598 > However : / But : / 0.7598 > > so, for 2-grams, when one word only is good it will generate a score of > 0.7598 > > > Thanks, > > Vincent > > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support