Hi,
 
I’ve written a BLEU scoring tool called “sacreBLEU” that may be of use to 
people here. The goal is to get people to start reporting WMT-matrix compatible 
scores in their papers (i.e., scoring on detokenized outputs with a fixed 
reference tokenization) so that numbers can be compared directly, in the spirit 
of Rico Sennrich's multi-bleu-detok.pl. The nice part for you is that it 
auto-downloads WMT datasets and makes it so you no longer have to deal with 
SGML. You can install it via pip:
 
    pip3 install sacrebleu
 
For starters, you can use it to easily download datasets:
 
    sacrebleu -t wmt17 -l en-de --echo src > wmt17.en-de.en
    sacrebleu -t wmt17 -l en-de --echo ref > wmt17.en-de.de 
<http://wmt17.en-de.de/>
 
You don’t need to download the reference, though. You can just score against it 
using sacreBLEU directly. After decoding and detokenizing, try:
 
    cat output.detok.txt | sacrebleu -t wmt17 -l en-de
 
I have tested and it produces the exact same numbers as Moses' mteval-v13a.pl, 
which is the official scoring script for WMT. It computes the exact same 
numbers for all 153 WMT17 system submissions (column BLEU-cased at 
matrix.statmt.org <http://matrix.statmt.org/>). For example:
 
    $ cat newstest2017.uedin-nmt.4722.en-de | sacrebleu -t wmt17 -l en-de
    
BLEU+case.mixed+lang.en-de+numrefs.1+smooth.exp+test.wmt17+tok.13a+version.1.1.4
 = 28.30 59.9/34.0/21.8/14.4 (BP = 1.000 ratio = 1.026 hyp_len = 62873 ref_len 
= 61287)

This means numbers computed with it are directly comparable across papers. As 
you can see, in addition to the score, it outputs a version string that records 
the exact BLEU parameters used. The output string is compatible with the output 
of multi-bleu.pl, so your old code for parsing the BLEU score out of 
multi-bleu.pl should still work.
 
You can also use the tool in a backward compatible mode with arbitrary 
references, the same way 
 
    cat output.detok.txt | sacrebleu ref1 [ref2 …]
 
The official code is in sockeye (Amazon’s NMT system):

    github.com 
<http://github.com/>/awslabs/sockeye/tree/master/contrib/sacrebleu 
<http://github.com/awslabs/sockeye/tree/master/contrib/sacrebleu>

I will also likely maintain a clone here:
 
    github.com/mjpost/sacreBLEU <http://github.com/mjpost/sacreBLEU>
 
matt
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to