Hi,

I tried to run MERT manually with my own decoder. It seems that
mert-moses.pl executes following two commands for each run. 

/home/leona/mosesdecoder/mert/extractor  --scconfig case:true --scfile
run1.scores.dat --ffile run1.features.dat
-r 
/home72/leona/IWSLT10.zh-en/tuning/reference.tok.2.ref0,/home72/leona/IWSLT10.zh-en/tuning/reference.tok.2.ref1,/home72/leona/IWSLT10.zh-en/tuning/reference.tok.2.ref2,/home72/leona/IWSLT10.zh-en/tuning/reference.tok.2.ref3
 -n run1.best100.out.gz > extract.out 2> extract.err

/home/leona/mosesdecoder/mert/mert -d 14  --scconfig case:true -n 20
--ffile run1.features.dat --scfile run1.scores.dat --ifile run1.init.opt
2> mert.log

where extractor outputs score the file (run*.score.dat) and feature file
(run*.features.dat), and mert outputs weights.txt and stderr messages.

My own decoder uses 7 features as follows. 

head -n1 run1.best100
0 ||| I name is Tanaka 希洛 grams of I want in your place for one a
room . ||| phrase-f2e: -30.924257 lex-f2e: -27.824498 phrase-e2f:
-22.619237 lex-e2f: -19.450964 phrase-penalty: -89.140054 n5gram:
-99.894705 d2gram: 0.000000 ||| -389.853714627


Therefore I adjust the argument (mert with -d 7).

$/home/leona/mosesdecoder/mert/extractor  --scconfig case:true --scfile
run1.scores.dat --ffile run1.features.dat
-r 
/home72/leona/IWSLT10.zh-en/tuning/reference.tok.2.ref0,/home72/leona/IWSLT10.zh-en/tuning/reference.tok.2.ref1,/home72/leona/IWSLT10.zh-en/tuning/reference.tok.2.ref2,/home72/leona/IWSLT10.zh-en/tuning/reference.tok.2.ref3
 -n run1.best100 > extract.out 2> extract.err

Binary write mode is NOT selected
Scorer type: BLEU
Scorer config string: case:true
name: case value: true
Using scorer regularisation strategy: none
Using scorer regularisation window: 0
Using case preservation: 1
Using reference length strategy: closest
Loading reference
from /home72/leona/IWSLT10.zh-en/tuning/reference.tok.2.ref0
.
Loading reference
from /home72/leona/IWSLT10.zh-en/tuning/reference.tok.2.ref1
.
Loading reference
from /home72/leona/IWSLT10.zh-en/tuning/reference.tok.2.ref2
.
Loading reference
from /home72/leona/IWSLT10.zh-en/tuning/reference.tok.2.ref3
.
References loaded : [0] seconds
Data::score_type BLEU
Data::Scorer type from Scorer: BLEU
BleuScorer: 9
ScoreData: number_of_scores: 9
Previous data loaded : [0] seconds
loading nbest from run1.100best
Nbest entries loaded and scored : [1] seconds
Binary write mode is NOT selected
Binary write mode is NOT selected
saving the array into run1.features.dat
saving the array into run1.scores.dat
Stopping... : [1] seconds


$/home/leona/mosesdecoder/mert/mert -d 7  --scconfig case:true -n 20
--ffile run1.features.dat --scfile run1.scores.dat --ifile run1.init.opt
2> mert.log

However, this gives the following error.

Seeding random numbers with system clock 
Scorer config string: case:true
name: case value: true
Using scorer regularisation strategy: none
Using scorer regularisation window: 0
Using case preservation: 1
Using reference length strategy: closest
Data::score_type BLEU
Data::Scorer type from Scorer: BLEU
BleuScorer: 9
ScoreData: number_of_scores: 9
Loading Data from: run1.score.dat and run1.features.dat
loading feature data from run1.features.dat
loading score data from run1.score.dat
Data loaded : [0] seconds
error size mismatch between FeatureData and Scorer

Do you have any suggestion?

-- 
Hwidong Na <[email protected]>
KLE lab, POSTECH, KOREA






_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to