Hi,

I fail to use the xml-input flag. More precisely, the translation I provide in 
the XML markup is ignored (and the markup is discarded).

Translating 'das ist ein <n translation='yoyo'>kleines</n> haus' , I expected 
to obtain 'this is a yoyo house'  with the option -xml-input exclusive   (I 
also tried using the historical 'english' XML attribute)

Can someone tell me what I do wrong or explain what is going on?

I tried with the sample_model discussed in the user guide p 21  
(http://www.statmt.org/moses/download/sample-models.tgz ) and a model of mine 
as well.
I'm using the Cygwin pre-compiled version of Moses 1.0 downloaded  on Jan 29th 
.  BTW is there a way to have the decoder showing its version?

Thank you!

JL


echo 'das ist ein <n translation='yoyo'>kleines</n> haus' | 
/c/moses10/bin/moses -f phrase-model/moses.ini -xml-input exclusive
Defined parameters (per moses.ini or switch):
        config: phrase-model/moses.ini
        input-factors: 0
        lmodel-file: 8 0 3 lm/europarl.srilm.gz
        mapping: T 0
        n-best-list: nbest.txt 100
        ttable-file: 0 0 0 1 phrase-model/phrase-table
        ttable-limit: 10
        weight-d: 1
        weight-l: 1
        weight-t: 1
        weight-w: 0
        xml-input: exclusive
/c/moses10/bin
ScoreProducer: Distortion start: 0 end: 1
ScoreProducer: WordPenalty start: 1 end: 2
ScoreProducer: !UnknownWordPenalty start: 2 end: 3
Loading lexical distortion models...have 0 models
Start loading LanguageModel lm/europarl.srilm.gz : [0.000] seconds
ScoreProducer: LM start: 3 end: 4
Loading the LM will be faster if you build a binary file.
Reading lm/europarl.srilm.gz
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
**The ARPA file is missing <unk>.  Substituting log10 probability -100.000.
**************************************************************************************************
Finished loading LanguageModels : [1.061] seconds
Start loading PhraseTable phrase-model/phrase-table : [1.061] seconds
filePath: phrase-model/phrase-table
ScoreProducer: PhraseModel start: 4 end: 5
Finished loading phrase tables : [1.061] seconds
Start loading phrase table from phrase-model/phrase-table : [1.061] seconds
Reading phrase-model/phrase-table
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
Finished loading phrase tables : [1.063] seconds
IO from STDOUT/STDIN
Created input-output object : [1.063] seconds
Translating line 0  in thread id 0x80047030
Translating: das ist ein kleines haus
Line 0: Collecting options took 0.000 seconds
Line 0: Search took 0.002 seconds
this is a small house
BEST TRANSLATION: this is a small house [11111]  [total=-28.923] 
core=(0.000,-5.000,0.000,-27.091,-1.833)
Line 0: Translation took 0.007 seconds total
user    1.045
sys     0.031
VmRSS:     34560 kB




echo 'das ist ein <n english='yoyo'>kleines</n> haus' | /c/moses10/bin/moses -f 
phrase-model/moses.ini -xml-input exclusive
Defined parameters (per moses.ini or switch):
        config: phrase-model/moses.ini
        input-factors: 0
        lmodel-file: 8 0 3 lm/europarl.srilm.gz
        mapping: T 0
        n-best-list: nbest.txt 100
        ttable-file: 0 0 0 1 phrase-model/phrase-table
        ttable-limit: 10
        weight-d: 1
        weight-l: 1
        weight-t: 1
        weight-w: 0
        xml-input: exclusive
/c/moses10/bin
ScoreProducer: Distortion start: 0 end: 1
ScoreProducer: WordPenalty start: 1 end: 2
ScoreProducer: !UnknownWordPenalty start: 2 end: 3
Loading lexical distortion models...have 0 models
Start loading LanguageModel lm/europarl.srilm.gz : [0.000] seconds
ScoreProducer: LM start: 3 end: 4
Loading the LM will be faster if you build a binary file.
Reading lm/europarl.srilm.gz
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
**The ARPA file is missing <unk>.  Substituting log10 probability -100.000.
**************************************************************************************************
Finished loading LanguageModels : [1.050] seconds
Start loading PhraseTable phrase-model/phrase-table : [1.050] seconds
filePath: phrase-model/phrase-table
ScoreProducer: PhraseModel start: 4 end: 5
Finished loading phrase tables : [1.050] seconds
Start loading phrase table from phrase-model/phrase-table : [1.051] seconds
Reading phrase-model/phrase-table
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
Finished loading phrase tables : [1.052] seconds
IO from STDOUT/STDIN
Created input-output object : [1.052] seconds
Translating line 0  in thread id 0x80047030
Translating: das ist ein kleines haus
Line 0: Collecting options took 0.000 seconds
Line 0: Search took 0.002 seconds
this is a small house
BEST TRANSLATION: this is a small house [11111]  [total=-28.923] 
core=(0.000,-5.000,0.000,-27.091,-1.833)
Line 0: Translation took 0.008 seconds total
user    1.060
sys     0.015
VmRSS:     34560 kB





exclusive Only the XML-specified translation is used for the input phrase. Any 
phrases
from the phrase table that overlap with that span are ignored.


Jean-Luc Meunier │ Senior Research Engineer │ Xerox Research Centre Europe│ 6 
chemin de Maupertuis 38240 MEYLAN │ +33 (0)4 76 61 50 18

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to