Hi, yes, the factor separator must be "|".
There is some programs that allow you specify other separators, but this is not sufficiently consistly available throughout the training / tuning / decoding pipeline. In case you corpus contains "|", it better to replace those occurrences with '&bar;'. -phi On Wed, Sep 5, 2012 at 2:15 AM, Tan, Jun <jun....@emc.com> wrote: > Hi Koehn, > > So the factor separator must be "|"? > I tagged all the data via some other tool, and default separator is "_". > I also have noticed the separator of target phrase in the phrase table is > "|", even I changed the separator to "_" during the training process. I > changed all the separator in the phrase-table from "|" to "_", and the > decoding did work. > > > -----Original Message----- > From: phko...@gmail.com [mailto:phko...@gmail.com] On Behalf Of Philipp Koehn > Sent: Wednesday, September 05, 2012 4:22 AM > To: Tan, Jun > Cc: moses-support@mit.edu > Subject: Re: [Moses-support] Dose Moses support binarised translation table > for factored model? > > Hi, > > this should be working. > > What seems odd to me that you are using "_" as factor separator, while it is > standard to use "|". There is no option in processPhraseTable to change the > separator. > > -phi > > On Tue, Sep 4, 2012 at 6:15 AM, Tan, Jun <jun....@emc.com> wrote: >> Hi all, >> >> >> >> I built a factored model following the guideline on Moses web page. In >> order to faster the decoding speed, I’m trying to use the binarised phrase >> table. >> >> The binaring progress is finished, when trying to decode with the >> binarised phrase table, the translation got failed. The input and >> output are the same. >> >> Dose Moses support binarised translation table for factored model? >> Does anybody also meet this issue? >> >> Below are the outputs of the decoding process: >> >> >> >> 1.decoding with binarised phrase-table: >> >> [root@Redhat-252 binarised-model]# echo 'the_DT' | >> /data/moses/moses-smt-mosesdecoder/bin/moses -f moses.ini >> >> Defined parameters (per moses.ini or switch): >> >> config: moses.ini >> >> distortion-limit: 6 >> >> factor-delimiter: _ >> >> input-factors: 0 >> >> lmodel-file: 0 0 3 >> /data/english-chinese_POS_tag/chinese-lm/english-chinese.lm.cn 0 1 3 >> /data/english-chinese_POS_tag/chinese-pos-lm/english-chinese.lm.cn >> >> mapping: 0 T 0 >> >> ttable-file: 1 0 0,1 5 >> /data/english-chinese_POS_tag/binarised-model/phrase-table >> >> ttable-limit: 20 >> >> weight-d: 0.6 >> >> weight-l: 0.2500 0.2500 >> >> weight-t: 0.20 0.20 0.20 0.20 0.20 >> >> weight-w: -1 >> >> /data/moses/moses-smt-mosesdecoder/bin >> >> Loading lexical distortion models...have 0 models >> >> Start loading LanguageModel >> /data/english-chinese_POS_tag/chinese-lm/english-chinese.lm.cn : >> [0.001] seconds >> >> /data/english-chinese_POS_tag/chinese-lm/english-chinese.lm.cn: line 125679: >> warning: non-zero probability for <unk> in closed-vocabulary LM >> >> Start loading LanguageModel >> /data/english-chinese_POS_tag/chinese-pos-lm/english-chinese.lm.cn : >> [7.148] seconds >> >> /data/english-chinese_POS_tag/chinese-pos-lm/english-chinese.lm.cn: line 46: >> warning: non-zero probability for <unk> in closed-vocabulary LM >> >> Finished loading LanguageModels : [7.214] seconds >> >> Start loading PhraseTable >> /data/english-chinese_POS_tag/binarised-model/phrase-table : [7.214] >> seconds >> >> filePath: /data/english-chinese_POS_tag/binarised-model/phrase-table >> >> Finished loading phrase tables : [7.214] seconds >> >> IO from STDOUT/STDIN >> >> Created input-output object : [7.214] seconds >> >> Translating line 0 in thread id 140249033144064 >> >> Translating: the >> >> >> >> reading bin ttable >> >> size of OFF_T 8 >> >> binary phrasefile loaded, default OFF_T: -1 >> >> Line 0: Collecting options took 0.000 seconds >> >> Line 0: Search took 0.000 seconds >> >> the >> >> BEST TRANSLATION: the_UNK_UNK_UNK [1] [total=-111.439] <<0.000, >> -1.000, -100.000, -23.206, -26.549, 0.000, 0.000, 0.000, 0.000, >> 0.000>> 0-0 >> >> Line 0: Translation took 0.894 seconds total >> >> >> >> 2.Normal decoding >> >> >> >> [root@Redhat-252 english-chinese_POS_tag]# echo 'the_DT' | >> /data/moses/moses-smt-mosesdecoder/bin/moses -f train/model/moses.ini >> >> Defined parameters (per moses.ini or switch): >> >> config: train/model/moses.ini >> >> distortion-limit: 6 >> >> factor-delimiter: _ >> >> input-factors: 0 >> >> lmodel-file: 0 0 3 >> /data/english-chinese_POS_tag/chinese-lm/english-chinese.lm.cn 0 1 3 >> /data/english-chinese_POS_tag/chinese-pos-lm/english-chinese.lm.cn >> >> mapping: 0 T 0 >> >> ttable-file: 0 0 0,1 5 >> /data/english-chinese_POS_tag/train/model/phrase-table.0-0,1.gz >> >> ttable-limit: 20 >> >> weight-d: 0.6 >> >> weight-l: 0.2500 0.2500 >> >> weight-t: 0.20 0.20 0.20 0.20 0.20 >> >> weight-w: -1 >> >> /data/moses/moses-smt-mosesdecoder/bin >> >> Loading lexical distortion models...have 0 models >> >> Start loading LanguageModel >> /data/english-chinese_POS_tag/chinese-lm/english-chinese.lm.cn : >> [0.000] seconds >> >> /data/english-chinese_POS_tag/chinese-lm/english-chinese.lm.cn: line 125679: >> warning: non-zero probability for <unk> in closed-vocabulary LM >> >> Start loading LanguageModel >> /data/english-chinese_POS_tag/chinese-pos-lm/english-chinese.lm.cn : >> [4.239] seconds >> >> /data/english-chinese_POS_tag/chinese-pos-lm/english-chinese.lm.cn: line 46: >> warning: non-zero probability for <unk> in closed-vocabulary LM >> >> Finished loading LanguageModels : [4.254] seconds >> >> Start loading PhraseTable >> /data/english-chinese_POS_tag/train/model/phrase-table.0-0,1.gz : >> [4.254] seconds >> >> filePath: >> /data/english-chinese_POS_tag/train/model/phrase-table.0-0,1.gz >> >> Finished loading phrase tables : [4.254] seconds >> >> Start loading phrase table from >> /data/english-chinese_POS_tag/train/model/phrase-table.0-0,1.gz : >> [4.254] seconds >> >> Reading >> /data/english-chinese_POS_tag/train/model/phrase-table.0-0,1.gz >> >> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70 >> ---75---80---85---90---95--100 >> >> ********************************************************************** >> ****************************** >> >> Finished loading phrase tables : [422.886] seconds >> >> IO from STDOUT/STDIN >> >> Created input-output object : [422.895] seconds >> >> Translating line 0 in thread id 139991742867200 >> >> Translating: the >> >> >> >> Line 0: Collecting options took 0.061 seconds >> >> Line 0: Search took 0.185 seconds >> >> 在 >> >> BEST TRANSLATION: 在_P [1] [total=-6.025] <<0.000, -1.000, 0.000, >> -12.496, -9.723, -1.545, -1.590, -2.312, -2.906, 1.000>> >> >> Line 0: Translation took 0.247 seconds total >> >> >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support >> > _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support