Hi,

yes, the factor separator must be "|".

There is some programs that allow you specify
other separators, but this is not sufficiently consistly
available throughout the training / tuning / decoding
pipeline.

In case you corpus contains "|", it better to replace
those occurrences with '&bar;'.

-phi

On Wed, Sep 5, 2012 at 2:15 AM, Tan, Jun <jun....@emc.com> wrote:
> Hi Koehn,
>
> So the factor separator must be "|"?
> I tagged all the data via some other tool,  and default separator is "_".
> I also have noticed the separator of target phrase in the phrase table is 
> "|", even I changed the separator to "_" during the training process. I 
> changed all the separator in the phrase-table from "|" to "_", and the 
> decoding did work.
>
>
> -----Original Message-----
> From: phko...@gmail.com [mailto:phko...@gmail.com] On Behalf Of Philipp Koehn
> Sent: Wednesday, September 05, 2012 4:22 AM
> To: Tan, Jun
> Cc: moses-support@mit.edu
> Subject: Re: [Moses-support] Dose Moses support binarised translation table 
> for factored model?
>
> Hi,
>
> this should be working.
>
> What seems odd to me that you are using "_" as factor separator, while it is 
> standard to use "|". There is no option in processPhraseTable to change the 
> separator.
>
> -phi
>
> On Tue, Sep 4, 2012 at 6:15 AM, Tan, Jun <jun....@emc.com> wrote:
>> Hi all,
>>
>>
>>
>> I built a factored model following the guideline on Moses web page. In
>> order to faster the decoding speed, I’m trying to use the binarised phrase 
>> table.
>>
>> The binaring progress is finished, when trying to decode with the
>> binarised phrase table, the translation got failed.  The input and
>> output are the same.
>>
>> Dose Moses support binarised translation table for factored model?
>> Does anybody also meet this issue?
>>
>> Below are the outputs of the decoding process:
>>
>>
>>
>> 1.decoding with binarised phrase-table:
>>
>> [root@Redhat-252 binarised-model]# echo 'the_DT' |
>> /data/moses/moses-smt-mosesdecoder/bin/moses  -f moses.ini
>>
>> Defined parameters (per moses.ini or switch):
>>
>>         config: moses.ini
>>
>>         distortion-limit: 6
>>
>>         factor-delimiter: _
>>
>>         input-factors: 0
>>
>>         lmodel-file: 0 0 3
>> /data/english-chinese_POS_tag/chinese-lm/english-chinese.lm.cn 0 1 3
>> /data/english-chinese_POS_tag/chinese-pos-lm/english-chinese.lm.cn
>>
>>         mapping: 0 T 0
>>
>>         ttable-file: 1 0 0,1 5
>> /data/english-chinese_POS_tag/binarised-model/phrase-table
>>
>>         ttable-limit: 20
>>
>>         weight-d: 0.6
>>
>>         weight-l: 0.2500 0.2500
>>
>>         weight-t: 0.20 0.20 0.20 0.20 0.20
>>
>>         weight-w: -1
>>
>> /data/moses/moses-smt-mosesdecoder/bin
>>
>> Loading lexical distortion models...have 0 models
>>
>> Start loading LanguageModel
>> /data/english-chinese_POS_tag/chinese-lm/english-chinese.lm.cn :
>> [0.001] seconds
>>
>> /data/english-chinese_POS_tag/chinese-lm/english-chinese.lm.cn: line 125679:
>> warning: non-zero probability for <unk> in closed-vocabulary LM
>>
>> Start loading LanguageModel
>> /data/english-chinese_POS_tag/chinese-pos-lm/english-chinese.lm.cn :
>> [7.148] seconds
>>
>> /data/english-chinese_POS_tag/chinese-pos-lm/english-chinese.lm.cn: line 46:
>> warning: non-zero probability for <unk> in closed-vocabulary LM
>>
>> Finished loading LanguageModels : [7.214] seconds
>>
>> Start loading PhraseTable
>> /data/english-chinese_POS_tag/binarised-model/phrase-table : [7.214]
>> seconds
>>
>> filePath: /data/english-chinese_POS_tag/binarised-model/phrase-table
>>
>> Finished loading phrase tables : [7.214] seconds
>>
>> IO from STDOUT/STDIN
>>
>> Created input-output object : [7.214] seconds
>>
>> Translating line 0  in thread id 140249033144064
>>
>> Translating: the
>>
>>
>>
>> reading bin ttable
>>
>> size of OFF_T 8
>>
>> binary phrasefile loaded, default OFF_T: -1
>>
>> Line 0: Collecting options took 0.000 seconds
>>
>> Line 0: Search took 0.000 seconds
>>
>> the
>>
>> BEST TRANSLATION: the_UNK_UNK_UNK [1]  [total=-111.439] <<0.000,
>> -1.000, -100.000, -23.206, -26.549, 0.000, 0.000, 0.000, 0.000,
>> 0.000>> 0-0
>>
>> Line 0: Translation took 0.894 seconds total
>>
>>
>>
>> 2.Normal decoding
>>
>>
>>
>> [root@Redhat-252 english-chinese_POS_tag]# echo 'the_DT' |
>> /data/moses/moses-smt-mosesdecoder/bin/moses -f train/model/moses.ini
>>
>> Defined parameters (per moses.ini or switch):
>>
>>         config: train/model/moses.ini
>>
>>         distortion-limit: 6
>>
>>         factor-delimiter: _
>>
>>         input-factors: 0
>>
>>         lmodel-file: 0 0 3
>> /data/english-chinese_POS_tag/chinese-lm/english-chinese.lm.cn 0 1 3
>> /data/english-chinese_POS_tag/chinese-pos-lm/english-chinese.lm.cn
>>
>>         mapping: 0 T 0
>>
>>         ttable-file: 0 0 0,1 5
>> /data/english-chinese_POS_tag/train/model/phrase-table.0-0,1.gz
>>
>>         ttable-limit: 20
>>
>>         weight-d: 0.6
>>
>>         weight-l: 0.2500 0.2500
>>
>>         weight-t: 0.20 0.20 0.20 0.20 0.20
>>
>>         weight-w: -1
>>
>> /data/moses/moses-smt-mosesdecoder/bin
>>
>> Loading lexical distortion models...have 0 models
>>
>> Start loading LanguageModel
>> /data/english-chinese_POS_tag/chinese-lm/english-chinese.lm.cn :
>> [0.000] seconds
>>
>> /data/english-chinese_POS_tag/chinese-lm/english-chinese.lm.cn: line 125679:
>> warning: non-zero probability for <unk> in closed-vocabulary LM
>>
>> Start loading LanguageModel
>> /data/english-chinese_POS_tag/chinese-pos-lm/english-chinese.lm.cn :
>> [4.239] seconds
>>
>> /data/english-chinese_POS_tag/chinese-pos-lm/english-chinese.lm.cn: line 46:
>> warning: non-zero probability for <unk> in closed-vocabulary LM
>>
>> Finished loading LanguageModels : [4.254] seconds
>>
>> Start loading PhraseTable
>> /data/english-chinese_POS_tag/train/model/phrase-table.0-0,1.gz :
>> [4.254] seconds
>>
>> filePath:
>> /data/english-chinese_POS_tag/train/model/phrase-table.0-0,1.gz
>>
>> Finished loading phrase tables : [4.254] seconds
>>
>> Start loading phrase table from
>> /data/english-chinese_POS_tag/train/model/phrase-table.0-0,1.gz :
>> [4.254] seconds
>>
>> Reading
>> /data/english-chinese_POS_tag/train/model/phrase-table.0-0,1.gz
>>
>> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70
>> ---75---80---85---90---95--100
>>
>> **********************************************************************
>> ******************************
>>
>> Finished loading phrase tables : [422.886] seconds
>>
>> IO from STDOUT/STDIN
>>
>> Created input-output object : [422.895] seconds
>>
>> Translating line 0  in thread id 139991742867200
>>
>> Translating: the
>>
>>
>>
>> Line 0: Collecting options took 0.061 seconds
>>
>> Line 0: Search took 0.185 seconds
>>
>> 在
>>
>> BEST TRANSLATION: 在_P [1]  [total=-6.025] <<0.000, -1.000, 0.000,
>> -12.496, -9.723, -1.545, -1.590, -2.312, -2.906, 1.000>>
>>
>> Line 0: Translation took 0.247 seconds total
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to