Hi all, I have two source files (same sentences, but in different order), something like
File source-d1-d2: <seg weight-setting="domain1"> 今年 前 两 月 广东 高新技术 产品 出口 37.6亿 美元 </seg> <seg weight-setting="domain2"> 新华社 广州 3月 16日 电 ( 记者 陈冀 ) 最新 统计 数字 显示 , 今年 1 至 2月 , 广东省 高新技术 产品 出口 37.6亿 美元 , 同比 增长 34.8% , 占 全省 出口 总值 的 25.5% 。 </seg> and source-d2-d1: <seg weight-setting="domain2"> 新华社 广州 3月 16日 电 ( 记者 陈冀 ) 最新 统计 数字 显示 , 今年 1 至 2月 , 广东省 高新技术 产品 出口 37.6亿 美元 , 同比 增长 34.8% , 占 全省 出口 总值 的 25.5% 。 </seg> <seg weight-setting="domain1"> 今年 前 两 月 广东 高新技术 产品 出口 37.6亿 美元 </seg> The I run the following commands, moses -threads 1 -f moses-domain-weights.ini < source-d2-d1 &> translation-d2-d1.log moses -threads 1 -f moses-domain-weights.ini < source-d1-d2 &> translation-d1-d2.log The output of first command, in translation-d1-d2.log is Defined parameters (per moses.ini or switch): alternate-weight-setting: id=domain1 LexicalReordering0= -0.00467702 0.0430348 0.0889973 0.0352209 0.000507535 0.48658 Distortion0= 0.0042172 LM0= 0.0350763 WordPenalty0= -0.149218 PhrasePenalty0= 0.0575948 TranslationModel0= 0.0122912 0.0178517 0.0438588 0.0208747 UnknownWordPenalty0= 1 id=domain2 LexicalReordering0= -0.0356997 0.0133961 0.00699043 0.760662 0.0874955 0.0278831 Distortion0= 0.00562152 LM0= 0.0105839 WordPenalty0= -0.0239658 PhrasePenalty0= 0.00725696 TranslationModel0= 0.00534787 0.00493235 0.00707 0.00309476 UnknownWordPenalty0= 1 id=domain3 LexicalReordering0= 0.0299658 0.0642291 0.132828 0.0099368 0.000757492 0.0317366 Distortion0= 0.00629415 LM0= 0.0543508 WordPenalty0= -0.340616 PhrasePenalty0= 0.0859599 TranslationModel0= 0.0183444 0.0266436 0.0654589 0.132879 UnknownWordPenalty0= 1 id=domain4 LexicalReordering0= 0.22727 0.0573986 0.465178 0.0491854 0.00318964 -0.00912402 Distortion0= 0.0350783 LM0= 0.0481925 WordPenalty0= 0.00113805 PhrasePenalty0= 0.00989932 TranslationModel0= 0.0232946 0.0316844 0.019047 -0.0203211 UnknownWordPenalty0= 1 config: moses-domain-weights.ini distortion-limit: 6 feature: UnknownWordPenalty WordPenalty PhrasePenalty PhraseDictionaryOnDisk name=TranslationModel0 num-features=4 path=/home/userfiltered/filtered.1/phrase-table.0-0.1.1.bin input-factor=0 output-factor=0 LexicalReordering name=LexicalReordering0 num-features=6 type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0 path=/home/userfiltered/filtered.1/reordering-table.1.wbe-msd-bidirectional-fe Distortion KENLM lazyken=0 name=LM0 factor=0 path=/home/userbaseline/lm/lm.lm.1.bin order=5 input-factors: 0 mapping: 0 T 0 threads: 1 distinct v: 0 weight: LexicalReordering0= 0.05546 0.0200971 0.212035 0.0738027 0.050521 0.0982874 Distortion0= 0.00893496 LM0= 0.0642775 WordPenalty0= -0.194264 PhrasePenalty0= 0.0867617 TranslationModel0= 0.0228662 0.0378634 0.0388449 0.0359839 UnknownWordPenalty0= 1 line=UnknownWordPenalty FeatureFunction: UnknownWordPenalty0 start: 0 end: 0 line=WordPenalty FeatureFunction: WordPenalty0 start: 1 end: 1 line=PhrasePenalty FeatureFunction: PhrasePenalty0 start: 2 end: 2 line=PhraseDictionaryOnDisk name=TranslationModel0 num-features=4 path=/home/userfiltered/filtered.1/phrase-table.0-0.1.1.bin input-factor=0 output-factor=0 FeatureFunction: TranslationModel0 start: 3 end: 6 line=LexicalReordering name=LexicalReordering0 num-features=6 type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0 path=/home/userfiltered/filtered.1/reordering-table.1.wbe-msd-bidirectional-fe FeatureFunction: LexicalReordering0 start: 7 end: 12 Initializing Lexical Reordering Feature.. line=Distortion FeatureFunction: Distortion0 start: 13 end: 13 line=KENLM lazyken=0 name=LM0 factor=0 path=/home/userbaseline/lm/lm.lm.1.bin order=5 FeatureFunction: LM0 start: 14 end: 14 Loading UnknownWordPenalty0 Loading WordPenalty0 Loading PhrasePenalty0 Loading LexicalReordering0 binary file loaded, default OFF_T: -1 Loading Distortion0 Loading LM0 Loading TranslationModel0 alternate weight setting domain1 alternate weight setting domain2 alternate weight setting domain3 alternate weight setting domain4 Created input-output object : [0.514] seconds Translating: 今年 前 两 月 广东 高新技术 产品 出口 37.6亿 美元 binary file loaded, default OFF_T: -1 Line 0: Initialize search took 0.529 seconds total Line 0: Collecting options took 0.026 seconds at moses/Manager.cpp:117 Line 0: Search took 0.455 seconds the first two months of this year guangdong ' s export of high @-@ tech products 37.6亿 us dollars BEST TRANSLATION: the first two months of this year guangdong ' s export of high @-@ tech products 37.6亿|UNK|UNK|UNK us dollars [1111111111] [total=-101.595] core=(-100.000,-19.000,5.000,-10.157,-15.979,-14.790,-32.019,-4.474,0.000,0.000,-2.293,0.000,0.000,0.000,-83.558) Line 0: Decision rule took 0.001 seconds total Line 0: Additional reporting took 0.001 seconds total Line 0: Translation took 1.011 seconds total Translating: 新华社 广州 3月 16日 电 ( 记者 陈冀 ) 最新 统计 数字 显示 , 今年 1 至 2月 , 广东省 高新技术 产品 出口 37.6亿 美元 , 同比 增长 34.8% , 占 全省 出口 总值 的 25.5% 。 Line 1: Initialize search took 0.150 seconds total Line 1: Collecting options took 0.082 seconds at moses/Manager.cpp:117 Line 1: Search took 1.328 seconds xinhua news agency , guangzhou on march 16 , xinhua ( reporter chen ji ) , up @-@ to @-@ date on during the january @-@ february period . exports of high @-@ tech products in guangdong 37.6亿 and 34.8 percent of the total export value , accounting for 25.5 per cent throughout the province . BEST TRANSLATION: xinhua news agency , guangzhou on march 16 , xinhua ( reporter chen ji ) , up @-@ to @-@ date on during the january @-@ february period . exports of high @-@ tech products in guangdong 37.6亿|UNK|UNK|UNK and 34.8 percent of the total export value , accounting for 25.5 per cent throughout the province . [1111111111111111111111111111111111111] [total=-102.559] core=(-100.000,-56.000,21.000,-48.972,-99.568,-54.978,-118.824,-10.884,-0.669,-6.607,-0.423,-0.336,-4.584,-50.000,-200.097) Line 1: Decision rule took 0.000 seconds total Line 1: Additional reporting took 0.000 seconds total Line 1: Translation took 1.560 seconds total Name:moses VmPeak:2003732 kB VmRSS:41416 kB RSSMax:1854592 kB user:2.884 sys:0.300 CPU:3.184 real:3.189 And the output of second command, in translation-d2-d1.log is Defined parameters (per moses.ini or switch): alternate-weight-setting: id=domain1 LexicalReordering0= -0.00467702 0.0430348 0.0889973 0.0352209 0.000507535 0.48658 Distortion0= 0.0042172 LM0= 0.0350763 WordPenalty0= -0.149218 PhrasePenalty0= 0.0575948 TranslationModel0= 0.0122912 0.0178517 0.0438588 0.0208747 UnknownWordPenalty0= 1 id=domain2 LexicalReordering0= -0.0356997 0.0133961 0.00699043 0.760662 0.0874955 0.0278831 Distortion0= 0.00562152 LM0= 0.0105839 WordPenalty0= -0.0239658 PhrasePenalty0= 0.00725696 TranslationModel0= 0.00534787 0.00493235 0.00707 0.00309476 UnknownWordPenalty0= 1 id=domain3 LexicalReordering0= 0.0299658 0.0642291 0.132828 0.0099368 0.000757492 0.0317366 Distortion0= 0.00629415 LM0= 0.0543508 WordPenalty0= -0.340616 PhrasePenalty0= 0.0859599 TranslationModel0= 0.0183444 0.0266436 0.0654589 0.132879 UnknownWordPenalty0= 1 id=domain4 LexicalReordering0= 0.22727 0.0573986 0.465178 0.0491854 0.00318964 -0.00912402 Distortion0= 0.0350783 LM0= 0.0481925 WordPenalty0= 0.00113805 PhrasePenalty0= 0.00989932 TranslationModel0= 0.0232946 0.0316844 0.019047 -0.0203211 UnknownWordPenalty0= 1 config: moses-domain-weights.ini distortion-limit: 6 feature: UnknownWordPenalty WordPenalty PhrasePenalty PhraseDictionaryOnDisk name=TranslationModel0 num-features=4 path=/home/userfiltered/filtered.1/phrase-table.0-0.1.1.bin input-factor=0 output-factor=0 LexicalReordering name=LexicalReordering0 num-features=6 type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0 path=/home/userfiltered/filtered.1/reordering-table.1.wbe-msd-bidirectional-fe Distortion KENLM lazyken=0 name=LM0 factor=0 path=/home/userbaseline/lm/lm.lm.1.bin order=5 input-factors: 0 mapping: 0 T 0 threads: 1 distinct v: 0 weight: LexicalReordering0= 0.05546 0.0200971 0.212035 0.0738027 0.050521 0.0982874 Distortion0= 0.00893496 LM0= 0.0642775 WordPenalty0= -0.194264 PhrasePenalty0= 0.0867617 TranslationModel0= 0.0228662 0.0378634 0.0388449 0.0359839 UnknownWordPenalty0= 1 line=UnknownWordPenalty FeatureFunction: UnknownWordPenalty0 start: 0 end: 0 line=WordPenalty FeatureFunction: WordPenalty0 start: 1 end: 1 line=PhrasePenalty FeatureFunction: PhrasePenalty0 start: 2 end: 2 line=PhraseDictionaryOnDisk name=TranslationModel0 num-features=4 path=/home/userfiltered/filtered.1/phrase-table.0-0.1.1.bin input-factor=0 output-factor=0 FeatureFunction: TranslationModel0 start: 3 end: 6 line=LexicalReordering name=LexicalReordering0 num-features=6 type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0 path=/home/userfiltered/filtered.1/reordering-table.1.wbe-msd-bidirectional-fe FeatureFunction: LexicalReordering0 start: 7 end: 12 Initializing Lexical Reordering Feature.. line=Distortion FeatureFunction: Distortion0 start: 13 end: 13 line=KENLM lazyken=0 name=LM0 factor=0 path=/home/userbaseline/lm/lm.lm.1.bin order=5 FeatureFunction: LM0 start: 14 end: 14 Loading UnknownWordPenalty0 Loading WordPenalty0 Loading PhrasePenalty0 Loading LexicalReordering0 binary file loaded, default OFF_T: -1 Loading Distortion0 Loading LM0 Loading TranslationModel0 alternate weight setting domain1 alternate weight setting domain2 alternate weight setting domain3 alternate weight setting domain4 Created input-output object : [0.514] seconds Translating: 新华社 广州 3月 16日 电 ( 记者 陈冀 ) 最新 统计 数字 显示 , 今年 1 至 2月 , 广东省 高新技术 产品 出口 37.6亿 美元 , 同比 增长 34.8% , 占 全省 出口 总值 的 25.5% 。 binary file loaded, default OFF_T: -1 Line 0: Initialize search took 0.276 seconds total Line 0: Collecting options took 0.078 seconds at moses/Manager.cpp:117 Line 0: Search took 1.265 seconds xinhua news agency , guangzhou on march 16 ( reporter chen ji ) @-@ @-@ to @-@ date on to february , the province this year exports of high @-@ tech products 37.6亿 and 34.8 percent of the total export value , accounting for 25.5 per cent throughout the province . BEST TRANSLATION: xinhua news agency , guangzhou on march 16 ( reporter chen ji ) @-@ @-@ to @-@ date on to february , the province this year exports of high @-@ tech products 37.6亿|UNK|UNK|UNK and 34.8 percent of the total export value , accounting for 25.5 per cent throughout the province . [1111111111111111111111111111111111111] [total=-102.552] core=(-100.000,-51.000,22.000,-62.087,-100.147,-58.165,-87.079,-12.522,-0.669,-8.377,-0.298,-0.336,-5.141,-56.000,-197.660) Line 0: Decision rule took 0.001 seconds total Line 0: Additional reporting took 0.001 seconds total Line 0: Translation took 1.619 seconds total Translating: 今年 前 两 月 广东 高新技术 产品 出口 37.6亿 美元 Line 1: Initialize search took 0.141 seconds total Line 1: Collecting options took 0.013 seconds at moses/Manager.cpp:117 Line 1: Search took 0.237 seconds the first two months of this year guangdong ' s export of high @-@ tech products 37.6亿 us dollars BEST TRANSLATION: the first two months of this year guangdong ' s export of high @-@ tech products 37.6亿|UNK|UNK|UNK us dollars [1111111111] [total=-101.595] core=(-100.000,-19.000,5.000,-10.157,-15.979,-14.790,-32.019,-4.474,0.000,0.000,-2.293,0.000,0.000,0.000,-83.558) Line 1: Decision rule took 0.000 seconds total Line 1: Additional reporting took 0.000 seconds total Line 1: Translation took 0.391 seconds total Name:moses VmPeak:2003732 kB VmRSS:46228 kB RSSMax:1854384 kB user:2.329 sys:0.268 CPU:2.597 real:2.603 The weights in ini are defined as: [weight] LexicalReordering0= 0.05546 0.0200971 0.212035 0.0738027 0.050521 0.0982874 Distortion0= 0.00893496 LM0= 0.0642775 WordPenalty0= -0.194264 PhrasePenalty0= 0.0867617 TranslationModel0= 0.0228662 0.0378634 0.0388449 0.0359839 UnknownWordPenalty0= 1 [alternate-weight-setting] id=domain1 LexicalReordering0= -0.00467702 0.0430348 0.0889973 0.0352209 0.000507535 0.48658 Distortion0= 0.0042172 LM0= 0.0350763 WordPenalty0= -0.149218 PhrasePenalty0= 0.0575948 TranslationModel0= 0.0122912 0.0178517 0.0438588 0.0208747 UnknownWordPenalty0= 1 id=domain2 LexicalReordering0= -0.0356997 0.0133961 0.00699043 0.760662 0.0874955 0.0278831 Distortion0= 0.00562152 LM0= 0.0105839 WordPenalty0= -0.0239658 PhrasePenalty0= 0.00725696 TranslationModel0= 0.00534787 0.00493235 0.00707 0.00309476 UnknownWordPenalty0= 1 id=domain3 LexicalReordering0= 0.0299658 0.0642291 0.132828 0.0099368 0.000757492 0.0317366 Distortion0= 0.00629415 LM0= 0.0543508 WordPenalty0= -0.340616 PhrasePenalty0= 0.0859599 TranslationModel0= 0.0183444 0.0266436 0.0654589 0.132879 UnknownWordPenalty0= 1 id=domain4 LexicalReordering0= 0.22727 0.0573986 0.465178 0.0491854 0.00318964 -0.00912402 Distortion0= 0.0350783 LM0= 0.0481925 WordPenalty0= 0.00113805 PhrasePenalty0= 0.00989932 TranslationModel0= 0.0232946 0.0316844 0.019047 -0.0203211 UnknownWordPenalty0= 1 But, for the sentence, which uses the weights of domain2, two runs give me different translations. Did I miss anything here? Thanks, Jian -- Jian Zhang Centre for Next Generation Localisation (CNGL) <http://www.cngl.ie/index.html> Dublin City University <http://www.dcu.ie/>
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support