Hi Nakul Some of your source words are untranslated because there are no corresponding entries in your phrase table. If moses cannot find a translation in the phrase table, it cannot translate it, and by default will pass it straight through. If you don't want unknowns in the output, then use the -drop-unknown flag.
In particular, the following source tokens are unknown: Even require recommendation. You may want to lowercase and tokenise the source before passing it to moses, best regards - Barry On Wednesday 06 April 2011 10:47, nakul sharma wrote: > Hi All, > > i am undertaking English to regional language translation. > i have tuned the corpus using mert-moses.pl using following command:- > > ./mert-moses.pl corpus/corpus.lowercased.en corpus/corpus.lowercased.hi > /home/nakul/moses/mosesdecoder/trunk/moses-cmd/src/moses model/moses.ini > --working-dir corpus/tuning/mert --rootdir > /home/nakul/moses/mosesdecoder/trunk/scripts/training/moses-scripts/scripts >-20110405-1055/ --decoder-flags "-v 0" >& mert.out & > > the moses.ini, i got after training (at corpus/tuning/mert) has following > contents :- > > > # MERT optimized configuration > # decoder /home/nakul/moses/mosesdecoder/trunk/moses-cmd/src/moses > # BLEU 0.612451 -> 0.612451 on dev > /home/nakul/moses/mosesdecoder/trunk/scripts/training/moses-scripts/scripts >-20110405-1055/training/corpus/corpus.lowercased.en > > # We were before running iteration 8 > # finished बुध अप्रेल 6 14:40:28 IST 2011 > ### MOSES CONFIG FILE ### > ######################### > > # input factors > [input-factors] > 0 > > # mapping steps > [mapping] > 0 T 0 > > # translation tables: source-factors, target-factors, number of scores, > file > [ttable-file] > 0 0 0 5 > /home/nakul/moses/mosesdecoder/trunk/scripts/training/moses-scripts/scripts >-20110405-1055/training/model/phrase-table.gz > > > # no generation models, no generation-file section > > # language models: type(srilm/irstlm), factors, order, file > [lmodel-file] > 0 0 3 /home/nakul/moses/mosesdecoder/trunk/scripts/training/lm_hin.lm > > > # limit on how many phrase translations e for each phrase f are loaded > # 0 = all elements loaded > [ttable-limit] > 20 > > # distortion (reordering) files > [distortion-file] > 0-0 wbe-msd-bidirectional-fe-allff 6 > /home/nakul/moses/mosesdecoder/trunk/scripts/training/moses-scripts/scripts >-20110405-1055/training/model/reordering-table.wbe-msd-bidirectional-fe.gz > > > # distortion (reordering) weight > [weight-d] > 0.009645 > 0.021906 > 0.008725 > 0.032902 > 0.019846 > 0.002462 > 0.016001 > > # language model weights > [weight-l] > 0.022668 > > > # translation model weights > [weight-t] > -0.008632 > 0.019782 > 0.228404 > -0.005989 > -0.499862 > > # no generation models, no weight-generation section > > # word penalty > [weight-w] > -0.103176 > > [distortion-limit] > 6 > > [v] > 0 > > > the contents of initial moses.ini is as follows:- > > ######################### > ### MOSES CONFIG FILE ### > ######################### > > # input factors > [input-factors] > 0 > > # mapping steps > [mapping] > 0 T 0 > > # translation tables: source-factors, target-factors, number of scores, > file > [ttable-file] > 0 0 0 5 > /home/nakul/moses/mosesdecoder/trunk/scripts/training/moses-scripts/scripts >-20110405-1055/training/model/phrase-table.gz > > > # no generation models, no generation-file section > > # language models: type(srilm/irstlm), factors, order, file > [lmodel-file] > 0 0 3 /home/nakul/moses/mosesdecoder/trunk/scripts/training/lm_hin.lm > > > # limit on how many phrase translations e for each phrase f are loaded > # 0 = all elements loaded > [ttable-limit] > 20 > > # distortion (reordering) files > [distortion-file] > 0-0 wbe-msd-bidirectional-fe-allff 6 > /home/nakul/moses/mosesdecoder/trunk/scripts/training/moses-scripts/scripts >-20110405-1055/training/model/reordering-table.wbe-msd-bidirectional-fe.gz > > > # distortion (reordering) weight > [weight-d] > 0.3 > 0.3 > 0.3 > 0.3 > 0.3 > 0.3 > 0.3 > > # language model weights > [weight-l] > 0.5000 > > > # translation model weights > [weight-t] > 0.2 > 0.2 > 0.2 > 0.2 > 0.2 > > # no generation models, no weight-generation section > > # word penalty > [weight-w] > -1 > > [distortion-limit] > 6 > > upon undertaking the translation i get mix of regional lang and english > words as follows:- > echo "Even amendments would require recommendation." | TMP=/tmp > /home/nakul/moses/mosesdecoder/trunk/moses-cmd/src/moses -f > /home/nakul/moses/mosesdecoder/trunk/scripts/training/moses-scripts/scripts >-20110405-1055/training/corpus/tuning/mert/moses.ini > > >output.txt > > Defined parameters (per moses.ini or switch): > config: > /home/nakul/moses/mosesdecoder/trunk/scripts/training/moses-scripts/scripts >-20110405-1055/training/corpus/tuning/mert/moses.ini > > distortion-file: 0-0 wbe-msd-bidirectional-fe-allff 6 > /home/nakul/moses/mosesdecoder/trunk/scripts/training/moses-scripts/scripts >-20110405-1055/training/model/reordering-table.wbe-msd-bidirectional-fe.gz > > distortion-limit: 6 > input-factors: 0 > lmodel-file: 0 0 3 > /home/nakul/moses/mosesdecoder/trunk/scripts/training/lm_hin.lm > mapping: 0 T 0 > ttable-file: 0 0 0 5 > /home/nakul/moses/mosesdecoder/trunk/scripts/training/moses-scripts/scripts >-20110405-1055/training/model/phrase-table.gz > > ttable-limit: 20 > v: 0 > weight-d: 0.009645 0.021906 0.008725 0.032902 0.019846 0.002462 > 0.016001 > weight-l: 0.022668 > weight-t: -0.008632 0.019782 0.228404 -0.005989 -0.499862 > weight-w: -0.103176 > Loading lexical distortion models...have 1 models > Creating lexical reordering... > weights: 0.022 0.009 0.033 0.020 0.002 0.016 > Loading table into memory...done. > Start loading LanguageModel > /home/nakul/moses/mosesdecoder/trunk/scripts/training/lm_hin.lm : [1.000] > seconds > /home/nakul/moses/mosesdecoder/trunk/scripts/training/lm_hin.lm: line 122: > warning: non-zero probability for <unk> in closed-vocabulary LM > Finished loading LanguageModels : [1.000] seconds > Start loading PhraseTable > /home/nakul/moses/mosesdecoder/trunk/scripts/training/moses-scripts/scripts >-20110405-1055/training/model/phrase-table.gz > > : [1.000] seconds > > filePath: > /home/nakul/moses/mosesdecoder/trunk/scripts/training/moses-scripts/scripts >-20110405-1055/training/model/phrase-table.gz > > Finished loading phrase tables : [1.000] seconds > IO from STDOUT/STDIN > Created input-output object : [1.000] seconds > Translating: Even amendments would require recommendation. > > Collecting options took 0.000 seconds > Search took 0.000 seconds > BEST TRANSLATION: संशोधनों सहित Even|UNK|UNK|UNK प्रभाव require|UNK|UNK|UNK > recommendation.|UNK|UNK|UNK [11111] [total=-301.895] <<-4.000, -6.000, > -300.000, 0.000, 0.000, -1.022, -0.511, -1.609, 0.000, -48.665, -1.792, > -2.303, -1.386, -3.624, 2.000>> > Translation took 0.000 seconds > Finished translating > End. : [1.000] seconds > > > Please tell how to improve upon this translation, i have already tuned the > system but problem presists. -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support