Hi, I am a noob at using Moses and have been trying to build a model and then use the decoder to translate test sentences. I used the following command for training:
* train-model.perl --root-dir /cygdrive/d/moses/fi-en/**fienModel/ --corpus /cygdrive/d/moses/fi-en/temp/**clean --f fi --e en --lm 0:3:/cygdrive/d/moses/fi-en/**en.irstlm.gz:1* The process ended cleanly with the following moses.ini file: *# input factors [input-factors] 0 # mapping steps [mapping] 0 T 0 # translation tables: table type (hierarchical(0), textual (0), binary (1)), source-factors, target-factors, number of scores, file # OLD FORMAT is still handled for back-compatibility # OLD FORMAT translation tables: source-factors, target-factors, number of scores, file # OLD FORMAT a binary table type (1) is assumed [ttable-file] 0 0 0 5 /cygdrive/d/moses/fi-en/fienModel//model/phrase-table.gz # no generation models, no generation-file section # language models: type(srilm/irstlm), factors, order, file [lmodel-file] 1 0 2 /cygdrive/d/moses/fi-en/en.irstlm.gz # limit on how many phrase translations e for each phrase f are loaded # 0 = all elements loaded [ttable-limit] 20 # distortion (reordering) weight [weight-d] 0.6 # language model weights [weight-l] 0.5000 # translation model weights [weight-t] 0.2 0.2 0.2 0.2 0.2 # no generation models, no weight-generation section # word penalty [weight-w] -1 [distortion-limit] 6* But the decoding step ends with a segfault with following output for -v 3: *Defined parameters (per moses.ini or switch): config: /cygdrive/d/moses/fi-en/fienModel/model/moses.ini distortion-limit: 6 input-factors: 0 lmodel-file: 1 0 2 /cygdrive/d/moses/fi-en/en.irstlm.gz mapping: 0 T 0 ttable-file: 0 0 0 5 /cygdrive/d/moses/fi-en/fienModel//model/phrase-tab le.gz ttable-limit: 20 verbose: 100 weight-d: 0.6 weight-l: 0.5000 weight-t: 0.2 0.2 0.2 0.2 0.2 weight-w: -1 input type is: text input Loading lexical distortion models...have 0 models Start loading LanguageModel /cygdrive/d/moses/fi-en/en.irstlm.gz : [0.000] secon ds In LanguageModelIRST::Load: nGramOrder = 2 Loading LM file (no MAP) iARPA loadtxt() 1-grams: reading 3195 entries 2-grams: reading 13313 entries 3-grams: reading 20399 entries done OOV code is 3194 OOV code is 3194 IRST: m_unknownId=3194 creating cache for storing prob, state and statesize of ngrams Finished loading LanguageModels : [1.000] seconds About to LoadPhraseTables Start loading PhraseTable /cygdrive/d/moses/fi-en/fienModel//model/phrase-table. gz : [1.000] seconds filePath: /cygdrive/d/moses/fi-en/fienModel//model/phrase-table.gz using standard phrase tables PhraseDictionaryMemory: input=FactorMask<0> output=FactorMask<0> Finished loading phrase tables : [1.000] seconds IO from STDOUT/STDIN Created input-output object : [1.000] seconds The score component vector looks like this: Distortion WordPenalty !UnknownWordPenalty LM_2gram PhraseModel_1 PhraseModel_2 PhraseModel_3 PhraseModel_4 PhraseModel_5 Stateless: 1 Stateful: 2 The global weight vector looks like this: 0.600 -1.000 1.000 0.500 0.200 0.200 0 .200 0.200 0.200 Translating: istuntokauden uudelleenavaaminen DecodeStep(): outputFactors=FactorMask<0> conflictFactors=FactorMask<> newOutputFactors=FactorMask<0> Translation Option Collection Total translation options: 2 Total translation options pruned: 0 translation options spanning from 0 to 0 is 1 translation options spanning from 0 to 1 is 0 translation options spanning from 1 to 1 is 1 translation options generated in total: 2 future cost from 0 to 0 is -100.136 future cost from 0 to 1 is -200.271 future cost from 1 to 1 is -100.136 Collecting options took 0.000 seconds added hyp to stack, best on stack, now size 1 processing hypothesis from next stack creating hypothesis 1 from 0 ( ... ) base score 0.000 covering 0-0: istuntokauden translated as: istuntokauden|UNK|UNK|UNK score -100.136 + future cost -100.136 = -200.271 unweighted feature scores: <<0.000, -1.000, -100.000, -2.271, 0.000, 0.0 00, 0.000, 0.000, 0.000>> added hyp to stack, best on stack, now size 1 Segmentation fault (core dumped)* The only suspicious thing I found in above is the message '*creating hypothesis 1 from 0*', but neither I know if it is the actual problem and why is it happening. I believe that problem is with the training step since the samples models that I downloaded from http://www.statmt.org/moses/download/sample-models.tgz work fine. Prior to this, I constructed an IRST LM an used clean-corpus-n.perl for cleaning the decoder input. Looking at the archives, the closest message I could find was http://thread.gmane.org/gmane.comp.nlp.moses.user/1478 but I don't think I'm committing the same mistake as the author of that message. I'll be delighted if anybody could provide any insights in this problem or requires me to provide any further details. Thanks, --Sudip.
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support