Hi,

I am a noob at using Moses and have been trying to build a model and then
use the decoder to translate test sentences. I used the following command
for training:

* train-model.perl --root-dir /cygdrive/d/moses/fi-en/**fienModel/ --corpus
/cygdrive/d/moses/fi-en/temp/**clean --f fi --e en --lm
0:3:/cygdrive/d/moses/fi-en/**en.irstlm.gz:1*

The process ended cleanly with the following moses.ini file:

*# input factors
[input-factors]
0

# mapping steps
[mapping]
0 T 0

# translation tables: table type (hierarchical(0), textual (0), binary (1)),
source-factors, target-factors, number of scores, file
# OLD FORMAT is still handled for back-compatibility
# OLD FORMAT translation tables: source-factors, target-factors, number of
scores, file
# OLD FORMAT a binary table type (1) is assumed
[ttable-file]
0 0 0 5 /cygdrive/d/moses/fi-en/fienModel//model/phrase-table.gz

# no generation models, no generation-file section

# language models: type(srilm/irstlm), factors, order, file
[lmodel-file]
1 0 2 /cygdrive/d/moses/fi-en/en.irstlm.gz


# limit on how many phrase translations e for each phrase f are loaded
# 0 = all elements loaded
[ttable-limit]
20

# distortion (reordering) weight
[weight-d]
0.6

# language model weights
[weight-l]
0.5000


# translation model weights
[weight-t]
0.2
0.2
0.2
0.2
0.2

# no generation models, no weight-generation section

# word penalty
[weight-w]
-1

[distortion-limit]
6*

But the decoding step ends with a segfault with following output for -v 3:

*Defined parameters (per moses.ini or switch):
        config: /cygdrive/d/moses/fi-en/fienModel/model/moses.ini
        distortion-limit: 6
        input-factors: 0
        lmodel-file: 1 0 2 /cygdrive/d/moses/fi-en/en.irstlm.gz
        mapping: 0 T 0
        ttable-file: 0 0 0 5
/cygdrive/d/moses/fi-en/fienModel//model/phrase-tab
le.gz
        ttable-limit: 20
        verbose: 100
        weight-d: 0.6
        weight-l: 0.5000
        weight-t: 0.2 0.2 0.2 0.2 0.2
        weight-w: -1
input type is: text input
Loading lexical distortion models...have 0 models
Start loading LanguageModel /cygdrive/d/moses/fi-en/en.irstlm.gz : [0.000]
secon
ds
In LanguageModelIRST::Load: nGramOrder = 2
Loading LM file (no MAP)
iARPA
loadtxt()
1-grams: reading 3195 entries
2-grams: reading 13313 entries
3-grams: reading 20399 entries
done
OOV code is 3194
OOV code is 3194
IRST: m_unknownId=3194
creating cache for storing prob, state and statesize of ngrams
Finished loading LanguageModels : [1.000] seconds
About to LoadPhraseTables
Start loading PhraseTable
/cygdrive/d/moses/fi-en/fienModel//model/phrase-table.
gz : [1.000] seconds
filePath: /cygdrive/d/moses/fi-en/fienModel//model/phrase-table.gz
using standard phrase tables
PhraseDictionaryMemory: input=FactorMask<0>  output=FactorMask<0>
Finished loading phrase tables : [1.000] seconds
IO from STDOUT/STDIN
Created input-output object : [1.000] seconds
The score component vector looks like this:
Distortion
WordPenalty
!UnknownWordPenalty
LM_2gram
PhraseModel_1
PhraseModel_2
PhraseModel_3
PhraseModel_4
PhraseModel_5
Stateless: 1    Stateful: 2
The global weight vector looks like this: 0.600 -1.000 1.000 0.500 0.200
0.200
0
.200 0.200 0.200
Translating: istuntokauden uudelleenavaaminen

DecodeStep():
        outputFactors=FactorMask<0>
        conflictFactors=FactorMask<>
        newOutputFactors=FactorMask<0>
Translation Option Collection

       Total translation options: 2
Total translation options pruned: 0
translation options spanning from  0 to 0 is 1
translation options spanning from  0 to 1 is 0
translation options spanning from  1 to 1 is 1
translation options generated in total: 2
future cost from 0 to 0 is -100.136
future cost from 0 to 1 is -200.271
future cost from 1 to 1 is -100.136
Collecting options took 0.000 seconds
added hyp to stack, best on stack, now size 1
processing hypothesis from next stack

creating hypothesis 1 from 0 ( ... )
        base score 0.000
        covering 0-0: istuntokauden
        translated as: istuntokauden|UNK|UNK|UNK
        score -100.136 + future cost -100.136 = -200.271
        unweighted feature scores: <<0.000, -1.000, -100.000, -2.271, 0.000,
0.0
00, 0.000, 0.000, 0.000>>
added hyp to stack, best on stack, now size 1
Segmentation fault (core dumped)*

The only suspicious thing I found in above is the message '*creating
hypothesis 1 from 0*', but neither I know if it is the actual problem and
why is it happening. I believe that problem is with the training step since
the samples models that I downloaded from
http://www.statmt.org/moses/download/sample-models.tgz work fine.

Prior to this, I constructed an IRST LM an used clean-corpus-n.perl for
cleaning the decoder input. Looking at the archives, the closest message I
could find was http://thread.gmane.org/gmane.comp.nlp.moses.user/1478 but I
don't think I'm committing the same mistake as the author of that message.

I'll be delighted if anybody could provide any insights in this problem or
requires me to provide any further details.

Thanks,

--Sudip.
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to