Hi all,
I am currently working on a little statMT-research and I would like to
do a little focus on factored models. After a struggle with my corpus
(subset Europarl EN/NL) to get it to work with Moses (spacing, correct
factor generation, charset, etc), Moses is finally able to train a
language model on it. However, when decoding, moses quickly dies
giving the following message:
---
Finished loading phrase tables : [14.000] seconds
IO from STDOUT/STDIN
Created input-output object : [14.000] seconds
Translating: ik|Pron|ik loop|V|lop over|Prep|over de|Art|de
straat|N|strat .|Punc|.
moses: LanguageModelSRI.cpp:154: virtual float
LanguageModelSRI::GetValue(const std::vector<const Word*,
std::allocator<const Word*> >&, const void**, unsigned int*) const:
Assertion `(*contextFactor[count-1])[factorType] != __null' failed.
Aborted
---
I am using the i386 binaries from the Ubuntu-NLP archive:
- Moses 20080525svn-1nlp3~0gutsy1
- Giza++ 2.0.20030930gcc41-3nlp1~0gutsy1
- SriLM 1.5.6-1nlp1~0gutsy1
I've trained using the following commandline:
train-factored-phrase-model.perl \
--root-dir . \
--f nl --e en \
--corpus corpus/euro \
--alignment-factors 0,1,2-0,1,2 \
--translation-factors 1-1+2-2 \
--generation-factors 2,1-0 \
--lm 0:3:corpus/surface.lm:0 \
--lm 1:3:corpus/pos.lm:0 \
--lm 2:3:corpus/stem.lm:0
I am decoding with the generated moses.ini from above commandline,
without tweaks. The sentence to be decoded is Dutch (nl), and is
prepared by the same factor producing chain as the nl-half of the
corpus was.
Since I am not that much of a code guru, I was hoping someone on this
list would be able to help me. Am I doing something wrong, or is this
a bug?
With kind regards,
Jorik Jonker
MSc student University of Twente, Netherlands
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support