Hi,

I still have problems with getting a factored model on Europarl data 
to run without memory allocation errors. I tried a simpler setup now 
with 3 decoding steps, 2 translation steps and one generation step. 
Translations are done on lemmas and POS labels and the generation step 
is supposed to create surface forms from target lemmas and POSs. Both 
phrase tables and the reordering table are binarized and the LM is 
also in binary format using IRST. Caching is disabled. However, moses 
still fails with a memory allocation error after some sentences (about 
100) using the non-tuned model. Surprisingly this happens with a very 
short sentence of only 8 words. There's nothing really strange with 
the sentence either (the only thing is a hyphen in the beginning but 
removing it doesn't change anything).

Now I'm wondering if I did something wrong or if europarl is just too 
big for such experiments. What would be a typical factored model that 
could be run on this amount of data? I'm running on a server with 8GB 
RAM - so that shouldn't be the big problem.


Another question: would it be possible to run a generation step (for 
example lemma+POS --> word) with a second model instead of doing it in 
the main decoding step (in the same way as the recaser is done 
afterwards)? Would that be a solution for my memory problems? If this 
is possible should I expect much different (worse) results?


thanks in advance for any suggestions,


Jörg

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to