Hi,
In mgiza, I noticed that the HMM model is a lot slower with 8 threads
than with 4 threads. It becomes a lot worse for 16 threads or more.
Multi-threading is the main feature of mgiza, so this kind of defies the
whole sense of the thing to the extend that I would say mgiza is broken.
Haven't tested the higher models yet, but I guess it is not going to be
much better. Tried to link to tcmalloc, but that does not seem to help. 

150,000 sentence corpus, first iteration of HMM: 

2 threads: 683s 

3 threads: 532s 

4 threads: 521s 

6 threads: 632s 

8 threads: 777s 

16 threads: 858s 

This is quite horrible, at least on my system.
I will try to have a look what is going on there (I dread the code), but
if anyone has some ideas, the same experience or wants to help that
would be most welcome. It seems Model 1 is fine, and speed improves with
a greater number of threads, but the other models are off. Also htop
shows mgiza is mostly in kernel-mode (all red bars instead of green)
once the HMM iterations start. 

Best, 

Marcin 
 
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to