On Jun 15, 2012, at 11:26 AM, Daniels, Marcus G wrote:

>> Were there any clues in /var/log/messages or dmesg?
> 
> Thanks.  I found a suggestion from Nathan Hjelm to add "options mlx4_core 
> log_mtts_per_seg=X" (where X is 5 in my case).  
> Offline suggestions (which also included that) were also add "--mca 
> mpi_leave_pinned 0" to the mpirun line and to double check my locked memory 
> limits.

Setting leave_pinned to 0 will likely decrease your overall registered memory 
usage, but only over time.  If you're not making it through MPI_INIT, then 
setting leave_pinned to 0 won't help.

> The only thing I find works reliably is to use "-npernode 32" instead of 
> "-npernode 48".  Unfortunately my system has 48 processor node.

Well, that's a bummer.  You've somehow got some restrictions on how much 
registered memory you can set.  You probably want to check with your IB vendor 
for further advice here.

One other thing you might want to try is to change Open MPI's receive queues to 
all be SRQ (as opposed to PP).  See this FAQ item:

    http://www.open-mpi.org/faq/?category=openfabrics#ib-receive-queues

FWIW, in my regression testing, I run this set of RQ's as one of my tests:

   --mca btl_openib_receive_queues S,128,256:S,2048,256:S,12288,256:S,65536,256

You may want to tweak these values to fit your applications, etc.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to