On Jun 15, 2012, at 11:26 AM, Daniels, Marcus G wrote: >> Were there any clues in /var/log/messages or dmesg? > > Thanks. I found a suggestion from Nathan Hjelm to add "options mlx4_core > log_mtts_per_seg=X" (where X is 5 in my case). > Offline suggestions (which also included that) were also add "--mca > mpi_leave_pinned 0" to the mpirun line and to double check my locked memory > limits.
Setting leave_pinned to 0 will likely decrease your overall registered memory usage, but only over time. If you're not making it through MPI_INIT, then setting leave_pinned to 0 won't help. > The only thing I find works reliably is to use "-npernode 32" instead of > "-npernode 48". Unfortunately my system has 48 processor node. Well, that's a bummer. You've somehow got some restrictions on how much registered memory you can set. You probably want to check with your IB vendor for further advice here. One other thing you might want to try is to change Open MPI's receive queues to all be SRQ (as opposed to PP). See this FAQ item: http://www.open-mpi.org/faq/?category=openfabrics#ib-receive-queues FWIW, in my regression testing, I run this set of RQ's as one of my tests: --mca btl_openib_receive_queues S,128,256:S,2048,256:S,12288,256:S,65536,256 You may want to tweak these values to fit your applications, etc. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/