Dear Gus, Thanks for your help - your clue solved my problem!
The ultimate solution was to limit mpi communications to the local, unrouted subnet. I made this the default behavior of all users of my cluster by adding the following line to the bottom of my $prefix/etc/openmpi-mca-params.conf file btl_tcp_if_include = 10.0.0.0/8 Thanks again - what a relief! Jed On Fri, Jul 5, 2013, at 01:25 AM, Gustavo Correa wrote: > Hi Jed > > You could try to select only ethernet interface that match your node's IP > addresses, > which seems to be en2. > > The en1 interface seems to be an external IP. > Not sure about en3, but it is awkward that it has a > different IP than en2, but in the same subnet. > I wonder if this may be the reason for the program hanging. > > You may need to search all nodes ifconfig for a consistent set of > interfaces/IP addresses, > and tailor your mpiexec command line and your hostfile accordingly. > > Say, something like this: > > mpiexec -mca btl_tcp_if_include en2 -hostfile your_hostfile -np 43 > ./ring_c > > See this FAQ (actually, all of them are very informative): > http://www.open-mpi.org/faq/?category=tcp#tcp-selection > > I hope this helps, > Gus Correa