On Oct 4, 2007, at 3:06 PM, Jinhui Qin wrote:
sib:sharcnet$ mpirun -n 3 ~/openMPI_stuff/Hello

Process 0.1.1 is unable to reach 0.1.2 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.


This is very odd -- it looks like two of the processes don't think they can talk to each other. Can you try running with:

  mpirun -n 3 -mca btl tcp,self <app>

If that fails, then the next piece of information that would be useful is the IP addresses and netmasks for all the nodes in your cluster. We have some logic in our TCP communication system that can cause some interesting results for some network topologies.

Just to verify it's not an XGrid problem, you might want to try running with a hostfile -- I think you'll find that the results are the same, but it's always good to verify.

Brian

Reply via email to