Thanks for the quick response ... I've been thinking about this today and tried 
a few things on my CentOS mini connected cluster ...

To use tcp btl I will have to set up a bridge on A with ib0 and ib1 
participating in the bridge, then tcp btl could be used as you suggest.  
Unfortunately, the obvious solution to use bridge-utils on CentOS does not 
support Infiniband adapters.

This is now straying out of MPI range to a networking issue ... any ideas would 
be greatly appreciated on bridging at the IP over IB tier in a cluster.  This 
must be a solved problem but I'm not having a lot of luck with google and the 
archives.

Paul Monday



On Nov 22, 2010, at 7:46 AM, Terry Dontje wrote:

> You're gonna have to use a protocol that can route through a machine, OFED 
> User Verbs (ie openib) does not do this.  The only way I know of to do this 
> via OMPI is with the tcp btl.
> 
> --td
> 
> On 11/22/2010 09:28 AM, Paul Monday (Parallel Scientific) wrote:
>> 
>> We've been using OpenMPI in a switched environment with success, but we've 
>> moved to a point to point environment to do some work.  Some of the nodes 
>> cannot talk directly to one another, sort of like this with computers A,B, C 
>> with A having two ports: 
>> 
>> A(1)(opensm)------>B 
>> A(2)(opensm)------>C 
>> 
>> B is not connected to C in any way. 
>> 
>> When we try to run our OpenMPI program we are receiving: 
>> At least one pair of MPI processes are unable to reach each other for 
>> MPI communications.  This means that no Open MPI device has indicated 
>> that it can be used to communicate between these processes.  This is 
>> an error; Open MPI requires that all MPI processes be able to reach 
>> each other.  This error can sometimes be the result of forgetting to 
>> specify the "self" BTL. 
>> 
>>   Process 1 ([[1581,1],5]) is on host: pg-B 
>>   Process 2 ([[1581,1],0]) is on host: pg-C 
>>   BTLs attempted: openib self sm 
>> 
>> Your MPI job is now going to abort; sorry. 
>> 
>> 
>> I hope I'm not being overly naive but, is their a way to join the subnets at 
>> the MPI layer?  It seems like IP over IB would be too high up the stack. 
>> 
>> Paul Monday 
>> _______________________________________________ 
>> users mailing list 
>> us...@open-mpi.org 
>> http://www.open-mpi.org/mailman/listinfo.cgi/users 
> 
> 
> -- 
> <Mail Attachment.gif>
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
> Oracle - Performance Technologies
> 95 Network Drive, Burlington, MA 01803
> Email terry.don...@oracle.com
> 
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to