Thanks for the quick response ... I've been thinking about this today and tried a few things on my CentOS mini connected cluster ...
To use tcp btl I will have to set up a bridge on A with ib0 and ib1 participating in the bridge, then tcp btl could be used as you suggest. Unfortunately, the obvious solution to use bridge-utils on CentOS does not support Infiniband adapters. This is now straying out of MPI range to a networking issue ... any ideas would be greatly appreciated on bridging at the IP over IB tier in a cluster. This must be a solved problem but I'm not having a lot of luck with google and the archives. Paul Monday On Nov 22, 2010, at 7:46 AM, Terry Dontje wrote: > You're gonna have to use a protocol that can route through a machine, OFED > User Verbs (ie openib) does not do this. The only way I know of to do this > via OMPI is with the tcp btl. > > --td > > On 11/22/2010 09:28 AM, Paul Monday (Parallel Scientific) wrote: >> >> We've been using OpenMPI in a switched environment with success, but we've >> moved to a point to point environment to do some work. Some of the nodes >> cannot talk directly to one another, sort of like this with computers A,B, C >> with A having two ports: >> >> A(1)(opensm)------>B >> A(2)(opensm)------>C >> >> B is not connected to C in any way. >> >> When we try to run our OpenMPI program we are receiving: >> At least one pair of MPI processes are unable to reach each other for >> MPI communications. This means that no Open MPI device has indicated >> that it can be used to communicate between these processes. This is >> an error; Open MPI requires that all MPI processes be able to reach >> each other. This error can sometimes be the result of forgetting to >> specify the "self" BTL. >> >> Process 1 ([[1581,1],5]) is on host: pg-B >> Process 2 ([[1581,1],0]) is on host: pg-C >> BTLs attempted: openib self sm >> >> Your MPI job is now going to abort; sorry. >> >> >> I hope I'm not being overly naive but, is their a way to join the subnets at >> the MPI layer? It seems like IP over IB would be too high up the stack. >> >> Paul Monday >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > <Mail Attachment.gif> > Terry D. Dontje | Principal Software Engineer > Developer Tools Engineering | +1.781.442.2631 > Oracle - Performance Technologies > 95 Network Drive, Burlington, MA 01803 > Email terry.don...@oracle.com > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users