Ralph is right: OMPI aggressively uses all Ethernet interfaces by default.
This short FAQ has links to 2 other FAQs that provide detailed information about reachability: http://www.open-mpi.org/faq/?category=tcp#tcp-multi-network The usNIC BTL uses UDP for its wire transport and actually does a much more standards-conformant peer reachability determination (i.e., it actually checks routing tables to see if it can reach a given peer which has all kinds of caching benefits, kernel controls if you want them, etc.). We haven't back-ported this to the TCP BTL because a) most people who use TCP for MPI still use a single L2 address space, and b) no one has asked for it. :-) As for the round robin scheduling, there's no indication from the Linux TCP stack what the bandwidth is on a given IP interface. So unless you use the btl_tcp_bandwidth_<IP_INTERFACE_NAME> (e.g., btl_tcp_bandwidth_eth0) MCA params, OMPI will round-robin across them equally. If you have multiple IP interfaces sharing a single physical link, there will likely be no benefit from having Open MPI use more than one of them. You should probably use btl_tcp_if_include / btl_tcp_if_exclude to select just one. On Nov 7, 2014, at 2:53 PM, Brock Palen <bro...@umich.edu> wrote: > I was doing a test on our IB based cluster, where I was diabling IB > > --mca btl ^openib --mca mtl ^mxm > > I was sending very large messages >1GB and I was surppised by the speed. > > I noticed then that of all our ethernet interfaces > > eth0 (1gig-e) > ib0 (ip over ib, for lustre configuration at vendor request) > eoib0 (ethernet over IB interface for IB -> Ethernet gateway for some > extrnal storage support at >1Gig speed > > I saw all three were getting traffic. > > We use torque for our Resource Manager and use TM support, the hostnames > given by torque match the eth0 interfaces. > > How does OMPI figure out that it can also talk over the others? How does it > chose to load balance? > > BTW that is fine, but we will use if_exclude on one of the IB ones as ib0 and > eoib0 are the same physical device and may screw with load balancing if > anyone ver falls back to TCP. > > Brock Palen > www.umich.edu/~brockp > CAEN Advanced Computing > XSEDE Campus Champion > bro...@umich.edu > (734)936-1985 > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/11/25709.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/