Dave, the QDR Infiniband uses the openib btl (by default : btl_openib_exclusivity=1024) i assume the RoCE 10Gbps card is using the tcp btl (by default : btl_tcp_exclusivity=100)
that means that by default, when both openib and tcp btl could be used, the tcp btl is discarded. could you give a try by settings the same exclusivity value on both btl ? e.g. OMPI_MCA_btl_tcp_exclusivity=1024 mpirun ... assuming this is enough the get traffic on both interfaces, you migh want *not* to use the eth0 interface (e.g. OMPI_MCA_btl_tcp_if_exlude=eth0 ...) you might also have to tweak the bandwidth parameters (i assume QDR interface should get 4 times more traffic than the 10Gbe interface) by default : btl_openib_bandwidth=4 btl_tcp_bandwidth=100 /* value is in Mbps, so the openib value should be 40960 (!), and in your case, tcp bandwidth should be 10240 */ you might also want to try btl_*_bandwidth=0 (auto-detect value at run time) i hope this helps, Cheers, Gilles On 2015/01/29 9:45, Dave Turner wrote: > I ran some aggregate bandwidth tests between 2 hosts connected by > both QDR InfiniBand and RoCE enabled 10 Gbps Mellanox cards. The tests > measured the aggregate performance for 16 cores on one host communicating > with 16 on the second host. I saw the same performance as with the QDR > InfiniBand alone, so it appears that the addition of the 10 Gbps RoCE cards > is > not helping. > > Should OpenMPI be using both in this case by default, or is there > something > I need to configure to allow for this? I suppose this is the same question > as > how to make use of 2 identical IB connections on each node, or is the system > simply ignoring the 10 Gbps cards because they are the slower option. > > Any clarification on this would be helpful. The only posts I've found > are very > old and discuss mostly channel bonding of 1 Gbps cards. > > Dave Turner > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/01/26243.php