Dave,

the QDR Infiniband uses the openib btl (by default :
btl_openib_exclusivity=1024)
i assume the RoCE 10Gbps card is using the tcp btl (by default :
btl_tcp_exclusivity=100)

that means that by default, when both openib and tcp btl could be used,
the tcp btl is discarded.

could you give a try by settings the same exclusivity value on both btl ?
e.g.
OMPI_MCA_btl_tcp_exclusivity=1024 mpirun ...

assuming this is enough the get traffic on both interfaces, you migh
want *not* to use the eth0 interface
(e.g. OMPI_MCA_btl_tcp_if_exlude=eth0 ...)

you might also have to tweak the bandwidth parameters (i assume QDR
interface should get 4 times more
traffic than the 10Gbe interface)
by default :
btl_openib_bandwidth=4
btl_tcp_bandwidth=100
/* value is in Mbps, so the openib value should be 40960 (!), and in
your case, tcp bandwidth should be 10240 */
you might also want to try btl_*_bandwidth=0 (auto-detect value at run time)

i hope this helps,

Cheers,

Gilles
On 2015/01/29 9:45, Dave Turner wrote:
>      I ran some aggregate bandwidth tests between 2 hosts connected by
> both QDR InfiniBand and RoCE enabled 10 Gbps Mellanox cards.  The tests
> measured the aggregate performance for 16 cores on one host communicating
> with 16 on the second host.  I saw the same performance as with the QDR
> InfiniBand alone, so it appears that the addition of the 10 Gbps RoCE cards
> is
> not helping.
>
>      Should OpenMPI be using both in this case by default, or is there
> something
> I need to configure to allow for this?  I suppose this is the same question
> as
> how to make use of 2 identical IB connections on each node, or is the system
> simply ignoring the 10 Gbps cards because they are the slower option.
>
>      Any clarification on this would be helpful.  The only posts I've found
> are very
> old and discuss mostly channel bonding of 1 Gbps cards.
>
>                      Dave Turner
>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/01/26243.php

Reply via email to