On Mon, 2018-04-23 at 14:44 +0000, Jeff Squyres (jsquyres) wrote:

On Apr 20, 2018, at 1:03 PM, Marshall2, John (SSC/SPC) 
<john.marsha...@canada.ca<mailto:john.marsha...@canada.ca>> wrote:



I am trying to verify/determine what the proper setting is for 
btl_openib_ib_include.



I think you mean btl_openib_if_include ("if" = "interface").


Yes.





Some background:
* openmpi 2.1.1 (and 1.6.5 - yes it is old)
* lxc containers
* SRIOV (virtual functions) being used
* dedicated IB interface (e.g., ib2) per container

Should the mlx4_X:1 correspond to a specific ibY interface? E.g., for ib26, I 
find
mlx4_13:1 by:
$ ls /sys/class/net/ib26/device/infiniband
mlx4_13

Does the mlx4_X have to be determined at each location where an mpi task
would run? I suppose it would because the ibY is likely to be different.



Open MPI basically probes its environment at run time.  In your case, it will 
find all available IB interfaces (per MPI process), filter them through 
if_include / if_exclude, and then use whatever is left.



On some tests, I have found that the setting:
export OMPI_MCA_btl_openib_if_include=mlx4_0:1

provides better performance than not specifying a value or letting mpirun/orted
figure it out at runtime.



That's a little surprising.

Do you have more than 1 IB interface?  If not, then Open MPI should likely be 
independently coming to the same conclusion (i.e., "mlx4_0:1").  If it's not, 
that's weird.


Only one ib interface shows up via ifconfig and at /sys/class/net/ibX.

But, under /sys/class/infiniband and /sys/class/infiniband_cm, all the mlx4_Y 
do show
up. E.g.,

mlx4_0  mlx4_10  mlx4_12  mlx4_14  mlx4_16  mlx4_3  mlx4_5  mlx4_7  mlx4_9

mlx4_1  mlx4_11  mlx4_13  mlx4_15  mlx4_2   mlx4_4  mlx4_6  mlx4_8

I'm not sure if this can be avoided.

So, where is openmpi looking for the available mlx4_Y? Under one of those two 
directories
or whatever is at /sys/class/net/ibX/device/infiniband/mlx4_Y?

Thanks,
John






_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to