On Apr 23, 2018, at 11:00 AM, Marshall2, John (SSC/SPC) 
<john.marsha...@canada.ca> wrote:
> 
> Only one ib interface shows up via ifconfig and at /sys/class/net/ibX.
> 
> But, under /sys/class/infiniband and /sys/class/infiniband_cm, all the mlx4_Y 
> do show
> up. E.g.,
> mlx4_0        mlx4_10  mlx4_12  mlx4_14  mlx4_16  mlx4_3  mlx4_5  mlx4_7  
> mlx4_9
> mlx4_1        mlx4_11  mlx4_13  mlx4_15  mlx4_2   mlx4_4  mlx4_6  mlx4_8
> 
> I'm not sure if this can be avoided.
> 
> So, where is openmpi looking for the available mlx4_Y? Under one of those two 
> directories
> or whatever is at /sys/class/net/ibX/device/infiniband/mlx4_Y?

It will use whatever devices libibverbs reports back.

It's been quite a while since I've looked in the libibverbs code, but it 
*might* return all the devices...?  What does ibv_devinfo(1) return inside one 
of your containers?  That's probably the same information that is returned to 
Open MPI programmatically via the libibverbs API.

If libibverbs is returning all devices vs. just the one that is actually 
available in your container, then that might explain the performance disparity.

-- 
Jeff Squyres
jsquy...@cisco.com

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to