I'm getting the below message on my cluster(s).  It seems to only
happen when I try to use more then 64 nodes (16-cores each).  The
clusters are running RHEL 6.5 with Slurm and Openmpi-1.6.5 with PSM.
I'm using the OFED versions included with RHEL for infiniband support.

ipath_userinit: Mismatched user minor version (12) and driver minor
version (11) while context sharing. Ensure that driver and library are
from the same release

I already realize this is a warning message and the jobs complete.
Another user a little over a year ago had a similar issue that was
tracked to mismatched ofed versions.  Since i have a diskless cluster
all my nodes are identical.

I'm not adverse to thinking there might not be something unique about
my machine, but since i have two separate machines doing it, I'm not
really sure where to look to triage the issue and see what might be
set incorrectly.

Any thoughts on where to start checking would be helpful, thanks...

Reply via email to