I'm getting the below message on my cluster(s). It seems to only happen when I try to use more then 64 nodes (16-cores each). The clusters are running RHEL 6.5 with Slurm and Openmpi-1.6.5 with PSM. I'm using the OFED versions included with RHEL for infiniband support.
ipath_userinit: Mismatched user minor version (12) and driver minor version (11) while context sharing. Ensure that driver and library are from the same release I already realize this is a warning message and the jobs complete. Another user a little over a year ago had a similar issue that was tracked to mismatched ofed versions. Since i have a diskless cluster all my nodes are identical. I'm not adverse to thinking there might not be something unique about my machine, but since i have two separate machines doing it, I'm not really sure where to look to triage the issue and see what might be set incorrectly. Any thoughts on where to start checking would be helpful, thanks...