On Wed, Mar 4, 2015 at 1:04 PM, Dave Goodell (dgoodell) <dgood...@cisco.com> wrote: [...]
> > libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'. > > libibverbs: Warning: no userspace device-specific driver found for > /sys/class/infiniband_verbs/uverbs0 > > I think that warning is printed by libibverbs itself. Are you 100% sure > there are no IB HCAs sitting in the head node? If there are IB HCAs but > you don't want them to be used, you might want to ensure that the various > verbs kernel modules don't get loaded, which is one half of the mismatch > which confuses libibverbs. > [...] FWIW, I can confirm that these two lines are from libibverbs itself: $ strings /usr/lib64/libibverbs.a | grep -e 'no userspace' -e 'open config directory' libibverbs: Warning: no userspace device-specific driver found for %s libibverbs: Warning: couldn't open config directory '%s'. As it happens, the login node *does* have an HCA installed and the kernel modules appears to be loaded. However, as the "17th node" in the cluster it was never cabled to the 16-port switch and the package(s) that should have created/populated /etc/libibverbs.d are *not* present (specifically the login node has libipathverbs-devel installed but not libipathverbs). So, Dave, are you saying that what I describe in the previous paragraph would be considered "misconfiguration"? I am fine with dropping the discussion of those first two lines if there is agreement that Open MPI shouldn't be responsible for handling this case. Now the ibv_fork_init() warnings are another issue entirely. Since btl:verbs and mtl:psm both work (at least separately) perfectly fine on the compute nodes, I don't believe that there are any configuration issues there. -Paul -- Paul H. Hargrove phhargr...@lbl.gov Computer Languages & Systems Software (CLaSS) Group Computer Science Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900