On Wed, Mar 4, 2015 at 1:04 PM, Dave Goodell (dgoodell) <dgood...@cisco.com>
wrote:
[...]

> > libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'.
> > libibverbs: Warning: no userspace device-specific driver found for
> /sys/class/infiniband_verbs/uverbs0
>
> I think that warning is printed by libibverbs itself.  Are you 100% sure
> there are no IB HCAs sitting in the head node?  If there are IB HCAs but
> you don't want them to be used, you might want to ensure that the various
> verbs kernel modules don't get loaded, which is one half of the mismatch
> which confuses libibverbs.
>
[...]

FWIW, I can confirm that these two lines are from libibverbs itself:

$ strings /usr/lib64/libibverbs.a | grep -e 'no userspace' -e 'open config
directory'
libibverbs: Warning: no userspace device-specific driver found for %s
libibverbs: Warning: couldn't open config directory '%s'.


As it happens, the login node *does* have an HCA installed and the kernel
modules appears to be loaded.  However, as the "17th node" in the cluster
it was never cabled to the 16-port switch and the package(s) that should
have created/populated /etc/libibverbs.d are *not* present (specifically
the login node has libipathverbs-devel installed but not libipathverbs).

So, Dave, are you saying that what I describe in the previous paragraph
would be considered "misconfiguration"?  I am fine with dropping the
discussion of those first two lines if there is agreement that Open MPI
shouldn't be responsible for handling this case.

Now the ibv_fork_init() warnings are another issue entirely.  Since
btl:verbs and mtl:psm both work (at least separately) perfectly fine on the
compute nodes, I don't believe that there are any configuration issues
there.

-Paul

-- 
Paul H. Hargrove                          phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department               Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Reply via email to