Paul, judging by: libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'. libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0
it seems that ofed userspace libraries version does not match loaded ofer kernel driver version. On Thu, Mar 5, 2015 at 5:33 PM, Alina Sklarevich <ali...@dev.mellanox.co.il> wrote: > I don't know much about PSM either but shouldn't it be called only after > the oob:ud code? > If so, then ibv_fork_init() is being called from oob:ud early enough so > either there is another reason for ibv_fork_init() failure (like you said), > or the reason why this verb failed was the same reason why these warnings > appeared? > libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'. > libibverbs: Warning: no userspace device-specific driver found for > /sys/class/infiniband_verbs/uverbs0 > > Also, opal_common_verbs_want_fork_support is now set to -1 like you > suggested so these warnings shouldn't appear anymore. > > On Thu, Mar 5, 2015 at 4:51 PM, Jeff Squyres (jsquyres) < > jsquy...@cisco.com> wrote: > >> On Mar 5, 2015, at 6:32 AM, Alina Sklarevich <ali...@dev.mellanox.co.il> >> wrote: >> > >> > If oob:ud was disabled then there was no call to ibv_fork_init() >> anywhere else, right? If so, then this is why the messages went away. >> >> Right. That's why I'm saying it doesn't seem like a PSM problem. >> >> (I don't know much about PSM, but I don't think it uses verbs...?) >> >> > The calls to ibv_fork_init() from the opal common verbs were pushed to >> the master. One of the places a call was set is oob:ud, but if there is a >> call to memory registering verbs before this place, then the call to it in >> oob:ud would result in a failure. >> >> Yes, I think that is the exact question: why are these messages showing >> up because of oob:ud? It seems like the call sequences to ibv_fork_init() >> are not as understood as we thought they were. :-( I.e., it was >> presupposed that oob_ud was the first entity to call any verbs code (and by >> your commits, is supposed to be calling the common verbs code to call >> ibv_fork_init() early enough such that it won't be a problem). But either >> that is not the case, or ibv_fork_init() is failing for some other reason. >> >> These are the things that need to be figured out. >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2015/03/17104.php >> > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/03/17106.php > -- Kind Regards, M.