I don't know much about PSM either but shouldn't it be called only after
the oob:ud code?
If so, then ibv_fork_init() is being called from oob:ud early enough so
either there is another reason for ibv_fork_init() failure (like you said),
or the reason why this verb failed was the same reason why these warnings
appeared?
libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'.
libibverbs: Warning: no userspace device-specific driver found for
/sys/class/infiniband_verbs/uverbs0

Also, opal_common_verbs_want_fork_support is now set to -1 like you
suggested so these warnings shouldn't appear anymore.

On Thu, Mar 5, 2015 at 4:51 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com>
wrote:

> On Mar 5, 2015, at 6:32 AM, Alina Sklarevich <ali...@dev.mellanox.co.il>
> wrote:
> >
> > If oob:ud was disabled then there was no call to ibv_fork_init()
> anywhere else, right? If so, then this is why the messages went away.
>
> Right.  That's why I'm saying it doesn't seem like a PSM problem.
>
> (I don't know much about PSM, but I don't think it uses verbs...?)
>
> > The calls to ibv_fork_init() from the opal common verbs were pushed to
> the master. One of the places a call was set is oob:ud, but if there is a
> call to memory registering verbs before this place, then the call to it in
> oob:ud would result in a failure.
>
> Yes, I think that is the exact question: why are these messages showing up
> because of oob:ud?  It seems like the call sequences to ibv_fork_init() are
> not as understood as we thought they were.  :-(  I.e., it was presupposed
> that oob_ud was the first entity to call any verbs code (and by your
> commits, is supposed to be calling the common verbs code to call
> ibv_fork_init() early enough such that it won't be a problem).  But either
> that is not the case, or ibv_fork_init() is failing for some other reason.
>
> These are the things that need to be figured out.
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/03/17104.php
>

Reply via email to