Ralph,

if I correctly read between the lines of your second point, omnipath (PSM2)
is working out of the box. I am not sure this is the case, and/or my
extrapolation might be incorrect.

if I understood correctly, psm2 is a new feature.
from a distro point of view, that could be a new package (known not to
support PSM), or a mpirun-psm2 wrapper, or a release note (e.g. use --mca
mtl ^psm or a psm2 param file)

I still do not get how removing PSM2 makes things better
(and the same result can be achieved by configuring with --without-psm2)

Cheers,

Gilles

On Thursday, September 3, 2015, Ralph Castain <r...@open-mpi.org> wrote:

> I guess I didn’t make it clear in my prior comment, so let me try again. I
> understand about dlopen and the fix that George proposed - we had
> internally discussed this as well. However, the questions that raises are:
>
> 1. how does the distro (Michal) decide which PSM module to disable by
> default in their package?
>
> 2. how does the user “discover” that their fabric has automatically been
> disabled, especially since this has never been the case before?
>
> I’ll raise the procedural question at our next telecon. I certainly take
> no pleasure out of generating releases, so if we have a better solution,
> I’m all for it!
>
>
> > On Sep 3, 2015, at 5:55 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com
> <javascript:;>> wrote:
> >
> > I agree with what George says.
> >
> > AFAIK, Red Hat builds Open MPI support for dlopen, so the config file
> option is probably suitable.
> >
> > However, I have to admit that I resent the fact that PSM's poor upgrade
> path design is forcing both the Open MPI and libfabric communities to have
> similar confusing conversations (e.g., see
> https://github.com/ofiwg/libfabric/issues/1258#issuecomment-137426271).
> >
> > Specifically: because of the design of PSM1/PSM2, both Open MPI and
> libfabric will have to adjust their configury and use dlopen/function
> pointer indirection to "solve" the problem of supporting both PSM1 and PSM2.
> >
> > Does that seem weird to anyone else?
> >
> > IMNSHO, if you have to have extremely confusing conversations in
> multiple software communities explaining your configury,
> function-pointer-indirection code (i.e., PR
> https://github.com/ofiwg/libfabric/pull/1259), compilation, and linking
> scheme to upgrade to a new library, you're doing it wrong.
> >
> >
> >
> >
> >> On Sep 3, 2015, at 7:19 AM, George Bosilca <bosi...@icl.utk.edu
> <javascript:;>> wrote:
> >>
> >> Hi Michael,
> >>
> >> I might have missed some context when proposing this solution. As
> Gilles suggested if you build Open MPI without support for dlopen
> (configure option --disable-dlopen) this simple solution will not work
> because the symbol conflict issue is generated deep inside the constructors
> of the 2 libraries.
> >>
> >> Yes, the "mtl = ^psm" (or ^psm2 depending on which one you want to
> disable) should go in the openmpi-mca-params.conf that gets installed in
> the $(sysconfigdir).
> >>
> >> Thanks,
> >> George.
> >>
> >>
> >> On Thu, Sep 3, 2015 at 5:14 AM, Michal Schmidt <mschm...@redhat.com
> <javascript:;>> wrote:
> >> [I apologize for not threading the email properly. I was not subscribed
> >> before and found the conversation in the web archive.]
> >>
> >> Hello,
> >>
> >> I am the one who discovered the PSM vs. PSM2 library conflict and
> >> proposed the temporary workaround of having two builds of the openmpi
> >> package.
> >>
> >> George Bosilca wrote:
> >>> 3. Except if the distro builds OMPI statically, I see no reason to
> >>> have 2 build of OMPI due to conflicting symbols between two shared
> >>> libraries that OMPI MCA load willingly. Why a simple "mtl = ^psm" in
> >>> the OMPI system wide configuration file is not enough to solve the
> >>> issue?
> >>
> >> Thank you for this suggestion. It would go into openmpi-mca-params.conf,
> >> right? I will try it.
> >>
> >> Regards,
> >> Michal
> >> _______________________________________________
> >> devel mailing list
> >> de...@open-mpi.org <javascript:;>
> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/09/17927.php
> >>
> >> _______________________________________________
> >> devel mailing list
> >> de...@open-mpi.org <javascript:;>
> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/09/17928.php
> >
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com <javascript:;>
> > For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org <javascript:;>
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/09/17931.php
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <javascript:;>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/09/17933.php

Reply via email to