Gilles: what version of PSM were you using? and with which cards?
> On Aug 25, 2015, at 9:32 AM, Nathaniel Graham <nrgraha...@gmail.com> wrote: > > What if we modify the mpirun script to include the --mca mtl ^psm tag if java > is in the run string? > > -Nathan > > On Tue, Aug 25, 2015 at 9:47 AM, Howard Pritchard <hpprit...@gmail.com > <mailto:hpprit...@gmail.com>> wrote: > I'll update the java FAQ. > > 2015-08-25 8:36 GMT-06:00 Jeff Squyres (jsquyres) <jsquy...@cisco.com > <mailto:jsquy...@cisco.com>>: > On Aug 25, 2015, at 10:00 AM, Howard Pritchard <hpprit...@gmail.com > <mailto:hpprit...@gmail.com>> wrote: > > > > I think rather than trying workarounds of dubious robustness inside open > > mpi we > > > > - dicument the issue on either the somewhat aged open mpi website faq or > > add it to a wiki page on github > > It should probably be documented in the README and the FAQ. > > I'd be against adding user documentation to the wiki -- this would be a 3rd > place for users to look for information. > > > - file a bug against intel psm > > I'd like to hear what they have to say first... :-) > > > > > ---------- > > > > sent from my smart phonr so no good type. > > > > Howard > > > > On Aug 25, 2015 6:02 AM, "Gilles Gouaillardet" > > <gilles.gouaillar...@gmail.com <mailto:gilles.gouaillar...@gmail.com>> > > wrote: > > i do not know if this can be runtime detected ... > > note we should report this to intel folks and ask them to advise. > > ideally, they would provide a way to make sure libinfinipath.so does not > > conflict with the jvm signal handlers. > > > > my idea is to dlopen libinfinipath only if java bindings are not used. > > > > On Tuesday, August 25, 2015, Jeff Squyres (jsquyres) <jsquy...@cisco.com > > <mailto:jsquy...@cisco.com>> wrote: > > Is it possible to run-time detect this situation? E.g., probe the signal > > handler, or somesuch. > > > > Rationale: I'd rather have something run-time disabled than not built. > > > > Would dlopen'ing libinfinipath change actually change its signal handler > > behavior? > > > > > > > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet <gil...@rist.or.jp > > > <mailto:gil...@rist.or.jp>> wrote: > > > > > > Folks, > > > > > > some time ago, some crashes were reported when using java bindings. > > > one of them was caused was caused by mca_mtl_psm.so. > > > the root cause is libinfinipath.so initializer sets its own signal > > > handler, which > > > conflicts with the signal handler sets by the jvm. > > > the only workaround is to disable the psm mtl > > > (e.g. mpirun --mca mtl ^psm ...) > > > since mpirun --mca mtl_psm_priority 0 ... does not work > > > (libinfinipath.so is loaded, so the initializer is ran and the signal > > > handlers are set) > > > so the psm mtl cannot be disabled by the Java MPI_Init() > > > > > > one option is to document this > > > an other option is not to build the psm mtl if java bindings are built > > > and an other option is to revamp mca_mtl_psm.so so it does not link with > > > libinfinipath.so > > > (use an intermediate component, or dlopen libinfinipath) > > > > > > any thoughts ? > > > > > > Cheers, > > > > > > Gilles > > > _______________________________________________ > > > devel mailing list > > > de...@open-mpi.org <mailto:de...@open-mpi.org> > > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > > > Link to this post: > > > http://www.open-mpi.org/community/lists/devel/2015/08/17838.php > > > <http://www.open-mpi.org/community/lists/devel/2015/08/17838.php> > > > > > > -- > > Jeff Squyres > > jsquy...@cisco.com <mailto:jsquy...@cisco.com> > > For corporate legal information go to: > > http://www.cisco.com/web/about/doing_business/legal/cri/ > > <http://www.cisco.com/web/about/doing_business/legal/cri/> > > > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org <mailto:de...@open-mpi.org> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > > Link to this post: > > http://www.open-mpi.org/community/lists/devel/2015/08/17840.php > > <http://www.open-mpi.org/community/lists/devel/2015/08/17840.php> > > > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org <mailto:de...@open-mpi.org> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > > Link to this post: > > http://www.open-mpi.org/community/lists/devel/2015/08/17841.php > > <http://www.open-mpi.org/community/lists/devel/2015/08/17841.php> > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org <mailto:de...@open-mpi.org> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > > Link to this post: > > http://www.open-mpi.org/community/lists/devel/2015/08/17845.php > > <http://www.open-mpi.org/community/lists/devel/2015/08/17845.php> > > > -- > Jeff Squyres > jsquy...@cisco.com <mailto:jsquy...@cisco.com> > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > <http://www.cisco.com/web/about/doing_business/legal/cri/> > > _______________________________________________ > devel mailing list > de...@open-mpi.org <mailto:de...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/08/17847.php > <http://www.open-mpi.org/community/lists/devel/2015/08/17847.php> > > > _______________________________________________ > devel mailing list > de...@open-mpi.org <mailto:de...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/08/17849.php > <http://www.open-mpi.org/community/lists/devel/2015/08/17849.php> > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/08/17851.php