What if we modify the mpirun script to include the --mca mtl ^psm tag if java is in the run string?
-Nathan On Tue, Aug 25, 2015 at 9:47 AM, Howard Pritchard <hpprit...@gmail.com> wrote: > I'll update the java FAQ. > > 2015-08-25 8:36 GMT-06:00 Jeff Squyres (jsquyres) <jsquy...@cisco.com>: > >> On Aug 25, 2015, at 10:00 AM, Howard Pritchard <hpprit...@gmail.com> >> wrote: >> > >> > I think rather than trying workarounds of dubious robustness inside >> open mpi we >> > >> > - dicument the issue on either the somewhat aged open mpi website faq >> or add it to a wiki page on github >> >> It should probably be documented in the README and the FAQ. >> >> I'd be against adding user documentation to the wiki -- this would be a >> 3rd place for users to look for information. >> >> > - file a bug against intel psm >> >> I'd like to hear what they have to say first... :-) >> >> > >> > ---------- >> > >> > sent from my smart phonr so no good type. >> > >> > Howard >> > >> > On Aug 25, 2015 6:02 AM, "Gilles Gouaillardet" < >> gilles.gouaillar...@gmail.com> wrote: >> > i do not know if this can be runtime detected ... >> > note we should report this to intel folks and ask them to advise. >> > ideally, they would provide a way to make sure libinfinipath.so does >> not conflict with the jvm signal handlers. >> > >> > my idea is to dlopen libinfinipath only if java bindings are not used. >> > >> > On Tuesday, August 25, 2015, Jeff Squyres (jsquyres) < >> jsquy...@cisco.com> wrote: >> > Is it possible to run-time detect this situation? E.g., probe the >> signal handler, or somesuch. >> > >> > Rationale: I'd rather have something run-time disabled than not built. >> > >> > Would dlopen'ing libinfinipath change actually change its signal >> handler behavior? >> > >> > >> > > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet <gil...@rist.or.jp> >> wrote: >> > > >> > > Folks, >> > > >> > > some time ago, some crashes were reported when using java bindings. >> > > one of them was caused was caused by mca_mtl_psm.so. >> > > the root cause is libinfinipath.so initializer sets its own signal >> handler, which >> > > conflicts with the signal handler sets by the jvm. >> > > the only workaround is to disable the psm mtl >> > > (e.g. mpirun --mca mtl ^psm ...) >> > > since mpirun --mca mtl_psm_priority 0 ... does not work >> > > (libinfinipath.so is loaded, so the initializer is ran and the signal >> handlers are set) >> > > so the psm mtl cannot be disabled by the Java MPI_Init() >> > > >> > > one option is to document this >> > > an other option is not to build the psm mtl if java bindings are built >> > > and an other option is to revamp mca_mtl_psm.so so it does not link >> with libinfinipath.so >> > > (use an intermediate component, or dlopen libinfinipath) >> > > >> > > any thoughts ? >> > > >> > > Cheers, >> > > >> > > Gilles >> > > _______________________________________________ >> > > devel mailing list >> > > de...@open-mpi.org >> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > Link to this post: >> http://www.open-mpi.org/community/lists/devel/2015/08/17838.php >> > >> > >> > -- >> > Jeff Squyres >> > jsquy...@cisco.com >> > For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> > >> > _______________________________________________ >> > devel mailing list >> > de...@open-mpi.org >> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > Link to this post: >> http://www.open-mpi.org/community/lists/devel/2015/08/17840.php >> > >> > _______________________________________________ >> > devel mailing list >> > de...@open-mpi.org >> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > Link to this post: >> http://www.open-mpi.org/community/lists/devel/2015/08/17841.php >> > _______________________________________________ >> > devel mailing list >> > de...@open-mpi.org >> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > Link to this post: >> http://www.open-mpi.org/community/lists/devel/2015/08/17845.php >> >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2015/08/17847.php >> > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/08/17849.php >