Kewl - just wanted to double check as this had come up before, and people felt strongly about it. I didn’t see any clear way to pass the error back to MPI_Init, which is why I started this thread. Sounds like one doesn’t exist, so exit is probably the only real option.
Thanks Ralph > On Dec 4, 2014, at 12:52 PM, George Bosilca <bosi...@icl.utk.edu> wrote: > > I was not advocating calling exit. I was merely suggesting that due to > earliness in the initialization process, and to the fact that we are lacking > the infrastructure to abort because a specific user request cannot be > complied to, calling exit seems like a reasonable bandaid. > > George. > > On Fri, Dec 5, 2014 at 5:38 AM, Ralph Castain <r...@open-mpi.org > <mailto:r...@open-mpi.org>> wrote: > Let me get this straight - you are advocating that I call “exit” directly > from within a library?? I thought that was “verboten” - MPI_Init should just > return an error somehow, yes? > > > On Dec 4, 2014, at 12:35 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com > > <mailto:jsquy...@cisco.com>> wrote: > > > > Oh, good catch -- thanks. > > > > I wouldn't call abort -- that will dump core. Just show_help() and > > exit(nonzero), I guess. > > > > > > On Dec 4, 2014, at 3:31 PM, George Bosilca <bosi...@icl.utk.edu > > <mailto:bosi...@icl.utk.edu>> wrote: > > > >> You can't use the PML error reporting mechanism in this particular > >> instance, it is too early in the setup process (in the BTL component init > >> function) and the PML has not setup the error callback yet. > >> > >> This function is called during the MPI_Init, at a time where most of the > >> Open MPI infrastructure is not yet setup. I guess the safest way to force > >> the process to fail is to call exit or maybe abort. > >> > >> George. > >> > >> > >> > >> On Fri, Dec 5, 2014 at 3:40 AM, Jeff Squyres (jsquyres) > >> <jsquy...@cisco.com <mailto:jsquy...@cisco.com>> wrote: > >> You're supposed to call the PML error handler, which was passed down to > >> the BTL during initialization. > >> > >> That is, the BTL registers a btl_register_error function with the PML. > >> The PML then calls this function and passes in its error handler function > >> pointer. The BTL can then use that error handler to tell the PML when an > >> error occurs. > >> > >> Right now, the only PML error handler aborts the job. So this should be a > >> sufficient mechanism. > >> > >> > >> On Dec 3, 2014, at 12:15 PM, Ralph Castain <r...@open-mpi.org > >> <mailto:r...@open-mpi.org>> wrote: > >> > >>> We talked during the telecon about the user-reported issue where they > >>> asked for knem support, it wasn’t available on the system, but we ran > >>> anyway at a reduced performance level. The agreement we had was that OMPI > >>> should instead fail at that point since the user had requested something > >>> we could not do. I got tasked with implementing this. > >>> > >>> Here is the problem code: > >>> > >>> /* If "use_knem" is positive, then it's an error if knem support > >>> is not available -- deactivate the sm btl. */ > >>> if (mca_btl_sm_component.use_knem > 0) { > >>> opal_show_help("help-mpi-btl-sm.txt", > >>> "knem requested but not available", > >>> true, opal_process_info.nodename); > >>> return NULL; > >>> > >>> As you can see, we deactivate sm but do not necessarily fail. Question > >>> for you folks: how do I cause us to safely fail from within a BTL?? > >>> > >>> Thanks > >>> Ralph > >>> > >>> _______________________________________________ > >>> devel mailing list > >>> de...@open-mpi.org <mailto:de...@open-mpi.org> > >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > >>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > >>> Link to this post: > >>> http://www.open-mpi.org/community/lists/devel/2014/12/16425.php > >>> <http://www.open-mpi.org/community/lists/devel/2014/12/16425.php> > >> > >> > >> -- > >> Jeff Squyres > >> jsquy...@cisco.com <mailto:jsquy...@cisco.com> > >> For corporate legal information go to: > >> http://www.cisco.com/web/about/doing_business/legal/cri/ > >> <http://www.cisco.com/web/about/doing_business/legal/cri/> > >> > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org <mailto:de...@open-mpi.org> > >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > >> Link to this post: > >> http://www.open-mpi.org/community/lists/devel/2014/12/16435.php > >> <http://www.open-mpi.org/community/lists/devel/2014/12/16435.php> > >> > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org <mailto:de...@open-mpi.org> > >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > >> Link to this post: > >> http://www.open-mpi.org/community/lists/devel/2014/12/16436.php > >> <http://www.open-mpi.org/community/lists/devel/2014/12/16436.php> > > > > > > -- > > Jeff Squyres > > jsquy...@cisco.com <mailto:jsquy...@cisco.com> > > For corporate legal information go to: > > http://www.cisco.com/web/about/doing_business/legal/cri/ > > <http://www.cisco.com/web/about/doing_business/legal/cri/> > > > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org <mailto:de...@open-mpi.org> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > > Link to this post: > > http://www.open-mpi.org/community/lists/devel/2014/12/16437.php > > <http://www.open-mpi.org/community/lists/devel/2014/12/16437.php> > > _______________________________________________ > devel mailing list > de...@open-mpi.org <mailto:de...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/12/16438.php > <http://www.open-mpi.org/community/lists/devel/2014/12/16438.php> > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/12/16439.php