Kewl - just wanted to double check as this had come up before, and people felt 
strongly about it. I didn’t see any clear way to pass the error back to 
MPI_Init, which is why I started this thread. Sounds like one doesn’t exist, so 
exit is probably the only real option.

Thanks
Ralph


> On Dec 4, 2014, at 12:52 PM, George Bosilca <bosi...@icl.utk.edu> wrote:
> 
> I was not advocating calling exit. I was merely suggesting that due to 
> earliness in the initialization process, and to the fact that we are lacking 
> the infrastructure to abort because a specific user request cannot be 
> complied to, calling exit seems like a reasonable bandaid.
> 
>   George.
> 
> On Fri, Dec 5, 2014 at 5:38 AM, Ralph Castain <r...@open-mpi.org 
> <mailto:r...@open-mpi.org>> wrote:
> Let me get this straight - you are advocating that I call “exit” directly 
> from within a library?? I thought that was “verboten” - MPI_Init should just 
> return an error somehow, yes?
> 
> > On Dec 4, 2014, at 12:35 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com 
> > <mailto:jsquy...@cisco.com>> wrote:
> >
> > Oh, good catch -- thanks.
> >
> > I wouldn't call abort -- that will dump core.  Just show_help() and 
> > exit(nonzero), I guess.
> >
> >
> > On Dec 4, 2014, at 3:31 PM, George Bosilca <bosi...@icl.utk.edu 
> > <mailto:bosi...@icl.utk.edu>> wrote:
> >
> >> You can't use the PML error reporting mechanism in this particular 
> >> instance, it is too early in the setup process (in the BTL component init 
> >> function) and the PML has not setup the error callback yet.
> >>
> >> This function is called during the MPI_Init, at a time where most of the 
> >> Open MPI infrastructure is not yet setup. I guess the safest way to force 
> >> the process to fail is to call exit or maybe abort.
> >>
> >> George.
> >>
> >>
> >>
> >> On Fri, Dec 5, 2014 at 3:40 AM, Jeff Squyres (jsquyres) 
> >> <jsquy...@cisco.com <mailto:jsquy...@cisco.com>> wrote:
> >> You're supposed to call the PML error handler, which was passed down to 
> >> the BTL during initialization.
> >>
> >> That is, the BTL registers a btl_register_error function with the PML.  
> >> The PML then calls this function and passes in its error handler function 
> >> pointer.  The BTL can then use that error handler to tell the PML when an 
> >> error occurs.
> >>
> >> Right now, the only PML error handler aborts the job.  So this should be a 
> >> sufficient mechanism.
> >>
> >>
> >> On Dec 3, 2014, at 12:15 PM, Ralph Castain <r...@open-mpi.org 
> >> <mailto:r...@open-mpi.org>> wrote:
> >>
> >>> We talked during the telecon about the user-reported issue where they 
> >>> asked for knem support, it wasn’t available on the system, but we ran 
> >>> anyway at a reduced performance level. The agreement we had was that OMPI 
> >>> should instead fail at that point since the user had requested something 
> >>> we could not do. I got tasked with implementing this.
> >>>
> >>> Here is the problem code:
> >>>
> >>>   /* If "use_knem" is positive, then it's an error if knem support
> >>>      is not available -- deactivate the sm btl. */
> >>>   if (mca_btl_sm_component.use_knem > 0) {
> >>>       opal_show_help("help-mpi-btl-sm.txt",
> >>>                      "knem requested but not available",
> >>>                      true, opal_process_info.nodename);
> >>>       return NULL;
> >>>
> >>> As you can see, we deactivate sm but do not necessarily fail. Question 
> >>> for you folks: how do I cause us to safely fail from within a BTL??
> >>>
> >>> Thanks
> >>> Ralph
> >>>
> >>> _______________________________________________
> >>> devel mailing list
> >>> de...@open-mpi.org <mailto:de...@open-mpi.org>
> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> >>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> >>> Link to this post: 
> >>> http://www.open-mpi.org/community/lists/devel/2014/12/16425.php 
> >>> <http://www.open-mpi.org/community/lists/devel/2014/12/16425.php>
> >>
> >>
> >> --
> >> Jeff Squyres
> >> jsquy...@cisco.com <mailto:jsquy...@cisco.com>
> >> For corporate legal information go to: 
> >> http://www.cisco.com/web/about/doing_business/legal/cri/ 
> >> <http://www.cisco.com/web/about/doing_business/legal/cri/>
> >>
> >> _______________________________________________
> >> devel mailing list
> >> de...@open-mpi.org <mailto:de...@open-mpi.org>
> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> >> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> >> Link to this post: 
> >> http://www.open-mpi.org/community/lists/devel/2014/12/16435.php 
> >> <http://www.open-mpi.org/community/lists/devel/2014/12/16435.php>
> >>
> >> _______________________________________________
> >> devel mailing list
> >> de...@open-mpi.org <mailto:de...@open-mpi.org>
> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> >> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> >> Link to this post: 
> >> http://www.open-mpi.org/community/lists/devel/2014/12/16436.php 
> >> <http://www.open-mpi.org/community/lists/devel/2014/12/16436.php>
> >
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com <mailto:jsquy...@cisco.com>
> > For corporate legal information go to: 
> > http://www.cisco.com/web/about/doing_business/legal/cri/ 
> > <http://www.cisco.com/web/about/doing_business/legal/cri/>
> >
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org <mailto:de...@open-mpi.org>
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> > <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2014/12/16437.php 
> > <http://www.open-mpi.org/community/lists/devel/2014/12/16437.php>
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <mailto:de...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/12/16438.php 
> <http://www.open-mpi.org/community/lists/devel/2014/12/16438.php>
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/12/16439.php

Reply via email to