Re: [OMPI devel] Exit status

2011-04-14 Thread N.M. Maclaren
On Apr 14 2011, Jeff Squyres wrote: I think Ralph's point is that OMPI is providing the run-time environment for the application, and it would probably behoove us to support both kinds of behaviors since there are obviously people in both camps out there. It's pretty easy to add a non-defaul

Re: [OMPI devel] Exit status

2011-04-14 Thread Jeff Squyres
I think Ralph's point is that OMPI is providing the run-time environment for the application, and it would probably behoove us to support both kinds of behaviors since there are obviously people in both camps out there. It's pretty easy to add a non-default MCA param / orterun CLI option to sa

Re: [OMPI devel] Exit status

2011-04-14 Thread Ken Lloyd
Point well made, Nick. In other words, irrespective of OS or language, are we citing the need for "application correcting code" from OpenMPI, (relocate a/o retry) similar to ECC in memory? Ken On Thu, 2011-04-14 at 14:31 +0100, N.M. Maclaren wrote: > On Apr 14 2011, Ralph Castain wrote: > >> >

Re: [OMPI devel] Exit status

2011-04-14 Thread N.M. Maclaren
On Apr 14 2011, Ralph Castain wrote: ... It's hopeless, and whatever you do will be wrong for many people. ... I think that sums it up pretty well. :-) It does seem a little strange that the scenario you describe somewhat implies that one process is calling MPI_Finalize lng before th

Re: [OMPI devel] Exit status

2011-04-14 Thread Jeff Squyres
On Apr 14, 2011, at 9:13 AM, Ralph Castain wrote: > I figure this last is the best option. My point was just that we abort the > job if someone calls "abort". However, if they indicate their program is > exiting with "something is wrong", we ignore it. Another option for the user is to kill(get

Re: [OMPI devel] Exit status

2011-04-14 Thread Ralph Castain
On Apr 14, 2011, at 5:33 AM, Jeff Squyres wrote: > On Apr 14, 2011, at 4:02 AM, N.M. Maclaren wrote: > >> ... It's hopeless, and whatever you do will be wrong for many >> people. ... > > I think that sums it up pretty well. :-) > > It does seem a little strange that the scenario you describ

Re: [OMPI devel] Exit status

2011-04-14 Thread Jeff Squyres
On Apr 14, 2011, at 4:02 AM, N.M. Maclaren wrote: > ... It's hopeless, and whatever you do will be wrong for many > people. ... I think that sums it up pretty well. :-) It does seem a little strange that the scenario you describe somewhat implies that one process is calling MPI_Finalize looo

Re: [OMPI devel] Exit status

2011-04-14 Thread N.M. Maclaren
On Apr 14 2011, Ralph Castain wrote: I've run across an interesting issue for which I don't have a ready answer. If an MPI process aborts, we automatically abort the entire job. If an MPI process returns a non-zero exit status, indicating that there was something abnormal about its terminatio

[OMPI devel] Exit status

2011-04-13 Thread Ralph Castain
I've run across an interesting issue for which I don't have a ready answer. If an MPI process aborts, we automatically abort the entire job. If an MPI process returns a non-zero exit status, indicating that there was something abnormal about its termination, we ignore it and let the job continu