On Apr 14 2011, Jeff Squyres wrote:
I think Ralph's point is that OMPI is providing the run-time environment
for the application, and it would probably behoove us to support both
kinds of behaviors since there are obviously people in both camps out
there.
It's pretty easy to add a non-defaul
I think Ralph's point is that OMPI is providing the run-time environment for
the application, and it would probably behoove us to support both kinds of
behaviors since there are obviously people in both camps out there.
It's pretty easy to add a non-default MCA param / orterun CLI option to sa
Point well made, Nick. In other words, irrespective of OS or language,
are we citing the need for "application correcting code" from OpenMPI,
(relocate a/o retry) similar to ECC in memory?
Ken
On Thu, 2011-04-14 at 14:31 +0100, N.M. Maclaren wrote:
> On Apr 14 2011, Ralph Castain wrote:
> >>
>
On Apr 14 2011, Ralph Castain wrote:
... It's hopeless, and whatever you do will be wrong for many
people. ...
I think that sums it up pretty well. :-)
It does seem a little strange that the scenario you describe somewhat
implies that one process is calling MPI_Finalize lng before th
On Apr 14, 2011, at 9:13 AM, Ralph Castain wrote:
> I figure this last is the best option. My point was just that we abort the
> job if someone calls "abort". However, if they indicate their program is
> exiting with "something is wrong", we ignore it.
Another option for the user is to kill(get
On Apr 14, 2011, at 5:33 AM, Jeff Squyres wrote:
> On Apr 14, 2011, at 4:02 AM, N.M. Maclaren wrote:
>
>> ... It's hopeless, and whatever you do will be wrong for many
>> people. ...
>
> I think that sums it up pretty well. :-)
>
> It does seem a little strange that the scenario you describ
On Apr 14, 2011, at 4:02 AM, N.M. Maclaren wrote:
> ... It's hopeless, and whatever you do will be wrong for many
> people. ...
I think that sums it up pretty well. :-)
It does seem a little strange that the scenario you describe somewhat implies
that one process is calling MPI_Finalize looo
On Apr 14 2011, Ralph Castain wrote:
I've run across an interesting issue for which I don't have a ready answer.
If an MPI process aborts, we automatically abort the entire job.
If an MPI process returns a non-zero exit status, indicating that there
was something abnormal about its terminatio
I've run across an interesting issue for which I don't have a ready answer.
If an MPI process aborts, we automatically abort the entire job.
If an MPI process returns a non-zero exit status, indicating that there was
something abnormal about its termination, we ignore it and let the job
continu