Sounds great to me.

Aurelien

Le 11 sept. 07 à 13:03, Jeff Squyres a écrit :
If you genericize the concept, I think it's compatible with FT:

1. during MPI_INIT, one of the MPI processes can request a "notify"
exit pattern for the job: a process must notify the RTE before it
actually exits (i.e., some ORTE notification during MPI_FINALIZE).
If a process exits before notifying the RTE, it's an error.

1a. The default action upon error can be to kill the entire job.
1b. If you desire plug-in-able error actions (e.g., not kill the
entire job), I'm *assuming* that our plugin frameworks can handle
that...?

2. for an FT MPI job, I assume that the MPI processes would either
not perform step 1 (i.e., the default action upon process exit is
nothing -- just like if you had run "mpirun -np 4 hostname"), or you
would select a specific action upon error/plugin for what to do when
a process exits without first notifying the RTE.

Howzat?

--
Jeff Squyres
Cisco Systems

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


Reply via email to