Sounds great to me. Aurelien
Le 11 sept. 07 à 13:03, Jeff Squyres a écrit :
If you genericize the concept, I think it's compatible with FT: 1. during MPI_INIT, one of the MPI processes can request a "notify" exit pattern for the job: a process must notify the RTE before it actually exits (i.e., some ORTE notification during MPI_FINALIZE). If a process exits before notifying the RTE, it's an error. 1a. The default action upon error can be to kill the entire job. 1b. If you desire plug-in-able error actions (e.g., not kill the entire job), I'm *assuming* that our plugin frameworks can handle that...? 2. for an FT MPI job, I assume that the MPI processes would either not perform step 1 (i.e., the default action upon process exit is nothing -- just like if you had run "mpirun -np 4 hostname"), or you would select a specific action upon error/plugin for what to do when a process exits without first notifying the RTE. Howzat? -- Jeff Squyres Cisco Systems _______________________________________________ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel