On Mar 16, 2007, at 5:44 PM, Mohammad Huwaidi wrote:
The following code is my trial to write a fault-tolerant
application on OpenMPI; however, it still does not trap exceptions:
I'm not sure what your question is.
It does not seem to trap exceptions because, at least at first
glance, your program appears to be correct (i.e., no exceptions need
to be thrown).
If you have a program that generates MPI errors and want to catch
them via MPI::ERRORS_THROW_EXCEPTIONS, then you need to ensure to
configure Open MPI with --enable-cxx-exceptions. However, recall
that the MPI standard does not guarantee the state of MPI after an
error has occurred -- i.e., Open MPI does not guarantee that further
calls to MPI functions will perform as they are supposed to.
In principle, if the error is a simple MPI argument problem (e.g.,
sending NULL or some other obviously illegal value), Open MPI should
be able to continue without a problem. But if you're looking for
true fault tolerance (i.e., an MPI send fails because of a transient
error), Open MPI is not yet robust enough to handle such scenarios,
even if you trap the C++ exception up in your application.
--
Jeff Squyres
Cisco Systems