On Mar 16, 2007, at 5:44 PM, Mohammad Huwaidi wrote:

The following code is my trial to write a fault-tolerant application on OpenMPI; however, it still does not trap exceptions:

I'm not sure what your question is.

It does not seem to trap exceptions because, at least at first glance, your program appears to be correct (i.e., no exceptions need to be thrown).

If you have a program that generates MPI errors and want to catch them via MPI::ERRORS_THROW_EXCEPTIONS, then you need to ensure to configure Open MPI with --enable-cxx-exceptions. However, recall that the MPI standard does not guarantee the state of MPI after an error has occurred -- i.e., Open MPI does not guarantee that further calls to MPI functions will perform as they are supposed to.

In principle, if the error is a simple MPI argument problem (e.g., sending NULL or some other obviously illegal value), Open MPI should be able to continue without a problem. But if you're looking for true fault tolerance (i.e., an MPI send fails because of a transient error), Open MPI is not yet robust enough to handle such scenarios, even if you trap the C++ exception up in your application.

--
Jeff Squyres
Cisco Systems

Reply via email to