On Aug 25, 2007, at 1:20 PM, Rainer Keller wrote:

     /* Protection for recursive invocation */
     if (have_been_invoked) {
         return OMPI_SUCCESS;
     }
     have_been_invoked = true;

This, IMHO, is a wrong thing to do.  The intent of ompi_mpi_abort()
was that it never returned.  But now it is?  That seems wrong to me.

If you read the rest of my e-mail, I was saying that I think it's wrong, too. :-) You did not answer my other question...

Recursive or not, if we add __opal_attribute_noreturn__, the compiler (and Coverity, as David said in a private mail) then knows, that the function
won't exit.

The real question is whether the function can be (or is) called recursively. If it can't / isn't, then we can replace the first "return OMPI_SUCCESS" with an endless sleep and we should be ok. It is *not* ok to replace it with "exit(errcode)" for the THREAD_MULTIPLE scenarios where a different thread is busy processing the actual abort. And do we need better protection for have_been_invoked, such as a lock or an atomic operation?

If this function can't be / is never invoked recursively, then we should make it noisy (in a developer / debug build) if it ever *is* invoked recursively and probably put it into either an endless sleep or invoke abort() because that will clearly be a bug.

I unfortunately do not remember whether I put that recursive protection in to fix a real problem or whether I was trying to be (incorrectly) proactive...

Comments?

--
Jeff Squyres
Cisco Systems

Reply via email to