At 06:07 PM 7/30/2003, Joe Orton wrote: >On Wed, Jul 30, 2003 at 11:21:21AM -0500, William Rowe wrote: >> >> The existing unix/thread_mutex.c code is neither thread-safe >> nor reentrant when using the 'nested' (default) thread mutexes. > >Why do you say "nested (default)"? The pools code is the only place I >can find which actually uses nested mutexes. By default they are not >nested.
You are correct - it's a platform choice, and Win32 chooses the nested as a default because critical sections are much faster. I made a misassumption on this point. >> The bug is most often expressed on SMP boxes, where they >> are so massively parallel that the subtle flaws of modifying the >> mutex object AFTER we have released the lock become very >> apparent. >> >> Examples include a number of error (45) Deadlock Avoided >> alerts using Apache2/mod_ssl on Solaris, where there are both >> internal thread locks and global locks which contain a nested >> thread lock. > >This code looks like it is making lots of dubious assumptions: > >- pthread_t can be a structure, that's why the memset is used I presume, >assinging 0 to a pthread_t definitely wins no prizes We are assigning the pthread_t to the result of apr_os_thread_current() so I presumed it's intregal enough to accept a 0/NULL reset. memset is obviously less atomic than such a reset, using many more cycles. >- who says a memset-to-zero'ed pthread_t isn't a valid pthread_t? No >reason why it couldn't be. If it is *not* a valid pthread_t, then >passing it to pthread_equal gives undefined behaviour anyway! This point *does* scare me, and it suggests some sort of requirement for an APR_THREAD_ID_NONE that's guarenteed to be not-a-thread. >- let's hope and pray that pthread_equal() is an atomic operation Good point, and point taken. >> Many more eyes on this code would be welcomed, I expect to >> discover that we repeat this mistake elsewhere in the mutex >> code... > >Like Aaron I'm not sure why the comments about reentrancy are relevant: >you can't use pthread mutexes (and hence apr mutexes :) from a signal >handler, anyone doing that is SOL. So we should spell that out as a prerequisite. >You've made lots of changes in the patch: > >1. changing some incr++ to ++incr which makes no differences The result is add-and-return, as opposed to evalute and increment. The result will be optimized away since it's unused as an rvalue, but there always is a difference. Evaluting some bytecode might be amusing. >2. re-ordered the mutex_unlock code so that the ownership changes occur >inside the critical section. This is the *critical* point of the patch, and one that I would commit first and prior to any other "legibility" or other insignificant aspects of the patch. >3. added some "protection" against races > >1 and 3 look bogus; 2 looks like the important change here. The only >effect of your (3) changes would seem to be to hide races if there still >are any? No, the next step in 3 is to actually provide a mechanism for throwing an exception when those cases are encountered. >(2) does introduce a semantic change in that the caller cannot rely on >pthread_mutex_unlock() having defined behaviour if called from a thread >which does not own the mutex lock, but I expect that's no big deal. If we are crossing ownership due to this or other races or reentrancy, this is a significant change, but it should probably become a fatal exception. >It's worth noting that most modern POSIX implementations offer >"recursive" mutexes in which pthread_mutex_lock/unlock do all this work >internally. Which should be detected with autoconf. As I invited, more eyes would be much appreciated. Bill
