(I sent this originally to Joe Orton who suggested I post it to this list instead):

I've been recently debugging an issue with Solaris, Apache and EDEADLK. Turning to Google, I ran across several posts, but found this fairly recent post:

   http://www.mail-archive.com/[email protected]/msg19804.html

"The default was changed to fcntl because of the potential for deadlocks
    in use of cross-process pthread mutexes:

          http://marc.info/?l=apr-dev&m=108720968023158&w=2

are those issues not seen any more? Since that decision was due to a
    potential OS bug (robust mutexes which aren't robust) has it been
confirmed with Sun that this fcntl/EDEADLK is definitely not an OS bug?"

I don't know if a reply was ever received (I haven't found one yet in my Google searching). I can confirm (at least in my case) from extensive DTrace debugging of Apache 2.2.8 locking behavior under Solaris 10, that, no, this is not a Solaris bug - it's properly detecting the classic deadlock case involving (at least) 2 locks wherein process 1 holds lock A and wants lock B, and process 2 holds lock B and wants lock A. I see this case occur in my DTrace output just before the EDEADLK
return.

This always involves the Accept Mutex and one other lock, which is usually a global mutex. It occurs because the Worker MPM is, of course, threaded and multi-process, so it's quite possible for 2 threads in one of the Worker MPM processes to hold locks - one holding the AcceptMutex, and the other wanting to lock say, the mod_rewrite RewriteLock. Then if another Worker MPM process has 2 threads, one of which is holding the mod_rewrite RewriteLock and a second thread in that same process wanting the AcceptMutex lock, EDEADLK will be returned to one, because Solaris is looking at the process level, not the thread level. If the locking were
treated as being at the thread level, there would be no deadlock.

I've seen that, for some people, setting AcceptMutex pthread fixed a similar problem, but I was concerned about your comment posted above. Have you heard whether or not the cross-process pthread problems involving lock robustness problems have been solved?

    Sincerely yours,

    Michael Durket

Reply via email to