Many thanks for your detailed reply. I looked further at the bug numbers you posted... a couple seem significant and happen to be patched on my test server (137111-07).
In a couple of my test runs with "AcceptMutex pthread" trailed the tests with sysvsem, but not conclusively. However, contrary to my earlier report, I still get sporadic errors: [Wed Feb 11 14:40:32 2009] [error] (45)Deadlock situation detected/avoided: apr_global_mutex_lock(jk_log_lock) failed I also re-ran separate tests with mod_ssl + a plain index.html and observed consistently that the apache set with mutex to sysvsem is a fraction faster than posixsem which is a fraction faster than pthread..... No Deadlocks, but with fcntl. Thanks again - Fred Rainer Jung-3 wrote: > > On 06.02.2009 20:40, fredk2 wrote: >> Do I understand you correctly that when Mr. Orton said to never use >> pthread >> nor posixsem mutex (http://marc.info/?l=apr-dev&m=108720968023158&w=2) >> that >> is now obsolete news and that Solaris perfected pthread mutex support >> since. > > Joe Orton is always very careful with his statements, precise and > correct. My personal experience with pthread mutexes on Solaris was > fine, but I must confess, that I didn't do specialized tests to > determine behaviour in crash situations. > > I now did some searching and it turns out that the implementation of > pthread mutexes for Solaris 10 has very recently changed quite a bit. So > all speculations about improved pthread mutex behaviour (especially for > "robust" mutexes) in the last years might have become obsolete. > > The new implementation is contained in Solaris kernel patch 137137-09 > and most likely also in Solaris 10 Update 6 (10/08). I didn't check, > whether that update simply contains the kernel patch or the fix is > included independently. > > Some detail is logged in Sunsolve under the bug IDs > > 6296770 > 2160259 > 6664275 > 6697344 > 6729759 > 6564706 > >> You mention that mod_jk uses pthread is that the same as the httpd >> itself? > > mod_jk uses a global mutex provided by the apr libraries for access to > the log file. It gets a default mutex, i.e. it lets APR decide, which > type of mutex to use (APR_LOCK_DEFAULT, for Solaris it should be fcntl). > You can't configure like for httpd's accept or ssl mutex. > > mod_jk uses a couple of more locks, which are all not APR provided, but > instead directly coded to use pthreads. All of those mutexes are only > thread mutexes, so used locally in each process and not shared between > processes. They won't have a problem with crashing processes. > > They are: > > - one mutex for each AJP worker, synchronizing access to the connection > pool, which exists per process > > - one mutex for each lb worker > > - a mutex for access to the shared memory when changing or reading > configuration parameters. That might be a little unsafe, because it > actually should be a global mutex, not a process local, but those config > changes are only done due to interaction with the status worker, so > there's very little chance for unwanted concurrency here. All dynamic > runtime data are already marked as being volatile. > > - a mutex used during dynamic update of uriworkermap.properties to > prevent concurrent updates. Updates are done per process. > > - a mutex to prevent concurrent execution of the process local internal > maintenance task > >> Some fellow at Covalent back in the early Apache 2.0 days, posted a white >> paper about his various mutex testing, but it does not appear to be >> available anymore. Would be interesting to know how it was tested and how >> it >> would playout today. > > Lots of the Covalent people are still around in various projects, like > William (Bill) A. Rowe and Jim Jagielski. You could post at apr-dev, > because Apache httpd uses the mutex implementations coming from the APR > libraries. > >> Rainer Jung-3 wrote: >>> On 06.02.2009 18:13, fredk2 wrote: >>>> I was doing some stress test (with apache ab, 100 users, 100K requests) >>>> to >>>> compare an Apache prefork and worker mpm. The test url is a simple >>>> hello >>>> servlet on Tomcat 6.0.x via mod_jk. On my Sparc Solaris 10 server with >>>> only >>>> the Apache set to worker mpm I see following error messages in my jk >>>> log: >>>> >>>> Apache/2.2.11 (Unix) with mod_jk/1.2.26 on Solaris 10. >>>> . . . >>>> [Thu Jan 08 11:42:28 2009] [error] (45)Deadlock situation >>>> detected/avoided: >>>> apr_global_mutex_lock(jk_log_lock) failed >>>> . . . >>>> [Thu Jan 08 11:42:29 2009] [emerg] (45)Deadlock situation >>>> detected/avoided: >>>> apr_proc_mutex_lock failed. Attempting to shutdown process gracefully. >>>> [Thu Jan 08 11:42:29 2009] [error] (45)Deadlock situation >>>> detected/avoided: >>>> apr_global_mutex_lock(jk_log_lock) failed >>>> . . . >>>> >>>> these errors do not appear to impact the test results and the jk log >>>> file >>>> seems complete. >>>> >>>> I can suppress the errors by choosing another Mutex in the Apache >>>> directive >>>> AcceptMutex, such as sysvsem or pthread. For Solaris 10 the default >>>> mutex >>>> for worker MPM is fcntl. Setting the Mutex sysvsem (also the default >>>> on >>>> Linux) marginally improves the request time. >>>> >>>> Can someone explain what exactly these errors means? when does it >>>> occur? >>>> I would have almost expect a "detected/avoided" to be a [warn] instead >>>> of >>>> an >>>> [error]. >>>> >>>> I have seen the trail http://markmail.org/message/dedqpmrrkpa224ns but >>>> I'd >>>> like to hear updated experiences that people have with sysvsem mutexes >>>> on >>>> Solaris 10 - what is the better mutex? sysvsme, posixsem, pthread **? >>>> >>>> any comment will be appreciated. >>> I experienced this too a couple of times and once wrote a small C >>> program to reproduce the problem. On Solaris the algorithm to detect a >>> possible deadlock is very careful and returns EDEADLOCK even in >>> situations were you can mathematically prove, that a deadlock is not >>> possible. This happens in a multi-threaded environment when more than >>> one mutex is used. >>> >>> Apache httpd and mod_jk use such a mutex and SSL also (so you can >>> observe the same warnings without mod_jk only using SSL with httpd and >>> doing stress tests). >>> >>> In older JK versions this could lead to a hang, but we worked around >>> that a couple of versions ago. I generally recommend the pthread mutex >>> for Solaris which doesn't have the problem and seems to be robust >>> despite warnings about pthread mutexes in very old versions of Solaris. >>> >>> We even once had a discussion about changing the default httpd mutex on >>> Solaris once, but I think that discussion didn't come to an end. >>> >>> Regards, >>> >>> Rainer > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > > -- View this message in context: http://www.nabble.com/Deadlock-situation-detected-avoided-with-jk_log_lock-tp21876381p21964001.html Sent from the Tomcat - User mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org