Re: Deadlock situation detected/avoided with jk_log_lock

fredk2 Wed, 11 Feb 2009 13:07:57 -0800

Many thanks for your detailed reply.
I looked further at the bug numbers you posted... a couple seem significant
and happen to be patched on my test server (137111-07).


In a couple of my test runs with "AcceptMutex pthread" trailed the tests
with sysvsem, but not conclusively. However, contrary to my earlier report,
I still get sporadic errors: 
[Wed Feb 11 14:40:32 2009] [error] (45)Deadlock situation detected/avoided:
apr_global_mutex_lock(jk_log_lock) failed

I also re-ran separate tests with mod_ssl + a plain index.html and observed
consistently that the apache set with mutex to sysvsem is a fraction faster
than posixsem which is a fraction faster than pthread..... No Deadlocks, but
with fcntl.

Thanks again - Fred


Rainer Jung-3 wrote:
> 
> On 06.02.2009 20:40, fredk2 wrote:
>> Do I understand you correctly that when Mr. Orton said to never use
>> pthread
>> nor posixsem mutex (http://marc.info/?l=apr-dev&m=108720968023158&w=2)
>> that
>> is now obsolete news and that Solaris perfected pthread mutex support
>> since.
> 
> Joe Orton is always very careful with his statements, precise and 
> correct. My personal experience with pthread mutexes on Solaris was 
> fine, but I must confess, that I didn't do specialized tests to 
> determine behaviour in crash situations.
> 
> I now did some searching and it turns out that the implementation of 
> pthread mutexes for Solaris 10 has very recently changed quite a bit. So 
> all speculations about improved pthread mutex behaviour (especially for 
> "robust" mutexes) in the last years might have become obsolete.
> 
> The new implementation is contained in Solaris kernel patch 137137-09 
> and most likely also in Solaris 10 Update 6 (10/08). I didn't check, 
> whether that update simply contains the kernel patch or the fix is 
> included independently.
> 
> Some detail is logged in Sunsolve under the bug IDs
> 
> 6296770
> 2160259
> 6664275
> 6697344
> 6729759
> 6564706
> 
>> You mention that mod_jk uses pthread is that the same as the httpd
>> itself?
> 
> mod_jk uses a global mutex provided by the apr libraries for access to 
> the log file. It gets a default mutex, i.e. it lets APR decide, which 
> type of mutex to use (APR_LOCK_DEFAULT, for Solaris it should be fcntl). 
> You can't configure like for httpd's accept or ssl mutex.
> 
> mod_jk uses a couple of more locks, which are all not APR provided, but 
> instead directly coded to use pthreads. All of those mutexes are only 
> thread mutexes, so used locally in each process and not shared between 
> processes. They won't have a problem with crashing processes.
> 
> They are:
> 
> - one mutex for each AJP worker, synchronizing access to the connection 
> pool, which exists per process
> 
> - one mutex for each lb worker
> 
> - a mutex for access to the shared memory when changing or reading 
> configuration parameters. That might be a little unsafe, because it 
> actually should be a global mutex, not a process local, but those config 
> changes are only done due to interaction with the status worker, so 
> there's very little chance for unwanted concurrency here. All dynamic 
> runtime data are already marked as being volatile.
> 
> - a mutex used during dynamic update of uriworkermap.properties to 
> prevent concurrent updates. Updates are done per process.
> 
> - a mutex to prevent concurrent execution of the process local internal 
> maintenance task
> 
>> Some fellow at Covalent back in the early Apache 2.0 days, posted a white
>> paper about his various mutex testing, but it does not appear to be
>> available anymore. Would be interesting to know how it was tested and how
>> it
>> would playout today.
> 
> Lots of the Covalent people are still around in various projects, like 
> William (Bill) A. Rowe and Jim Jagielski. You could post at apr-dev, 
> because Apache httpd uses the mutex implementations coming from the APR 
> libraries.
> 
>> Rainer Jung-3 wrote:
>>> On 06.02.2009 18:13, fredk2 wrote:
>>>> I was doing some stress test (with apache ab, 100 users, 100K requests)
>>>> to
>>>> compare an Apache prefork and worker mpm.  The test url is a simple
>>>> hello
>>>> servlet on Tomcat 6.0.x via mod_jk. On my Sparc Solaris 10 server with
>>>> only
>>>> the Apache set to worker mpm I see following error messages in my jk
>>>> log:
>>>>
>>>> Apache/2.2.11 (Unix) with mod_jk/1.2.26 on Solaris 10.
>>>> . . .
>>>> [Thu Jan 08 11:42:28 2009] [error] (45)Deadlock situation
>>>> detected/avoided:
>>>> apr_global_mutex_lock(jk_log_lock) failed
>>>> . . .
>>>> [Thu Jan 08 11:42:29 2009] [emerg] (45)Deadlock situation
>>>> detected/avoided:
>>>> apr_proc_mutex_lock failed. Attempting to shutdown process gracefully.
>>>> [Thu Jan 08 11:42:29 2009] [error] (45)Deadlock situation
>>>> detected/avoided:
>>>> apr_global_mutex_lock(jk_log_lock) failed
>>>> . . .
>>>>
>>>> these errors do not appear to impact the test results and the jk log
>>>> file
>>>> seems complete.
>>>>
>>>> I can suppress the errors by choosing another Mutex in the Apache
>>>> directive
>>>> AcceptMutex, such as sysvsem or pthread.  For Solaris 10 the default
>>>> mutex
>>>> for worker MPM is fcntl.  Setting the Mutex sysvsem (also the default
>>>> on
>>>> Linux) marginally improves the request time.
>>>>
>>>> Can someone explain what exactly these errors means? when does it
>>>> occur?
>>>> I would have almost expect a "detected/avoided" to be a [warn] instead
>>>> of
>>>> an
>>>> [error].
>>>>
>>>> I have seen the trail http://markmail.org/message/dedqpmrrkpa224ns but
>>>> I'd
>>>> like to hear updated experiences that people have with sysvsem mutexes
>>>> on
>>>> Solaris 10 - what is the better mutex?  sysvsme, posixsem, pthread **?
>>>>
>>>> any comment will be appreciated.
>>> I experienced this too a couple of times and once wrote a small C
>>> program to reproduce the problem. On Solaris the algorithm to detect a
>>> possible deadlock is very careful and returns EDEADLOCK even in
>>> situations were you can mathematically prove, that a deadlock is not
>>> possible. This happens in a multi-threaded environment when more than
>>> one mutex is used.
>>>
>>> Apache httpd and mod_jk use such a mutex and SSL also (so you can
>>> observe the same warnings without mod_jk only using SSL with httpd and
>>> doing stress tests).
>>>
>>> In older JK versions this could lead to a hang, but we worked around
>>> that a couple of versions ago. I generally recommend the pthread mutex
>>> for Solaris which doesn't have the problem and seems to be robust
>>> despite warnings about pthread mutexes in very old versions of Solaris.
>>>
>>> We even once had a discussion about changing the default httpd mutex on
>>> Solaris once, but I think that discussion didn't come to an end.
>>>
>>> Regards,
>>>
>>> Rainer
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Deadlock-situation-detected-avoided-with-jk_log_lock-tp21876381p21964001.html
Sent from the Tomcat - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Re: Deadlock situation detected/avoided with jk_log_lock

Reply via email to