Obligatory

http://xkcd.com/979/

;-)

On Jun 29, 2012, at 9:35 AM, Brook, James wrote:

> It's probably bad form to keep answering my own mails but no-one had anything 
> to say about this. Are there still people on the list who are familiar with 
> the adaptor internals? This problem is causing us a lot of pain in production.
> 
> Does anyone use the MPM worker module with Apache or are we all still with 
> pre-fork? I don't think we could live without the performance gains. Perhaps 
> it doesn't matter.
> 
> I haven't quite proven this but I am pretty certain that my problem is with 
> fcntl. That's what the adaptor uses to lock the shared memory file. It's 
> apparently an outdated way of doing this - APR now has better abstractions 
> for these sorts of mutexes. Even the code that does the locking is in a retry 
> loop with up to 50 attempts! I started trying to rewrite the locking stuff 
> but I am out of my depth.
> 
> It strikes me that in general this would not be a bad bit of code for the 
> community to have updated. Can anyone help me with that please?
> 
> James
> 
> ________________________________________
> From: Brook, James
> Sent: 13 June 2012 18:48
> To: <[email protected]>
> Subject: Re: Deadlock on Apache 2.2 Adaptor under high load - Solaris 10 - 
> Worker MPM
> 
> Now I have some detailed adaptor logging from a time close to the deadlock. 
> Here is an example of an error with a lock:
> 
>   Debug: thread 37 locking WOShmem_lock from ../Adaptor/shmem.c:375
>   Debug: thread 37 unlocking WOShmem_lock from ../Adaptor/shmem.c:379
>   Error: lock_file_section(): failed to lock (1 attempts): Deadlock situation 
> detected/avoided
>   Debug: thread 37 locking str_lock from ../Adaptor/wastring.c:93
>   Debug: thread 37 unlocking str_lock from ../Adaptor/wastring.c:100
>   Debug: thread 37 locking str_lock from ../Adaptor/wastring.c:152
>   Debug: thread 37 unlocking str_lock from ../Adaptor/wastring.c:158
>   Debug: thread 37 locking WOShmem_lock from ../Adaptor/shmem.c:391
>   Debug: thread 37 unlocking WOShmem_lock from ../Adaptor/shmem.c:394
>   Error: ac_readConfiguration: WOShmem_lock() failed. Skipping reading config.
> 
> On Jun 13, 2012, at 5:30 PM, James Brook wrote:
> 
>> We have a big problem with the Apache 2.2 WebObjects adaptor on our Solaris 
>> 10 web servers. We are using the 'worker' MPM but when the sites get busy 
>> nearly every Apache thread is waiting for a shared memory lock to call the 
>> function that reads the adaptor config. The remaining threads are in the 
>> fcntl function trying to lock a section of shared memory. See below for a 
>> couple of example thread stacks.
>> 
>> I read in several posts that fcntl on Solaris 10 causes deadlocks under high 
>> load and that the problem is worse with the 'worker MPM'. The recommend 
>> locking mechanism for Solaris seems to be to use pthreads.
>> 
>> I know that at least a few list members are running with the Solaris 
>> adaptor. My questions:
>>  * Has anyone experienced this problem and found a solution?
>>  * Anyone using the 'worker' MPM or do people still use pre-fork (I don't 
>> think this a thread safety problem).
>>  * Any help or suggestions? Especially, any tips on rewriting to use 
>> pthreads?
>> 
>> --
>> James
>> 
>> 
>> feec5638 fcntl    (d8, 7, 2abe588)
>> feeb8258 fcntl    (d8, 1, fefcc200, 4d6880, 1580, 20a58) + 84
>> febe8570 lock_file_section (d8, 4d6880, 14, 2abe588, 147c, 2) + 58
>> febe8e14 WOShmem_lock (2abe588, 14, 1, 4d6880, 1580, 1400) + d4
>> febef410 ac_readConfiguration (1, fffee980, 11400, fec08f74, 1d84, 1c00) + 40
>> febe71cc _runRequest (fc9fb9c4, 0, 2d77168, 2d18b40, 5, 0) + 260
>> febe6a0c tr_handleRequest (2d18b40, 27226f0, fc9fbc50, 0, 5, 2) + 30c
>> febf42a8 WebObjects_handler (2721208, 0, 10000, 0, 2d18b40, fec08f74) + 48c
>> 00041484 ap_run_handler (2721208, febf3e1c, 7b578, 6b5a10, 2, 8) + 40
>> 00041ab4 ap_invoke_handler (2721208, 0, 2721208, 0, 6b58bc, 79c00) + ec
>> 0005132c ap_process_request (2721208, 79400, 4, 1, 0, 2721208) + 54
>> 0004d9a4 ap_process_http_connection (26b61c0, 7c000, 0, 1, 79548, 5) + 78
>> 00049654 ap_process_connection (26b61c0, 26b5f10, 6b5d90, 0, 7bd98, 6b5d78) 
>> + d4
>> 00057558 worker_thread (14d888, ad7, fc9fbf98, 7c24c, 2b, 17) + 280
>> feec5238 _lwp_start (0, 0, 0, 0, 0, 0)
>> 
>> 
>> feec52d8 lwp_park (0, 0, 0)
>> feebf350 cond_wait_queue (ef50a8, ef5090, 0, 0, 1c00, 0) + 28
>> feebf874 cond_wait (ef50a8, ef5090, ef50a8, 0, fec0a8f8, 3) + 10
>> feebf8b0 pthread_cond_wait (ef50a8, ef5090, ef5090, 0, 1c00, 3a) + 8
>> febf2730 _WA_lock (ef5088, febf5974, ef50a8, 0, fec0a8f8, 3) + 90
>> febe9494 sha_lock (100, 4, fffeca64, fec08f74, ef3230, 13400) + 5c
>> febedd84 ac_findApplication (fe0fb54c, 4, fec0acfc, fec08f74, 0, fec0a474) + 
>> 70
>> febe6794 tr_handleRequest (2402c38, 30bbec0, fe0fb7d8, 798f0, ffffffff, 
>> 14400) + 94
>> febf42a8 WebObjects_handler (30baf40, 0, 10000, 0, 2402c38, fec08f74) + 48c
>> 00041484 ap_run_handler (30baf40, febf3e1c, 7b578, 6b5a10, 2, 8) + 40
>> 00041ab4 ap_invoke_handler (30baf40, 0, 2ba5f10, 2ba5348, 30baf40, 2b824d8) 
>> + ec
>> 0003f080 ap_run_sub_req (ffffffff, 30bb0e8, 20, 0, 30bc370, 30baf40) + 3c
>> fed336d8 handle_include (2ba4d20, 10800, 2ba5f10, 2ba5348, 30baf40, 2b824d8) 
>> + 334
>> fed378f8 send_parsed_content (11a8, 7c021, 2ba4d20, 2c01898, 2ba5f14, 
>> 2ba5f10) + 1080
>> 0003afb0 default_handler (0, 2c01898, 2b91e10, 2b7c748, 2b7e598, 2ba5328) + 
>> 4a8
>> 00041484 ap_run_handler (2c01898, 3ab08, 7b578, 6b5a74, 7, 8) + 40
>> 00041ab4 ap_invoke_handler (2c01898, 0, 2c01898, 2b9eb80, ffb1b6a0, 4e4960) 
>> + ec
>> 00051a58 ap_internal_redirect (0, 2c01898, fe0fbd10, fe0fbcac, 1, 2c01898) + 
>> 44
>> febab53c handler_redirect (2b9eb80, ffffffff, febbd238, 2c01560, fffefd64, 
>> 10000) + 90
>> 00041484 ap_run_handler (2b9eb80, febab4ac, 7b578, 6b5a4c, 5, 8) + 40
>> 00041ab4 ap_invoke_handler (2b9eb80, 0, 2b9eb80, 0, 6b58bc, 79c00) + ec
>> 0005132c ap_process_request (2b9eb80, 79400, 4, 1, 0, 2b9eb80) + 54
>> 0004d9a4 ap_process_http_connection (2b7c748, 7c000, 0, 1, 79548, 5) + 78
>> 00049654 ap_process_connection (2b7c748, 2b7c498, 6b5d90, 0, 7bd98, 6b5d78) 
>> + d4
>> 00057558 worker_thread (14d5a8, a00, fe0fbf98, 7c24c, 28, 0) + 280
>> feec5238 _lwp_start (0, 0, 0, 0, 0, 0)
>> 
>> 
>> 
>> 
>> 
> 
> 
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> Webobjects-dev mailing list      ([email protected])
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/webobjects-dev/ramseygurley%40gmail.com
> 
> This email sent to [email protected]


 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      ([email protected])
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [email protected]

Reply via email to