I was getting apache seg-faults with a load of around 80 - 100 requests per second. Moved to prefork to fix it.
Michael On Jul 5, 2012, at 4:41 AM, "Brook, James" <[email protected]> wrote: > Chuck, thanks for the information. The skill set is a bit of a rare one. > > Michael, interesting to hear about a deadlock on Linux. Do you mean you were > using the 'worker' MPM? You don't have a stack trace by any chance? We may > have hit the same bottleneck or deadlock. I have been wondering whether this > is a sparc/Solaris specific issue. > > Not being able to use the worker MPM with WO seems like a serious scalability > issue for anyone who isn't 'akamazed'. If I can narrow the problem down to > something specific enough we may look to pay someone to fix this. > > Sent from my iPhone > > On 5 Jul 2012, at 00:10, "Chuck Hill" <[email protected]> wrote: > >> Hi James, >> >> >> On 2012-06-29, at 9:35 AM, Brook, James wrote: >> >>> It's probably bad form to keep answering my own mails but no-one had >>> anything to say about this. Are there still people on the list who are >>> familiar with the adaptor internals? This problem is causing us a lot of >>> pain in production. >> >> At this point in time, you are probably the world's authority on this. >> >> >>> Does anyone use the MPM worker module with Apache or are we all still with >>> pre-fork? I don't think we could live without the performance gains. >>> Perhaps it doesn't matter. >> >> I would guess that very few of us are using Apache on Solaris. >> >> >>> I haven't quite proven this but I am pretty certain that my problem is with >>> fcntl. That's what the adaptor uses to lock the shared memory file. It's >>> apparently an outdated way of doing this - APR now has better abstractions >>> for these sorts of mutexes. Even the code that does the locking is in a >>> retry loop with up to 50 attempts! I started trying to rewrite the locking >>> stuff but I am out of my depth. >> >> There are probably a few people here with current C skills, I am not one of >> them. And then you probably need Apache and Solaris API knowledge too. >> >> >>> It strikes me that in general this would not be a bad bit of code for the >>> community to have updated. Can anyone help me with that please? >> >> I would. but I can't. I was trying to help one company that had a >> deployment problem on Solaris that sounds somewhat similar to yours. So >> yes, it would be good to get this updated. But finding someone else with >> the skill set is unlikely. >> >> >> Chuck >> >> >>> ________________________________________ >>> From: Brook, James >>> Sent: 13 June 2012 18:48 >>> To: <[email protected]> >>> Subject: Re: Deadlock on Apache 2.2 Adaptor under high load - Solaris 10 - >>> Worker MPM >>> >>> Now I have some detailed adaptor logging from a time close to the deadlock. >>> Here is an example of an error with a lock: >>> >>> Debug: thread 37 locking WOShmem_lock from ../Adaptor/shmem.c:375 >>> Debug: thread 37 unlocking WOShmem_lock from ../Adaptor/shmem.c:379 >>> Error: lock_file_section(): failed to lock (1 attempts): Deadlock situation >>> detected/avoided >>> Debug: thread 37 locking str_lock from ../Adaptor/wastring.c:93 >>> Debug: thread 37 unlocking str_lock from ../Adaptor/wastring.c:100 >>> Debug: thread 37 locking str_lock from ../Adaptor/wastring.c:152 >>> Debug: thread 37 unlocking str_lock from ../Adaptor/wastring.c:158 >>> Debug: thread 37 locking WOShmem_lock from ../Adaptor/shmem.c:391 >>> Debug: thread 37 unlocking WOShmem_lock from ../Adaptor/shmem.c:394 >>> Error: ac_readConfiguration: WOShmem_lock() failed. Skipping reading config. >>> >>> On Jun 13, 2012, at 5:30 PM, James Brook wrote: >>> >>>> We have a big problem with the Apache 2.2 WebObjects adaptor on our >>>> Solaris 10 web servers. We are using the 'worker' MPM but when the sites >>>> get busy nearly every Apache thread is waiting for a shared memory lock to >>>> call the function that reads the adaptor config. The remaining threads are >>>> in the fcntl function trying to lock a section of shared memory. See below >>>> for a couple of example thread stacks. >>>> >>>> I read in several posts that fcntl on Solaris 10 causes deadlocks under >>>> high load and that the problem is worse with the 'worker MPM'. The >>>> recommend locking mechanism for Solaris seems to be to use pthreads. >>>> >>>> I know that at least a few list members are running with the Solaris >>>> adaptor. My questions: >>>> * Has anyone experienced this problem and found a solution? >>>> * Anyone using the 'worker' MPM or do people still use pre-fork (I don't >>>> think this a thread safety problem). >>>> * Any help or suggestions? Especially, any tips on rewriting to use >>>> pthreads? >>>> >>>> -- >>>> James >>>> >>>> >>>> feec5638 fcntl (d8, 7, 2abe588) >>>> feeb8258 fcntl (d8, 1, fefcc200, 4d6880, 1580, 20a58) + 84 >>>> febe8570 lock_file_section (d8, 4d6880, 14, 2abe588, 147c, 2) + 58 >>>> febe8e14 WOShmem_lock (2abe588, 14, 1, 4d6880, 1580, 1400) + d4 >>>> febef410 ac_readConfiguration (1, fffee980, 11400, fec08f74, 1d84, 1c00) + >>>> 40 >>>> febe71cc _runRequest (fc9fb9c4, 0, 2d77168, 2d18b40, 5, 0) + 260 >>>> febe6a0c tr_handleRequest (2d18b40, 27226f0, fc9fbc50, 0, 5, 2) + 30c >>>> febf42a8 WebObjects_handler (2721208, 0, 10000, 0, 2d18b40, fec08f74) + 48c >>>> 00041484 ap_run_handler (2721208, febf3e1c, 7b578, 6b5a10, 2, 8) + 40 >>>> 00041ab4 ap_invoke_handler (2721208, 0, 2721208, 0, 6b58bc, 79c00) + ec >>>> 0005132c ap_process_request (2721208, 79400, 4, 1, 0, 2721208) + 54 >>>> 0004d9a4 ap_process_http_connection (26b61c0, 7c000, 0, 1, 79548, 5) + 78 >>>> 00049654 ap_process_connection (26b61c0, 26b5f10, 6b5d90, 0, 7bd98, >>>> 6b5d78) + d4 >>>> 00057558 worker_thread (14d888, ad7, fc9fbf98, 7c24c, 2b, 17) + 280 >>>> feec5238 _lwp_start (0, 0, 0, 0, 0, 0) >>>> >>>> >>>> feec52d8 lwp_park (0, 0, 0) >>>> feebf350 cond_wait_queue (ef50a8, ef5090, 0, 0, 1c00, 0) + 28 >>>> feebf874 cond_wait (ef50a8, ef5090, ef50a8, 0, fec0a8f8, 3) + 10 >>>> feebf8b0 pthread_cond_wait (ef50a8, ef5090, ef5090, 0, 1c00, 3a) + 8 >>>> febf2730 _WA_lock (ef5088, febf5974, ef50a8, 0, fec0a8f8, 3) + 90 >>>> febe9494 sha_lock (100, 4, fffeca64, fec08f74, ef3230, 13400) + 5c >>>> febedd84 ac_findApplication (fe0fb54c, 4, fec0acfc, fec08f74, 0, fec0a474) >>>> + 70 >>>> febe6794 tr_handleRequest (2402c38, 30bbec0, fe0fb7d8, 798f0, ffffffff, >>>> 14400) + 94 >>>> febf42a8 WebObjects_handler (30baf40, 0, 10000, 0, 2402c38, fec08f74) + 48c >>>> 00041484 ap_run_handler (30baf40, febf3e1c, 7b578, 6b5a10, 2, 8) + 40 >>>> 00041ab4 ap_invoke_handler (30baf40, 0, 2ba5f10, 2ba5348, 30baf40, >>>> 2b824d8) + ec >>>> 0003f080 ap_run_sub_req (ffffffff, 30bb0e8, 20, 0, 30bc370, 30baf40) + 3c >>>> fed336d8 handle_include (2ba4d20, 10800, 2ba5f10, 2ba5348, 30baf40, >>>> 2b824d8) + 334 >>>> fed378f8 send_parsed_content (11a8, 7c021, 2ba4d20, 2c01898, 2ba5f14, >>>> 2ba5f10) + 1080 >>>> 0003afb0 default_handler (0, 2c01898, 2b91e10, 2b7c748, 2b7e598, 2ba5328) >>>> + 4a8 >>>> 00041484 ap_run_handler (2c01898, 3ab08, 7b578, 6b5a74, 7, 8) + 40 >>>> 00041ab4 ap_invoke_handler (2c01898, 0, 2c01898, 2b9eb80, ffb1b6a0, >>>> 4e4960) + ec >>>> 00051a58 ap_internal_redirect (0, 2c01898, fe0fbd10, fe0fbcac, 1, 2c01898) >>>> + 44 >>>> febab53c handler_redirect (2b9eb80, ffffffff, febbd238, 2c01560, fffefd64, >>>> 10000) + 90 >>>> 00041484 ap_run_handler (2b9eb80, febab4ac, 7b578, 6b5a4c, 5, 8) + 40 >>>> 00041ab4 ap_invoke_handler (2b9eb80, 0, 2b9eb80, 0, 6b58bc, 79c00) + ec >>>> 0005132c ap_process_request (2b9eb80, 79400, 4, 1, 0, 2b9eb80) + 54 >>>> 0004d9a4 ap_process_http_connection (2b7c748, 7c000, 0, 1, 79548, 5) + 78 >>>> 00049654 ap_process_connection (2b7c748, 2b7c498, 6b5d90, 0, 7bd98, >>>> 6b5d78) + d4 >>>> 00057558 worker_thread (14d5a8, a00, fe0fbf98, 7c24c, 28, 0) + 280 >>>> feec5238 _lwp_start (0, 0, 0, 0, 0, 0) >>>> >>>> >>>> >>>> >>>> >>> >>> >>> _______________________________________________ >>> Do not post admin requests to the list. They will be ignored. >>> Webobjects-dev mailing list ([email protected]) >>> Help/Unsubscribe/Update your Subscription: >>> https://lists.apple.com/mailman/options/webobjects-dev/chill%40global-village.net >>> >>> This email sent to [email protected] >> >> -- >> Chuck Hill Senior Consultant / VP Development >> >> Practical WebObjects - for developers who want to increase their overall >> knowledge of WebObjects or who are trying to solve specific problems. >> http://www.global-village.net/gvc/practical_webobjects >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> Do not post admin requests to the list. They will be ignored. >> Webobjects-dev mailing list ([email protected]) >> Help/Unsubscribe/Update your Subscription: >> https://lists.apple.com/mailman/options/webobjects-dev/jbrook%40upcbroadband.com >> >> This email sent to [email protected] > _______________________________________________ Do not post admin requests to the list. They will be ignored. Webobjects-dev mailing list ([email protected]) Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com This email sent to [email protected]
