On Sun, Apr 11, 2010 at 3:43 PM, Frederik Banke <frede...@tigermedia.dk>wrote:
> Hi > > I have a problem with my php, once in a while an apache process will hang. > In apache's /server-status it looks like: > *11-1*111941/3/588*W* 0.20213490179.30.2227.03 xxx.xxx.xxx.xxx > localhostGET xxx.php?xxx=xxxAs you can see from this line the process has > been working for 21349sec and the status is W (Sending Reply). I can se > from > top that the process uses 0% cpu. > > I have been unable to find out what causes some of the processes to behave > like this. When I try the same request as a process that stalls it finishes > with no problem. The problem is that the request uses alot of memory and it > is never released so when there is many stalled processes the server > crashes. The stalled processes only occours on a few specific requests of > large searches so the page generation time is large for the request (30sec+ > range). > > I have tried to setup another server with the same configuration to debug > but I'm unable to reproduce the problem. I suspect it to be related to some > kind of deadlock that only occours in special cases, because of the futex > wait mentioned below. > > With strace and gdb on the process i get: > > strace -p 11194 > futex(0x7f9fefee3a00, FUTEX_WAIT_PRIVATE, 2, NULL > > (gdb) where > #0 0x00007f9fefc6a6de in ?? () from /lib/libc.so.6 > #1 0x00007f9fefbf4025 in ?? () from /lib/libc.so.6 > #2 0x00007f9fefbf026b in free () from /lib/libc.so.6 > #3 0x00007f9fefb9fb6d in setlocale () from /lib/libc.so.6 > #4 0x00007f9fe95d8e61 in zm_deactivate_basic () from > /usr/lib/apache2/modules/libphp5.so > #5 0x00007f9fe969d4ac in module_registry_cleanup () from > /usr/lib/apache2/modules/libphp5.so > #6 0x00007f9fe96a698b in zend_hash_apply () from > /usr/lib/apache2/modules/libphp5.so > #7 0x00007f9fe969bdad in zend_deactivate_modules () from > /usr/lib/apache2/modules/libphp5.so > #8 0x00007f9fe9656085 in php_request_shutdown () from > /usr/lib/apache2/modules/libphp5.so > #9 0x00007f9fe9711123 in ?? () from /usr/lib/apache2/modules/libphp5.so > #10 0x00007f9ff09e72d3 in ap_run_handler () from /usr/sbin/apache2 > #11 0x00007f9ff09eaa6f in ap_invoke_handler () from /usr/sbin/apache2 > #12 0x00007f9ff09f8430 in ap_internal_redirect () from /usr/sbin/apache2 > #13 0x00007f9fe8c47bd5 in ?? () from > /usr/lib/apache2/modules/mod_rewrite.so > #14 0x00007f9ff09e72d3 in ap_run_handler () from /usr/sbin/apache2 > #15 0x00007f9ff09eaa6f in ap_invoke_handler () from /usr/sbin/apache2 > #16 0x00007f9ff09f860e in ap_process_request () from /usr/sbin/apache2 > #17 0x00007f9ff09f5448 in ?? () from /usr/sbin/apache2 > #18 0x00007f9ff09eeca3 in ap_run_process_connection () from > /usr/sbin/apache2 > #19 0x00007f9ff09fcf76 in ?? () from /usr/sbin/apache2 > #20 0x00007f9ff09fd2ea in ?? () from /usr/sbin/apache2 > #21 0x00007f9ff09fde1a in ap_mpm_run () from /usr/sbin/apache2 > #22 0x00007f9ff09d360d in main () from /usr/sbin/apache2 > > > > So it looks like php hangs waiting for some futex. I have tried to > find the zm_deactivate_basic function but i have only found a macro > "#define ZEND_MODULE_DEACTIVATE_N(module) zm_deactivate_##module" > > This seems to point to the module basic, but i can't find this module > anywhere so i suspect it to be some kind of internal/core "module". > > Any ideas on how to proceed to find a solution? > > /Frederik Banke > I can't help you, but in the past I did see apache processes got stuck for some ajax requests, and because it was a third party application and I didn't had the time to debug it properly, I ended up writing a simple process which killed the long-running apache workers which stucked with the W status. Tyrael