I have 40 or so apache processes suspended in "Sending Reply".  My hypothesis
is that MySQL had a problem, and either apache or php somehow got gummed up
and isn't cleaning up for some reason.  I'm hoping the list can give me more
ideas for debugging or point me in the right direction.



Here is the output of http://localhost/server-status:

        Server uptime: 1 day 6 hours 57 minutes 9 seconds
        Total accesses: 47613 - Total Traffic: 498.2 MB
        CPU Usage: u1446.77 s548.53 cu6.26 cs0 - 1.8% CPU load
        .427 requests/sec - 4688 B/second - 10.7 kB/request
        41 requests currently being processed, 8 idle workers
        WW_WWW_WWWWW_WWWWWWWWWW_W_WWWW__WWW.WWWWWWWW_WWWWW

Examining the logs confirms that the last request on each pid was quite a while
ago, and they are just hanging out doing nothing.

The server:
 - RHEL
                $uname -a
                Linux xxx 2.6.18-164.6.1.el5 #1 SMP Tue Oct 27 11:30:06 EDT 2009
i686 i686 i386 GNU/Linux
 - Apache:
                Server version: Apache/2.2.3
                Server built:   Nov 10 2009 09:06:57
 - PHP:
                $php -v
                PHP 5.1.6 (cli) (built: Feb 26 2009 07:01:10)
                Zend Engine v2.1.0
 - Runs Wordpress (not my choice)
 - Receives mostly search crawler traffic at a steady rate
 - has a lot of "(32)Broken pipe: core_output_filter: writing data to the
     network" and "(104)Connection reset by peer: core_output_filter: writing
     data to the network" messages
 - stopping reporting to rrdtool/cacti between 18:50 and 21:30 last night
 - Had a child process die with the error /usr/sbin/httpd: free():
invalid pointer: 0x0a2044a4
     however this was about 20 minutes *after* the problem began
 - had some "database error MySQL server has gone away for query" errors around
     18:50 last night
 - is behind an F5 device that proxies all connections - so every connection to
     the server comes from the same IP address

Relevant config:

        Timeout 40
        KeepAlive On
        MaxKeepAliveRequests 200
        KeepAliveTimeout 5
        StartServers       3
        MinSpareServers    2
        MaxSpareServers   10
        ServerLimit       50
        MaxClients        50
        MaxRequestsPerChild  1000


I've only been able to find one person who had a similar problem, and his was
caused by "dodgy sql": http://marc.info/?l=tomcat-user&m=106319217331935&w=2
(His was also involving tomcat which I do not have.)

The biggest issue is that the processes should time out and clean up after
themselves, right?  But they're not - instead they're just sitting consuming
RAM.  (Not entirely sure about that - in some stacktraces I see
<signal handler called> followed by "zend_timeout ()".)

My hypothesis is that MySQL had a problem, and either apache or php somehow
got gummed up and isn't cleaning up for some reason.

I'm sure a httpd restart will clean everything up, but I wanted to debug this
as best I could.  I gdb-ed a stacktrace for 8 of the hung threads, but it's
not compiled in debug mode.  The stacktraces, and other relevant data, is here:
http://ritter.vg/misc/apache-debug/

If anyone can suggest further things to try to debug this, or any additional
info, I'd appreciate it.

-tom

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
   "   from the digest: users-digest-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org

Reply via email to