On Mon, Aug 22, 2011 at 8:10 PM, Jonathan Swartz <[email protected]> wrote:
> We use Apache/mod_perl 2 and occasionally get a child httpd process that
> spins out of control, either consuming ever-increasing amounts of memory or
> max cpu. Usually due to an infinite loop or other bug in a specific part of
> the site - this sort of thing happens.
>
> I would like to monitor for such httpd children every second or so, and
> when finding one, send it a USR2 signal so it can dump its current Perl
> stack to our error logs.
>
A few ideas:
- If your requests are typically short and the memory allocation uses
enough CPU time, you could set a soft limit for CPU time then catch
$SIG{XCPU} (you would also need to limit how many requests your child
processes handle). It worked for me in a quick test.
- If the memory usage is significant, as a quick check you could look at
the total free memory available on the system, and only if it falls below a
threshold do a more complex check with Proc::ProcessTable.
- If the runaway process causes the load average to go up, you could look
at the lod average, and only if it rises above a threshold do a more complex
check with Proc::ProcessTable.
- If your requests are typically short, you could create a small watchdog
server; a request would register its PID with the watchdog server, then
unregister when it finishes. If the watchdog sees a request register that
does not complete within some time limit, it could send SIGUSR2. I have
used a solution like this in the past, and it is effective, if a bit
cumbersome.
- Apache::Scoreboard<http://search.cpan.org/~mjh/Apache-Scoreboard-2.09.2/>
can
get you the PIDs of just the Apache processes, and some basic state
information. You might be able to use this to make your process table scan
more efficient. Maybe you could write a URL handler to do your checking
and signaling using the scoreboard from within Apache, then load the URL
periodically to trigger the test.
Hope this is helpful,
-----Scott.