Michael B Allen wrote:
On Fri, Oct 16, 2009 at 2:42 PM, Joe Lewis <j...@joe-lewis.com> wrote:
Michael B Allen wrote:
On Fri, Oct 16, 2009 at 1:10 PM, Joe Lewis <j...@joe-lewis.com> wrote:
Michael B Allen wrote:
I have a customer who very occasionally sees apache workers hang. I'm
pretty sure this is caused by an errant module but I don't know which
one.
Is there any way to determine which module is causing Apache workers to
hang?
Can I temporarily disable that SIGTERM so that I can have enough time
to attach GDB to the hanging processes?
Mike
Perhaps run it in a non-forking mode (httpd -X -k start) inside of gdb
and
see what it hangs on?
If I run it in gdb like you suggest:
# gdb httpd
(gdb) run -X -k start
I cannot get httpd to run module deinitialization. Meaning if I do
apachectl stop or httpd -X -k stop or graceful-stop in another
terminal, it just kills the whole process group. Since the problem is
hanging during module deinitialization I don't think this is going to
help me. How do I shutdown httpd so that it runs the module
deinitialization routines?
Otherwise does anyone have a web-svn pointer to the code that's
calling the SIGTERM? Maybe I can find a way to disable it.
Mike
Disabling SIGTERM for apache would be akin to leaving the landing gear of
your airplane on the ground when you take off. How are you going to
properly shutdown apache if you completely kill the SIGTERM signals?
SIGTERM should not be used to stop processes. A process should
complete gracefully and call exit(2). Normally, this is what httpd
does. However if a child process takes too long, something is sending
a SIGTERM to *kill* the process. I assume this is Apache since it's
writing a message in error_log to that effect. This is what I want to
disable. Meaning, if a child process hangs, I want it to just sit
there stuck forever until an operator can login and attach gdb to it.
If I could find that part of the code, I might find a directive that
controls how long Apache waits before it sends the SIGTERM.
The "deinitialization" - are you just not seeing the messages you'd normally
see? Or did apache just terminate (which is normal in gdb, which causes the
gdb session to terminate as well).
Right. I have an Apache module that writes to a separate log. When the
module is deinitialized, information is written to the log. Without
gdb, that information is correctly written to the log. When running in
gdb, nothing is written to the log. It seems the entire process group
is simply being killed. And thus the part of interest is not
accessible.
Mike
The SIGTERMS are occurring because apache has already attempted to stop
a process gracefully, and it isn't stopping. Rather than endlessly try
and "gracefully" shutdown a child process, apache will presume that the
process is just not going to respond.
You can always try the worker MPM rather than the prefork MPM.
As it stands, from the sound of the problem and the rarity of it (your
previous descriptions), you are going to be "hit and miss" on tracking
it down. You could potentially recompile all of the modules and apache
itself (placing debug log lines in each one), but the problems may
actually go away in that case. Especially if you switch versions.
I do know that some distributions' versions of apache exhibited behavior
similar to what you have described (specifically, SuSE), so I don't know
if compiling a new version would alleviate the customer gripe.
I only have two real suggestions : strace the processes, and hope the
hard drive is big enough to capture the output from strace until the
problems are encountered, or try upgrading the version of Apache.
Joe
--
Joe Lewis
Chief Nerd SILVERHAWK <http://www.silverhawk.net/>
------------------------------------------------------------------------
/With every passing hour our solar system comes forty-three thousand
miles closer to globular cluster 13 in the constellation Hercules, and
still there are some misfits who continue to insist that there is no
such thing as progress.
--Ransom K. Ferm/