Michael B Allen wrote:
On Fri, Oct 16, 2009 at 2:42 PM, Joe Lewis <j...@joe-lewis.com> wrote:
Michael B Allen wrote:
On Fri, Oct 16, 2009 at 1:10 PM, Joe Lewis <j...@joe-lewis.com> wrote:

Michael B Allen wrote:

I have a customer who very occasionally sees apache workers hang. I'm
pretty sure this is caused by an errant module but I don't know which
one.

Is there any way to determine which module is causing Apache workers to
hang?

Can I temporarily disable that SIGTERM so that I can have enough time
to attach GDB to the hanging processes?

Mike


Perhaps run it in a non-forking mode (httpd -X -k start) inside of gdb
and
see what it hangs on?

If I run it in gdb like you suggest:

 # gdb httpd
 (gdb) run -X -k start

I cannot get httpd to run module deinitialization. Meaning if I do
apachectl stop or httpd -X -k stop or graceful-stop in another
terminal, it just kills the whole process group. Since the problem is
hanging during module deinitialization I don't think this is going to
help me. How do I shutdown httpd so that it runs the module
deinitialization routines?

Otherwise does anyone have a web-svn pointer to the code that's
calling the SIGTERM? Maybe I can find a way to disable it.

Mike

Disabling SIGTERM for apache would be akin to leaving the landing gear of
your airplane on the ground when you take off.  How are you going to
properly shutdown apache if you completely kill the SIGTERM signals?

SIGTERM should not be used to stop processes. A process should
complete gracefully and call exit(2). Normally, this is what httpd
does. However if a child process takes too long, something is sending
a SIGTERM to *kill* the process. I assume this is Apache since it's
writing a message in error_log to that effect. This is what I want to
disable. Meaning, if a child process hangs, I want it to just sit
there stuck forever until an operator can login and attach gdb to it.

If I could find that part of the code, I might find a directive that
controls how long Apache waits before it sends the SIGTERM.

The "deinitialization" - are you just not seeing the messages you'd normally
see?  Or did apache just terminate (which is normal in gdb, which causes the
gdb session to terminate as well).

Right. I have an Apache module that writes to a separate log. When the
module is deinitialized, information is written to the log. Without
gdb, that information is correctly written to the log. When running in
gdb, nothing is written to the log. It seems the entire process group
is simply being killed. And thus the part of interest is not
accessible.

Mike

The SIGTERMS are occurring because apache has already attempted to stop a process gracefully, and it isn't stopping. Rather than endlessly try and "gracefully" shutdown a child process, apache will presume that the process is just not going to respond.

You can always try the worker MPM rather than the prefork MPM.

As it stands, from the sound of the problem and the rarity of it (your previous descriptions), you are going to be "hit and miss" on tracking it down. You could potentially recompile all of the modules and apache itself (placing debug log lines in each one), but the problems may actually go away in that case. Especially if you switch versions.

I do know that some distributions' versions of apache exhibited behavior similar to what you have described (specifically, SuSE), so I don't know if compiling a new version would alleviate the customer gripe.

I only have two real suggestions : strace the processes, and hope the hard drive is big enough to capture the output from strace until the problems are encountered, or try upgrading the version of Apache.

Joe
--
Joe Lewis
Chief Nerd      SILVERHAWK <http://www.silverhawk.net/>   

------------------------------------------------------------------------
/With every passing hour our solar system comes forty-three thousand miles closer to globular cluster 13 in the constellation Hercules, and still there are some misfits who continue to insist that there is no such thing as progress.
   --Ransom K. Ferm/

Reply via email to