David Fraser wrote ..
> Trying to find relevant info on this from the Apache docs and other
> module documentation:
> http://httpd.apache.org/docs/2.2/stopping.html#gracefulstop
>   talks about advising children to exit after their current request. In
> this case it would seem the cleanup methods should get called at the end
> of the request processing, and thus shouldn't be in a signal handler
> (and there should be no other Python code executing...)
> http://www.apachetutor.org/dev/pools
>   talks about using pools to allocate/deallocate resources other than
> memory - could we provide a way to register Python objects that need
> cleanup using this mechanism?
> 
> Am I barking up the wrong tree or is this worth investigating further?
> David

If I remember correctly, it is only in the parent Apache process that doing
stuff outside of the signal handler is avoided. In terms of waiting until a
request is finished, all this means is that the parent process waits until the
child process finishes any requests it may be handling before it sends the
child process the TERM signal. The TERM signal in the child process still
results in a call to a signal handler whose action is to destroy the child
process memory pool resulting in complex code associated with Python being
called within the context of the signal handler.

The end result is that using a graceful restart as opposed to a plain restart
may increase your chances of that complex Python code being able to
execute successfully, as no requests should be executing at the same time,
but it is still not a guarantee that it will work.

I still could be wrong about parts of this as working out how it all works
is hard because of the one code file being used for both child and parent
implementations. You also have the different MPMs to consider.

Anyway, I'll look over the code again. One thing I have noticed is that although
it says (at least for worker):

        apr_signal(SIGTERM, just_die);
        child_main(slot);

        clean_child_exit(0);

Where just_die() being called is the problem, it later in the code does:

        unblock_signal(SIGTERM);
        apr_signal(SIGTERM, dummy_signal_handler);

It only does this though when not running as one process. It is entirely
possible when debugging this previously that I was interpreting things
wrongly as I was running in single process mode in order to run gdb.
This may have resulted in the behaviour of the process being changed
so that the SIGTERM was doing what we don't want, when in fact when
run normally it doesn't.

All we know is that something is causing crashes and hangs on shutdown
and if it wasn't the signal, it must be something else.

Graham

Reply via email to