Hey, me again - back with dying daemons, the bane of my existence at the moment.
Again, the problem is that daemons are mysteriously disappearing for some
reason, without restarting, in a single-threaded prefork mpm, with
single-threaded daemons. I cranked up logging on apache, and even added some
log messages of my own. It seems that wsgi_manage_process is not always called,
and it seems to be specifically during shutdown - the python interpreter is
shutting down, but the new daemon never starts up.
So I have done some extensive log analysis and noticed a slightly odd
pattern... it seems that the failure is that apr_proc_other_child_register() is
somehow not registering, or at least the cleanup is not getting called. I
looked at the source for apr_proc_other_child_register() and it is DEFINITELY
not threadsafe, (uses a linked list anchored in a global!) though it seems like
this code would never be called from multiple threads.
Below I've extracted the mod_wsgi log messages for each failure - essentially I
took all the 'mod_wsgi (pid=xxx)' messages and extracted the unique sequences,
grouping the occurrence by PID. Imagine that each of the messages below begins
with 'mod_wsgi (pid=xxx'
Successful daemon behavior looks like this:
586 pids with this sequence:
['9075', '10295', '9676', '10109', ...
): Starting process 'apiserver-freebase.com' with threads=1.
): Initializing Python.
): Attach interpreter ''.
): Adding '/mw/app/apiserver_94819/_install/lib/python2.6/site-packages' to
path.
, process='apiserver-freebase.com', application=''): Loading WSGI script
'/mw/app/apiserver_94819/_install/bin/apiserver.wsgi'.
): Maximum requests reached 'apiserver-freebase.com'.
): Shutdown requested 'apiserver-freebase.com'.
): Stopping process 'apiserver-freebase.com'.
): Destroying interpreters.
): Cleanup interpreter ''.
): Terminating Python.
): Python has shutdown.
): wsgi_manage_process(0, 'apiserver-freebase.com', 255)
): wsgi_manage_process(3, 'apiserver-freebase.com', -1)
): Process 'apiserver-freebase.com' unregistered,
(APR_OC_REASON_UNREGISTER) not doing anything
): Process 'apiserver-freebase.com' has died, (APR_OC_REASON_DEATH)
restarting.
): Successfully replaced with pid=xxx
Failure to restart looks like this:
4 pids with this sequence:
['10901', '9335', '10910', '10952']
): Starting process 'apiserver-freebase.com' with threads=1.
): Initializing Python.
): Attach interpreter ''.
): Adding '/mw/app/apiserver_94819/_install/lib/python2.6/site-packages' to
path.
, process='apiserver-freebase.com', application=''): Loading WSGI script
'/mw/app/apiserver_94819/_install/bin/apiserver.wsgi'.
): Maximum requests reached 'apiserver-freebase.com'.
): Shutdown requested 'apiserver-freebase.com'.
): Stopping process 'apiserver-freebase.com'.
): Destroying interpreters.
): Cleanup interpreter ''.
): Terminating Python.
): Python has shutdown.
Obviously these last 5 log messages are all my own. I'm attaching the patch
that produced the above message. (disclaimer: I discovered that signal handler
is crashing with this patch because I'm passing NULL to ap_log_error, but that
is not the cause of this as the crash only occurs during apache shutdown)
mod_wsgi.patch
Description: Binary data
-- You received this message because you are subscribed to the Google Groups "modwsgi" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en.
