On 18 May 2010 09:31, Alec Flett <[email protected]> wrote:
> Hey, me again - back with dying daemons, the bane of my existence at the 
> moment.
>
> Again, the problem is that daemons are mysteriously disappearing for some 
> reason, without restarting, in a single-threaded prefork mpm, with 
> single-threaded daemons. I cranked up logging on apache, and even added some 
> log messages of my own. It seems that wsgi_manage_process is not always 
> called, and it seems to be specifically during shutdown - the python 
> interpreter is shutting down, but the new daemon never starts up.
>
> So I have done some extensive log analysis and noticed a slightly odd 
> pattern... it seems that the failure is that apr_proc_other_child_register() 
> is somehow not registering, or at least the cleanup is not getting called. I 
> looked at the source for apr_proc_other_child_register() and it is DEFINITELY 
> not threadsafe, (uses a linked list anchored in a global!) though it seems 
> like this code would never be called from multiple threads.

No, would not ever be called from multithreaded context as only
invoked in Apache parent process which is single threaded.

> Below I've extracted the mod_wsgi log messages for each failure - essentially 
> I took all the 'mod_wsgi (pid=xxx)' messages and extracted the unique 
> sequences, grouping the occurrence by PID. Imagine that each of the messages 
> below begins with 'mod_wsgi (pid=xxx'
>
> Successful daemon behavior looks like this:
>
> 586 pids with this sequence:
> ['9075', '10295', '9676', '10109', ...
>    ): Starting process 'apiserver-freebase.com' with threads=1.
>    ): Initializing Python.
>    ): Attach interpreter ''.
>    ): Adding '/mw/app/apiserver_94819/_install/lib/python2.6/site-packages' 
> to path.
>    , process='apiserver-freebase.com', application=''): Loading WSGI script 
> '/mw/app/apiserver_94819/_install/bin/apiserver.wsgi'.
>    ): Maximum requests reached 'apiserver-freebase.com'.
>    ): Shutdown requested 'apiserver-freebase.com'.
>    ): Stopping process 'apiserver-freebase.com'.
>    ): Destroying interpreters.
>    ): Cleanup interpreter ''.
>    ): Terminating Python.
>    ): Python has shutdown.
>    ): wsgi_manage_process(0, 'apiserver-freebase.com', 255)
>    ): wsgi_manage_process(3, 'apiserver-freebase.com', -1)
>    ): Process 'apiserver-freebase.com' unregistered, 
> (APR_OC_REASON_UNREGISTER) not doing anything
>    ): Process 'apiserver-freebase.com' has died, (APR_OC_REASON_DEATH) 
> restarting.
>    ): Successfully replaced with pid=xxx
>
> Failure to restart looks like this:
> 4 pids with this sequence:
> ['10901', '9335', '10910', '10952']
>    ): Starting process 'apiserver-freebase.com' with threads=1.
>    ): Initializing Python.
>    ): Attach interpreter ''.
>    ): Adding '/mw/app/apiserver_94819/_install/lib/python2.6/site-packages' 
> to path.
>    , process='apiserver-freebase.com', application=''): Loading WSGI script 
> '/mw/app/apiserver_94819/_install/bin/apiserver.wsgi'.
>    ): Maximum requests reached 'apiserver-freebase.com'.
>    ): Shutdown requested 'apiserver-freebase.com'.
>    ): Stopping process 'apiserver-freebase.com'.
>    ): Destroying interpreters.
>    ): Cleanup interpreter ''.
>    ): Terminating Python.
>    ): Python has shutdown.
>
> Obviously these last 5 log messages are all my own. I'm attaching the patch 
> that produced the above message. (disclaimer: I discovered that signal 
> handler is crashing with this patch because I'm passing NULL to ap_log_error, 
> but that is not the cause of this as the crash only occurs during apache 
> shutdown)

Do know I haven't totally forgotten about this, it is still all in my
inbox and I have been noting your followups.

My last thought about it, but which I haven't been able to investigate
and so hadn't posted about it as yet, is what the implications would
be if a WSGI applications code did something like call setsid() or
some other operation which disassociated itself from process group of
parent process. I cant remember if this would result in death of a
child process not being signalled back to the original parent process.

This of course is dependent on code running in WSGI application that
did something odd like this, but then I have seen all sorts of strange
things done before. There is various stuff out there that assumes that
it is running from a command line Python for example, rather than an
embedded system, and does various stuff it shouldn't with controlling
tty as an example. Am not sure if use of the subprocess module could
also do strange things with the relationships between processes and
signalling on death.

Are you able to able to list any non mainstream third party Python
packages you use?

FWIW, you are still the only report of this sort of issue that I
recollect seeing which still makes me suspect it is something in the
hosted web application itself that is causing the issue

Graham

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en.

Reply via email to