On 18 May 2010 09:44, Graham Dumpleton <[email protected]> wrote:
> On 18 May 2010 09:31, Alec Flett <[email protected]> wrote:
>> Hey, me again - back with dying daemons, the bane of my existence at the 
>> moment.
>>
>> Again, the problem is that daemons are mysteriously disappearing for some 
>> reason, without restarting, in a single-threaded prefork mpm, with 
>> single-threaded daemons. I cranked up logging on apache, and even added some 
>> log messages of my own. It seems that wsgi_manage_process is not always 
>> called, and it seems to be specifically during shutdown - the python 
>> interpreter is shutting down, but the new daemon never starts up.
>>
>> So I have done some extensive log analysis and noticed a slightly odd 
>> pattern... it seems that the failure is that apr_proc_other_child_register() 
>> is somehow not registering, or at least the cleanup is not getting called. I 
>> looked at the source for apr_proc_other_child_register() and it is 
>> DEFINITELY not threadsafe, (uses a linked list anchored in a global!) though 
>> it seems like this code would never be called from multiple threads.
>
> No, would not ever be called from multithreaded context as only
> invoked in Apache parent process which is single threaded.
>
>> Below I've extracted the mod_wsgi log messages for each failure - 
>> essentially I took all the 'mod_wsgi (pid=xxx)' messages and extracted the 
>> unique sequences, grouping the occurrence by PID. Imagine that each of the 
>> messages below begins with 'mod_wsgi (pid=xxx'
>>
>> Successful daemon behavior looks like this:
>>
>> 586 pids with this sequence:
>> ['9075', '10295', '9676', '10109', ...
>>    ): Starting process 'apiserver-freebase.com' with threads=1.
>>    ): Initializing Python.
>>    ): Attach interpreter ''.
>>    ): Adding '/mw/app/apiserver_94819/_install/lib/python2.6/site-packages' 
>> to path.
>>    , process='apiserver-freebase.com', application=''): Loading WSGI script 
>> '/mw/app/apiserver_94819/_install/bin/apiserver.wsgi'.
>>    ): Maximum requests reached 'apiserver-freebase.com'.
>>    ): Shutdown requested 'apiserver-freebase.com'.
>>    ): Stopping process 'apiserver-freebase.com'.
>>    ): Destroying interpreters.
>>    ): Cleanup interpreter ''.
>>    ): Terminating Python.
>>    ): Python has shutdown.
>>    ): wsgi_manage_process(0, 'apiserver-freebase.com', 255)
>>    ): wsgi_manage_process(3, 'apiserver-freebase.com', -1)
>>    ): Process 'apiserver-freebase.com' unregistered, 
>> (APR_OC_REASON_UNREGISTER) not doing anything
>>    ): Process 'apiserver-freebase.com' has died, (APR_OC_REASON_DEATH) 
>> restarting.
>>    ): Successfully replaced with pid=xxx
>>
>> Failure to restart looks like this:
>> 4 pids with this sequence:
>> ['10901', '9335', '10910', '10952']
>>    ): Starting process 'apiserver-freebase.com' with threads=1.
>>    ): Initializing Python.
>>    ): Attach interpreter ''.
>>    ): Adding '/mw/app/apiserver_94819/_install/lib/python2.6/site-packages' 
>> to path.
>>    , process='apiserver-freebase.com', application=''): Loading WSGI script 
>> '/mw/app/apiserver_94819/_install/bin/apiserver.wsgi'.
>>    ): Maximum requests reached 'apiserver-freebase.com'.
>>    ): Shutdown requested 'apiserver-freebase.com'.
>>    ): Stopping process 'apiserver-freebase.com'.
>>    ): Destroying interpreters.
>>    ): Cleanup interpreter ''.
>>    ): Terminating Python.
>>    ): Python has shutdown.
>>
>> Obviously these last 5 log messages are all my own. I'm attaching the patch 
>> that produced the above message. (disclaimer: I discovered that signal 
>> handler is crashing with this patch because I'm passing NULL to 
>> ap_log_error, but that is not the cause of this as the crash only occurs 
>> during apache shutdown)
>
> Do know I haven't totally forgotten about this, it is still all in my
> inbox and I have been noting your followups.
>
> My last thought about it, but which I haven't been able to investigate
> and so hadn't posted about it as yet, is what the implications would
> be if a WSGI applications code did something like call setsid() or
> some other operation which disassociated itself from process group of
> parent process. I cant remember if this would result in death of a
> child process not being signalled back to the original parent process.
>
> This of course is dependent on code running in WSGI application that
> did something odd like this, but then I have seen all sorts of strange
> things done before. There is various stuff out there that assumes that
> it is running from a command line Python for example, rather than an
> embedded system, and does various stuff it shouldn't with controlling
> tty as an example. Am not sure if use of the subprocess module could
> also do strange things with the relationships between processes and
> signalling on death.

Neither setsid() or setpgrp() triggers the sort of thing I had in mind. :-(

Graham

> Are you able to able to list any non mainstream third party Python
> packages you use?
>
> FWIW, you are still the only report of this sort of issue that I
> recollect seeing which still makes me suspect it is something in the
> hosted web application itself that is causing the issue
>
> Graham
>

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en.

Reply via email to