Re: [modwsgi] Re: apache defunct processes

Cristiano Coelho Fri, 30 Dec 2016 18:58:17 -0800

Hello, thanks for the quick response!

This apache deploy is done automatically by AWS Elastic Beanstalk, so I 
don't really have control over the version used, I'm amazed it is using a 5 
y.o version.
I know for sure it is running already with Daemon Mode since I have looked 
at the wsgi config they provide.


At the end is part of the wsgi file config and also the logs of the faulty 
restart that caused the process to stay alive. The configuration is pretty 
much provided by Amazon so one would expect it is ideal.

Also, now that you mention keep alive settings, is there any chance this 
issue is caused by mpm_event and 100s keep alive settings combination? Will 
mod_wsgi/apache try to wait until the connections are closed and this is 
causing issues? This is really odd because there have been many successful 
restarts even under load and many faulty restarts while the servers were 
probably not being used.



 -- wsgi.conf (partially)
>  
> LoadModule wsgi_module modules/mod_wsgi.so
> WSGIPythonHome /opt/python/run/baselinenv
> WSGISocketPrefix run/wsgi
> WSGIRestrictEmbedded On
>
> WSGIDaemonProcess wsgi processes=1 threads=10 display-name=%{GROUP} \
>   
> python-path=/opt/python/current/app:/opt/python/run/venv/lib64/python2.7/site-packages:/opt/python/run/venv/lib/python2.7/site-packages
>  
> user=wsgi group=wsgi \
>   home=/opt/python/current/app
> WSGIProcessGroup wsgi
> </VirtualHost>
>
>
>  -- Restart logs
> [Fri Dec 30 18:26:46.825763 2016] [core:warn] [pid 24265:tid 
> 140339875915840] AH00045: child process 24396 still did not exit, sending a 
> SIGTERM
> [Fri Dec 30 18:26:48.827998 2016] [core:warn] [pid 24265:tid 
> 140339875915840] AH00045: child process 24396 still did not exit, sending a 
> SIGTERM
> [Fri Dec 30 18:26:50.830264 2016] [core:warn] [pid 24265:tid 
> 140339875915840] AH00045: child process 24396 still did not exit, sending a 
> SIGTERM
> [Fri Dec 30 18:26:52.832466 2016] [core:error] [pid 24265:tid 
> 140339875915840] AH00046: child process 24396 still did not exit, sending a 
> SIGKILL
> [Fri Dec 30 18:26:54.539770 2016] [suexec:notice] [pid 12669:tid 
> 140513528571968] AH01232: suEXEC mechanism enabled (wrapper: 
> /usr/sbin/suexec)
> [Fri Dec 30 18:26:54.550651 2016] [so:warn] [pid 12669:tid 
> 140513528571968] AH01574: module expires_module is already loaded, skipping
> [Fri Dec 30 18:26:54.550700 2016] [so:warn] [pid 12669:tid 
> 140513528571968] AH01574: module deflate_module is already loaded, skipping
> [Fri Dec 30 18:26:54.550791 2016] [so:warn] [pid 12669:tid 
> 140513528571968] AH01574: module wsgi_module is already loaded, skipping
> [Fri Dec 30 18:26:54.552750 2016] [auth_digest:notice] [pid 12669:tid 
> 140513528571968] AH01757: generating secret for digest authentication ...
> [Fri Dec 30 18:26:54.553328 2016] [lbmethod_heartbeat:notice] [pid 
> 12669:tid 140513528571968] AH02282: No slotmem from mod_heartmonitor
> [Fri Dec 30 18:26:54.553663 2016] [:warn] [pid 12669:tid 140513528571968] 
> mod_wsgi: Compiled for Python/2.7.9.
> [Fri Dec 30 18:26:54.553671 2016] [:warn] [pid 12669:tid 140513528571968] 
> mod_wsgi: Runtime using Python/2.7.10.
> [Fri Dec 30 18:26:54.554100 2016] [core:warn] [pid 12669:tid 
> 140513528571968] AH00098: pid file /var/run/httpd/httpd.pid overwritten -- 
> Unclean shutdown of previous Apache run?
> [Fri Dec 30 18:26:54.555343 2016] [mpm_event:notice] [pid 12669:tid 
> 140513528571968] AH00489: Apache/2.4.23 (Amazon) mod_wsgi/3.5 Python/2.7.10 
> configured -- resuming normal operations
>





El viernes, 30 de diciembre de 2016, 23:16:46 (UTC-3), Graham Dumpleton 
escribió:
>
> The version of mod_wsgi you are using is over 50 versions behind the 
> latest and is a version which was merely a patch release of a version from 
> over 5 years ago. I can only suggest you upgrade to the latest mod_wsgi 
> version as it is not supported unless you can manage to force your 
> operating system vendor to support it.
>
> There is a known orphaned processes issue with mod_wsgi, but is only known 
> as being a problem with certain versions of Apache 2.2 and has never been 
> seen with Apache 2.4. It also never occurred for an Apache restart, only 
> for internal daemon process restarts, albeit it was still the result of 
> some bug in Apache which seemed to have been resolved around Apache 2.2.18.
>
> I can only speculate that you aren’t using daemon mode, or not always, and 
> requests are running in, or leaking into Apache child worker processes. On 
> a graceful restart, Apache can let worker processes linger so they handle 
> keep alive connections, so if your code was running in embedded mode 
> processes, Apache may well not be shutting them down straight away. Apache 
> could then be loosing track of them as from memory there are cases it will 
> give up on processes when a graceful restart occurs.
>
> I would ensure you are using daemon mode of mod_wsgi, not embedded mode. 
> Ensure you set at global Apache configuration scope:
>
>     WSGIRestrictEmbedded On
>
> so that use of embedded mode of mod_wsgi is prohibited and you will get 
> errors if a WSGI application request is wrongly delegated to a Apache 
> worker process. This will highlight that you have some issue with your 
> mod_wsgi configuration for delegating requests to daemon mode processes.
>
> Graham
>
> On 31 Dec 2016, at 12:11 PM, Cristiano Coelho <[email protected] 
> <javascript:>> wrote:
>
> Sorry for bringing up sucn an ancient post but this is the closest thing 
> similar to my issue I have found.
>
> With apache 2.4 and mod_wsgi 3.5 and python 2.7, I am having a similar 
> issue, exactly on apache reboots.
>
> Not all the time, but some times, the wsgi processes would stay alive 
> after an apache restart, and need to be manually killed with sudo kill pid. 
> The worst part of this is that the process keeps running, this is known 
> because the process which is serving a django app, starts some background 
> threads with the app that perform some tasks periodically, and when this 
> issue happens those tasks start to stack up since duplicated logs appear 
> when only 1 server and 1 process is supposed to be running.
> The apache process is restarted through amazon AWS elastic beanstalk, 
> which is a managed service, but the logs shows that a SIGTERM is attempted 
> and after 3 failures a SIGKILL is sent, yet the process stays alive and 
> doing tasks.
>
> Note that all background tasks are either daemon threads or ThreadPool 
> instances from the multiprocessing library.
>
>
> El miércoles, 28 de julio de 2010, 10:45:04 (UTC-3), Paddy Joy escribió:
>>
>> Graham, 
>>
>> Haven't found any evidence of apache crashing, the whole setup has 
>> been running very successfully for the last two years. I usually use 
>> force-reload when changes are made to virtual hosts. 
>>
>> The memory has definitely been increasing due to the orphaned 
>> processes, especially when I get 2 or 3 processes per application 
>> orphaned however this takes a few weeks to occur and using the 
>> mod_wsgi inactivity timeout helps as these processes appear to drop 
>> down to minimal memory consumption. 
>>
>> I have upgraded to v3 of mod_wsgi so will monitor for a few weeks and 
>> report back if I can't resolve. Thanks again for your assistance. 
>>
>> Paddy 
>>
>> On Jul 27, 9:51 am, Graham Dumpleton <[email protected]> 
>> wrote: 
>> > On 25 July 2010 22:16, Paddy Joy <[email protected]> wrote: 
>> > 
>> > 
>> > 
>> > > Graham, 
>> > 
>> > > Thank you for such a detailed response. As a first step I will update
>>  
>> > > mod_wsgi to a more recent version! 
>> > 
>> > >> But can you confirm you are using daemon mode and what the 
>> > >> WSGIDaemonProcess configuration is? 
>> > 
>> > > WSGIDaemonProcess designcliq user=django group=django threads=25 
>> > > display-name=%{GROUP} inactivity-timeout=3600 
>> > > WSGIProcessGroup designcliq 
>> > 
>> > >> > I usually have to kill them individually to get rid of them and 
>> free 
>> > >> > up the memory. 
>> > 
>> > >> Technically you can't kill defunct processes, they are actually 
>> > >> already dead, so not sure what you are doing. 
>> > 
>> > > Late night reboot. 
>> > 
>> > > Here is a more detailed example of what I am trying to get my head 
>> > > around. 
>> > 
>> > > The following command shows some django applications twice, for 
>> > > example (wsgi:designcliq) appears twice under parent id's 10436 and 
>> > > 19648 (top of output). 
>> > 
>> > > paddy@joytech:~$ ps -feA  | grep -i wsgi 
>> > > django   19686 19648  0 19:29 ?        00:00:00 (wsgi:designcliq) -k 
>> > > start 
>> > > django   14118 10436  0 Jul23 ?        00:00:00 (wsgi:designcliq) -k 
>> > > start 
>> > > django     443 19648  0 20:43 ?        00:00:00 (wsgi:erinaheight -k 
>> > > start 
>> > > django     476 19648  0 20:43 ?        00:00:00 (wsgi:simplystyli -k 
>> > > start 
>> > > django     593 19648  0 20:44 ?        00:00:00 (wsgi:gilliantenn -k 
>> > > start 
>> > > django    3719 19648  0 21:00 ?        00:00:00 (wsgi:pipair)     -k 
>> > > start 
>> > > django    5548 19648  0 21:10 ?        00:00:00 (wsgi:keyboardkid -k 
>> > > start 
>> > > django    6779 10436  0 Jul23 ?        00:00:00 (wsgi:funkparty)  -k 
>> > > start 
>> > > django   11371 19648  0 21:42 ?        00:00:00 (wsgi:classicinte -k 
>> > > start 
>> > > paddy    13613  4428  0 21:55 pts/0    00:00:00 grep -i wsgi 
>> > > django   16246 10436  0 Jul24 ?        00:00:00 (wsgi:fasttraku)  -k 
>> > > start 
>> > > django   18161 10436  0 Jul24 ?        00:00:00 (wsgi:hostingssl) -k 
>> > > start 
>> > > django   19651 19648  0 19:29 ?        00:00:00 (wsgi:hostingssl) -k 
>> > > start 
>> > > django   19700 19648  0 19:29 ?        00:00:00 (wsgi:doorssincer -k 
>> > > start 
>> > > django   19769 19648  0 19:29 ?        00:00:00 (wsgi:fasttraku)  -k 
>> > > start 
>> > > django   19853 19648  0 19:29 ?        00:00:00 (wsgi:mariatennan -k 
>> > > start 
>> > > django   19913 19648  0 19:29 ?        00:00:00 (wsgi:talkoftheto -k 
>> > > start 
>> > > django   23082 10436  0 Jul24 ?        00:00:00 (wsgi:mariatennan -k 
>> > > start 
>> > > django   30964 19648  0 20:33 ?        00:00:00 (wsgi:funkparty)  -k 
>> > > start 
>> > 
>> > > If I then stop apache and run the same command some applications still
>>  
>> > > show up running under parent 10436 even though apache has been 
>> > > stopped. 
>> > 
>> > > paddy@joytech:~$ sudo /etc/init.d/apache2 stop 
>> > >  * Stopping web server apache2 
>> > 
>> > > paddy@joytech:~$ ps -feA  | grep -i wsgi 
>> > > django   14118 10436  0 Jul23 ?        00:00:00 (wsgi:designcliq) -k 
>> > > start 
>> > > django    6779 10436  0 Jul23 ?        00:00:00 (wsgi:funkparty)  -k 
>> > > start 
>> > > paddy    14014  4428  0 21:57 pts/0    00:00:00 grep -i wsgi 
>> > > django   16246 10436  0 Jul24 ?        00:00:00 (wsgi:fasttraku)  -k 
>> > > start 
>> > > django   18161 10436  0 Jul24 ?        00:00:00 (wsgi:hostingssl) -k 
>> > > start 
>> > > django   23082 10436  0 Jul24 ?        00:00:00 (wsgi:mariatennan -k 
>> > > start 
>> > 
>> > > Any ideas? 
>> > 
>> > Have you seen any evidence that Apache itself is crashing? 
>> > Alternatively, have you been doing anything like attaching debuggers 
>> > direct to Apache? 
>> > 
>> > Events like that can sometimes leave processes around, as can other 
>> things. 
>> > 
>> > The operating system generally has a job to go around and cleanup 
>> > zombie processes that haven't been reclaimed and which may be orphaned 
>> > in some way. 
>> > 
>> > As I pointed out, zombie processes don't actually consume memory and 
>> > it is just an entry in the process table. Thus, unless you are seeing 
>> > issues such as growing system wide memory usage as a result, or of 
>> > Apache no longer serving requests, then I wouldn't be overly 
>> > concerned. 
>> > 
>> > BTW, when you do Apache restarts, are you doing a 'restart' or a 
>> > 'graceful restart'. A graceful restart could possibly result in 
>> > processing hanging around as in that case Apache doesn't forcibly kill 
>> > them off and so if they don't shutdown promptly themselves, and for 
>> > some reason Apache didn't clean them up properly when they do exit, 
>> > they could remain in the process table. 
>> > 
>> > Graham
>
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] <javascript:>.
> To post to this group, send email to [email protected] <javascript:>
> .
> Visit this group at https://groups.google.com/group/modwsgi.
> For more options, visit https://groups.google.com/d/optout.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

Re: [modwsgi] Re: apache defunct processes

Reply via email to