Cleaned up blog version now for the second part on vertically partition Python web applications.
http://blog.dscpl.com.au/2014/02/vertically-partitioning-python-web.html This corrects (hopefully) a number of mis stated things. Graham On 20/02/2014, at 3:30 PM, Graham Dumpleton <graham.dumple...@gmail.com> wrote: > Blog version of the issue with number of threads seen. > > http://blog.dscpl.com.au/2014/02/use-of-threading-in-modwsgi-daemon-mode.html > > I stole your htop output. :-) > > Note that the blog post explains a bit more, mentioning a transient reaper > thread that is created at the time of shutdown. > > It is possible I should create that reaper thread up front as well, making 4 > extra. Am wondering whether delaying creation may be the cause of a rare > problem with processes hanging. This could occur if resources were exhausted > and the thread could not be created. If request threads or interpreter > destruction then subsequently hung, the process would never exit. > > This would though produce a specific log message though and I have never seen > that message reported. All the same, may be safer to create the reaper thread > at the outset and have it wait on a thread condition variable to know when to > activate. > > Graham > > On 20/02/2014, at 2:06 PM, Graham Dumpleton <graham.dumple...@gmail.com> > wrote: > >> For each mod_wsgi daemon process where you have set threads=n, you will see >> n+3 threads. >> >> The n threads is obviously the configured number of threads to handle >> requests. >> >> The other three threads are as follows: >> >> 1. The main thread which was left running after the daemon process forked >> from Apache. It is from this thread that the n requests threads are created >> initially. It will also create 2 additional threads described below. After >> it has done this, this main thread becomes a caretaker for the whole >> process. It will wait on a special socketpair, which a signal handler will >> write a character to as a flag that the process should shutdown. In other >> words, this main thread just sits there and stops the process from exiting >> until told to. >> >> 2. The second thread is a monitor thread. What it does is manage things like >> the activity timeout and shutdown timeout. If either of those timeouts occur >> it will send a signal to the same process (ie., itself), to trigger shutdown >> of the process. >> >> 3. The third thread is another monitoring thread, but one which specifically >> detects whether the whole Python interpreter itself gets into a complete >> deadlock and stops doing anything. If this is detected it will again send a >> signal to the same process to trigger a shutdown. >> >> So the additional threads are to manage process shutdown and ensure the >> process is still alive and doing stuff. >> >> As to your memory issue, the problem with web application deployments which >> just about no one takes into consideration is that not all URLs in a web >> application are equal. I actually proposed a talk for PyCon US this year >> about this specific issue and how to deal with, but the talk was rejected. >> >> In short, because your complete web application runs in the same process >> space, if one specific URL, or a small subset of URLs has special resource >> requirements, it dictates for the complete application what resources you >> require, even if those URLs might be infrequently used. >> >> As an example, the admin pages in a Django application are not frequently >> used, but they may have a requirement to process a lot of data. This could >> create a large transient memory requirement just for the request, but since >> memory allocations from the operating system are generally never given back, >> this one infrequent request will blow out memory usage for the whole >> application. This memory once allocated will be retained by the process >> until the process is subsequently restarted. >> >> Because of this, you could have a stupid situation whereby a request which >> is only run once every fifteen minutes, could over the course of a few >> hours, progressively be handled by a different process in a multiprocess web >> server configuration. Thus your overall memory usage will seem to jump up >> for no good reason until finally all processes have finally hit a plateau >> where they have allocated the maximum amount of memory they require to >> handle the worst case transient memory usage requirements for individual >> requests. >> >> It can though get worse though if you also have multithreading being used in >> each process. As the response time for a memory hungry URL is longer and >> longer, you raise the odds that you could have two such memory hungry >> requests wanting to be handled concurrently within the same process in >> different threads. What this means is that your worst case memory usage >> isn't actually just the worst case memory requirement for a specific URL, >> but that multiplied by the number of threads in the process. >> >> Further examples I have seen in the past where people have been hit by this >> are a site map, PDF generation and possibly even RSS feeds where a >> significant amount of content is returned with each item rather than it just >> being a summary. >> >> The big problem in all of this is identifying which URL has the large >> transient memory requirement. Tools available for this aren't good and you >> generally have to fallback to adhoc solutions to try and work it out. I'll >> get to how you can work it out later, possibly as separate email as I have >> to go find some code I wrote once before for someone to try and work it out. >> >> As to solving the problem when you have identified which URLs are the >> problem, ideally you would change how the code works to avoid the large >> transient memory requirement. If you cannot do that, or not straight away, >> then you can fall back on a number of different techniques to at least >> lesson the impact, by configuring the web server differently. >> >> You have already identified two ways that this can be done, which is the >> inactivity timeout and maximum number of requests per process before a >> restart. >> >> The problem with these as a solution is that the requirement for a small set >> of URLs has dictated the configuration for the whole application. Using them >> can therefore have an impact on other parts of the application. >> >> In the case of setting a maximum for the number of requests handled for the >> process, you can introduce a significant amount of process churn if this is >> set too low relative to the overall throughput. That is, the processes will >> get restarted on a frequent basis. >> >> I talk about this issue of process churn in my PyCon talk from last year: >> >> http://lanyrd.com/2013/pycon/scdyzk/ >> >> but you can also see what I mean in the attached application capacity >> analysis report picture. >> >> <PastedGraphic-1.png> >> The better solution to this problem with not all URLs being equal and having >> different resource requirements, is to vertically partition your web >> application and spread it across multiple processes. Where each process only >> handles a subset of URLs. Luckily this can be easily handled by mod_wsgi >> using multiple daemon process groups and delegating URLs to different >> processes. >> >> Take for example admin URLs in Django. If these are indeed infrequently used >> but can have a large transient memory requirement, what we can do is: >> >> WSGIDaemonProcess main processes=5 threads=5 >> WSGIDaemonProcess admin threads=3 inactivity-timeout=30 maximum-requests=20 >> >> WSGIApplicationGroup %{GLOBAL} >> WSGIProcessGroup main >> >> WSGIScriptAlias / /some/path/wsgi.py >> >> <Location /admin> >> WSGIProcessGroup admin >> </Location> >> >> So what we have done is created two daemon process groups and have shoved >> the admin pages into a distinct one of its own where we can be more >> aggressive and uses inactivity timeout and maximum requests to combat >> excessive memory use. In doing this we have left along things for the bulk >> of the web application. >> >> The end result is that we can tailor configuration settings for different >> parts of the application. The only requirement is that we can reasonably >> easily separate them out based on the URL being able to be matched by a >> Location/LocationMatch directive in Apache. >> >> In this example we have done this specifically to separate our misbehaving >> parts of an application, but the converse can also be done. >> >> If you think about it, most of the traffic for your site will often hit a >> small subset of URLs. The performance of the handling of these small, but >> very frequently visited URLs, could be impeded by having to use a more >> general configuration for the server. >> >> What may work better is to delegate the very high trafficked URLs into their >> own daemon process with a processes/threads mix tuned for that scenario. >> Because that daemon is only going to handle a smaller number of URLs, the >> actual amount of code from your application that would ever be executed >> within that process would be much smaller. So long as your code base is >> setup such that it only lazily imports code for specific handlers when >> necessary the first time, you can keep this optimised process quite lean as >> far as memory usage. >> >> So instead of having every process having to be very fat and eventually load >> up all parts of your application code, you can leave that for a smaller >> number of processes alone, where although they are going to serve up a >> greater number of different URLs, wouldn't necessarily get much traffic and >> so don't have to have as much capacity. >> >> You might therefore have the following: >> >> WSGIDaemonProcess main processes=1 threads=5 >> WSGIDaemonProcess volume processes=3 threads=5 >> WSGIDaemonProcess admin threads=3 inactivity-timeout=30 maximum-requests=20 >> >> WSGIApplicationGroup %{GLOBAL} >> WSGIProcessGroup main >> >> WSGIScriptAlias / /some/path/wsgi.py >> >> <Location /publications/article/> >> WSGIProcessGroup volume >> </Location> >> >> <Location /admin> >> WSGIProcessGroup admin >> </Location> >> >> In your case we are therefore shoving the one URL which accounts for almost >> 50% of your total traffic into one daemon process group. This should have a >> lower memory footprint and so we can afford to run it across a few >> processes, each with a small number of process. All other non admin traffic, >> where all the remain coding for your application would be loaded, can be >> handled by one process. >> >> So by juggling things like this, handling as special cases worst case URLs >> for transient memory usage, as well as your high traffic URLs, one can often >> quite dramatically control the amount of memory used. >> >> Now what about monitoring these so as to be able to gauge effectiveness. >> >> Because server monitoring in New Relic can't separately identify the >> mod_wsgi daemon process groups, even when the display-name options is used, >> for things like memory tracking you cannot rely readily on server >> monitoring. This is because everything will be lumped under Apache and you >> cannot tell what the memory requirements are of each. >> >> What you have to do in this case is rely on the memory usage charts on the >> main overview dashboard for the web application in New Relic. >> >> <PastedGraphic-3.png> >> >> We have a problem though at this point though and that is that everything >> will still report under the same existing application in the New Relic UI >> and so we still don't have separation. >> >> What we can do here though is configure things though so that each daemon >> process group reports into a separate application, as well as still >> reporting to a combined application for everything. This can be done from >> the Apache configuration file using: >> >> WSGIDaemonProcess main processes=1 threads=5 >> WSGIDaemonProcess volume processes=3 threads=5 >> WSGIDaemonProcess admin threads=3 inactivity-timeout=30 maximum-requests=20 >> >> WSGIApplicationGroup %{GLOBAL} >> WSGIProcessGroup main >> >> SetEnv newrelic.app_name 'My Site (main);My Site' >> >> WSGIScriptAlias / /some/path/wsgi.py >> >> <Location /publications/article/> >> WSGIProcessGroup volume >> SetEnv newrelic.app_name 'MySite (volume);My Site' >> </Location> >> >> <Location /admin> >> WSGIProcessGroup admin >> SetEnv newrelic.app_name 'MySite (admin);My Site' >> </Location> >> >> So we are using specialisation via the Location directive to override what >> the application name the New Relic Python agent reports to. >> >> We are also in this case using a semi colon separated list of names. >> >> The result is that each daemon process group logs under a separate >> application of the form 'My Site (XXX)' but at the same time they also all >> report to 'My Site'. >> >> This way you can still have a combined view, but you can also look at each >> daemon process group in isolation. >> >> The isolation is important, because you can then do the following separately >> for each daemon process group. >> View response times. >> View throughput. >> View memory usage. >> View CPU usage. >> View capacity analysis report. >> Trigger thread profiler. >> If things were separated and they were all reporting only to the same >> application, the data presented by this would be all mixed up and for the >> last 4 could be confusing. >> >> Okay, so that is probably going to be a lot to digest but represents just a >> part of what I would have presented at PyCon US if my talk had been >> submitted. >> >> Other things I would have talked about would have included dealing with >> request back log when overloaded due to increase traffic for certain URLs, >> dealing with danger of malicious POST requests with large content size etc >> etc. >> >> Am sure the above will keep you busy for a while at least though. :-) >> >> Now that I have done all that, I should clean it up a bit and put it up in a >> couple of blog posts. >> >> Graham >> >> On 20/02/2014, at 8:06 AM, scoopseven <m...@kecko.com> wrote: >> >>> Graham, I'm still not sure why with processes=5 threads=2 I see 5 threads >>> for each process for mod_wsgi in htop. If you could explain that last >>> little hanging chad it would be great. Thanks! >>> >>> Updated SO with summary of solution: >>> http://serverfault.com/questions/576527/apache-processes-in-top-more-than-maxclients >>> >>> Mark >>> >>> >>> On Wednesday, February 19, 2014 12:05:56 PM UTC-5, scoopseven wrote: >>> This question started on SO: >>> http://serverfault.com/questions/576527/apache-processes-in-top-more-than-maxclients/576600 >>> >>> I've updated my Apache config and mod_wsgi settings, but am still >>> experiencing memory creep. Here's my site conf and my apache2.conf: >>> >>> WSGIDaemonProcess mywsgi user=www-data group=www-data processes=5 threads=5 >>> display-name=mod-wsgi >>> python-path=/home/admin/.virtualenvs/django/lib/python2.7/site-packages >>> WSGIPythonHome /home/admin/.virtualenvs/django >>> WSGIRestrictEmbedded On >>> WSGILazyInitialization On >>> >>> <VirtualHost 127.0.0.1:8080> >>> ServerName www.mysite.com >>> DocumentRoot /srv/mysite >>> >>> SetEnvIf X-Forwarded-Protocol https HTTPS=1 >>> WSGIScriptAlias / /srv/mysite/system/apache/django.wsgi process-group= >>> mywsgi application-group=%{GLOBAL} >>> RequestHeader add X-Queue-Start "%t" >>> </VirtualHost> >>> >>> <IfModule mpm_worker_module> >>> StartServers 1 >>> ThreadsPerChild 5 >>> MinSpareThreads 5 >>> MaxSpareThreads 10 >>> MaxClients 25 >>> ServerLimit 5 >>> MaxRequestsPerChild 0 >>> MaxMemFree 1024 >>> </IfModule> >>> >>> I'm watching apache and mod_wsgi via htop and apache seems to be playing by >>> the rules, never loading more than 25 threads. It usually stays around >>> 10-15 threads. We average around 5-6 requests/second monitored by >>> /server-status/. The thing that's bothering me is that I'm counting 44 >>> mod_wsgi threads in htop. I assumed that since I had processes=5 threads=5 >>> I would only see a maximum of 30 threads below (5 processes + 25 threads). >>> >>> Partial htop dump: >>> >>> 2249 www-data 20 0 159M 65544 4676 S 26.0 0.8 2:09.93 mod-wsgi >>> -k start >>> 2248 www-data 20 0 164M 69040 5560 S 148. 0.8 2:10.72 mod-wsgi >>> -k start >>> 2274 www-data 20 0 159M 65544 4676 S 0.0 0.8 0:12.58 mod-wsgi >>> -k start >>> 2250 www-data 20 0 157M 62212 5168 S 10.0 0.7 1:50.35 mod-wsgi >>> -k start >>> 2291 www-data 20 0 164M 69040 5560 S 41.0 0.8 0:17.07 mod-wsgi >>> -k start >>> 2251 www-data 20 0 165M 69320 4676 S 0.0 0.8 1:59.48 mod-wsgi >>> -k start >>> 2272 www-data 20 0 159M 65544 4676 S 0.0 0.8 0:28.67 mod-wsgi >>> -k start >>> 2282 www-data 20 0 165M 69320 4676 S 0.0 0.8 0:33.85 mod-wsgi >>> -k start >>> 2292 www-data 20 0 164M 69040 5560 S 28.0 0.8 0:28.08 mod-wsgi >>> -k start >>> 2298 www-data 20 0 157M 62212 5168 S 0.0 0.7 0:14.93 mod-wsgi >>> -k start >>> 2299 www-data 20 0 157M 62212 5168 S 1.0 0.7 0:23.71 mod-wsgi >>> -k start >>> 2358 www-data 20 0 164M 69040 5560 S 1.0 0.8 0:02.62 mod-wsgi >>> -k start >>> 2252 www-data 20 0 165M 70468 4660 S 41.0 0.8 1:55.85 mod-wsgi >>> -k start >>> 2273 www-data 20 0 159M 65544 4676 S 10.0 0.8 0:29.03 mod-wsgi >>> -k start >>> 2278 www-data 20 0 159M 65544 4676 S 1.0 0.8 0:02.79 mod-wsgi >>> -k start >>> 2264 www-data 20 0 165M 70468 4660 S 0.0 0.8 0:07.50 mod-wsgi >>> -k start >>> 2266 www-data 20 0 165M 70468 4660 S 25.0 0.8 0:39.49 mod-wsgi >>> -k start >>> 2300 www-data 20 0 157M 62212 5168 S 6.0 0.7 0:28.78 mod-wsgi >>> -k start >>> 2265 www-data 20 0 165M 70468 4660 S 15.0 0.8 0:31.44 mod-wsgi >>> -k start >>> 2294 www-data 20 0 164M 69040 5560 R 54.0 0.8 0:34.82 mod-wsgi >>> -k start >>> 2279 www-data 20 0 165M 69320 4676 S 0.0 0.8 0:32.63 mod-wsgi >>> -k start >>> 2297 www-data 20 0 157M 62212 5168 S 3.0 0.7 0:09.68 mod-wsgi >>> -k start >>> 2302 www-data 20 0 157M 62212 5168 S 0.0 0.7 0:27.62 mod-wsgi >>> -k start >>> 2323 www-data 20 0 157M 62212 5168 S 0.0 0.7 0:02.56 mod-wsgi >>> -k start >>> 2280 www-data 20 0 165M 69320 4676 S 0.0 0.8 0:13.00 mod-wsgi >>> -k start >>> 2263 www-data 20 0 165M 70468 4660 S 0.0 0.8 0:19.35 mod-wsgi >>> -k start >>> 2322 www-data 20 0 165M 69320 4676 S 0.0 0.8 0:03.05 mod-wsgi >>> -k start >>> 2275 www-data 20 0 165M 70468 4660 S 0.0 0.8 0:02.72 mod-wsgi >>> -k start >>> 2285 www-data 20 0 164M 69040 5560 S 0.0 0.8 0:00.00 mod-wsgi >>> -k start >>> 2288 www-data 20 0 164M 69040 5560 S 0.0 0.8 0:00.11 mod-wsgi >>> -k start >>> 2290 www-data 20 0 164M 69040 5560 S 4.0 0.8 0:15.66 mod-wsgi >>> -k start >>> 2293 www-data 20 0 164M 69040 5560 S 20.0 0.8 0:29.01 mod-wsgi >>> -k start >>> 2268 www-data 20 0 159M 65544 4676 S 0.0 0.8 0:00.00 mod-wsgi >>> -k start >>> 2269 www-data 20 0 159M 65544 4676 S 0.0 0.8 0:00.11 mod-wsgi >>> -k start >>> 2270 www-data 20 0 159M 65544 4676 S 15.0 0.8 0:26.62 mod-wsgi >>> -k start >>> 2271 www-data 20 0 159M 65544 4676 S 0.0 0.8 0:26.55 mod-wsgi >>> -k start >>> >>> Last night I had processes=3 threads=3 and my NR capacity report reported >>> 100% usage >>> (https://rpm.newrelic.com/accounts/67402/applications/1132078/optimize/capacity_analysis), >>> so I upped it to processes=5 threads=5 and now I have 44 threads going. >>> Despite the instance count reported by NR staying relatively stable, memory >>> consumption continues to increase >>> (https://rpm.newrelic.com/accounts/67402/servers/1130000/processes#id=152494639). >>> I realize that nobody except for Graham can see those NR reports, sorry. >>> >>> Has anyone dealt with this situation before? >>> >>> Mark >>> >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "modwsgi" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to modwsgi+unsubscr...@googlegroups.com. >>> To post to this group, send email to modwsgi@googlegroups.com. >>> Visit this group at http://groups.google.com/group/modwsgi. >>> For more options, visit https://groups.google.com/groups/opt_out. >> > -- You received this message because you are subscribed to the Google Groups "modwsgi" group. To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+unsubscr...@googlegroups.com. To post to this group, send email to modwsgi@googlegroups.com. Visit this group at http://groups.google.com/group/modwsgi. For more options, visit https://groups.google.com/groups/opt_out.