> On 31 Dec 2016, at 6:37 AM, Jason Garber <[email protected]> wrote: > > Hi Graham, > > Can you comment on why one would use more processes or more threads in a > database-intensive web application? > > For example: > > 6 processes > 10 threads each > > Or > > 1 process > 60 threads > > My current understanding would say that you could use more threads until you > started to see a performance drop due to the GIL not allowing concurrent > execution. But I was wondering what you had to say.
Sort of. The problem is knowing when the GIL is starting to impact on things. The measure of what constitutes a performance drop is what is hard to measure. You could keep adding threads until you have way more than you need and still not see a performance drop as the contention point could be in a place where you just can’t get enough requests into the process to make use of them anyway. Have seen this many times where people allocate way too many threads and most of them are never used and sit their idle just chewing up memory. The design of mod_wsgi daemon mode is such that it will at least not activate threads unless the capacity is needed. This only applies to Python side of things though, they are still allocated on C side, and a momentary backlog could still seem them activated on Python side and used for a moment but then go back to not being in use. That is when memory use can start to blow out unnecessarily and where fewer threads would have been better and live with the potential momentary backlog. So since Python doesn’t provide a way of measuring GIL contention and give any guidance, the question is what we can look at. What I have been using to try and quantify GIL impacts and tune processes/threads are the following. * Thread capacity used - This is a measure of how many of the capacity of the specified threads are actually being used in a time period. This can tell you when you have too many threads allocated and so they are wasted. * Per request CPU usage - This is a measure of how much CPU was used in handling a request. If this comes in as a low figure, the request is likely I/O bound. If a high figure then CPU bound. Unfortunately granularity of this on some systems is not great and so if you always have very short response times under <10ms, may not give an completely accurate picture. * Process wide CPU usage - This is a measure of how much CPU is used by the whole process in a period of time. * Per request queue time - This is a measure of how much time a request spent in Apache before it was handled in a daemon process group by a WSGI application. This helps in understanding whether backlogging is occurring due lack of capacity in the daemon processes or bottlenecks elsewhere. * Rate of requests timed out - This is a measure of how many requests timed out before being handled by the daemon processes. This helps in understanding when the application got overloaded and requests started to be failed, if enabled, to try and discard the backlog. With the exception of the last one, there are ways in mod_wsgi of getting this information. I could partly implement something to track the last one, but it will not be accurate for severe backlogging, but would still show the spike which is enough, as only excessive backlogging in Apache worker processes such that socket connection to daemon processes fails that can’t easily get access to. Only thought about way to do something for the latter when saw this email, so may start playing with it. An issue related to all this is process memory size. In Python there is a tendency for memory in use to grow up to a plateau as all possible request handlers are visited. This means you have a lot of memory in use that isn’t potentially touched very often. I have talked previously about vertically partition an application across multiple daemon process groups so that different sub sets of URLs could be handled in own processes before. This allows you to separately tune processes/threads for that set of URLs, and for more frequently visited URLs keep that memory hot with less memory that just sits there unused. Another crude way of handling growing memory use and extra overhead of too much memory paging, is to restart the daemon processes every so often. Right now can only do this based on request count or inactivity though. Not for the first time, I have been looking lately at a way of simply restarting processes on a regular time interval to keep memory usage down. The danger here which isn’t simple to solve is avoiding restarting multiple processes at the same time and causing load spikes. Using graceful timeouts though may be enough to spread restarts as would be random based on when processes not handling requests. Anyway, there are metrics one can use to measure things to try and understand what is going on. Do you have any monitoring system in place into which metrics can be injected? If you do I can explain how you can get the information out and we can start looking at it. I would really love to be able to get hold of a data dump for these metrics over a period of time for a real application so I can try and do some data analysis of it in Jupyter Notebook and develop some code which could be used to give guidance on tuning when you have the data. One can’t get decent data for doing this when using test applications and fake data, need a real application with decent amount of traffic and I don’t have one. Graham -- You received this message because you are subscribed to the Google Groups "modwsgi" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/modwsgi. For more options, visit https://groups.google.com/d/optout.
