Re: [modwsgi] Sizing number of threads with long sleepy requests

Graham Dumpleton Thu, 21 Sep 2017 13:21:43 -0700

You can add this to the end of your wsgi.py file.


import time
import threading
import os
import mod_wsgi

last_metrics = None

def monitor(*args):
    global last_metrics

    while True:
        current_metrics = mod_wsgi.process_metrics()

        if last_metrics is not None:
            cpu_user_time = (current_metrics['cpu_user_time'] -
                    last_metrics['cpu_user_time'])
            cpu_system_time = (current_metrics['cpu_system_time'] -
                    last_metrics['cpu_system_time'])

            request_busy_time = (current_metrics['request_busy_time'] -
                    last_metrics['request_busy_time'])

            request_threads = current_metrics['request_threads']

            # report data

            timestamp = int(current_metrics['current_time'] * 1000000000)

            item = {}
            item['time'] = timestamp
            item['measurement'] = 'process'
            item['process_group'] = mod_wsgi.process_group
            item['process_id'] = os.getpid()

            fields = {}

            fields['cpu_user_time'] = cpu_user_time
            fields['cpu_system_time'] = cpu_system_time

            fields['request_busy_time'] = request_busy_time
            fields['request_busy_usage'] = (request_busy_time /
                    mod_wsgi.threads_per_process)

            fields['threads_per_process'] = mod_wsgi.threads_per_process
            fields['request_threads'] = request_threads

            item['fields'] = fields

            print(item)

        last_metrics = current_metrics

        current_time = current_metrics['current_time']
        delay = max(0, (current_time + 1.0) - time.time())
        time.sleep(delay)

thread = threading.Thread(target=monitor)
thread.setDaemon(True)
thread.start()


This will send a message to the log every second for each process.

Change it as necessary to write to a separate file in /tmp so you can go back 
and look at it.

Example of what you will see is:

[Thu Sep 21 22:15:46.232094 2017] [wsgi:error] [pid 14468] {'time': 
1506024946231937792, 'measurement': 'process', 'process_group': 
'localhost:8000', 'process_id': 14468, 'fields': {'cpu_user_time': 0.0, 
'cpu_system_time': 0.0, 'request_busy_time': 0.0, 'request_busy_usage': 0.0, 
'threads_per_process': 5, 'request_threads': 4}}

The 'threads_per_process' field is how many threads defined for the process.

The 'request_threads' field is how many of those threads have been used.

Important one to look at is 'request_busy_usage' which shows for the last 
period what percentage utilisation of thread pool capacity was for the process. 
If this is a low percentage, then likely way more threads defined for the 
process than needed. If very high percentage, then you are using up a lot of 
the capacity and risk capacity being used and requests starting to backlog.

Graham

> On 21 Sep 2017, at 4:35 PM, Samuel Bancal <[email protected]> wrote:
> 
> Hi Graham,
> 
> Thanks for these advices. I've switched apache settings so that /refresh URLs 
> are handled in a separate daemon settings.
> Here is a summary of the setup :
> 
> WSGIDaemonProcess main processes=1 threads=15 
> python-home=/django_app/venv/django_py2 python-path=/django_app/fileshare 
> display-name=%{GROUP}
> WSGIDaemonProcess refresh processes=3 threads=50 
> python-home=/django_app/venv/django_py2 python-path=/django_app/fileshare 
> display-name=%{GROUP}
> WSGIScriptAlias / /django_app/fileshare/fileshare/wsgi.py
> WSGIApplicationGroup %{GLOBAL}
> WSGIProcessGroup main
> 
> <Location /refresh>
> WSGIProcessGroup refresh
> </Location>
> 
> In your answer, you mentioned "metrics" ... I could find pointers to the New 
> Relic solution, but I wonder if we can get such values from CLI on the 
> server. (?)
> 
> Regards,
> Samuel
> 
> 
> 
> On Thursday, 21 September 2017 07:45:01 UTC+2, Graham Dumpleton wrote:
> 
>> On 20 Sep 2017, at 9:24 AM, Samuel Bancal <[email protected] <>> wrote:
>> 
>> Hi,
>> 
>> We've experienced some high latency with a Django project using mod-wsgi in 
>> daemon mode. That web service does folder management ... and has a /refresh 
>> POST url that will keep the request hang until a) a change is made in the 
>> folder or b) 60 seconds have passed.
>> 
>> With this, having a few clients ... already used the 15 threads available. 
>> And any new request were waiting for a /refresh to timeout before being 
>> processed.
>> 
>> I guess such /refresh method is not today's best practice ... and the dev 
>> (me) would better look into XHR calls (which I don't know yet).
>> 
>> With this, I have 3 questions :
>> 
>> + Would XHR calls be non-consuming threads as POST do?
>> + How to best size today's solution? Since all /refresh are spending their 
>> time doing some DB requests followed by `time.sleep(1)` ... Can I grow the 
>> number of threads to 256? Or shall I split it to something like 8 processes 
>> of 32 threads? (or even above?)
>> + Before that, we used mod-wsgi in embedded mode ... which never gave any 
>> latency we could measure. Would it be, for such situation, better to switch 
>> back to embedded mode?
> 
> The reason you wouldn't have seen an issue in embedded mode is that the limit 
> on maximum number of clients supported by Apache child worker processes would 
> be much higher and Apache would spawn new worker processes if need be if the 
> number of concurrent requests needing to be handled did grow, up to the limit.
> 
> The risk in using embedded mode is the amount of memory used by Apache and 
> some performance issues around how Apache dynamically adjusts the number of 
> processes. The overhead of always starting up new processes with a full 
> Python interpreter in it and reloading your code is expensive. Unfortunately 
> the Apache algorithm suffers when doing this with heavyweight Python 
> applications.
> 
> Switching to XHR calls will not help unless you are also re-designing things 
> to periodically poll instead of wait, something which could be done without 
> using XHR anyway. Depending on number of requests, polling can cause other 
> issues as number of overall requests increases.
> 
> As to growing the number of threads, what you need to be careful of here is 
> that doing so in a process which also can handle CPU intensive requests, is 
> that you start to bog things down due to the Python global interpreter lock 
> (GIL).
> 
> If these requests are always waiting for the database and then sleeping, and 
> not doing CPU intensive tasks, best way to handle it would be to create two 
> separate daemon process groups. In the new daemon process group you could set 
> the number of threads a lot higher in order to handle the number of 
> concurrent requests. Then in the Apache configuration, you would delegate 
> only URLs related to these requests which need to wait a long time to that 
> daemon process group.
> 
> I call this vertically partitioning the URL namespace, the separation 
> allowing you to custom daemon process groups to the specific type of work 
> being down for the requests delegated to that daemon process group.
> 
> I have blogged about this concept in:
> 
> * http://blog.dscpl.com.au/2014/02/vertically-partitioning-python-web.html 
> <http://blog.dscpl.com.au/2014/02/vertically-partitioning-python-web.html>
> 
> There are some examples in the post about it.
> 
> Still be careful about bumping the number of threads per process too high, 
> better to also use processes as well so spreading threads. There are some 
> metrics you can get out of mod_wsgi if need be to help size the thread pool 
> so you aren't allocating way more than you need.
> 
> The other way is to delegate the long wrong requests to run back in embedded 
> mode and so for those at least, allow them to make use of dynamic 
> process/thread management of Apache.
> 
> If you were doing that, you would need to look closely at the Apache MPM 
> settings and adjust them to avoid the problems with process churn Apache 
> experiences. I have talked about that problem in:
> 
> * https://www.youtube.com/watch?v=k6Erh7oHvns 
> <https://www.youtube.com/watch?v=k6Erh7oHvns>
> 
> To combat memory overhead, you might perhaps want to use a more lightweight 
> WSGI application for handling those specific requests. Pulling in a large 
> Django application wouldn't necessarily be a good idea depending on how well 
> it lazily loads code, or whether is pre-pulling request handlers on startup.
> 
> I'll let you digest that and if you have questions let me know.
> 
> Graham
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:[email protected]>.
> To post to this group, send email to [email protected] 
> <mailto:[email protected]>.
> Visit this group at https://groups.google.com/group/modwsgi 
> <https://groups.google.com/group/modwsgi>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

Re: [modwsgi] Sizing number of threads with long sleepy requests

Reply via email to