Re: [modwsgi] processes vs. threads

Graham Dumpleton Thu, 05 Jan 2017 15:28:03 -0800

Sorry for slow reply. Got distracted with other stuff.

See the attached example code for a start.

--
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

from __future__ import print_function

import os
import time
import threading
import atexit
import socket


try:
    import Queue as queue
except ImportError:
    import queue

import mod_wsgi

class Monitor(object):

    def __init__(self, interval=1.0):
        self.monitor_thread = None
        self.monitor_interval = interval
        self.monitor_queue = queue.Queue()

        self.last_metrics = None

    def start(self):
        if self.monitor_thread is None:
            self.monitor_thread = threading.Thread(target=self.monitor)
            self.monitor_thread.setDaemon(True)
            self.monitor_thread.start()

            atexit.register(self.stop)

    def stop(self):
        try:
            self.monitor_queue.put(True)
        except Exception:
            pass

        self.monitor_thread.join()

    def monitor(self, *args):
        while True:
            current_metrics = mod_wsgi.process_metrics()

            if self.last_metrics is not None:
                duration = (current_metrics['current_time'] -
                        self.last_metrics['current_time'])

                cpu_user_time = (current_metrics['cpu_user_time'] -
                        self.last_metrics['cpu_user_time'])
                cpu_system_time = (current_metrics['cpu_system_time'] -
                        self.last_metrics['cpu_system_time'])

                request_busy_time = (current_metrics['request_busy_time'] -
                        self.last_metrics['request_busy_time'])

                request_threads = current_metrics['request_threads']

                timestamp = current_metrics['current_time']

                fields = {}

                fields['cpu_user_time'] = cpu_user_time
                fields['cpu_system_time'] = cpu_system_time

                fields['cpu_usage'] = ((cpu_user_time+cpu_system_time) /
                        duration)

                fields['request_busy_time'] = request_busy_time

                fields['request_busy_usage'] = (request_busy_time /
                        (duration * mod_wsgi.threads_per_process))

                fields['threads_per_process'] = mod_wsgi.threads_per_process
                fields['request_threads'] = request_threads

                print('METRICS', fields)

            self.last_metrics = current_metrics

            current_time = current_metrics['current_time']
            delay = max(0, (current_time + self.monitor_interval) - time.time())

            if delay == 0.0:
                delay = self.monitor_interval

            try:
                return self.monitor_queue.get(timeout=delay)

            except queue.Empty:
                pass

monitor = Monitor(1.0)
monitor.start()

def application(environ, start_response):
    status = '200 OK'
    output = b'Hello World!'

    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(output)))]
    start_response(status, response_headers)

    time.sleep(0.1)

    return [output]

If you use that with mod_wsgi, for example as:

mod_wsgi-express start-server example.py --log-to-terminal --threads 20

It will output each second for each process something like:

METRICS {'cpu_system_time': 0.01000000536441803, 'request_busy_time': 3.3496430000000004, 'cpu_usage': 0.019944821691381887, 'request_busy_usage': 0.1670199913154585, 'request_threads': 16, 'cpu_user_time': 0.01000000536441803, 'threads_per_process': 20L}

Key things to look at here are:

threads_per_process - This says how many threads the process is configured to use.

request_threads - This gives the number of threads in the process that mod_wsgi has activated and used at some point. Effectively it is what is needed at peak times when have lots of requests, or intersection of long running requests.

request_busy_usage - This give the percentage capacity used in the last reporting period (default 1 second). So 1.0 is 100% of capacity. If this is always running at a low capacity utilisation, it indicates you have more threads per process than you need. What you might set it to is partly then guide by what request_threads reports, but remember that request_threads represents peak. It may be acceptable to still have less than request_threads if rarely hit the peak and a momentary back log is fine, or if have multiple processes to also handle and peak occurs due to restarts.

cpu_usage - This gives percentage of CPU capacity (one core is 1.0) being used. This can technically go over 1.0 indicating using more than one core. Obviously due to the GIL that is hard and would depend on C code which is running without GIL being used. Hammer mod_wsgi and you can see it as mod_wsgi releases GIL for I/O. For normal site load I generally recommend not to let this go over 30-40%. If you are hitting 60-80% all the time, may be better to have more processes.

If you can integrate this, adjusting 1 second reporting period to be longer if need be, report the results and we can look at them and explain anything or suggest things.

Graham

On 31 Dec 2016, at 4:33 PM, Jason Garber <[email protected]> wrote:

Hi Graham thank you very much for the very detailed email. It helps explain a lot of the issues to me.

I run a very seriously used wsgi application for a rather large client. I would be more than happy to plug in metric Gathering code so that we can analyze Statistics over a period of time. If you have something specific in mind please let me know. I do not currently have any metric Gathering in place. I think the application is specified at 6 processes and 15 threads per process.

But since we only use apache for midwsgi requests and let nginx buffer dynamic and serve all static requests, I don't think requests timing out has been a problem.

What brought this to mind recently was some back-end processes then I'm running within the WSGI Daemon which process Amazon SQS messages. The volume that we were pushing through SQS was such that the background process was taking days to catch up. So I changed the code and had its spawn 40 threads to handle the SQS messages and was able to get caught up.

Please let me know how I can help.

Thanks!
Jason

On Dec 30, 2016 5:20 PM, "Graham Dumpleton" <[email protected]> wrote:

On 31 Dec 2016, at 6:37 AM, Jason Garber <[email protected]> wrote:

Hi Graham,

Can you comment on why one would use more processes or more threads in a database-intensive web application?

For example:

6 processes
10 threads each

Or

1 process
60 threads

My current understanding would say that you could use more threads until you started to see a performance drop due to the GIL not allowing concurrent execution. But I was wondering what you had to say.

Sort of.

The problem is knowing when the GIL is starting to impact on things. The measure of what constitutes a performance drop is what is hard to measure. You could keep adding threads until you have way more than you need and still not see a performance drop as the contention point could be in a place where you just can’t get enough requests into the process to make use of them anyway. Have seen this many times where people allocate way too many threads and most of them are never used and sit their idle just chewing up memory. The design of mod_wsgi daemon mode is such that it will at least not activate threads unless the capacity is needed. This only applies to Python side of things though, they are still allocated on C side, and a momentary backlog could still seem them activated on Python side and used for a moment but then go back to not being in use. That is when memory use can start to blow out unnecessarily and where fewer threads would have been better and live with the potential momentary backlog.

So since Python doesn’t provide a way of measuring GIL contention and give any guidance, the question is what we can look at.

What I have been using to try and quantify GIL impacts and tune processes/threads are the following.

* Thread capacity used - This is a measure of how many of the capacity of the specified threads are actually being used in a time period. This can tell you when you have too many threads allocated and so they are wasted.

* Per request CPU usage - This is a measure of how much CPU was used in handling a request. If this comes in as a low figure, the request is likely I/O bound. If a high figure then CPU bound. Unfortunately granularity of this on some systems is not great and so if you always have very short response times under <10ms, may not give an completely accurate picture.

* Process wide CPU usage - This is a measure of how much CPU is used by the whole process in a period of time.

* Per request queue time - This is a measure of how much time a request spent in Apache before it was handled in a daemon process group by a WSGI application. This helps in understanding whether backlogging is occurring due lack of capacity in the daemon processes or bottlenecks elsewhere.

* Rate of requests timed out - This is a measure of how many requests timed out before being handled by the daemon processes. This helps in understanding when the application got overloaded and requests started to be failed, if enabled, to try and discard the backlog.

With the exception of the last one, there are ways in mod_wsgi of getting this information. I could partly implement something to track the last one, but it will not be accurate for severe backlogging, but would still show the spike which is enough, as only excessive backlogging in Apache worker processes such that socket connection to daemon processes fails that can’t easily get access to. Only thought about way to do something for the latter when saw this email, so may start playing with it.

An issue related to all this is process memory size. In Python there is a tendency for memory in use to grow up to a plateau as all possible request handlers are visited. This means you have a lot of memory in use that isn’t potentially touched very often. I have talked previously about vertically partition an application across multiple daemon process groups so that different sub sets of URLs could be handled in own processes before. This allows you to separately tune processes/threads for that set of URLs, and for more frequently visited URLs keep that memory hot with less memory that just sits there unused.

Another crude way of handling growing memory use and extra overhead of too much memory paging, is to restart the daemon processes every so often. Right now can only do this based on request count or inactivity though. Not for the first time, I have been looking lately at a way of simply restarting processes on a regular time interval to keep memory usage down. The danger here which isn’t simple to solve is avoiding restarting multiple processes at the same time and causing load spikes. Using graceful timeouts though may be enough to spread restarts as would be random based on when processes not handling requests.

Anyway, there are metrics one can use to measure things to try and understand what is going on.

Do you have any monitoring system in place into which metrics can be injected? If you do I can explain how you can get the information out and we can start looking at it. I would really love to be able to get hold of a data dump for these metrics over a period of time for a real application so I can try and do some data analysis of it in Jupyter Notebook and develop some code which could be used to give guidance on tuning when you have the data. One can’t get decent data for doing this when using test applications and fake data, need a real application with decent amount of traffic and I don’t have one.

Graham

--
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+unsubscribe@googlegroups.com.
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

Re: [modwsgi] processes vs. threads

Reply via email to