You can add this to the end of your wsgi.py file.
import time
import threading
import os
import mod_wsgi
last_metrics = None
def monitor(*args):
global last_metrics
while True:
current_metrics = mod_wsgi.process_metrics()
if last_metrics is not None:
cpu_user_time = (current_metrics['cpu_user_time'] -
last_metrics['cpu_user_time'])
cpu_system_time = (current_metrics['cpu_system_time'] -
last_metrics['cpu_system_time'])
request_busy_time = (current_metrics['request_busy_time'] -
last_metrics['request_busy_time'])
request_threads = current_metrics['request_threads']
# report data
timestamp = int(current_metrics['current_time'] * 1000000000)
item = {}
item['time'] = timestamp
item['measurement'] = 'process'
item['process_group'] = mod_wsgi.process_group
item['process_id'] = os.getpid()
fields = {}
fields['cpu_user_time'] = cpu_user_time
fields['cpu_system_time'] = cpu_system_time
fields['request_busy_time'] = request_busy_time
fields['request_busy_usage'] = (request_busy_time /
mod_wsgi.threads_per_process)
fields['threads_per_process'] = mod_wsgi.threads_per_process
fields['request_threads'] = request_threads
item['fields'] = fields
print(item)
last_metrics = current_metrics
current_time = current_metrics['current_time']
delay = max(0, (current_time + 1.0) - time.time())
time.sleep(delay)
thread = threading.Thread(target=monitor)
thread.setDaemon(True)
thread.start()
This will send a message to the log every second for each process.
Change it as necessary to write to a separate file in /tmp so you can go back
and look at it.
Example of what you will see is:
[Thu Sep 21 22:15:46.232094 2017] [wsgi:error] [pid 14468] {'time':
1506024946231937792, 'measurement': 'process', 'process_group':
'localhost:8000', 'process_id': 14468, 'fields': {'cpu_user_time': 0.0,
'cpu_system_time': 0.0, 'request_busy_time': 0.0, 'request_busy_usage': 0.0,
'threads_per_process': 5, 'request_threads': 4}}
The 'threads_per_process' field is how many threads defined for the process.
The 'request_threads' field is how many of those threads have been used.
Important one to look at is 'request_busy_usage' which shows for the last
period what percentage utilisation of thread pool capacity was for the process.
If this is a low percentage, then likely way more threads defined for the
process than needed. If very high percentage, then you are using up a lot of
the capacity and risk capacity being used and requests starting to backlog.
Graham
> On 21 Sep 2017, at 4:35 PM, Samuel Bancal <[email protected]> wrote:
>
> Hi Graham,
>
> Thanks for these advices. I've switched apache settings so that /refresh URLs
> are handled in a separate daemon settings.
> Here is a summary of the setup :
>
> WSGIDaemonProcess main processes=1 threads=15
> python-home=/django_app/venv/django_py2 python-path=/django_app/fileshare
> display-name=%{GROUP}
> WSGIDaemonProcess refresh processes=3 threads=50
> python-home=/django_app/venv/django_py2 python-path=/django_app/fileshare
> display-name=%{GROUP}
> WSGIScriptAlias / /django_app/fileshare/fileshare/wsgi.py
> WSGIApplicationGroup %{GLOBAL}
> WSGIProcessGroup main
>
> <Location /refresh>
> WSGIProcessGroup refresh
> </Location>
>
> In your answer, you mentioned "metrics" ... I could find pointers to the New
> Relic solution, but I wonder if we can get such values from CLI on the
> server. (?)
>
> Regards,
> Samuel
>
>
>
> On Thursday, 21 September 2017 07:45:01 UTC+2, Graham Dumpleton wrote:
>
>> On 20 Sep 2017, at 9:24 AM, Samuel Bancal <[email protected] <>> wrote:
>>
>> Hi,
>>
>> We've experienced some high latency with a Django project using mod-wsgi in
>> daemon mode. That web service does folder management ... and has a /refresh
>> POST url that will keep the request hang until a) a change is made in the
>> folder or b) 60 seconds have passed.
>>
>> With this, having a few clients ... already used the 15 threads available.
>> And any new request were waiting for a /refresh to timeout before being
>> processed.
>>
>> I guess such /refresh method is not today's best practice ... and the dev
>> (me) would better look into XHR calls (which I don't know yet).
>>
>> With this, I have 3 questions :
>>
>> + Would XHR calls be non-consuming threads as POST do?
>> + How to best size today's solution? Since all /refresh are spending their
>> time doing some DB requests followed by `time.sleep(1)` ... Can I grow the
>> number of threads to 256? Or shall I split it to something like 8 processes
>> of 32 threads? (or even above?)
>> + Before that, we used mod-wsgi in embedded mode ... which never gave any
>> latency we could measure. Would it be, for such situation, better to switch
>> back to embedded mode?
>
> The reason you wouldn't have seen an issue in embedded mode is that the limit
> on maximum number of clients supported by Apache child worker processes would
> be much higher and Apache would spawn new worker processes if need be if the
> number of concurrent requests needing to be handled did grow, up to the limit.
>
> The risk in using embedded mode is the amount of memory used by Apache and
> some performance issues around how Apache dynamically adjusts the number of
> processes. The overhead of always starting up new processes with a full
> Python interpreter in it and reloading your code is expensive. Unfortunately
> the Apache algorithm suffers when doing this with heavyweight Python
> applications.
>
> Switching to XHR calls will not help unless you are also re-designing things
> to periodically poll instead of wait, something which could be done without
> using XHR anyway. Depending on number of requests, polling can cause other
> issues as number of overall requests increases.
>
> As to growing the number of threads, what you need to be careful of here is
> that doing so in a process which also can handle CPU intensive requests, is
> that you start to bog things down due to the Python global interpreter lock
> (GIL).
>
> If these requests are always waiting for the database and then sleeping, and
> not doing CPU intensive tasks, best way to handle it would be to create two
> separate daemon process groups. In the new daemon process group you could set
> the number of threads a lot higher in order to handle the number of
> concurrent requests. Then in the Apache configuration, you would delegate
> only URLs related to these requests which need to wait a long time to that
> daemon process group.
>
> I call this vertically partitioning the URL namespace, the separation
> allowing you to custom daemon process groups to the specific type of work
> being down for the requests delegated to that daemon process group.
>
> I have blogged about this concept in:
>
> * http://blog.dscpl.com.au/2014/02/vertically-partitioning-python-web.html
> <http://blog.dscpl.com.au/2014/02/vertically-partitioning-python-web.html>
>
> There are some examples in the post about it.
>
> Still be careful about bumping the number of threads per process too high,
> better to also use processes as well so spreading threads. There are some
> metrics you can get out of mod_wsgi if need be to help size the thread pool
> so you aren't allocating way more than you need.
>
> The other way is to delegate the long wrong requests to run back in embedded
> mode and so for those at least, allow them to make use of dynamic
> process/thread management of Apache.
>
> If you were doing that, you would need to look closely at the Apache MPM
> settings and adjust them to avoid the problems with process churn Apache
> experiences. I have talked about that problem in:
>
> * https://www.youtube.com/watch?v=k6Erh7oHvns
> <https://www.youtube.com/watch?v=k6Erh7oHvns>
>
> To combat memory overhead, you might perhaps want to use a more lightweight
> WSGI application for handling those specific requests. Pulling in a large
> Django application wouldn't necessarily be a good idea depending on how well
> it lazily loads code, or whether is pre-pulling request handlers on startup.
>
> I'll let you digest that and if you have questions let me know.
>
> Graham
>
> --
> You received this message because you are subscribed to the Google Groups
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected]
> <mailto:[email protected]>.
> To post to this group, send email to [email protected]
> <mailto:[email protected]>.
> Visit this group at https://groups.google.com/group/modwsgi
> <https://groups.google.com/group/modwsgi>.
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
--
You received this message because you are subscribed to the Google Groups
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.