Sorry for the slow reply. Been quite busy trying to finish off some stuff before a holiday.
> On 28 Mar 2019, at 7:17 pm, Stéphane Poss <stephp...@gmail.com> wrote: > > Hi, > I am unable to find the correct doc about the problem I have and I hope > you'll forgive my naivety. I'm running a django app using mod_wsgi with > mpm_event. > > The env is: Server Version: Apache/2.4.25 (Debian) OpenSSL/1.0.2r > mod_wsgi/4.6.4 Python/3.5 > > Here are the relevant bits of the Apache VirtualHost conf: > > WSGIScriptAlias / > /opt/virtualenvs/server/lib/python3.5/site-packages/server/wsgi.py > WSGIDaemonProcess server listen-backlog=200 processes=16 > threads=30 display-name=%{GROUP} > python-path=/opt/virtualenvs/server/lib/python3.5/site-packages > restart-interval=3600 graceful-timeout=10 Initial impressions are: 1. Threads per daemon process is way too high. Unless you have a highly I/O wait process, would recommend only 3-5 threads per process and rely on multiple processes. Having a high number of threads can mean thread resources go wasted, or requests could pile onto one process and overwhelm things because of global interpreter lock issues. 2. Increasing listener backlog to 200 on daemon process is probably not a good idea. Even the default of about 100 is way too many usually. Having backlog there is a bad idea because if your system gets overloaded, you have a backlog of requests which you still end up handling, yet the user has probably gone away if delay was significant. Better to reject requests with an error using queue-timeout rather than put your application in a permanently overloaded state where never seems to catch up. 3. You should not use python-path to refer to a site-packages directory. Use python-home to refer to the root of the virtual environment instead. > WSGIApplicationGroup %{GLOBAL} > WSGIProcessGroup server > > I have another Virtual host listening on port 443 with config: > > WSGIScriptAlias / > /opt/virtualenvs/server/lib/python3.5/site-packages/tile_server/wsgi.py > WSGIProcessGroup server > > > the following is the mpm_event config: > > ServerLimit 32 > StartServers 3 > MinSpareThreads 75 > MaxSpareThreads 150 > ThreadLimit 64 > ThreadsPerChild 40 > MaxRequestWorkers 500 > MaxConnectionsPerChild 0 These settings are out of whack. Some general rules about setting these to avoid strange behaviour. 1. MaxRequestWorkers should be an integer multiple of ThreadsPerChild. 2. MaxRequestWorkers would normally be ThreadsPerChild * ServerLimit. 3. MinSpareThreads should be a multiple of ThreadsPerChild. 4. MaxSpareThreads should be a multiple of ThreadsPerChild. For (3) and (4), if they aren't a multiple of ThreadsPerChild, you can invoke strange behaviour that might cause Apache to keep starting/stopping child processes. A better config might be: ServerLimit 32 StartServers 3 MinSpareThreads 75 MaxSpareThreads 150 ThreadLimit 64 ThreadsPerChild 25 MaxRequestWorkers 800 MaxConnectionsPerChild 0 > I'm having a hard time finding the relationship between the 'processes' and > 'threads' of the WSGIDaemonProcess and the StartServers, ThreadsPerChild and > MaxRequestWorkers of the mpm_event config. I have checked some of the videos > I found on other threads (very interesting!) but was not able to find the > means to understand how to configure the 2 together. The MPM settings above only related to the Apache child worker processes. These handle static files requests. For the WSGI application, all they do is act as a proxy for those requests. So MaxRequestWorkers should at least be greater than processes*threads of daemon process group otherwise the Apache child processes would never have enough capacity to proxy requests up to the capacity of the daemon process group. You would add a bit extra capacity in the MPM settings, over what the daemon process group can handle, so it has capacity to handle static requests and queued up WSGI application requests. What you set the MPM settings and daemon process group settings to depends on request throughput, and whether WSGI application is CPU and I/O bound. You are never going to be able to tune these properly if you don't know have a way of monitoring throughput, request times, and performance of the server. Bumping up threads in daemon process groups because you think you need to handle a huge number of concurrent requests, more often than not will just make things worse and is unnecessary. > My issue is that I seem not to have a high CPU usage on the host (it's a VM), > using cached data, while not being able to serve more than 60-70 requests per > second. I'm wondering why kind of knob I should turn to improve the server's > usage and thus the request rate. > Another issue I discovered this morning is the following: > > [Thu Mar 28 08:50:53.568321 2019] [wsgi:error] [pid 21253:tid > 140125284001536] (2)No such file or directory: [client > 2a02:121f:21b:0:c6cd:4394:566a:12ea:33664] mod_wsgi (pid=21253): Unable to > connect to WSGI daemon process 'tile-elevation' on > '/var/run/apache2/wsgi.15888.0.1.sock' as user with uid=33. > > Looks like the socket was rotated, but I cannot see why... This is usually because the operating system logging system is force restarting Apache once a day so it can rotate log files, instead of letting Apache rotate the log files itself. You can end up seeing this error when keep alive connections were being used by a client, and the keep alive connection survived because of graceful restart being used, but daemon process had been restarted. You can avoid this problem by setting the directive: WSGISocketRotation Off > Thanks in advance for the assitance, and thanks for the great tool! Beyond that, it is hard to suggest what you should use without you having instrumentation for your WSGI application and mod_wsgi so you can monitor throughput, request times, capacity utilisation etc. Do you have any monitoring in place? Have you eliminated that your bottleneck isn't your Python application code or backend databases etc. I would at least suggest using: processes=16 threads=5 and see what happens. This will eliminate potential issues with the Python GIL and pilling up of requests in one process. If however you are trying to test the setup by using a benchmarking tool at maximum throughput, you are always going to get silly results. You should never test a server setup in overloaded state as it tells you nothing about how to tune it and more often than not just triggers pathological conditions in the server setup. You want to aim to test with 40-60 capacity, and set systems up so always running around that much for typical traffic volumes, scaling horizontally when need more capacity. Finally, will mention again the importance of monitoring. If you want to do this properly, you need it. > > Cheers! > > -- > You received this message because you are subscribed to the Google Groups > "modwsgi" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to modwsgi+unsubscr...@googlegroups.com > <mailto:modwsgi+unsubscr...@googlegroups.com>. > To post to this group, send email to modwsgi@googlegroups.com > <mailto:modwsgi@googlegroups.com>. > Visit this group at https://groups.google.com/group/modwsgi > <https://groups.google.com/group/modwsgi>. > For more options, visit https://groups.google.com/d/optout > <https://groups.google.com/d/optout>. -- You received this message because you are subscribed to the Google Groups "modwsgi" group. To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+unsubscr...@googlegroups.com. To post to this group, send email to modwsgi@googlegroups.com. Visit this group at https://groups.google.com/group/modwsgi. For more options, visit https://groups.google.com/d/optout.