Re: [modwsgi] Improving mod_wsgi daemon process starting speed

Graham Dumpleton Sat, 04 Sep 2010 06:00:36 -0700

Overall I think you are being a bit unrealistic as to the number of
sites you can host on the one system. Although you may well be able to
manage 600+ sites on the one host with PHP, the same cannot be said
for Python, or at least not where using a distinct process for each
site and where using a fat Python web framework like Django.

In the case of PHP the whole application/site is effectively thrown
away between requests. Because PHP works this way and has to load the
site fresh each time, it is optimised for quick startup. This is not
the case with Python which is a generic programming language being
used for web development, rather than a purpose built development
language and environment. Instead, Python web frameworks/applications
rely on being able to remain resident in memory between requests and
not being thrown out on each request, or even necessarily on a
periodic basis or when they become idle.

The only reason that some of the features of daemon mode exist was
because of memory constrained systems running a single or minimal
number of sites. The mod_wsgi project was never intended for mass
hosting and nor are those features intended to assist in helping to
fit large numbers of sites into a single host.

To even attempt to host large numbers of site instances on one host
using Python, you would need to be using a very lightweight Python web
framework and Django possibly isn't a good choice. This is because
Django's memory overhead and startup cost is too great. If using a
generic framework as starting point, you would have been better off
starting with something like Flask (flask.pocoo.org). That or you
develop a way of allowing multiple customer sites to be hosted through
the one Django instance. This does mean multiple users sites running
as same user though.

That said, I'll try and make a few comments based on your
configuration, but it is going to be hard to make things much better
simply because it sounds like you are trying to do too much in too few
resources.

On 30 August 2010 20:05, virgil.balibanu <[email protected]> wrote:
> I have a server where user can connect to and create their own
> websites automatically. For this I use mod_wsgi in daemon mode with
> predefined users for processes. Each site runs on it's own process.
> The configuration file looks like this:
> WSGIDaemonProcess site-1 user=site-1 group=nobody maximum-requests=300
> inactivity-timeout=300
> WSGIDaemonProcess site-2 user=site-2 group=nobody maximum-requests=300
> inactivity-timeout=300
> WSGIDaemonProcess site-3 user=site-3 group=nobody maximum-requests=300
> inactivity-timeout=300

Using maximum-requests option is a bad idea in any production system.
This is because it causes process restarts potentially at times when
you least want them to occur. This is especially the case if the
number of requests is quite low like you have. This is because if you
get a spike in requests, churning based on number of requests is only
going to serve to cause processes to be restarted more often when you
really want it to stay persistent.

These process restarts in themselves are bad because the start up
costs of reloading the process is only going cause a load spike when
you don't need it. Slowly down ability of the server to handle
requests. A further problem is that if you have long running requests
active in the process when maximum number of requests is reached,
mod_wsgi will wait for them to complete, or until shutdown timeout
occurs. If you only have a single process in daemon process group as
you have, then request processing will pause until those requests
complete, the process is shutdown, restarted and application reloaded.
As such, maximum-requests if it must be used should be used with two
or more processes in process group. That way when one process is
restarting, more chance that other process can still handle requests
at that time and no noticeable pause.

So, maximum-requests is fine when developing a system in order to
combat memory/resource leaks in an application, but you should fix
such problems in production and avoid using maximum-requests option at
all.

> ... (with a lot of lines like these ones)
> WSGIProcessGroup %{ENV:PROCESS_GROUP}
> WSGIApplicationGroup %{GLOBAL}
> I also use nginx as a front-end server for serving static files and I
> proxy other requests to apache.
> My problem is that whenever a process has been closed due to
> inactivity and it needs to start again it would take around 5s if it
> is the only one starting then, and can get to a lot more if I have 10
> or 20 processes starting at the same time. Also this time can go over
> my limits set in nginx configuration for timeout and I could get a 504
> response from nginx. I need to use inactivity-timeout because I have
> around 600 processes at the moment and the number is increasing and
> each process uses around 20 to 25 MB of memory, so I would run out of
> memory and get to use the swap if I did not use use it. I also use
> maximum-requests in order to keep the process memory lower.
> I am looking for any idea that would help me start a process faster,
> or improve memory consumption. From what I understand this is because
> each process has to load python and django thus the memory and time. I
> there any way to improve this? How does WSGIProcessGroup and
> ApplicationGroup help, are there better settings?
> Also I would like to know if there are any apache options that I need
> to play with?
> RLimitMEM 205024597
> RLimitCPU 240

These don't affect mod_wsgi.

In mod_wsgi 4.0 there is a cpu-time-limit option for daemon processes.
There are also memory-limit and virtual-memory-limit options. The
latter two will not work on some systems as they don't implement such
memory limits. Which of the two is appropriate also differs between
systems so you need to do some tests to work out which, if at all.
See:

http://code.google.com/p/modwsgi/wiki/ChangesInVersion0400

for details of new options.

Also look at the graceful-timeout option.

Just be aware that I don't know to what degree these have been tested
or used by anyone.

> KeepAlive On
> MaxKeepAliveRequests 100
> KeepAliveTimeout 15

Keep alive is irrelevant as nginx only uses HTTP/1.0 without keepalive
when proxying. Thus, you may as well turn KeepAlive Off.

> Do I need to change add anything else, like # MaxSpareServers,
> MinSpareServers, StartServers

You should be using worker MPM and what the MPM settings are set to is
going depend on your expected number of concurrent requests. I cant
say what you should be setting them to as only you have information to
know that.

Graham

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en.

Re: [modwsgi] Improving mod_wsgi daemon process starting speed

Reply via email to