On 07/03/2008, Manlio Perillo <[EMAIL PROTECTED]> wrote: > Is it true that Apache can spawn additional processes,
Yes, for prefork and worker MPM, but not winnt on Windows. See for example details for worker MPM in: http://httpd.apache.org/docs/2.2/mod/worker.html > By the way, I know there is an event based worker in Apache. > Have you exterience with it? No, haven't used it. It isn't an event driven system like you know it. It still uses threads like worker MPM. The difference as I understand it is that it dedicates a single thread to managing client socket connections maintained due to keep alive, rather than a whole thread being tied up for each such connection. So, it is just an improvement over worker and does not implement a full event driven system. > > No matter what technology one uses there will be such trade offs and > > they will vary depending on what you are doing. Thus it is going to be > > very rare that one technology is always the "right" technology. Also, > > as much as people like to focus on raw performance of the web server > > for hosting Python web applications, in general the actual performance > > matters very little in the greater scheme of things (unless your > > stupid enough to use CGI). This is because that isn't where the > > bottlenecks are generally going to be. Thus, that one hosting solution > > may for a hello world program be three times faster than another, > > means absolutely nothing if that ends up translating to less than 1 > > percent throughput when someone loads and runs their mega Python > > application. This is especially the case when the volume of traffic > > the application receives never goes any where near fully utilising the > > actual resources available. For large systems, you would never even > > depend on one machine anyway and load balance across a cluster. Thus > > the focus by many on raw speed in many cases is just plain ridiculous > > as there is a lot more to it than that. > > There is not only the problem on raw speed. > There is also a problem of server resources usage. > > As an example, an Italian hosting company poses strict limits on > resource usage for each client. As would any sane web hosting company. > They do not use Apache, since they fear that serving embedded > applications limits their control If they believe that embedded solutions like mod_python are the only things available for Apache, then I can understand that. There are other solutions though such as fastcgi and mod_wsgi daemon mode, so it isn't as necessarily as unmanageable as they may believe. They perhaps just don't know what options are available, don't understand the technology well or how to manage it. I do admit though it would be harder when it isn't your own application and you are hosting stuff written by a third party. > Using Nginx + the wsgi module has the benefit to require less system > resources than flup (as an example) and, probabily, Apache. Memory usage is also relative, just like network performance. Configure Apache correctly and don't load modules you don't need and the base overhead of Apache can be reduced quite a lot. For a big system heavy on media using a separate media server such as nginx or lighttpd can be sensible. One can then turn off keep alive on Apache for the dynamic Python web application since keep alive doesn't necessarily help there and will cause the sorts of issues the event MPM attempts to solve. So, its manageable and there are known steps one can take. The real memory usage comes when someone loads up a Python web application which requires 80-100MB per process at the outset before much has even happened. Just because you are using another web hosting solution, be it nginx or even a Python based web server, this will not change that the Python web application is chewing up so much memory. The one area where memory usage can be a problem with Python web applications and which is not necessarily understood well by most people, is the risk of concurrent requests causing a sudden burst in memory usage. Imagine a specific URL which needs a large amount of transient memory, for example something which is generating PDFs using reportlab and PIL. All is okay if the URL only gets hit by one request at a time, but if multiple requests hit at the same time, then your memory blows out considerably as each request needs the large amount of transient memory at the same time and once allocated it will be retained by the process. So, if one was using worker MPM to keep down the number of overall processes and memory usage, you run the risk of this sort of problem occurring. One could stop it occurring by implementing throttling in the application, that is put locking on specific URLs which consumed lots of transient memory to restrict number of concurrent requests, but frankly I have never actually ever heard of anyone actually doing it. The alternative is to use prefork MPM, or similar model, such that there can only be one active request in the process at a time. But then you need more processes to handle the same number of requests, so overall memory usage is high again. For large sites however, which can afford lots of memory, using prefork would be the better way to go as it will at least limit the possibilities of individual processes spiking memory usage unexpectedly, with memory usage being more predictable. That all said, just because you aren't using threads and are handling concurrency using an event driven system approach will not necessarily isolate you from this specific problem. All in all it can be a tough problem. If your web application demands are relatively simple then it may never be an issue, but people are trying to do more and more within the web application itself, rather than delegating it to separate back end systems or programs. At the same time they want to use cheap memory constrained VPS systems. So, lots of fun. :-) Graham _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com