Re: MPM and AcceptMutex questions

Sander Temme Sat, 04 Dec 2004 11:08:30 -0800

Hi Kristina,

On Dec 3, 2004, at 1:58 PM, Kristina Clair wrote:

I'm not sure if this is really the right list for this, but I got no
response from the "users" list, so I'm wondering if some of the
developers might be able to shed some light.  Sorry if it's
inappropriate to mail this list with these questions.


Yeah, it's not the right forum, but no big deal.

<...>

I'm running apache2 on redhat 7.3 servers with very heavy http
traffic.  I'm wondering if it might be wise for me to try to use the
worker MPM rather than the default prefork.  Does anyone have any
real-world experience with this?  Our users are allowed to run any cgi
scripts they want --  could this cause a potential problem with
worker?

On Linux, especially the older versions before kernel 2.6, the kernel uses a process data structure for each thread. Switching from a forked model to a threaded model doesn't really make a difference w.r.t. performance. You may see a difference in semaphore load (see below).

In fact, you may see some problems running CGIs with Worker on a heavily loaded server. As someone else remarks, forking CGIs from a threaded server poses logistical problems, so Worker uses a construction called cgid which is a separate program that receives CGI requests from the worker threads. It takes care of forking the CGIs and communicating the results back. This communication happens over a unix domain socket.

When I used to run benchmarks for a living, I frequently saw the cgid queue back up when using worker under high load. When this happens, the worker threads try a few times to contact cgid, then give up and send a 500 Internal Server Error response back to the client. Not the result you want.

A way around this would be to proxy cgi requests to a different instance of Apache, running Prefork. This would be more tunable than cgid, which (I think, from browsing the source code) just has one process servicing CGI requests from all workers with a listen backlog of only 100.

Also, I recently encountered a problem with the mutex device - I'm
using the default AcceptMutex setting, which is sysvsem, but it is
leaving semaphores around when httpd is restarted. It seems like the
two main recommendations for fixing this are to either increase the
max number of semaphores in the kernel or use AcceptMutex fcntl
instead. Again, has anyone had any experience with this? It's probably
also important to note that the files that apache serves are nfs - does this
give me a disadvantage if I try to use fcntl? And if I increase the
max semaphores in the kernel - isn't it just a matter of time before
they all get used up too?

Apache should clean up after itself. If you're seeing dangling semaphores in the ipcs output, you probably have crashes going on. When Apache exits cleanly, it removes its semaphore(s).

Apache doesn't use all that many semaphores. It uses one for the accept lock (AcceptMutex), one to serialize access to the SSL session store (if you're running SSL) and that's about it. In fact, if you have only one listener (one Listen statement in httpd.conf), it will not even use an AcceptMutex on many platforms. I don't remember if this is the case with Red Hat 7.3: run httpd -V and if you have "-D SINGLE_LISTEN_UNSERIALIZED_ACCEPT" in the output, you'll run without AcceptMutex if you have only a single listener.

If you're running with sysvsem, you probably applied some kernel tunings w.r.t. the semaphore resources. As I said, Apache doesn't use all that many semaphores. What you're running out of (No Space Left On Device) are Undo Structures. The sysv semaphore locking code in APR allocates an Undo Structure every time it tries to lock. This is a little extra work, but it ensures that the semaphore can be unlocked even if a process crashes while holding the lock. Now, the AcceptMutex works by having each and every httpd child try to lock the semaphore. Only one can succeed, the others camp out on the lock. This means every child (all hundreds of them) allocate an undo structure for that mutex, so your kernel tunings must be able to accommodate this.

When you're running Worker, you'll have one listener thread for each process. This listener thread then fires off worker threads to handle the requests it receives. This means you'll have a lot fewer listeners on the AcceptMutex than in the prefork case. However, you'll probably have to beef up your maximum number of open files per process (ulimit -n) and you may run into the cgid scalability problem I described above.

Is sysvsem the right mutex for you? Hard to tell. During my benchmark runs (none of which have published results, unfortunately), I have found sysvsem to be slightly faster than fcntl, but at the cost of a much higher system load. However, that was back in the days of Apache 1.3. The picture for Apache 2.0 may be different, especially in the Worker case.

The final answer may be that you'll have to set up the various scenarios on a staging box and point Siege at it, then see if you can make it not fall over. (:

Hope some of this is relevant for you. As you see, it has nothing to do with programming (unless you're interested in rewriting mod_cgid to make it pre-fork for more scalability), so it really belongs on the users list. (:

S.

--
[EMAIL PROTECTED]              http://www.temme.net/sander/
PGP FP: 51B4 8727 466A 0BC3 69F4  B7B8 B2BE BC40 1529 24AF

smime.p7s
Description: S/MIME cryptographic signature

Re: MPM and AcceptMutex questions

Reply via email to