Dear friends,

Nadav, my son, sent the enclosed message to the Apache's users mailing list,
and drew blank. I resend it here, hoping the top Apache gurus participating in
the discussions here may give some insight. We are really puzzled by the
described behavior.

Best,

Zvi.

-- 
Dr. Zvi Har'El     mailto:[EMAIL PROTECTED]     Department of Mathematics
tel:+972-54-227607                   Technion - Israel Institute of Technology
fax:+972-4-8324654 http://www.math.technion.ac.il/~rl/     Haifa 32000, ISRAEL
"If you can't say somethin' nice, don't say nothin' at all." -- Thumper (1942)
                           Thursday, 25 Shevat 5762,  7 February 2002,  1:39PM

---------- Forwarded message ----------
Date: Sun, 20 Jan 2002 13:28:09 +0200
From: Nadav Har'El <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Subject: unexplained phenomenon: hanging apache processes
Resent-Date: Thu, 7 Feb 2002 11:00:23 +0200
Resent-From: [EMAIL PROTECTED]
Resent-To: "Zvi Har'El" <[EMAIL PROTECTED]>

Recently I've stumbled a puzzling problem when trying to measure Apache's
performance using Microsoft's WAS (Web Application Stress Tool) or Radview's
WebLOAD. I was wondering if anyone ever noticed this phenomenon, or can
suggest an explanation, and any guess on whether this is a bug in Apache
(or Linux), or what.

The problem is that something in the measurement client, the server OS
(I tried Linux 2.2.16, 2.2.19 and 2.4.3), Apache (I tried 1.3.20), or
modssl (I tried both without it and with it) - something causes one or
more of the httpd processes to "hang", blocking on reading input from the
client which will never come.

Such a blocking process will remain blocked for 300 seconds (or another
period defined by the Apache "Timeout" directive), so over time more and
more processes can get hung; On WebLOAD measurements with very high loads,
starting apache with 250 processes, we consistently got them all blocked after
roughly 10 minutes, at which point Apache's throughput obviously dropped down
to zero. But it is even easier to recreate this problem when Apache is
limited (with MaxClients) to a smaller number of processes: For example
with 9 processes Webload will hang them all in a few minutes, and with
2 processes Microsoft WAS will hang them both (if you configure it to do
SSL requests) almost immediately.

I tried several experiments to understand what is going on, but so far without
being able to fully explain it, so I was hoping maybe someone else noticed
this problem and can shed some light on it (or just say "I've seen it too!").

For example, when I measure Apache with one process (http -X) with MS-WAS
and SSL requests, and run a sniffer to see what's going on, I see the first
request being handled perfectly, but already on the second request the
client (MS-WAS) suddenly stops sending the proper SSL protocol in the middle
of a session, so the server (Apache) hangs on read. There is no RST or
anything else sent by the client indicating that it wanted to close this
session... This could have been downplayed as a bug in MS-WAS or the Windows
it runs on, if it weren't for the fact that as I said I also see a similar
problem with Radview's Webload, and that both of these tools seem (at least
as far as I heard) to be rather respected in the industry.

So, has anyone else ever noticed such a problem? Can anyone perhaps shed
some light on it?

Thanks in advance,
        Nadav.


-- 
Nadav Har'El                        |       Sunday, Jan 20 2002, 7 Shevat 5762
[EMAIL PROTECTED]             |-----------------------------------------
Phone: +972-53-245868, ICQ 13349191 |Don't be irreplaceable. If you can't be
http://nadav.harel.org.il           |replaced, you can't be promoted.

Reply via email to