Hiyas,

Try Disabling "KeepAlive" connections on your apache server (it is enabled by default). Simply edit your "httpd.conf" file, and change the following line:

KeepAlive On

        to:

KeepAlive Off

        And see if it works.

Regards,

Domingos.

Frode E. Moe escreveu:
(Sorry for a rather long email, here's an "executive summary": Windows firewall
doesn't reply with RST for TCP retransmissions on a client-closed connection,
causes apache workers to get stuck for 5 minutes)

Hello list,
lately I've been trying to track down spurious apparent freezes in an application running on Apache + PHP. In short it seems like apache (or the kernel?) in some cases fails to detect when a client closes a connection mid-download, and leaves the worker stuck for several minutes trying to write the full http response. The problem is amplified by the fact that PHP keeps its session file flock()'ed while this happens,
which means any further requests from the client never get answered
(at least until the stuck worker times out).
Please note that although this post focuses on PHP, I'm pretty sure this
problem is not specific to that scripting language.

Interestingly this occurs much more frequently if the client is running
the Windows XP SP2 built-in firewall. I'll get back to that shortly.

Here is a small test case:

index.php contains:
  <? session_start(); ?>
  <html>
    <body>
       Test case for hang
       <?
       for ($i=0; $i<100; $i++) {
         echo $i.'<img src="noimg.php?i='.$i.'"><br>';
       }
       ?>
    </body>
  </html>

noimg.php contains:
  <? session_start(); ?>
  <html>
  <body>
  <?
    for ($i=0;$i<1000;$i++) {
      echo "$i: asdf asdf asdf asdf asdfasdf asdf asdf asdf asdf asdf asdf asdf 
asdf\n";
    }
  ?>
  </body>
  </html>

(I know it's silly to return a text/html page to be loaded in an <img src>, but
that's necessary to cause the client to abort the connection mid-download.
This actually happened in real life due to an erroneous <img src=""> tag which caused the browser to load the *current URL* as an image)

My test setup consists of:
* Server: Apache HTTPd 2.2.3 compiled from source, running on Debian GNU/Linux stable with kernel 2.4.33.3
  * Client: FireFox 2.0 on Windows XP SP2, with the Windows firewall
    enabled.

Server and client are placed on the same LAN.

What happens when pointing firefox to the index.php given above, is
that it starts to load the various <img> tags, but aborts the connection for each image mid-download, probably because it detects the mime type text/html. The problem is that after a couple of requests, apache fails to detect that the client has closed the connection, so firefox tries to load the
next image, but the previous "image" (PHP script) is still running
and keeping the session locked, so any further requests from the client
just "hangs".

I did a wireshark capture on the client while executing the test case,
and here is an excerpt (I could probably sanitize out passwords etc
and provide a full .pcap file, should that be necessary):

(10.0.0.43 is the server, 10.0.0.138 is the client, '>' marks the most interesting packets)

 596   9.729025   10.0.0.138 -> 10.0.0.43    TCP 2623 80 2623 > 80 [SYN] Seq=0 
Len=0 MSS=1460
 600   9.751507    10.0.0.43 -> 10.0.0.138   TCP 80 2623 80 > 2623 [SYN, ACK] 
Seq=0 Ack=1 Win=5840 Len=0 MSS=1460
 601   9.751546   10.0.0.138 -> 10.0.0.43    TCP 2623 80 2623 > 80 [ACK] Seq=1 
Ack=1 Win=64512 Len=0
 602   9.760570   10.0.0.138 -> 10.0.0.43    HTTP 2623 80 GET /noimg.php?i=45 
HTTP/1.1
 603   9.762774    10.0.0.43 -> 10.0.0.138   TCP 80 2623 80 > 2623 [ACK] Seq=1 
Ack=464 Win=6432 Len=0
 604   9.803347    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP segment of a 
reassembled PDU]
 605   9.810978    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP segment of a 
reassembled PDU]
 606   9.811029   10.0.0.138 -> 10.0.0.43    TCP 2623 80 2623 > 80 [ACK] 
Seq=464 Ack=2921 Win=64512 Len=0
607   9.813999   10.0.0.138 -> 10.0.0.43    TCP 2623 80 2623 > 80 [FIN, ACK] 
Seq=464 Ack=2921 Win=64512 Len=0
 608   9.814405   10.0.0.138 -> 10.0.0.43    TCP 2624 80 2624 > 80 [SYN] Seq=0 
Len=0 MSS=1460
 609   9.819676    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP segment of a 
reassembled PDU]
610   9.819725   10.0.0.138 -> 10.0.0.43    TCP 2623 80 2623 > 80 [RST, ACK] 
Seq=465 Ack=4381 Win=0 Len=0
 611   9.826157    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP segment of a 
reassembled PDU]
 612   9.859980    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Previous segment 
lost] 80 > 2623 [ACK] Seq=7301 Ack=465 Win=6432 Len=0
613  10.072413    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] 
[TCP segment of a reassembled PDU]
614  10.568147    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] 
[TCP segment of a reassembled PDU]
620  11.558401    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] 
[TCP segment of a reassembled PDU]
 623  12.785586   10.0.0.138 -> 10.0.0.43    TCP 2624 80 2624 > 80 [SYN] Seq=0 
Len=0 MSS=1460
 624  12.786740    10.0.0.43 -> 10.0.0.138   TCP 80 2624 80 > 2624 [SYN, ACK] 
Seq=0 Ack=1 Win=5840 Len=0 MSS=1460
 625  12.786768   10.0.0.138 -> 10.0.0.43    TCP 2624 80 2624 > 80 [ACK] Seq=1 
Ack=1 Win=64512 Len=0
 626  12.789319   10.0.0.138 -> 10.0.0.43    HTTP 2624 80 GET /noimg.php?i=46 
HTTP/1.1
 627  12.790579    10.0.0.43 -> 10.0.0.138   TCP 80 2624 80 > 2624 [ACK] Seq=1 
Ack=464 Win=6432 Len=0
628  13.569077    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] 
[TCP segment of a reassembled PDU]
637  17.570187    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] 
[TCP segment of a reassembled PDU]
638  25.575101    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] 
[TCP segment of a reassembled PDU]
639  41.563376    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] 
[TCP segment of a reassembled PDU]
650 137.533500    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] 
[TCP segment of a reassembled PDU]
651 257.507283    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] 
[TCP segment of a reassembled PDU]
652 377.499848    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] 
[TCP segment of a reassembled PDU]

(then things "unlock" and proceeds as normal for a while)

The interesting parts to note here is that the client sends a FIN, ACK, then an RST,ACK, and then stays completely silent on port 2623.

netstat -n on the client shows:
  TCP   10.0.0.138:2624    10.0.0.43:80     ESTABLISHED

netstat -np on the server shows:
tcp        0      0 10.0.0.43:80            10.0.0.138:2624         ESTABLISHED 
 6080/httpd
tcp        1  10220 10.0.0.43:80            10.0.0.138:2623         CLOSE_WAIT  
 6075/httpd

In other words, the client has completely "forgotten" the port-2623 connection,
but the server still knows about it.
Attaching to pid 6075 with gdb and running a stacktrace shows:

#0  0x401fba18 in poll () from /lib/libc.so.6
#1  0x40093c78 in apr_wait_for_io_or_timeout (f=0x0, s=0x817f7c0, for_read=0) 
at support/unix/waitio.c:51
#2  0x4008efef in apr_socket_sendv (sock=0x817f7c0, vec=0xbfffbbf8, nvec=3, 
len=0xbfffbab8) at network_io/unix/sendrecv.c:208
#3  0x08073642 in writev_it_all (s=0x817f7c0, vec=0xbfffbbf0, nvec=4, len=8074, 
nbytes=0xbfffbb48) at core_filters.c:321
#4  0x08073fea in ap_core_output_filter (f=0x817fdd0, b=0x8185dd8) at 
core_filters.c:868
#5  0x0807ef11 in ap_pass_brigade (next=0x7531, bb=0x1) at util_filter.c:526
#6  0x0808f91a in ap_http_chunk_filter (f=0x8185f88, b=0x8185dd8) at 
chunk_filter.c:187
#7  0x0807ef11 in ap_pass_brigade (next=0x7531, bb=0x1) at util_filter.c:526
#8  0x0807ef11 in ap_pass_brigade (next=0x7531, bb=0x1) at util_filter.c:526
#9  0x080693ae in ap_content_length_filter (f=0x81927a8, b=0x8185dd8) at 
protocol.c:1338
#10 0x0807ef11 in ap_pass_brigade (next=0x7531, bb=0x1) at util_filter.c:526
#11 0x40048294 in apr_brigade_write (b=0x8185dd8, flush=0x807f050 
<ap_filter_flush>, ctx=0xfffffffc,
    str=0x410e38a8 "218: asdf asdf asdf asdf asdfasdf asdf asdf asdf asdf asdf asdf 
asdf asdf\n", nbyte=74) at buckets/apr_brigade.c:400
#12 0x080697cd in buffer_output (r=0x8191a20, str=0x410e38a8 "218: asdf asdf asdf 
asdf asdfasdf asdf asdf asdf asdf asdf asdf asdf asdf\n", len=74)
    at protocol.c:1455
#13 0x080698db in ap_rwrite (buf=0xfffffffc, nbyte=74, r=0x7531) at 
protocol.c:1490
#14 0x40635137 in php_apache_sapi_ub_write (str=0xfffffffc <Address 0xfffffffc out 
of bounds>, str_length=74)
    at /devel2/x2www/src/php-5.2.0/sapi/apache2handler/sapi_apache2.c:78

(I won't bore the list with the output of "bt full", but you can get that at http://corehacker.com/~frode/apache-user/pollhang-bt-full.txt)

So, apache is stuck in a poll() apparently waiting for the client to suck down whatever apache wants to write, but the client is long gone
and we have to wait for the poll() to completely time out before the
worker is freed.

Interestingly enough, if the windows firewall is disabled on the client, there is no such long hang, because the client sends RST packets
for each "[TCP Retransmission]" packet, so the socket closes down almost
immediately on the server as well.

I've reproduced exactly the same effect when running httpd 2.2.2 on FreeBSD 6.1,
and also on httpd 1.3.34 (although this seemed to detect the closed client 
socket quicker)
on the same Linux box.
I failed to reproduce the effect when running the win xp sp2 + firefox + 
firewall
client setup inside vmware, strangely enough.

Does anyone have any tips on how to mitigate this problem (besides the obvious
fix of "don't return text/html when the client Accepts: image/png")? I don't think "disable the client firewall" is a realistic answer for a public-facing web site. Anyway, I've gotten reports that disabling the firewall greatly improves things but the occasional hang still occurs.
Also, isn't this sort of a weakness that makes it fairly easy to create a
Denial of Service situation by "eating up" all workers with little effort?

--
Domingos Parra Novo
Coordenador de Projetos
Terra Networks Brasil S/A
Tel: +55(51)3284-4275

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [EMAIL PROTECTED]
  "   from the digest: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to