On 01/19/2015 11:40 PM, Rainer Jung wrote: > I noticed a hanging child process on our ASF server aurora. > It currently uses 2.4.11 (plus the post tag commit) and event MPM. > Most processes exiting due to MaxConnectionsPerChild get cleaned up after > some time but this one doesn't. It now hangs > for more than an hour. I'll let it hang. In case anyone has a good question I > can answer with gdb let me know. > > It shows a strange connection view in the server status table: > > PID Connections Threads Async connections > total accepting busy idle writing keep-alive closing > 93557 1 yes 0 0 0 0 0 > > So it has 1 connection, but 0s in all other columns. > > The connection can be seen by lsof: > > FD TYPE DEVICE SIZE/OFF NODE NAME > txt VREG 183,3400335528 36497117 275235 > /x1/www/archive.apache.org/dist/cordova/cordova-3.4.0-src.zip > 9u PIPE 0xfffffe061ecfab60 16384 ->0xfffffe061ecfacb8 > 10u PIPE 0xfffffe061ecfacb8 0 ->0xfffffe061ecfab60 > 24u KQUEUE 0xfffffe033071be00 count=0, state=0x2 > 41u IPv4 0xfffffe01316243d0 0t0 TCP > 127.0.0.1:35849->127.0.0.1:8050 (CLOSE_WAIT) > 83u IPv4 0xfffffe0255d08b70 0t0 TCP > 127.0.0.1:52023->127.0.0.1:8050 (CLOSE_WAIT) > 108u IPv4 0xfffffe09990eeb70 0t0 TCP > 127.0.0.1:22532->127.0.0.1:8050 (CLOSE_WAIT) > > This is the established connectioN: > > 110u IPv4 0xfffffe0255ab4b70 0t0 TCP > 192.87.106.229:http->179.206.174.192:65496 (ESTABLISHED) > > And this is likely the file being served on that connection: > > 126r VREG 183,3400335528 36497117 275235 > /x1/www/archive.apache.org/dist/cordova/cordova-3.4.0-src.zip > 156u IPv4 0xfffffe048d0ff3d0 0t0 TCP > 127.0.0.1:26685->127.0.0.1:8050 (CLOSE_WAIT) > 229u IPv4 0xfffffe0131d013d0 0t0 TCP > 127.0.0.1:31538->127.0.0.1:8050 (CLOSE_WAIT) > > netstat shows: > > Proto Recv-Q Send-Q Local Address Foreign Address (state) > tcp4 0 87650 192.87.106.229.80 179.206.174.192.65496 ESTABLISHED > > so there's 87650 bytes in the send-q. Most lilely the client hans't acked > what we send.
Isn't it weird that the connection remains in this state for an hour? I would guess the OS tries to resent whats in the buffer and if doesn't get ACK'ed it would somehow timeout the TCP connection and assume the peer is gone. Regards Rüdiger