Hi Jim,
let this observation not stop you from tagging 2.4.12. I don't think it
is new behavior and in contrast to my original expectation the child
processes do not hang infinitely but instead just serve the remaining
requests with a low data send rate.
The only strange thing - and that is likely not new - is that the
remaining connection(s) often are not counted in the busy or async
connections columns in server-status.
Regards,
Rainer
Am 20.01.2015 um 12:52 schrieb Jim Jagielski:
Isn't the TCP keepalive like 2hours or so?
On Jan 20, 2015, at 2:45 AM, Ruediger Pluem <rpl...@apache.org> wrote:
On 01/19/2015 11:40 PM, Rainer Jung wrote:
I noticed a hanging child process on our ASF server aurora.
It currently uses 2.4.11 (plus the post tag commit) and event MPM.
Most processes exiting due to MaxConnectionsPerChild get cleaned up after some
time but this one doesn't. It now hangs
for more than an hour. I'll let it hang. In case anyone has a good question I
can answer with gdb let me know.
It shows a strange connection view in the server status table:
PID Connections Threads Async connections
total accepting busy idle writing keep-alive closing
93557 1 yes 0 0 0 0 0
So it has 1 connection, but 0s in all other columns.
The connection can be seen by lsof:
FD TYPE DEVICE SIZE/OFF NODE NAME
txt VREG 183,3400335528 36497117 275235
/x1/www/archive.apache.org/dist/cordova/cordova-3.4.0-src.zip
9u PIPE 0xfffffe061ecfab60 16384 ->0xfffffe061ecfacb8
10u PIPE 0xfffffe061ecfacb8 0 ->0xfffffe061ecfab60
24u KQUEUE 0xfffffe033071be00 count=0, state=0x2
41u IPv4 0xfffffe01316243d0 0t0 TCP
127.0.0.1:35849->127.0.0.1:8050 (CLOSE_WAIT)
83u IPv4 0xfffffe0255d08b70 0t0 TCP
127.0.0.1:52023->127.0.0.1:8050 (CLOSE_WAIT)
108u IPv4 0xfffffe09990eeb70 0t0 TCP
127.0.0.1:22532->127.0.0.1:8050 (CLOSE_WAIT)
This is the established connectioN:
110u IPv4 0xfffffe0255ab4b70 0t0 TCP
192.87.106.229:http->179.206.174.192:65496 (ESTABLISHED)
And this is likely the file being served on that connection:
126r VREG 183,3400335528 36497117 275235
/x1/www/archive.apache.org/dist/cordova/cordova-3.4.0-src.zip
156u IPv4 0xfffffe048d0ff3d0 0t0 TCP
127.0.0.1:26685->127.0.0.1:8050 (CLOSE_WAIT)
229u IPv4 0xfffffe0131d013d0 0t0 TCP
127.0.0.1:31538->127.0.0.1:8050 (CLOSE_WAIT)
netstat shows:
Proto Recv-Q Send-Q Local Address Foreign Address (state)
tcp4 0 87650 192.87.106.229.80 179.206.174.192.65496 ESTABLISHED
so there's 87650 bytes in the send-q. Most lilely the client hans't acked what
we send.
Isn't it weird that the connection remains in this state for an hour? I would
guess the OS tries to resent whats in the
buffer and if doesn't get ACK'ed it would somehow timeout the TCP connection
and assume the peer is gone.
Regards
Rüdiger