Re: Sockets stuck in FIN_WAIT_1

Robert Blayzor Thu, 29 May 2008 15:12:42 -0700

On May 29, 2008, at 5:32 PM, Matthew Dillon wrote:

Now, the connection is also in a half-closed state, which meansthatone direction is closed. I can't tell which direction that isbut myguess is that 1.1.1.1 (the apache server) closed the 1.1.1.1->2.2.2.2direction and the 2.2.2.2 box has a broken TCP implementation andcan't
   deal with it.

This is exactly what we're seeing, it's VERY strange. I did kill offApache, and all the FIN_WAIT_1's stuck around, so the kernel is infact sending these probe packets, every 60 seconds, which the clientresponds to... (most of the time).

I can suggest two things. First, the TCP connection is good butyoustill may be able to tell Apache, in the apache configurationfile, to
   timeout after a certain period of time and clear the connection.

I don't think this helps since Apache sees the connection as longgone. As far as Apache is concerned (as far as I can tell), thisconnection doesn't exist. This may be proved by killing off Apache,the connection still lives and while Apache is running, I have the maxclients connected most of the time... so I don't think the lingeraround and jam up sockets to Apache. If they did, I think Apachewould spiral down quite quickly.

Secondly, it may be beneficial to identify exactly what theclient andserver were talking about which caused the client to hang with alivetcp connection. The only way to do that is to tcpdump EVERYTHINGgoingon related to the apache srever, save it to a big-ass diskpartition(like 500G), and then when you see a stuck connection go backthroughthe tcpdump log file and locate it, grep it out, and review whatexactlyit was talking about. You'd have to tcpdump with options to tellit to
   dump the TCP data payloads.

Unfortunately it's not possible for me, not nearly enough space. Thisis a VERY busy server, a spikey 20Mbps+ (8-12Mbps on average) of webtraffic almost constantly. The traffic is VERY static, just smalldata files and occasional large ones (12Mb+), but the majority are2-5k files. (it's a clamav mirror server)

It seems likely that the client is running an applet orjavascript thatreceives a stream over the connection, and that applet orjavascriptprogram has locked up, causing the data sent from the server tobuild upand for the client's buffer space to run out, and startadvertising the
   0 window.

98% of the clients are clamav (freshclam) clients on variousplatforms. Using p0f most of them are various flavors of Linux, but Ican't say what OS the clients are connecting to for sure since I'dhave to look at the OS finger print of the SYN packets...

Don't get me wrong, the server keeps up well, low CPU, lots of RAMfree, lots of network available, and 99% of all HTTP connections arecompleted just fine. I just see these FIN_WAIT_1 connections build upover time until the server runs out of socket space and then thingsjust stop working. Only way to correct it seems to reboot theserver... even under RELENG_7_0.... so the upgrade from 4_11 did notfix the problem.


--
Robert Blayzor, BOFH
INOC, LLC
[EMAIL PROTECTED]
http://www.inoc.net/~rblayzor/



_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Sockets stuck in FIN_WAIT_1

Reply via email to