Hi Gary,

I'd strongly suggest opening that support case with Sun. To give you
additional information, your understanding of FIN_WAIT_2 state is correct.
After your end of connection (application) sends EOF to socket, and
effectively FIN to remote end, it goes to FIN_WAIT_1 state. Then it
receives Ack of that FIN frame and goes to FIN_WAIT_2 state. 
Now, If I'm not mistaken, if your end of connection does not receive FIN
from other end while it's in FIN_WAIT_2 state, the ndd parameter you
mentioned determines how long we stay in such state before connection gets
flushed. This is 675 seconds in your case. It is in fact protocol
violation, but seems to be good idea since we must account for situations
where far end of connection can simply reboot and loose all it's socket
states.
Important thing about FIN_WAIT_2 state though is the fact, that it's
perfectly synchronised TCP state. Far end will transition into CLOSE_WAIT
state having received FIN from you, but may still want to send some data. 
In my opinion you should ask Sun to have a look at it. Here's what you
should be collecting to get you started:

- snoops from both ends of connection:
  snoop -q -d <devicename> -o <outputfile>
- trusses of processes on both ends responsible for data transfer (the ones
reading and writing to your sockets)
  truss -eflDda -fall -rall -vall -mall -o <outpu_file> -p <pid>
- explorers from both systems

Good luck.

Regards,
Daniel

On Sun, 13 Dec 2009 06:23:44 PST, Gary Mills <[email protected]> wrote:
> I have an anonymous FTP server where processes occasionally
> persist with one socket in the FIN_WAIT_2 state:
> 
>    Local Address        Remote Address    Swind Send-Q Rwind Recv-Q   
>    State
> -------------------- -------------------- ----- ------ ----- ------
> -----------
> 130.179.16.34.7775   164.164.240.122.1814 59430      0 49640      0
> FIN_WAIT_2
> 
> It's always for the data connection, with the process sleeping
> in read() on that socket.  I assume it's waiting for a FIN from
> the client.  Shouldn't this state time out?
> 
> # ndd /dev/tcp tcp_fin_wait_2_flush_interval
> 675000
> 
> It never does.  Is the server supposed to take some action?
> All of the timeouts are set in the ftpaccess file.  Is there a bug
> in the Solaris kernel?  I can't find one that's documented.
> 
> This is running under Solaris 10.  I can open a support case,
> but I'd like to get a little more information first.
_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to