I had trouble thinking of a good summary subject for this question 8-/.

Basically, I was testing NFSv4 from a Solaris client. Evidently Solaris
does not tear down the TCP connection when the client is shut down or
rebooted, leaving an orphaned established connection on the server.

The problem I'm having is when the client comes back up, it tries to
reconnect to the server, but the server believes the connection is already
in the established state. Solaris always uses port 1023 as the source port
for the initial NFS connection, so it is highly likely if you bring the
system up and then immediately reboot it the server will already believe it
has a connection to the client.

I have a basic stateful rule on the client allowing outbound access to the
server (something like "pass out quick proto tcp from any to server port =
2049 flags S/SA keep state"). This works fine for the initial connection.
However, after a reboot when the server thinks it already has an
established connection, instead of replying with a SYN/ACK, the server
replies with a dupACK. If ipf is disabled, everything works out okay and
they renegotiate the connection. With ipf enabled, the dupACK is blocked.

Unfortunately, Solaris never times out the attempted connection, it just
sits there forever sending out SYNs and NFS is wedged.

This is with the bundled Solaris 10U4 ipfilter with the latest patches.
After going back and forth with Sun support on the issue, they have decided
their position is that this is correct behavior, a stateful connection in
SYN-SENT state should only allow a SYN/ACK back. Their recommendation is to
either not use ipf or to use stateless rules both inbound and outbound.

Neither option seems particularly palatable.

What should happen in this case? Is Sun's position reasonable? In the face
of this scenario, complete failure is the best option? I haven't had the
time to try out the latest version of open-source ipf, would it do the same
thing? Any recommendations on resolving this problem?

The problem is further complicated because the Linux server I'm currently
using has a bug where it never closes an established NFS TCP connection no
matter how long it has been idle. As such, the client could be down for
days and still wedge upon being booted.

Thanks...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768

Reply via email to