Hi Rick,

> Well, I have some good news and some bad news (the bad is mostly for Richard).
>
> The only message logged is:
> tcpflags 0x4<RST>; tcp_do_segment: Timestamp missing, segment processed 
> normally
>
> But...the RST battle no longer occurs. Just one RST that works and then the 
> SYN gets SYN,ACK'd by the FreeBSD end and off it goes...
>
> So, what is different?
>
> r367492 is reverted from the FreeBSD server.
> I did the revert because I think it might be what otis@ hang is being caused 
> by. (In his case, the Recv-Q grows on the socket for the stuck Linux client, 
> while others work.
>
> Why does reverting fix this?
> My only guess is that the krpc gets the upcall right away and sees a EPIPE 
> when it does soreceive()->results in soshutdown(SHUT_WR).

With r367492 you don't get the upcall with the same error state? Or you don't 
get an error on a write() call, when there should be one?

>From what you describe, this is on writes, isn't it? (I'm asking, at the 
>original problem that was fixed with r367492, occurs in the read path 
>(draining of ths so_rcv buffer in the upcall right away, which subsequently 
>influences the ACK sent by the stack).

I only added the so_snd buffer after some discussion, if the WAKESOR shouldn't 
have a symmetric equivalent on WAKESOW....

Thus a partial backout (leaving the WAKESOR part inside, but reverting the 
WAKESOW part) would still fix my initial problem about erraneous DSACKs (which 
can also lead to extremely poor performance with Linux clients), but possible 
address this issue...

Can you perhaps take MAIN and apply https://reviews.freebsd.org/D29690 for the 
revert only on the so_snd upcall?

If this doesn't help, some major surgery will be necessary to prevent NFS 
sessions with SACK enabled, to transmit DSACKs...


> I know from a printf that this happened, but whether it caused the RST battle 
> to not happen, I don't know.
> 
> I can put r367492 back in and do more testing if you'd like, but I think it 
> probably needs to be reverted?

Please, I don't quite understand why the exact timing of the upcall would be 
that critical here...

A comparison of the soxxx calls and errors between the "good" and the "bad" 
would be perfect. I don't know if this is easy to do though, as these calls 
appear to be scattered all around the RPC / NFS source paths.

> This does not explain the original hung Linux client problem, but does shed 
> light on the RST war I could create by doing a network partitioning.
>
> rick

_______________________________________________
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to