tue...@freebsd.org wrote:
>Rick wrote:
[stuff snipped]
>>> With r367492 you don't get the upcall with the same error state? Or you 
>>> don't get an error on a write() call, when there should be one?
> If Send-Q is 0 when the network is partitioned, after healing, the krpc sees 
> no activity on
> the socket (until it acquires/processes an RPC it will not do a sosend()).
> Without the 6minute timeout, the RST battle goes on "forever" (I've never 
> actually
> waited more than 30minutes, which is close enough to "forever" for me).
> --> With the 6minute timeout, the "battle" stops after 6minutes, when the 
> timeout
>      causes a soshutdown(..SHUT_WR) on the socket.
>      (Since the soshutdown() patch is not yet in "main". I got comments, but 
> no "reviewed"
>       on it, the 6minute timer won't help if enabled in main. The soclose() 
> won't happen
>       for TCP connections with the back channel enabled, such as Linux 
> 4.1/4.2 ones.)
>I'm confused. So you are saying that if the Send-Q is empty when you partition 
>the
>network, and the peer starts to send SYNs after the healing, FreeBSD responds
>with a challenge ACK which triggers the sending of a RST by Linux. This RST is
>ignored multiple times.
>Is that true? Even with my patch for the the bug I introduced?
Yes and yes.
Go take another look at linuxtofreenfs.pcap
("fetch https://people.freebsd.org/~rmacklem/linuxtofreenfs.pcap"; if you don't
  already have it.)
Look at packet #1949->2069. I use wireshark, but you'll have your favourite.
You'll see the "RST battle" that ends after
6minutes at packet#2069. If there is no 6minute timeout enabled in the
server side krpc, then the battle just continues (I once let it run for about
30minutes before giving up). The 6minute timeout is not currently enabled
in main, etc.

>What version of the kernel are you using?
"main" dated Dec. 23, 2020 + your bugfix + assorted NFS patches that
are not relevant + 2 small krpc related patches.
--> The two small krpc related patches enable the 6minute timeout and
       add a soshutdown(..SHUT_WR) call when the 6minute timeout is
       triggered. These have no effect until the 6minutes is up and, without
       them the "RTS battle" goes on forever.

Add to the above a revert of r367492 and the RST battle goes away and things
behave as expected. The recovery happens quickly after the network is
unpartitioned, with either 0 or 1 RSTs.

rick
ps: Once the irrelevant NFS patches make it into "main", I will upgrade to
     main bits-de-jur for testing.

Best regards
Michael
>
> If Send-Q is non-empty when the network is partitioned, the battle will not 
> happen.
>
>>
>> My understanding is that he needs this error indication when calling 
>> shutdown().
> There are several ways the krpc notices that a TCP connection is no longer 
> functional.
> - An error return like EPIPE from either sosend() or soreceive().
> - A return of 0 from soreceive() with no data (normal EOF from other end).
> - A 6minute timeout on the server end, when no activity has occurred on the
>  connection. This timer is currently disabled for NFSv4.1/4.2 mounts in 
> "main",
>  but I enabled it for this testing, to stop the "RST battle goes on forever"
>  during testing. I am thinking of enabling it on "main", but this crude 
> bandaid
>  shouldn't be thought of as a "fix for the RST battle".
>
>>>
>>> From what you describe, this is on writes, isn't it? (I'm asking, at the 
>>> original problem that was fixed with r367492, occurs in the read path 
>>> (draining of ths so_rcv buffer in the upcall right away, which subsequently 
>>> influences the ACK sent by the stack).
>>>
>>> I only added the so_snd buffer after some discussion, if the WAKESOR 
>>> shouldn't have a symmetric equivalent on WAKESOW....
>>>
>>> Thus a partial backout (leaving the WAKESOR part inside, but reverting the 
>>> WAKESOW part) would still fix my initial problem about erraneous DSACKs 
>>> (which can also lead to extremely poor performance with Linux clients), but 
>>> possible address this issue...
>>>
>>> Can you perhaps take MAIN and apply https://reviews.freebsd.org/D29690 for 
>>> the revert only on the so_snd upcall?
> Since the krpc only uses receive upcalls, I don't see how reverting the send 
> side would have
> any effect?
>
>> Since the release of 13.0 is almost done, can we try to fix the issue 
>> instead of reverting the commit?
> I think it has already shipped broken.
> I don't know if an errata is possible, or if it will be broken until 13.1.
>
> --> I am much more concerned with the otis@ stuck client problem than this 
> RST battle that only
>       occurs after a network partitioning, especially if it is 13.0 specific.
>       I did this testing to try to reproduce Jason's stuck client (with 
> connection in CLOSE_WAIT)
>       problem, which I failed to reproduce.
>
> rick
>
> Rs: agree, a good understanding where the interaction btwn stack, socket and 
> in kernel tcp user breaks is needed;
>
>>
>> If this doesn't help, some major surgery will be necessary to prevent NFS 
>> sessions with SACK enabled, to transmit DSACKs...
>
> My understanding is that the problem is related to getting a local error 
> indication after
> receiving a RST segment too late or not at all.
>
> Rs: but the move of the upcall should not materially change that; i don’t 
> have a pc here to see if any upcall actually happens on rst...
>
> Best regards
> Michael
>>
>>
>>> I know from a printf that this happened, but whether it caused the RST 
>>> battle to not happen, I don't know.
>>>
>>> I can put r367492 back in and do more testing if you'd like, but I think it 
>>> probably needs to be reverted?
>>
>> Please, I don't quite understand why the exact timing of the upcall would be 
>> that critical here...
>>
>> A comparison of the soxxx calls and errors between the "good" and the "bad" 
>> would be perfect. I don't know if this is easy to do though, as these calls 
>> appear to be scattered all around the RPC / NFS source paths.
>>
>>> This does not explain the original hung Linux client problem, but does shed 
>>> light on the RST war I could create by doing a network partitioning.
>>>
>>> rick
>>
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

_______________________________________________
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to