On 04/06/2018 03:05 AM, Michal Kubecek wrote: > Hello, > > I encountered a strange behaviour of some (non-linux) TCP stack which > I believe is incorrect but support engineers from the company producing > it claim is OK. > > Assume a client (sender, Linux 4.4 kernel) sends a stream of MSS sized > segments but segments 2, 4 and 6 do not reach the server (receiver): > > ACK SAK SAK SAK > +-------+-------+-------+-------+-------+-------+-------+ > | 1 | 2 | 3 | 4 | 5 | 6 | 7 | > +-------+-------+-------+-------+-------+-------+-------+ > 34273 35701 37129 38557 39985 41413 42841 44269 > > When segment 2 is retransmitted after RTO timeout, normal response would > be ACK-ing segment 3 (38557) with SACK for 5 and 7 (39985-41413 and > 42841-44269). > > However, this server stack responds with two separate ACKs: > > - ACK 37129, SACK 37129-38557 39985-41413 42841-44269 > - ACK 38557, SACK 39985-41413 42841-44269
Hmmm... Yes this seems very very wrong and lazy. Have you verified behavior of more recent linux kernel to such threats ? packetdrill test would be relatively easy to write. Regardless of this broken alien stack, we might be able to work around this faster than the vendor is able to fix and deploy a new stack. ( https://en.wikipedia.org/wiki/Robustness_principle ) Be conservative in what you do, be liberal in what you accept from others... > > There is no payload from server, no window update and it happens even if > there is no other packet received by server between those two. The > result is that as segment 3 was never retransmitted, second ACK is > interpreted as acking a newly arrived segment by 4.4 kernel so that the > whole interval between first transmission of segment 3 and this second > ACK is used for RTT estimator; even worse, when the same happens again > for segment 5, both timeouts (for 2 and 4) are counted into its RTT. > The result is RTO growing exponentially until it reaches the maximum > (120 seconds) and the connection is effectively stalled. > > In my opinion, server behaviour violates the last paragraph of RFC 5681, > section 4.2: > > A TCP receiver MUST NOT generate more than one ACK for every incoming > segment, other than to update the offered window as the receiving > application consumes new data (see [RFC813] and page 42 of [RFC793]). > > Server vendor claims that their behaviour is correct as first ACK is > sent in response to segment 2 and second ACK in response to segment 3 > (which has just been delayed in the out of order queue). > > Note that SACK doesn't really help here. First SACK block in first ACK > (37129-38557) is actually invalid as it violates the "the bytes just > below the block ... have not been received" condition from RFC 2018 > section 3. Therefore Linux 4.4 stack ignores this SACK block, detects > (spurious) SACK reneging and unmarks the "previously sacked" flag of > segment 3 so that when second ACK arrives, there is no trace of it > having been sacked before. They already admitted this SACK block is > incorrect but there is still disagreement about the "one-by-one acking" > behaviour in general. > > My question is: is my interpretation correct? If so, is there an even > less ambiguous statement somewhere that receiver is supposed to send one > ACK for "everything they got so far" rather than acking the segments one > by one? While reading the RFCs, I always considered this obvious but > apparently some people may think otherwise. > > Thanks in advance, > Michal Kubecek >