Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat

Richard Scheffenegger Sun, 08 May 2011 05:37:09 -0700


Hi Fred,

Goodput can really only be measured at the sender; by definition, anyretransmitted packet will reduce goodput vs throughput; In your example,where each segment is retransmitted once, goodput would be - at most - 0.5,not 1.0... IMHO defining the data volume after the bottleneck by itself asgoodput is also a bit short-sighted, because a good fraction of that datamay still be discarded by TCP for numerous reasons, ultimately (ie, legacygo-back-n RTO recovery by the sender)...

Measuring at the receiver (or in-path network) side, on a SACK enabledsession, will miss all the instances where the last (or a number of segmentsrunning up to and including the last) segment was lost, or where aretransmitted segment was lost twice.

The former can be approximated by checking the RTOs (which would requirealready some heuristic to come up with a good approximation of what thesender's RTO timeout is likely to be - the IETF RFC 1sec prescribed minRTOis virtually never used). The latter, where retransmitted segments are alsolost, you can only infer indirectly about the senders behavior from areceiver-side (or in-path ) trace, again because lost retransmissiondetection is done by one stack (Linux), but not by the others, and RTOs canagain not be evaded under all circumstances.

But back to my original question: When looking at modern TCP stacks, withTSO, if the bufferbloat allows the senders cwnd to grow beyond thresholdswhich allow the aggressive use of TSO (64kB or even 256kB of data allowed inthe senders cwnd), the effective sending rate of such a burst will bewirespeed (no interleaving segments of other sessions). As pointed out inother mails to this thread, if the bottleneck has then 1/10th the capacityof the senders wire (and is potentially shared among multiple senders), atleast 90% of all the sent data of such a TSO segment train will be droppedin a single burst of loss... With proper AQM, and some (single segment) lossearlier, cwnd may never grow to trigger TSO in that way, and the goodput (1segment out of 64kB data, vs. 58kB out of 64kB data) is obviously shiftedextremely to the scenario with AQM...

So, qualitatively, a ISP with proper AQM should be able to have a betterGoodput (downloads from upstream or uploads to upstream ISP); However,pricing is typically done on data volume exchanged - if goodput is lower, aninverse number of higher volume is necessary, to achive the same "real" dataexchange.

However, the next question becomes, how to quanitfy this on large scale - ifthe monetary difference is, say, in the vicinity of 2-3% saved (averageinternet loss ratio), that accumulates to huge sums for small / medium ISPs(which get charged more per volume than large ISPs).

If the quantitative difference is only 0,02-0,05%, say, than the incentiveof enabling AQMs in small ISPs is not really there in monetary terms (andthese ISPs would have to be motivated by other, typically much less strongincentives).


Best regards,
  Richard

----- Original Message -----From: "Fred Baker" <fredbaker...@gmail.com>

To: "Jim Gettys" <j...@freedesktop.org>
Cc: <bloat@lists.bufferbloat.net>
Sent: Friday, May 06, 2011 6:18 AM
Subject: Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat

There are a couple of ways to approach this, and they depend on yournetwork model.
In general, if you assume that there is one bottleneck, losses occur inthe queue at the bottleneck, and are each retransmitted exactly once (notnecessary, but helps), goodput should approximate 100% regardless of thequeue depth. Why? Because every packet transits the bottleneck once - ifit is dropped at the bottleneck, the retransmission transits thebottleneck. So you are using exactly the capacity of the bottleneck.
the value of a shallow queue is to reduce RTT, not to increase or decreasegoodput. cwnd can become too small, however; if it is possible to set cwndto N without increasing queuing delay, and cwnd is less than N, you're notmaximizing throughput. When cwnd grows above N, it merely increasesqueuing delay, and therefore bufferbloat.
If there are two bottlenecks in series, you have some probability that apacket transits one bottleneck and doesn't transit the other. In thatcase, there is probably an analytical way to describe the behavior, but itdepends on a lot of factors including distributions of competing traffic.There are a number of other possibilities; imagine that you drop a packet,there is a sack, you retransmit it, the ack is lost, and meanwhile thereis another loss. You could easily retransmit the retransmissionunnecessarily, which reduces goodput. The list of silly possibilities goeson for a while, and we have to assume that each has some probability ofhappening in the wild.
On May 5, 2011, at 9:01 AM, Jim Gettys wrote:
On 04/30/2011 03:18 PM, Richard Scheffenegger wrote:
I'm curious, has anyone done some simulations to check if the followingqualitative statement holds true, and if, what the quantitative effectis:
With bufferbloat, the TCP congestion control reaction is unduelydelayed. When it finally happens, the tcp stream is likely facing a"burst loss" event - multiple consecutive packets get dropped. Worseyet, the sender with the lowest RTT across the bottleneck will likelystart to retransmit while the (tail-drop) queue is still overflowing.
And a lost retransmission means a major setback in bandwidth (except forLinux with bulk transfers and SACK enabled), as the standard (RFCdocumented) behaviour asks for a RTO (1sec nominally, 200-500 mstypically) to recover such a lost retransmission...
The second part (more important as an incentive to the ISPs actually),how does the fraction of goodput vs. throughput change, when AQM schemesare deployed, and TCP CC reacts in a timely manner? Small ISPs have topay for their upstream volume, regardless if that is "real" work(goodput) or unneccessary retransmissions.
When I was at a small cable ISP in switzerland last week, surely enoughbufferbloat was readily observable (17ms -> 220ms after 30 sec of a bulktransfer), but at first they had the "not our problem" view, until Istarted discussing burst loss / retransmissions / goodput vsthroughput - with the latest point being a real commercial incentive tothem. (They promised to check if AQM would be available in the CPE /CMTS, and put latency bounds in their tenders going forward).
I wish I had a good answer to your very good questions. Simulation wouldbe interesting though real daa is more convincing.
I haven't looked in detail at all that many traces to try to get a feelfor how much bandwidth waste there actually is, and more formal studieslike Netalyzr, SamKnows, or the Bismark project would be needed toquantify the loss on the network as a whole.
I did spend some time last fall with the traces I've taken. In those,I've typically been seeing 1-3% packet loss in the main TCP transfers.On the wireless trace I took, I saw 9% loss, but whether that isbufferbloat induced loss or not, I don't know (the data is out there forthose who might want to dig). And as you note, the losses areconcentrated in bursts (probably due to the details of Cubic, so I'mtold).
I've had anecdotal reports (and some first hand experience) with muchhigher loss rates, for example from Nick Weaver at ICSI; but I believe inplaying things conservatively with any numbers I quote and I've notgotten consistent results when I've tried, so I just report what's in thepacket captures I did take.
A phenomena that could be occurring is that during congestion avoidance(until TCP loses its cookies entirely and probes for a higher operatingpoint) that TCP is carefully timing it's packets to keep the buffersalmost exactly full, so that competing flows (in my case, simple pings)are likely to arrive just when there is no buffer space to accept themand therefore you see higher losses on them than you would on the singleflow I've been tracing and getting loss statistics from.
People who want to look into this further would be a great help.
               - Jim


_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat
_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat

Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat

Reply via email to