Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.

David Lang Sat, 26 Jul 2014 17:50:26 -0700

On Sun, 27 Jul 2014, Sebastian Moeller wrote:

On Jul 27, 2014, at 00:53 , David Lang <da...@lang.hm> wrote:
On Sun, 27 Jul 2014, Sebastian Moeller wrote:
Hi David,

On Jul 26, 2014, at 23:45 , David Lang <da...@lang.hm> wrote:
On Sat, 26 Jul 2014, Sebastian Moeller wrote:
On Jul 26, 2014, at 22:39 , David Lang <da...@lang.hm> wrote:
by how much tuning is required, I wasn't meaning how frequently to tune,but how close default settings can come to the performance of a expertlytuned setup.
        Good question.
Ideally the tuning takes into account the characteristics of the hardwareof the link layer. If it's IP encapsulated in something else (ATM, PPPoE,VPN, VLAN tagging, ethernet with jumbo packet support for example), thenyou have overhead from the encapsulation that you would ideally take intoaccount when tuning things.
the question I'm talking about below is how much do you loose compared tothe idea if you ignore this sort of thing and just assume that the wireis dumb and puts the bits on them as you send them? By dumb I mean don'teven allow for inter-packet gaps, don't measure the bandwidth, don't tryto pace inbound connections by the timing of your acks, etc. Just run BQLand fq_codel and start the BQL sizes based on the wire speed of your link(Gig-E on the 3800) and shrink them based on long-term passiveobservation of the sender.
As data talks I just did a quick experiment with my ADSL2+ koine athome. The solid lines in the attached plot show the results for propershaping with SQM (shaping to 95% of del link rates of downstream andupstream while taking the link layer properties, that is ATM encapsulationand per packet overhead into account) the broken lines show the samesystem with just the link layer adjustments and per packet overheadadjustments disabled, but still shaping to 95% of link rate (this isroughly equivalent to 15% underestimation of the packet size). The actualtheist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down whilemeasuring latency with ping and UDP probes). As you can see from the plotjust getting the link layer encapsulation wrong destroys latency underload badly. The host is ~52ms RTT away, and with fq_codel the ping timeper leg is just increased one codel target of 5ms each resulting in anmodest latency increase of ~10ms with proper shaping for a total of ~65ms,with improper shaping RTTs increase to ~95ms (they almost double), so RTTincreases by ~43ms. Also note how the extremes for the broken lines aremuch worse than for the solid lines. In short I would estimate that aslight misjudgment (15%) results in almost 80% increase of latency underload. In other words getting the rates right matters a lot. (I should alsonote that in my setup there is a secondary router that limits RTT to max300ms, otherwise the broken lines might look even worse...)
what is the latency like without BQL and codel? the pre-bufferbloatversion? (without any traffic shaping)
So I just disabled SQM and the plot looks almost exactly like the brokenline plot I sent before (~95ms RTT up from 55ms unloaded, with single pingsdelayed for > 1000ms, just as with the broken line, with proper shaping evenextreme pings stay < 100ms). But as I said before I need to run through myISP supplied primary router (not just a dumb modem) that also tries to boundthe latencies under load to some degree. Actually I just repeated the testconnected directly to the primary router and get the same ~95ms average pingtime with frequent extremes > 1000ms, so it looks like just getting theshaping wrong by 15% eradicates the buffer de-bloating efforts completely...
just so I understand this completely

you have

debloated box <-> ISP router <-> ADSL <-> Internet <-> debloated server?
        Well more like:
Macbook with dubious bloat-state -> wifi to de-bloated cerowrt box thatshapes the traffic -> ISP router -> ADSL -> internet -> server
I assume that Dave debated these servers well, but it should not really matteras the problem are the buffers on both ends of the bottleneck ADSL link.

right, I was forgetting that unless you are the bottleneck, you aren't bufferinganything and so debloating makes no difference. In a case like yours where youcan't debloat the actual bottleneck, the best that you can do is to artificiallybecome the bottleneck by shaping the traffic. but on the download side it's muchharder.

What are we aiming for? something that will show the problem clearly so thatfixes can be put in the right place? or a work-around to use in the meantime?

I think both need to be pursued, but we need to be clear on what is being donefor each one.

If having BQL+fq_codel with defaults would solve the problem if it was on theright routers, we need to show that.

Then, because we can't get the fixes on the right routers and need towork-around the problem by artificially becoming the bottleneck, we need to showthat the 95% that we shape to is throwing away 5% of your capacity and make thatclear to the users.

otherwise we will risk getting to the point where it will never get fixedbecause the ISPs will look at their routers and say that bufferbloat can'tpossibly be a problem as they never have large queues (because we are doing theworkarounds.

and are you measuring the latency impact when uploading or downloading?
No I measure the impact of latency of saturating both up- and downlink,pretty much the worst case scenario.


I think we need to test this in each direction independantly.

Cerowrt can do a pretty good job of keeping the uplink from being saturated, butit can't do a lot for the downlink.

I think a lot of people would be happy with 95ms average pings on a loadedconnection, even with occasional outliers.
No that is too low an aim, this still is not useable for real timeapplications, we should aim for base RTT plus 10ms. (For very slow links weneed to cut some slack but for > 3Mbps 10ms should be achievable )


perfect is the enemy of good enough.

There's achievable if every router is tuned to exactly the right conditions andthere's achievable for course settings that can be widely deployed. Get thesecond out while continuing to work on making the first easier.

residential connections only come in a smallish number of sizes, it shouldn't betoo hard to do a few probes and guess which size is in use, then set thebandwith to 90% of that standard size and you should be pretty good withoutfurther tuning.

It's far better than sustained multi-second ping times which is what I'veseen with stock setups.
True, but compared to multi seconds even <1000ms would be a really greatimprovement, but also not enough.
but if no estimate is this bad, how bad is it if you use as your estimate the'rated' speed of your DSL (i.e. what the ISP claims they are providing you)instead of the fully accurate speed that includes accounting for ATMencapsulation?
Well ~95ms with outliers > 1000ms, just as bad as no estimate. I shaped5% below rated speed as reported by the DSL modem, so disabling the ATM linklayer adjustments (as shown in the broken lines in the plot), basicallyincreased the effective shaped rate by ~13% or to effectively 107% of linerate, your proposal would be line rate and no link layer adjustments oreffectively 110% of line rate; I do not feel like repeating this experimentright now as I think the data so far shows that even with less misjudgment thebloat effect is fully visible ) Not accounting for ATM framing carries a ~10%cost in link speed, as ATM packet size on the wire increases by >= ~10%.

so what if you shape to 90% of rated speed (no allowance for ATM vs othertransports)?

It's also worth figuring out if this problem would remain in place if youdidn't have to go through the ISP router and were runing fq_codel on thatrouter.
If the DSL modem would be debloated at least on upstream no shapingwould be required any more; but that does not fix the need for downstreamshaping (and bandwidth estimation) until the head end gear is debloated..


right, I was forgetting this earlier.

As long as fixing bufferbloat involves esoteric measurements and tuning, it'snot going to be solved, but if it could be solved by people flahing openwrtonto their DSL router and then using the defaults, it could gain tractionfairly quickly.
But as there are only very few DSL modems with open sources (especiallyof the DSL chips) this just as esoteric ;) Really if equipment manufacturescould be convinced to take these issues seriously and actually fix their gearthat would be best. But this does not look like it is happening on the fasttrack. (Even DOCSIS developer cable labs punted on requiring codel or fq_codelin DOCSIS modems since the think that the required timestamps are to“expensive” on the device class they want to use for modems. They opted forPIE, much better than what we have right now but far away from my latencyunder load increase of 10ms...)
I agree that going from 65ms to 95ms seems significant, but if the stockversion goes into up above 1000ms, then I think we are talking about thingsthat are ‘close'
Well if we include outliers (and we should as enough outliers willdegrade the FPS and voip suitability of an otherwise responsive systemquickly) stock and improper shaping are in the >1000ms worst case range,while proper SQM bounds this to 100ms.
assuming that latency under load without the improvents got >1000ms

fast-slow (in ms)
ideal=10
untuned=43
bloated > 1000

        The sign seems off as fast < slow? I like this best ;)
yep, I reversed fast/slow in all of these
fast/slow
ideal = 1.25
untuned = 1.83
bloated > 19
        But Fast < Slow and hence this ration should be <0?
1 not 0, but yes, this is really slow/fast
slow/fast
ideal = 0.8
untuned = 0.55
bloated = 0.05
        and this >0?
and this is really fast/slow
What about taking the latency difference an re;aging it with a referencetime, like say the time a photon would take to travel once around the equator,or the earth’s diamater?

how about latency difference scaled by the time to send one 1500 byte packet atthe measured throughput?

This would factor out the data rate and would not be affected by long distancelinks.


David Lang

_______________________________________________
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel

Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.

Reply via email to