Re: [Bloat] when does the CoDel part of fq_codel help in the real world?

Michael Welzl Sun, 02 Dec 2018 19:39:55 -0800

Hi,

A few answers below:



> On Nov 27, 2018, at 9:10 PM, Dave Taht <dave.t...@gmail.com> wrote:
> 
> On Mon, Nov 26, 2018 at 1:56 PM Michael Welzl <mich...@ifi.uio.no 
> <mailto:mich...@ifi.uio.no>> wrote:
>> 
>> Hi folks,
>> 
>> That “Michael” dude was me  :)
>> 
>> About the stuff below, a few comments. First, an impressive effort to dig 
>> all of this up - I also thought that this was an interesting conversation to 
>> have!
>> 
>> However, I would like to point out that thesis defense conversations are 
>> meant to be provocative, by design - when I said that CoDel doesn’t usually 
>> help and long queues would be the right thing for all applications, I 
>> certainly didn’t REALLY REALLY mean that.  The idea was just to be thought 
>> provoking - and indeed I found this interesting: e.g., if you think about a 
>> short HTTP/1 connection, a large buffer just gives it a greater chance to 
>> get all packets across, and the perceived latency from the reduced 
>> round-trips after not dropping anything may in fact be less than with a 
>> smaller (or CoDel’ed) buffer.
> 
> I really did want Toke to have a hard time. Thanks for putting his
> back against the wall!
> 
> And I'd rather this be a discussion of toke's views... I do tend to
> think he thinks FQ solves more than it does.... and I wish we had a
> sound analysis as to why 1024 queues
> works so much better for us than 64 or less on the workloads we have.
> I tend to think in part it's because that acts as a 1000x1
> rate-shifter - but should it scale up? Or down? Is what we did with
> cake (1024 setassociative) useful? or excessive? I'm regularly seeing
> 64,000 queues on 10Gig and up hardware due to 64 hardware queues and
> fq_codel on each, on that sort of gear. I think that's too much and
> renders the aqm ineffective, but lack data...
> 
> but, to rant a bit...
> 
> While I tend to believe FQ solves 97% of the problem, AQM 2.9% and ECN .09%.

I think the sparse flow optimization bit plays a major role in FQ_CoDel.


> BUT: Amdahls law says once you reduce one part of the problem to 0,
> everything else takes 100%. :)
> 
> it often seems like me, being the sole and very lonely FQ advocate
> here in 2011, have reversed the situation (in this group!), and I'm
> oft the AQM advocate *here* now.

Well I’m with you, I do agree that an AQM is useful!  It’s just that there are 
not SO many cases where a single flow builds a standing queue only for itself 
and this really matters for that particular application.
But these cases absolutely do exist!  (and several examples were mentioned - 
also the VPN case etc.)


> It's sort of like all the people quoting the e2e argument still, back
> at me when dave reed (at least, and perhaps the other co-authors now)
> have bought into this level of network interference between the
> endpoints, and had no religion - or the red in a different light paper
> being rejected because it attempted to overturn other religion - and
> I'll be damned if I'll let fq_codel, sch_fq, pie, l4s, scream, nada,
> 
> I admit to getting kind of crusty and set in my ways, but so long as
> people put code in front of me along with the paper, I still think,
> when the facts change, so do my opinions.
> 
> Pacing is *really impressive* and I'd like to see that enter
> everything, not just in packet processing - I've been thinking hard
> about the impact of cpu bursts (like resizing a hash table), and other
> forms of work that we currently do on computers that have a
> "dragster-like" peak performance, and a great average, but horrible
> pathologies - and I think the world would be better off if we built
> more

+1


> Anyway...
> 
> Once you have FQ and a sound outer limit on buffer size (100ms),
> depredations like comcast's 680ms buffers no longer matter. There's
> still plenty of room to innovate. BBR works brilliantly vs fq_codel
> (and you can even turn ECN on which it doesn't respect and still get a
> great result). LoLa would probably work well also 'cept that the git
> tree was busted when I last tried it and it hasn't been tested much in
> the 1Mbit-1Gbit range.
> 
>> 
>> But corner cases aside, in fact I very much agree with the answers to my 
>> question Pete gives below, and also with the points others have made in 
>> answering this thread. Jonathan Morton even mentioned ECN - after Dave’s 
>> recent over-reaction to ECN I made a point of not bringing up ECN *yet* again
> 
> Not going to go into it (much) today! We ended up starting another
> project on ECN that that operates under my core ground rule - "show me
> the code" - and life over there and on that mailing list has been
> pleasantly quiet. https://www.bufferbloat.net/projects/ecn-sane/wiki/ 
> <https://www.bufferbloat.net/projects/ecn-sane/wiki/>
> .
> 
> I did get back on the tsvwg mailing list recently because of some
> ludicrously inaccurate misstatements about fq_codel. I also made a
> strong appeal to the l4s people, to, in general, "stop thanking me" in
> their documents. To me that reads as an endorsement, where all I did
> was participate in the process until I gave up and hit my "show me the
> code" moment - which was about 5 years ago and hasn't moved on the
> needle since except in mutating standards documents.
> 
> The other document I didn't like was an arbitary attempt to just set
> the ecn backoff figure to .8 when the sanest thing, given the
> deployment, and pacing... was to aim for a right number - anyway.....
> in that case I just wanted off the "thank you" list.

So let’s draw a line between L4S and “the other document you didn’t like”, 
which was our ABE.
L4S is a more drastic attempt at getting things right. I haven’t been 
contributing to this much; I like it for what it’s trying to achieve, but I 
don’t have a strong opinion on it.
Myself, I thought that much smaller changes might have a better chance at 
getting the incentives right, to support ECN deployment - which was the change 
to 0.8.

Looking at our own document again, I am surprised to see that you are indeed in 
our acknowledgement list:
https://tools.ietf.org/html/draft-ietf-tcpm-alternativebackoff-ecn-12 
<https://tools.ietf.org/html/draft-ietf-tcpm-alternativebackoff-ecn-12>
We added everyone who we thought made useful suggestions - it wasn’t meant as a 
sign of endorsement. But, before RFC publication, there is still an opportunity 
to remove your name.
=> I apologize and will remove you.


> I like to think the more or less rfc3168 compliant deployment of ecn
> is thus far going brilliantly, but lack data. Certainly would like a
> hostile reviewers evaluation of cake's ecn method and for that matter,
> pie's, honestly - from real traffic! There's an RFC- compliant version
> of Pie being pushed into the kernel after it gets through some of
> stephens nits.
> 
> And I'd really prefer all future discussions of "ecn benefits" to come
> with code and data and be discussed over on the ecn-sane mailing list,
> or *not discussed here* if no code is available.

You keep complaining about lack of code. At least for ABE:
- I think the code is in FreeBSD now
- There is a slightly older Linux patch. I agree it would be nice to continue 
with this code… I don’t have someone doing this right now.
Anyway, all code, along with measurement results, is available from:
http://heim.ifi.uio.no/michawe/research/projects/abe/index.html 
<http://heim.ifi.uio.no/michawe/research/projects/abe/index.html>


>> , but… yes indeed, being able to use ECN to tell an application to back off 
>> instead of requiring to drop a packet is also one of the benefits.
> 
> One thus far mis-understood and under-analyzed aspect of our work is
> the switch to head dropping.
> 
> To me the switch to head dropping essentially killed the tail loss RTO
> problem, eliminated most of the need for ecn.

I doubt that: TCP will need to retransmit that packet at the head, and that 
takes an RTT - all the packets after it will need to wait in the receiver 
buffer before the application gets them.
But I don’t have measurements to prove my point, so I’m just hand-waving...


> Forward progress and
> prompt signalling always happens. That otherwise wonderful piece
> stuart cheshire did at apple elided the actual dropping mode version
> of fq_codel, which as best as I recall was about 12? 15ms? long and
> totally invisible to the application.
> 
>> (I think people easily miss the latency benefit of not dropping a packet, 
>> and thereby eliminating head-of-line blocking - packet drops require an 
>> extra RTT for retransmission, which can be quite a long time. This is about 
>> measuring latency at the right layer...)
> 
> see above. And yea, perversely, I agree with your last statement.

Perversely? Come on  :)


> A
> slashdot web page download takes 78 separate flows and 2.2 seconds to
> complete. Worst case completion
> time - if you had *tail* loss would be about 80ms longer than that, on
> a tiny fraction of loads. The rest of it is absorbed into those 2.2
> seconds.

Yes - and these separate flows get their own buckets in FQ_CoDel. Which is 
great - just not much effect from CoDel there.
But I’m NOT arguing that per-flow AQM is a bad thing, absolutely not!


> EVEN with http 2.0/ I would be extremely surprised to learn that many
> websites fit it all into one tcp transaction.
> 
> There are very few other examples of TCP traffic requiring a low
> latency response. I happen to be very happy with the ecn support in
> mosh btw, not that anybody's ever looked at it since we did it.
> 
> And I'd really prefer all future discussions of "ecn benefits" to come
> with code and data and be discussed over on the ecn-sane mailing list,
> or not discussed here if no code is available.
> 
>> BTW, Anna Brunstrom was also very quick to also give me the HTTP/2.0 example 
>> in the break after the defense. Also, TCP will generally not work very well 
>> when queues get very long… the RTT estimate gets way off.
> 
> I like to think that the syn/ack and ssl negotation handshake under
> fq_codel gives a much more accurate estimate of actual RTT than we
> ever had before.

Another good point - this is indeed useful!


>> All in all, I think this is a fun thought to consider for a bit, but not 
>> really something worth spending people’s time on, IMO: big buffers are bad, 
>> period. All else are corner cases.
> 
> I've said it elsewhere, and perhaps we should resume, but an RFC
> merely stating the obvious about maximal buffer limits and getting
> ISPs do to do that would be a boon.
> 
>> I’ll use the opportunity to tell folks that I was also pretty impressed with 
>> Toke’s thesis as well as his performance at the defense. Among the many cool 
>> things he’s developed (or contributed to), my personal favorite is the 
>> airtime fairness scheduler. But, there were many more. Really good stuff.
> 
> I so wish the world has about 1000 more toke's in training. How can we
> make that happen?

I don’t know… in academia, the mix of really contributing to the kernel on the 
one side, and getting academic results on the other, is a rare thing.
Not that we advisors (at least the people I consider friends) would be against 
that!  But it's not easy to find someone who can pull this off.

Cheers,
Michael

_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat

Re: [Bloat] when does the CoDel part of fq_codel help in the real world?

Reply via email to