Re: [Bloat] [Codel] The "Some Congestion Experienced" ECN codepoint - a new internet draft -

2019-03-11 Thread Dave Taht
On Mon, Mar 11, 2019 at 7:14 AM Bob Briscoe  wrote:
>
> Dave,
>
> L4S is far from dead. It's merely been working differently from how you're 
> used to. Those working on an L4S AQM (at least those in the cable industry) 
> had to have a private WG for the last ~18 months, but now we're allowed to 
> publish and talk openly again. Similarly, there's work under the covers on an 
> L4S AQM in switch hardware.

After traffic essentially vanished from the various ietf mailing
lists, and specs kept dropping, I assumed there was work being done
somewhere. I'm very glad that it's now more open

We announced the ecn-sane project and it's goals last august. If you
and/or you group cannot participate under those house rules please
suggest changes. And/or resurrect an appropriate list within the ietf.

https://www.bufferbloat.net/projects/ecn-sane/wiki/

>And I see external signs of work under covers on DSL access equipment (covers 
>that I am not under any longer).

It has certainly been my hope that the DSL folk would at least wake up
and implement BQL in their lowest level firmware for about 8 years
now. Free.fr basically did that in 2012 when they shipped fq_codel on
their revolution series of modems.

>
> Nonetheless, I think you will see updated Linux code for an L4S DualQ Coupled 
> AQM built against the mainline tree appear on netdev list today.

I am beyond delighted to finally have a chance to evaluate this. Have
you run any flent related tests through it yet?

Regrettably since this code posting is so close to netdevconf it's
going to be very difficult for me to do a comprehensive evaluation in
time to form an objective opinion. I'm busy on something else.

> ==In summary==
>
> The problem that the SCE draft identifies with TCP's sharp multiplicative 
> decrease is also the primary problem that L4S identified.

Yes, we have long been in agreement that some congestion signal should
be an earlier notification than drop.

> Like L4S, SCE requires changes to network, sender and receiver (see comment 
> later about the rcv-window approach hinted at in the SCE draft). But SCE is 
> just starting on its journey. Having to change end systems and network 
> together is really tough and takes many years.

Not really, the AQM portion of SCE could roll out very quickly across
all of the linux and freebsd universe, if consensus is achieved that
it's a good idea.

>
> It seems you're trying to do the same thing as L4S, but by slightly different 
> means. Before splitting the people involved in this into two factions, can 
> you say what you didn't like about the L4S approach in the first place? We've 
> been very careful to specify L4S broadly enough so that it can encompass many 
> different approaches within it.

I've read the 100+ of spec now multiple times over the years (and all
your work on ecn in general), and I hope, that once we get a bit of
time, we can do a detailed comparison of the two approaches.

But, honestly, based on the total inactivity on the tcp prague mailing
list that it had died, until recently.

>
> The only thing stated against L4S I can find is that it's taking a long time. 
> Starting an identically difficult approach now is going to restart the clock, 
> and take a lot lot longer.

SCE and the modifications to the relevant already IETF approved AQMs
are extremely straightforward, backward compatible approaches to
extending RFC3168 for all existing transports.

It's not an identically difficult approach at all.

Aside from some more detailed analysis of transport effects and the
inevitable debate in netdevconf and ietf, rolling out SCE, at least in
openwrt and linux, could be happen by about june.  Particularly
with the now more readily available source code to compare for the two
approaches, independent experts should be able to leap in and provide
feedback.

> ==2 output values vs. 2 input values.==
>
> We considered schemes where the AQM can use a second marking as a lower 
> strength /output/ (like VCP, my own QV and now SCE). But we worked out that 
> there were a wider range of advantages and much more significant performance 
> improvements from the sender using a second marking to distinguish how it 
> will behave (i.e. a second /input/ to the classifier in front of the AQM).
>
> Don't get me wrong. It's useful that you guys are putting the work in on SCE. 
> Then everyone can compare the two approaches (again), as a check on whether 
> that decision was correct. That's important, cos ECT(1) is the last useful 
> codepoint in the IP header. See: "Notification of Less Severe Congestion than 
> CE" at 
> https://tools.ietf.org/html/draft-ietf-tsvwg-ecn-l4s-id-05#appendix-C.2 where 
> we've written:

Yep. I am glad that both of our cards are on the table.

>
>Before assigning ECT(1) as an identifer for L4S, we must carefully
>consider whether it might be better to hold ECT(1) in reserve for
>future standardisation of rapid flow acceleration, which is an
>important and 

Re: [Bloat] [Codel] The "Some Congestion Experienced" ECN codepoint - a new internet draft -

2019-03-11 Thread Bob Briscoe

Dave,

L4S is far from dead. It's merely been working differently from how 
you're used to. Those working on an L4S AQM (at least those in the cable 
industry) had to have a private WG for the last ~18 months, but now 
we're allowed to publish and talk openly again. Similarly, there's work 
under the covers on an L4S AQM in switch hardware. And I see external 
signs of work under covers on DSL access equipment (covers that I am not 
under any longer).


Nonetheless, I think you will see updated Linux code for an L4S DualQ 
Coupled AQM built against the mainline tree appear on netdev list today.


==In summary==

The problem that the SCE draft identifies with TCP's sharp 
multiplicative decrease is also the primary problem that L4S identified.


Like L4S, SCE requires changes to network, sender and receiver (see 
comment later about the rcv-window approach hinted at in the SCE draft). 
But SCE is just starting on its journey. Having to change end systems 
and network together is really tough and takes many years.


It seems you're trying to do the same thing as L4S, but by slightly 
different means. Before splitting the people involved in this into two 
factions, can you say what you didn't like about the L4S approach in the 
first place? We've been very careful to specify L4S broadly enough so 
that it can encompass many different approaches within it.


The only thing stated against L4S I can find is that it's taking a long 
time. Starting an identically difficult approach now is going to restart 
the clock, and take a lot lot longer.


==2 output values vs. 2 input values.==

We considered schemes where the AQM can use a second marking as a lower 
strength /output/ (like VCP, my own QV and now SCE). But we worked out 
that there were a wider range of advantages and much more significant 
performance improvements from the sender using a second marking to 
distinguish how it will behave (i.e. a second /input/ to the classifier 
in front of the AQM).


Don't get me wrong. It's useful that you guys are putting the work in on 
SCE. Then everyone can compare the two approaches (again), as a check on 
whether that decision was correct. That's important, cos ECT(1) is the 
last useful codepoint in the IP header. See: "Notification of Less 
Severe Congestion than CE" at 
https://tools.ietf.org/html/draft-ietf-tsvwg-ecn-l4s-id-05#appendix-C.2 
where we've written:


   Before assigning ECT(1) as an identifer for L4S, we must carefully
   consider whether it might be better to hold ECT(1) in reserve for
   future standardisation of rapid flow acceleration, which is an
   important and enduring problem [RFC6077  
].


==FQ-only vs. FQ or DualQ==

One of the problems with the 2 outputs approach (SCE etc.) is that it is 
only possible with per-flow queuing. I doubt you'll get the last useful 
codepoint in the IP header for just that. It's sort-of obvious that, if 
you try to implement SCE in a FIFO, you can only have one queue length 
for all the flows. Then legacy TCP flows that don't understand SCE would 
push the queue deeper to the CE threshold, ruining it for the flows that 
support SCE.


We worked out that the 2 inputs approach (L4S) is more generic - ie. it 
can be used with FQ or DualQ (multiple or just 2 queues).


For instance, you can modify fq_CoDel for senders that uses ECT(1) to 
indicate that they support a small multiplicative decrease (L4S 
senders). You only need the following: Include the last bit of the ECN 
field with the flow ID when you do the classification for sfq. Then in 
the queues with ECN==X1, you instantiate a shallow threshold ECN AQM. 
This could be CoDel with a shallow 'target', but you also want it to 
respond immediately (zero 'interval'), so even a simple step at about 
1ms will work, but a random RED-like ramp on the /instantaneous/ queue 
is much better.


==Re-purposing the Receive Window?===

Receiver congestion control using the receive window may seem like a 
useful stop-gap, but it runs counter to how nearly all today's transport 
protocols are intended to work (except, I know of a LEDBAT-like example 
from Microsoft Research). So you will have your work cut out proving 
that it is stable and that the two ends don't fight, etc. if you think 
L4S is taking years, you will find that takes longer. There is current 
research on this that I can point you to, if you want.


That's why we chose an approach that had a pre-existing widely deployed 
existence proof (DCTCP) to start from.


IETF groups like rmcat explicitly decided early on to require the 
approach where the receiver is a dumb reflector, then new sender 
congestion control algorithms can be deployed unilaterally. The argument 
was that the feedback function can be thought of as a sub-layer below 
the congestion control function. The ongoing addition of accurate ECN 
feedback to TCP and to QUIC also take the dumb reflector approach. And 
RTCP already does it that way.


==ECN 

Re: [Bloat] [Codel] The "Some Congestion Experienced" ECN codepoint - a new internet draft -

2019-03-11 Thread Dave Taht
Everybody, calm down. I put this out merely to get comment before we
submitted the first of several drafts. That draft is now submitted and
we've asked for a talk slot in the tsvwg for it. I cc'd the world to
get quick initial feedback, and I want to shut this overbroad
conversation down and move it to just the ecn-sane mailing list.

The l4s mailing list is dead, and the debates on the AQM mailing list and here,
unhelpful - for decades. So, back in august I started a new working
group here, under house rules that I thought would be more productive,
and asked that people that wanted to debate ecn more sanely, join. few
did.

And jon and I have been working for months (and largely not on the
list) to try and create a compromise proposal of which y'all just saw
the first output. There's more in the bufferbloat-rfcs repo.

The rules for joining the ecn-sane list are simple - take the time to
step back and write a write a short position paper, and join (or
create) a team. You needn't do that immediately. If you disagree with
the rules of operation of the ecn-sane working group, submit a pull
request or file a bug on the web site. where we can discuss it.

Ironically our ssl cert just expired and I don't remember how to fix it.

Please join the ecn-sane mailing list for discussing this stuff and
stop cc-ing the whole bufferbloat.net  world on it, please.
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat