Re: [Bloat] [Ecn-sane] [iccrg] Fwd: [tcpPrague] Implementation and experimentation of TCP Prague/L4S hackaton at IETF104

2019-03-24 Thread Bob Briscoe

Roland,

On 22/03/2019 13:53, Bless, Roland (TM) wrote:

Hi Bob,

see inline.

Am 21.03.19 um 14:24 schrieb Bob Briscoe:

On 21/03/2019 08:49, Bless, Roland (TM) wrote:

Hi,

Am 21.03.19 um 09:02 schrieb Bob Briscoe:

Just to rapidly reply,


On 21/03/2019 07:46, Jonathan Morton wrote:

The ECN field was never intended to be used as a classifier, except to
distinguish Not-ECT flows from ECT flows (which a middlebox does need
to know, to choose between mark and drop behaviours).  It was intended
to be used to convey congestion information from the network to the
receiver.  SCE adheres to that ideal.

Each PHB has a forwarding behaviour a DSCP re-marking behaviour and an
ECN marking behaviour. The ECN field is the claissifer for the ECN
marking behaviour.

That's exactly the reason, why using ECT(1) as classifier for L4S
behavior is not the right choice. L4S should use a DSCP for
classification, because it is actually defining a PHB.

1/ First Terminology
The definition of 'PHB' includes the drop or ECN-marking behaviour. For
instance, you see this in WRED or in PCN (Pre-Congestion Notification).
If you want to solely talk about scheduling, pls say the scheduling PHB.

I thought that I'm well versed with Diffserv terminology, but I'm not
aware that a Diffserv PHB requires the definition of an ECN marking
behavior.
Ah well, I do not think that any living human has visited all the dark 
corners of the Diffserv world.


I didn't mean (or say) that a Diffserv PHB /requires/ an ECN marking 
behaviour, just that if you are talking about a Diffserv PHB that 
includes an ECN marking behaviour, it helps to say when you are solely 
talking about the scheduling part of the PHB.


In many cases when there is no ECN marking behaviour, it makes no 
difference if you omit the word scheduling, cos that is all there is to 
the behaviour.



In fact ECN is orthogonal to Diffserv as both RFCs 2474 and
2475 do not even mention ECN. RFC 2475:
"A per-hop behavior (PHB) is a description of the externally
observable forwarding behavior of a DS node applied to a particular
DS behavior aggregate." and "Useful behavioral distinctions
are mainly observed when multiple behavior aggregates compete for
buffer and bandwidth resources on a node."
Even the original experimental ECN spec RFC2481 was published just after 
2474 & 2475. So you wouldn't expect the original Diffserv specs to 
mention something that didn't exist then.




Usually, there are different mechanisms how to implement a PHB,
e.g., for EF one could use a tail drop queue and Simple Priority
Queueing, Weighted Fair Queueing, or Deficit Round Robin and so
on. Consequently, queueing and scheduling behavior are used to
_implement_ a PHB, i.e., IMHO it makes sense to distinguish between
the PHB as externally observable behavior and a specific _PHB
implementation_ as also pointed out in RFC2475:
PHBs are implemented in nodes by means of some buffer management and
packet scheduling mechanisms.  PHBs are defined in terms of behavior
characteristics relevant to service provisioning policies, and not in
terms of particular implementation mechanisms.


So some of the Diffserv PHBs do _not_ require using an AQM,
which is often the basis for ECN marking, e.g., for EF
tail drop should be sufficient. For other PHBs it may be
useful to say something about ECN usage (as I did for LE).

RFC 2475:

PHBs may be specified in terms of their resource (e.g., buffer,
bandwidth) priority relative to other PHBs, or in terms of their
relative observable traffic characteristics (e.g., delay, loss).
Since RFC2474 & 2475, AQM behaviour and/or ECN marking behaviour has 
become part of some Diffserv PHBs. E.g. WRED in AF. See any of the 
tables in RFC4594 that have an AQM column.


The need for ECN marking behaviour (rather than just AQM behaviour in 
general) as part of a PHB became necessary during the definition of PCN. 
Jo Babiarz and Kwok Ho Chan were both authors of 4594 and of many of the 
PCN specs, and proposed the term 'marking behaviour' as part of the PHB. 
You will find ECN marking behaviours are central to, for instance, RFC5670.


As I've pointed out already, the transitions used by SCE were already in 
the PCN baseline encoding spec [RFC6660], except only defined if 
accompanied by a specific DSCP (which was subsequently standardized as 
EF-ADMIT).





I think that L4S therefore specifies such a PHB as it is defined
in relation to the default PHB (as in the L4S arch draft
"Classic service").
No. The L4S use of ECN is orthogonal to Diffserv, and can be associated 
with more than one scheduling PHB in a queuing hierarchy. See 
draft-briscoe-l4s-diffserv (which I would welcome you to review - Brian 
Carpenter is also currently reviewing it).


However, you are certainly right in thinking that L4S associated with 
the default PHB is by far and away the most important use-case for L4S. 
All the other possible schemes 

Re: [Bloat] [Ecn-sane] [iccrg] Fwd: [tcpPrague] Implementation and experimentation of TCP Prague/L4S hackaton at IETF104

2019-03-24 Thread Bob Briscoe

Alex, inline...

On 24/03/2019 21:15, alex.b...@ealdwulf.org.uk wrote:


Hi Bob,


I note that all the non-dependent claims of US20170019343A1 (claims 1,14,22) 
seem to assume use of the proportional-integral controller (Note, I am not a 
lawyer, and especially not a patent lawyer).
Yes, as I understand it, Nokia's intention with this filing was to cover 
use of the PI controller in particular, in combination with various 
other ideas.



In Appendix B of draft-briscoe-tsvwg-aqm-dualq-coupled, an alternate algorithm 
'Curvy RED' seems to replace PI, but it is noted that 'the Curvy RED algorithm 
has not been maintained to the same degree as the DualPI2 algorithm '.

Can you comment on whether the Curvy RED algorithm could form a 
non-patent-encumbered dualq? In particular:
  - Why wasn't curvy red further developed? Was it found to contain some 
deficiency? Are you intending to present it as an alternative?
We just didn't develop it further, cos we were getting better results 
with PI2. However, I am aware of a hardware implementation based on 
Curvy RED going on at the moment, and you will see there have recently 
been review comments on that Curvy RED appendix on the list.


So, even tho PI might be better, Curvy RED (or another AQM) might be 
preferable for other reasons that performance (e.g. ease of 
implementation, or similarity to an existing hardware implementation).


And indeed, there's nothing to stop anyone using other AQMs, either to 
work round the IPR, or because they're preferable in their own right - 
the DualQ Coupled AQM is intentionally a framework into which you drop 2 
AQMs.



  - Does Curvy RED actually completely replace PI?

Yes.

  - Can we have reasonable assurance that no patents will surface covering 
Curvy RED?
Well, I developed the idea of Curvy RED and I / my employer (BT) did not 
file any IPR at the time. I got approval to publish a tech report 
jointly with Al-Lu. http://bobbriscoe.net/pubs.html#CRED-insights


That was May 2015, so given nothing has surfaced by now, there can't be 
anything from that time from us (where us = Al-Lu & BT).


Of course, I cannot guarantee that there is not another patent in the 
system from some other random company that my searches haven't found. 
There are large numbers of AQM patents. Also, I cannot guarantee that an 
implementer working now isn't filing patents around their 
implementation. All we can do is publish as much as possible as early as 
possible to try to keep some areas of the field open.



Bob


Thanks,
Alex


On Wednesday, March 20, 2019, 11:29:38 PM GMT, Bob Briscoe 
 wrote:






1/ In 2016, I arranged for the hire of a patent attorney to undertake the 
unusual step of filing a third party observation with the European Patent 
Office. This went through Al-Lu's patent application claim by claim pointing to 
prior art and giving the patent examiner all the arguments to reject each 
claim. However, the examiner chose to take very little note of it, which was 
disappointing and costly for us. The main prior art is:
     Gibbens, R.J. & Kelly, F.P., "On Packet Marking at Priority Queues," IEEE 
Transactions on Automatic Control 47(6):1016--1020 (June 2002)
The guys named as inventors in AL-Lu's filing published a paper on PI2 with me, 
in which we included a citation of this Gibbens paper as inspiration for the 
coupling. The Gibbens paper was already cited as background by other patents, 
so the EPO has it in their citation index.

The coupling was also based on my prior research with Mirja before I started 
working with the guys from Al-Lu in the RITE European Collaborative project. we 
had to go through a few rejections, but Mirja and I finally got this work 
published in 2014  - still before the priority date of the Al-Lu patent 
application:
     Kühlewind, M., Wagner, D.P., Espinosa, J.M.R. & Briscoe, B., "Using Data Center 
TCP (DCTCP) in the Internet," In: Proc. Third IEEE Globecom Workshop on 
Telecommunications Standards: From Research to Standards pp.583-588 (December 2014)

2/ The only claim that I could not find prior art for (in the original EU 
filing) was a very specific claim about using a square root for the coupling. 
The Linux implementation runs this the other way round so that it only has to 
do a squaring. So I figured we were safe from that.

However, until just now, I had not noticed that Al-Lu has retrospectively 
re-written the claims in the US patent and in the EU patent application to 
claim this the other way round - as a squaring. And to claim the two random 
number trick. Both restructuring to use a squaring and the two random number 
trick were definitely my ideas (while working for BT in a collaboration with 
Al-Lu). I have emails to prove this (from memory they were actually both in the 
same email). This is important, because a patent has to be about mechanism, not 
algorithm.

3/ This is a positive development. It means this patent is on very shaky l

Re: [Bloat] [Ecn-sane] [iccrg] Fwd: [tcpPrague] Implementation and experimentation of TCP Prague/L4S hackaton at IETF104

2019-03-21 Thread Bob Briscoe

Roland,

On 21/03/2019 08:49, Bless, Roland (TM) wrote:

Hi,

Am 21.03.19 um 09:02 schrieb Bob Briscoe:

Just to rapidly reply,


On 21/03/2019 07:46, Jonathan Morton wrote:

The ECN field was never intended to be used as a classifier, except to
distinguish Not-ECT flows from ECT flows (which a middlebox does need
to know, to choose between mark and drop behaviours).  It was intended
to be used to convey congestion information from the network to the
receiver.  SCE adheres to that ideal.

Each PHB has a forwarding behaviour a DSCP re-marking behaviour and an
ECN marking behaviour. The ECN field is the claissifer for the ECN
marking behaviour.

That's exactly the reason, why using ECT(1) as classifier for L4S
behavior is not the right choice. L4S should use a DSCP for
classification, because it is actually defining a PHB.


1/ First Terminology
The definition of 'PHB' includes the drop or ECN-marking behaviour. For 
instance, you see this in WRED or in PCN (Pre-Congestion Notification).  
If you want to solely talk about scheduling, pls say the scheduling PHB.


2/ The architectural intent of the ECN field

For many years (long before we thought of L4S) I have been making sure 
that ECN propagation through the layers supports the duality of ECN 
behaviours as both a classifier (on the way down from L7/L4 to L3/2) and 
as a return value (on the way back up).


The architecture of ECN is determined by the valid codepoint 
transitions. They are:

1. 00->11
2. 10->11
3. 01->11
4. 10->01

The first three were in RFC3168, but it did not preclude the fourth.
The fourth was first standardized in RFC6660 (which I co-authored). This 
had to be isolated from the e2e use of ECN by inclusion of a DSCP as well.


The relatively late addition of the fourth approach means that an 
attempt to mark using the SCE approach (10->01) is more likely to find 
that it gets reversed when the outer header is decapsulated, if the 
decapsulator hasn't been updated to the latest RFC that catered for this 
fourth transition (RFC6040, also co-authored by me).


L4S follows the original RFC3168 approach
SCE uses the fourth

So, SCE proposes to use /a/ correct approach, but it might not work.
Whereas L4S uses the original correct approach.

3a/ DualQ L4S AQMs
With the DualQ, the difference between the two queues is both in their 
ECN marking behaviour and in their forwarding/scheduling behaviour. 
However, whenever there's traffic in the classic queue the coupling 
between the AQMs overrides the network scheduler. The coupling is solely 
ECN behaviour not scheduling behaviour. So the primary difference 
between the queues is in their ECN-marking behaviour.


What do I mean by "the coupling overrides the network scheduler"? The 
network scheduler certainly does give priority to L4S packets whenever 
they arrive, but the coupling makes the L4S sources control how often 
packets arrive. It's tough to reason about, because we haven't had a 
mechanism like this before.


2b/ FQ L4S AQMs
If the AQM is implemented with per flow queues, the picture is clearer. 
The only difference between the queues is in the ECN marking behaviour 
of the different AQMs.




Bob


Regards
  Roland


--
________
Bob Briscoe   http://bobbriscoe.net/

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] [Ecn-sane] [iccrg] Fwd: [tcpPrague] Implementation and experimentation of TCP Prague/L4S hackaton at IETF104

2019-03-21 Thread Bob Briscoe

Just to rapidly reply,


On 21/03/2019 07:46, Jonathan Morton wrote:

The ECN field was never intended to be used as a classifier, except to 
distinguish Not-ECT flows from ECT flows (which a middlebox does need to know, 
to choose between mark and drop behaviours).  It was intended to be used to 
convey congestion information from the network to the receiver.  SCE adheres to 
that ideal.


Each PHB has a forwarding behaviour a DSCP re-marking behaviour and an 
ECN marking behaviour. The ECN field is the claissifer for the ECN 
marking behaviour.



Bob

--

Bob Briscoe   http://bobbriscoe.net/

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] [Ecn-sane] [iccrg] Fwd: [tcpPrague] Implementation and experimentation of TCP Prague/L4S hackaton at IETF104

2019-03-21 Thread Bob Briscoe

Jonathan,

On 20/03/2019 23:51, Jonathan Morton wrote:

On 21 Mar, 2019, at 1:29 am, Bob Briscoe  wrote:


But more importantly, the L4S usage couples the minimized latency use
case to any possibility of getting a high fidelity explicit congestion
signal, so the "maximize throughput" use case can't ever get it.

Eh? There's definitely a misunderstanding or a difference in terminology 
between us here. The whole point of using a congestion controller like DCTCP is 
so that flow rate can scale indefinitely with capacity. Van Jacobson actually 
noted that the original TCP was unscalable in a footnote to the tech report 
version of the SIGCOMM paper.

The high fidelity congestion signal of what we call scalable congestion 
controllers (like DCTCP) is inversely proportional to the window. So as window 
scales up, the congestion signal scales down, so that their product remains 
constant. That means the number of ECN marks per RTT is scale-invariant. So the 
control signal remains just as tight at any scale.

If you'll indulge me for a moment, I'd like to lay out a compromise scenario 
where a lot of L4S' stated goals are still met.

There is no dualQ.  There is an AQM at the bottleneck link, of unspecified 
type, which implements SCE.  Assume that it produces CE marks like a 
conventional AQM, and also produces SCE marks like an L4S AQM produces CE.

A sender implements DCTCP-SCE, which is essentially Paced NewReno modified to 
subtract half of all acked data that was SCE-marked from its cwnd.  (This is 
equivalent to the DCTCP algorithm with g=1 and an arbitrarily small measurement 
window, but acting on SCE instead of CE.)  Any SCE mark also kicks it out of 
slow-start.

The means by which SCE information gets back to the sender is left vague for 
now; it's an orthogonal problem with several viable solutions.

What is missing from this scenario, from L4S' point of view?  And why have I 
been able to describe it so succinctly?
My goal is also to tighten the EWMA parameter, g, in DCTCP to 1 (or 2). 
That is why we have recommended a queuing-time-based ramp AQM for the 
Low Latency queue, which so far works equivalently to the step with g 
set to its current default of 1/16. We have been doing experiments on 
this for some time. But it is important to assess each change one at a 
time.


Congestion controls are tricky to get stable in all situations. So it is 
important to separate ideas and research from engineering of more mature 
approaches that are ready for more widespread experimentation on the 
public Internet. Our goal with L4S was to use proven algorithms, and put 
in place mechanism to allow those algorithms to evolve.


As regards the desire to use SCE instead of the L4S approach of using a 
classifier, please answer all the reasons I gave for why that won't 
work, which I sent in response to your draft some days ago. The main one 
is incremental deployment: the source does not identify its packets as 
distinct from others, so the source needs the network to use some other 
identifier if it wants the network to put it in a queue with latency 
that is isolated from packets not using the scheme. The only way I can 
see to so this would be to use per-flow-queuing. I think that is an 
unstated assumption of SCE.


In contrast, L4S works with either per-flow queuing or dualQ, so it is 
more appropriate for a wider spread of scenarios. Again, in that same 
unanswered email, I described a way L4S can use per-flow queuing, and 
Greg has since given pseudocode. There are other problems with doing the 
codepoints the SCE way round - pls see that email.


There has been a general statement that the SCE way round is purer. 
However, that concept is in the eye of the beholder. The SCE way round 
does not allow the ECN field to be used as a classifier, so you don't 
get the benefit above about support for per-flow-queueing and dual 
queue. You also don't get the benefit of being able to relax 
resequencing in the network, because the network has no classifier to 
look at. For these, the SCE codepoint would need to be combined with a 
DSCP, and I assume you don't want to do that.


The L4S way round signifies an alternative meaning of the ECN field, 
which is exactly what it is. The problem of having to guess which type 
of packet a CE used to be has been roundly discussed at the IETF in TCPM 
and TSVWG WGs and it has been decided it is a non-problem if it is 
assumed to have been ECT(1) even if it was not - as written up in 
draft-ietf-ecn-l4s-id. And that discussion assumed TCP with 3DupACK 
reordering tolerance, not the more liberal use of RACK (or a RACK-like 
approach in other transports). With a RACK-like approach, it becomes 
even less of a problem.



Bob


  - Jonathan Morton



--
________
Bob Briscoe   http://bobbriscoe.net/

___
Bloat mailing list
Bloat@lists.buffe

Re: [Bloat] [Ecn-sane] [iccrg] Fwd: [tcpPrague] Implementation and experimentation of TCP Prague/L4S hackaton at IETF104

2019-03-20 Thread Bob Briscoe
l except minimize cost, but minimize 
cost can be built around the system. This was the subject of my PhD. I 
haven't described L4S in these terms, because most people are only 
interested in the latency. But this is the underlying reason for my 
obsession with ECN.


Frank Kelly predicted that queuing delay would be removed from the 
optimization as it was minimized. With L4S we've got very close to that.


ECN removes all congestion loss.

And the use of a inverse linear congestion controller gives the scalable 
throughput. I shall be touching on this in my talk for netdev tomorrow, 
but it's not really a subject for an implementation conference.


Minimize cost is something you do by combining the congestion signals 
across a network. So any AQM is part of that. And congestion controllers 
are the other part - they implicitly optimize cost, using the congestion 
signals as shadow prices. The square root in classic TCP distorts this, 
but DCTCP's inverse linear controller gives proportional fairness 
directly. Without a weight term in the congestion controller, there is 
not really an economic optimization, but that can be built onto a 
proportionally fair system and competition will gradually cause that to 
happen (or regulation as a proxy for competition). These are very long 
term processes though.



But more importantly, the L4S usage couples the minimized latency use

case to any possibility of getting a high fidelity explicit congestion

signal, so the "maximize throughput" use case can't ever get it.

Eh? There's definitely a misunderstanding or a difference in terminology 
between us here. The whole point of using a congestion controller like 
DCTCP is so that flow rate can scale indefinitely with capacity. Van 
Jacobson actually noted that the original TCP was unscalable in a 
footnote to the tech report version of the SIGCOMM paper.


The high fidelity congestion signal of what we call scalable congestion 
controllers (like DCTCP) is inversely proportional to the window. So as 
window scales up, the congestion signal scales down, so that their 
product remains constant. That means the number of ECN marks per RTT is 
scale-invariant. So the control signal remains just as tight at any scale.


Cheers



Bob


Regards,

Jake

PS:

If you get a chance, I'm still interested in seeing answers to my

questions about deployment mitigations on the tsvwg list:

https://mailarchive.ietf.org/arch/msg/tsvwg/TWOVpI-SvVsYVy0_U6K8R04eq3A

I'm not surprised if it slipped by unnoticed, there have been a lot of

emails on this.  But good answers to those questions would go a long way

toward easing my concerns about the urgency on this discussion.

PPS:

These issues are a bit sideways to my technical reasons for preferring

the SCE formulation of ECT(1), which have more to do with the confusing

semantics and proliferation of corner cases it creates for CE and ECE.

But I thought I’d ask about them since it seemed like maybe there was a

disconnect here.

*From: *Bob Briscoe 
*Date: *2019-03-18 at 18:07
*To: *"David P. Reed" , Vint Cerf 
*Cc: *tsvwg IETF list , bloat 

*Subject: *Re: [Bloat] [Ecn-sane] [iccrg] Fwd: [tcpPrague] 
Implementation and experimentation of TCP Prague/L4S hackaton at IETF104


David,

On 17/03/2019 18:07, David P. Reed wrote:

Vint -

BBR is the end-to-end control logic that adjusts the source rate
to match the share of the bolttleneck link it should use.

It depends on getting reliable current congestion information via
packet drops and/or ECN.

So the proposal by these guys (not the cable guys) is an attempt
to improve the quality of the congestion signal inserted by the
router with the bottleneck outbound link.

What do you mean 'not the cable guys'?
This thread was reasonably civil until this intervention.


THe cable guys are trying to get a "private" field in the IP
header for their own use.


There is nothing private about this codepoint, and there never has 
been. Here's some data points:


* The IP header codepoint in question (ECT(1) in the ECN field) was 
proposed for use as an alternative ECN behaviour in July 2105 in the 
IETF AQM WG and the IETF's transport area WG (which handles all ECN 
matters).
* A year later there followed a packed IETF BoF on the subject (after 
2 open Bar BoFs).
* Long discussion ensued on the merits of different IP header field 
combinations, on both these IETF lists, involving people active on 
this list (bloat), including Dave Taht, who is acknowledged for his 
contributions in the IETF draft.

* That was when it was decided that ECT(1) was most appropriate.
* The logic of the decision is written up in an appendix of 
draft-ietf-ecn-l4s-id.
* David Black, one of the co-chairs of the IETF's transport area WG 
and co-author of both the original ECN and Diffserv RFCs, wrote 
RFC8311 to lay out the process for reclaiming and reusing the 
necessary codepoints.
* This work and the pro

Re: [Bloat] [Ecn-sane] [iccrg] Fwd: [tcpPrague] Implementation and experimentation of TCP Prague/L4S hackaton at IETF104

2019-03-18 Thread Bob Briscoe
s NIC or OS.

So why isn't it in all the receivers' NIC or OS (where it would render
the switch's ordering efforts moot) instead of in all the switches?

I'm guessing the answer is a competition trap for the switch vendors,
plus "with ordering goes faster than without, when you benchmark the
switch with typical load and current (non-RACK) receivers".

If that's the case, it seems like the drive for a competitive
advantage
caused deployment of a packet ordering workaround in the wrong network
location(s), out of a pure misalignment of incentives.

RACK rates to fix that in the end, but a lot of damage is already
done,
and the L4S approach gives switches a flag that can double as
proof that
RACK is there on the receiver, so they can stop trying to order those
packets.

So point granted, I understand and agree there's a cost to abandoning
that advantage.


But as you also said so well in another thread, this is
important.  ("The
last unicorn", IIRC.)  How much does it matter if there's a
feature that
has value today, but only until RACK is widely deployed? If you were
convinced RACK would roll out everywhere within 3 years and SCE would
produce better results than L4S over the following 15 years, would
that
change your mind?

It would for me, and that's why I'd like to see SCE explored before
making a call.  I think at its core, it provides the same thing
L4S does
(a high-fidelity explicit congestion signal for the sender), but with
much cleaner semantics that can be incrementally added to congestion
controls that people are already using.

Granted, it still remains to be seen whether SCE in practice can match
the results of L4S, and L4S was here first.  But it seems to me
L4S comes
with some problems that have not yet been examined, and that are
nicely
dodged by a SCE-based approach.

If L4S really is as good as they seem to think, I could imagine
getting
behind it, but I don't think that's proven yet.  I'm not certain, but
all the comparative analyses I remember seeing have been from more or
less the same team, and I'm not convinced they don't have some
misaligned incentives of their own.

I understand a lot of work has gone into L4S, but this move to jump it
from interesting experiment to de-facto standard without a more
critical
review that digs deeper into some of the potential deployment problems
has me concerned.

If it really does turn out to be good enough to be permanent, I'm not
opposed to it, but I'm just not convinced that it's non-harmful,
and my
default position is that the cleaner solution is going to be better in
the long run, if they can do the same job.

It's not that I want it to be a fight, but I do want to end up
with the
best solution we can get.  We only have the one internet.

Just my 2c.

-Jake


___
Ecn-sane mailing list
ecn-s...@lists.bufferbloat.net <mailto:ecn-s...@lists.bufferbloat.net>
https://lists.bufferbloat.net/listinfo/ecn-sane


--
New postal address:
Google
1875 Explorer Street, 10th Floor
Reston, VA 20190

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


--

Bob Briscoe   http://bobbriscoe.net/

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] [Codel] The "Some Congestion Experienced" ECN codepoint - a new internet draft -

2019-03-11 Thread Bob Briscoe
hat way.


==ECN feedback problems===

Over the last decades, we've made sure that the ECN feedback schemes for 
TCP, QUIC, RTP (but not SCTP yet) can all feed back ECT(1) as well as 
CE, in case a scheme like SCE came along.


However, the solution in the TCP case [draft-ietf-tcpm-accurate-ecn] is 
still problematic for SCE if you're impatient. The base scheme overloads 
3 bits in the TCP header, which it uses to feed back CE only. To feed 
back ECT(1) we had to add a TCP option. That's not going to get through 
middleboxes for many years. The TCP option is also optional to 
implement. Two of the main TCP developers are currently saying they will 
probably not implement it, at least not initially.


==Tunnels and lower layers==

Over the years I've maintained a fairly lonely activity to make sure 
that the ECN propagation behaviour of tunnels and layer 2 protocols will 
treat ECT(1) as either a stronger output signal (as in SCE) or as an 
alternative input signal to an AQM (as in L4S). Theoretically, this 
allows either the SCE or the L4S approach.


HOWEVER, you would probably not be surprised at how many people read the 
spec [RFC6040], and say "Ah, no router alters ECT(0) to ECT(1) today, so 
I'm not going to implement that unnecessary extra line of code in my 
tunnel decap."


==Wider benefit: Relaxing link ordering==

By overloading the ECT(1) marking to mean "the sender uses time for loss 
detection" a link can relax the reordering requirement on ECT(1) packets 
today. You can do that with L4S, cos the sender is selecting the 
marking. You can't do that when the AQM is selecting the marking (as 
with SCE).


If transport protocols detect loss in time units without tying it to any 
marking (as in RACK on its own), a link cannot use this to relax the 
ordering requirement until it is sure that all the legacy non-RACK 
transports have decayed out of the network. That would be measured in 
decades.


HTH



Bob

On 11/03/2019 10:11, Dave Taht wrote:

Everybody, calm down. I put this out merely to get comment before we
submitted the first of several drafts. That draft is now submitted and
we've asked for a talk slot in the tsvwg for it. I cc'd the world to
get quick initial feedback, and I want to shut this overbroad
conversation down and move it to just the ecn-sane mailing list.

The l4s mailing list is dead, and the debates on the AQM mailing list and here,
unhelpful - for decades. So, back in august I started a new working
group here, under house rules that I thought would be more productive,
and asked that people that wanted to debate ecn more sanely, join. few
did.

And jon and I have been working for months (and largely not on the
list) to try and create a compromise proposal of which y'all just saw
the first output. There's more in the bufferbloat-rfcs repo.

The rules for joining the ecn-sane list are simple - take the time to
step back and write a write a short position paper, and join (or
create) a team. You needn't do that immediately. If you disagree with
the rules of operation of the ecn-sane working group, submit a pull
request or file a bug on the web site. where we can discuss it.

Ironically our ssl cert just expired and I don't remember how to fix it.

Please join the ecn-sane mailing list for discussing this stuff and
stop cc-ing the whole bufferbloat.net  world on it, please.
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat



--
________
Bob Briscoe   http://bobbriscoe.net/

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] DETNET

2017-11-12 Thread Bob Briscoe

Matthias, Dave,

The sort of industrial control applications that detnet is targeting 
require far lower queuing delay and jitter than fq_CoDel can give. They 
have thrown around numbers like 250us jitter and 1E-9 to 1E-12 packet 
loss probability.


However, like you, I just sigh when I see the behemoth detnet is building.

Nonetheless, it's important to have a debate about where to go to next. 
Personally I don't think fq_CoDel alone has legs to get (that) much better.


I prefer the direction that Mohamad Alizadeh's HULL pointed in:
Less is More: Trading a little Bandwidth for Ultra-Low Latency in the 
Data Center 


In HULL you have i) a virtual queue that models what the queue would be 
if the link were slightly slower, then marks with ECN based on that. 
ii)  a much more well-behaved TCP (HULL uses DCTCP with hardware pacing 
in the NICs).


I would love to be able to demonstrate that HULL can achieve the same 
extremely low latency and loss targets as detnet, but with a fraction of 
the complexity.


*Queuing latency?* This keeps the real FIFO queue in the low hundreds to 
tens of microseconds.


*Loss prob?* Mohammad doesn't recall seeing a loss during the entire 
period of the experiments, but he doubted their measurement 
infrastructure was sufficiently accurate (or went on long enough) to be 
sure they were able to detect one loss per 10^12 packets.


For their research prototype, HULL used a dongle they built, plugged 
into each output port to constrict the link in order to shift the AQM 
out of the box. However, Broadcom mid-range chipsets already contain 
vertual queue hardware (courtesey of a project we did with them when I 
was at BT:
How to Build a Virtual Queue from Two Leaky Buckets (and why one is not 
enough)  ).


*For public Internet, not just for DCs?* You might have seen the work 
we've done (L4S ) to get queuing delay 
over regular public Internet and broadband down to about mean 500us; 
90%-ile 1ms, by making DCTCP deployable alongside existing Internet 
traffic (unlike HULL, pacing at the source is in Linux, not hardware). 
My personal roadmap for that is to introduce virtual queues at some 
future stage, to get down to the sort of delays that detnet wants, but 
over the public Internet with just FIFOs.


Radio links are harder, of course, but a lot of us are working on that too.



Bob

On 12/11/2017 22:58, Matthias Tafelmeier wrote:

On 11/07/2017 01:36 AM, Dave Taht wrote:

Perceived that as shareworthy/entertaining ..

https://tools.ietf.org/html/draft-ietf-detnet-architecture-03#section-4.5

without wanting to belittle it.

Hope springs eternal that they might want to look over the relevant
codel and fq_codel RFCS at some point or another.


Not sure, appears like juxtaposing classical mechanics to nanoscale 
physics.


--
Besten Gruß

Matthias Tafelmeier



___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] [iccrg] survey on Internet latency, FYI

2014-08-04 Thread Bob Briscoe

Dave,

Thx for BQL+pFabric+HFSC suggestions. We are now 
in the process of addressing a few items the 
reviewers wanted us to change (each different 
from yours)), so we can add stuff. The main 
reason for sending out the pre-print was 
precisely to widen the formal review process out 
to the community, so thx for contributing.


All we hoped for was to ensure we had 
representative papers about each /type/ of 
technology. We knew we could never summarise 
every paper that has every been written about 
latency, altho obviously we do want to include as 
many /important/ ones as possible.


I'm afraid we have no plans to turn this into a 
living doc. It would be feasible if it was just a 
categorisation of paper summaries. However, the 
main value of the paper (IMO) is the quantative 
comparison of the main techniques (Gain vs Pain 
section), which isn't so easy to open up to 
public contribution. Even amongst our 10 authors, 
it took a long while to converge on the answer 
for each technology we picked. Now all the 
authors want to get on with the rest of their lives.



Bob

At 14:53 04/08/2014, Dave Taht wrote:

On Mon, Aug 4, 2014 at 8:07 AM, Michael Welzl mich...@ifi.uio.no wrote:
 The link below doesn't work anymore; a slighly updated version is at:
 http://riteproject.eu/?attachment_id=735

0) I enjoyed reading it, but it wasn't as comprehensive as I'd like,
and more than a few paragraphs reflected a bias that I'd have liked to
have had an opportunity to address before publication.

1) I would like a future version to dedicate a bit of space to Byte
Queue Limits (BQL), as that happens to have been the technology that
has made all the AQM and FQ algorithms scalable to 10s of gbits in
Linux. Prior to BQL, NO aqm or FQ worked right at line rate. I have
kept hoping some paper would examine the difference in latency due to
BQL for quite some time now... if it wasn't for the BQL breakthrough:

https://lists.bufferbloat.net/pipermail/bloat/2011-November/000726.html

The bufferbloat.net project would have folded up shop and quit that
year, if BQL hadn't arrived, as our initial focus on fixing wifi
wasn't working out.

The BQL concept and implementation are delightfully simple and fast
enough to run at interrupt cleanup time.

 http://lwn.net/Articles/454390/

And the results, spectacular, particularly on hosts with TSO or GSO
enabled, and at a wide range of rates, particularly high ones.

2) The same person that did Data Center TCP went off to do pfabric. I
don't see that in here.

3) Also HFSC seems to be widely used, in the DSL market in particular,
and is not in here.

4) Numerous other nits. I do hope further input from the community is
solicited for a 1.1, and I'd love it if it were available as a wiki or
html document one day for easier browsing/decoding/searching from the
Internet.



 On 21. juli 2014, at 11:17, David Ros wrote:

 Hi all,

 The following paper may be of interest to ICCRGers:

 Reducing Internet Latency: A Survey of Techniques and their Merits

 available here:
 http://riteproject.files.wordpress.com/2014/07/latency_preprint-31.pdf

 This is a much-extended version 
(understatement) of a position paper of ours

 that was presented in last year's ISOC workshop on Internet latency.

 It's currently under submission to a journal, so this preprint will be
 slightly amended for the eventual journal version that in principle should
 be available in a few months. Comments will be gladly welcome.

 Thanks,

 David (as individual, on behalf of all the authors)

 ___
 iccrg mailing list
 ic...@irtf.org
 https://www.irtf.org/mailman/listinfo/iccrg



 ___
 iccrg mailing list
 ic...@irtf.org
 https://www.irtf.org/mailman/listinfo/iccrg




--
Dave Täht

NSFW: 
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat



Bob Briscoe,  BT 


___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] What is a good burst? -- AQM evaluation guidelines

2013-12-15 Thread Bob Briscoe
 for each packet).  It strikes me as 
feasible for NIC hardware to take on some of this burden.


 - Jonathan Morton

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat



Bob Briscoe,  BT 


___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] [aqm] [iccrg] AQM deployment status?

2013-11-04 Thread Bob Briscoe

Curtis  all,

At 17:07 14/10/2013, Curtis Villamizar wrote:

In enterprise and data center there is also very good control over
what equipment is used and how it is used.  However, clue density
decreases exponentially farther from the core and approaches zero in
some data centers and in some enterprise IT departments.  This is also
true of the part of service provider organizations that pick stuff for
consumer access.  Where clue density is lowest, you'll find ignorance
of AQM and buffer issues and lack of consideration for buffering and
AQM when picking equipment for deployment and configuring it.


We (BT) use WRED extensively at the edge routers into our global 
enterprise MPLS network (see the 
http://www2.bt.com/btPortal/application;JSESSIONID_btPortalWebApp=Jq1gl1Qz8sbdOQjvnRTGVGo1IfBs47bTe6Amy69S3bYcsw5A99sp!6984303?namespace=pns_catalogueorigin=mb_navigator_right_cd_editorial.jspevent=link.print.editorialPorS=productscontentType=techinicalspeccontentitemid=editorial/tech_spec/mpls_ts.xmlproductDetail=products/mpls.xmlBT 
MPLS technical 
http://www2.bt.com/btPortal/application;JSESSIONID_btPortalWebApp=Jq1gl1Qz8sbdOQjvnRTGVGo1IfBs47bTe6Amy69S3bYcsw5A99sp!6984303?namespace=pns_catalogueorigin=mb_navigator_right_cd_editorial.jspevent=link.print.editorialPorS=productscontentType=techinicalspeccontentitemid=editorial/tech_spec/mpls_ts.xmlproductDetail=products/mpls.xmlspecification).



Bob





Bob Briscoe,  BT ___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!

2012-01-08 Thread Bob Briscoe
 a way to handle TSO properly, as byte oriented AQMs
handled that badly. Sort of fixed now.


Agreed.



Please feel free to evaluate and critique SFQRED.


If I don't agree with the goal, are you expecting me to critique the 
detailed implementation?



In my case I plan to continue working with HTB, QFQ
(which, btw, has multiple interesting ways to engage
weighting mechanisms) and various sub qdiscs... *after SFQ stablizes*.

I kind of view QFQ as a qdisc construction set.


Do you have any knowledge (accessible to the code) of what the 
weights should be?


The $10M question is: What's the argument against not doing FQ?



We're well aware that these may not be the ultimate answer to
all the known networking problems in the universe, but what
we have working in the lab is making a big dent in them.

And now that bugfix #1 above exists...

Let a thousand new AQM's bloom!

If there is anything in what we're up to that will damage the
internet worse that it is damaged already, let us know soonest.


More *FQ is worse than what there already is.



Bob



Bob Briscoe,BT Innovate  Design 


___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] What is fairness, anyway? was: Re: finally... winning on wired!

2012-01-05 Thread Bob Briscoe

Jim, Justin,

Jumping back one posting in this thread...

At 17:36 04/01/2012, Justin McCann wrote:

On Wed, Jan 4, 2012 at 11:16 AM, Dave Taht dave.t...@gmail.com wrote:

 On Wed, Jan 4, 2012 at 4:25 PM, Jim Gettys j...@freedesktop.org wrote:

 1: the 'slower flows gain priority' question is my gravest concern
 (eg, ledbat, bittorrent). It's fixable with per-host FQ.

Meaning that you don't want to hand priority to stuff that is intended
to stay in the background?


The LEDBAT/BitTorrent issue wouldn't be fixed by per-host FQ.
LEDBAT/uTP tries to yield to other hosts, not just its own host.

In fact, in the early part of the last decade, the whole issue of 
long-running vs interactive flows showed how broken any form of FQ 
was. This was why ISPs moved away from rate equality (whether 
per-flow, per-host or per-customer site) to various 
per-customer-volume-based approaches (or a mix of both).


There seems to be an unspoken assumption among many on this list that 
rate equality must be integrated into each AQM implementation. That's 
so 2004. It seems all the developments in fairness over the last 
several years have passed completely over the heads of many on this 
list. This page might fill in the gaps for those who missed the last few years:

http://trac.tools.ietf.org/group/irtf/trac/wiki/CapacitySharingArch

To address buffer bloat, I advise we do one thing and do it well: bulk AQM.

In a nutshell, bit-rate equality, where each of N active users gets 
1/N of the bit-rate, was found to be extremely _unfair_ when the 
activity of different users is widely different. For example:
* 5 light users all active 1% of the time get close to 100% of a 
shared link whenever they need it.
* However, if instead 2 of these users are active 100% of the time, 
FQ gives the other three light users only 33% of the link whenever 
they are active.
* That's pretty rubbish for a solution that claims to isolate each 
user from the excesses of others.


Since 2004, we now understand that fairness has to involve accounting 
over time. That requires per-user state, which is not something you 
can do, or that you need to do, within every queue. We should leave 
fairness to separate code, probably on machines specialising in this 
at the edge of a provider's network domain - where it knows about 
users/customers - separate from the AQM code of each specific queue.



Bob










Bob Briscoe,BT Innovate  Design 


___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Simplified Bloat explanation

2011-04-13 Thread Bob Briscoe

Bruce,

The candy factory analogy is a useful one. Thanks

However, I wouldn't associate it with the usage-billing issue or with 
NN. I know that was your original motivation for writing this, but 
the two issues (bandwith and latency) need to be treated separately.


Just because buffer bloat is slowing the little transfers (individual 
candies), doesn't mean anything about how much bandwidth is needed to 
transfer predicted volumes of data in reasonable time. That's about 
how many Ethels are employed to meet total demand for wrapping 
candies, which could still be an issue (or not) irrespective of 
whether we take away the hats and buckets.


The association with NN  usage-billing merely serves to confuse what 
is an otherwise helpful explanation.



Bob

At 18:44 12/04/2011, Bruce Atherton wrote:
For those that are interested, I've written a simplified explanation 
of BufferBloat for a nontechnical audience using a classic I Love 
Lucy episode as an analogy. I wrote it to introduce the concept to 
the people fighting Usage Based Billing here in Canada during a 
federal election.


I don't need to get all the details right because this is intended 
for the general population, but any feedback about the general 
correctness of the description is appreciated.


http://callenish.blogspot.com/2011/03/usage-based-billing-caused-by-internet.html
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat



Bob Briscoe,BT Innovate  Design 


___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat