Re: [Bloat] [Ecn-sane] [iccrg] Fwd: [tcpPrague] Implementation and experimentation of TCP Prague/L4S hackaton at IETF104

Bob Briscoe Sun, 24 Mar 2019 19:48:12 -0700

Roland,

On 22/03/2019 13:53, Bless, Roland (TM) wrote:

Hi Bob,


see inline.

Am 21.03.19 um 14:24 schrieb Bob Briscoe:

On 21/03/2019 08:49, Bless, Roland (TM) wrote:

Hi,

Am 21.03.19 um 09:02 schrieb Bob Briscoe:

Just to rapidly reply,


On 21/03/2019 07:46, Jonathan Morton wrote:

The ECN field was never intended to be used as a classifier, except to
distinguish Not-ECT flows from ECT flows (which a middlebox does need
to know, to choose between mark and drop behaviours).  It was intended
to be used to convey congestion information from the network to the
receiver.  SCE adheres to that ideal.

Each PHB has a forwarding behaviour a DSCP re-marking behaviour and an
ECN marking behaviour. The ECN field is the claissifer for the ECN
marking behaviour.

That's exactly the reason, why using ECT(1) as classifier for L4S
behavior is not the right choice. L4S should use a DSCP for
classification, because it is actually defining a PHB.

1/ First Terminology
The definition of 'PHB' includes the drop or ECN-marking behaviour. For
instance, you see this in WRED or in PCN (Pre-Congestion Notification).
If you want to solely talk about scheduling, pls say the scheduling PHB.

I thought that I'm well versed with Diffserv terminology, but I'm not
aware that a Diffserv PHB requires the definition of an ECN marking
behavior.

Ah well, I do not think that any living human has visited all the darkcorners of the Diffserv world.

I didn't mean (or say) that a Diffserv PHB /requires/ an ECN markingbehaviour, just that if you are talking about a Diffserv PHB thatincludes an ECN marking behaviour, it helps to say when you are solelytalking about the scheduling part of the PHB.

In many cases when there is no ECN marking behaviour, it makes nodifference if you omit the word scheduling, cos that is all there is tothe behaviour.

In fact ECN is orthogonal to Diffserv as both RFCs 2474 and
2475 do not even mention ECN. RFC 2475:
"A per-hop behavior (PHB) is a description of the externally
observable forwarding behavior of a DS node applied to a particular
DS behavior aggregate." and "Useful behavioral distinctions
are mainly observed when multiple behavior aggregates compete for
buffer and bandwidth resources on a node."

Even the original experimental ECN spec RFC2481 was published just after2474 & 2475. So you wouldn't expect the original Diffserv specs tomention something that didn't exist then.


Usually, there are different mechanisms how to implement a PHB,
e.g., for EF one could use a tail drop queue and Simple Priority
Queueing, Weighted Fair Queueing, or Deficit Round Robin and so
on. Consequently, queueing and scheduling behavior are used to
_implement_ a PHB, i.e., IMHO it makes sense to distinguish between
the PHB as externally observable behavior and a specific _PHB
implementation_ as also pointed out in RFC2475:
    PHBs are implemented in nodes by means of some buffer management and
    packet scheduling mechanisms.  PHBs are defined in terms of behavior
    characteristics relevant to service provisioning policies, and not in
    terms of particular implementation mechanisms.


So some of the Diffserv PHBs do _not_ require using an AQM,
which is often the basis for ECN marking, e.g., for EF
tail drop should be sufficient. For other PHBs it may be
useful to say something about ECN usage (as I did for LE).

RFC 2475:

    PHBs may be specified in terms of their resource (e.g., buffer,
    bandwidth) priority relative to other PHBs, or in terms of their
    relative observable traffic characteristics (e.g., delay, loss).

Since RFC2474 & 2475, AQM behaviour and/or ECN marking behaviour hasbecome part of some Diffserv PHBs. E.g. WRED in AF. See any of thetables in RFC4594 that have an AQM column.

The need for ECN marking behaviour (rather than just AQM behaviour ingeneral) as part of a PHB became necessary during the definition of PCN.Jo Babiarz and Kwok Ho Chan were both authors of 4594 and of many of thePCN specs, and proposed the term 'marking behaviour' as part of the PHB.You will find ECN marking behaviours are central to, for instance, RFC5670.

As I've pointed out already, the transitions used by SCE were already inthe PCN baseline encoding spec [RFC6660], except only defined ifaccompanied by a specific DSCP (which was subsequently standardized asEF-ADMIT).


I think that L4S therefore specifies such a PHB as it is defined
in relation to the default PHB (as in the L4S arch draft
"Classic service").

No. The L4S use of ECN is orthogonal to Diffserv, and can be associatedwith more than one scheduling PHB in a queuing hierarchy. Seedraft-briscoe-l4s-diffserv (which I would welcome you to review - BrianCarpenter is also currently reviewing it).

However, you are certainly right in thinking that L4S associated withthe default PHB is by far and away the most important use-case for L4S.All the other possible schemes in l4s-diffserv are only possibilities -probably for corporate networks where proliferation of Diffserv modelsis currently most prevalent.

2/ The architectural intent of the ECN field

For many years (long before we thought of L4S) I have been making sure
that ECN propagation through the layers supports the duality of ECN
behaviours as both a classifier (on the way down from L7/L4 to L3/2) and
as a return value (on the way back up).

The architecture of ECN is determined by the valid codepoint
transitions. They are:

I wouldn't say that it's determined solely by the transitions.

Correct. I didn't say that either.

1. 00->11
2. 10->11
3. 01->11
4. 10->01

The first three were in RFC3168, but it did not preclude the fourth.
The fourth was first standardized in RFC6660 (which I co-authored). This
had to be isolated from the e2e use of ECN by inclusion of a DSCP as well.

The relatively late addition of the fourth approach means that an
attempt to mark using the SCE approach (10->01) is more likely to find
that it gets reversed when the outer header is decapsulated, if the
decapsulator hasn't been updated to the latest RFC that catered for this
fourth transition (RFC6040, also co-authored by me).

L4S follows the original RFC3168 approach
SCE uses the fourth

So, SCE proposes to use /a/ correct approach, but it might not work.

In case of nodes that implement RFC6040? I think that it would
be useful to measure how many boxes out there actually do this

[BB] To measure this, you would need to have a box between the tunnelendpoints to mark the outer, before you could check what the behaviourof the decap was.

(or how widespread is ECN usage actually, e.g., how many boxes
actually set CE on congestion? MAMI results anyone?).

[BB] Brian Trammel re-ran ETHZ's ECN tests in Jan'19. Informally he toldme yesterday that he found about 13 CE marks (i.e. still hardly any).But this might mean the links aren't loaded when he's looking. It's hardto apply enough load while doing a large-scale measurement - takes fartoo long per path.

Stuart Cheshire is helping me to try to get something meaningful out ofthe data Apple is continually gathering. The data Padma presented atMAPRG at IETF-98 was % of Apple devices tested that saw at least one CEmark in 12 hours.

The hard problem is, once we find something CE-marking, we're interestedin knowing whether it's FQ (which protects flows from each other) orsingle queue (which doesn't). Greg White, Jake Holland & I devised atest for this between us. Tweak a congestion control so you have anaggressive one. Run it in parallel with another regular CC between thesame two hosts. Then look for correlation of RTT movements. Like commonbottleneck detection in RMCAT.

Whereas L4S uses the original correct approach.

Which might also not work...in case RFC3168 boxes set CE, so
the L4S receivers/senders cannot know that the CE wasn't set
by an L4S node.

Yes. That's a different point, but it's true.

3a/ DualQ L4S AQMs
With the DualQ, the difference between the two queues is both in their
ECN marking behaviour and in their forwarding/scheduling behaviour.
However, whenever there's traffic in the classic queue the coupling
between the AQMs overrides the network scheduler. The coupling is solely
ECN behaviour not scheduling behaviour. So the primary difference
between the queues is in their ECN-marking behaviour.

What do I mean by "the coupling overrides the network scheduler"? The
network scheduler certainly does give priority to L4S packets whenever
they arrive, but the coupling makes the L4S sources control how often
packets arrive. It's tough to reason about, because we haven't had a
mechanism like this before.

Yes, the DualQ mechanism is actually nice, but what I particularly don't
like is to fix the coupling into nodes in this way. If the congestion
control behavior is different from your expectation it will not work
(as already experienced with BBR) properly. This would ossify congestion
control evolution and I see this a very big disadvantage of this
approach.

The argument for using the square is to be compatible with theworst-case classic traffic, which is Reno. The worst-case will remainossified for some many years yet. It doesn't mean that better CCs cannotdevelop within Classic. And to a certain extent, they still have tocoexist with the worst case Classic... we'll see what the latest updateon BBRv2 is in a few day's time.

But importantly, a square relationship between flow rates is notenforced by the network, it's encouraged for end-systems that have adefault behaviour. But, if the network policy becomes wrong in future,end-systems can correct it.

2b/ FQ L4S AQMs
If the AQM is implemented with per flow queues, the picture is clearer.
The only difference between the queues is in the ECN marking behaviour
of the different AQMs.

This would at least avoid the baked-in coupling law problem...

Er, no. 1:1 is just as much a baked-in policy, and it really is baked-inwhen the network is enforcing it.

That's why we developed the DualQ - we wanted to avoid the networkenforcing rigid fairness at every instant. It's much less ossified whenend-systems keep to it voluntarily.

Bob


Regards
  Roland


--
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/

_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat

Re: [Bloat] [Ecn-sane] [iccrg] Fwd: [tcpPrague] Implementation and experimentation of TCP Prague/L4S hackaton at IETF104

Reply via email to