Re: [Cake] Proposing COBALT

moeller0 Fri, 20 May 2016 04:38:33 -0700

Hi Jonathan,

interesting ideas.


> On May 20, 2016, at 12:04 , Jonathan Morton <chromati...@gmail.com> wrote:
> 
> With the recent debate over handling unresponsive flows in fq_codel, I had a 
> brainwave involving constructing a hybrid AQM which preserves Codel’s 
> excellent properties on responsive flows, while also reacting appropriately 
> when faced with a UDP flood.  The key difficulty was deciding when to switch 
> over from the Codel behaviour to a PIE or RED like behaviour.
> 
> It turns out that BLUE is a perfect fit for this job, because it activates 
> when the queue is completely full - an unambiguous signal that Codel has lost 
> the plot and is unable to control the queue alone.  BLUE was one of the more 
> promising AQMs in the days immediately prior to Codel’s ascendance, so it 
> should be effective outside Codel’s speciality.
> 
> The name COBALT, as well as referring to a nice shade of blue, can read 
> “Codel-BLUE Alternate”.

        That is important, alwas start with a good acronym ;) (now really there 
are some EU funding programs that actually require you to supply an acronym if 
applying for a grant).

> 
> It is unnecessary to explicitly “switch over” between Codel and BLUE; they 
> can work in parallel, since their operating characteristics are independent.  
> It may be feasible to simplify the Codel implementation, since it will no 
> longer need to handle overload conditions as robustly.  For example, the 
> Codel section should use ECN marking whenever possible, and never drop an 
> ECN-Capable packet; the BLUE section should ignore ECN capability and simply 
> drop packets, since the traffic is evidently not responding to any ECN 
> signals if BLUE is triggered.
> 
> One of the major reasons why Codel fails on UDP floods is that its drop 
> schedule is time-based.  This is the correct behaviour for TCP flows, which 
> respond adequately to one congestion signal per RTT, regardless of the packet 
> rate.  However, it means it is easily overwhelmed by high-packet-rate 
> unresponsive (or anti-responsive, as with TCP acks) floods, which an attacker 
> or lab test can easily produce on a high-bandwidth ingress, especially using 
> small packets.

        In essence I agree, but want to point out that the protocol itself does 
not really matter but rather the observed behavior of a flow. Civilized UDP 
applications (that expect their data to be carried over the best-effort 
internet) will also react to drops similar to decent TCP flows, and crappy TCP 
implementations might not. I would guess with the maturity of TCP stacks 
misbehaving TCP flows will be rarer than misbehaving UDP flows (which might be 
for example well-behaved fixed-rate isochronous flows that simply should never 
have been sent over the internet).

> 
> BLUE, by contrast, uses a drop *probability*, so its effectiveness on floods 
> is independent of the packet rate.  If necessary, its drop rate can increase 
> to 100% in a reasonable amount of time.
> 
> A couple of details are necessary to integrate BLUE with a flow-isolating 
> qdisc:
> 
> BLUE’s up-trigger should be on a packet drop due to overflow (only) targeting 
> the individual subqueue managed by that particular BLUE instance.  It is not 
> correct to trigger BLUE globally when an overall overflow occurs.  Note also 
> that BLUE has a timeout between triggers, which should I think be scaled 
> according to the estimated RTT.

        That sounds nice in that no additional state is required. But with the 
current fq_codel I believe, the packet causing the memory limit overrun, is not 
necessarily from the flow that actually caused the problem to beginn with, and 
I doesn’t fq_codel actuall search the fattest flow and drops from there. But I 
guess that selection procedure could be run with blue as as well.

> 
> BLUE’s down-trigger is on the subqueue being empty when a packet is requested 
> from it, again on a timeout.  To ensure this occurs, it may be necessary to 
> retain subqueues in the DRR list while BLUE’s drop probability is nonzero.

        Question, doesn’t this mean the affected flow will be throttled quite 
harshly? Will blue slowly decrease the drop probability p if the flow behaves? 
If so, blue could just disengage if p drops below a threshold?

> 
> Note that this does nothing to improve the situation regarding fragmented 
> packets.  I think the correct solution in that case is to divert all 
> fragments (including the first) into a particular queue dependent only on the 
> host pair, by assuming zero for src and dst ports and a “special” protocol 
> number.  

        I believe the RFC recommends using the SRC IP, DST IP, Protocol, 
Identity tuple, as otherwise all fragmented flows between a host pair will hash 
into the same bucket…

Best Regards
        Sebastian

> This has the distinct advantages of keeping related fragments together, and 
> ensuring they can’t take up a disproportionate share of bandwidth in 
> competition with normal traffic.
> 
> - Jonathan Morton
> 
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake

_______________________________________________
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake

Re: [Cake] Proposing COBALT

Reply via email to