With the recent debate over handling unresponsive flows in fq_codel, I had a 
brainwave involving constructing a hybrid AQM which preserves Codel’s excellent 
properties on responsive flows, while also reacting appropriately when faced 
with a UDP flood.  The key difficulty was deciding when to switch over from the 
Codel behaviour to a PIE or RED like behaviour.

It turns out that BLUE is a perfect fit for this job, because it activates when 
the queue is completely full - an unambiguous signal that Codel has lost the 
plot and is unable to control the queue alone.  BLUE was one of the more 
promising AQMs in the days immediately prior to Codel’s ascendance, so it 
should be effective outside Codel’s speciality.

The name COBALT, as well as referring to a nice shade of blue, can read 
“Codel-BLUE Alternate”.

It is unnecessary to explicitly “switch over” between Codel and BLUE; they can 
work in parallel, since their operating characteristics are independent.  It 
may be feasible to simplify the Codel implementation, since it will no longer 
need to handle overload conditions as robustly.  For example, the Codel section 
should use ECN marking whenever possible, and never drop an ECN-Capable packet; 
the BLUE section should ignore ECN capability and simply drop packets, since 
the traffic is evidently not responding to any ECN signals if BLUE is triggered.

One of the major reasons why Codel fails on UDP floods is that its drop 
schedule is time-based.  This is the correct behaviour for TCP flows, which 
respond adequately to one congestion signal per RTT, regardless of the packet 
rate.  However, it means it is easily overwhelmed by high-packet-rate 
unresponsive (or anti-responsive, as with TCP acks) floods, which an attacker 
or lab test can easily produce on a high-bandwidth ingress, especially using 
small packets.

BLUE, by contrast, uses a drop *probability*, so its effectiveness on floods is 
independent of the packet rate.  If necessary, its drop rate can increase to 
100% in a reasonable amount of time.

A couple of details are necessary to integrate BLUE with a flow-isolating qdisc:

BLUE’s up-trigger should be on a packet drop due to overflow (only) targeting 
the individual subqueue managed by that particular BLUE instance.  It is not 
correct to trigger BLUE globally when an overall overflow occurs.  Note also 
that BLUE has a timeout between triggers, which should I think be scaled 
according to the estimated RTT.

BLUE’s down-trigger is on the subqueue being empty when a packet is requested 
from it, again on a timeout.  To ensure this occurs, it may be necessary to 
retain subqueues in the DRR list while BLUE’s drop probability is nonzero.

Note that this does nothing to improve the situation regarding fragmented 
packets.  I think the correct solution in that case is to divert all fragments 
(including the first) into a particular queue dependent only on the host pair, 
by assuming zero for src and dst ports and a “special” protocol number.  This 
has the distinct advantages of keeping related fragments together, and ensuring 
they can’t take up a disproportionate share of bandwidth in competition with 
normal traffic.

 - Jonathan Morton

_______________________________________________
Codel mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/codel

Reply via email to