Re: [aqm] CoDel's control law that determines drop frequency

2015-11-03 Thread Jonathan Morton

> On 3 Nov, 2015, at 18:22, Jeff Weeks  wrote:
> 
> I believe that means 'count' has to reach some nearly impossibly high value 
> of (100ms/5198ns)^2 == 370,107,128
> 
> I say nearly impossible, because it will take minutes (hours?) to get that 
> high (if my math is correct, it'll take over 17 seconds just to reach 7500).

Note that since count increments on every signal given (every packet dropped, 
in this case), it will increment faster as time goes on, as long as there is a 
sufficient packet frequency to support it (as there is in this case).

However, even 17 seconds is obviously too slow to control the buffer.

> In the meantime, the queue *isn't* being effectively managed, as packets with 
> extremely high latencies will be transmitted for far too long.

The problem is that you have an unresponsive flow here.  Codel is designed to 
give adequate congestion signals to *responsive* flows that require only one 
signal per RTT.  Unresponsive flows are instead caught by the buffer limit, 
causing hard packet drops.

This also means that where a responsive flow and an unresponsive flow share the 
same buffer, Codel doesn’t perform very well; the responsive flow backs off 
while the unresponsive flow continues to saturate the buffer.  (This is also 
what happens, intentionally, with uTP and standard TCP in a dumb FIFO.)

This is fixed if you combine Codel with flow isolation (ie. FQ), since then the 
responsive and unresponsive flows do not share the same buffer.  When the 
buffer limit is reached, only the longest queue is culled, and this will be the 
unresponsive flow.

 - Jonathan Morton

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm


Re: [aqm] CoDel's control law that determines drop frequency

2015-11-03 Thread Dave Taht
It helps to have the codel mailing list cc'd on codel discussions.
Adding this message to the cc.

One of these days we do have to write up - after finishing - cake's
code-likel implementation.


Dave Täht
I just invested five years of my life to making wifi better. And,
now... the FCC wants to make my work, illegal for people to install.
https://www.gofundme.com/savewifi


On Tue, Nov 3, 2015 at 11:22 AM, Jeff Weeks  wrote:
> The drop rate is affected by sojourn time, yes, but a 2x sojourn time goes 
> through the same incremental reduction of interval size, as does a sojourn 
> time of x.
>
> In investigating codel, I've setup various worst case scenarios, and I feel 
> like the algorithm could be made better by having its response time more 
> dependent upon how far away from the target latency it is.
>
> For example, consider a disabled interface, with a large (or even 
> conceptually infinite queue), that's attached to a fairly small shaper.
>
> The interface is then enabled, and immediately starts seeing 100Mbps, and 
> tries to shape it to 1Mbps.
>
> The queue will obviously build up quickly, and codel will notice this, and 
> enter the drop state.  But it will start at count = 1.
>
> If the interface is receiving 64b udp packets, then it'll be receiving 
> 100,000,000/512 == 195,312 packets per second, and only transmitting 1953 
> packets per second.
>
> The default target is 5ms, which is about 10 packets.  So of those 195,312 
> packets/second, we should ideally be dropping 195,312 - (1953 + 10) == 
> 192,349 packets/second.
>
> But in order to drop that many packets, 'count' needs to ramp up to the point 
> where the drop interval is consistently 5,198 ns.
>
> I believe that means 'count' has to reach some nearly impossibly high value 
> of (100ms/5198ns)^2 == 370,107,128
>
> I say nearly impossible, because it will take minutes (hours?) to get that 
> high (if my math is correct, it'll take over 17 seconds just to reach 7500).
>
> In the meantime, the queue *isn't* being effectively managed, as packets with 
> extremely high latencies will be transmitted for far too long.
>
> Of course, as I stated earlier, simply increasing count more quickly, based 
> on how far away we are from the target latency effectively invalidates the 
> optimization which most (all?) codel implementations use (namely the newton 
> step integer-only sqrt approximation) as, at some point, the approximation 
> starts *diverging* from the appropriate value.
>
> One alternative which I've been investigating is the possibility of skewing 
> the precalculated 1/sqrt(count) value.
>
> If this is kept as a 32-bit all decimal fixed point number, then performing 
> the multiplication by intentionally miss-shifting will result in doubling 
> sqrt(count):
>
> eg, take the following accurate calculation of next interval:
>
> codel->next_interval_start_ticks = base_time + ((interval * 
> codel->one_over_sqrt_count) >> 32)
>
> And intentionally miss-shift by 1 bit:
>
> codel->next_interval_start_ticks = base_time + ((interval * 
> codel->one_over_sqrt_count) >> 33)
>
> Will effectively have the interval reduce twice as fast.
>
> Alternatively, (and similarly to how CAKE halves the count while re-entering 
> the drop interval), count can periodically be doubled, if the current value 
> is seen to not be adequately affecting traffic, and the pre-calculated 
> 1/sqrt(count) can then be divided by sqrt(2) (i.e., do not rely on the newton 
> step approximation for this modification of count).
>
> Cheers,
> --Jeff
>
>
>
> 
> /dev/jeff_weeks.x2936
> Sandvine Incorporated
> 
> From: aqm [aqm-boun...@ietf.org] on behalf of Andrew Mcgregor 
> [andrewm...@google.com]
> Sent: Sunday, October 25, 2015 6:44 PM
> To: Dave Dolson
> Cc: Kathleen Nichols; Bob Briscoe; Dave Taht; Van Jacobson; AQM IETF list
> Subject: Re: [aqm] CoDel's control law that determines drop frequency
>
> CoDel does have the form of a controller; drop rate (not probability) is a 
> function of sojourn time (not queue size) and history, encoded in the state 
> variables.
>
> Now, I don't take it as proven that the particular form of the controller is 
> the best we could do, but making it a rate and based on sojourn time are 
> clear wins. Yes, you can use size as a proxy for sojourn time if your link 
> really has a constant bit rate, but not even ethernet is exactly CBR in 
> practice (and in some hardware situations, knowing the size is much more 
> expensive than measuring sojourn; the opposite can also apply).  Yes, you can 
> use probability as a proxy for rate if you are doing a hardware 
> implementation or otherwise doing the rate the way CoDel does is hard.
>
> PIE is closely related; the controller is the biggest difference, and it uses 
> both those proxies because it was aimed at a situation where there was 
> hardware support for one set of choices.
>
> On 23 October 2015 at 07:07, Dave Dols

Re: [aqm] CoDel's control law that determines drop frequency

2015-11-03 Thread Jeff Weeks
The drop rate is affected by sojourn time, yes, but a 2x sojourn time goes 
through the same incremental reduction of interval size, as does a sojourn time 
of x.

In investigating codel, I've setup various worst case scenarios, and I feel 
like the algorithm could be made better by having its response time more 
dependent upon how far away from the target latency it is.

For example, consider a disabled interface, with a large (or even conceptually 
infinite queue), that's attached to a fairly small shaper.

The interface is then enabled, and immediately starts seeing 100Mbps, and tries 
to shape it to 1Mbps.

The queue will obviously build up quickly, and codel will notice this, and 
enter the drop state.  But it will start at count = 1.

If the interface is receiving 64b udp packets, then it'll be receiving 
100,000,000/512 == 195,312 packets per second, and only transmitting 1953 
packets per second.

The default target is 5ms, which is about 10 packets.  So of those 195,312 
packets/second, we should ideally be dropping 195,312 - (1953 + 10) == 192,349 
packets/second.

But in order to drop that many packets, 'count' needs to ramp up to the point 
where the drop interval is consistently 5,198 ns.

I believe that means 'count' has to reach some nearly impossibly high value of 
(100ms/5198ns)^2 == 370,107,128

I say nearly impossible, because it will take minutes (hours?) to get that high 
(if my math is correct, it'll take over 17 seconds just to reach 7500).

In the meantime, the queue *isn't* being effectively managed, as packets with 
extremely high latencies will be transmitted for far too long.

Of course, as I stated earlier, simply increasing count more quickly, based on 
how far away we are from the target latency effectively invalidates the 
optimization which most (all?) codel implementations use (namely the newton 
step integer-only sqrt approximation) as, at some point, the approximation 
starts *diverging* from the appropriate value.

One alternative which I've been investigating is the possibility of skewing the 
precalculated 1/sqrt(count) value.

If this is kept as a 32-bit all decimal fixed point number, then performing the 
multiplication by intentionally miss-shifting will result in doubling 
sqrt(count):

eg, take the following accurate calculation of next interval:

codel->next_interval_start_ticks = base_time + ((interval * 
codel->one_over_sqrt_count) >> 32)

And intentionally miss-shift by 1 bit:

codel->next_interval_start_ticks = base_time + ((interval * 
codel->one_over_sqrt_count) >> 33)

Will effectively have the interval reduce twice as fast.

Alternatively, (and similarly to how CAKE halves the count while re-entering 
the drop interval), count can periodically be doubled, if the current value is 
seen to not be adequately affecting traffic, and the pre-calculated 
1/sqrt(count) can then be divided by sqrt(2) (i.e., do not rely on the newton 
step approximation for this modification of count).

Cheers,
--Jeff




/dev/jeff_weeks.x2936
Sandvine Incorporated

From: aqm [aqm-boun...@ietf.org] on behalf of Andrew Mcgregor 
[andrewm...@google.com]
Sent: Sunday, October 25, 2015 6:44 PM
To: Dave Dolson
Cc: Kathleen Nichols; Bob Briscoe; Dave Taht; Van Jacobson; AQM IETF list
Subject: Re: [aqm] CoDel's control law that determines drop frequency

CoDel does have the form of a controller; drop rate (not probability) is a 
function of sojourn time (not queue size) and history, encoded in the state 
variables.

Now, I don't take it as proven that the particular form of the controller is 
the best we could do, but making it a rate and based on sojourn time are clear 
wins. Yes, you can use size as a proxy for sojourn time if your link really has 
a constant bit rate, but not even ethernet is exactly CBR in practice (and in 
some hardware situations, knowing the size is much more expensive than 
measuring sojourn; the opposite can also apply).  Yes, you can use probability 
as a proxy for rate if you are doing a hardware implementation or otherwise 
doing the rate the way CoDel does is hard.

PIE is closely related; the controller is the biggest difference, and it uses 
both those proxies because it was aimed at a situation where there was hardware 
support for one set of choices.

On 23 October 2015 at 07:07, Dave Dolson 
mailto:ddol...@sandvine.com>> wrote:
Catching up...

Bob said:
> I meant to say in my mail to Andrew that the rate at which the control
> law increases the dropping frequency ought to depend on how far the
> queue is from target, or even how quickly it is diverging from target.
> I've always said that this control law should not be just a hard-coded
> thing that has no dependency on how the queue is moving.

In other words, make it a PID controller, not just a PI controller?
(Or is it currently just "I" ?)
With consideration of the derivative term (how quickly the queue
size is moving away from