Re: [aqm] FQ-PIE kernel module implementation

Agarwal, Anil Thu, 09 Jul 2015 11:30:27 -0700

Andrea,

Can you please clarify, whether in your simulation, there is a common drop 
probability applied to all queues or
is the drop probability explicitly scaled by queue length, as described in the 
DOCSIS report?
"Then, at enqueue time, the drop probability applied to a packet destined for 
queue X
is scaled based on the ratio of the queue depth
of queue X and the queue depth of the current largest queue."

The latter would help explain the observed packet drop rates.

Anil

From: Rong Pan (ropan) [mailto:ro...@cisco.com]
Sent: Thursday, July 09, 2015 1:34 PM
To: Francini, Andrea (Andrea); Agarwal, Anil; Polina Goltsman; Bless, Roland 
(TM); Fred Baker (fred); Toke Høiland-Jørgensen
Cc: Hironori Okano -X (hokano - AAP3 INC at Cisco); AQM IETF list
Subject: Re: [aqm] FQ-PIE kernel module implementation

The drop probability does not agree with the drop rate is highly possible as it 
is in the case here. However, the power of the PI controller would make sure 
that the latency is in check. The drop probability's exact value does not 
matter as long as the final drop rate is correct so that the queue/latency is 
stable.

CUBIC is kinda magic. I don't know whether there is theoretical study about its 
behavior. If so, I would like to know and study.

Thanks,

Rong

From: aqm <aqm-boun...@ietf.org<mailto:aqm-boun...@ietf.org>> on behalf of 
"Francini, Andrea (Andrea)" 
<andrea.franc...@alcatel-lucent.com<mailto:andrea.franc...@alcatel-lucent.com>>
Date: Thursday, July 9, 2015 at 6:26 AM
To: "Agarwal, Anil" <anil.agar...@viasat.com<mailto:anil.agar...@viasat.com>>, 
Polina Goltsman 
<polina.golts...@student.kit.edu<mailto:polina.golts...@student.kit.edu>>, 
"Bless, Roland (TM)" <roland.bl...@kit.edu<mailto:roland.bl...@kit.edu>>, "Fred 
Baker (fred)" <f...@cisco.com<mailto:f...@cisco.com>>, Toke Høiland-Jørgensen 
<t...@toke.dk<mailto:t...@toke.dk>>
Cc: "Hironori Okano -X (hokano - AAP3 INC at Cisco)" 
<hok...@cisco.com<mailto:hok...@cisco.com>>, AQM IETF list 
<aqm@ietf.org<mailto:aqm@ietf.org>>
Subject: Re: [aqm] FQ-PIE kernel module implementation

The packet drop rate is very different for the TCP and UDP queues: 0.017% and 
12.7% are the values measured in the 100ms RTT case (with PIE drop probability 
at about 16%). The random generator for the drop probability would indeed drop 
at the 16% rate, but whenever a TCP packet arrives at the FQ-PIE queue (10.4% 
of the cases), the drop probability is scaled down quite drastically. In other 
words, the full 16% packet drop probability only applies to a fraction of the 
incoming packets, which yields a lower total packet drop rate (11.3%).This is 
an intrinsic property of FQ-PIE with drop probability derived from aggregate 
state: the effective drop rate is systematically lower than the drop 
probability set by the algorithm, because the scaling by queue length ratio 
produces a drop probability value that is never larger than the one produced by 
the control equation. I wonder if this discrepancy should be a reason for 
concern from a control-theory perspective.

I also wonder if there is an equation that accurately relates the long-term 
CUBIC drop rate to the long-term CUBIC throughput when the CUBIC drop 
probability oscillates with the CUBIC queue length.

Regards,

Andrea

From: Agarwal, Anil [mailto:anil.agar...@viasat.com]
Sent: Thursday, July 09, 2015 5:31 AM
To: Francini, Andrea (Andrea); Polina Goltsman; Bless, Roland (TM); Fred Baker 
(fred); Toke Høiland-Jørgensen
Cc: Hironori Okano -X (hokano - AAP3 INC at Cisco); AQM IETF list
Subject: RE: [aqm] FQ-PIE kernel module implementation

Andrea,

This is good info.

Slightly surprising how CUBIC manages to keep cwnd large in the presence of 
such a large packet discard probability.

Might be useful to try and compare with the per-queue packet drop probability 
algorithm mentioned in the DOCSIS report.

Regards,
Anil

From: Francini, Andrea (Andrea) [mailto:andrea.franc...@alcatel-lucent.com]
Sent: Wednesday, July 08, 2015 1:52 PM
To: Agarwal, Anil; Polina Goltsman; Bless, Roland (TM); Fred Baker (fred); Toke 
Høiland-Jørgensen
Cc: Hironori Okano -X (hokano - AAP3 INC at Cisco); AQM IETF list
Subject: RE: [aqm] FQ-PIE kernel module implementation

Hi Anil,

I have run a few experiments in my ns2 environment, using my own FQ version of 
a rather old PIE module. I used 16ms for both the tUpdate and qdelay_ref 
parameters.

DISCLAIMER: Please do not take the results as indicative of the actual FQ-PIE 
(and plain PIE) behavior, but rather of a multi-queue (and single-queue) AQM 
scheme that instantiates some of the PIE principles.

The link has 100Mbps capacity. The buffer size is rather large (at least 6x 
BDP) so that buffer overflow occurs only if the AQM drop decisions are not 
sufficient to keep the queue length from growing uncontrolled.

The UDP traffic comes in at the constant rate of 101 Mbps. TCP traffic is from 
a CUBIC source (also a very old ns2 model). I used 50ms and 100ms RTT for the 
TCP traffic.

The experiments run for 300s of simulated time. The collection of statistics 
starts at time 100s, so all averages are computed over a 200s period. The 
scenario may be far from realistic, but it helps highlight basic properties of 
the AQM scheme.

With 50ms RTT, the CUBIC flow gets 32.3% of its 50Mbps fair share (2.15% with 
plain PIE -  a 15x drop). The PIE drop probability (which applies directly to 
UDP traffic, while it is scaled down for the shorter TCP queue) settles around 
16%.
With 100ms RTT, the CUBIC flow gets 23.6% of its 50Mbps fair share (1.33% with 
plain PIE). The PIE drop probability also settles around 16%, but with wider 
oscillations.

The plot in attachment shows a 25s window in the evolution of the aggregate 
queue length and congestion window size for the CUBIC source (100ms RTT case). 
The 100% dashed line is for the cwnd size that yields 100% of the fair share.

The drop probability is much larger than 1%. This is because the TCP traffic 
adds to the 101% input load of UDP. The queue length oscillates at every high 
frequency, loosely modulated by the CUBIC cwnd. As soon as the length of the 
CUBIC queue accumulates a few units, the TCP drop probability given by the 
FQ-PIE formula becomes sufficiently large to cause the loss of a TCP packet and 
push the TCP queue back to zero occupancy.

TCP traffic suffers because the TCP drop probability is tied to the overall 
buffer occupancy, which UDP keeps trying to increase. We agree that it would be 
much better if the drop probability was defined exclusively by the state of the 
TCP queue, but this is not easy to realize in practice, at least according to 
the explanation given in the May 2014 CableLabs document for the decision made 
there to use aggregate state for setting the drop probability  (the explanation 
overlaps in part with the issues you identify in your message). The aggregate 
approach could be improved with a better formula for the per-queue drop 
probability, but I doubt it will be easy to find one that fits well all use 
cases and types of TCP source.

Regards,

Andrea

From: Agarwal, Anil [mailto:anil.agar...@viasat.com]
Sent: Tuesday, July 07, 2015 9:27 PM
To: Francini, Andrea (Andrea); Polina Goltsman; Bless, Roland (TM); Fred Baker 
(fred); Toke Høiland-Jørgensen
Cc: Hironori Okano -X (hokano - AAP3 INC at Cisco); AQM IETF list
Subject: RE: [aqm] FQ-PIE kernel module implementation

Hi Andrea,

I am quite sure FQ-PIE with aggregate queue AQM will have some advantages over 
PIE with a single queue.
Although, it is not much in this use case, described by Polina.
In this case, assume that the unresponsive traffic is at a rate just 1% over 
the link rate.
PIE will converge to a drop probability of around 1%.
The TCP connection will also experience ~1% packet drop rate.
At that drop rate, the TCP goodput will be quite small - ~160 kbps.

I suspect that the advantages will show up in cases with multiple responsive 
flows and
better fairness and delay properties across flows.

Also, we have not discussed any advantages of FQ-PIE with aggregate queue vs 
FQ-PIE with per-queue AQM.
I am sure there are some.
One thought is that the aggregation will result in less "noise" in the 
algorithm input variables and
more stability in the state variable values.
Imagine per-queue AQM having to deal with individual short-lived TCP 
connections and
slow starts for each connection and very few RTTs to adapt connection rates. 
How well will it
control delays and aggregate buffer usage? (Better than single queue with 
tail-drops, but
that is a very low bar). Perhaps, we will need help from techniques such as TCP 
Hybrid Slow Start.
Some analysis or simulations of FQ-PIE with aggregate queue vs FQ-PIE with 
per-queue AQM
would be useful.
Note that FQ-Codel with aggregate queue AQM is not a viable option.

Regards,
Anil

From: Francini, Andrea (Andrea) [mailto:andrea.franc...@alcatel-lucent.com]
Sent: Tuesday, July 07, 2015 3:05 PM
To: Agarwal, Anil; Polina Goltsman; Bless, Roland (TM); Fred Baker (fred); Toke 
Høiland-Jørgensen
Cc: 
draft-ietf-aqm-...@tools.ietf.org<mailto:draft-ietf-aqm-...@tools.ietf.org>; 
Hironori Okano -X (hokano - AAP3 INC at Cisco); AQM IETF list
Subject: RE: [aqm] FQ-PIE kernel module implementation

Hi Anil,

One comment about the first point of your summary:

While FQ-PIE does drop TCP throughput compared to the fair share, a 
single-queue AQM will do even worse in the same scenario where the input rate 
of the UDP flow exceeds the output rate of the queue (no TCP throughput at 
all). I also suspect that, if the FQ-PIE experiment is repeated with a smaller 
RTT, closer to the PIE delay target, we may see some improvement for TCP (and 
more so with CUBIC vs. Reno).

FQ-AQM with per-queue state (including the case of a fixed tail-drop threshold 
per queue) does succeed in enforcing the fair share, but if the drop threshold 
is oversized compared to the flow RTT the price to pay is a large 
self-inflicted queuing delay.

It is true that any scheme that uses aggregate state (typically the overall 
buffer occupancy or queuing delay) to make drop decisions will lose flow 
isolation/protection to some extent. However, there are important quantitative 
differences that may emerge depending on the way the FQ-AQM uses the aggregate 
state.

Regards,

Andrea

From: aqm [mailto:aqm-boun...@ietf.org] On Behalf Of Agarwal, Anil
Sent: Tuesday, July 07, 2015 1:31 PM
To: Polina Goltsman; Bless, Roland (TM); Fred Baker (fred); Toke 
Høiland-Jørgensen
Cc: 
draft-ietf-aqm-...@tools.ietf.org<mailto:draft-ietf-aqm-...@tools.ietf.org>; 
Hironori Okano -X (hokano - AAP3 INC at Cisco); AQM IETF list
Subject: Re: [aqm] FQ-PIE kernel module implementation

Polina, Roland,

This is good info.
So, here is a short summary of our analysis -
For FQ-PIE with aggregate-queue AQM -

1.     In the presence of unresponsive flows, FQ-PIE has similar properties as 
single-queue AQMs - the responsive flows are squeezed down to use leftover 
bandwidth, if any. FQ-AQM with per-queue AQM performs better.

2.     In the presence of flows that do not use their fairshare (temporarily or 
permanently), FQ-PIE has similar properties as single-queue AQMs - the flows, 
that do not use their fairshare, experience non-zero packet drops. FQ-AQM with 
per-queue AQM performs better.

3.     In the presence of flows that do not use their fairshare (temporarily or 
permanently), the queue size and queuing delay of flows that use their 
fairshare can grow above the desired target value.

#2 and #3 are probably not major issues - especially in a network bottleneck 
with a large number of diverse flows.
But it is worth pointing out and documenting these properties (somewhere).

Regards,
Anil

From: Polina Goltsman [mailto:polina.golts...@student.kit.edu]
Sent: Tuesday, July 07, 2015 5:09 AM
To: Bless, Roland (TM); Agarwal, Anil; Fred Baker (fred); Toke Høiland-Jørgensen
Cc: 
draft-ietf-aqm-...@tools.ietf.org<mailto:draft-ietf-aqm-...@tools.ietf.org>; 
Hironori Okano -X (hokano - AAP3 INC at Cisco); AQM IETF list
Subject: Re: [aqm] FQ-PIE kernel module implementation

Hello all,

Here are my thoughts about interaction of AQM and fair-queueing system.

I think I will start with a figure. I have started a tcp flow with netperf, and 
15 seconds later unresponsive UDP flow with iperf with a send rate a little bit 
above bottleneck link capacity. Both flows run together for 50 seconds.
This figure plots the throughput of UDP flow that was reported by iperf server. 
(Apparently netperf doesn't produce any output if throughput is below some 
value, so I can't plot TCP flow.).  The bottleneck is 100Mb/s and RTT is 100ms. 
All AQMs were configured with their default values and noecn flag.
[cid:image001.png@01D0BA53.B16AC200]

Here is my example in theory. A link with capacity is C is shared between two 
flows - a non-application-limited TCP flow and unresponsive UDP flow with send 
rate 105%C. Both flows send max-sized packets, so round robin can be used 
instead of fair-queueing scheduler.

Per definition of max-min fair share both flows are supposed to get 50% of link 
capacity.

(1) Taildrop queues:
UDP packets will be dropped when its queue is full, TCP packets will be dropped 
when its queue is full. As long as there are packets in TCP flow queue, TCP 
should receive its fair share. ( As far as I understand, this depends on the 
size of the queue)

(2) AQM with state per queue:
Drop probability of UDP flow will always be non-zero and should stabilize 
around approximately 0.5.
Drop probability of TCP flow will be non-zero only when it starts sending above 
50%C. Thus, while TCP recovers from packet drops, it should not receive another 
drop.

(3) AQM with state per aggregate:
UDP flow always creates a standing queue, so drop probability of aggregate is 
always non-zero. Let's call it p_aqm.
The share of TCP packets in the aggregate p_tcp = TCP send rate / (TCP send 
rate + UDP send rate) and the probability of dropping a TCP packet is p_aqm * 
p_tcp. This probability is non-zero unless TCP doesn't send at all.

In (3) drop probability is at least different. I assume that it is larger than 
in (2), which will cause more packet drops for TCP flow, and as result the flow 
will reduce its sending rate below its fair share.

Regards,
Polina
On 07/07/2015 10:06 AM, Bless, Roland (TM) wrote:

Hi,

thanks for your analysis. Indeed, Polina came up with

a similar analysis for an unresponsive UDP flow and

a TCP flow. Flow queueing can achieve link share fairness

despite the presence of unresponsive flows, but is ineffective

if the AQM is applied to the aggregate and not to the individual

flow queue. Polina used the FQ-PIE implementation

to verify this behavior (post will follow).

Regards,

 Roland

Am 04.07.2015 um 22:12 schrieb Agarwal, Anil:

Roland, Fred,

Here is a simple example to illustrate the differences between FQ-AQM with AQM 
per queue vs AQM per aggregate queue.

Let's take 2 flows, each mapped to separate queues in a FQ-AQM system.

   Link rate = 100 Mbps

   Flow 1 rate = 50 Mbps, source rate does not go over 50 Mbps

   Flow 2 rate >= 50 Mbps, adapts based on AQM.

FQ-Codel, AQM per queue:

   Flow 1 delay is minimal

   Flow 1 packet drops = 0

   Flow 2 delay is close to target value

FQ-Codel, AQM for aggregate queue:

   Does not work at all

   Packets are dequeued alternatively from queue 1 and queue 2

   Packets from queue 1 experience very small queuing delay

   Hence, CoDel does not enter dropping state, queue 2 is not controlled :(

FQ-PIE, AQM per queue:

   Flow 1 delay is minimal

   Flow 1 packet drops = 0

   Flow 2 delay is close to target value

FQ-PIE, AQM for aggregate queue:

   Flow 1 delay and queue 1 length are close to zero.

   Flow 2 delay is close to 2 * target_del :(

           qlen2 = target_del * aggregate_depart_rate

   Flow 1 experiences almost the same number of drops or ECNs as flow 2 :(

           Same drop probability and almost same packet rate for both flows

   (If flow 1 drops its rate because of packet drops or ECNs, the analysis gets 
slightly more complicated).

See if this makes sense.

If the analysis is correct, then it illustrates that flow behaviors are quite 
different

between AQM per queue and AQM per aggregate queue schemes.

In FQ-PIE for aggregate queue,

   - The total number of queued bytes will slosh between

     queues depending on the nature and data rates of the flows.

   - Flows with data rates within their fair share value will experience

     non-zero packet drops (or ECN marks).

   - Flows that experience no queuing delay will increase queuing delay of 
other flows.

   - In general, the queuing delay for any given flow will not be close to 
target_delay and can be

     much higher

_______________________________________________
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] FQ-PIE kernel module implementation

Reply via email to