Re: [aqm] review draft-kuhn-aqm-eval-guidelines-02

Nicolas KUHN Thu, 28 Aug 2014 08:40:28 -0700

Dear Wolfram, 

Thanks a lot for your useful feedback. 
Pleas, have a look at my answer inline.

Kind regards, 

Nicolas

On Aug 12, 2014, at 10:04 AM, "LAUTENSCHLAEGER, Wolfram (Wolfram)" 
<[email protected]> wrote:

> I agree with most of the suggested features that must be tested 
> for AQM evaluation.
> 
> But I have some doubts if the proposed experiments/metrics
> are really applicable and able to reveal the required features.
> 
> My comments in detail:
> 
> Section 2.1: Flow completion time:
> 
> Is applicable only for equally sized finite flows. 
> Not really meaningful for variable sized flows, e.g. the Tmix trace. 
> Not applicable to infinite flows.
> 

This comment points out that the metrics may not suit to every traffic detailed 
in the rest of the draft.
As, in this Section 2, we just list the metrics that can be measured, I propose 
to add the following paragraph in the beginning of Section 2:
"The metrics listed in this section may not suit every type of traffic detailed 
in the rest of this document: all of the following metrics MAY not be measured. 
For each scenario, the metrics should be selected among the following list, 
depending on the traffic considered."

> Section 2.2: Packet loss: 
> 
> - Long term loss probability is meaningful only in a steady state
> scenario. And it characterizes the TCP flavor, not the AQM. (Loss
> probability remains the same, whatever you do with AQM, as long as you
> reach roughly the same throughput.)
> 

The long term loss probability might not be of interests for all the scenarios. 
In order to make the presentation of this metric more general, we could 
evaluate the packet drop probability over the time of the experiment. 

I propose to replace "the long term packet loss probability" by :
"       - the packet loss probability: this metric should be frequently 
measured during the experiment as the long term loss probability is of 
interests for steady state scenarios only;"

> - interval between consecutive losses: If the losses are well spaced,
> it resembles somehow the loss probability. If not well spaced (bursty), 
> what to record?
> 

To illustrate the interval between losses, we could propose to measure the time 
between two losses and provide the minimum, 5%, median, 95%, maximum values of 
the set of "interval times".

I propose to replace "the interval between consecutive losses" by : 
"       - the interval between consecutive losses: the time between two losses 
should be measured. From the set of interval times, the tester should present 
the median value, the minimum and maximum values and the 10th and 90th 
percentiles." 

> - packet loss patterns: Metric is undefined, except the special case 
> "packet loss synchronization", next section, 2.3. It is, indeed, highly 
> interesting qualitatively in non-stationary cases, e.g. abrupt capacity 
> drop. But how to quantify?
> 

I agree that it will be hard to quantify this metric. The idea behind the 
"packet loss pattern", is to have a  description of the loss pattern, that 
might be obtained by considering the loss probability and the interval between 
consecutive losses. As a result, we will remove the "packet loss patterns" for 
the next version of the document. 

> Section 2.4: Goodput: 
> 
> - Meaningful only with steady state occupancy by a number of more or 
> less greedy TCP flows. Here it shows, to which extent the AQM is able 
> to keep a link close to the 100% utilization. 
> 
> - With a trace of variable sized flows (Tmix) the goodput resembles the 
> traffic offer (if in total below the link capacity; not overloaded). 
> 
> - The overload scenario does not reach steady state. Goodput in 
> overload cases is highly dependent on other things than AQM, e.g. test 
> duration or shuffling of the trace.
> 

Indeed, some things need to be clarified. I propose to add the following 
paragraph: 
" The measurement of the goodput let the tester evaluate to which extend the 
AQM is able to keep an high link utilization. This metric should be obtained 
frequently during the experiment: the long term goodput makes sense for 
steady-state scenarios only and may not reflect how the introduction of AQM 
actually impacts on the link utilization. It is worth pointing out that the 
fluctuations of this measurement may depend on other things than the 
introduction of an AQM, such as physical layer losses, fluctuating bandwidths 
(Wi-Fi), heavy congestion levels or transport layer congestion controls."

> Section 2.6: Trade-off latency vs. goodput:
> 
> The section refers to two (x,y) plots of the form:
>     X=delay(parms)
>     Y=goodput(parms)
> and
>     X=delay(parms)
>     Y=drop_ratio(parms)
> where <parms> are tuples of parameter values, each describing 
> one experimental set-up.
> 
> It remains unclear, what <parms> might be in the context of the given 
> document. The cited document [TCPEVAL2013] suggests that one dimension 
> of <parms> might be the scaling of the applied Tmix trace. The other 
> parameters in the cited documents are not applicable here. More on this 
> see below.
> 

More precisions should have been given on how to generate this graphs. 
I propose to add the following paragraph: 
"Each of the end-to-end delay, goodput and drop probability should be measured 
every second. From each of this sets of measurements, the 10th and 90th 
percentile and the median value should be computed. For each scenario case, an 
ellipse can be generated from the measurement of the percentiles and a point 
for the median value can be plotted."

> Section 4.1: TCP-friendly sender:
> 
> Requires the plots according to section 2.6. But at the same time it 
> specifies one single long-lived, non application limited flow - this is 
> one single dot in each of the plots.
> 

See above

> Section 4.2: Aggressive Transport Sender: same problem
> 

See above

> Section 4.3: Unresponsive Transport: same problem;

See above

> moreover:
> 
> Scenario is only applicable to scheduling, not to AQM. The described 
> traffic simply overloads the link with no response to AQM (that is, to 
> my understanding the meaning of unresponsive traffic). A "long-lived 
> non application limited UDP flow" is somewhat infinite, other than its 
> counterpart TCP.
> 
> I would suggest a different test here: In a mixture of responsive and 
> unresponsive traffic do a test, to which extent the AQM scheme is still 
> able to keep the responsive fraction under control. This requires that 
> the unresponsive traffic is well below the capacity limit. The 
> rationale behind this test is that an AQM scheme might under- or 
> over-react if it drops packets but does not see the expected reduction.
> 

I propose the following text: 
"    The first scenario is the following.  In order to create a test
  environment that results in queue build up, we consider unresponsive
  flow(s) whose sending rate is greater than the bottleneck link
  capacity between routers L and R. This scenario consists of a long-
  lived non application limited UDP flow transmits data between sender
  A and receiver B. Graphs described in Section 2.6 MUST be generated.

  The second scenario is the following.  In order to test to which
  extend the AQM scheme is able to keep responsive fraction under
  control, this scenario considers a mixture of TCP-friendly and
  unresponsive traffics.  This scenario consists of a long-lived non
  application limited UDP flow and a single long-lived, non application
  limited, TCP New Reno flow that transmit data between sender A and
  receiver B. As opposed to the first scenario, the rate of the UDP
  traffic should not be greater than the bottleneck capacity, and
  should not be higher than half of the bottleneck capacity.  For each
  type of traffic, the graphs described in Section 2.6 MUST be
  generated. "

> Section 4.4: Initial Congestion Window:
> 
> Makes sense only with a mix of short lived flows. For long-lived flows 
> the IW does not matter. Alternatively a single experiment of a pre-
> existing long-lived flow and a newly appearing IW3/IW10 flow could be 
> executed. But there is no reference to any traffic mix. The table 
> specifies just 2 flows in parallel.
> 

New paragraph for this scenario: 
"    This scenario helps evaluate how an AQM scheme adapts to a traffic
  mix consisting of TCP flows with different values for the initial
  congestion window (IW).

  For this scenario, we consider two types of flow that MUST be
  generated between sender A and receiver B:

  o  a single long-lived non application limited TCP New Reno flow;

  o  a single long-lived application limited TCP New Reno flow, with an
     IW set to 3 or 10 packets.  The size of the data transmitted MUST
     be strictly higher than 10 packets and should be lower than 100
     packets.

        The transmission of both flows must not start simultaneously: a steady
        state must be achieved before the transmission of the application 
limited 
        flow. As a result, the transmission of the non application limited flow 
MUST
        start before the transmission of the application limited flow.

  For each of these scenarios, the graphs described in Section 2.6 MUST
  be generated for each class of traffic.  The completion time of the
  application limited TCP flow could be measured.
"

> What are the <parms> for graphs according to section 2.6? 
> 
> Section 4.5: Traffic Mix
> 
> The section defines its own traffic mixes in a table, but requires the 
> graphs according to 2.6, which somehow implies the Tmix traffic.
> 

Now that the graphs in 2.6 have been more clearly detailed, and saying that 
they must be generated for each class of traffic, is it better ? 

> Section 6: Burst absorption
> 
> Same as above; the test comes with its own traffic mix, but requests 
> graphs according to 2.6, thus implying Tmix.
> 

Now that the graphs in 2.6 have been more clearly detailed, and saying that 
they must be generated for each class of traffic, is it better ?

> The proposed bursty scenarios seem to be not specific enough, if 
> compared with 4.5.
> 

I do not see an important difference of details between the bursty scenario and 
section 4.5.

> I would propose here, for reproducibility, something like UDP on/off 
> background traffic.
> 

Yes - I have added more cases.  

> Section 7: Stability
> 
> This section mixes two different things: 
> (a) The impact of general drop rate by other cross traffic, which is 
> unrelated to the bottleneck link. 
> (b) Reaction to varying link capacity at the bottleneck.
> 
> The general drop rate experiment (a) is weakly specified: If the drop 
> rate is too high, the bottleneck capacity cannot be reached; AQM does 
> not matter. Or, the other way round, if the drop rate is too low, the 
> AQM algorithm dominates the drop process, whereas the background drops 
> don't really matter. Only the transition between both regimes could be 
> of interest. But how to get there, and is this of relevance in practice?
> 

I propose to add the following scenario:
"
 This scenario helps to evaluate how an AQM scheme reacts to incoming
  traffic resulting in various level of congestions during the
  experiment.  In this scenario, the congestion level varies according
  to a large time scale.  The following phases may be considered: phase
  I - mild congestion during 0-5s; phase II - medium congestion during
  5-10s; phase III - heavy congestion during 10-15s; phase I again, ...
  and so on. Each single lived non application limited TCP flow
  transfers data.

  For this scenario, the graphs described in Section 2.6 MUST be
  generated.  Moreover, one graph SHOULD be generated for each of the
  phases previously detailed.
"

> The varying capacity experiment (b) is really relevant. I am asking 
> myself, if there could be resonant effects in the AQM parameter 
> adaptation algorithms, and how to test for this?
> 

This is why we tried to mention in the "parameters sensitivity" section. 
Proposing scenarios for the resonant effects adaptation algorithms of AQM is 
not an easy task.

Thanks a lot for your useful comments. 
Please let us know if our modifications could assess your remarks. 

Kind regards, 

Nicolas Kuhn

_______________________________________________
aqm mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] review draft-kuhn-aqm-eval-guidelines-02

Reply via email to