Re: [aqm] Gen-art LC review of draft-ietf-aqm-recommendation-08

Elwyn Davies Thu, 08 Jan 2015 01:20:07 -0800

(Copied to aqm mailing list as suggested by WG chair).
Hi.

Thanks for your responses. Just a reminder... I am not (these days,anyway) an expert in router queue management, so my comments should notbe seen as deep critique of the individual items, but things that cometo mind as matters of general control engineering and areas where I feelthe language needs clarification - that's what gen-art is for,

As a matter of interest it might be useful to explain a bit what scaleof routing engine you are thinking about in this paper. This is becauseI got a feeling from your responses to the buffer bloat question thatyou are primarily thinking big iron here. The buffer bloat phenomenonhas tended to be in smaller boxes where the AQM stuff may or may not beapplicable. I don't quite know what your target is here - or if you arethinking over the whole range of sizes. The responses below clearlyindicate that you have some examples in mind (Codel, for example which Iknow nothing about except (now) that it is an AQM WG product) and Idon't know what scale of equipment these are really relevant to.


Some more responses in line.

Regards,
Elwyn

On 05/01/15 20:32, Fred Baker (fred) wrote:

On Jan 5, 2015, at 1:13 AM, go...@erg.abdn.ac.uk wrote:

Fred, I've applied the minor edits.

I have questions to you on the comments blow (see GF:) before I
proceed.

Gorry


Adding Elwyn, as the discussion of his comments should include him -
he might b able to clarify his concerns. I started last night to
write a note, which I will now discard and instead comment here.

I am the assigned Gen-ART reviewer for this draft. For background
on Gen-ART, please see the FAQ at

<http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.

Please resolve these comments along with any other Last Call
comments you may receive.

Document: draft-ietf-aqm-recommendation-08.txt Reviewer: Elwyn
Davies Review Date: 2014/12/19 IETF LC End Date: 2014/12/24 IESG
Telechat date: (if known) -

Summary:  Almost ready for BCP.

Possibly missing issues:

Buffer bloat:  The suggestions/discussions are pretty much all
about keeping buffer size sufficiently large to avoid burst
dropping.  It seems to me that it might be good to mention the
possibility that one can over provision queues, and this needs to
be avoided as well as under provisioning.

GF: I am not sure - this to me depends use case.


To me, this is lily-gilding. To pick one example, the Cisco ASR 8X10G
line card comes standard from the factory with 200 ms of queue per
10G interface. If we were to implement Codel on it, Codel would try
desperately to keep the average induced latency less than five ms. If
it tried to make it be 100 microseconds, we would run into the issues
the draft talks about - we're trying to maximize rate while
minimizing mean latency, and due to TCP's dynamics, we would no
longer maximize rate. If 5 ms is a reasonable number (and for
intra-continental terrestrial delays I would think it is), and we set
that variable to 10, 50, or 100 ms, the only harm would be that we
had some probability of a higher mean induced latency than was really
necessary - AQM would be a little less effective. In the worst case,
(suppose we set Codel's limit to 200 ms), it would revert to tail
drop, which is what we already have.

There are two reasonable responses to this. One would be to note that
high RTT cases, even if auto-tuning mostly works, manual tuning may
deliver better results or tune itself correctly more quickly (on a
650 ms RTT satcom link, I'd start by changing Codel's 100 ms trigger
to something in the neighborhood of 650 ms). The other is to simply
say that there is no direct harm in increasing the limits, and there
may be value in some use cases. But I would also tend to think that
anyone that actually operates a network already has a pretty good
handle on that fact. So I don't see the value in saying it - which is
mostly why it's not there already.

My take on this would be "make as few assumptions about your audience aspossible, and write them down". Its a generally interesting topic andwould interest people who are not deeply skilled in the art - as well aspotentially pulling in some new researchers!

Interaction between boxes using different or the same algorithms:
Buffer bloat seems to be generally about situations where chains
of boxes all have too much buffer.  One thing that is not
currently mentioned is the possibility that if different AQM
schemes are implemented in various boxes through which a flow
passes, then there could be inappropriate interaction between the
different algorithms.  The old RFC suggested RED and nothing else
so that one just had one to make sure multiple RED boxes in
series didn't do anything bad.  With potentially different
algorithms in series, one had better be sure that the mechanisms
don't interact in a bad way when chained together - another
research topic, I think.

GF: I think this could be added as an area for continued research
mentioned in section 4.7. At least I know of some poor
interactions between PIE and CoDel on particular paths - where both
algorithms are triggered. However, I doubt if this is worth much
discussion in this document? thoughts?

Suggest: "The Internet presents a wide variety of paths where
traffic can experience combinations of mechanisms that can
potentially interact to influence the performance of applications.
Research therefore needs to consider the interactions between
different AQM algorithms, patterns of interaction in network
traffic and other network mechanisms to ensure that multiple
mechanisms do not inadvertently interact to impact performance."


Mentioning it as a possible research area makes sense. Your proposed
text is fine, from my perspective.

Yes. I think something like this would be good. The buffer bloatexample is probably an extreme case of things not having AQM at all andinteracting badly. It would maybe be worth mentioning that any AQMmechanism has also got to work in series with boxes that don't have anyactive AQM - just tail drop. Ultimately, I would say this is just amatter of control engineering principles: You are potentially making anetwork in which various control algorithms are implemented on differentlegs/nodes and the combination of transfer functions could possibly beunstable. Has anybody applied any of the raft of control theoreticmethods to these algorithms? I have no idea!

I start by questioning the underlying assumption, though, which is
that bufferbloat is about paths in which there are multiple
simultaneous bottlenecks. Yes, that occurs (think about paths that
include both Cogent and a busy BRAS or CMTS, or more generally, if
any link has some probability of congesting, math sophomore
statistics course maintained that any pair of links has the product
of the two probabilities of being simultaneously congested), but I'd
be hard-pressed to make a statistically compelling argument out of
it. The research and practice I have seen has been about a single
bottleneck.

Please don't fixate on buffer bloat!

Minor issues: s3, para after end of bullet 3:

The projected increase in the fraction of total Internet
traffic for more aggressive flows in classes 2 and 3 could pose
a threat to the performance of the future Internet.  There is
therefore an urgent need for measurements of current conditions
and for further research into the ways of managing such flows.
This raises many difficult issues in finding methods with an
acceptable overhead cost that can identify and isolate
unresponsive flows or flows that are less responsive than TCP.


Question: Is there actually any published research into how one
would identify class 2 or class 3 traffic in a router/middle box?
If so it would be worth noting - the text call for "further
research" seems to indicate there is something out there.

GF: I think the text is OK.


Agreed. Elwyn's objection appears to be to the use of the word
"further"; if we don't know of a paper, he'd like us to call for
"research". The papers that come quickly to my mind are various
papers on non-responsive flows, such as
http://www.icir.org/floyd/papers/collapse.may99.pdf or
http://www2.research.att.com/~jiawang/sstp08-camera/SSTP08_Pan.pdf.
We already have a pretty extensive bibliography...

Right either remove/alter "further" if there isn't anything already outthere or put in some reference(s).

s4.2, next to last para: Is it worth saying also that the
randomness should avoid targeting a single flow within a
reasonable period to give a degree of fairness.


Network devices SHOULD use an AQM algorithm to determine the packets
that are marked or discarded due to congestion.  Procedures for
dropping or marking packets within the network need to avoid
increasing synchronization events, and hence randomness SHOULD be
introduced in the algorithms that generate these congestion signals
to the endpoints.

GF: Thoughts?


I worry. The reasons for the randomness are (1) to tend to hit
different sessions, and (2) when the same session is hit, to minimize
the probability of multiple hits in the same RTT. It might be worth
saying as much. However, to *stipulate* that algorithms should limit
the hit rate on a given flow invites a discussion of stateful
inspection algorithms. If someone wants to do such a thing, I'm not
going to try to stop them (you could describe fq_* in those terms),
but I don't want to put the idea into their heads (see later comment
on privacy). Also, that is frankly more of a concern with Reno than
with NewReno, and with NewReno than with anything that uses SACK.
SACK will (usually) retransmit all dropped segments in the subsequent
RTT, while NewReno will retransmit the Nth dropped packet in the Nth
following RTT, and Reno might take that many RTO timeouts.


You have thought about what I said.  Put in what you think it needs.

s4.2.1, next to last para:

An AQM algorithm that supports ECN needs to define the
threshold and algorithm for ECN-marking.  This threshold MAY
differ from that used for dropping packets that are not marked
as ECN-capable, and SHOULD be configurable.

Is this suggestion really compatible with recommendation 3 and
s4.3 (no tuning)?

GF: I think making a recommendation here is beyond the "BCP"
experience, although I suspect that a lower marking threshold is
generally good. Should we add it also to the research agenda as an
item at the end of para 3 in S4.7.?

I think you may have misunderstood what I am saying here. Rec 3 ands4.3 say things should work without tuning. Doesn't having to set thesethresholds/algorithms constitute tuning? If so then it makes itdifficult to see these ECN schemes as meeting the constraints. If youdisagree then explain how it isn't - or suggest that there should beresearch to see how to make ECN zero config as well.


I can see adding it to the research agenda; the comment comes from
Bob Briscoe's research.

That said, any algorithm using any mechanism by definition needs to
specify any variables it uses - Codel, for example, tries to keep a
queue at 5 ms or less, and cuts in after a queue fails to empty for a
period of 100 ms. I don't see a good argument for saying "but an
ECN-based algorithm doesn't need to define its thresholds or
algorithms". Also, as I recall, the MAY in the text came from the
fact that Bob seemed to think there was value in it (which BTW I
agree with). To my mind, SHOULD and MUST are strong words, but absent
such an assertion, an implementation MAY do just about anything that
comes to the implementor's mind. So saying an implementation MAY <do
something> is mostly a suggestion that an implementor SHOULD think
about it. Are we to say that an implementor, given Bob's research,
should NOT think about giving folks the option?

I also don't think Elwyn's argument quite follows. When I say that an
algorithm should auto-tune, I'm not saying that it should not have
knobs; I'm saying that the default values of those knobs should be
adequate for the vast majority of use cases. I'm also not saying that
there should be exactly one initial default; I could easily imagine
an implementation noting the bit rate of an interface and the ping
RTT to a peer and pulling its initial configuration out of a table.

That would be at least partially acceptable as a mode of operation. Butyou might have a "warm-up" issue - would it work OK while the algorithmwas working out what the RTT actually was? And would the algorithmsadapt autonomously (i.e., auto-tune) to close in on optimum values afterpicking initial values from the table?

s7:  There is an arguable privacy concern that if schemes are
able to identify class 2 or class 3 flows, then a core device can
extract privacy related info from the identified flows.

GF: I don't see how traffic profiles expose privacy concerns, sure
users and apps can be characterised by patterns of interaction -
but this isn't what is being talked about here.


Agreed. If the reference is to RFC 6973, I don't see a violation of
https://tools.ietf.org/html/rfc6973#section-7. I would if we appeared
to be inviting stateful inspection algorithms. To give an example of
how difficult sessions are managed, RFC 6057 uses the CTS message in
round-robin fashion to push back on top-talker users in order to
enable the service provider to give consistent service to all of his
subscribers when a few are behaving in a manner that might prevent
him from doing so. Note that the "session", in that case, is not a
single TCP session, but a bittorrent-or-whatever server engaged in
sessions to tens or hundreds of peers. The fact that a few users
receive some pushback doesn't reveal the identities of those users.
I'd need to hear the substance behind Elwyn's concern before I could
write anything.


My reaction was that if your algorithm identifies flows then you have

On 05/01/15 20:32, Fred Baker (fred) wrote:>
>> On Jan 5, 2015, at 1:13 AM, go...@erg.abdn.ac.uk wrote:
>>
>> Fred, I've applied the minor edits.
>>
>> I have questions to you on the comments blow (see GF:) before I proceed.
>>
>> Gorry
>

> Adding Elwyn, as the discussion of his comments should include him -he might b able to clarify his concerns. I started last night to write anote, which I will now discard and instead comment here.

>
>>> I am the assigned Gen-ART reviewer for this draft. For background on
>>> Gen-ART, please see the FAQ at
>>>
>>> <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.
>>>
>>> Please resolve these comments along with any other Last Call comments
>>> you may receive.
>>>
>>> Document: draft-ietf-aqm-recommendation-08.txt
>>> Reviewer: Elwyn Davies
>>> Review Date: 2014/12/19
>>> IETF LC End Date: 2014/12/24
>>> IESG Telechat date: (if known) -
>>>
>>> Summary:  Almost ready for BCP.
>>>
>>> Possibly missing issues:
>>>
>>> Buffer bloat:  The suggestions/discussions are pretty much all about
>>> keeping buffer size

>>> sufficiently large to avoid burst dropping. It seems to me that itmight

>>> be good to

>>> mention the possibility that one can over provision queues, andthis needs

>>> to be avoided
>>> as well as under provisioning.
>>>
>> GF: I am not sure - this to me depends use case.
>

> To me, this is lily-gilding. To pick one example, the Cisco ASR 8X10Gline card comes standard from the factory with 200 ms of queue per 10Ginterface. If we were to implement Codel on it, Codel would trydesperately to keep the average induced latency less than five ms. If ittried to make it be 100 microseconds, we would run into the issues thedraft talks about - we're trying to maximize rate while minimizing meanlatency, and due to TCP's dynamics, we would no longer maximize rate. If5 ms is a reasonable number (and for intra-continental terrestrialdelays I would think it is), and we set that variable to 10, 50, or 100ms, the only harm would be that we had some probability of a higher meaninduced latency than was really necessary - AQM would be a little lesseffective. In the worst case, (suppose we set Codel's limit to 200 ms),it would revert to tail drop, which is what we already have.

> There are two reasonable responses to this. One would be to note thathigh RTT cases, even if auto-tuning mostly works, manual tuning maydeliver better results or tune itself correctly more quickly (on a 650ms RTT satcom link, I'd start by changing Codel's 100 ms trigger tosomething in the neighborhood of 650 ms). The other is to simply saythat there is no direct harm in increasing the limits, and there may bevalue in some use cases. But I would also tend to think that anyone thatactually operates a network already has a pretty good handle on thatfact. So I don't see the value in saying it - which is mostly why it'snot there already.

>>> Interaction between boxes using different or the same algorithms:Buffer

>>> bloat seems to
>>> be generally about situations where chains of boxes all have too much
>>> buffer.  One thing

>>> that is not currently mentioned is the possibility that ifdifferent AQM

>>> schemes are

>>> implemented in various boxes through which a flow passes, thenthere could

>>> be inappropriate

>>> interaction between the different algorithms. The old RFCsuggested RED

>>> and nothing else so

>>> that one just had one to make sure multiple RED boxes in seriesdidn't do

>>> anything bad.  With
>>> potentially different algorithms in series, one had better be sure that
>>> the mechanisms don't
>>> interact in a bad way when chained together - another research topic, I
>>> think.
>>>
>> GF: I think this could be added as an area for continued research
>> mentioned in section 4.7. At least I know of some poor interactions
>> between PIE and CoDel on particular paths - where both algorithms are
>> triggered. However, I doubt if this is worth much discussion in this
>> document? thoughts?
>>
>> Suggest:
>> "The Internet presents a wide variety of paths where traffic can
>> experience combinations of mechanisms that can potentially interact to
>> influence the performance of applications. Research therefore needs to
>> consider the interactions between different AQM algorithms, patterns of

>> interaction in network traffic and other network mechanisms toensure that>> multiple mechanisms do not inadvertently interact to impactperformance."

> Mentioning it as a possible research area makes sense. Your proposedtext is fine, from my perspective.

> I start by questioning the underlying assumption, though, which isthat bufferbloat is about paths in which there are multiple simultaneousbottlenecks. Yes, that occurs (think about paths that include bothCogent and a busy BRAS or CMTS, or more generally, if any link has someprobability of congesting, math sophomore statistics course maintainedthat any pair of links has the product of the two probabilities of beingsimultaneously congested), but I'd be hard-pressed to make astatistically compelling argument out of it. The research and practice Ihave seen has been about a single bottleneck.

>
>>> Minor issues:
>>> s3, para after end of bullet 3:

>>>> The projected increase in the fraction of total Internettraffic for>>>> more aggressive flows in classes 2 and 3 could pose a threatto the

>>>>     performance of the future Internet.  There is therefore an urgent

>>>> need for measurements of current conditions and for furtherresearch

>>>>     into the ways of managing such flows.  This raises many difficult

>>>> issues in finding methods with an acceptable overhead costthat can

>>>>     identify and isolate unresponsive flows or flows that are less
>>>>     responsive than TCP.
>>>
>>> Question: Is there actually any published research into how one would
>>> identify
>>> class 2 or class 3 traffic in a router/middle box? If so it would be
>>> worth noting -
>>> the text call for "further research" seems to indicate there is
>>> something out there.
>>>
>> GF: I think the text is OK.
>

> Agreed. Elwyn's objection appears to be to the use of the word"further"; if we don't know of a paper, he'd like us to call for"research". The papers that come quickly to my mind are various paperson non-responsive flows, such ashttp://www.icir.org/floyd/papers/collapse.may99.pdf orhttp://www2.research.att.com/~jiawang/sstp08-camera/SSTP08_Pan.pdf. Wealready have a pretty extensive bibliography...

>
>>> s4.2, next to last para: Is it worth saying also that the randomness
>>> should avoid targeting a single flow within a reasonable period to give
>>> a degree of fairness.
>
>     Network devices SHOULD use an AQM algorithm to determine the packets
>     that are marked or discarded due to congestion.  Procedures for
>     dropping or marking packets within the network need to avoid
>     increasing synchronization events, and hence randomness SHOULD be
>     introduced in the algorithms that generate these congestion signals
>     to the endpoints.
>
>> GF: Thoughts?
>

> I worry. The reasons for the randomness are (1) to tend to hitdifferent sessions, and (2) when the same session is hit, to minimizethe probability of multiple hits in the same RTT. It might be worthsaying as much. However, to *stipulate* that algorithms should limit thehit rate on a given flow invites a discussion of stateful inspectionalgorithms. If someone wants to do such a thing, I'm not going to try tostop them (you could describe fq_* in those terms), but I don't want toput the idea into their heads (see later comment on privacy). Also, thatis frankly more of a concern with Reno than with NewReno, and withNewReno than with anything that uses SACK. SACK will (usually)retransmit all dropped segments in the subsequent RTT, while NewRenowill retransmit the Nth dropped packet in the Nth following RTT, andReno might take that many RTO timeouts.

>

On 05/01/15 20:32, Fred Baker (fred) wrote:>
>> On Jan 5, 2015, at 1:13 AM, go...@erg.abdn.ac.uk wrote:
>>
>> Fred, I've applied the minor edits.
>>
>> I have questions to you on the comments blow (see GF:) before I proceed.
>>
>> Gorry
>

>
>>> I am the assigned Gen-ART reviewer for this draft. For background on
>>> Gen-ART, please see the FAQ at
>>>
>>> <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.
>>>
>>> Please resolve these comments along with any other Last Call comments
>>> you may receive.
>>>
>>> Document: draft-ietf-aqm-recommendation-08.txt
>>> Reviewer: Elwyn Davies
>>> Review Date: 2014/12/19
>>> IETF LC End Date: 2014/12/24
>>> IESG Telechat date: (if known) -
>>>
>>> Summary:  Almost ready for BCP.
>>>
>>> Possibly missing issues:
>>>
>>> Buffer bloat:  The suggestions/discussions are pretty much all about
>>> keeping buffer size

>>> sufficiently large to avoid burst dropping. It seems to me that itmight

>>> be good to

>>> mention the possibility that one can over provision queues, andthis needs

>>> to be avoided
>>> as well as under provisioning.
>>>
>> GF: I am not sure - this to me depends use case.
>

>>> Interaction between boxes using different or the same algorithms:Buffer

>>> bloat seems to
>>> be generally about situations where chains of boxes all have too much
>>> buffer.  One thing

>>> that is not currently mentioned is the possibility that ifdifferent AQM

>>> schemes are

>>> implemented in various boxes through which a flow passes, thenthere could

>>> be inappropriate

>>> interaction between the different algorithms. The old RFCsuggested RED

>>> and nothing else so

>>> that one just had one to make sure multiple RED boxes in seriesdidn't do

>>> anything bad.  With
>>> potentially different algorithms in series, one had better be sure that
>>> the mechanisms don't
>>> interact in a bad way when chained together - another research topic, I
>>> think.
>>>
>> GF: I think this could be added as an area for continued research
>> mentioned in section 4.7. At least I know of some poor interactions
>> between PIE and CoDel on particular paths - where both algorithms are
>> triggered. However, I doubt if this is worth much discussion in this
>> document? thoughts?
>>
>> Suggest:
>> "The Internet presents a wide variety of paths where traffic can
>> experience combinations of mechanisms that can potentially interact to
>> influence the performance of applications. Research therefore needs to
>> consider the interactions between different AQM algorithms, patterns of

>> interaction in network traffic and other network mechanisms toensure that>> multiple mechanisms do not inadvertently interact to impactperformance."

> Mentioning it as a possible research area makes sense. Your proposedtext is fine, from my perspective.

>
>>> Minor issues:
>>> s3, para after end of bullet 3:

>>>> The projected increase in the fraction of total Internettraffic for>>>> more aggressive flows in classes 2 and 3 could pose a threatto the

>>>>     performance of the future Internet.  There is therefore an urgent

>>>> need for measurements of current conditions and for furtherresearch

>>>>     into the ways of managing such flows.  This raises many difficult

>>>> issues in finding methods with an acceptable overhead costthat can

>>>>     identify and isolate unresponsive flows or flows that are less
>>>>     responsive than TCP.
>>>
>>> Question: Is there actually any published research into how one would
>>> identify
>>> class 2 or class 3 traffic in a router/middle box? If so it would be
>>> worth noting -
>>> the text call for "further research" seems to indicate there is
>>> something out there.
>>>
>> GF: I think the text is OK.
>

>
>>> s4.2, next to last para: Is it worth saying also that the randomness
>>> should avoid targeting a single flow within a reasonable period to give
>>> a degree of fairness.
>
>     Network devices SHOULD use an AQM algorithm to determine the packets
>     that are marked or discarded due to congestion.  Procedures for
>     dropping or marking packets within the network need to avoid
>     increasing synchronization events, and hence randomness SHOULD be
>     introduced in the algorithms that generate these congestion signals
>     to the endpoints.
>
>> GF: Thoughts?
>

>
>>> s4.2.1, next to last para:

>>>> An AQM algorithm that supports ECN needs to define thethreshold and>>>> algorithm for ECN-marking. This threshold MAY differ fromthat used>>>> for dropping packets that are not marked as ECN-capable, andSHOULD

>>>>     be configurable.
>>>>
>>> Is this suggestion really compatible with recommendation 3 and s4.3 (no
>>> tuning)?
>>>
>> GF: I think making a recommendation here is beyond the "BCP" experience,
>> although I suspect that a lower marking threshold is generally good.

>> Should we add it also to the research agenda as an item at the endof para

>> 3 in S4.7.?
>

> I can see adding it to the research agenda; the comment comes fromBob Briscoe's research.

> That said, any algorithm using any mechanism by definition needs tospecify any variables it uses - Codel, for example, tries to keep aqueue at 5 ms or less, and cuts in after a queue fails to empty for aperiod of 100 ms. I don't see a good argument for saying "but anECN-based algorithm doesn't need to define its thresholds oralgorithms". Also, as I recall, the MAY in the text came from the factthat Bob seemed to think there was value in it (which BTW I agree with).To my mind, SHOULD and MUST are strong words, but absent such anassertion, an implementation MAY do just about anything that comes tothe implementor's mind. So saying an implementation MAY <do something>is mostly a suggestion that an implementor SHOULD think about it. Are weto say that an implementor, given Bob's research, should NOT think aboutgiving folks the option?

> I also don't think Elwyn's argument quite follows. When I say that analgorithm should auto-tune, I'm not saying that it should not haveknobs; I'm saying that the default values of those knobs should beadequate for the vast majority of use cases. I'm also not saying thatthere should be exactly one initial default; I could easily imagine animplementation noting the bit rate of an interface and the ping RTT to apeer and pulling its initial configuration out of a table.

>
>>> s7:  There is an arguable privacy concern that if schemes are able to
>>> identify class 2 or class 3 flows, then a core device can extract
>>> privacy related info from the identified flows.
>>>
>> GF: I don't see how traffic profiles expose privacy concerns, sure users

>> and apps can be characterised by patterns of interaction - but thisisn't

>> what is being talked about here.
>

> Agreed. If the reference is to RFC 6973, I don't see a violation ofhttps://tools.ietf.org/html/rfc6973#section-7. I would if we appeared tobe inviting stateful inspection algorithms. To give an example of howdifficult sessions are managed, RFC 6057 uses the CTS message inround-robin fashion to push back on top-talker users in order to enablethe service provider to give consistent service to all of hissubscribers when a few are behaving in a manner that might prevent himfrom doing so. Note that the "session", in that case, is not a singleTCP session, but a bittorrent-or-whatever server engaged in sessions totens or hundreds of peers. The fact that a few users receive somepushback doesn't reveal the identities of those users. I'd need to hearthe substance behind Elwyn's concern before I could write anything.

>
>> s4.7, para 3:
>>> the use of Map/Reduce applications in data centers
>>> I think this needs a reference or a brief explanation.
>> GF: Fred do you know a reference or can suggest extra text?
>

> The concern has to do with incast, which is a pretty active researcharea (http://lmgtfy.com/?q=research+incast). The paragraph asks aquestion, which is whether the common taxonomy of network flows (mice vselephants) needs to be extended to include references to herds of micetraveling together, with the result that congestion control algorithmsdesigned under the assumption that a heavy data flow contains anelephant merely introduce head-of-line blocking in short flows. The word"lemmings" is mine.

> I know of at least four papers (Microsoft Research, CAIA, Tsinghua,and KAIST) submitted to various journals in 2014 on the topic. It'salso, at least in part, the basis for the DCLC RG. The only ones wecould reference, among those, would relate to DCTCP, as the rest havenot yet been published.

> Again, I'd like to understand the underlying issue. I doubt that itis that Elwyn doesn't like the question as such. Is it that he's lookingfor the word “incast” to replace "map/reduce"?

>
>> --- The edits below have been incorporated in the XML for  v-09 ---
>>> Nits/editorial comments:
>>> General: s/e.g./e.g.,/, s/i.e./i.e.,/
>>>

>>> s1.2, para 2(?) - top of p4: s/and often necessary/and is oftennecessary/

>>> s1.2, para 3: s/a > class of technologies that/a class of technologies
>>> that/
>>>
>>> s2, first bullet 3: s/Large burst of packets/Large bursts of packets/
>>>
>>> s2, last para: Probably need to expand POP, IMAP and RDP; maybe provide
>>> refs??
>>>
>>> s2.1, last para: s/open a large numbers of short TCP flows/may open a
>>> large number of short duration TCP flows/
>>>
>>> s4, last para: s/experience occasional issues that need moderation./can
>>> experience occasional issues that warrant mitigation./
>>>
>>> s4.2, para 6, last sentence: s/similarly react/react similarly/
>>>
>>> s4.2.1, para 1: s/using AQM to decider when/using AQM to decide when/
>>>
>>> s4.7, para 3:
>>>> In 2013,
>>> "At the time of writing" ?
>>>
>>> s4.7, para 3:
>>>> the use of Map/Reduce applications in data centers
>>> I think this needs a reference or a brief explanation.
>

>>> s4.2.1, next to last para:

>>>>     be configurable.
>>>>
>>> Is this suggestion really compatible with recommendation 3 and s4.3 (no
>>> tuning)?
>>>
>> GF: I think making a recommendation here is beyond the "BCP" experience,
>> although I suspect that a lower marking threshold is generally good.

>> Should we add it also to the research agenda as an item at the endof para

>> 3 in S4.7.?
>

> I can see adding it to the research agenda; the comment comes fromBob Briscoe's research.

>
>>> s7:  There is an arguable privacy concern that if schemes are able to
>>> identify class 2 or class 3 flows, then a core device can extract
>>> privacy related info from the identified flows.
>>>
>> GF: I don't see how traffic profiles expose privacy concerns, sure users

>> and apps can be characterised by patterns of interaction - but thisisn't

>> what is being talked about here.
>

>
>> s4.7, para 3:
>>> the use of Map/Reduce applications in data centers
>>> I think this needs a reference or a brief explanation.
>> GF: Fred do you know a reference or can suggest extra text?
>

> Again, I'd like to understand the underlying issue. I doubt that itis that Elwyn doesn't like the question as such. Is it that he's lookingfor the word “incast” to replace "map/reduce"?

>
>> --- The edits below have been incorporated in the XML for  v-09 ---
>>> Nits/editorial comments:
>>> General: s/e.g./e.g.,/, s/i.e./i.e.,/
>>>

>>> s1.2, para 2(?) - top of p4: s/and often necessary/and is oftennecessary/

>>> s1.2, para 3: s/a > class of technologies that/a class of technologies
>>> that/
>>>
>>> s2, first bullet 3: s/Large burst of packets/Large bursts of packets/
>>>
>>> s2, last para: Probably need to expand POP, IMAP and RDP; maybe provide
>>> refs??
>>>
>>> s2.1, last para: s/open a large numbers of short TCP flows/may open a
>>> large number of short duration TCP flows/
>>>
>>> s4, last para: s/experience occasional issues that need moderation./can
>>> experience occasional issues that warrant mitigation./
>>>
>>> s4.2, para 6, last sentence: s/similarly react/react similarly/
>>>
>>> s4.2.1, para 1: s/using AQM to decider when/using AQM to decide when/
>>>
>>> s4.7, para 3:
>>>> In 2013,
>>> "At the time of writing" ?
>>>
>>> s4.7, para 3:
>>>> the use of Map/Reduce applications in data centers
>>> I think this needs a reference or a brief explanation.
>

potentially helped a bad actor to pick off such flows or get to know whois communicating in a situation that currently it would be verydifficult to know as the queueing is basically flow agnostic. OK thisfairly way out but we have seen some pretty serious stuff apparentlybeing done around core routers according to Snowden et al.

s4.7, para 3:

the use of Map/Reduce applications in data centers I think this
needs a reference or a brief explanation.

GF: Fred do you know a reference or can suggest extra text?


The concern has to do with incast, which is a pretty active research
area (http://lmgtfy.com/?q=research+incast). The paragraph asks a
question, which is whether the common taxonomy of network flows (mice
vs elephants) needs to be extended to include references to herds of
mice traveling together, with the result that congestion control
algorithms designed under the assumption that a heavy data flow
contains an elephant merely introduce head-of-line blocking in short
flows. The word "lemmings" is mine.

I know of at least four papers (Microsoft Research, CAIA, Tsinghua,
and KAIST) submitted to various journals in 2014 on the topic. It's
also, at least in part, the basis for the DCLC RG. The only ones we
could reference, among those, would relate to DCTCP, as the rest have
not yet been published.

Again, I'd like to understand the underlying issue. I doubt that it
is that Elwyn doesn't like the question as such. Is it that he's
looking for the word “incast” to replace "map/reduce"?

I was just looking for somebody to define the jargon - As far as I amconcerned at this moment "incast" would be just as "bad" since it wouldproduce an equally blank stare followed by a grab for Google.

--- The edits below have been incorporated in the XML for  v-09
---

Nits/editorial comments: General: s/e.g./e.g.,/, s/i.e./i.e.,/

s1.2, para 2(?) - top of p4: s/and often necessary/and is often
necessary/ s1.2, para 3: s/a > class of technologies that/a class
of technologies that/

s2, first bullet 3: s/Large burst of packets/Large bursts of
packets/

s2, last para: Probably need to expand POP, IMAP and RDP; maybe
provide refs??

s2.1, last para: s/open a large numbers of short TCP flows/may
open a large number of short duration TCP flows/

s4, last para: s/experience occasional issues that need
moderation./can experience occasional issues that warrant
mitigation./

s4.2, para 6, last sentence: s/similarly react/react similarly/

s4.2.1, para 1: s/using AQM to decider when/using AQM to decide
when/

s4.7, para 3:

In 2013,

"At the time of writing" ?

s4.7, para 3:

the use of Map/Reduce applications in data centers

I think this needs a reference or a brief explanation.


_______________________________________________
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] Gen-art LC review of draft-ietf-aqm-recommendation-08

Reply via email to