Re: [aqm] last call results on draft-ietf-aqm-recommendation
I agree the focus has been on updating the recommendations section. I quite understand that you have other things to do, so I'll try and summarise comments on the earlier sections and see if you agree. When we have a list we (with AQM chairs) can then decide what to do (work out how long it will take to complete the job). If other people *DO* have comments against specific parts of the current text, please do tell the list. Gorry Gorry, I'm concerned that you've asked me to supply text to /add/ to the early sections, when they actually need a lot /subtracted/. They need to be written knowing where they want to go to, and just say enough to get there. Effectively a re-write. A lot of people are confused about what AQM can and should do, so this doc should have an important role in deconfusing. However, I don't have the b/w to volunteer to do this re-write. Bob At 17:27 15/05/2014, go...@erg.abdn.ac.uk wrote: Gorry, And just on this... a) Congestion collapse: An AQM cannot prevent congestion collapse - that is the job of congestion control and, failing that, of policing. Even isolation (e.g. flow separation) doesn't prevent congestion collapse, because collapse is caused by the load from new flow arrivals exceeding the ability of the system to serve and clear load from existing flows, most likely because many existing flows are not sufficiently responsive to congestion, so retransmissions dominate over goodput (even if each unresponsive flow is in an isolated silo). Flow separation doesn't help when the problem is too many flows. That would seem OK to call-out, at least to me. My concern is that it's wrong to introduce a doc with a description of a problem that we're not addressing in the body of the doc (even tho collapse is an important problem, AQM doesn't address it, so why is it even relevant at all?). E.g. we could also add world hunger to the introduction, but it wouldn't be relevant. So, we will find some way on this topic - the current editors did not start this... the document we update uses this language and I think in the update it needs to be confined to the early sections, possibly with your text on why there is not mention elsewhere. We look forward to the detailed comments. Gorry Bob Briscoe, BT ___ aqm mailing list aqm@ietf.org https://www.ietf.org/mailman/listinfo/aqm ___ aqm mailing list aqm@ietf.org https://www.ietf.org/mailman/listinfo/aqm
Re: [aqm] last call results on draft-ietf-aqm-recommendation
I strongly agree with Bob's concerns: - Congestion collapse cannot be prevented by AQM - Flow fairness is not a topic for AQM - AQM should be possible without any knowledge of particular flows. As such it is clearly a L2 mechanism (which does not mean that it cannot be applied in L3 boxes). Of course, in technical implementations an AQM could be combined with fairness mechanisms like the fq_X proposals do. (I assume that in such combination AQM and scheduling essentially operate on the same queue or group of queues.) But for a clear understanding, what a particular AQM does, I would prefer to see the AQM algorithm in isolation. AQM should be applied to every buffer that can be overloaded and TCP is involved, i.e. where ingress rate can be higher the egress rate and no backpressure is in place. In middleboxes with ingress == egress rates this is not the case, except the processing capacity is insufficient. In my opinion AQM applies to multiplexed links, where several flows share the same transmission capacity. AQM cannot do a lot for a single flow on one link. In the capacity sharing case the effects of synchronization and burstiness of drops are strongly tied together. Removing the one also removes the other and this is the most what AQM can afford. Wolfram -Ursprüngliche Nachricht- Von: aqm [mailto:aqm-boun...@ietf.org] Im Auftrag von Bob Briscoe Gesendet: Donnerstag, 15. Mai 2014 16:32 An: Wesley Eddy; Fred Baker; Gorry Fairhurst Cc: aqm@ietf.org Betreff: Re: [aqm] last call results on draft-ietf-aqm-recommendation Wes, Thx. In case I don't get time to read, then type, I'll shoot my mouth off anyway... Sorry this is a bit rushed and dismissive. That's not my intention - I'm v supportive of the recommendations that have now been carefully and nicely worded. I will give more detailed comments, but these are the MSBs. 1) My main concern: The two halves of the document seemed nearly unrelated (at least in draft-03 and it looks like draft-04 hasn't changed this). The first half (Sections 1,2,3) framed the problem as primarily about preventing congestion collapse and preventing flow-unfairness, while the recommendations (section 4) were about AQM. The irony of this sentence is deliberate. I had few concerns about the recommendations text (section 4), which we've all been focusing on, including me. But I hadn't realised the introductory text was so out of kilter with the recommendations. Sections 1,2 and 3 seemed to focus on problems that I wouldn't even address with AQM (from a quick scan it looks like these sections haven't changed in this respect for draft-04): a) Congestion collapse: An AQM cannot prevent congestion collapse - that is the job of congestion control and, failing that, of policing. Even isolation (e.g. flow separation) doesn't prevent congestion collapse, because collapse is caused by the load from new flow arrivals exceeding the ability of the system to serve and clear load from existing flows, most likely because many existing flows are not sufficiently responsive to congestion, so retransmissions dominate over goodput (even if each unresponsive flow is in an isolated silo). Flow separation doesn't help when the problem is too many flows. b) Flow fairness (or user-fairness etc): this is a policy issue that needs to be built in a modular way, for optional addition to AQM. Therefore an AQM must also work well without fairness mechanisms. This conclusion was actually reached in the early sections, but it's not carried forward into the recommendations in section 4. If the conclusion is that AQM isn't intended to solve these two problems, we need to clearly say so. Most people who need to read this will be confused, so we shouldn't confuse them further! 2) There's no statement of scope. Can we really make all these recommendations irrespective of whether we're talking about high stat-mux core links, low stat-mux access links, low- stat-mux data centre links, or host buffers? Are there different recommendations for edge links (on trust boundaries) vs interior links? Does AQM apply at L2 as well as L3 (of course it does)? Which recommendations are different for each layer? Does AQM apply for middleboxes (firewalls, NATs etc) as much as for switches and routers? If not why not (only need AQM if there can be queuing - perhaps due to processor overload)? To illustrate the problem, our goal should be AQM in every buffer. But we really don't need and shouldn't have policing or isolation in every buffer. 4) Because sections 1,2,3 focused heavily on the above two problems (collapse and fairness) that can't really be addressed by AQM, these sections also gave insufficient attention to problems that AQM does address (and should address), E.g.: * synchronisation and lock-out were both described as vaguely the same problem, * synchronisation wasn't explained, * lock-out wasn't explained
Re: [aqm] last call results on draft-ietf-aqm-recommendation
See comments in-line, Gorry First, is this based on teh discussion or the revised draft text in -04, where the words were chosen to try to reflect this - but carefully avoid saying you can not integrate AQM scheduling? I strongly agree with Bob's concerns: - Congestion collapse cannot be prevented by AQM - Flow fairness is not a topic for AQM - AQM should be possible without any knowledge of particular flows. As such it is clearly a L2 mechanism (which does not mean that it cannot be applied in L3 boxes). Of course, in technical implementations an AQM could be combined with fairness mechanisms like the fq_X proposals do. (I assume that in such combination AQM and scheduling essentially operate on the same queue or group of queues.) But for a clear understanding, what a particular AQM does, I would prefer to see the AQM algorithm in isolation. AQM should be applied to every buffer that can be overloaded and TCP is involved, i.e. where ingress rate can be higher the egress rate and no backpressure is in place. In middleboxes with ingress == egress rates this is not the case, except the processing capacity is insufficient. There can also questions with current algorithms about parameter settings if you implement this away from the edge whre you don't know the flow path RTT and other params - but that's something to work upon. In my opinion AQM applies to multiplexed links, where several flows share the same transmission capacity. AQM cannot do a lot for a single flow on one link. In the capacity sharing case the effects of synchronization and burstiness of drops are strongly tied together. Removing the one also removes the other and this is the most what AQM can afford. OK Wolfram -Ursprüngliche Nachricht- Von: aqm [mailto:aqm-boun...@ietf.org] Im Auftrag von Bob Briscoe Gesendet: Donnerstag, 15. Mai 2014 16:32 An: Wesley Eddy; Fred Baker; Gorry Fairhurst Cc: aqm@ietf.org Betreff: Re: [aqm] last call results on draft-ietf-aqm-recommendation Wes, Thx. In case I don't get time to read, then type, I'll shoot my mouth off anyway... Sorry this is a bit rushed and dismissive. That's not my intention - I'm v supportive of the recommendations that have now been carefully and nicely worded. I will give more detailed comments, but these are the MSBs. 1) My main concern: The two halves of the document seemed nearly unrelated (at least in draft-03 and it looks like draft-04 hasn't changed this). The first half (Sections 1,2,3) framed the problem as primarily about preventing congestion collapse and preventing flow-unfairness, while the recommendations (section 4) were about AQM. The irony of this sentence is deliberate. I had few concerns about the recommendations text (section 4), which we've all been focusing on, including me. But I hadn't realised the introductory text was so out of kilter with the recommendations. Sections 1,2 and 3 seemed to focus on problems that I wouldn't even address with AQM (from a quick scan it looks like these sections haven't changed in this respect for draft-04): a) Congestion collapse: An AQM cannot prevent congestion collapse - that is the job of congestion control and, failing that, of policing. Even isolation (e.g. flow separation) doesn't prevent congestion collapse, because collapse is caused by the load from new flow arrivals exceeding the ability of the system to serve and clear load from existing flows, most likely because many existing flows are not sufficiently responsive to congestion, so retransmissions dominate over goodput (even if each unresponsive flow is in an isolated silo). Flow separation doesn't help when the problem is too many flows. b) Flow fairness (or user-fairness etc): this is a policy issue that needs to be built in a modular way, for optional addition to AQM. Therefore an AQM must also work well without fairness mechanisms. This conclusion was actually reached in the early sections, but it's not carried forward into the recommendations in section 4. If the conclusion is that AQM isn't intended to solve these two problems, we need to clearly say so. Most people who need to read this will be confused, so we shouldn't confuse them further! 2) There's no statement of scope. Can we really make all these recommendations irrespective of whether we're talking about high stat-mux core links, low stat-mux access links, low- stat-mux data centre links, or host buffers? Are there different recommendations for edge links (on trust boundaries) vs interior links? Does AQM apply at L2 as well as L3 (of course it does)? Which recommendations are different for each layer? Does AQM apply for middleboxes (firewalls, NATs etc) as much as for switches and routers? If not why not (only need AQM if there can be queuing - perhaps due to processor overload)? To illustrate the problem, our goal should be AQM in every buffer. But we really don't need
Re: [aqm] last call results on draft-ietf-aqm-recommendation
On 5/15/2014 5:09 AM, Bob Briscoe wrote: Wes, I assume you also want comments on the new version. Is there a deadline for comments? Absolutely, yes. There's no deadline at the moment, but it would be good to get any out sooner rather than later, especially if they're likely to need more discussion or are asking for major changes. I prepared comments on the previous version, but didn't get the time to type them up. So I want to try to remedy this with the new version (that I haven't read yet). The diffs aren't huge, so many of your comments on the previous revision might still be valid. -- Wes Eddy MTI Systems ___ aqm mailing list aqm@ietf.org https://www.ietf.org/mailman/listinfo/aqm
Re: [aqm] last call results on draft-ietf-aqm-recommendation
Gorry, At 16:55 15/05/2014, go...@erg.abdn.ac.uk wrote: Great, I look forward to comments on the actual text. I agree the front part needs more structure and more topics called out. i started adding that in -04 and would be pleased to add a few more subsections if we get agreement. I'll wait until I see comments before looking at updating the text with Fred. Gorry Wes, Thx. In case I don't get time to read, then type, I'll shoot my mouth off anyway... Sorry this is a bit rushed and dismissive. That's not my intention - I'm v supportive of the recommendations that have now been carefully and nicely worded. I will give more detailed comments, but these are the MSBs. 1) My main concern: The two halves of the document seemed nearly unrelated (at least in draft-03 and it looks like draft-04 hasn't changed this). The first half (Sections 1,2,3) framed the problem as primarily about preventing congestion collapse and preventing flow-unfairness, while the recommendations (section 4) were about AQM. The irony of this sentence is deliberate. I had few concerns about the recommendations text (section 4), which we've all been focusing on, including me. But I hadn't realised the introductory text was so out of kilter with the recommendations. Sections 1,2 and 3 seemed to focus on problems that I wouldn't even address with AQM (from a quick scan it looks like these sections haven't changed in this respect for draft-04): a) Congestion collapse: An AQM cannot prevent congestion collapse - that is the job of congestion control and, failing that, of policing. Even isolation (e.g. flow separation) doesn't prevent congestion collapse, because collapse is caused by the load from new flow arrivals exceeding the ability of the system to serve and clear load from existing flows, most likely because many existing flows are not sufficiently responsive to congestion, so retransmissions dominate over goodput (even if each unresponsive flow is in an isolated silo). Flow separation doesn't help when the problem is too many flows. That would seem OK to call-out, at least to me. My concern is that it's wrong to introduce a doc with a description of a problem that we're not addressing in the body of the doc (even tho collapse is an important problem, AQM doesn't address it, so why is it even relevant at all?). E.g. we could also add world hunger to the introduction, but it wouldn't be relevant. b) Flow fairness (or user-fairness etc): this is a policy issue that needs to be built in a modular way, for optional addition to AQM. Therefore an AQM must also work well without fairness mechanisms. This conclusion was actually reached in the early sections, but it's not carried forward into the recommendations in section 4. If the conclusion is that AQM isn't intended to solve these two problems, we need to clearly say so. Most people who need to read this will be confused, so we shouldn't confuse them further! OK - as long as we get agreement from the various AQM proposals, some methods rely heavily on flow isolation to achieve their wanted behaviour. Indeed, my point about fairness is different from my point about collapse (which just isn't even relevant). Fairness isn't strictly relevant to AQM, but flow isolation is used to complement AQMs. And flow isolation (which the doc calls 'scheduling') can't be done without affecting fairness. So 2.1 concludes: In short, scheduling algorithms and queue management should be seen as complementary, not as replacements for each other. This is a conclusion that should be reflected in the recommendations and conclusions. I.e. if they are complements, they need to be separable, not integrated. Because scheduling requires policy and AQM doesn't. So operators don't want to have to face the dilemma of needing the AQM part, but not being able to have it because they don't want the policy implicit in the scheduling part. This is critical for fq_codel, because apparently CoDel alone is not recommended (which I would agree with). This means that we really need fq_X, where X is something that can be recommended either alone or with fq. 2) There's no statement of scope. Can we really make all these recommendations irrespective of whether we're talking about high stat-mux core links, low stat-mux access links, low-stat-mux data centre links, or host buffers? Are there different recommendations for edge links (on trust boundaries) vs interior links? Does AQM apply at L2 as well as L3 (of course it does)? Which recommendations are different for each layer? Does AQM apply for middleboxes (firewalls, NATs etc) as much as for switches and routers? If not why not (only need AQM if there can be queuing - perhaps due to processor overload)? To illustrate the problem, our goal should be AQM in every buffer. But we really don't need and shouldn't have policing or isolation in every buffer. Hmmm... that will be interesting
Re: [aqm] last call results on draft-ietf-aqm-recommendation
Gorry, And just on this... a) Congestion collapse: An AQM cannot prevent congestion collapse - that is the job of congestion control and, failing that, of policing. Even isolation (e.g. flow separation) doesn't prevent congestion collapse, because collapse is caused by the load from new flow arrivals exceeding the ability of the system to serve and clear load from existing flows, most likely because many existing flows are not sufficiently responsive to congestion, so retransmissions dominate over goodput (even if each unresponsive flow is in an isolated silo). Flow separation doesn't help when the problem is too many flows. That would seem OK to call-out, at least to me. My concern is that it's wrong to introduce a doc with a description of a problem that we're not addressing in the body of the doc (even tho collapse is an important problem, AQM doesn't address it, so why is it even relevant at all?). E.g. we could also add world hunger to the introduction, but it wouldn't be relevant. So, we will find some way on this topic - the current editors did not start this... the document we update uses this language and I think in the update it needs to be confined to the early sections, possibly with your text on why there is not mention elsewhere. We look forward to the detailed comments. Gorry ___ aqm mailing list aqm@ietf.org https://www.ietf.org/mailman/listinfo/aqm
Re: [aqm] last call results on draft-ietf-aqm-recommendation
I agree with the complement language. I don't mind if they are separable. Integration, however, is highly advantagous. I started another thread on the backlog issue. Because scheduling requires policy and AQM doesn't. Machine gunning down packets randomly until the flows start to behave does not require any policy, agreed. a 5 tuple fq system is not a lot of policy to impose. certainly qos and rate scheduling systems impose a lot more policy. Actually, I'm going to retract part of what I just said. Everything is a policy. Drop tail is a policy, it's useful for e2e mechanisms like ledbat if the queue size is greater than 100ms. not helpful for bufferbloat. Drop head is a policy, it's useful for voip (actually useful for tcp too). Not helpful for ledbat. Shooting randomly and increasingly until flows get under control, a decent compromise between drop head and drop tail, that also shoots at a lot of packets it doesn't need to drr is a policy that does better mixing and does byte fairness sfq is a polic does better mixing of packet fairness qfq does weighted fq red/ared/wred is a policy. hfsc is a policy that does interesting scheduling and drop things all its own htb based policies are often complex and interesting so the problem is in defining what policies are needed and what algorithms can be used to implement that policy. May the ones that provide the best QoE for the end user succeed in the marketplace, and networks get ever better. https://www0.comp.nus.edu/~bleong/publications/pam14-ispcheck.pdf So operators don't want to have to face the dilemma of needing the AQM part, but not being able to have it because they don't want the policy implicit in the scheduling part. A dilemma of choosing which single line of code to incorporate in an otherwise far more complex system? I certainly do wish it was entirely parameterless, and perhaps a future version could be more so than this is today. I can write up the complexity required to do for example qfq + pie but it would be a great deal longer than the below, and qfq + RED or red alone, is much longer than either. Scripting is needed to configure those... # to do both AQM + DRR at the same time, with reasonable defaults for 4mbit-10gbit tc qdisc add dev your_device root fq_codel # AQM only # ecn not presently recomended tc qdisc add dev your_device root codel # or (functional equivalent) tc qdisc add dev your_device root fq_codel flows 1 noecn # (you could also replace the default tc filter, to get, like, # a 4 queued system on dscp...) # DRR + SQF-like behavior with minimal AQM, probably mostly reverting # to drop head from largest queue (with the largest delay I consider # even slightly reasonable) tc qdisc add dev your_device root fq_codel target 250ms interval 2500ms # if your desire is to completely rip out the codel portion of fq_codel that's # doable. I know a fq_pie exists, too. # reasonable default for satellite systems (might need to be closer to 120ms, # and given the speed of most satellites, quantum 300 makes sense as well as # a reduced mtu and IW) tc qdisc add dev your_device root fq_codel target 60ms interval 1200ms # useful option for lower bandwidth systems is quantum 300 # Data center only use can run at reduced target and interval tc qdisc add dev your_device root fq_codel target 500us interval 10ms # above 10Gbit, increasing the packet limit is good, probably a good idea to increase flows # a current problematic interaction with htb below 2.5mb leads to a need for a larger target # (it would be better to fix htb or to write a better rate limiter) It's about a page of directions to handle every use case. I'd LOVE to have similar guideline and cookbook page(s) for EVERY well known aqm and packet scheduling system - notably red and ared. I lack data on pie's scalability presently, too. Most rate shaping code on top of this sort of stuff, and most shaping/qos related code also is orders of magnitude more complex than this. Take htb's compensator for ATM and/or PPPoe framing. Please. OR the hideous QoS schemes people have designed using DPI. As things stand fq_codel is a simpler/faster/better drop in replacement for tons of code that shaped and used RED, or shaped and did sfq. Sensing the line rate, choosing an appropriate packet limit based on available memory, and auto-choosing number of flows are things the C code could be smarter about. They are something I currently do in a shell script (that also tries to figure out atm framing and a 3 tier qos system) I think that adding a rate limiter directly to a fq_codel or wfq + codel derived algo is a great idea and would be better than htb or hfsc + X. Been meaning to polish up the code... This is critical for fq_codel, because apparently CoDel alone is not recommended (which I would agree with). The present version of that is useful (without ecn) in many scenarios. It has been used in combination with hfsc, htb, and standalone. We've long
[aqm] last call results on draft-ietf-aqm-recommendation
Hello AQMers, the WG last call on the 2309bis / AQM recommendation draft has turned up a couple of reviews that said the document isn't quite ready. I think some of the comments could be resolved relatively easily with an update, though others might take some discussion to converge on what really is needed to say or not say in this document. There haven't been any responses to these yet that I've seen, nor a solid set of positive comments. So, we're not going to advance this quite yet. We do want to make progress between now and the next meeting. We thought that perhaps a webex / teleconference side meeting would give people a chance to talk about next steps and help to advance this. If you're interested in participating in this, please respond to this poll on some possible times: http://doodle.com/ng3y444te6mbendx Assuming there's a critical mass of responses, we'll try to pick a time that's least inconvenient, though guaranteed to be terrible for some. Thanks for your feedback on this, and help towards finishing the document. -- Wes Eddy MTI Systems ___ aqm mailing list aqm@ietf.org https://www.ietf.org/mailman/listinfo/aqm