from:"Dave Taht"

Re: [aqm] [gautamramk/FQ-PIE-for-Linux-Kernel] max_prob & ecn (#2)

2018-12-19 Thread Dave Taht

Somehow our naive attempt at putting ecn into pie became part of the
standard. This project is making that more configurable. I'd like it if
more pie folk took a look at it.

https://github.com/gautamramk/FQ-PIE-for-Linux-Kernel/issues/2

Gautam Ramakrishnan  writes:

> I have added this feature in the latest commit.
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub, or mute the thread.

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] [Bloat] [Cake] paper: per flow fairness in a data center network

2018-12-15 Thread Dave Taht

Luca Muscariello  writes:

> I disagree on the claims that DC switches do not implement anything.
> They do, from quite some time now.
>
> https://www.cisco.com/c/en/us/products/collateral/switches/nexus-9000-series-switches/white-paper-c11-738488.html

I'm really impressed. I'd have probably heard about it if they'd
mentioned bufferbloat once :/.

The graphs comparing their performance to arista's are far, far, far too
small to read. You can certainly see a huge improvement on mice in this
paper.

is there a better copy of this paper around?

What's the cheapest form of this switch I can buy? (or beg, borrow, or
steal?) I do need a 10GigE-40GigE capable switch in the lab, and BOY oh
boy oh boy would I love to test this one.

Has this tech made it into their routing products?

>
> On Thu, Dec 6, 2018 at 4:19 AM Dave Taht  wrote:
>
> While I strongly agree with their premise:
> 
> "Multi-tenant DCNs cannot rely on specialized protocols and
> mechanisms
> that assume single ownership and end-system compliance. It is
> necessary rather to implement general, well-understood mechanisms
> provided as a network service that require as few assumptions
> about DC
> workload as possible."
> 
> ... And there's a solid set of links to current work, and a very
> interesting comparison to pfabric, their DCTCP emulation is too
> flawed
> to be convincing, and we really should get around to making the
> ns2
> fq_codel emulation fully match reality. This is also a scenario
> where
> I'd like to see cake tried, to demonstrate the effectiveness (or
> not!)
> of 8 way set associative queuing, cobalt, per host/per flow fq,
> etc,
> vs some of the workloads they outline.
> 
> https://perso.telecom-paristech.fr/drossi/paper/rossi18hpsr.pdf
> 
> -- 
> 
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-205-9740
> ___
> Cake mailing list
> c...@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
> 
>
>
> ___
> Bloat mailing list
> bl...@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] paper: per flow fairness in a data center network

2018-12-05 Thread Dave Taht

While I strongly agree with their premise:

"Multi-tenant DCNs cannot rely on specialized protocols and mechanisms
that assume single ownership and end-system compliance. It is
necessary rather to implement general, well-understood mechanisms
provided as a network service that require as few assumptions about DC
workload as possible."

... And there's a solid set of links to current work, and a very
interesting comparison to pfabric, their DCTCP emulation is too flawed
to be convincing, and we really should get around to making the ns2
fq_codel emulation fully match reality. This is also a scenario where
I'd like to see cake tried, to demonstrate the effectiveness (or not!)
of 8 way set associative queuing, cobalt, per host/per flow fq, etc,
vs some of the workloads they outline.

https://perso.telecom-paristech.fr/drossi/paper/rossi18hpsr.pdf

-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] sch_cake and sch_tbs now in linux 4.19

2018-10-24 Thread Dave Taht

Of possible interest to the members of this (former) working group, is that
sch_cake (our all singing, all dancing shaper + per host fq + revised
codel qdisc) is now in the Linux mainline. Of other possible interest
is the new sch_tbs scheduler which allows for time based packet
releases and hardware offload support.

https://kernelnewbies.org/Linux_4.19#Better_networking_experience_with_the_CAKE_queue_management_algorithm

-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] upstreamed sqm sch_cake in openwrt-18.06-rc2

2018-07-15 Thread Dave Taht

hopefully the identical version of sch_cake that will also be in linux
4.19 (presently in net-next) is now in openwrt's 18.06-rc2 release. It
would be good for tons more folk to beat it up thoroughly over the
next several weeks before it is formally released.

Come on, don't you remember back when reflashing for the cause was fun?

https://downloads.openwrt.org/releases/18.06.0-rc2/targets/

For those of you not paying attention to sch_cake's development, see

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=046f6fd5daefac7f5abdafb436b30f63bc7c602b

1) It would be good to get back more results from docsis mode on cable
modems (where we hope now you can use 99.99% of the uplink set rate
rather than 85-95%). Also, If anyone's got docsis-3.1 with pie
enabled, it would be good to know if we co-exist well with that.

2) We hope everyone digs the default per-host/per flow fq

3) There's a zillion other features worth exercising, like diffserv. A
late addition was the ability to run at speeds far greater than the <=
1gbit speeds we initially targetted for the shaper component, on
suitable hw. (shaper works great at a gigabit, try it!) Establishing
good cpu constraints by architecture would be good too. Etc.

huge thx to kevin db, toke, jon, and everyone else for finally "making it real".

-- 

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] CoDel: After much ado ...

2017-09-12 Thread Dave Taht

Jana Iyengar <j...@google.com> writes:

> ... draft-ietf-aqm-codel-08 is finally posted. This new version addresses all
> IESG comments during IESG review, in addition to review comments by Patrick
> Timmons and Yoav Nir. We thank everyone for their help with reviews. 
>
> Most importantly, I want to personally thank the fq_codel authors for sending 
> me
> Yerba Mate, Dave Taht for sending me delicious freshly-baked cookies, and Paul
> McKenney for sending me a ton of organic green tea to help me move on the
> document. I will say that you all managed to do something nobody has managed 
> so
> far: you successfully shamed me into getting this work done.
>
> I also received bungee cords from the fq_codel authors to tie myself to my 
> chair
> with, which I put to good use: I would like to share here evidence of my
> atonement. (Cookies are not in the picture, because they were delicious. 
> Thanks,
> Dave!)

Yer welcome, and thank you VERY MUCH for completing this. I got some
bungee cords for myself, too, as I have more than a few things 98% done
I'd like to get off my plate.

Perhaps we could include these new concepts in future standards for the
RFC creation processes?

However, I think at least one new hardware standard is necessary. There
needs to be some sort of laptop mounting bracket for the cookies and a
powerful feedback loop between interface and future IOT enabled-bungee
cords. Particularly, reaching for the mouse, rather than cookies or tea,
should be de-incentivised. I couldn't come up with a good way to
distinguish between those forms of muscular traffic.

>
> - jana
>
> (P.S.: I now look forward to receiving thank you gifts. Oh, and I'm
> caffeine-free and vegetarian, just in case.)
>
>
> ___
> aqm mailing list
> aqm@ietf.org
> https://www.ietf.org/mailman/listinfo/aqm

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] I am setting up a per holiday cron job

2017-02-13 Thread Dave Taht

the template:

For [INSERT HOLIDAY], I'd really love to see a codel & fq_codel RFC published.




-- 
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] make-wifi-fast linuxplumbers talk summary on lwn.net

2016-11-08 Thread Dave Taht

and available here:

https://lwn.net/SubscriberLink/705884/1bdb9c4aa048b0d5/

After the talk I discussed with several folk about applying the same
debloating techniques to other chipsets.

I don't remember, unfortunately, who all those folk were, nor the
candidate chipsets!

We are still wrestling with "good" settings to get fq_codel to scale
properly, and mostly trying to move in the direction of less inherent
latency on more stations.

-- 
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] "Globally, the average loss rates on policed flows are over 20%"

2016-08-03 Thread Dave Taht

And while I'm catching up on my academic backlog (scholar.google.com
has a ton of newer things on it about bufferbloat), this report on the
effects of policing was pretty good:

http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45411.pdf

On Wed, Aug 3, 2016 at 3:37 PM, Dave Täht  wrote:
> I am especially grateful for the full documentation of how to configure
> the bsd versions of this stuff, but the rest of the report was pretty
> good too.
>
> http://caia.swin.edu.au/reports/160708A/CAIA-TR-160708A.pdf
> ___
> Bloat mailing list
> bl...@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat



-- 
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] transport protocols in userspace

2016-06-21 Thread Dave Taht

good discussion of a new feature for linux, proposed by facebook, that
will make it much easier to write protocols in userspace, the
positives, and negatives.

https://lwn.net/SubscriberLink/691887/9388e53741d4c93e/

Please don't discuss on this list, I've had a bad morning already.

-- 
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] A bit of history on RFC970 and RFC896 from john nagle

2016-04-14 Thread Dave Taht

-- Forwarded message --
From: John Nagle
Date: Thu, Apr 14, 2016 at 7:14 PM
Subject: Re: Bufferbloat and FQ and you
To: Dave Taht <dave.t...@gmail.com>
Cc: ro...@cisco.com

On 04/14/2016 03:33 PM, Dave Taht wrote:
>
> https://www.rfc-editor.org/rfc/rfc7806.txt was published today.

   "There is extensive history in the set of algorithms collectively
   referred to as "fair queuing".  The model was initially discussed in
   [RFC970], which proposed it hypothetically as a solution to the TCP
   Silly Window Syndrome issue in BSD 4.1."

That's somewhat wrong.

First, tinygram prevention (the "Nagle algorithm") is not about
"silly window syndrome".  Silly window syndrome occurs when the
window is full, and the reader does a small read, resulting in
the sender being allowed a small write, resulting in very short
messages.  The solution is clearly to not offer more window
until there's at least one full size datagram worth of window
available.

Tinygram prevention is a problem when the window is empty,
not full, and the writer is doing small writes.  The question
is how to consolidate those writes.  It's not obvious how
to do this without impacting interactive response.  The
classic solution, from X.25, was an accumulation timer with
a human response time sized delay. That's a bad idea, but
unfortunately the people who put in delayed ACKs didn't know that.
They were trying to fix TELNET responsiveness at Berkeley, which
was using a large number of dumb terminals connected to terminal
servers at the time. Delayed ACKs with a fixed timer are useful
in that situation, and in few others.

Actually, this didn't involve 4.1BSD's netowrking; we at Ford
Aerospace were running a heavily modified version of 3COM's UNET
TCP/IP stack on various UNIX systems.  That TCP/IP stack lost
out because it cost about $4000 per node for the software.

The tinygram stuff was in my RFC 896, and isn't really
relevant to fair queuing or congestion management in routers
and other middle boxes.

The important part of RFC970 is at the section headed
"Game Theoretic Aspects of Network Congestion".  This
discusses the relationship between endpoints and
middle boxes, and the need to create an ecosystem
which does not reward bad endpoint behavior.

The [NOFAIR]] reference is interesting.  Yes, fairness is
gameable.  But FIFO is so much worse, as the bufferbloat
people point out.

It's worth thinking about when a packet becomes useless and
should be dropped.  If the packet times out (this was originally
what TTL was for; it was a seconds count), it can be dropped
as obsolete.  A router which looks above the IP level could
also detect that the packet has been superseded by a later packet
in the same flow, that is, there's a retransmitted copy also
queued.  If enough resources are available, that's the only
packet dropping you have to do.  As an optimization, you
can also drop packets that are so far back in queues that
they'll time out before they're sent.

It's worth viewing that as a goal - don't drop any packets that
would be useful if they were delivered.  Just reorder based
on notions of fairness and quality of service.  This is the
opposite of Random Early Drop, but that's sort of a meat-axe
approach.

Bufferbloat is only bad if the queuing is FIFO-dumb. It's
fine to have lots of queue space if you manage it well.

(I'm retired from all this.  It's up to you guys now.)

John Nagle

-- 
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] Last Call: (FlowQueue-Codel) to Experimental RFC

2016-03-23 Thread Dave Taht

rity as I could
muster... and went and tested the hell out of it, whenever I could.

as for bufferbloat.net's efforts: We've published all the source code,
all the (flent.org) benchmark code, made all the code available widely
for anyone to try for under 50 bucks worth of hardware that can be
reflashed with openwrt (well, 54 dollars on amazon, for the edgerouter
X), and begged interested parties to try *every* bufferbloat-fighting
technology we have.  I jumped all over pie when it came out, helped
polish the code, added ecn support, and got it out there so it could
be tested by as many as possible, as soon as possible.

I thought fq_pie was pretty neat. I'd fiddle with your latest stuff if
you'd just fix dctcp's behaviors vs loss. I love the work on BQL and
on fixing TCPs, like pacing in sch_fq.

My principal interest is in ending bufferbloat in my lifetime via any
means possible. And I did not, and do not, intend to make a career out
of it.

If I, personally, can just get to where a few more pieces of gear can
be bought off the shelf with stuff that has the products of the AQM wg
in it - hopefully including wifi, 3g, and homeplug! - I can go back to
things I consider far, far, far more interesting.

and... Jeebus, it's just one experimental RFC.

> Otherwise, FQ_CoDel will get bad press later. Then, after riding the hype
> curve of coolness it will fall over the cliff of disillusionment.

We'll see. Never in my life have I seen a set of ideas so
enthusiastically adopted by those that have adopted it, with so few
complaints.

The only things that bother me at this point are behaviors below
~2mbit sans tuning, and the ecn support, for which we have research
ongoing in cake that we can easily fold back into fq_codel if we need
to.

>
>
> Bob
>
>
>
> On 22/03/16 04:41, Dave Taht wrote:
>>
>> I don't even know where to start bob. This part of the language has
>> been in the draft for 2 years, and you are the only person to object
>> that I can recall.
>>
>> It's an experimental RFC. By "safe" we mean that deploying it, within
>> the guidelines, won't break anything to any huge extent to enormous
>> benefits. "unsafe", for example, would be promoting use of dctcp while
>> it still responds incorrectly to packet loss.
>>
>> Verses your decades long quest for better variable rate video, we've
>> had over a decade of the bufferbloat problem to deal with on all
>> traffic, particularly along the edge, and even after solutions started
>> to appear in mid 2012, we haven't made a real dent in what's deployed,
>> except for the small select group of devs, academics, ISPs, and
>> manufacturers willing to try something new.  I'd like to imagine
>> things are shifting to the left side of the green line here, but under
>> load, most users are still experiencing latency orders of magnitude in
>> excess of what can be achieved
>> http://www.dslreports.com/speedtest/results/bufferbloat?up=1
>>
>> I've been testing the latest generation of wifi APs of late, and the
>> "best" of them, under load, in a single direction, has over 2 seconds
>> at the lower rates. Applying any of these algorithms to wifi is
>> proving hard, and it's where the bottlenecks are shifting to at least
>> in my world, where the default download speed is hovering at around
>> 75mbit, and wifi starts breaking down long before that is hit.
>>
>> ...
>>
>> I tore apart that HAS experiment you cited here:
>> https://lists.bufferbloat.net/pipermail/bloat/2016-February/007198.html
>> - where I was, at least, happy
>> to see fq_codel handle the onslought of dctcp traffic, gracefully. (It
>> makes me nervous to have such tcps loose on the internet where a
>> configuration mistake might send that at the wrong people. fq_codel,
>> "safe" - not, perhaps, optimal - in the face of dctcp.)
>>
>> my key objections to nearly all the experiments on your side are
>> non-reproducability, no competing traffic (not even bothering to
>> measure web PLT in
>> that paper, for example), no competing upload traffic, and no
>> inclusion of the typical things that are latency sensitive at all
>> (voip, dns, tcp neg, ssl neg, etc).
>>
>> with competing download and upload traffic, fq_codel *dramatically*
>> improves the responsiveness and utilization of the link, for all
>> traffic. Above 5mbits pretty much the only thing that matters for web
>> traffic is RTT, the google cite for this is around somewhere.
>>
>> I tend to weigh low latency for every other form of traffic...
>> today... over marginal improvements in a contrived video download
>> scenario someday.
>>
>> As for pie vs fq_codel,

Re: [aqm] Alia Atlas' No Objection on draft-ietf-aqm-fq-codel-05: (with COMMENT)

2016-03-19 Thread Dave Taht

On Thu, Mar 17, 2016 at 10:13 AM, Toke Høiland-Jørgensen  wrote:
> "Alia Atlas"  writes:
>
>> --
>> COMMENT:
>> --
>>
>> I think it would be useful to have a reference to the Linux
>> implementation ("current" version and pointer).
>
> Hi Alia
>
> I've added a reference pointing to the fq_codel code in Linux git tree
> to the latest updated version, available here:
> https://kau.toke.dk/ietf/draft-ietf-aqm-fq-codel-06.html (or .txt).

I'm not huge on calling this reference [LINUX]. [LINUXSRC]? [SRC]?

I also felt compelled, after this round of cite-adding, to add a few
more cites, (what will be) rfc7806, BQL, HTB, and HFSC, with a brief
section explaining why they are needed also. BQL was the under
appreciated breakthrough that made scaling past a gbit possible, and
would (if implemented) make dsl and cable modems a lot better,
at their (much slower) speeds.

https://github.com/dtaht/bufferbloat-rfcs/commit/7d500133008857b7b78000abac9d592e66477ffb

adding:

## Device queues must also be well controlled

It is best that these AQM and FQ algorithms run as close to the hardware
as possible. Scheduling such complexity at interrupt time is difficult, so
a small standing queue between the algorithm and the wire is often needed
at higher transmit rates.

In Linux, this is accomplished via "Byte Queue Limits" {{BQL}} in the
device driver ring buffer (for physical line rates), and via a software
rate limiter such as {HTB}}, {{HFSC}}, or {{CAKE}} otherwise.

Other issues with concatenated queues are described in {{CODEL}}.

...

There has been such an accumulation of small changes in response to
this wonderful review process that I fear that going through another
"last, last" call will be needed.

> -Toke

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] working group last call on CoDel drafts

2015-12-04 Thread Dave Taht

In just about every benchmark we have created to date, the linux
version of the codel implementation wins over dozens of attempted
alternatives. We have one that is mildly better at a 10ms RTT, but not
as good at 80ms, but that's it.

This doesn't mean that more experimentation isn't called for (there
are two radical alternatives I know of still being tested), but I
would vote for putting the linux version into the codel draft.


On Fri, Dec 4, 2015 at 11:16 AM, Bless, Roland (TM)
 wrote:
> Dear all,
>
> we believe that the Codel specification
> https://datatracker.ietf.org/doc/draft-ietf-aqm-codel/ needs at least one
> major clarification.
>
> The following lines are present in the draft's pseudo-code, but are not
> explained further anywhere in the document text, and moreover differ from
> the Linux implementation [*], that the document also suggests as reference
> implementation.
>
>// If min went above target close to when it last went
>// below, assume that the drop rate that controlled the
>// queue on the last cycle is a good starting point to
>// control it now. ('drop_next' will be at most 'interval'
>// later than the time of the last drop so 'now - drop_next'
>// is a good approximation of the time from the last drop
>// until now.)
>count_ = (count_ > 2 && now - drop_next_ < 8*interval_)?
>count_ - 2 : 1;
>
> This line makes sure, that when two dropping states are entered within a
> short interval from each other, the variable count is not reset (to zero),
> but is rather changed somehow. In this document, count is decreased by two,
> while in the Linux version, count is set to the number of packets, that were
> dropped in the previous dropping state.
>
> Based on the email-thread that was started from these messages ...
>  http://www.ietf.org/mail-archive/web/aqm/current/msg00376.html
> http://www.ietf.org/mail-archive/web/aqm/current/msg01250.html
> http://www.ietf.org/mail-archive/web/aqm/current/msg01455.html
>
> ... one can infer, that:
> 1) the case where count is not reset is not an exception, but rather a
> common case (that we can confirm from our measurements),

It is a common case. Most of the other behaviors in codel are in
attempting to seek to the optimum drop rate, that bit is the one that
maintains the optimal drop rate.

> 2) several options for this behavior were described on the mailing list some
> time ago,
>
> Since it is the most common case, this part of the algorithm should be
> explained in the specification.
> If the two versions will continue to differ, both algorithms (and their
> difference in behavior) should be explained,
> but in order to avoid confusion for implementers/operators we believe that
> specification of a single algorithm is preferable .

I could make a counter argument saying that diversity and not having a
monoculture is good, and that it is possible to make other codels with
very similar behavior... but I too would prefer the one true
implementation in this draft.

>
> Regards,
>   Roland and Polina
>
> [*] https://github.com/torvalds/linux/blob/master/include/net/codel.h#L341
>
> Am 02.12.2015 um 16:45 schrieb Wesley Eddy:
>
> These both have the intended status designated as "Informational". Similar
> to the questions asked for PIE, we/chairs need to understand if there's
> consensus on:
> - Are these specifications are clear and sufficient quality to publish?
> - Should the status of the RFCs be "Experimental", "Proposed Standard", or
> "Informational"?
>
>
>
> ___
> aqm mailing list
> aqm@ietf.org
> https://www.ietf.org/mailman/listinfo/aqm
>

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] CoDel's control law that determines drop frequency

2015-11-03 Thread Dave Taht

It helps to have the codel mailing list cc'd on codel discussions.
Adding this message to the cc.

One of these days we do have to write up - after finishing - cake's
code-likel implementation.


Dave Täht
I just invested five years of my life to making wifi better. And,
now... the FCC wants to make my work, illegal for people to install.
https://www.gofundme.com/savewifi


On Tue, Nov 3, 2015 at 11:22 AM, Jeff Weeks <jwe...@sandvine.com> wrote:
> The drop rate is affected by sojourn time, yes, but a 2x sojourn time goes 
> through the same incremental reduction of interval size, as does a sojourn 
> time of x.
>
> In investigating codel, I've setup various worst case scenarios, and I feel 
> like the algorithm could be made better by having its response time more 
> dependent upon how far away from the target latency it is.
>
> For example, consider a disabled interface, with a large (or even 
> conceptually infinite queue), that's attached to a fairly small shaper.
>
> The interface is then enabled, and immediately starts seeing 100Mbps, and 
> tries to shape it to 1Mbps.
>
> The queue will obviously build up quickly, and codel will notice this, and 
> enter the drop state.  But it will start at count = 1.
>
> If the interface is receiving 64b udp packets, then it'll be receiving 
> 100,000,000/512 == 195,312 packets per second, and only transmitting 1953 
> packets per second.
>
> The default target is 5ms, which is about 10 packets.  So of those 195,312 
> packets/second, we should ideally be dropping 195,312 - (1953 + 10) == 
> 192,349 packets/second.
>
> But in order to drop that many packets, 'count' needs to ramp up to the point 
> where the drop interval is consistently 5,198 ns.
>
> I believe that means 'count' has to reach some nearly impossibly high value 
> of (100ms/5198ns)^2 == 370,107,128
>
> I say nearly impossible, because it will take minutes (hours?) to get that 
> high (if my math is correct, it'll take over 17 seconds just to reach 7500).
>
> In the meantime, the queue *isn't* being effectively managed, as packets with 
> extremely high latencies will be transmitted for far too long.
>
> Of course, as I stated earlier, simply increasing count more quickly, based 
> on how far away we are from the target latency effectively invalidates the 
> optimization which most (all?) codel implementations use (namely the newton 
> step integer-only sqrt approximation) as, at some point, the approximation 
> starts *diverging* from the appropriate value.
>
> One alternative which I've been investigating is the possibility of skewing 
> the precalculated 1/sqrt(count) value.
>
> If this is kept as a 32-bit all decimal fixed point number, then performing 
> the multiplication by intentionally miss-shifting will result in doubling 
> sqrt(count):
>
> eg, take the following accurate calculation of next interval:
>
> codel->next_interval_start_ticks = base_time + ((interval * 
> codel->one_over_sqrt_count) >> 32)
>
> And intentionally miss-shift by 1 bit:
>
> codel->next_interval_start_ticks = base_time + ((interval * 
> codel->one_over_sqrt_count) >> 33)
>
> Will effectively have the interval reduce twice as fast.
>
> Alternatively, (and similarly to how CAKE halves the count while re-entering 
> the drop interval), count can periodically be doubled, if the current value 
> is seen to not be adequately affecting traffic, and the pre-calculated 
> 1/sqrt(count) can then be divided by sqrt(2) (i.e., do not rely on the newton 
> step approximation for this modification of count).
>
> Cheers,
> --Jeff
>
>
>
> 
> /dev/jeff_weeks.x2936
> Sandvine Incorporated
> 
> From: aqm [aqm-boun...@ietf.org] on behalf of Andrew Mcgregor 
> [andrewm...@google.com]
> Sent: Sunday, October 25, 2015 6:44 PM
> To: Dave Dolson
> Cc: Kathleen Nichols; Bob Briscoe; Dave Taht; Van Jacobson; AQM IETF list
> Subject: Re: [aqm] CoDel's control law that determines drop frequency
>
> CoDel does have the form of a controller; drop rate (not probability) is a 
> function of sojourn time (not queue size) and history, encoded in the state 
> variables.
>
> Now, I don't take it as proven that the particular form of the controller is 
> the best we could do, but making it a rate and based on sojourn time are 
> clear wins. Yes, you can use size as a proxy for sojourn time if your link 
> really has a constant bit rate, but not even ethernet is exactly CBR in 
> practice (and in some hardware situations, knowing the size is much more 
> expensive than measuring sojourn; the opposite can also apply).  Yes, you can 
> use probability as a proxy for rate if y

[aqm] Catching up on diffserv markings

2015-10-21 Thread Dave Taht

I unsubscribed from rmcat and rtcweb groups a while back after I got
overloaded, and appear.in started working so well, (for both ipv6 and
ipv4! I use it all day long now!), to focus on finishing up the new
"cake" qdisc/shaper/aqm/QoS system, among other things.

http://www.bufferbloat.net/projects/codel/wiki/CakeTechnical

Cake is now entering the testlab, and among other things, it has
support for the diffserv markings discussed in the related, now
concluded dart wg, but in ways somewhat different from that imagined
there. We have not got any good code in our testbeds yet to test
videoconferencing behavior, and we could use some, although it does
look like we can drive firefox with some remote control stuff with a
fixed video playback now

Five questions:

1) Has anyone implemented or tested putting voice and video on two
different 5-tuples in any running code out there?

2) How about diffserv markings in general? Do any browsers or webrtc
capable software support what was discussed way back when?

3) Were diffserv marking changes eventually allowed on the same 5-tuple?

4) Did the ECN support that was originally in one draft or another
ever make it into any running code?

(yea, apple plans to turn on ecn universally in their next OS!)

5)  What else did I miss in the past year I should know about?

Feel free to contact me off list if these have already been discussed.
I have totally lost track of the relevant drafts.

Sincerely,

Dave Täht
I just lost five years of my life to making the edge
of the internet, and, wifi better.
And, now... the FCC wants to make my work illegal
for ordinary people to install.
https://www.gofundme.com/savewifi

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] one way to truly screw up ecn - is to mark CE on all packets

2015-10-15 Thread Dave Taht

https://forums.developer.apple.com/thread/16699

Unitymedia sets “CE” on all packets. And totally messes up a vpn that
adheres to bob's guidelines regarding encapsulation. Sigh.

Can someone call those guys and straighten them out?

-- 
Dave Täht
Do you want faster, better, wifi? https://www.patreon.com/dtaht

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] Last call for signatures to the FCC on the wifi lockdown issue

2015-10-09 Thread Dave Taht

The CeroWrt project's letter to the FCC on how to better manage the
software on wifi and home routers vs some proposed regulations is now
in last call for signatures. The final draft of our FCC submittal is
here:

https://docs.google.com/document/d/15QhugvMlIOjH7iCxFdqJFhhwT6_nmYT2j8xAscCImX0/edit?usp=sharing

The principal signers (Dave Taht and Vint Cerf), are joined by many
network researchers, open source developers, and dozens of developers
of aftermarket firmware projects like OpenWrt.

Prominent signers currently include:

Jonathan Corbet, David P. Reed, Dan Geer, Jim Gettys, Phil Karn, Felix
nFietkau, Corinna "Elektra" Aichele, Randell Jesup, Eric S. Raymond,
Simon Kelly, Andreas Petlund, Sascha Meinrath, Joe Touch, Dave Farber,
Nick Feamster, Paul Vixie, Bob Frankston, Eric Schultz, Brahm Cohen,
Jeff Osborn, Harald Alvestrand, and James Woodyatt.

If you would like to join our call for substituting sane software
engineering practices over misguided regulations, the window for
adding your signature to the letter closes at 11:59AM ET, today,
Friday, 2015-10-08.

Sign via webform here: http://goo.gl/forms/WCF7kPcFl9

We are at approximately 170 signatures as I write.

For more details on the controversy we are attempting to address, or
to submit your own filing to the FCC see:

https://libreplanet.org/wiki/Save_WiFi
https://www.dearfcc.org/

Sincerely,

Dave Täht
CeroWrt Project Architect
Tel: +46547001161

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] FCC vs Wifi: The cerowrt letter to the FCC about the wifi firmware lockdown issue is nearly final

2015-10-06 Thread Dave Taht

We go into a lot of bufferbloat and homenet stuff... it is my hope
others involved in these efforts would be willing to add their voice
to the mix, either by signing, commenting, or producing your own
letters.

For your comments, please see the current draft, and especially the 5
mandates at the end at:

https://docs.google.com/document/d/1E1D1vWP9uA97Yj5UuBPZXuQEPHARp-AhRqUOeQB2WPk/edit?usp=sharing

Final signatures are being accepted now via web form at:

http://goo.gl/forms/WCF7kPcFl9

if there is another more appropo ietf mailing list for this sort of
announcement/RFC, please forward. Also discussions are mostly on the
make-wifi-mailing list on lists.bufferbloat.net or the fcc mailing
list at prpl. I note that a similar letter needs to be constructed to
the eu commission.

Deadline for filing is oct 8.






-- 
Dave Täht
Do you want faster, better, wifi? https://www.patreon.com/dtaht

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] CoDel's control law that determines drop frequency

2015-09-30 Thread Dave Taht

On Wed, Sep 30, 2015 at 1:50 AM, Bob Briscoe  wrote:
> Andrew,
>
> I am also not so interested in an AQM dealing directly with unresponsive
> traffic - I prefer to keep policing and AQM as separately deployable
> functions, because AQM should be policy-neutral, whereas policing inherently
> involves policy.
>
> My concern was merely that CoDel's linear increase in drop probability can
> take a long time to reach where it intends to get to. I would have thought
> some form of exponential increase, or at least super-linear, would have been
> more responsive to changing traffic conditions. I.e., rather than have to
> answer the question "how quickly should drop probability increase?", make it
> increase increasingly quickly.
>
> Early on, Rong Pan showed that it takes CoDel ages to bring high load under
> control. I think this linear increase is the reason.

cake uses a better curve for codel, but we still need to do more
testing in the lab.

http://www.bufferbloat.net/projects/codel/wiki/CakeTechnical

>
> Bob
>
>
>
> On 30/09/15 01:42, Andrew Mcgregor wrote:
>
> Hmm, that's really interesting.
>
> Most interesting is that my understanding is that the control law was
> intended to deal with aggregates of mostly TCP-like traffic, and that an
> overload of unresponsive traffic wasn't much of a goal; this seems like
> vaguely reasonable behaviour, I suppose, given that pathological situation.
>
> But I don't have a way to derive the control law from first principles at
> this time (I haven't been working on that for a long time now).
>
> On 25 September 2015 at 06:27, Bob Briscoe  wrote:
>>
>> Toke,
>>
>> Having originally whinged that no-one ever responded to my original 2013
>> posting, now it's my turn to be embarrassed for having missed your
>> interesting response for over 3 months.
>>
>> Cool that the analysis proves correct in practice - always nice.
>>
>> The question is still open whether this was the intention, and if so why
>> this particular control law was intended.
>> I would rather we started from a statement of what the control law ought
>> to do, then derive it.
>>
>> Andrew McGregor said he would have a go at this question some time ago...
>> Andrew?
>>
>>
>> Bob
>>
>>
>>
>> On 07/06/15 20:27, Toke Høiland-Jørgensen wrote:
>>
>> Hi Bob
>>
>> Apologies for reviving this ancient thread; been meaning to get around
>> to it sooner, but well... better late than never I suppose.
>>
>> (Web link to your original mail, in case Message-ID referencing breaks:
>> https://www.ietf.org/mail-archive/web/aqm/current/msg00376.html ).
>>
>> Having recently had a need to understand CoDel's behaviour in more
>> detail, your analysis popped out of wherever it's been hiding in the
>> back of my mind and presented itself as maybe a good place to start. :)
>>
>> So anyhow, I'm going to skip the initial assertions in your email and
>> focus on the analysis:
>>
>> Here's my working (pls check it - I may have made mistakes)
>> _
>> For brevity, I'll define some briefer variable names:
>> interval =  I [s]
>> next_drop = D [s]
>> packet-rate =   R [pkt/s]
>> count = n [pkt]
>>
>> >From the CoDel control law code:
>> D(n) = I / sqrt(n)
>> And the instantaneous drop probability is:
>> p(n) = 1/( R * D(n) )
>>
>> Then the slope of the rise in drop probability with time is:
>> Delta p / Delta t   = [p(n+1) - p(n)] / D(n)
>> = [1/D(n+1) - 1/D(n)] / [ R * D(n) ]
>> = sqrt(n) * [sqrt(n+1) - sqrt(n)] /
>> [R*I*I]
>> = [ sqrt(n(n+1)) - n ] / R*I^2
>>
>> I couldn't find anything wrong with the derivation. I'm not entirely
>> sure that I think it makes sense to speak about an "instantaneous drop
>> probability" for an algorithm that is not probabilistic in nature.
>> However, interpreting p(n) as "the fraction of packets dropped over the
>> interval from D(n) to D(n+1)" makes sense, I guess, and for this
>> analysis that works.
>>
>> At count = 1, the numerator starts at sqrt(2)-1 = 0.414.
>> Amd as n increases, it rapidly tends to 1/2.
>>
>> So CoDel's rate of increase of drop probability with time is nearly
>> constant (it
>> is always between 0.414 and 0.5) and it rapidly approaches 0.5 after a few
>> drops, tending towards:
>> dp/dt = 1/(2*R*I^2)
>>
>> This constant increase clearly has very little to do with the square-root
>> law of
>> TCP Reno.
>>
>> In the above formula, drop probability increases inversely proportional to
>> the
>> packet rate. For instance, with I = 100ms and 1500B packets
>> at 10Mb/s =>R = 833 pkt/s =>dp/dt = 6.0% /s
>> at 100Mb/s =>   R = 8333 pkt/s =>   dp/dt = 0.6% /s
>>
>> I also tried to test this. I configured CoDel (on a Linux 4.0 box) on
>> 1Mbps, 2Mbps and 10Mbps links with interval settings of 1 second and
>> 500ms, and a total packet limit

Re: [aqm] WGLC on draft-ietf-aqm-eval-guidelines

2015-08-18 Thread Dave Taht

On Tue, Aug 18, 2015 at 3:03 PM, Roland Bless roland.bl...@kit.edu wrote:
 Hi,

 Am 10.08.2015 um 15:43 schrieb Wesley Eddy:
 As chairs, Richard and I would like to start a 2-week working
 group last call on the AQM characterization guidelines:

 https://datatracker.ietf.org/doc/draft-ietf-aqm-eval-guidelines/

 Please make a review of this, and send comments to the list or
 chairs.  Any comments that you might have will be useful to us,
 even if it's just to say that you've read it and have no other
 comments.

 Unfortunately, we (Polina and I) did a thorough review,
 which is attached. TL;DR: from our point-of-view
 the I-D needs a major revision.

I am so tired of this document that I can hardly bear to read it
again, but I agree with the majority of the comments.

Sometimes I do wish we could do graphics and charts as the IEEE does.

 Regards,
  Roland


 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm




-- 
Dave Täht
worldwide bufferbloat report:
http://www.dslreports.com/speedtest/results/bufferbloat
And:
What will it take to vastly improve wifi for everyone?
https://plus.google.com/u/0/explore/makewififast

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] cake status ( was Codel's count variable and re-entering dropping state at small time intervals)

2015-08-14 Thread Dave Taht

I would like to stress that cake is a work in progress, taking place
with very limited resources - jonathon's funding ran out last month
and we've had to scramble to keep a floor under him F/T, toke is
contributing his testbed and test scripts that he used for The good
the bad and the wifi, recently published in computer networks:
https://kau.toke.dk/experiments/good-bad-wifi/ so we can compare all
prior qdiscs...

but he is otherwise on vacation... various other parties have
contributed scripts to use it in openwrt... and I am entirely unpaid,
yet contributing a few servers and clients in real world scenarios
while working primarily on the make-wifi-fast stuff, for which some of
cake's algorithms may apply but the code needs to move to the
mac80211e layer, which was discussed at battlemesh.

Other bits - like the new more robust linux hashing api which supports
macaddr and mpls targets - are in rapid development elsewhere and we
are not tracking that work well.

Any suggestions towards putting a better floor under this increasingly
promising work are welcomed. Any grant money out there?

Exploration of various constants, ratios, and other bits of math
throughout the code is welcomed, also. All the code is open source and
easily buildable for many versions of linux now. Feel free to play.

Much needed are testing and analysis at both line and shaped rates at
1gigE, 10gige and higher, (anyone got 10GigE in a testbed we can use?)
-

testing at longer rtts is needed (we probably need to expose the
interval parameter for the satcomm folk), and with more mixtures of
traffic than we currently use. We  worked out how to test webrtc only
recently (at ietf), for example, but not coded it up.

ns2 and ns3 models are needed.

there are some thoughts towards leveraging qfq in another group of
researchers, more news on that as it happens.

Lastly, if anyone knows of some cites for previous attempts at deficit
mode schedulers and the other key ideas in cake - we have not done an
exaustive liturature search yet, for that pentultimate paper that is
in progress.

Having a ton of fun though! What we did on our summer vacation!

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] [tsvwg] New Liaison Statement, Explicit Congestion Notification for Lower Layer Protocols

2015-07-21 Thread Dave Taht

Is there  anyone doing ECN outreach also to IEEE 802.11?

On Tue, Jul 21, 2015 at 10:42 AM, Liaison Statement Management Tool
l...@ietf.org wrote:
 Title: Explicit Congestion Notification for Lower Layer Protocols
 Submission Date: 2015-07-20
 URL of the IETF Web page: https://datatracker.ietf.org/liaison/1424/
 Please reply by 2015-10-30
 From: Transport Area Working Group (David Black david.bl...@emc.com)
 To: 3GPP (susanna.koois...@etsi.org)
 Cc: Gonzalo Camarillo gonzalo.camari...@ericsson.com,Gorry Fairhurst 
 go...@erg.abdn.ac.uk,Martin Stiemerling mls.i...@gmail.com,Spencer 
 Dawkins spencerdawkins.i...@gmail.com,John Kaippallimalil 
 john.kaippallima...@huawei.com,Bob Briscoe i...@bobbriscoe.net,Transport 
 Area Working Group Discussion List ts...@ietf.org
 Response Contact: David Black david.bl...@emc.com
 Technical Contact: Bob Briscoe i...@bobbriscoe.net
 Purpose: For comment

 Body: To: 3GPP SA, 3GPP CT, 3GPP RAN, 3GPP SA4, 3GPP SA2, 3GPP RAN2
 From: IETF TSVWG

 In 2001, the IETF introduced explicit congestion notification (ECN) to the 
 Internet Protocol as a proposed standard [RFC3168]. The purpose of ECN was to 
 notify congestion without having to drop packets. The IETF originally 
 specified ECN for cases where buffers were IP-aware. However, ECN is now 
 being used in a number of environments including codec selection and rate 
 adaptation, where 3GPP protocols such as PDCP encapsulate IP. As active queue 
 management (AQM) and ECN become widely deployed in 3GPP networks and 
 interconnected IP networks, it could be incompatible with the standardized 
 use of ECN across the end-to-end IP transport [RFC7567].

 The IETF is now considering new uses of ECN for low latency 
 [draft-welzl-ecn-benefits] that would be applicable to 5G mobile flows. 
 However, the IETF has realized that it has given little if any guidance on 
 how to add explicit congestion notification to lower layer protocols or 
 interfaces between lower layers and ECN in IP.

 This liaison statement is to inform 3GPP, in particular those groups 
 including those involved in 3GPP Release-10 work on the work item ECSRA_LA 
 (TR23.860) - SA4, CT4, SA2 and RAN2. Please distribute to all groups that 
 have used or plan to use IETF ECN /AQM RFCs in 3GPP specifications.

 The IETF has started work on guidelines for adding ECN to protocols that may 
 encapsulate IP and interfacing these protocols with ECN in IP. Then IP may 
 act in its role as an interoperability protocol over multiple forwarding 
 protocols. This activity is led by the IETF's transport services working 
 group (tsvwg).

 Actions:
 The IETF tsvwg kindly asks 3GPP:
 1) to tell the IETF tsvwg which 3GPP working groups could be affected by this 
 work.
 2) To inform the IETF tsvwg of any specific 3GPP specifications affected by 
 this work.
 3) to forward this liaison statement to these affected working groups, and to 
 invite them to review the latest draft of the guidelines, available here:
   http://tools.ietf.org/html/draft-ietf-tsvwg-ecn-encap-guidelines

 Review comments are particularly welcome on:
   - comprehensibility for the 3GPP community
   - usefulness and applicability
   - technical feasibility

 Review comments may be posted directly to the IETF tsvwg mailing list 
 mailto: ts...@ietf.org. Postings from non-subscribers may be delayed by 
 moderation. Alternatively, subscription is open to all at:  
 https://www.ietf.org/mailman/listinfo/tsvwg.

 The following IETF specifications or drafts are particularly relevant to this 
 activity (the relevance of each of them is explained in the first item below):
 * draft-ietf-tsvwg-ecn-encap-guidelines
 * RFC3168 updated by RFC4301, RFC6040 (ECN in respectively: IP/TCP, IPsec  
 IP-in-IP tunnels)
 * RFC6679 (ECN in RTP)
 * RFC5129 updated by RFC5462 (ECN in MPLS)
 * RFC4774 (Specifying alternative semantics for the ECN field)
 * RFC7567 (Recommendations Regarding Active Queue Management
 * draft-welzl-ecn-benefits (Benefits to Applications of Using ECN)

 Yours,
 --David L. Black (TSVWG co-chair)
 Attachments:

 No document has been attached

-- 
Dave Täht
worldwide bufferbloat report:
http://www.dslreports.com/speedtest/results/bufferbloat
And:
What will it take to vastly improve wifi for everyone?
https://plus.google.com/u/0/explore/makewififast

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] quick comment on aqm-eval-guidelines

2015-07-03 Thread Dave Taht

   From each of these sets of measurements, the 10th and 90th
   percentiles and the median value SHOULD be computed.  For each
   scenario, a graph can be generated, with the x-axis showing the end-
   to-end delay and the y-axis the goodput.  This graph provides part of
   a better understanding of (1) the delay/goodput trade-off for a given
   congestion control mechanism, and (2) how the goodput and average
   queue size vary as a function of the traffic load.

This is lame. Capturing *all* the data as in a CDF or an Winstien
ellipsis plot, across the entire range, is to be preferred when
engineering a system.

90th percentile is a very, very low bar to cross, most of the nasty
bufferbloat happens at the top end of the range. Packet crcs, as one
example, are measured out to what, one in 6 million? Would you drive a
car that had the steering wheel fail one time in 10 turns?

as for medians, seven figure summaries, if you must...




-- 
Dave Täht
worldwide bufferbloat report:
http://www.dslreports.com/speedtest/results/bufferbloat
And:
What will it take to vastly improve wifi for everyone?
https://plus.google.com/u/0/explore/makewififast

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] FQ-PIE kernel module implementation

2015-07-03 Thread Dave Taht

On Fri, Jul 3, 2015 at 11:22 AM, Fred Baker (fred) f...@cisco.com wrote:

 On Jul 3, 2015, at 10:56 AM, Dave Taht dave.t...@gmail.com wrote:

 There are also weighted FQ systems (like qfq+ + pie or codel) under
 development.

 Actually, A WFQ system has been in Cisco product for 20 years, and I wrote 
 one at a different company four years earlier. having FQ systems be weighted 
 is pretty normal.

yep! Sorry! What is the current limit on number of queues, however?

/me gets head out of linux sand


-- 
Dave Täht
worldwide bufferbloat report:
http://www.dslreports.com/speedtest/results/bufferbloat
And:
What will it take to vastly improve wifi for everyone?
https://plus.google.com/u/0/explore/makewififast

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] FQ-PIE kernel module implementation

2015-07-03 Thread Dave Taht

On Fri, Jul 3, 2015 at 2:52 AM, Fred Baker f...@cisco.com wrote:

 On Jul 3, 2015, at 2:45 AM, Polina Goltsman uu...@student.kit.edu wrote:

 As I understand the FQ-Codel draft, it seems to be fundamental to FQ-Codel 
 that each queue has separate state variables.
 So my question is: is it indeed fundamental ?

 If you're asking whether it is fundamental to fair queuing, I'll recommend 
 you start researching that question with RFC 970 and the articles in SIGCOMM 
 and INFOCOMM on the topic circa 1988-1995 or so. Also take a look at 
 Class-based Queing (aka CBQ) in the same timeframe. I think you'll find that 
 FQ systems are not approached as collections of queues with different 
 characteristics; they are collections of queues with essentially the same set 
 of characteristics, using scheduling to make the queues share bandwidth in a 
 manner similar to the Generalized Processor Sharing model. On the other hand, 
 CBQ systems are systems with separate queues or classes for different sets of 
 traffic, with different characteristics such as drop policy or target latency.

I do not think how FQ-codel works is fundamental to FQ, rather, it is
an innovation that seems to work well in practice, with extremely low
overhead, leaving (mostly) untouched, request-response traffic like
dns, and other low rate traffic, but
while getting large bursts under control *rapidly*, yet allowing short
flows through, with no further configuration. So it is a set of queues
with different characteristics - which is why we ended up calling it
Flow queueing, not fair queuing.

AQMs (like pie) traditionally rely on applying increasing amounts of
random drops/marks to a stream of packets, hoping to eventually pick
out the fattest flows and shoot at them, and yet modern traffic is A)
assymmetric, and B) bidirectional, and C) bursty, with large bursts
coming from various forms off TSO/GSO/GRO offloads, IW10 (and now quic
IW10+22 paced packets), and things like web browsers opening up many
connections simultaneously.

Any form of FQ reduces the impact of C) enormously. So much so that I
regard FQ as the biggest part of the answer to achieving reliably low
latency for all forms of traffic.

No AQM deals particularly well with B - if you have 20 acks vs 1 full
size MTU packet filling up the queue, it can take a lot longer for the
AQM to find an ideal drop rate - which is not, actually, ideal. Toke
did a preso on this on B and C.

As for A) - I am seeing a 12x1 ratio of down to up in my current cable
modem services, and this makes acks actually far more important and
painful than they ever have been before. It does not take a lot of fat
uplink traffic to start starving the downlink. Wifi and LTE are also
often painfully assymmetric.

FQ does impose some structure  - per packet fairness has problems in
some scenarios, byte fairness at the MTU size does not shoot at enough
acks, the compromise in deployed fq_codel systems at lower bandwidths
is a DRR quantum of 300 bytes. Cake uses peeling (to deal with up to
64k byte packets common now in consumer and server hardware), an 8
way set associative hash (much more flow isolation) and has a variable
quantum (presently), based on the number of extant flows, and has some
thoughts about the ideal BDP in the AQM, as the number of flows goes
up, and is smarter about quite a few things, and perhaps dumber about
others.

http://www.bufferbloat.net/projects/codel/wiki/Cake

I think FQ-pie is a good idea (aside from worrying about it
over-prioritizing millions of small flows). I lean still towards
stochastic methods (as in cake and fq_codel) along the edge. As for
the core, or where you have a lot of cpus on the ingress side, damned
if I know. As for wifi/wireless, we still have tons of work left to do
there.

On my benchmarks fq-pie, fq_codel, and cake all win pretty big. I
think cake is now more or less better than sqm-scripts was, and
handles more edge cases.

 When we built the differentiated services model, we modeled a FQ subsystem as 
 if it were a single queue in a larger CBQ system, We might, for example, have 
 a FQ system for an AF class, but give EF priority over the entire FQ 
 subsystem.

What we did with sqm-scripts (and other deployed fq_codel based
systems, like free.fr's) was to have 3 tiers of relative fq_codel
queues for priority, best effort, and background. That seems to work
pretty well. There is not a lot of effective use of classification in
my sample sets.

Cake is experimenting with various means of layering diffserv on top
of that, presently with a default of 4 fq_codel-ish queues,
We are collecting a great deal more stats on queue behavior and actual
loads in cake, now, example here:

http://pastebin.com/bX1HmDP6

Couple notes on that url:

Class 0 is just a name (background traffic goes here), class 1 is CS0,
classes 2 and 3 are higher prio than 1. We need a better name than
class as CS0 is actually in class 1

 Sent 65725311796 bytes 52559409 pkt (dropped 11935,

Re: [aqm] Questioning the goal of a hard delay target

2015-07-03 Thread Dave Taht

On Fri, Jul 3, 2015 at 10:42 AM, Bob Briscoe i...@bobbriscoe.net wrote:
 Simon,

 Y, if you're going to start autoadjusting a hard-coded parameter, you have
 to first question whether it was right to choose that parameter to hard-code
 in the first place.

In codel, target was never a hardcoded parameter. It has always been
specified as 5-10% of the interval, with a default of 5% (which
equals 5ms on an interval of 100ms. In retrospect I really wish we had
made it be an actual percentage in the code and configuration, and on
other days wish we had only exposed interval as a parameter).

We have always thought target in the case of wifi especially needed to
be a function of active stations. This is sort of where cake is
going.

Target is merely a delay that codel *aims for*. When it hits a drop
rate (or the flow slows down) enough - it turns off, and the algorithm
goes into behavior that only goes on again after the delay exceeds
target for the current computed interval.

It is good to have some new thinking on this, of course, and codifying
how to modify the target - or work on different curves on various
other algorithms, is wonderful. I like bob's other piece on smaller
cwnds to keep the tcp signal strength up, but am not as allergic as
he to reducing mss in such cases.

One of these days, perhaps, someone will successfully write up and
explain the 3 modes of codel. People look too hard at the ramp up
portion of the algo and not at what happens when you are at or near
steady state. I would like to develop a model that shows what is going
on in all the queues, all the time, and presents it graphically,
somehow.


 Bob


 On 03/07/15 18:34, Simon Barber wrote:

 Hi Bob,

 Very interesting to see this. I had just recently privately proposed an
 extension to Codel - to auto tune the target parameter. The proposal is to
 observe the characteristics that are exhibited when target is too large or
 too small, and make adjustments appropriately. i.e. if you make a single
 drop during an interval, and the response of the flow is to go idle (even
 momentarily) then perhaps it was because target is too small. Using some
 rule you could increase target. Conversely you can heuristically identify
 when target is likely too large, and reduce it.

 Simon

 On 7/3/2015 5:20 AM, Bob Briscoe wrote:

 AQM chairs and list,

 1) Delay-loss tradeoff

 We (Koen de Schepper and I) have designed an AQM aimed at removing the
 need for low delay QoS classes, initially as a cost/complexity reduction
 exercise for broadband remote access servers (BRASs). One of the
 requirements given to us was:
 * As background load increases, delay-sensitive apps previously given
 priority QoS treatment (e.g. voice, conversational video) should continue to
 get the same QoS as they got with Diffserv.

 We found that AQMs with a hard delay threshold (PIE, CoDel) have to drive
 up loss really high in order to maintain the hard cap on delay. The levels
 of loss start to cause QoS problems for voice, even tho delay is fine.
 Indeed, we found that the high levels of loss become the dominant cause of
 delay for Web traffic, due to tail losses and timeouts.

 Everyone has been focusing on delay, but we've not been noticing
 consequent really bad loss levels at high load.

 Once you know where to look, the problem is easy to grasp: As load
 increases, the bottleneck link has to get each TCP flow to go slower to use
 a smaller share of the link. The network can increase either drop or RTT. If
 it holds queuing delay (and therefore RTT) constant (as PIE and CoDel do),
 it has to increase drop more.

 We found that by softening the delay threshold a little, at high load we
 don't need crazy loss levels to keep delay within bounds.
 BTW, the implementation needs fewer operations per packet than RED, PIE
 or CoDel.

 Conversely, at low load, a hard queuing delay threshold also means that
 delay will be /higher/ than it needs to be.

 I've written up a brief (4pp) tech report quantifying the problem
 analytically.
 http://www.bobbriscoe.net/projects/latency/credi_tr.pdf

 Koen and colleagues have since done thousands of experiments on their
 broadband testbed with real equipment. It's looking good, even before we've
 explored varying what we call the 'curviness' parameter (which varies how
 hard the target it). We have a paper under submission with all the results,
 which we'll post as soon as it's not sub judice.


 2) Does Flow Aggregation Increase or Decrease the Queue?

 Something else had been bugging me about how queue lengths vary with
 load: The above argument explains how more TCP flows /increase/ the queue.
 But queues are meant to get /smaller/ at higher levels of aggregation.

 The second half of the above tech report explains why there's no paradox.
 And it goes on to explain when you have to configure an AQM with different
 parameters for higher link capacity, and when you don't. It gives the
 formula for how to set the config too.

 Writing this

Re: [aqm] tackling torrent on a 10mbit uplink (100mbit down)

2015-06-19 Thread Dave Taht

sometimes I pick the wrong week to actually try to benchmark a
protocol in the wild.

https://torrentfreak.com/popular-torrents-being-sabotaged-by-ipv6-peer-flood-150619/

On Fri, Jun 19, 2015 at 9:01 AM, Dave Taht dave.t...@gmail.com wrote:
 I just downloaded and seeded 4 popular torrents overnight using  the
 latest version of the transmission-gtk client. I have not paid much
 attention to this app or protocol of late (about 2.5 years since last
 I did this), I got a little sparked by wanting to test cdg, but did
 not get that far.

 Some egress stats this morning (fq_codel on the uplink)

 bytes 32050522339
 packets 3379478
 dropped 702799
 percent 20.80%
 maxpacket 28614

 Some notes:

 1) The link stayed remarkably usable:

 http://snapon.lab.bufferbloat.net/~d/withtorrent/vs64connectedpeers.png

 This graph shows what happened when one of the 4 torrents completed.

 The percentage of bandwidth the uplink on this test got was a bit
 larger than I expected.

 Subjectively, web browsing was slower but usable, and my other normal
 usages (like ssh and mosh and google music over quic) were seemingly
 unaffected. (latency for small flows stayed pretty flat)

 2) even with 69 peers going at peak, I generally did not get anywhere
 near saturating the 100mbit downlink with torrent alone.

 3) Offloads are a pita. Merely counting packets here does not show
 the real truth of what's going on (max packet of 28614 bytes!?), so
 linux, benchmarkers, and so on, should also be counting bytes dropped
 these days. (cake does peeling of superpackets but I was not testing
 that, and it too does not return bytes dropped)

 4) *All* the traffic was udp. (uTP) Despite ipv6 being enabled (with
 two source specific ipv6 ips), I did not see any ipv6 peers connect.
 Bug? Death of torrent over ipv6? Blocking? What?

 5) transmission-generated uplink traffic seemed bursty, but I did
 not tear apart the data or code. I will track queue length next time.

 6) Although transmission seems to support setting the diffserv bytes,
 it did not do so on the udp marked traffic. I think that was a tcp
 only option. Also it is incorrect for ipv6 (not using IPV6_TCLASS). I
 had figured (before starting the test) that this was going to be a
 good test of cake's diffserv support. Sigh. Is there some other client
 I could use?

 7) transmission ate a metric ton of cpu (30% on a i3) at these speeds.

 8) My (cable) link actually is 140mbit down, 11 up. I did not much
 care for asymmetric networks when the ratios were 6x1, so 13x1 is way
 up there

 Anyway, 20% packet loss of the right packets was survivable. I will
 subject myself to the same test on other fq or aqms. And, if I can
 force myself to, with no aqm or fq. For SCIENCE!

 Attention, DMCA lawyers: Please send takedown notices to
 bufferbloat-research@/dev/null.org . One of the things truly
 astonishing about this is that in 12 hours in one night I downloaded
 more stuff than I could ever watch (mp4) or listen to (even in flac
 format) in several days of dedicated consumption. And it all just got
 rm -rf'd. It occurs to me there is a human upper bound to how much
 data one would ever want to consume, and we cracked that limit at
 20mbit, with only 4k+ video driving demand any harder. When we started
 bufferbloat.net 20mbit downlinks were the best you could easily get.



 --
 Dave Täht
 worldwide bufferbloat report:
 http://www.dslreports.com/speedtest/results/bufferbloat
 And: What will it take to vastly improve wifi for everyone?
 https://plus.google.com/u/0/explore/makewififast



-- 
Dave Täht
worldwide bufferbloat report:
http://www.dslreports.com/speedtest/results/bufferbloat
And:
What will it take to vastly improve wifi for everyone?
https://plus.google.com/u/0/explore/makewififast

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] loss + ect

2015-06-19 Thread Dave Taht

After subjecting myself to the cable dslreports.com/speedtest on a
1mbit link,against current implementations of codel and fq_codel (no
overload protection), pie and cake (overload protection)... and
witnessing the carnage...

...I kind of think transports should treat loss with ect(3) also being
sent as a stronger signal than they do.

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] tackling torrent on a 10mbit uplink (100mbit down)

2015-06-19 Thread Dave Taht

I just downloaded and seeded 4 popular torrents overnight using  the
latest version of the transmission-gtk client. I have not paid much
attention to this app or protocol of late (about 2.5 years since last
I did this), I got a little sparked by wanting to test cdg, but did
not get that far.

Some egress stats this morning (fq_codel on the uplink)

bytes 32050522339
packets 3379478
dropped 702799
percent 20.80%
maxpacket 28614

Some notes:

1) The link stayed remarkably usable:

http://snapon.lab.bufferbloat.net/~d/withtorrent/vs64connectedpeers.png

This graph shows what happened when one of the 4 torrents completed.

The percentage of bandwidth the uplink on this test got was a bit
larger than I expected.

Subjectively, web browsing was slower but usable, and my other normal
usages (like ssh and mosh and google music over quic) were seemingly
unaffected. (latency for small flows stayed pretty flat)

2) even with 69 peers going at peak, I generally did not get anywhere
near saturating the 100mbit downlink with torrent alone.

3) Offloads are a pita. Merely counting packets here does not show
the real truth of what's going on (max packet of 28614 bytes!?), so
linux, benchmarkers, and so on, should also be counting bytes dropped
these days. (cake does peeling of superpackets but I was not testing
that, and it too does not return bytes dropped)

4) *All* the traffic was udp. (uTP) Despite ipv6 being enabled (with
two source specific ipv6 ips), I did not see any ipv6 peers connect.
Bug? Death of torrent over ipv6? Blocking? What?

5) transmission-generated uplink traffic seemed bursty, but I did
not tear apart the data or code. I will track queue length next time.

6) Although transmission seems to support setting the diffserv bytes,
it did not do so on the udp marked traffic. I think that was a tcp
only option. Also it is incorrect for ipv6 (not using IPV6_TCLASS). I
had figured (before starting the test) that this was going to be a
good test of cake's diffserv support. Sigh. Is there some other client
I could use?

7) transmission ate a metric ton of cpu (30% on a i3) at these speeds.

8) My (cable) link actually is 140mbit down, 11 up. I did not much
care for asymmetric networks when the ratios were 6x1, so 13x1 is way
up there

Anyway, 20% packet loss of the right packets was survivable. I will
subject myself to the same test on other fq or aqms. And, if I can
force myself to, with no aqm or fq. For SCIENCE!

Attention, DMCA lawyers: Please send takedown notices to
bufferbloat-research@/dev/null.org . One of the things truly
astonishing about this is that in 12 hours in one night I downloaded
more stuff than I could ever watch (mp4) or listen to (even in flac
format) in several days of dedicated consumption. And it all just got
rm -rf'd. It occurs to me there is a human upper bound to how much
data one would ever want to consume, and we cracked that limit at
20mbit, with only 4k+ video driving demand any harder. When we started
bufferbloat.net 20mbit downlinks were the best you could easily get.



-- 
Dave Täht
worldwide bufferbloat report:
http://www.dslreports.com/speedtest/results/bufferbloat
And: What will it take to vastly improve wifi for everyone?
https://plus.google.com/u/0/explore/makewififast

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] I-D Action: draft-ietf-aqm-ecn-benefits-04.txt

2015-06-17 Thread Dave Taht

for many positive bullet points in the present document I can think of
a negative counter-example in the real world that needs to be defeated
in detail. Just off the top of my head:

3.2 tinc, when carrying tos, encapsulates the ecn markings also and
does not apply them properly according to the various rfcs.
3.3 recently: one large providers equal cost multipath implementation
used the full 8 bit tos field as part of the tuple. This worked fine,
until CE started getting exerted by new aqms in the path, which led to
massive packet re-ordering. Fixing it required fixing a ton of pretty
modern vendor gear.
3.4: thus far, even with multiple queues, on the aqms I have, ECN
marked traffic causes extra loss and delay in non ecn marked traffic.
I agree that we should ecn mark sooner than drop, work is progressing.

I would like it if non-traditional (ab)uses of ecn were covered - 1)
attacks using ecn marked packets on dns servers, for example - and 2)
future protocols that could use it (say, Quic).
3) as an example of something I've been fiddling with for a long time,
coupling a routing protocol's metrics to something other than packet
loss, and getting better signal strength by using ecn marked packets
for more reliable communications under congestion.

the draft touches upon voip uses (where I kind of think ecn is not the
best idea), but does not touch upon videoconferencing well, where I
think ecn protection of iframes would be a very good idea. So the
guidance in sec 2.4 is a bit vague.

aggregating transports with retries (e.g. wifi) could use ecn
basically for free when experiencing trouble at the lowest layers of
the stack.

I know I have a tendency to accumulate the negatives (I do LIKE ecn),
but would certainly like to have a fora, or a living document or wiki
for potential sysadmins, vendors, and deployers to have a clear grip
on what can go wrong when attempting to roll out ecn stuff.

So I am mostly in favor of this document getting published, so long as
someone steps up to also be an ecn news central, chock full of user
generated content on the pitfalls, tips, and tricks - and benefits! -,
to guiding ecn deployment further along.

ecn is inevitable. finally.

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] CoDel on high-speed links

2015-06-15 Thread Dave Taht

On Mon, Jun 15, 2015 at 4:40 PM, Agarwal, Anil anil.agar...@viasat.com wrote:
 I guess this is pointing to the age old problem - what is the right buffer 
 size or equivalent delay limit, when packets should be dropped or ECN-marked, 
 so that the link is never under-utilized?

 For a single TCP connection, the answer is the bandwidth-delay product BDP.
 For large number of connections, it is BDP / sqrt(numConnections).

 Hence, one size does not fit all.
 E.g., for RTT of 100 ms or 500 ms, CoDel target delay of 5 or 10 ms is too 
 short - when handling a small number of connections.

The codel recommendation is that the target be set to 5-10% of the
typical interval (RTT), so in the case of a sat link, interval would
be 600(?)ms, and target 5-10 % of that, 30-60ms.

The recommendation bears testing in this scenario. If you would like
to specify what you expect your (say) 98th percentile real physical
RTT, I can exhaustively simulate that rather rapidly later this week
against reno, cdg, cubic, and westwood and various aqms, against flows
from 1 to 100.

Most of this convo has also missed other advancements in tcp (reno is
thoroughly dead), like PRR. A very good read which incorporates a
discussion of PRR is

http://folk.uio.no/kennetkl/jonassen_thesis.pdf

which I plan to finish reading on the plane.

I am not sure what the pie settings would be for a sat system.

 Perhaps, there is a need for a design that adapts the queue size (or delay) 
 target dynamically by estimating numConnections !

Not perhaps. That is one of the avenues cake is exploring:

https://lists.bufferbloat.net/pipermail/cake/2015-June/000241.html

 Anil


 -Original Message-
 From: aqm [mailto:aqm-boun...@ietf.org] On Behalf Of Simon Barber
 Sent: Monday, June 15, 2015 2:01 PM
 To: Dave Taht
 Cc: Jonathan Morton; aqm@ietf.org; Steven Blake
 Subject: Re: [aqm] CoDel on high-speed links

 On 6/14/2015 10:26 PM, Dave Taht wrote:
 On Sun, Jun 14, 2015 at 4:10 PM, Simon Barber si...@superduper.net wrote:
 Indeed - I believe that Codel will drop too much to allow maximum
 bandwidth utilization, when there are very few flows, and RTT is
 significantly greater than target.
 Interval. Not target. Interval defaults to 100ms. Target is 5ms.
 Dropping behaviors stop when the queue falls below the target.

 In this case I specifically mean target, not interval. Dropping stops when 
 queue falls below target, but by then it's too late. In the case I'm talking 
 about (cwind cut by more than queue length) then a period of link idle 
 occurs, and so bandwidth is hurt. It happens repeatedly.

 Range of tests from near zero to 300ms RTT codel does quite well with
 reno, better with cubic, on single flows. 4 flows, better. fq_codel
 does better than that on more than X flows in general.
 The effect is not huge, but the bandwidth loss is there. More flows 
 significantly reduce the effect, since the other flows keep the link busy. 
 This bandwidth reduction effect only happens with very few flows.
 I think TCP Reno will be worse than Cubic, due to it's 50% reduction in cwind 
 on drop vs Cubic's 20% reduction - but Cubic's RTT independent increase in 
 cwind after the drop may make the effect happen more often with larger RTTs.

 What results have you seen for codel on single flow for these larger RTTs?

 You can easily do whatever experiments you like with off the shelf
 hardware and RTTs around half the planet to get the observations you
 need to confirm your thinking. Remember that a drop tail queue of
 various sizes has problems of it's own.

 I have a long overdue rant in progress of being wikified about how to
 use netem correctly to properly emulate any rtt you like.

 I note that a main aqm goal is not maximum bandwidth utilization, but
 maximum bandwidth while still having working congestion avoidance and
 minimal queue depth so other new flows can rapidly grab their fair
 share of the link. The bufferbloat problem was the result of wanting
 maximum bandwidth for single flows.
 Indeed - with many TCP CC algorithms it's just not possible to achieve 
 maximum bandwidth utilization with only 5ms induced latency when the RTTs are 
 long, and a single queue (no FQ, only drop tail or single queue AQM). The 
 multiplicative decrease part of TCP CC simply does not allow it unless the 
 decrease is smaller than the queue (PRR might mitigate a little here). Now 
 add in FQ and you can have the best of both worlds.
 The theory is - with a Reno based CC the cwind gets cut in half on a
 drop. If the drop in cwind is greater than the number of packets in
 the queue, then the queue will empty out, and the link will then be
 idle for a
 flight + queue.
 When cwind gets cut by N packets, the sender stops sending data while ACKs 
 for N data packets are received. If the queue has less than N data packets, 
 then it will empty out, resulting in an idle link at that point, and 
 eventually at the receiver (hence bandwidth loss).
 while. If you want data

Re: [aqm] CoDel on high-speed links

2015-06-15 Thread Dave Taht

On Mon, Jun 15, 2015 at 5:12 PM, Agarwal, Anil anil.agar...@viasat.com wrote:
 Dave,

 I guess I need to read up on cake.

some basic doc is at:

http://www.bufferbloat.net/projects/codel/wiki/Cake

The most important thing in cake at the moment is GRO packet peeling,
which turned out desperately needed in all the new router hardware we
have encountered. Huge wins there.

the other stuff is not fully baked or implemented yet. We are in a bit
of a debate about the most troublesome and misunderstood aspect of
codel on the list over there.

 If you have time, can you simulate an RTT of 600 ms?

ok.

 With a few queue drain rates from 1 Mbps to 100 Mbps.

10,50,100,200,300 was my planned range, fed by gigE. I can't go much
faster than than that. (and even as low as 300 requires GRO offloads)

 Would help us satellite folks get a better understanding of CoDel 
 parameters.

I have not looked at the very long rtt problem in several years, and
since then tcps, in particular have changed muchly (pacing in
particular)


 Thanks,
 Anil

 -Original Message-
 From: Dave Taht [mailto:dave.t...@gmail.com]
 Sent: Monday, June 15, 2015 7:57 PM
 To: Agarwal, Anil
 Cc: Simon Barber; Jonathan Morton; aqm@ietf.org; Steven Blake
 Subject: Re: [aqm] CoDel on high-speed links

 On Mon, Jun 15, 2015 at 4:40 PM, Agarwal, Anil anil.agar...@viasat.com 
 wrote:
 I guess this is pointing to the age old problem - what is the right buffer 
 size or equivalent delay limit, when packets should be dropped or 
 ECN-marked, so that the link is never under-utilized?

 For a single TCP connection, the answer is the bandwidth-delay product BDP.
 For large number of connections, it is BDP / sqrt(numConnections).

 Hence, one size does not fit all.
 E.g., for RTT of 100 ms or 500 ms, CoDel target delay of 5 or 10 ms is too 
 short - when handling a small number of connections.

 The codel recommendation is that the target be set to 5-10% of the typical 
 interval (RTT), so in the case of a sat link, interval would be 600(?)ms, and 
 target 5-10 % of that, 30-60ms.

 The recommendation bears testing in this scenario. If you would like to 
 specify what you expect your (say) 98th percentile real physical RTT, I can 
 exhaustively simulate that rather rapidly later this week against reno, cdg, 
 cubic, and westwood and various aqms, against flows from 1 to 100.

 Most of this convo has also missed other advancements in tcp (reno is 
 thoroughly dead), like PRR. A very good read which incorporates a discussion 
 of PRR is

 http://folk.uio.no/kennetkl/jonassen_thesis.pdf

 which I plan to finish reading on the plane.

 I am not sure what the pie settings would be for a sat system.

 Perhaps, there is a need for a design that adapts the queue size (or delay) 
 target dynamically by estimating numConnections !

 Not perhaps. That is one of the avenues cake is exploring:

 https://lists.bufferbloat.net/pipermail/cake/2015-June/000241.html

 Anil


 -Original Message-
 From: aqm [mailto:aqm-boun...@ietf.org] On Behalf Of Simon Barber
 Sent: Monday, June 15, 2015 2:01 PM
 To: Dave Taht
 Cc: Jonathan Morton; aqm@ietf.org; Steven Blake
 Subject: Re: [aqm] CoDel on high-speed links

 On 6/14/2015 10:26 PM, Dave Taht wrote:
 On Sun, Jun 14, 2015 at 4:10 PM, Simon Barber si...@superduper.net wrote:
 Indeed - I believe that Codel will drop too much to allow maximum
 bandwidth utilization, when there are very few flows, and RTT is
 significantly greater than target.
 Interval. Not target. Interval defaults to 100ms. Target is 5ms.
 Dropping behaviors stop when the queue falls below the target.

 In this case I specifically mean target, not interval. Dropping stops when 
 queue falls below target, but by then it's too late. In the case I'm talking 
 about (cwind cut by more than queue length) then a period of link idle 
 occurs, and so bandwidth is hurt. It happens repeatedly.

 Range of tests from near zero to 300ms RTT codel does quite well with
 reno, better with cubic, on single flows. 4 flows, better. fq_codel
 does better than that on more than X flows in general.
 The effect is not huge, but the bandwidth loss is there. More flows 
 significantly reduce the effect, since the other flows keep the link busy. 
 This bandwidth reduction effect only happens with very few flows.
 I think TCP Reno will be worse than Cubic, due to it's 50% reduction in 
 cwind on drop vs Cubic's 20% reduction - but Cubic's RTT independent 
 increase in cwind after the drop may make the effect happen more often with 
 larger RTTs.

 What results have you seen for codel on single flow for these larger RTTs?

 You can easily do whatever experiments you like with off the shelf
 hardware and RTTs around half the planet to get the observations you
 need to confirm your thinking. Remember that a drop tail queue of
 various sizes has problems of it's own.

 I have a long overdue rant in progress of being wikified about how to
 use netem correctly to properly emulate any

Re: [aqm] CoDel on high-speed links

2015-06-14 Thread Dave Taht

On Sun, Jun 14, 2015 at 4:10 PM, Simon Barber si...@superduper.net wrote:
 Indeed - I believe that Codel will drop too much to allow maximum bandwidth
 utilization, when there are very few flows, and RTT is significantly greater
 than target.

Interval. Not target. Interval defaults to 100ms. Target is 5ms. Dropping
behaviors stop when the queue falls below the target.

Range of tests from near zero to 300ms RTT codel does quite well with reno,
better with cubic, on single flows. 4 flows, better. fq_codel does better than
that on more than X flows in general.

You can easily do whatever experiments you like with off the shelf hardware
and RTTs around half the planet to get the observations you need to confirm
your thinking. Remember that a drop tail queue of various sizes has problems
of it's own.

I have a long overdue rant in progress of being wikified about how to use
netem correctly to properly emulate any rtt you like.

I note that a main aqm goal is not maximum bandwidth utilization, but maximum
bandwidth while still having working congestion avoidance and minimal queue
depth so other new flows can rapidly grab their fair share of the
link. The bufferbloat
problem was the result of wanting maximum bandwidth for single flows.

 The theory is - with a Reno based CC the cwind gets cut in half
 on a drop. If the drop in cwind is greater than the number of packets in the
 queue, then the queue will empty out, and the link will then be idle for a

flight + queue.

 while. If you want data to keep the data flowing uninterrupted, then you
 must have a full unloaded RTT's worth of data in the queue at that point. A

Do the experiment? Recently landed in flent is the ability to monitor
queue depth while running another test.

 drop will happen, the cwind will be halved (assuming a Reno TCP), and the
 sender will stop sending until one (unloaded) RTT's worth of data has been
 received. At that point the queue will just hit empty as the sender starts
 sending again.

And reno is dead. Long live reno!

 Simon



 On 6/9/2015 10:30 AM, Jonathan Morton wrote:


 Wouldn't that be a sign of dropping too much, in contrast to your previous
 post suggesting it wouldn't drop enough?

 In practice, statistical multiplexing works just fine with fq_codel, and
 you do in fact get more throughput with multiple flows in those cases where
 a single flow fails to each adequate utilisation.

 Additionally, utilisation below 100% is really characteristic of Reno on
 any worthwhile AQM queue and significant RTT. Other TCPs, particularly CUBIC
 and Westwood+, do rather better.

 - Jonathan Morton


 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm



-- 
Dave Täht
What will it take to vastly improve wifi for everyone?
https://plus.google.com/u/0/explore/makewififast

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] CoDel on high-speed links

2015-06-10 Thread Dave Taht

 to.

 Regards,
 Anil Agarwal
 ViaSat Inc.

 -Original Message-
 From: aqm [mailto:aqm-boun...@ietf.org] On Behalf Of Steven Blake
 Sent: Tuesday, June 09, 2015 4:40 PM
 To: Dave Taht
 Cc: aqm@ietf.org
 Subject: Re: [aqm] CoDel on high-speed links

 On Tue, 2015-06-09 at 12:44 -0700, Dave Taht wrote:

 The below makes several mis-characterisations of codel in the first
 place, and then attempts to reason from there.

 Hmmm...

 On Tue, Jun 9, 2015 at 9:11 AM, Steven Blake slbl...@petri-meat.com wrote:
  I have a question about how CoDel (as defined in
  draft-ietf-aqm-codel-01) behaves on high-speed (e.g., = 1 Gbps) links.
  If this has been discussed before, please just point me in the right
  direction.

  In the text below, I'm using drop to mean either packet
  discard/ECN mark.  I'm using (instantaneous) drop frequency to
  mean the inverse of the interval between consecutive drops during
  a congestion epoch, measured in drops/sec.

  The control law for CoDel computes the next time to drop a packet,
  and is given as:

t + interval/sqrt(count)

  where t is the current time, interval is a value roughly
  proportional to maximum RTT (recommended 100 msec), and count is
  cumulative number of drops during a congestion epoch.

 No. Count is just a variable to control the curve of the drop rate. It
 is not constantly incremented, either, it goes up and down based on
 how successful it is at controlling the flow(s), only incrementing
 while latency exceeds the target, decrementing slightly after it stays
 below the target. The time spent below the target is not accounted
 for, so you might have a high bang-bang drop rate retained, when
 something goes above from below.

 This subtlety is something people consistently miss and something I
 tried to elucidate in the first stanford talk.

 I specifically mentioned during a congestion epoch, but let me be more
 precise: count is continuously incremented during an extended period where 
 latency exceeds the target (perhaps because CoDel isn't yet dropping hard 
 enough). Correct?

 The fact that the drop frequency doesn't ramp down quickly when congestion is 
 momentarily relieved is good, but doesn't help if it takes forever for the 
 algorithm to ramp up to an effective drop frequency (i.e., something greater 
 than 1 drop/flow/minute).

 It is not hard to see that drop
  frequency increases with sqrt(count).  At the first drop, the
 frequency  is 10 drop/sec; after 100 drops it is 100 drops/sec; after
 1000 drops it  is 316 drops/sec.

  On a 4 Mbps link serving say 1000 packets/sec (on average), CoDel
  immediately starts dropping 1% of packets and ramps up to ~10% after
  100 drops (1.86 secs).

 No it will wait 100ms after stuff first exceeds the target, then
 progressively shoot harder based on the progress of the
 interval/sqrt(count).

 Ok.  At the first drop it is dropping at a rate of 1 packet/100 msec ==
 10 drops/sec and ramps up from there.  At the 100th drop it is dropping at a 
 rate of 100 msec/sqrt(100) == 1 packet/10 msec == 100 drops/sec.
 This just so happens to occur after 1.8 secs.

 Aside: as described, CoDel's drop frequency during a congestion epoch 
 increases approximately linearly with time (at a rate of about 50
 drops/sec^2 when interval = 100 msec).

 secondly people have this tendency to measure full size packets, or a
 1k average packet. The reality is a dynamic range of 64 bytes to 64k
 (gso/tso/gro offloads). So bytes is a far better proxy than packets in
 order to think about this properly.

 offloads of various sorts bulking up packet sizes has been a headache.
 I favor reducing mss on highly congested underbuffered links (and bob
 favors sub-packet windows) to keep the signal strength up.

 The original definition of packet (circa 1962) was 1000 bits, with up
 to 8 fragments. I do wish the materials that were the foundation of
 packet behavior were online somewhere...

 I don't see how this has anything to do with the text of the draft or my 
 questions.

   This seems like a reasonable range.  On a 10 GE link serving 2.5
  MPPs on average, CoDel would only drop 0.013% of packets after 1000
  drops (which would occur after 6.18 secs).

 I am allergic to averages as a statistic in the network measurement case.

  This doesn't seem
  to be very effective.  It's possible to reduce interval to ramp up
  drop frequency more quickly, but that is counter-intuitive because
  interval should be roughly proportional to maximum RTT, which is
  link-speed independent.

 Except that tcp's drop their rates by (typically) half on a drop, and
 a matter of debate as to when on CE.

 Ex/ 10 GE link, ~10K flows (average).  During a congestion epoch, CoDel with 
 interval = 100 msec starts dropping 257 packets/sec after 5 secs.
 How many flows is that effectively managing?

  Unless I am mistaken, it appears that the control law should be
  normalized in some way to average packet rate.  On a high-speed
  link, it might

Re: [aqm] CoDel on high-speed links

2015-06-10 Thread Dave Taht

On Tue, Jun 9, 2015 at 10:14 AM, Simon Barber si...@superduper.net wrote:
 My concern with fq_codel is that by putting single flows into single Codel
 instances you hit the problem with Codel where it limits bandwidth on higher
 RTT paths.

I recently did a bit of work, testing rtt_fairness from my location
(los gatos, california) to linodes in london, dallas, tokoyo, and
newark, at RTTs of roughly 145, 45, 115, and 85ms. The servers are all
using sch_fq and a modern linux (on a vm) There is a rangley box
inbetween running the sqm scripts that let me test pie, codel,
fq_codel, cake, etc.

On the long path, with pie, the download rate was generally higher
than on the shorter paths, which was kind of interesting and would
bear a repeated look at. Codel, more even, and fq_codel was very even
across all rtts.

http://snapon.lab.bufferbloat.net/~d/qdisc-stats2/download_comparison.png

http://snapon.lab.bufferbloat.net/~d/qdisc-stats2/upload_comparison.png

(rawer data in that dir or you can get it all via:
http://snapon.lab.bufferbloat.net/~d/qdisc-stats2.tgz
 toke is also working on getting buffering, drop, and delay
measurements, some of those and preliminary plot types are in there
also. pull flent from git.
 )

the rtt_fair4be dataset is noisy (and limited to my local connection
speed of 70/10mbits) If anyone would like access to these servers for
more extensive testing, I still have quite a few more gigabits to use
up, and no time to use them. Contact me offlist for access.


 Simon

 Sent with AquaMail for Android
 http://www.aqua-mail.com



 On June 9, 2015 9:32:15 AM Jonathan Morton chromati...@gmail.com wrote:


  On 9 Jun, 2015, at 19:11, Steven Blake slbl...@petri-meat.com wrote:
 
  On a 10 GE link
  serving 2.5 MPPs on average, CoDel would only drop 0.013% of packets
  after 1000 drops (which would occur after 6.18 secs).  This doesn't seem
  to be very effective.

 Question: have you worked out what drop rate is required to achieve
 control of a TCP at that speed?  There are well-known formulae for standard
 TCPs, particularly Reno.  You might be surprised by the result.

 Fundamentally, Codel operates on the principle that one mark/drop per RTT
 per flow is sufficient to control a TCP, or a flow which behaves like a TCP;
 *not* a particular percentage of packets.  This is because TCPs are
 generally required to perform multiplicative decrease upon a *single*
 congestion event.  The increasing count over time is meant to adapt to
 higher flow counts and lower RTTs.  Other types of flows tend to be sparse
 and unresponsive in general, and must be controlled using some harder
 mechanism if necessary.

 One such mechanism is to combine Codel with an FQ system, which is exactly
 what fq_codel in Linux does.  Fq_codel has been tested successfully at
 10Gbps.  Codel then operates separately for each flow, and unresponsive
 flows are isolated.

  - Jonathan Morton

 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm



 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm



-- 
Dave Täht
What will it take to vastly improve wifi for everyone?
https://plus.google.com/u/0/explore/makewififast

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] CoDel on high-speed links

2015-06-09 Thread Dave Taht

The below makes several mis-characterisations of codel in the first
place, and then attempts to reason from there.

On Tue, Jun 9, 2015 at 9:11 AM, Steven Blake slbl...@petri-meat.com wrote:
 I have a question about how CoDel (as defined in
 draft-ietf-aqm-codel-01) behaves on high-speed (e.g., = 1 Gbps) links.
 If this has been discussed before, please just point me in the right
 direction.

 In the text below, I'm using drop to mean either packet discard/ECN
 mark.  I'm using (instantaneous) drop frequency to mean the inverse of
 the interval between consecutive drops during a congestion epoch,
 measured in drops/sec.

 The control law for CoDel computes the next time to drop a packet, and
 is given as:

   t + interval/sqrt(count)

 where t is the current time, interval is a value roughly proportional to
 maximum RTT (recommended 100 msec), and count is cumulative number of
 drops during a congestion epoch.

No. Count is just a variable to control the curve of the drop rate. It
is not constantly
incremented, either, it goes up and down based on how successful it is
at controlling
the flow(s), only incrementing while latency exceeds the target,
decrementing slightly
after it stays below the target. The time spent below the target is
not accounted for,
so you might have a high bang-bang drop rate retained, when
something goes above from below.

This subtlety is something people consistently miss and something I
tried to elucidate in the
first stanford talk.

It is not hard to see that drop
 frequency increases with sqrt(count).  At the first drop, the frequency
 is 10 drop/sec; after 100 drops it is 100 drops/sec; after 1000 drops it
 is 316 drops/sec.

 On a 4 Mbps link serving say 1000 packets/sec (on average), CoDel
 immediately starts dropping 1% of packets and ramps up to ~10% after 100
 drops (1.86 secs).

No it will wait 100ms after stuff first exceeds the target, then
progressively shoot harder based on the progress of the
interval/sqrt(count).

secondly people have this tendency to measure full size packets, or a
1k average packet. The reality is a dynamic range of 64 bytes to 64k
(gso/tso/gro offloads). So bytes is a far better proxy than packets in
order to think about this properly.

offloads of various sorts bulking up packet sizes has been a headache.
I favor reducing mss on highly congested underbuffered links (and bob
favors sub-packet windows) to keep the signal strength up.

The original definition of packet (circa 1962) was 1000 bits, with up
to 8 fragments. I do wish the materials that were the foundation of
packet behavior were online somewhere...

  This seems like a reasonable range.  On a 10 GE link
 serving 2.5 MPPs on average, CoDel would only drop 0.013% of packets
 after 1000 drops (which would occur after 6.18 secs).

I am allergic to averages as a statistic in the network measurement case.

 This doesn't seem
 to be very effective.  It's possible to reduce interval to ramp up drop
 frequency more quickly, but that is counter-intuitive because interval
 should be roughly proportional to maximum RTT, which is link-speed
 independent.

Except that tcp's drop their rates by (typically) half on a drop, and
a matter of debate as to when on CE.


 Unless I am mistaken, it appears that the control law should be
 normalized in some way to average packet rate.  On a high-speed link, it
 might be common to drop multiple packets per-msec, so it also isn't
 clear to me whether the drop frequency needs to be recalculated on every
 drop, or whether it could be recalculated over a shorter interval (e.g.,
 5 msec).

Pie took the approach of sampling, setting a rate for shooting, over a
16ms interval. That's pretty huge, but also low cost in some hardware.

Codel's timestamp per-packet control law is continuous (but you do
need to have a cheap packet timestamping ability).

Certainly in all cases more work is needed to address the problems
100gps rates have in general, and it is not just all queue theory! A
small packet is .62 *ns* in that regime. A benefit of fq in this case
is that you can parallelize fib table lookups across multiple
processors/caches, and of fq_codel is that all codels operate
independently.


 Regards,

 // Steve

 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm



-- 
Dave Täht
What will it take to vastly improve wifi for everyone?
https://plus.google.com/u/0/explore/makewififast

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] FQ-PIE kernel module implementation

2015-06-04 Thread Dave Taht

On Thu, Jun 4, 2015 at 3:06 PM, Hironori Okano -X (hokano - AAP3 INC
at Cisco) hok...@cisco.com wrote:
 Hi all,

 I’m Hironori Okano and Fred’s intern.
 I’d like to let you know that I have implemented FQ-PIE as a linux kernel
 module “fq-pie and iproute2 for fq-pie.
 This was done in collaboration with others at Cisco including Fred Baker,
 Rong Pan, Bill Ver Steeg, and Preethi Natarajan.

 The source codes are in my github repository. I attached patch file
 “fq-pie_patch.tar.gz” to this email also.
 I’m using the latest linux kernel
 (git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git)

 fq-pie kernel module
 https://github.com/hironoriokano/fq-pie.git

 iproute2 for fq-pie
 https://github.com/hironoriokano/iproute2_fq-pie.git

 If you have any comments, please reach out to me.

 Best regards,

Very cool. I have been building this as part of my testbed for some
time now with some very impressive results.

I will update my openwrt tree to pull from yours (if possible, openwrt
is still largely linux-3.18 based, otherwise I might have to slip in
some backport code)

https://github.com/dtaht/ceropackages-3.10/tree/master/net/kmod-sched-fq_pie

thanks for such a cool and interesting qdisc!

 Hironori Okano
 hok...@cisco.com

 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm




-- 
Dave Täht
What will it take to vastly improve wifi for everyone?
https://plus.google.com/u/0/explore/makewififast

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] big science discovers sch_fq and pacing

2015-05-31 Thread Dave Taht

https://fasterdata.es.net/host-tuning/linux/fair-queuing-scheduler/

-- 
Dave Täht
What will it take to vastly improve wifi for everyone?
https://plus.google.com/u/0/explore/makewififast

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] AQM hurts utilization with a single TCP stream?

2015-05-23 Thread Dave Taht

On Fri, May 22, 2015 at 11:42 PM, Jonathan Morton chromati...@gmail.com wrote:
 In practice, I haven't noticed any loss of throughput due to using Codel on
 100ms+ RTTs. Probably most servers now use CUBIC, which contributes to that
 impression.

There are only slight differences between these tcps (and everybody
just uses multiple flows to do stuff anyway)

 Using ECN rather than tail drops also makes the delivery
 smoother.

I can try some longer rtts than this (10,70). this was against the
latest cake on linux 4.1rc3.

http://snapon.lab.bufferbloat.net/~cero3/renovscubic.tgz (or the dir)


 There is a flaw in your analysis. Codel only starts dropping (or marking)
 when the sojourn time has remained above target (5ms) for an entire interval
 (100ms), during which time the cwnd is still growing. Thus the peak queue
 occupancy is more than 5ms.

yes. people keep missing the wait for an interval thing in codel.

 It is however probably fair to say that a single Reno flow does lose some
 throughput under AQM.

very clear plot of reno's classic sawtooth behavior vs cubics:

http://snapon.lab.bufferbloat.net/~cero3/renovscubic/reno.png

http://snapon.lab.bufferbloat.net/~cero3/renovscubic/cubic.png

On a 200ms rtt things look more interesting, but aqm hardly enters
into it. Reno is simply less efficient than cubic, period.

but I'll leave it to you to generate your own plots if you want, out
of the netperf-eu dataset above.


(it would be nice to be able to generate directly comparable plots vs
cc algo types, presently you can't combine these data sets in flent)

 - Jonathan Morton


 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm




-- 
Dave Täht
Open Networking needs **Open Source Hardware**

https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] I-D Action: draft-ietf-aqm-recommendation-04.txt

2015-05-21 Thread Dave Taht

On Thu, May 21, 2015 at 10:18 PM, Simon Barber si...@superduper.net wrote:
 On 5/18/2015 10:00 AM, Dave Taht wrote:

 LEDBAT was probably my first concern and area of research before entering
 this project full time. I *knew* we were going to break ledbat, but the two
 questions were: how badly? (ans: pretty badly) and did it matter? (not that
 much, compared to saving seconds overall in induced delay)

 LEDBAT is about more than just reducing the delay caused by the steam - it's
 also about the bandwidth impact. AQM solves the delay situation, but breaks
 the bandwidth reduction that LEDBAT can achieve today when other traffic is
 present.

In pointing out that paper I have to stress that their good ledbat result,
(read the text around table 3)

was look! it's scavenging!

And mine was:

with over 7 seconds of inherent delay on the link!

Revisiting the data sets with reasonable amounts of buffering on the link,
a correctly functioning tcp stack, and a few other variables more under
control would be good... (much as I pushed for dctcp to be looked at
once real patches landed for it)

... as would investigating actual behavior of ledbat on real links with
aqm and fq technologies on them. While I poked into it quite a lot,
I did not do much more rigorous than observe that web traffic
worked a lot better when torrent was present in a fq/aqm'd environment
and that cubic outcompeted it slightly, generally.

There was supposed to be someone else updating the tcp_ledbat
kernel module we used, but that never got fixed, and it is in a dire
need of update since the change to usec from msec and other
major tcp modifications in the linux kernel.



 while we have long recommended CS1 be set on torrent, it turns out that a
 lot of gear actually prioritizes that over BE, still. It helps on the
 outbound where you can still control your dscp settings. Many torrent users
 have reported just setting their stuff to max outbound and rate limiting
 inbound, and observing no real effects on their link.


 Do you have examples of the gear that prioritizes CS1 over best effort? How
 often have you seen it? Did you see it in places where it would be
 important?

Yes. a lot. and yes. More details I can do later.

 Simon



-- 
Dave Täht
Open Networking needs **Open Source Hardware**

https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] I-D Action: draft-ietf-aqm-recommendation-04.txt

2015-05-18 Thread Dave Taht

On Mon, May 18, 2015 at 8:54 AM, Jonathan Morton chromati...@gmail.com wrote:

 On 18 May, 2015, at 18:27, Simon Barber si...@superduper.net wrote:

 Apparently a significant chunk of bittorrent traffic and Windows updates use 
 these techniques to deprioritise their traffic. Widespread adoption of AQM 
 will remove their ability to avoid impacting the network at peak times. Use 
 of DSCP could be one way to mitigate this problem with AQM, and this merits 
 further study.

 I’m working on a comprehensive algorithm (including AQM, FQ and Diffserv 
 support and a shaper in one neat package) which does address this problem, or 
 at least provides a platform for doing so.  Some information here:

 http://www.bufferbloat.net/projects/codel/wiki/Cake

This is partially an outgrowth of some of the ideas and problems I
attempted to discuss at ietf90.

https://www.ietf.org/proceedings/90/slides/slides-90-aqm-6.pdf

Since then various other working groups (like dart) attempted to
answer some of the same questions.

I am pretty convinced (now) that inbound policing on cpe can be
improved to better fool dumb upstream rate limiters (like those in
cmtses), but haven't got around to doing the work (it's called
bobbie). The biggest problem we have with applying a shaper + fq/aqm
algorithm to inbound traffic on an already be-ing dumbly rate limited
link is that a burst can backup in the upstream cmts and stay backed
up - a rate differential of 90 to 100 takes a long time for an aqm to
bring under control. Analysis of smoothness might also help.

When the ratios are 10 or 1000s to 1 and there is only one bottleneck
link, we do better.

 This is working code, albeit still under development.  I’m actively 
 dogfooding it, and I’m not the only one doing so.

Pushing it into openwrt soon, we hope. As it stands cake is a win
across the board on cpu cost and fairness, it does saner things with
ecn, and so on...

We have discussed a few more advanced ideas that are not currently in
cake on the cake mailing list, including better coupling between
flows, more rapid response to overload, etc.

 The Diffserv layer provides a four-class system by default, corresponding in 
 principle with the 802.1p classes - background, best-effort, video and voice. 
  It does not inherit the naive mapping from DSCPs to those classes, though - 
 only CS1 (001000) is mapped to the background class.

I see a ton of traffic remarked to CS1 from comcast. Others may be more lucky.

Since dart I have basically come to the conclusion we need at least
one new diffserv priority class for scavaging traffic.

 An important part of the Diffserv support in Cake is that the enhanced 
 priority given to the video and voice classes applies only up to given shares 
 of the overall bandwidth.  If traffic in those classes exceeds that allocated 
 share, deprioritisation occurs.  This ensures that improperly marked traffic 
 cannot starve the link, and attempts to incentivise correct marking.

  - Jonathan Morton

 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm



-- 
Dave Täht
Open Networking needs **Open Source Hardware**

https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] I-D Action: draft-ietf-aqm-recommendation-04.txt

2015-05-18 Thread Dave Taht

On Mon, May 18, 2015 at 8:27 AM, Simon Barber si...@superduper.net wrote:
 Thank you Mikeal, these are useful observations about the choice of exact
 DSCP value and various potential impacts. I agree that ultimately without
 operator agreement non of this matters. I do think that an important step
 towards garnering that operator agreement is to have the concerns clearly
 elucidated in this group's recommendations.

 I found a study of the interaction between low priority background
 techniques, including LEDBAT and AQM.

 http://www.enst.fr/~drossi/paper/rossi12conext.pdf

That paper was continually extended and revised. (I have had very
little to do with it since the first release.)

http://perso.telecom-paristech.fr/~drossi/paper/rossi14comnet-b.pdf

While it is pretty good... my fav part of that paper is table 3 where
the authors ignore the 7 second delay on the link but otherwise
showing the optimal ratio between real tcp and utp in their testbed.

LEDBAT was probably my first concern and area of research before
entering this project full time. I *knew* we were going to break
ledbat, but the two questions were:

how badly? (ans: pretty badly)
and did it matter? (not that much, compared to saving seconds overall
in induced delay)

 it's conclusion states:

 Shortly, our investigation confirms the negative interference: while AQM
 fixes the bufferbloat, it destroys the relative priority among Cc
 protocols.

Yep.

I do wish the paper was updated to account for 4 concepts

0) never got around to trying ns2/ns3 fq_codel or the sqm_scripts against it
1) utp has a lower IW. With the move to IW10 in linux tcp iw10 tcp
knocks utp more out of the way (note that a ton of torrent clients
still use tcp and thus they are getting an advantage now by using
iw10 that they shouldn't be). Anyway, most web traffic knocks utp out
of the way handily
2) ledbat when first proposed had a 25ms target for induced delay. I
would not mind that tried again.
3) coupled congestion control (one app, many flows)

 Apparently a significant chunk of bittorrent traffic and Windows updates use
 these techniques to deprioritise their traffic.

So, torrent and ledbat are different things. torrent has LOTS of flows
(worst case 6 active/torrent, (50 or more connected, and switching one
into an active state every 15 seconds)).

ledbat is just a cc algorithm that torrent and some other heavy apps use.

 Widespread adoption of AQM
 will remove their ability to avoid impacting the network at peak times.

No. A single ledbat flow will behave like a single tcp flow.
Widespread adoption of AQM will make it easier for many flows to share
the network with low latency.
I don't see any impact from continued use of ledbat for applications
like updates, backups, etc.

My own recomendation is merely to try torrent today with your aqm or
fq system of choice and see what happens. I did, and stopped worrying
about ledbat.

 Use
 of DSCP could be one way to mitigate this problem with AQM, and this merits
 further study.

while we have long recommended CS1 be set on torrent, it turns out
that a lot of gear actually prioritizes that over BE, still. It helps
on the outbound where you can still control your dscp settings.
Many torrent users have reported just setting their stuff to max
outbound and rate limiting inbound, and observing no real effects on
their link.



 Simon

 Sent with AquaMail for Android
 http://www.aqua-mail.com



 On May 13, 2015 1:47:33 AM Mikael Abrahamsson swm...@swm.pp.se wrote:

 On Tue, 12 May 2015, Simon Barber wrote:

  Hi John,
 
  Where would be the best place to see if it would be possible to get
  agreement
  on a global low priority DSCP?

 Currently the general assumption among ISPs is that DSCP should be zeroed
 between ISPs unless there is a commercial agreement saying that it
 shouldn't. This is generally accepted (there are NANOG mailing list
 threads on several occasions in the past 5-10 years where this was the
 outcome).

 The problem is quite complex if you actually want things to act on this
 DSCP value, as there are devices with default behaviour is 4 queue 802.1p,
 with 1 and 2 (which will match AF1x and AF2x) will have lower priority
 than 0 and 3 (BE and AF3x), and people doing DSCP based forwarding,
 usually does things the other way around.

 It might be possible to get the last DSCP bits to map into this, because
 for DSCP-ignorant quipment, this would still be standard BE to something
 only looking at CSx (precedence), but that would be lower than 00. So
 DSCP 000110 (high drop BE) might work, because it's incremental. Possibly
 DSCP 10 (low drop BE) might be able to get some agreement because it
 doesn't really cause any problems in the existing networks (most likely)
 and it could be enabled incrementally.

 I would suggest bringing this kind of proposal to operator organizations
 and the IETF. It needs to get sold to the ISPs mostly, because in this
 aspect the IETF decision will mostly be empty

Re: [aqm] I-D Action: draft-ietf-aqm-recommendation-04.txt

2015-05-18 Thread Dave Taht

On Tue, May 12, 2015 at 9:17 PM, Simon Barber si...@superduper.net wrote:
 Hi Wesley,

 Thanks for considering my comments, and apologies for being so late in the
 process - I've only recently been able to put time into this area, and I
 understand it may be too late in the process to hack things in. I replied to
 John with where I'm concerned with the current -11 text.

I am glad you are able to put time in, you have been a long way away.

 Re: background / low priority streams. There are other ways to achieve a
 'lower priority', such as changing the AIMD parameters. Does not help if FQ
 is involved though.

There are many ways to do lower priority streams if fq is present.

Simplest:

1) Send
3 packets back to back, timestamped. First packet arrives in an empty
queue, gets sent out immediately, 2nd and third packet are affected by the
total number of flows extant.  (fq_codel) (or in SFQ all are affected by
total number of flows)

keep that to 1/2 OWD (or less) plus fuzz/smoothing and you have a solution
for how much additional load you are willing to add to the network.

2) for coupled congestion control on say 6 flows from one app do the
same sort of bunching and measure,
then drop off when one or more of the flows experiences excessive delay.

In both cases the timestamps would be received differently and in
order via pure aqm or drop tail most of the time.

It is relatively easy to get low priority in other words in a fq'd
system. It is harder to get to an optimal bandwidth while still
staying low priority and somewhat hard to figure out if you are being
fq'd in the first place.


 My concern is that implementing AQM removes a capability
 from the network, so doing so without providing a mechanism to support low
 priority is a negative for certain applications (backups, updates - and the
 impact these have on other applications). Would be good for this to be at
 least common knowledge. Is there any other document this could go in?

see dart.

 Simon



 On 5/12/2015 5:11 PM, Wesley Eddy wrote:

 On 5/8/2015 11:42 PM, Simon Barber wrote:

 I have a couple of concerns with the recommendations of this document as
 they stand. Firstly - implementing AQM widely will reduce or even
 possibly completely remove the ability to use delay based congestion
 control in order to provide a low priority or background service. I
 think there should be a recommendation that if you are implementing AQM
 then you should also implement a low priority service using DSCP, e.g.
 CS1. This will enable these low priority applications to continue to
 work in an environment where AQM is increasingly deployed. Unlike DSCPs
 that give higher priority access to the network, a background or low
 priority DSCP is not going to be gamed to get better service!

 Secondly, there is a recommendation that AQM be implemented both within
 classes of service, and across all classes of service. This does not
 make sense. If you are implementing AQM across multiple classes of
 service, then you are making marks or drops while ignoring what class
 the data belongs to. This destroys the very unfairness that you wanted
 to achieve by implementing the classes in the first place.


 Hi Simon, thanks for your comments.

 These comments appear to be in response to version -04 of the document,
 from around 1 year ago.  The document is currently on version -11, has
 past working group last call and IESG evaluation, and is in the RFC
 Editor's queue.  I mention this, because it isn't clear to me how
 applicable your comments are with regard to the current copy.

 The current copy can be found at:
 https://datatracker.ietf.org/doc/draft-ietf-aqm-recommendation/

 The current revision does mention the impact to delay-based end-host
 algorithms as an area for future research.

 While I agree that in a lot of cases it seems like logically a good
 idea to have a DiffServ configuration like you mention, I don't think
 we have seen data on this yet in the working group.  Looking into this
 could be part of that mentioned future work, though not something I'd
 want to see hacked into this document today, so late in its publication
 process.


 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm



-- 
Dave Täht
Open Networking needs **Open Source Hardware**

https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] splat start?

2015-05-11 Thread Dave Taht

On Mon, May 11, 2015 at 5:26 AM, Mirja Kühlewind
mirja.kuehlew...@tik.ee.ethz.ch wrote:
 Hi Dave,

 Michael Scharf performed his PhD thesis on start up mechanisms. Here is one 
 of his papers (from 2009):

Am I the only person that works in spreadsheets to model stuff? :(

 Scharf, M.: Work in Progress: Performance Evaluation of Fast Startup 
 Congestion Control Schemes
 Proceedings of the 8th IFIP-TC6 Networking Conference (Networking 2009), 
 Lecture Notes in Computer Science (LNCS) 5550, Aachen, May 2009

Thank you. I did not know that work had also fed back into RFC6928.

https://www.bell-labs.com/researchers/537/

thesis here. But I will need a spare weekend to read it.

http://www.ikr.uni-stuttgart.de/Content/Publications/Archive/Sf_Diss_40112.pdf

 I guess he can further comment on this own (cc’ed).



 Mirja



 Am 10.05.2015 um 04:18 schrieb Dave Taht dave.t...@gmail.com:

 One of the things bugging me lately is that we actually have a lot of
 forms of slow start
 on the table - HyStart, Initial Spreading, reno vs cubic, dctcp, IW2,
 IW4, IW10, TSO offloads, the effect of GRO on it, etc. I dont know
 what is in QUIC, either.

 I would love a comprehensive guide to exactly the behaviors of slow
 start in every tcp known to man and some sane way to refer to them
 all in a cross reference and a spreadsheet.

 Does something like that exist? Just the * start behavior.

 The world has spent way too much time analyzing congestion avoidance mode.

 --
 Dave Täht

 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm




-- 
Dave Täht
Open Networking needs **Open Source Hardware**

https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] splat start?

2015-05-09 Thread Dave Taht

One of the things bugging me lately is that we actually have a lot of
forms of slow start
on the table - HyStart, Initial Spreading, reno vs cubic, dctcp, IW2,
IW4, IW10, TSO offloads, the effect of GRO on it, etc. I dont know
what is in QUIC, either.

I would love a comprehensive guide to exactly the behaviors of slow
start in every tcp known to man and some sane way to refer to them
all in a cross reference and a spreadsheet.

Does something like that exist? Just the * start behavior.

The world has spent way too much time analyzing congestion avoidance mode.

-- 
Dave Täht

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] ECN AQM parameters

2015-05-09 Thread Dave Taht

On Sat, May 9, 2015 at 10:20 AM, Bob Briscoe bob.bris...@bt.com wrote:
 Dave,

 As promised, here's my thoughts on what PIE (and CoDel) should do when ECN
 is enabled.


 There's also new info in here that I think is important: CoDel uses an RTT
 estimate in two different places. One has to be the max expected RTT, the
 other should be the (harmonic) mean of expected RTTs.

I like the harmonic idea a lot, it ties in with some of my packet pair
thinking on wifi aggregation.

 The former might be 100ms, but the latter is more likely to be 15-20ms,
 given most traffic in the developed world these days comes from CDNs. This
 could make significant difference to performance.


 Bob

 Date: Tue, 14 Apr 2015 19:59:47 +0100
 To: Fred Baker (fred) f...@cisco.com
 From: Bob Briscoe bob.bris...@bt.com
 Subject: ECN AQM parameters (was: AQM Recommendation: last minute change?)
 Cc: Gorry Fairhurst go...@erg.abdn.ac.uk, Richard Scheffenegger
 r...@netapp.com, Eddy Wesley M. [VZ] wesley.m.e...@nasa.gov,
 aqm-...@ietf.org aqm-...@ietf.org

 Fred,

 At 22:27 13/04/2015, Fred Baker (fred) wrote:


 I think w have a pregnant statement in this case. What parameters do you
 have in mind?


 The point was simply to ensure that implementers provide sufficient
 flexibility so that /any or all/ of the AQM parameters for ECN traffic could
 be separate instances from those for drop. But they would still apply to the
 same queue, much like the different RED curves for different traffic classes
 in the WRED algo.


 With RED, the parameters available to change are min-threshold,
 max-threshold, the limit mark/drop rate, and (IIRC) the minimum
 inter-mark/drop interval.


 ...and, importantly, the EMWA constant, which is the main parameter I would
 change for ECN (for ECN set ewma-const = 0, assuming the Cisco definition of
 ewma-const as
EWMA weight = 2^{ema-const}
 So for ECN, EWMA weight = 2^0 = 1).

 See also {Note 1} about inter-mark/drop interval.

 With PIE, the equation is
  p = p + alpha*(est_del-target_del) + beta*(est_del-est_del_old).
 meaning that we can meaningfully tune alpha, beta, and target_del, and there
 is an additional 'max_burst' parameter.


 Yes.
 Strictly, the min data in the queue before PIE measures the rate
 ('dq_threshold') is a parameter as well.

 With Codel, if I understand section 4, the only parameters are the a
 round-trip time metric (100 ms by default) and the Setpoint, which they set
 to 5 ms based on it being 5-10% of the RTT.

 If it's not the target delay, which is essentially what Codel's setpoint is,
 I'm not sure what parameter you want to change.


 There are actually two more hidden parameters in CoDel's control law, which
 is written in the pseudocode as :
t + interval / sqrt(count)
 but ought to have been written as:
t + rtt_ave / (count)^b

 These parameters have been hard-coded as
rtt_ave = interval
 and
b=1/2.

 'rtt_ave' in the control law is better set to a likely /average/ RTT,
 whereas the interval used to determine whether to enter dropping mode is
 better set to a likely /maximum/ RTT.

 To implement an ECN variant of CoDel I would set
interval = 0 (or very close to zero)
 and I would leave 'rtt_ave' (in the control law) as an average RTT,
 decoupled from 'interval'.

 However, the CoDel control law was designed assuming it will remove packets
 from the queue, so I'm not convinced that any naive approach for
 implementing ECN will work. I suspect a CoDel-ECN doesn't just need
 different parameters, it needs a different algo.

 I have no interest in solving this problem, because I wouldn't start from
 CoDel in the first place - I would never design an AQM that switches between
 discrete modes, and CoDel's control law assumes that the e2e congestion
 control is NewReno which contravenes our AQM recommendations anyway.


 In saying 'in this case you might want to mess with the parameters', I'm not
 sure what parameters are under discussion, and in any event we're talking
 about the document that says 'we should have an algorithm', not the
 discussion of any of them in particular.

 To my mind, this begs for a new draft on your part.


 Certainly. We're still doing the research and evaluation tho (see
 www.ietf.org/proceedings/92/slides/slides-92-iccrg-5.pdf - I don't remember
 whether you were in the room for that). But, yes, we will write it up.

 So far it's not based on RED, PIE or CoDel, but a new drop-based AQM that is
 most similar to RED but with only 2 parameters (not 4). This is because we
 needed the drop probability to be the square of the marking probability. So
 it made the implementation really simple to use a square curve through the
 origin for drop. It doesn't need min_thresh, because the square curve
 near-enough runs along the axis when it is close to the origin. For the
 square curve we used a probability trick - we merely had to compare the
 queue delay with the max of two random numbers. RED (especially gentle RED)
 can be thought

Re: [aqm] ECN AQM parameters

2015-05-09 Thread Dave Taht

I am looking over the rest of your email. It is a lot to absorb...

but:

I have no interest in solving this problem, because I wouldn't start
from CoDel in the first place - I would never design an AQM that
switches between discrete modes, and CoDel's control law assumes that
the e2e congestion control is NewReno which contravenes our AQM
recommendations anyway.

0) I too have trouble... particularly with the decay of codel as it stands.

1) I note we generally got better results from cubic than reno.

But: Whatever people call cubic is not what has been in linux, and is
certainly not what is in it now after umpteen revisions and bug fixes
over the past 4 years. It is not what is in QUIC, and it is not how
tcp with sch_fq behaves, and I have tried to document each
major change to tcp in every talk I give, with things like hystart
being modified, etc.

As a baseline reference reno was useless 6+ years ago.

Tracking the continuous changes and bugfixes to linux tcp and the
driver subsystems over the past 4 years has been one of my biggest headaches.

I sometimes wish I was tracking something stable and obsolete like
windows or ns2.

2) It is good to know your honest opinion of why you would not start
with codel as a base for an AQM, and also good to know your lines of
inquery.

3) I really hope that it is clear to everyone that my own main
objective is to fix bufferbloat, and I really don't care what
algorithms we use to do that - the parts that I care about are getting
good data, working clean code, stuff that won´t break the internet,
clear rfcs, and all the solutions out there before the heat death of
the universe.

Obviously I think highly of fq as a big means to get there in many
circumstances.

I got really good results from fq_pie btw, published them in a dataset
that I thought others would find interesting (apparently nobody looks
at my data sets, no matter how fast you can fly through them now with
netperf-wrapper)

http://snapon.lab.bufferbloat.net/~d/cake3-fixed/baseline.png

In the next year I hope to finally buckle down down to writing
some papers with all the needed math and results while we ramp up on
the very difficult problems wifi represents...

but now, damn it, I have to go re-run thousands of tests with whatever
version of pie emerges from your analysis and the new draft (pie v8 as
far as I am concerned, I have been tracking the code for far too long
and they kept changing stuff all over the place, where *codel has
stayed stable).

I have a bunch of ideas queued up for cake which is going to be a test
vehicle for also testing enhancements to codel if some more folk were
willing to help implement:

https://lists.bufferbloat.net/pipermail/cake/2015-April/02.html

Particularly the cake_drop_monitor bits seem very useful to basically
import over from ns3 and try on real traffic on real machines.

tc qdisc add dev whatever root cake flowblind # is the codel only test

and i totally welcome new attempts at the problem as you allude to
below and will gladly fix up, polish, and test *anything*.

But I would like to be able to test stuff in the 3 testbeds I have,
the dozens of routers that i have - or the 10s of thousands we can
quickly muster by leveraging the openwrt effort, and in the 10 servers
I have over the world, and so on, and all the other testbeds now out there,
so we can lock the theorists in the
same room with the experimenters, coders, and EEs making hardware, so
we all line up in the same place(s) at the end. Implementations DO
require tradeoffs from ideal circumstances (like fixed point) and
sometimes those are significant and not understood by the
implementers. So I would like very much for the linux pie code that I
have so many results on to also be subject to the same scrutiny as you
subjected the draft to and drew plots of in the hope that some of the
experimental data will line up fully with your analysis.

And I really want to be creating and providing data people can use.

Which I don't feel like I am doing right now.

Highest on my list is
webrtc behaviors, followed by a stack of wifi related issues so high that I dont
think 2 years will be enough to get all the coding done... I have been
delightfully distracted by debugging the dslreports tests of late

4) So I really liked very much you identifying edge cases in
particular in that document, and much else besides. That yields
testable concepts instead of having to explore the whole parameter
space. thank you thank you thank you! I really understood pie, codel,
and aqm behavior overall a lot better after reading that critique!

5) One thing I really gotta do is test the drop on overload even if
ecn marked, then mark the next packet idea out more full and commit
that to mainline (and look over the tcp scoreboard) and also figure
out what to do with pie ecn.

High on my list is producing some results showing the existing ecn
and drop behaviors in all these algos on the table.

6) There is no 6!

On Sat, May 9,

Re: [aqm] draft-ietf-aqm-pie-01: review

2015-05-08 Thread Dave Taht

Dear Bob:

 I now understand the linux codebase for pie a lot better, as well as
some of the experimental data I have. It  looks like I could make
several of the changes you describe and put them in my next series of
tests, and based on your parameters I should be able to exercise some
edge cases across those changes. Wow, thx!

I have not actually read the latest pie draft, but would like to make
a few comments on your comments quickly:

re: 3.1: in linux, params-ecn is presently a boolean, and could
easily be modified to mean anything and compare anything you want.
What would be a good default?

The ECN support in the linux code on enqueue looks like:

if (!drop_early(sch, skb-len)) {
enqueue = true;
} else if (q-params.ecn  (q-vars.prob = MAX_PROB / 10) 

   INET_ECN_set_ce(skb)) {

/* If packet is ecn capable, mark it if drop probability

 * is lower than 10%, else drop it.

 */


Re: 5.0: will look over that code

re: 5.1,  linux code:

/* Non-linear drop in probability: Reduce drop probability quickly if
 * delay is 0 for 2 consecutive Tupdate periods.
 */

if ((qdelay == 0)  (qdelay_old == 0)  update_prob)
q-vars.prob = (q-vars.prob * 98) / 100;


re: 5.2: strongly agree that the lookup table doesn't scale properly.
The linux code appears to differ from the draft also, here, with a
smaller lookup table, and some other smoothing functions. I am going
to stop pasting now and just point at:
https://github.com/torvalds/linux/blob/master/net/sched/sch_pie.c#L334

Will await more feedback from y'all on that.

Codel also has a similar undershoot problem for which I proposed we
try a fixed point fractional count variable in a recent post to the
cake mailing list.

re: 5.3.1: In the environments i work in it is extremely hard to get
timers to reliably fire in under 2ms intervals, particularly on vm'd
systems. Also as you fire the timer more rapidly the current
calculations in pie now done out of band of the packet processing have
a couple divides in them which tend to be processor intensive... both
things said, i figure this and other implementations could fire faster
than the default 16ms...

re: 5.3.2: I like what you are saying but I gotta go work it out for
myself. which will take a while. patches wanted.

re: 5.4: linux and all my tests have always been against:
   /* default of 100 ms in pschedtime */
   vars-burst_time = PSCHED_NS2TICKS(100 * NSEC_PER_MSEC);

5.5: explains a lot. Probably. Will think on it.

5.6: :chortle: heats_this_room() indeed! Derandomization always looked
to me like an overly complex solution to a non-problem.

5.7: don't think this problem exists in the linux code but will step through.

But: in one of my recent (450 rrul_50_up tcp flows) tests neither
codel or pie got to a sane drop rate in under 60 seconds, and pie
stayed stuck at 1 packets outstanding, i did not try more - on
gigE local links. I think a lot of pie tests were run by others with a
very low outside packet limit (200 packets) and thus the tail drop
kicked in before pie itself could react.

5.8:  I think this is a quibblish change not relevant for any
reasonable length of queue measured in packets.

but I do note that we switched from packet limits to byte limits in
cake but that was for other reasons - primarily due to the extreme
(1000x1) dynamic range of a modern packet.

6: I do wish the draft and the code I have still lined up, and the
constants clearly defined.

7: exp(p) is not cheap, and there aint no floating point in the kernel either.

8. Haven't read the draft, can't comment on the nits.

One quick note on:

4.1 Random Dropping s/Like any state-of-the-art AQM scheme, PIE would
drop packets randomly/ /PIE drops packets randomly/ Rationale: The
other scheme with a claim to be state-of-the-art doesn’t (CoDel). I
would agree if the draft had said “A state of the art scheme should
introduce randomness into packet dropping in order to desynchronize
flows,” but maybe it was decided not to introduce such underhand
criticism of CoDel. Whatever, the draft needs to be careful about
evangelising random drop, given it attempts to derandomize later.

I don't buy the gospel that randomness is needed to avoid tcp global
synchronization. I would prefer that whatever evangelicalization of
randomness exists in any draft here or elsewhere be dropped in favor
always of discussing the real problem of tcp global sync instead...

... which as near as I can tell both codel and fq_codel avoid just
fine without inducing a random number generator. I welcome evidence to
the contrary.

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] bob's summary of ecn and aqm use cases

2015-05-07 Thread Dave Taht

I thought every bullet point here was marvelous:

http://www.ietf.org/mail-archive/web/aqm/current/msg01118.html

and would like to see it captured in a formal document somewhere if it
is not captured in the ecn advocacy document.

I have only three quibbles, one kind of major.

re: #5 Slow-starts{Note 3} cause spikes in delay.

- AQM without ECN cannot remove this delay, and typically AQM is
designed to allow such bursts of delay in the hope they will disappear
of their own accord. - Flow queuing can remove the effect of these
delay bursts on other flows, but only if it gives all flows a
separate queue from the start.

I don't think this last is proven. Further, I am pretty sure that a
fully dedicated queue per flow is kind of dangerous.

I would have said:

- AQM *with* ECN cannot remove this delay, and typically AQM is
designed to allow such bursts of delay in the hope they will disappear
of their own accord. AQM without ECN or with ECN overload protection
can make a dent in this delay but not instantaneously. FQ can remove
the effect of these delay bursts on other flows.

re: #6 Slow-starts{Note 3} can cause runs of losses, which in turn cause delays.
  - AQM without ECN cannot remove these delays.
  - Flow queuing cannot remove these losses, if self-induced.
  - Delay-based SS like HSS can mitigate these losses, with increased
risk of longer completion time.
  - ECN can remove these losses, and the consequent delays.

We have a problem with interpreting delay within a flow and delays
induced in other flows throughout much of our debates and perhaps we
should come up with a clean word to distinguish between these two
forms of induced delay. AQM with ECN in this case would reduce the
amount of induced delay here on that flow, but cause delay for other
flows, and the overall rate would only be reduced by half while
perhaps 5 of the IW10 packets (as an example) could have been dropped
(clearing the immediate congestion for another flow).

With the current overload protection in the different ecn enabled AQM
algorithms, different things happen, as I have noted elsewhere. pie
very quickly starts dropping even ecn marked packets, when slammed
with stuff in slow start, which is perhaps as it should be.

Re: Whether flow queuing is applicable depends on the scale. The work
I'm doing with Koen is to reduce the cost of the queuing mechanisms on
our BNGs (broadband network gateways). We're trying to reduce the cost
of per-customer queuing at scale, so per-flow queuing is simply out of
the question. Whereas ECN requires no more processing than drop.

Three subpoints.

1) It turns out that the amount of packet inspection needed to pry
apart a packet and mark it can be quite a lot at dequeue time,
additional memory accesses for configuration variables as well.
hashing and timestamping the headers at enqueue time is in some ways
lighter weight, particularly if offloaded to the rx hardware.

There are other things besides queue algorithms that are pretty
heavyweight in the code path, notably FIB lookups (recently massively
improved in linux 4.0) - which can benefit from be-ing paralellized,
as they are, with the current 10GigE hardware in most intel systems,
with 16 cpus handling the load of, typically, 64 rx and tx queues.

So it is a total systems (amdahl's law) sort of problem as to where
the trade-offs are. I would be interested to know of the cpu, network
hardware, and memory design of your BNGs. I am painfully aware of how
hard it is to do software rate limiting in Linux, by this point, in
particular. Doing it in hardware turned out to be straightforward
(senic).

In the design of cake it was basically my hope to find a simple means
to apply it
to many, many customer specific queues, but that requires a customer
lookup service filter not yet designed, and some attention to how rx
queues are handled in the stack on a per cpu basis, and perhaps some
custom hardware. I look forward to trying it at 10GigE soon.

2) per-flow queuing is a mere matter of memory organization after that
point, and we already know how to scale that to millions of flows on a
day to day basis on intel hardware.[1]

As sort of a side note that doesn't really fit anywhere

3) - FQ, for lack of a better word, can act as a step down
transformer. Imagine if you will a TSO burst emitted at 10GigE,
hitting a saturated 10GigE link with 1000 flows with FQ enabled. Each
packet from this flow will be slowed down and delivered at effectively
10Mbits/sec. At one level this is desirable, giving the ultimate
endpoint more time to maneuver. Breaking up the burst applies a form
of pacing, even if the bottleneck link is only momentarily saturated
with another flow(s).

At another it isn't desirable, particularly in the case of packet
aggregation on the endpoint.

This burst break-up is, of course, something that already basically
happens on switched ports, by design.

[1] Please note I just said per-flow queuing not any particular form
of fq algorithm and

Re: [aqm] [homenet] IEEE 1905.1 and 1905.1a

2015-03-27 Thread Dave Taht

up until this moment I had never heard of

http://en.wikipedia.org/wiki/IEEE_1905

this spec, and it does sound useful.

+10 on more open access to it. +100 on anyone working on open source code
for it.

I would certainly like closer relationships between the IEEE and IETF one
day, perhaps even a truly joint (as opposed to back to back) conference.
For far too long members of these two orgs have been going to different
parties, and many, many cross layer issues have arisen due to this.

In my own case I had hoped (in dropping ietf) to be able to attend more
IEEE 802.11 wg meetings - but I would really prefer to stay home and code
for a while.

I would be very supportive of someone(s) taking on the tasks of better
grokking wifi and other non-ethernet media across both orgs both in the
context of homenet and in aqm.

PS While I have a good grip on cable media layers, I am lacking such on
gpon...
___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] comcast research funding tracks

2015-03-15 Thread Dave Taht

http://techfund.comcast.com/ has quite a few topics on it that might
be of interest to those working on networking and bufferbloat.

I am going to put in for a bit of funding from there myself, but
certainly others here have the right interests, but not the time or
money to pursue their interests, so... do check that url out.

There are a few other programs I am exploring. Things like SBIR didn't
seem useful, and most of what DHS is funding is security related
rather than network performance related.

-- 
Dave Täht
Let's make wifi fast, less jittery and reliable again!

https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] fixing bufferbloat on bigpond cable...

2015-03-11 Thread Dave Taht

I was very pleased to see this tweet go by today:

https://twitter.com/mnot/status/575581792650018816

where Mark Nottingham fixed his bufferbloat on bigpond cable
using a very simple htb + fq_codel script. (I note ubnt edgerouters
also have a nice gui for that, as does openwrt)

But: he does point out a flaw in netanalyzr's current tests[1], in that
it does not correctly detect the presence of aqm or FQing on the link,
(in part due to not running long enough, and also in not using
multiple distinct flows) and like the ping loss considered harmful
thread last week on the aqm and bloat lists, matching user
expectations and perceptions would be good with any public
tests that exist.

There is some stuff in the aqm evaluation guide's burst tolerance
tests that sort of applies, but... ideas?

[1] I am not aware of any other tests for FQ than mine, which are still
kind of hacky. What I have is in my isochronous repo on github.

-- 
Dave Täht
Let's make wifi fast, less jittery and reliable again!

https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] fixing bufferbloat on bigpond cable...

2015-03-11 Thread Dave Taht

Sorry, didn't read the thread closely. I made a few suggestions on
that person's gist, as you probably also have downstream bufferbloat
as well, which you can fix (on the edgerouter and openwrt) at speeds
up to 60mbit on those weak cpus using the user-supplied edgerouter gui
for the ingress stuff. The code for doing inbound shaping also is not
that much harder, a simple example for that is in the ingress
section on the gentoo wiki here:
http://wiki.gentoo.org/wiki/Traffic_shaping

(sqm-scripts in openwrt and other linuxen has the logic for this also built-in)

It is grand to have helped you out a bit. Thx for all the work on
http/2! How about some ecn? ;)

On Wed, Mar 11, 2015 at 7:14 PM, Mark Nottingham m...@mnot.net wrote:
 Hi,

 Just to clarify -- the credit goes to 'saltspork' on that thread, not I :)

 Cheers,


 On 12 Mar 2015, at 1:11 pm, Dave Taht dave.t...@gmail.com wrote:

 I was very pleased to see this tweet go by today:

 https://twitter.com/mnot/status/575581792650018816

 where Mark Nottingham fixed his bufferbloat on bigpond cable
 using a very simple htb + fq_codel script. (I note ubnt edgerouters
 also have a nice gui for that, as does openwrt)

 But: he does point out a flaw in netanalyzr's current tests[1], in that
 it does not correctly detect the presence of aqm or FQing on the link,
 (in part due to not running long enough, and also in not using
 multiple distinct flows) and like the ping loss considered harmful
 thread last week on the aqm and bloat lists, matching user
 expectations and perceptions would be good with any public
 tests that exist.

 There is some stuff in the aqm evaluation guide's burst tolerance
 tests that sort of applies, but... ideas?

 [1] I am not aware of any other tests for FQ than mine, which are still
 kind of hacky. What I have is in my isochronous repo on github.

 --
 Dave Täht
 Let's make wifi fast, less jittery and reliable again!

 https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb

 --
 Mark Nottingham   https://www.mnot.net/




-- 
Dave Täht
Let's make wifi fast, less jittery and reliable again!

https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] fixing bufferbloat on bigpond cable...

2015-03-11 Thread Dave Taht

cake, if we ever get around to finishing it, gets it down to 1 line
of code for outbound, and maybe 1 or 2 for inbound. That said, we
probably need a policer for inbound traffic on the lowest end hardware
built around fq_codel principles. The design is called bobbie, and I
kept meaning to get around to it for about 3 years now.

That one line (for anyone willing to try the patches)

tc qdisc add dev eth0 root cake bandwidth 2500kbit diffserv

but back to my open question - how can we get better public benchmarks
that accurately detect the presence of AQM and FQ technologies on the
link?


On Wed, Mar 11, 2015 at 7:23 PM, Dave Taht dave.t...@gmail.com wrote:
 Sorry, didn't read the thread closely. I made a few suggestions on
 that person's gist, as you probably also have downstream bufferbloat
 as well, which you can fix (on the edgerouter and openwrt) at speeds
 up to 60mbit on those weak cpus using the user-supplied edgerouter gui
 for the ingress stuff. The code for doing inbound shaping also is not
 that much harder, a simple example for that is in the ingress
 section on the gentoo wiki here:
 http://wiki.gentoo.org/wiki/Traffic_shaping

 (sqm-scripts in openwrt and other linuxen has the logic for this also 
 built-in)

 It is grand to have helped you out a bit. Thx for all the work on
 http/2! How about some ecn? ;)

 On Wed, Mar 11, 2015 at 7:14 PM, Mark Nottingham m...@mnot.net wrote:
 Hi,

 Just to clarify -- the credit goes to 'saltspork' on that thread, not I :)

 Cheers,


 On 12 Mar 2015, at 1:11 pm, Dave Taht dave.t...@gmail.com wrote:

 I was very pleased to see this tweet go by today:

 https://twitter.com/mnot/status/575581792650018816

 where Mark Nottingham fixed his bufferbloat on bigpond cable
 using a very simple htb + fq_codel script. (I note ubnt edgerouters
 also have a nice gui for that, as does openwrt)

 But: he does point out a flaw in netanalyzr's current tests[1], in that
 it does not correctly detect the presence of aqm or FQing on the link,
 (in part due to not running long enough, and also in not using
 multiple distinct flows) and like the ping loss considered harmful
 thread last week on the aqm and bloat lists, matching user
 expectations and perceptions would be good with any public
 tests that exist.

 There is some stuff in the aqm evaluation guide's burst tolerance
 tests that sort of applies, but... ideas?

 [1] I am not aware of any other tests for FQ than mine, which are still
 kind of hacky. What I have is in my isochronous repo on github.

 --
 Dave Täht
 Let's make wifi fast, less jittery and reliable again!

 https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb

 --
 Mark Nottingham   https://www.mnot.net/




 --
 Dave Täht
 Let's make wifi fast, less jittery and reliable again!

 https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb



-- 
Dave Täht
Let's make wifi fast, less jittery and reliable again!

https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] a test post on a thread that disappeared

2015-03-07 Thread Dave Taht

On Sat, Mar 7, 2015 at 12:14 PM, Dave Taht dave.t...@gmail.com wrote:
 I was wondering why a certain thread did not show up in the ietf aqm archive,

 http://www.ietf.org/mail-archive/web/aqm/current/maillist.html

and now, stripping out the urls with an invalid cert, also as a test. Sorry
for the noise... I would try to see if it was merely me, that goofed, on the cc.

The aqm list was cc'd on a thread titled:

some thoughts towards medals and other recognition for fundamental
contributions to the internet

on the cerowrt-devel and bloat mailing lists, with a follow on
message, from Vint Cerf.

That title was certainly not indexed by google, thus far... and I
really do need to fix that cert...

but the post did make it to gmane, at least:

http://article.gmane.org/gmane.network.routing.codel/629/match=medals


 --
 Dave Täht
 Let's make wifi fast, less jittery and reliable again!

 https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb



-- 
Dave Täht
Let's make wifi fast, less jittery and reliable again!

https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] [Cerowrt-devel] ping loss considered harmful

2015-03-05 Thread Dave Taht

I had spoken to someone at nznog that promised to combine mrtg +
smokeping or cacti + smokeping so as to be able to get long term
latency and bandwidth numbers on one graph. cc added.

On Thu, Mar 5, 2015 at 12:38 PM, Matt Taggart m...@lackof.org wrote:
 Dave Taht writes:

 wow. It never registered to me that users might make a value judgement
 based on the amount of ping *loss*, rather than latency, and in looking back 
 in time, I can
 think of multiple people that have said things based on their
 perception that losing pings was bad, and that sqm-scripts was worse
 than something else because of it.

 This thread makes me realize that my standard method of measuring latency
 over time might have issues. I use smokeping

   http://oss.oetiker.ch/smokeping/


in sqm-scripts's case, possibly, all you have been collecting is
largely worst case behavior, which I don't mind collecting as it tends
to be pretty good. :)

However, I have been unclear. In the main (modern - I don't know what
version you have) sqm code, IF you enable dscp squashing on inbound
(the default), you do end up with a single fq_codel queue, not 3, no
classification or ping prioritization. (it is the default because of
all the re-marking I have seen from comcast)

So if you are, as I am, monitoring your boxes from the outside, there
is no classification and prioritization present for ping.

do a tc -s qdisc show ifbwhatever (varies by platform) to see how many
queues you have. Example of a single queued inbound rate limiter +
fq_codel (yea! packet drop AND ecn working great!)

root@lorna-gw:~# tc -s qdisc show dev ifb4ge00
qdisc htb 1: root refcnt 2 r2q 10 default 10 direct_packets_stat 0
direct_qlen 32
 Sent 168443514948 bytes 334370551 pkt (dropped 0, overlimits
143273498 requeues 0)
 backlog 0b 0p requeues 0
qdisc fq_codel 110: parent 1:10 limit 1001p flows 1024 quantum 300
target 5.0ms interval 100.0ms ecn
 Sent 168443514948 bytes 334370551 pkt (dropped 17480, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 1514 drop_overlimit 0 new_flow_count 125872421 ecn_mark 1044
  new_flows_len 0 old_flows_len 1

root@lorna-gw:~# uptime
 12:45:35 up 54 days, 22:33,  load average: 0.05, 0.05, 0.04

dscp classification in general, is only useful from within your own
network, going outside.

 which is a really nice way of measuring and visualizing packet loss and
 variations in latency. I am using the default probe type which uses fping
 (ICMP http://www.fping.org/ ).

I LOVE smokeping and wish very much we had a way to combine it with
mrtg data to see latency AND bandwidth at the same time.


 It has been working well, I set it up for a site in advance of setting up
 SQM and then afterwards I can see the changes and determine if more tuning
 is needed.  But if ICMP is having it's priority adjusted (up or down), then
 the results might not reflect the latency of other services.

 Fortunately the nice thing is that many other probe types exist

   http://oss.oetiker.ch/smokeping/probe/index.en.html

 So which probe types would be good to use for bufferbloat measurement? I
 guess the answer is whatever is important to you, but I also suspect
 there is a set of things that ISPs are known to mess with.
 HTTP? But also maybe HTTPS in case they are doing some sort of transparent
 proxy?
 DNS?
 SIP?
 I suppose you could even do explicit checks for things like Netflix (but
 then it's easy to go off on a tangent of building a net neutrality
 observatory).

 On a somewhat related note, I was once using smokeping to measure a fiber
 link to a bandwidth provider and had it configured to ping the router IP on
 the other side of the link. In talking to one of their engineers, I learned
 that they deprioritize ICMP when talking _with_ their routers, so my
 measurement weren't valid. (I don't know if they deprioritize ICMP traffic
 going _through_ their routers)

I do strongly recomend deprioritizing ping slightly, and as I noted, I
have seen many a borken
script that actually prioritized it, which is foolish, at best.

I keep hoping multiple (many!) someones here will go have lunch with
their company's oft lonely, oft starving sysadmin(s), to ask them what
they are doing as to firewalling, QoS and traffic shaping. Most of the
ones I have talked are quite eager to show off their work, which is
unfortunately often of wildly varying quality and complexity.

I find that an offer of saki and sushi are most conducive to getting
that conversation started.

I certainly would like to see more default corporate
firewall/QoS/shaping rules than I have personally, for various
platforms. Someone's got to have some good ideas in them... and it
would be nice to know how far the bad ones, have propagated.

 --
 Matt Taggart
 m...@lackof.org





-- 
Dave Täht
Let's make wifi fast, less jittery and reliable again!

https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb

___
aqm mailing list
aqm@ietf.org
https

Re: [aqm] the cisco pie patent and IETF IPR filing

2015-03-05 Thread Dave Taht

On Wed, Mar 4, 2015 at 12:17 AM, Vishal Misra mi...@cs.columbia.edu wrote:
Hi Dave,

Thanks for your email. A few quick points:

- I have actually sent a note already to someone on the Cisco PIE team
about the error in the IETF IPR filing and am sure they will get it
corrected. You have helpfully dug out the actual patent application
and it appears that one digit got inadvertently changed in the Cisco
IETF IPR declaration of the patent application.

- I wish I had a marketing department that would do stories for me
:-). I work at Columbia University and that story that you point out
was done by a writer at the UMass-Amherst engineering school as an
example of academic research having practical impact. There is an
urgent need to support more academic research and I think stories like
this one support the cause.

Well, yes and no. One thing I have tried really hard to do throughout
this project is give credit where credit is due, at every talk for
example, always mentioning pie, even before I actually had any data on
it's performance.- I try to give every individual that has contributed
something to this stone soup project, as here at uknof -

https://plus.google.com/u/0/107942175615993706558/posts/immF8Pkj19C

praise - for what they did to help out. There have been an amazing
level of details to sort out along the way here at every level in the
OS stack, and in the hardware and there is simply no one individual or
company I would single out as truly key, except maybe George P.
Burdell!

A lesson I have learned is that folk in marketing are not particularly
good at correctly distributing credit, and I assume that is how they
are taught to write, to not look at any facts outside of their
immediate objectives. [1]

http://newsroom.cisco.com/feature-content?type=webcontentarticleId=1414442

and 'course nobody in the press has shown up with a photographer to
write puff pieces about the overall effort except, well, cringely's
work is not puffy enough by marketing standards: (
http://www.cringely.com/tag/bufferbloat/ )

I admit to a great deal of frustration when nick weaver writes an
otherwise *excellent* piece in forbes,
http://www.forbes.com/sites/valleyvoices/2015/02/27/this-one-clause-in-the-new-net-neutrality-regs-would-be-a-fiasco-for-the-internet/

and expends 3+ paragraphs explaining bufferbloat, but never gives the
reader a link back *to the word* so that maybe, some CTO or CEO that
reads that rag would have some context and clue when an engineer comes
up to him asking for permission to go implement a fix that is now,
basically, off the shelf.

*I* am going to keep giving credit to everyone I can, in every talk
and presentation I do, and there are quite a few core contributors
that I wish I had called out by name more - for example, I would have
mentioned felix feitkau's contribution towards fixing wifi at the
nznog talk if I could correctly pronounce his name! I struggled for
years to be able to pronounce juliusz's!

At the very least, I hope we can do more from a SEO perspective - and
all *pull together* to get the message out - that bufferbloat is
fixed, that solutions are being standardized in the ietf, and the code
is widely available on a ton of platforms already - and move to
somehow get to where ISPs are announcing settings for things like
openwrt + sqm-scripts, and more importantly - schedules for rolling
out fixes (like docsis 3.1 and better CPE) to their customers.

everyone:

What else more can we do here to cross the chasm?

- Indeed neither me nor any of the other PI authors had any idea of
the PIE work. I discovered it accidentally when I was at MIT giving a
talk on Network Neutrality and Dave Clark mentioned Cisco's PIE and
DOCSIS 3.1 to me. I later read up on PIE and was pleasantly surprised
that our PI work from more than a decade back evolved into it.

- I had contributed the PI code to Sally Floyd back in 2001 and it has
been part of ns2 for the longest time (pi.cc). It shouldn't be
difficult to adapt that for a Linux implementation and I am happy to
help anyone who wishes to try it. Maybe that might affect your loyalty
to fq_codel.

I let the data take me where it may. I (not) always have, but reformed
about 15 years ago. [1] I hope that you and your students also, do
some experiments on the successors to PI and RED and DRR - and also
follow the data where-ever it leads you.

I was fiercely proud of sfqred - until fq_codel blew it away on every
benchmark I could devise. I have long longed to find another
independent expert in the field to create new experiments and/or
recreate/reproduce/disprove our results.

[1] For a successful technology, reality must take precedence over
public relations, for Nature cannot be fooled. - Richard P. Feyman,
Challenger Disaster report:
https://www.youtube.com/watch?v=6Rwcbsn19c0

-Vishal
--
http://www.cs.columbia.edu/~misra/

On Mar 4, 2015, at 1:07 AM, Dave Taht dave.t...@gmail.com wrote:

Two items:

A) The IETF

Re: [aqm] the cisco pie patent and IETF IPR filing

2015-03-04 Thread Dave Taht

On Wed, Mar 4, 2015 at 2:08 PM, Rong Pan (ropan) ro...@cisco.com wrote:
 The correct Cisco IPR is http://datatracker.ietf.org/ipr/2540/.

Thank you very much for the pointer to the correct IPR filing.

I apologize for being grumpy.


-- 
Dave Täht
Let's make wifi fast, less jittery and reliable again!

https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] [Bloat] ping loss considered harmful

2015-03-03 Thread Dave Taht

On Tue, Mar 3, 2015 at 10:00 AM, Fred Baker (fred) f...@cisco.com wrote:

 On Mar 3, 2015, at 9:29 AM, Wesley Eddy w...@mti-systems.com wrote:

 On 3/3/2015 12:20 PM, Fred Baker (fred) wrote:

 On Mar 1, 2015, at 7:57 PM, Dave Taht dave.t...@gmail.com
 mailto:dave.t...@gmail.com wrote:

 How can we fix this user perception, short of re-prioritizing ping in
 sqm-scripts?

 IMHO, ping should go at the same priority as general traffic - the
 default class, DSCP=0. When I send one, I am asking whether a random
 packet can get to a given address and get a response back. I can imagine
 having a command-line parameter to set the DSCP to another value of my
 choosing.

 I generally agree, however ...

 The DSCP of the response isn't controllable though, and likely the DSCP
 that is ultimately received will not be the one that was sent, so it
 can't be as simple as echoing back the same one.  Ping doesn't tell you
 latency components in the forward or return path (some other protocols
 can do this though).

 So, setting the DSCP on the outgoing request may not be all that useful,
 depending on what the measurement is really for.

 Note that I didn’t say “I demand”… :-)

My point was A), I have seen tons of shapers out there that actually
prioritize ping over other traffic. I figure everyone here will agree
that is a terrible practice, but I can certainly say it exists, as it
is a dumb mistake replicated in tons of shapers I have seen... that
makes people in marketing happy.

Already put up extensive commentary on that bit of foolishness on
wondershaper must die.

Please feel free to review any shapers or firewall code you might have
access to for the same sort of BS and/or post the code somewhere for
public review. A BCP for these two things would be nice.

And B) Deprioritizing ping (slightly) as I do came from what has
happened to me multiple times when hit by a bot that ping floods the
network. One time, 30+ virtual windows boxes in a lab got infected by
something that went nuts pinging the entire 10/8 network we were on.
It actually DID melt the switch - and merely getting to isolating that
network off from the rest was a PITA, as getting to the (SFQ-ing)
router involved was nearly impossible via ssh. (like, 2 minutes
between keystrokes).

Thus, ping, deprioritized. I tend to feel deprioritizing it slightly
is much more important in the post ipv6 world.


 I share the perception that ping is useful when it’s useful, and that it is 
 at best an approximation. If I can get a packet to the destination and a 
 response back, and I know the time I sent it and the time I received the 
 response, I know exactly that - messages went out and back and took some 
 amount of total time. I don’t know anything about the specifics of the path, 
 of buffers en route, or delay time in the target. Traceroute tells me a 
 little more, at the cost of a more intense process. In places I use ping, I 
 tend to send a number of them over a period of time and observe on the 
 statistics that result, not a single ping result.



-- 
Dave Täht
Let's make wifi fast, less jittery and reliable again!

https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] speedtest-like results for 3g and 4g at ofcom

2015-02-28 Thread Dave Taht

Anybody know anybody here that could ask them to run a valid latency
under load test?

http://media.ofcom.org.uk/news/2014/3g-4g-bb-speeds/



-- 
Dave Täht
Let's make wifi fast, less jittery and reliable again!

https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] Gathering Queue Length Statistics

2015-02-25 Thread Dave Taht

On Wed, Feb 25, 2015 at 9:53 AM, Ryan Doyle rpdo...@live.unc.edu wrote:
Hello,

I am a senior undergraduate student at the University of North Carolina at
Chapel Hill and am studying the effectiveness of AQMs. I have set up a lab
network and plan on running different sets of experiments with different
AQMs. My router machines are running Linux kernel version 3.16.0.

I am using the fq_codel, codel, and pie qdiscs for my research and am
wondering if there is a way to collect statistics regarding the average
queue length since a qdisc was enabled? I have looked at tc's -s flag for
statistics, but they show nothing about queue length and I have been unable
to find anything else that might help me get queue length statistics.

Oh, god. I am getting incredibly sensitive about average queue length,
and I realize that that is not what you meant. But since not enough
people have seemingly read this or any of the related materials, here
it is again.

http://www.pollere.net/Pdfdocs/QrantJul06.pdf

And I of course always recommend van´s talk on the fountain model for
thinking about closed loop servo systems.

http://www.bufferbloat.net/projects/cerowrt/wiki/Bloat-videos

In the bufferbloat project...

We have developed many tools that drive aqms hard, see netperf-wrapper
on github for that, which measures e2e delay without requiring any
tools on the routers inbetween. e2e delay in my mind is way more
important than average queue length. And you can derive the queue
length(s) from tcp timestamps in those netperf-wrapper tests, from
additional packet captures, if you must.

If you absolutely MUST derive average queue length from the box, you
can poll the interface frequently with tc -s qdisc show as well as
with ifconfig- and parse out the number of packets and the number of
bytes. But you can do MUCH more valid statistical analysis than that,
with that sort of data set - and if you poll too frequently you will
heisenbug your tests, as those data collection calls take locks that
interfere with the path. and we have all sorts of advice about traps
for the unwary here:

http://www.bufferbloat.net/projects/codel/wiki/Best_practices_for_benchmarking_Codel_and_FQ_Codel

Please use things like CDFs to see the range of delays, rather than
averages. It is what happens at above 90% of the range that makes
bufferbloat maddening to ordinary users.

I am summarily rejecting any papers that I review that report average
queue length as if it meant anything. And for a few other reasons. You
have been warned. I really lost my temper after the last paper I
reviewed last weekend and the resulting flamage is all over the bloat
list and codel lists, and starts here:

https://lists.bufferbloat.net/pipermail/codel/2015-February/000872.html

Best,
Ryan Doyle

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

--
Dave Täht
Let's make wifi fast, less jittery and reliable again!

https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] PIE implementation on NS2

2015-01-21 Thread Dave Taht

pie, codel, sfq_codel, codel-dt and other variants are all part of the
upcoming ns2 release.

A release candidate is here: http://nsnam.isi.edu/nsnam/index.php/Roadmap

codel ended up in the september release of ns3-21, the fq_codel
variant is being merged (hopefully) in the 3.22 release, present tree
for that awaiting some pending refactoring and someone to help do the
work.

https://www.nsnam.org/wiki/Ns-3.22

Please give 'em a try and report any bugs, etc, to the relevant ns*
mailing lists.


On Tue, Jan 20, 2015 at 6:40 PM, ETAF dancing_li...@foxmail.com wrote:
 Hello!
 Dose Anyone can provide the implementation of PIE on NS2 ?
 Thanks a lot!

 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm




-- 
Dave Täht

thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] incorrect defaults for RED in ns2 cross check

2014-12-22 Thread Dave Taht

I am curious if those of you fiddling with RED have
been setting q_weight in your simulations and papers,
overriding the incorrect ns2 default?

Similarly, those of you testing ARED, making sure the adaptive
parameter is really on?

... and if someone can come up with a way of a validating
correct configurations for RED were used in every paper that used it in ns2
for the last 12 years, I would love to hear it.

-- Forwarded message --
From: Tom Henderson t...@tomh.org
Date: Sun, Dec 21, 2014 at 12:07 PM
Subject: [Ns-developers] proposed changes to ns-2 RED code
To: ns-developers list ns-develop...@isi.edu, ns-us...@isi.edu
ns-us...@isi.edu
Cc: Mohit P. Tahiliani tahiliani.n...@gmail.com


If you are using ns-2 and RED queues, please help us to evaluate the
following proposed change.

Mohit Tahiliani has been working on Adaptive RED in ns-2, and has
patched a few issues, including:

-   use of ARED in wireless networks leads to floating point exception
-   the default value of Queue/RED set q_weight_ -1 is incorrect
-   Queue/RED set adaptive_ 0: this must be set to 1, otherwise max_p
parameter never adapts.

While the default values of some parameters (such as thresh_,
maxthresh_, q_weight_) were changed in 2001 to make ARED as the default
RED mechanism in ns-2, those of others parameters were
left unchanged. The resulting code defaults to something that is neither
RED nor ARED; this patch will fix the default to ARED.

The proposed patch is in a tracker issue here:

http://sourceforge.net/p/nsnam/patches/25/

I'm testing release candidates for ns-2.36, which are described here:

http://nsnam.isi.edu/nsnam/index.php/Roadmap

Mohit's patch is _not_ part of the first release candidate.  If we move
forward with it, it will be merged as part of a later release candidate.
 So to test it yourself, I recommend to download the release candidate
and apply the patch there.

I've been through a couple of review cycles with Mohit on this patch.
We'll use lazy consensus to try to decide on its inclusion.  Unless we
hear from the community that these changes should be reconsidered (let's
set a date, such as by January 10), I plan to work with Mohit to
evaluate the ns-2 validate trace changes and update the traces, and
commit this to ns-2 prior to the ns-2.36 release.

Of course, even if you support this, it would be nice to hear
positive feedback if you read over this patch, test it, and like what
you see.

Thanks,
Tom


-- 
Dave Täht

thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] RED implementation on Linux 3.14.22

2014-12-15 Thread Dave Taht

On Mon, Dec 15, 2014 at 5:41 AM, Jim Gettys j...@freedesktop.org wrote:


 On Mon, Dec 15, 2014 at 2:51 AM, Simone Ferlin-Oliveira fer...@simula.no
 wrote:

 All,

 I am doing some work with shared bottleneck detection that requires
 some evaluation with different AQM, in particular, RED. Since I
 haven't been following the evolution of the implementation,  I would
 like to ask about your experience with the code on Linux 3.14 (and
 newer).


 I know that Dave Taht ran into bugs in RED a while back, which I believe
 have been fixed for quite a while.

The power of git to answer questions like this is unparalleled. Taking
a look at my current kernel tree and doing a:

git log net/sched/sch_red.c

shows eric fixed 2 bugs in Linux RED in

commit 1ee5fa1e9970a16036e37c7b9d5ce81c778252fc
Author: Eric Dumazet eric.duma...@gmail.com
Date:   Thu Dec 1 11:06:34 2011 +

   sch_red: fix red_change

...

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=1ee5fa1e9970a16036e37c7b9d5ce81c778252fc

ARED was added slightly later, and sfqred (a first attempt at blending
fq + AQM together) shortly after that. (which doesn't do ared (never
made it to mainline as fq_codel landed soon after that)

Advice: Keep track of net-next, do git pulls regularly, and watch git
log net for changes.

 You should always be looking at whether code has been patched in the current
 kernel.org system for a module like that you are interested in, so do a diff
 between 3.14 and the current Linux system. 3.14 is recent enough that it may
 be viable for experiments, for the time being.  Planning to keep up with
 Linux development is wise long term in any case, as the rate of
 improvement/change in the networking stack is very high at the moment as
 draining the bufferbloat swamp and other performance work continues.

Important changes since 3.14: pie added, DCTCP added, gso/tso offloads
seriously reworked and made gentler, sch_fq's pacing improved. The
last kernel rounds (3.18,3.19) were seriously productive: hystart
improved at longer RTTs, still more TSO/gso improvements, and
xmit_more support was added for some devices. Also support for per
route congestion control settings (primarily targetted at DCTCP) was
just added.

I believe some of the long RTT falloff we saw in toke's paper was due
to hystart issues, as I have been unable to duplicate some of his
results with this upcoming release.

I have basically thrown out all my 3.14 results at this point and am
starting over with the soon-to-stablize 3.19 release. (Well, in fact,
I ended up starting over 3 times in the last 2 months as each of the
new features above landed in the kernel) (but as for red, no changes
except in the underlying TCPs and device drivers)

Relevant commits were:

Hystart change:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=42eef7a0bb0989cd50d74e673422ff98a0ce4d7b

xmit_more:

http://netoptimizer.blogspot.com/2014/10/unlocked-10gbps-tx-wirespeed-smallest.html
very good lwn article on it http://lwn.net/Articles/615238/

one of several GSO fixes:

commit d649a7a81f3b5bacb1d60abd7529894d8234a666
Author: Eric Dumazet eduma...@google.com
Date:   Thu Nov 13 09:45:22 2014 -0800

tcp: limit GSO packets to half cwnd

... etc. Do a git log net. :)

preso that convinced systemd to switch to fq_codel:
http://lwn.net/Articles/616241/

 Also note that underlying device drivers may have (sometimes lots) of
 buffering out of control of the Linux queue discipline.  For Ethernet
 devices, you should ensure that that the drivers have BQL support
 implemented to minimize this buffering.  Other classes of drivers are more
 problematic, and may have lots of buffering to surprise you.

+10 (or rather, -10). It's up to 25 devices now. I note that TSO/GSO used to
interact very badly with soft rate limiting (htb), it seems better now.


 Also be aware that ethernet flow control may move the bottleneck from where
 you expect to somewhere else, and that switches in networks also have to be
 well understood.  Most consumer switches have this *on* by default, and
 mixed 1G/100Mb networks can be particularly entertaining in this regard.
 Cable modems, unfortunately, typically do not implement flow control, but
 some DSL modems do (putting the bottleneck into your router, rather than in
 the modem).

I should probably put red back into my test matrixes. I stopped
benchmarking it and pfifo_fast a long time ago. A netperf-wrapper data
set that predates the hystart fix, testing 3 RTTs:

http://snapon.lab.bufferbloat.net/~d/comprehensive.puck/
or:
http://snapon.lab.bufferbloat.net/~d/comprehensive_puck.tgz



 *Any* help is appreciated.


 Hope this helps.



 Thanks,
 Simone

 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm



-- 
Dave Täht

thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks

___
aqm mailing list
aqm

Re: [aqm] RED implementation on Linux 3.14.22

2014-12-15 Thread Dave Taht

On Mon, Dec 15, 2014 at 7:54 AM, Dave Taht dave.t...@gmail.com wrote:
 On Mon, Dec 15, 2014 at 5:41 AM, Jim Gettys j...@freedesktop.org wrote:


 On Mon, Dec 15, 2014 at 2:51 AM, Simone Ferlin-Oliveira fer...@simula.no
 wrote:

 All,

 I am doing some work with shared bottleneck detection that requires
 some evaluation with different AQM, in particular, RED. Since I
 haven't been following the evolution of the implementation,  I would
 like to ask about your experience with the code on Linux 3.14 (and
 newer).

I need to clarify something about newer. The third parameter in Linux is
for bug fixes only. 3.14 is the major release, a 3.14.22 was 22 bug
fix releases. A -X or 4th parameter, if it exists, is distro specific
changes, which can often, particularly in major distros like redhat or
ubuntu, be quite extensive.

New features, such as the ones I mentioned in the previous email, generally
do not make it to the bug fix releases, and I don't know if (for
example) the hystart
change or GSO half cwnd change will make it to the -stable tree for
older releases (without checking), as usually only security or crash
critical bugs make it into stable.

I mention this in light of a fairly recent DCTCP paper which used a
pre-bufferbloat-fixes kernel of 3.2.something, discussed (Well, ranted
about slightly, apologies) here.

https://lists.bufferbloat.net/pipermail/bloat/2013-November/001736.html

(I would dearly like to see that paper's experiments revised and
updated in light of that discussion, now that all these other fixes
have landed, and DCTCP is now in mainline linux.)

I try to publish a simple debian kernel build script, and my own patch
set of the codel-related research in progress regularly, somewhere:

http://snapon.lab.bufferbloat.net/~d/codel_patches/

and will probably restart publishing a separate debloat-testing tree
for the upcoming make-wifi-fast effort, as that set of changes is
going to be quite extensive, and buggy, for a while.

-- 
Dave Täht

http://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] analysis paper on PIE...

2014-11-12 Thread Dave Taht

On Wed, Nov 12, 2014 at 4:03 PM, Scheffenegger, Richard r...@netapp.com wrote:
 Hi Martin,

 I believe these papers may qualify that requirement:

 http://ipv6.cablelabs.com/wp-content/uploads/2014/06/DOCSIS-AQM_May2014.pdf

This documents the docsis-pie implementation which has rather a few
basic improvements on pie by itself, notably bytemode, some predictive
stuff, and drop-semi-derandomization.

It also uses overlarge and not-recomended-by-the-inventors constants
for codel and sfq_codel, and lumps together all results at all
bandwidths, where, as we've shown, current aqm implementations perform
differently at different bandwidths and RTTs.

The earlier paper covered the bandwidth scenarios more broadly and in
depth, with less twiddling of the constants.

I also had a very long post on this list going into the problems with
the testing done here. which I'll search for unless someone beats me
to it.

Some of the tools used in this evaluation landed in ns2 earlier this
year, and I would certainly like these tests reproduced independently,
with sane values for codel and fq_codel, and preferably against a
simulator I trust more, like ns3.

 http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6925768

Yep. Toke points to codel's falloff at higher rates in his paper also
- this is mostly due to a problem in the control law introduced in the
linux version that isn't in the ns2 version, and is nearly invisible
in the fq_codel version. I do fear a similar problem is in PIE when
dealing with TSO packets but have not tested.

And the load transients are a problem with any straight aqm system.

 https://www.duo.uio.no/handle/10852/37381

I have picked apart this paper elsewhere also. I would have liked it,
if in particular, fq_codel had been used throughout the tests in
comparison, particularly with ecn

To add to the comparisons:

http://caia.swin.edu.au/reports/140630A/CAIA-TR-140630A.pdf

While quite good, this was kind of limited to steady state
performance, and at rates below 10mbit.

I was delighted to see someone actually use cerowrt for it's intended
purpose, evaluating
each new algorithm on real hardware in their bachelor's thesis, but I
can't find that url right now.

 tl;dr - both pie and codel camps did some independent implementations and 
 testing of the respective other algorithm,

I tracked codel, fq_codel, pie, improvements to red (ared), sfq,
sfqred, and SFB closely for the past 4 years. All of these are in
linux 3.14 or later (with pie entering last). The ability to do basic
testing of everything on the table is a download away, on nearly every
linux distro, and cerowrt has 'em all - and openwrt several...

I spent GSOC2014 getting ns3 up to speed. For some reason I don't seem
to have any time to write papers myself...

with discussions and denting out some poorly described aspects in that 
process. It's my understanding that this lead to a better quality of the 
drafts in both instances.

Ya know, I'm not part of the codel or pie camp. I'm pretty firmly in
the low knobs fq + an aqm with ecn support camp.There aren't a lot
of people in a codel-only camp. The algorithm is a toolkit, much like
pie is a toolkit for docsis-pie, and setting up the debate as codel vs
pie feels like an exercise in dialectical dualism for sake of
excluding the third alternative.

Certainly codel the algorithm, and codel the stand-alone aqm can be
improved (and I have patches for that available for a long time now)
but it has taken a long time to have an adaquate suite of test tools
to be able to analyze the often microscopic differences between
versions of the base aqm algorithms, be the pie or codel or red
derived. (for example I was unaware until I got a preview of toke's
paper of the degree of decline in effectiveness in codel alone at
100Mbit, being focused primarily on finding improvements at 5mbit or
below (the speed most of the edge of the internet runs at - and the
hardware I use to test peaking out at about 60mbits before htb rolls
over and dies on a cpu designed in 1989. So instead of fiddling with
codel I've been fiddling with a better rate shaper embedded into
fq_codel)

I do certainly hope that work can move forward on the evaluation
guidelines based on a wide variety of scenarios. In my own case, my
biases are towards managing slow start better (vs steady state TCPs),
capturing all the latency sources (e.g DNS, tcp syns), and enabling
voip, gaming, web, and videoconferencing traffic better to the expense
of full single flow goodput, at rates well below 100mbit, and
typically at baseline physical rtts in the 4-50ms range.




I utterly agree that more testing is needed. However the only open
source test suite (netperf-wrapper) only implements a few of the tests
in the pie and docsis-pie papers, making reproduction difficult, and
the earlier


 Richard Scheffenegger




 -Original Message-
 From: aqm [mailto:aqm-boun...@ietf.org] On Behalf Of Martin Stiemerling
 Sent: Mittwoch, 12. November 2014

Re: [aqm] adoption call: draft-welzl-ecn-benefits

2014-08-29 Thread Dave Taht

On Tue, Aug 12, 2014 at 3:24 AM, Gorry Fairhurst go...@erg.abdn.ac.uk wrote:
 OK, so I have many comments, see below.

 Gorry


 On 12/08/2014 10:43, Bob Briscoe wrote:

 Wes, and responders so far,

 A doc on the benefits and pitfalls of ECN is needed. Personally I
 wouldn't assign much of my own time as a priority for such work; I'd
 rather work on finding the best road through the protocol engineering.
 But I'm glad others are doing this.

 We need to be clear that this doc (at the moment) is about the benefits
 of 'something like RFC3168 ECN'. I think that is the right direction. I
 would not be interested in a doc solely about the benefits of 'classic'
 ECN (we had RFC2884 for that).

 However, if it is about the benefits of some other ECN-like thing, it
 will not be worth writing unless it is more concrete on what that other
 ECN-like thing is. At present different and sometimes conflicting ideas
 are floating around (I'm to blame for a few).

 In order to write about benefits, surely experiments are needed to
 quantify the benefits?

+10

 Alternatively, this could be a manifesto to
 identify /potential/ benefits of ECN that the current classic ECN is
 failing to give. I think at the moment it's the latter (and that's OK
 given that's where we have reached today).

 GF: If someone wishes to write this research paper, I'd be happy to join
 them, but it was not what I had in mind for this ID.


 How about the title Explicit Congestion Notification (ECN): Benefits,
 Opportunities and Pitfalls ?

 GF: I could live with that, if the group wished this!

+1



 We (in the RITE project) have agreed to start work on an 'ECN Roadmap'
 in order to identify all the potential ideas for using ECN coming out of
 research, and write down whether new standards will be needed for some,
 whether they can evolve without changing standards, which are
 complementary, which conflict, etc.

I'd like to see experiments done through the free.fr network as it's the
only one I know of with ecn enabled along the edge in their revolution
v6 product.

Presently cerowrt ships with ecn enabled on the inbound rate
limiter and disabled on the outbound, I have considered enabling
it by default on the outbound for connections  4mbits.

(users can override these settings, of course)


 I don't know whether this ECN benefits doc ought to include this
 detailed ECN roadmap work, but if it's going to talk about something
 like ECN I believe it will have to include a summary of the main items
 on such a roadmap to be concrete.


 more inline...

 At 00:38 12/08/2014, John Leslie wrote:

(I have read Michael's reply to this, but I'll respond here.)

 Dave Taht dave.t...@gmail.com wrote:
  On Mon, Aug 11, 2014 at 7:48 AM, Wesley Eddy w...@mti-systems.com
 wrote:
 
  This draft has been discussed a bit here and in TSVWG:
  http://tools.ietf.org/html/draft-welzl-ecn-benefits-01

I do think this is the right place to discuss it.

  As I understand, the IAB has also discussed it a bit, and would
  be happy if this was something that an IETF working group
  published.  I believe the TSVWG chairs also discussed this and
  would be fine if the AQM working group adopted it.

Thus, I am in favor of adopting it, with the understanding that
 it will see significant changes during our discussion.


 I think we can and should agree the direction of those changes in this
 thread. I'd rather not agree to start on a doc and plan to meander.

 GF: +1, we can add comments to the ID to align to this, personally I've
 already said that I'd like to see text on:
 - bleaching and middlebox requirements to deploy.
 - Need to verify the paths actually really *do support* ECN (sorry, but may
 be needed).

I agree that verifying that a path can take a congestion notification e2e
is important.



 I don't think this will be a quick (6 months) job, because of the
 problem of being clear about the things like ECN that it needs to talk
 about.

 GF: That depends also in part on whether these new mechanisms: will actually
 change the message to potential users of transports and people considering
 deployment. In my mind the definition of the protocol techniques does not
 HAVE to be the same document that tells people *HOW* to implement this in
 stacks or network devices. (My own choice would be to keep these to research
 papers and RFCs targeted at their respective communities).



  I don't share the relentless optimism of this document, and would
  like it - or a competing document - to go into the potential negatives.

I think it should concentrate on what its name says: the benefits
 of ECN, both now and in an expected future; but that it should also
 at least mention downsides this WG sees: and that it should avoid any
 recommendation stronger than make ECN available to consenting
 applications.


 I agree it should be informative, rather than making too many detailed
 recommendations.

 GF: Any other bullets listing additional topics are most welcome

[aqm] Sane analysis of typical traffic

2014-07-15 Thread Dave Taht

changing the title as this is not relevant to the aqm document...

... but to an attitude that is driving me absolutely crazy.

On Tue, Jul 15, 2014 at 10:46 AM, Akhtar, Shahid (Shahid)
shahid.akh...@alcatel-lucent.com wrote:
 Dave,

 The message of the results that we presented in November is that it is 
 possible, with currently deployed access hardware, to configure RED so that 
 it consistently improves the end user experience of common network services 
 over Tail-Drop (which is most often configured), and that this improvement 
 can be achieved with a fixed set of RED configuration guidelines.

 We did not run experiments with sfq_codel because it is not deployed in 
 access networks today. We ran experiments with plain CoDel to understand the 
 difference between a well-configured RED and a more recent single-bucket AQM 
 in our target scenarios, and as reported, didn't observe significant 
 differences in application QoE.

Your application was a bunch of video streams. Not web traffic, not
voip, not, gaming, not bittorrent, not a family of four doing a
combination of these things, nor a small business that isn't going to
use HAS at all.

Please don't over generalize your results. RED proven suitable for
family of couch potatoes surfing the internet and watching 4 movies at
once over the internet but not 5, at 8mbit/sec might have been a
better title for this paper.

In this fictional family, just one kid under the stair, trying to do
something useful, interactive and/or fun, can both wreck the couch
potatoes' internet experience, and have his own, wrecked also.


 Additional inline clarifications below.

 -Shahid.

 -Original Message-
 From: Dave Taht [mailto:dave.t...@gmail.com]
 Sent: Monday, July 14, 2014 2:00 PM
 To: Akhtar, Shahid (Shahid)
 Cc: Fred Baker (fred); John Leslie; aqm@ietf.org
 Subject: Re: [aqm] Obsoleting RFC 2309

 On Mon, Jul 14, 2014 at 11:08 AM, Akhtar, Shahid (Shahid) 
 shahid.akh...@alcatel-lucent.com wrote:
 Hi Fred, All,

 Let me an additional thought to this issue.

 Given that (W)RED has been deployed extensively in operators' networks, and 
 most vendors are still shipping equipment with (W)RED, concern is that 
 obsoleting 2309 would discourage research on trying to find good 
 configurations to make (W)RED work.

 We had previously given a presentation at the ICCRG on why RED can still 
 provide value to operators 
 (http://www.ietf.org/proceedings/88/slides/slides-88-iccrg-0.pdf). We have a 
 paper at Globecom 2014 that explains this study much better, but I cannot 
 share a link to it until the proceedings are available.

 My problem with the above preso and no doubt the resulting study is that it 
 doesn't appear cover the classic, most basic, bufferbloat scenario, which is

 1 stream up, 1 stream down, one ping (or some form of voip-like traffic) 
 and usually on an edge network with asymmetric bandwidth.

Two additional analyses of use from the download perspective might be
Arris's analysis of the benefits of red and fq over cable head ends:

http://snapon.lab.bufferbloat.net/~d/trimfat/Cloonan_Paper.pdf

and the cable labs work which focused more on the effects of traffic
going upstream which has been discussed fairly extensively here.

 SA: We tried to cover the typical expected traffic over the Internet.

I don't know where you get your data, but my measured edge traffic
looks nothing like yours. Sure bandwidth wise, theres the netflix
spike 3 hours out of the day but the rest sure isn't HAS.


Most of the traffic is now HAS traffic (as per the sandvine report), so if 
only a single stream is present, it is likely to be HAS.

The closest approximation of a continuous TCP stream, as you mention, would be 
a progressive download which can last long enough to look continuous. These 
were modeled together with other types of traffic.

You keep saying, download, download, download. I am saying merely
please ALWAYS try an upload at the same time you are testing downloads
- be it videoconferencing (which can easily use up that 1.6mbit link),
a youtube upload, a rsync backup, a scp, anything...

It needent be the crux of your paper! But adding several tests of that
sort does need to inform your total modeling experience.

If you do that much, you we get a feel for how present day systems
interact with things like ack clocking which will do very interesting
things to your downstream to the couch potato performance metrics.


 It's not clear from the study that this is a 8mbit down 1mbit up DSL network 
 (?),

 SA: In the study presented, It was 8M down and 1.6M up - slide 9

Thx.



 nor is it clear if RED is being applied in both directions or only one 
 direction onl?

 SA: AQMs (including RED) were only applied in downstream direction - slide 9

Are you going to follow up with stuff that looks at the upstream direction?

 (and the results you get from an asymmetric network are quite interesting, 
 particularly in the face of any cross traffic at all)

 SA

Re: [aqm] Obsoleting RFC 2309

2014-07-14 Thread Dave Taht

On Mon, Jul 14, 2014 at 11:23 AM, Fred Baker (fred) f...@cisco.com wrote:

 On Jul 14, 2014, at 11:08 AM, Akhtar, Shahid (Shahid) 
 shahid.akh...@alcatel-lucent.com wrote:

 Hi Fred, All,

 Let me an additional thought to this issue.

 Given that (W)RED has been deployed extensively in operators' networks, and 
 most vendors are still shipping equipment with (W)RED, concern is that 
 obsoleting 2309 would discourage research on trying to find good 
 configurations to make (W)RED work.

 Well, note that we’re not saying to pull RED out of the network; we’re saying 
 to not make it the default. Note that even in the networks you mention, 
 (W)RED is not the default configuration; you have to give it several 
 parameters, and therefore have to actively turn it on.

 We had previously given a presentation at the ICCRG on why RED can still 
 provide value to operators 
 (http://www.ietf.org/proceedings/88/slides/slides-88-iccrg-0.pdf). We have a 
 paper at Globecom 2014 that explains this study much better, but I cannot 
 share a link to it until the proceedings are available.

 One of the major reasons why operators chose not to deploy (W)RED was a 
 number of studies and research which gave operators conflicting messages on 
 the value of (W)RED and appropriate parameters to use. Some of these are 
 mentioned in the presentation above.

 In it we show that the previous studies which showed low value for RED used 
 web traffic which had very small file sizes (of the order of 5-10 packets), 
 which reduces the effectives of all AQMs which work by dropping or ECN 
 marking of flows to indicate congestion. Today's traffic is composed of 
 mostly multi-media traffic like HAS or video progressive download which has 
 much larger file sizes and can be controlled much better with AQMs and in 
 our research we show that RED can be quite effective with this traffic, with 
 little tuning needed for typical residential access flows.

 Prefer John's proposal of updating 2309 rather than obsoleting, but if we 
 can have some text in Fred's draft acknowledging the large deployment of 
 (W)RED and the need to still find good configurations - that may work. I can 
 volunteer to provide that text.

 The existing draft doesn’t mention any specific AQM algorithms. It seems to 
 me that the more consistent approach would be to write a short draft 
 documenting WRED, that the WG could pass along as informational or 
 experimental on the basis of not meeting the requirements of being 
 self-configuring/tuning, at the same time as it passes along others as PS or 
 whatever.

I strongly support a good, consistent set of recommendations for how,
when, and where to use and not use WRED.


 -Shahid.

 -Original Message-
 From: aqm [mailto:aqm-boun...@ietf.org] On Behalf Of Fred Baker (fred)
 Sent: Monday, July 14, 2014 2:06 AM
 To: John Leslie
 Cc: aqm@ietf.org
 Subject: Re: [aqm] Obsoleting RFC 2309


 On Jul 3, 2014, at 10:22 AM, John Leslie j...@jlc.net wrote:

 It would be possible for someone to argue that restating a
 recommendation from another document weakens both statements; but I
 disagree: We should clearly state what we mean in this document, and I
 believe this wording does so.

 The argument for putting it in there started from the fact that we are 
 obsoleting 2309, as stated in the charter. I would understand a document 
 that updates 2309 to be in a strange state if 2309 is itself made historic 
 or obsolete. So we carried the recommendation into this document so it 
 wouldn't get lost.

 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm


 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm




-- 
Dave Täht

NSFW: 
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] [iccrg] Fwd: New Version Notification for draft-irtf-iccrg-tcpeval-01.txt

2014-07-06 Thread Dave Taht

I like what I see here. I will have a few suggestions for the text after
the (USA) holiday...
On Jul 4, 2014 5:11 AM, David Ros d...@simula.no wrote:

 Dear all,

 After a long hiatus, we have finally posted an update to the TCP
 evaluation suite. Comments are welcome.

 Besides mostly editorial fixes, the main changes wrt version -00 that
 David Hayes presented in Berlin (*) concern parameter values for several
 scenarios (some have been fixed, some have been added).

 (*) We just realised version -00 never got posted to the IETF datatracker,
 so there was only a privately-hosted version online; we've just fixed
 this omission. Our apologies.

 Thanks,

 David (as individual)


 Begin forwarded message:

  From: internet-dra...@ietf.org
  Subject: New Version Notification for draft-irtf-iccrg-tcpeval-01.txt
  Date: 4 Jul 2014 13:50:53 GMT+2
  To: Lachlan L.H. Andrew lachlan.and...@monash.edu, Lachlan L.H.
 Andrew lachlan.and...@monash.edu, David Hayes davi...@ifi.uio.no,
 Sally Floyd fl...@acm.org, David Ros d...@simula.no, David Ros 
 d...@simula.no, Sally Floyd fl...@acm.org, David Hayes 
 davi...@ifi.uio.no
 
 
  A new version of I-D, draft-irtf-iccrg-tcpeval-01.txt
  has been successfully submitted by David Ros and posted to the
  IETF repository.
 
  Name: draft-irtf-iccrg-tcpeval
  Revision: 01
  Title:Common TCP Evaluation Suite
  Document date:2014-07-04
  Group:iccrg
  Pages:34
  URL:
 http://www.ietf.org/internet-drafts/draft-irtf-iccrg-tcpeval-01.txt
  Status:
 https://datatracker.ietf.org/doc/draft-irtf-iccrg-tcpeval/
  Htmlized:   http://tools.ietf.org/html/draft-irtf-iccrg-tcpeval-01
  Diff:
 http://www.ietf.org/rfcdiff?url2=draft-irtf-iccrg-tcpeval-01
 
  Abstract:
This document presents an evaluation test suite for the initial
assessment of proposed TCP modifications.  The goal of the test suite
is to allow researchers to quickly and easily evaluate their proposed
TCP extensions in simulators and testbeds using a common set of well-
defined, standard test cases, in order to compare and contrast
proposals against standard TCP as well as other proposed
modifications.  This test suite is not intended to result in an
exhaustive evaluation of a proposed TCP modification or new
congestion control mechanism.  Instead, the focus is on quickly and
easily generating an initial evaluation report that allows the
networking community to understand and discuss the behavioral aspects
of a new proposal, in order to guide further experimentation that
will be needed to fully investigate the specific aspects of such
proposal.
 
 
 
 
  Please note that it may take a couple of minutes from the time of
 submission
  until the htmlized version and diff are available at tools.ietf.org.
 
  The IETF Secretariat
 

 ___
 iccrg mailing list
 ic...@irtf.org
 https://www.irtf.org/mailman/listinfo/iccrg

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] aqm conference call results?

2014-06-26 Thread Dave Taht

there some slides presented that I'd like to refer to as to the aqm
evaluation guide's directions that I'd like to see again. Link?

As it is being broken up into an overview and a second document
detailing tests, I'd like people to look over the tests proposed in

http://tools.ietf.org/html/draft-sarker-rmcat-eval-test-01

as a possible inspiration. While I like the above a lot, it bothers me
that it is only targeted at very low bandwidth scenarios (4mbit being
the topmost).

There are hopefully other tests proposed by other relevant working
groups (ippm, http 2.0, sctp come to mind immediately), that I'd like
to be aware of, and yet don't have the energy to sort through each wg
to find. If there is a way to get a list of tests each wg considers
important to work with, that would be a starting point.


-- 
Dave Täht

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] New Version Notification for draft-baker-aqm-sfq-implementation-00.txt

2014-06-24 Thread Dave Taht

On Tue, Jun 24, 2014 at 1:01 PM, Fred Baker (fred) f...@cisco.com wrote:

 On Jun 24, 2014, at 12:45 PM, Daniel Havey dha...@yahoo.com wrote:

 So IMHO it really doesn't matter except in the weird corner case where a a 
 running flow has already bloated the queue and then we switch on the AQM.

Hmm? In practice, changing the qdisc in Linux, at least, does
completely blow up the existing queue: all packets are discarded, the
various data structures removed, the new data structures created, then
switched.

Don't do that. Do it once, at init time, or before address
acquisition. Simple schemes can be handled now (linux 3.13 and later)
by a single sysctl variable, set either in /etc/sysctl.conf or via

sysctl -w net.core.default_qdisc=fq_codel # or pie, or sfq, or fq

Arguably this needs to allow for arguments, be more flexible and
interface specific. Same goes for enabling ecn or not.
(net.ipv4.tcp_ecn=0)

More complex implementations, like htb, have a default direction
things go until things are fully setup. Other linux implementations,
like drr and qfq, do not, and result in packet loss until entirely
setup.

Recently support for a plug scheduler was developed in order to
assist vm migration, which might make it more possible to switch out
qdiscs without interrupting service.


 That actually has me a little worried with the 100 ms delay built into codel.

Initial delay until it finds a sane delay to drop at.

Imagine, if you will, that you have a deep buffer at the bottleneck and a 
relatively short RTT, and are moving a large file. I could imagine codel’s 
delay allowing a session to build a large backlog and then “suddenly turning 
on AQM”. on a 10 MS link with O(200) packets queue depth, for example, you 
could build 100 ms plus of data in the queue, spend the delay mostly emptying 
it, and then drop the last queued packet because 100 ms had gone by, there was 
still data in the queue, and the next packet had sat there longer than 5 ms.

Given that pie depends on an estimation window being filled this is
not a problem pie has. However needing that window filled is a big
problem at low bandwidths for pie.

As for codel

Well there is a specific inhibit in present forms of codel to not drop
the last packet in the queue even if it has sat there too long. Codel
stops dropping at minbytes (called maxpacket in the code), which is a
variable determined from the flow characteristics, and is usually 1MTU
in size, but can be larger if TSO or GRO are in operation on the
device.

The first versions of fq_codel preserved this behavior: it would never
drop the last packet in any fq_codel queue. This (still) seems like
desirable behavior in the case of having nearly one queue per flow,
but it led inevitably to what I had called the horizontal standing
queue problem (where we could end up with 1024 queues all with one
packet and no longer meeting the latency target(s)). So eric made the
backlog maxpacket check global to all queues, and that's what's been
deployed ever since.

Later work, (I think) is showing, that in practice any inhibit at all
hurts on the architectures available, as htb or (bql and the tx-ring)
are already buffering up packets below where codel was dropping from
near-head.

More packets will always be along, later.

This patch disables the maxpacket check entirely, and results in a
space and cpu savings, without much observable negative or positive
effect on latency and utilization on the bandwidths available to me. I
remain a bit concerned about what happens with TSO and/or GRO enabled.

http://snapon.lab.bufferbloat.net/~cero2/0003-codel-eliminate-maxpacket-variable.patch

I'd love it if people tried it.

Of higher concern to me has long been more sanely applying hysteresis
in the drop rate over wildly varying high bandwidths and loads, but
not a lot of work has gone into codel since it's inception, as it was
so good to start with and so dramatically improved by fq_codel as to
be barely worth debating. But certainly better control laws are
welcomed!
 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm




-- 
Dave Täht

NSFW: 
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] New Version Notification for draft-baker-aqm-sfq-implementation-00.txt

2014-06-24 Thread Dave Taht

On Tue, Jun 24, 2014 at 1:48 PM, Fred Baker (fred) f...@cisco.com wrote:

 On Jun 24, 2014, at 1:33 PM, Daniel Havey dha...@yahoo.com wrote:

 There may be scenarios where the interaction of the interval, the RTT and 
 the bandwidth cause this to happen recurringly constantly underflowing the 
 bandwidth.

 To be honest, the real concern is very long delay paths, and it applies to 
 AQM algorithms generally. During TCP slow start (which is not particularly 
 slow, but entertains contains exponential growth), we have an initial burst, 
 which with TCP Offload Engines can, I’m told, spit 65K bytes out in the 
 initial burst. The burst travels somewhere and results in a set of acks, 
 which presumably arrive at the sender at approximately the rate the burst 
 went through the bottleneck, but elicit a burst roughly twice as fast as the 
 bottleneck. That happens again and again until either a loss/mark event is 
 detected or cwnd hits ssthresh, at which point the growth of cwnd becomes 
 linear.

I think tcp offloads have been thoroughly shown by now to blow up all
sorts of networks, and there has been a lot of work in recent linux
kernels for hosts to mitigate it (use smaller bursts), most recently
the sch_fq + pacing work. The objective of slow start is to fill the
pipe and especially in the case of long rtts like in satellite and
lte networks, it needs to be, well, slower.

tcp offloads are an assist to slower cpus and a per-ethernet-device
feature to get more bandwidth for less cpu... at the cost of
latency, bursty loss, and packet mixing. modern x86 hardware can
easily saturate gigE links without TSO in use at all. many lower end
(arm) products can't, as yet, and 10GigE is still the realm of TSO
(with mitigations arriving in software as per above)

I do hope things like TSO2 (bursts of 256k packets) are not widely
adopted, and smarter mixing happens on multi-queued ethernet devices
instead.


 If the burst is allowed to use the entire memory of the bottleneck system’s 
 interface, it will very possibly approach the capacity of the bottleneck. 
 However, with pretty much any AQM algorithm I’m aware of, the algorithm will 
 sense an issue and drop or mark something, kicking the session into 
 congestion avoidance relatively early.

big bursts are bad. let packets be packets!

Kicking things into congestion avoidance early turns out to have
interesting interactions with hystart.


 This is well-known behavior, and something we have a couple of RFCs on.

 But yes, it can happen on more nominal paths as well.

 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm




-- 
Dave Täht

NSFW: 
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] AQM in every buffer?

2014-06-13 Thread Dave Taht

In the other long thread, gorry said something that didn't quite ring
true with me:

Our goal should be AQM in every buffer.

Well, that's somewhat desirable but not doable (at least in my world) -

1) The device has sufficient buffering to get at least one packet out.
2) There's a tx ring which puts packets for the device to pick up from
3) In linux now (and some older cisco boxes) there is this thing
called byte queue limits which moderates the tx ring to only have
enough data in it to keep the device busy

4) These layers gives the upper portions of the stack time to think
harder about what to put on the tx ring.

*Ideally* an AQM should have a picture of the total buffering in the
system all the way to the wire, but in practice, at higher speeds,
once things are controlled by BQL, it's a trivial amount of extra
buffering.

(this is partially why I get non-plussed by people dissing drop head, when
 what's on the TX ring is already past the drop head point of the AQM layer )

Now, I imagine that at least some hardware switches *could* have a
picture all the way to the wire, but doubt that it's feasible, also.





-- 
Dave Täht

NSFW: 
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] AQM conference call - June 24

2014-06-02 Thread Dave Taht

On Mon, Jun 2, 2014 at 8:19 AM, Wesley Eddy w...@mti-systems.com wrote:
 Hello, we're planning on holding an AQM conference call on June 24
 at 1PM US/Eastern time.

 We'll publish webex/telecon coordinates closer to the day.  This is
 just a notification for calendar planning purposes.

I don't have webex capability. Do have google hangouts. Can dial in.

 We will also have it announced to the IETF announcement list soon.

 The goal of this is to give us a chance to focus some higher bandwidth
 discussion around the working group milestones, and hopefully make a
 little bit of progress prior to the next actual working group meeting
 at IETF 90.

 I'm hoping that we need no more than about an hour and a half.

 The rough agenda (for bashing) is:

 1 - discuss overall WG status quickly
 2 - discuss state of the 2309bis / recommendation draft
 - if any editors or people with comments are online, this will be
   a chance to discuss any remaining items that haven't converged
   through the mailing list yet
 3 - discuss state of evaluation guidelines / scenarios
 - if one of the editors is available, we'd like them to share
   plans and status briefly
 4 - discuss possibly adopting algorithms, as mentioned on the mailing
 list and get some feedback on this

I am interested in feedback  and discussion on the following two
drafts before that date, if possible:

http://tools.ietf.org/html/draft-nichols-tsvwg-codel-02

http://tools.ietf.org/html/draft-hoeiland-joergensen-aqm-fq-codel-00

IF the wg is interested in seeing this draft completed before ietf,
let me know soonest:

http://snapon.lab.bufferbloat.net/~d/draft-taht-home-gateway-best-practices-00.html

I would probably accompany it with a preso on the case for
comprehensive queue management,
talking about the expected network behavior and test suites developed
by other wgs like webrtc.


 5 - plan agenda for Toronto


 --
 Wes Eddy
 MTI Systems

 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm



-- 
Dave Täht

NSFW: 
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] last call results on draft-ietf-aqm-recommendation

2014-05-15 Thread Dave Taht


 I agree with the complement language. I don't mind if they are separable.
 Integration, however, is highly advantagous.

I started another thread on the backlog issue.


Because scheduling requires policy and AQM doesn't.

 Machine gunning down packets randomly until the flows start to behave
 does not require any policy, agreed. a 5 tuple fq system is not a lot
 of policy to impose. certainly
 qos and rate scheduling systems impose a lot more policy.

Actually, I'm going to retract part of what I just said. Everything is a policy.

Drop tail is a policy, it's useful for e2e mechanisms like ledbat if
the queue size
is greater than 100ms. not helpful for bufferbloat.

Drop head is a policy, it's useful for voip (actually useful for tcp
too). Not helpful for ledbat.

Shooting randomly and increasingly until flows get under control, a
decent compromise
between drop head and drop tail, that also shoots at a lot of packets
it doesn't need to

drr is a policy that does better mixing and does byte fairness
sfq is a polic does better mixing of packet fairness
qfq does weighted fq

red/ared/wred is a policy.
hfsc is a policy that does interesting scheduling and drop things all its own
htb based policies are often complex and interesting

so the problem is in defining what policies are needed and what algorithms
can be used to implement that policy. May the ones that provide the best
QoE for the end user succeed in the marketplace, and networks get ever better.

https://www0.comp.nus.edu/~bleong/publications/pam14-ispcheck.pdf

 So operators
 don't want to have to face the dilemma of needing the AQM part, but not
 being able to have it because they don't want the policy implicit in the
 scheduling part.

A dilemma of choosing which single line of code to incorporate in an
otherwise far more complex system? I certainly do wish it was entirely
parameterless, and perhaps a future version could be more so than
this is today.

I can write up the complexity required to do for example qfq + pie but
it would be a great deal longer than the below, and qfq + RED or red alone,
is much longer than either. Scripting is needed to configure those...

# to do both AQM + DRR at the same time, with reasonable defaults for
4mbit-10gbit

tc qdisc add dev your_device root fq_codel

# AQM only
# ecn not presently recomended

tc qdisc add dev your_device root codel

# or (functional equivalent)

tc qdisc add dev your_device root fq_codel flows 1 noecn

# (you could also replace the default tc filter, to get, like,
# a 4 queued system on dscp...)

# DRR + SQF-like behavior with minimal AQM, probably mostly reverting
# to drop head from largest queue (with the largest delay I consider
# even slightly reasonable)

tc qdisc add dev your_device root fq_codel target 250ms interval 2500ms

# if your desire is to completely rip out the codel portion of fq_codel that's
# doable. I know a fq_pie exists, too.

# reasonable default for satellite systems (might need to be closer to 120ms,
# and given the speed of most satellites, quantum 300 makes sense as well as
# a reduced mtu and IW)

tc qdisc add dev your_device root fq_codel target 60ms interval 1200ms

# useful option for lower bandwidth systems is quantum 300

# Data center only use can run at reduced target and interval

tc qdisc add dev your_device root fq_codel target 500us interval 10ms

# above 10Gbit, increasing the packet limit is good, probably a good
idea to increase flows

# a current problematic interaction with htb below 2.5mb leads to a
need for a larger target
# (it would be better to fix htb or to write a better rate limiter)

It's about a page of directions to handle every use case. I'd LOVE to
have similar
guideline and cookbook page(s) for EVERY well known aqm and packet
scheduling system - notably red and ared. I lack data on pie's
scalability presently, too.

Most rate shaping code on top of this sort of stuff, and most
shaping/qos related code also
is orders of magnitude more complex than this. Take htb's compensator
for ATM and/or PPPoe framing. Please. OR the hideous QoS schemes
people have designed using DPI.

As things stand fq_codel is a simpler/faster/better
drop in replacement for tons of code that shaped and used RED, or shaped and
did sfq.

Sensing the line rate, choosing an appropriate packet limit based on
available memory,
and auto-choosing number of flows are things the C code could be
smarter about.

They are something I currently do in a shell script (that also tries to figure
out atm framing and a 3 tier qos system)

I think that adding a rate limiter directly to a fq_codel or wfq +
codel derived algo
is a great idea and would be better than htb or hfsc + X. Been meaning to polish
up the code...



 This is critical for fq_codel, because apparently CoDel alone is not
 recommended (which I would agree with).

The present version of that is useful (without ecn) in many scenarios.
It has been used
in combination with hfsc, htb, and standalone.

We've long

Re: [aqm] chrome web page benchmarker fixed

2014-04-30 Thread Dave Taht

Doug Orr recommended to us that we give

http://www.chromium.org/developers/telemetry

a shot in generating reproducible web traffic models.

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] the side effects of 330ms lag in the real world

2014-04-29 Thread Dave Taht

On Tue, Apr 29, 2014 at 12:56 AM, Mikael Abrahamsson swm...@swm.pp.se wrote:
 On Tue, 29 Apr 2014, Fred Baker (fred) wrote:

A couple points here.

1) The video went viral, and garnered over 600,000 new hits in the 12
hours since I posted
 it here.

there is pent up demand for less latency. While the ad conflates
bandwidth with latency,
they could have published their RTTs on their local fiber network,
which is probably a
great deal less than dsl or cable. That counts for a lot when
accessing local services.

2) There is a lot of things an ISP can do to improve apparent latency
on the long haul

A)  co-locating with a major dns server like f-root to reduce dns latency
B)  co-locating with major services like google and netflix

publishing ping times to google for example might be a good tactic.

C) Better peering

 Well, we could discuss international communications. I happen to be at
 Infocom in Toronto, VPN’d into Cisco San Jose, and did a ping to you:


 Yes, but as soon as you hit the long distance network the latency is the
 same regardless of access method. So while I agree that understanding the
 effect of latency is important, it's no longer a meaningful way of selling
 fiber access. If your last-mile is fiber instead of ADSL2+ won't improve
 your long distance latency.

Well, it chops a great deal from the baseline physical latency, and most
people tend to access resources closer to them than farther away. An
american in paris might want to access the NYT, but Parisians La Monde.

Similarly most major websites are replicated and use CDNs to distribute
their data closer to the user. The physical RTT matters more and more
in the last mile the more resources are co-located in the local data center.

-- 
Dave Täht

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] [Bloat] the side effects of 330ms lag in the real world

2014-04-29 Thread Dave Taht

On Tue, Apr 29, 2014 at 9:44 AM, Jim Gettys j...@freedesktop.org wrote:



 On Tue, Apr 29, 2014 at 3:56 AM, Mikael Abrahamsson swm...@swm.pp.se
 wrote:

 On Tue, 29 Apr 2014, Fred Baker (fred) wrote:

 Well, we could discuss international communications. I happen to be at
 Infocom in Toronto, VPN’d into Cisco San Jose, and did a ping to you:


 Yes, but as soon as you hit the long distance network the latency is the
 same regardless of access method. So while I agree that understanding the
 effect of latency is important, it's no longer a meaningful way of selling
 fiber access. If your last-mile is fiber instead of ADSL2+ won't improve
 your long distance latency.


 FIOS bufferbloat is a problem too.

 Measured bufferbloat, symmetric 25/25 service in New Jersey at my inlaw's
 house is 200ms (on the ethernet port of the Actiontec router provided by
 Verizon).  So latency under load is the usual problem.

ESR's link, before and after the cerowrt SQM treatment:

https://www.bufferbloat.net/projects/codel/wiki/RRUL_Rogues_Gallery#Verizon-FIOS-Testing-at-25Mbit-up-and-25Mbit-down

 Why would you think the GPON guys are any better in principle than cable or
 DSL?  Cable and DSL may be somewhat worse, just because it is older and
 downward compatibility means that new modems on low bandwidth tiers are even
 more grossly over buffered.

Well, buffering on the DSLAM or CMTS needs to be more actively
managed. Fixed limits are much like conventional policing, always
either too large or too small to handle sustained or bursty traffic
respectively.

I have been fiddling with Tim Shepard's udpburst tool as a quick
means of measuring head end buffering, even with fq_codel present on
the inbound. (It's not suitable for open internet use as yet, but code
in progress can be had or enhanced at
https://github.com/dtaht/isochronous ). I just added ecn and tos
setting support to it.

server: ./udpburst -S -E -D 32 # Server mode, enable ECN marking, set
dscp to 0x20 (CS1)

client:

This is from a 22Mbit down CMTS

d@nuc:~/git/isochronous$ ./udpburst -f 149.20.63.30 -E -C -d -n 400 -s 1400
1400 bytes -- received 382 of 400 -- 365 consecutive 0 ooo 0 dups 2 ect






 .. .   ... . ...  . ....  .  .
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2   2 2   2   2 2 2   2   2 2 2 2   2 2 2 2 2

or roughly 512k of buffering.

A DSL link (6400 down)

d@puck:~/git/isochronous$ ./udpburst -f snapon.lab.bufferbloat.net -n
100 -C -d -s 1000
1000 bytes -- received 71 of 100 -- 71 consecutive 0 ooo 0 dups 0 ect

..

 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

or roughly 64k worth of buffering. Interestingly the bandwidth disparity between
the server (gigE in isc.org's co-lo), is so great that fq_codel can't
kick in before
the 64k dslam buffer is overrun.


 You can look at the netalyzr scatter plots in
 http://gettys.wordpress.com/2010/12/06/whose-house-is-of-glasse-must-not-throw-stones-at-another/

 Now, if someone gives me real fiber to the home, with a real switch fabric
 upstream, rather than gpon life might be somewhat better (if the switches
 aren't themselves overbuffered  But so far, it isn't.
   - Jim

   - Jim

  - Jim



 --
 Mikael Abrahamssonemail: swm...@swm.pp.se

 ___
 Bloat mailing list
 bl...@lists.bufferbloat.net
 https://lists.bufferbloat.net/listinfo/bloat



 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm




-- 
Dave Täht

NSFW: 
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

Re: [aqm] [Bloat] the side effects of 330ms lag in the real world

2014-04-29 Thread Dave Taht

On Tue, Apr 29, 2014 at 10:01 AM, Toke Høiland-Jørgensen t...@toke.dk wrote:
 Jim Gettys j...@freedesktop.org writes:

 Now, if someone gives me real fiber to the home, with a real switch fabric
 upstream, rather than gpon life might be somewhat better (if the switches 
 aren't
 themselves overbuffered But so far, it isn't.

 As a data point for this, I have fibre to my apartment building and
 ethernet into the apartment. I get .5 ms to my upstream gateway and
 about 6 ms to Google. Still measured up to ~20 ms of bufferbloat while
 running at 100 Mbps...

 http://files.toke.dk/bufferbloat/data/karlstad/cdf_comparison.png

I need to note that what this wonderfully flat CDF for the measurement
stream shows is that short flows under fq_codel leap to the head of
the queue ever better as you get more and more bandwidth available.

The background load flows not shown on this graph are experiencing
5-20ms worth of latency in each direction as per codel's algorithm.

A better test (in progress) would measure typical voip behaviors

 However, as that graph shows, it is quite possible to completely avoid
 bufferbloat by deploying the right shaping.

It does not completely avoid bufferbloat, the fq_codel fast queue
merely eliminates queuing delay for sparse flows, things like arp, syn,
syn/ack, dns, ntp, etc, as  well as the first packet of any flow that
has not built up a queue yet.

(which is, admittedly, quite a lot of bufferbloat reduction)

The rest of the magic comes from codel.

 And in that case fibre
 *does* have a significant latency advantage. The best latency I've seen
 to the upstream gateway on DSL has been ~12 ms.

And reduced RTT = money.

this piece states observed average RTTs at peak times were 17ms for fiber,
28ms for cable, and 44ms for DSL.

http://www.igvita.com/2012/07/19/latency-the-new-web-performance-bottleneck/

I don't know if the underlying report measures baseline unloaded last mile RTT.


 -Toke

 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm




-- 
Dave Täht

NSFW: 
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] the side effects of 330ms lag in the real world

2014-04-28 Thread Dave Taht

pretty wonderful experiment and video http://livingwithlag.com/

-- 
Dave Täht

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] chrome web page benchmarker fixed

2014-04-19 Thread Dave Taht

On Fri, Apr 18, 2014 at 1:41 PM, Greg White g.wh...@cablelabs.com wrote:
 On 4/18/14, 1:05 PM, Dave Taht dave.t...@gmail.com wrote:

On Fri, Apr 18, 2014 at 11:15 AM, Greg White g.wh...@cablelabs.com
wrote:

 The choice of RTTs also came from the web traffic captures. I saw
 RTTmin=16ms, RTTmean=53.8ms, RTTmax=134ms.

Get a median?

 Median value was 62ms.


My own stats are probably quite skewed lower from being in california,
and doing some tests from places like isc.org in redwood city, which
is insanely well
co-located.

 Mine are probably skewed too. I was told that global median (at the time I
 collected this data) was around 100ms.

Well, the future is already here, just not evenly distributed. Nearly every
sample I'd taken at the same time as from, almost entirely from major
cities, came in at under 70ms median.

It strikes me that a possibly useful metric would be object size vs RTT,
over time.





-- 
Dave Täht

NSFW: 
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] chrome web page benchmarker fixed

2014-04-18 Thread Dave Taht

On Fri, Apr 18, 2014 at 11:15 AM, Greg White g.wh...@cablelabs.com wrote:
 Dave,

 We used the 25k object size for a short time back in 2012 until we had
 resources to build a more advanced model (appendix A).  I did a bunch of
 captures of real web pages back in 2011 and compared the object size
 statistics to models that I'd seen published.  Lognormal didn't seem to be
 *exactly* right, but it wasn't a bad fit to what I saw.  I've attached a
 CDF.

That does seem a bit large on the initial 20%.

Hmm.

There is a second kind of major case, where you are
moving around on the same web property, and hopefully many
core portions of the web page(s) such as the css and javascript,
basic logos and other images, are cached. Caching is handled
two ways, one is to explicitly mark the data as cacheable for
a certain period, the other is an if-modified-since request,
which costs RTTs for setup and the query. I am
under the impression that we generally see a lot more of the
latter than the former these days.

 The choice of 4 servers was based somewhat on logistics, and also on a
 finding that across our data set, the average web page retrieved 81% of
 its resources from the top 4 servers.  Increasing to 5 servers only
 increased that percentage to 84%.

 The choice of RTTs also came from the web traffic captures. I saw
 RTTmin=16ms, RTTmean=53.8ms, RTTmax=134ms.

Get a median?

My own stats are probably quite skewed lower from being in california,
and doing some tests from places like isc.org in redwood city, which
is insanely well
co-located.

 Much of this can be found in
 https://tools.ietf.org/html/draft-white-httpbis-spdy-analysis-00

Thx!

 In many of the cases that we've simulated, the packet drop probability is
 less than 1% for DNS packets.  In our web model, there are a total of 4

I think we have the ability to get a better number for dns loss now.

 servers, so 4 DNS lookups assuming none of the addresses are cached. If
 PLR = 1%, there would be a 3.9% chance of losing one or more DNS packets
 (with a resulting ~5 second additional delay on load time).  I've probably
 oversimplified this, but Kathie N. and I made the call that it would be
 significantly easier to just do this math than to build a dns
 implementation in ns2.


The specific thing I've been concerned about was not the probability of
a dns loss, although as you note the consequences are huge -
but the frequency and cost of a cache miss and the resulting fill.

This is a very simple namebench test against the alexa top 1000:

http://snapon.lab.bufferbloat.net/~d/namebench/namebench_2014-03-20_1255.html

This is a more comprehensive one taken against my own recent web history file.

http://snapon.lab.bufferbloat.net/~d/namebench/namebench_2014-03-24_1541.html

Both of these were taken against the default SQM system in cerowrt
against a cable modem, so you can
pretty safely assume the ~20ms (middle) knee in the curve is basically
based on physical
RTT to the nearest upstream DNS server.

And it's a benchmark so I don't generally believe in the relative hit
ratios vs a vs normal traffic, but do think the baseline RTT, and
the knees in the curves in the cost of a miss and file are relevant.

(it's also not clear to me if all cable modems run a local dns server)

Recently simon kelly added support for gathering hit and miss
statistics to dnsmasq 2.69.

They can be obtained via  a simple dns lookup as answers to queries of
class CHAOS and type TXT in domain bind. The domain names are
cachesize.bind, insertions.bind, evictions.bind, misses.bind,
hits.bind, auth.bind and servers.bind. An example command to query
this, using the dig utility would be

dig +short chaos txt cachesize.bind

It would be very interesting to see the differences between dnsmasq
without DNSSEC, with DNSSEC and with DNSSEC and
--dnssec-check-unsigned (checking for proof of non-existence) - we've
been a bit concerned about the overheads of the last in particular.

Getting more elaborate stats (hit, miss, and fill costs) is under discussion.

 We've open sourced the web model (it's on Kathie's
 web page and will be part of ns2.36) with an encouragement to the
 community to improve on it.  If you'd like to port it to ns3 and add a dns
 model, that would be fantastic.

As part of the google summer of code I am signed up to mentor a
student with tom for the *codel related bits
in ns3, and certainly plan to get fingers dirty in the cablelabs drop,
and there was a very encouraging patch set distributed around for
tcp-cubic with hystart support recently as well as a halfway decent
802.11 mac emulation.

As usual, I have no funding, personally, to tackle the job, but I'll
do what I can anyway. It would be wonderful to finally have all the
ns2 and ns3 code mainlined for more people to use it.



-- 
Dave Täht

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] The Effect of Network and Infrastructural Variables on SPDY's Performance

2014-04-17 Thread Dave Taht

Last night's reading was quite good:

http://arxiv.org/pdf/1401.6508.pdf


As RTT goes up, it becomes increasingly
expensive for HTTPS to establish separate connections for each resource.
Each HTTPS connection costs one round trip on TCP handshaking and
a further two on negotiating SSL setup. SPDY does this only once (per
server) and hence reduces such large waste by multiplexing streams over a
single connection. 

...

the separation between RTT
and bandwidth is not particularly distinct. This is because HTTPS tends to
operate in a somewhat network-unfriendly manner, creating queueing delays
where bandwidth is low. The bursty use of HTTPS' parallel connections cre-
ates congestion at the gateway queues, causing upto 3% PLR and inflating
RTT by upto 570%. In contrast, SPDY causes negligible packet loss at the
gateway.

The network friendly behaviour of SPDY is particularly interesting as
Google has recently argued for the use of a larger IW for TCP [7]. The
aim of this is to reduce round trips and speed up delivery | an idea which
has been criticised for potentially causing congestion. One question here
is whether or not this is a strategy that is speci cally designed to oper-
ate in conjunction with SPDY. To explore this, we run further tests using

and bandwidth xed at 1Mbps (all other parameters as above). For HTTPS,
it appears that the critics are right: RTT and loss increase greatly
with larger IWs. In contrast, SPDY achieves much higher gains when
increasing the IW without these negative side effects. 

and then they inject packet loss:

we inspect the impact of packet loss on SPDY's performance. We
fix RTT at 150ms

Sigh, the rest of the paper is pretty good, but they should have looked at
packet loss at 10-30ms at least.

 and BW at 1Mbps, varying packet loss using the Linux
kernel rewall with a stochastic proportional packet processing rule between
0 and 3%.
. Figure 6 presents the results.
Immediately, we see that SPDY is far more adversely affected by packet
loss than HTTPS is. This has been anticipated in other work [29] but never
before tested. It is also contrary to what has been reported in the SPDY
white paper [2], which states that SPDY is better able to deal with loss.
The authors suggest because SPDY sends fewer packets, the negative eect
of TCP backo is mitigated. We nd that SPDY does, indeed, send fewer
packets (up to 49% less due to TCP connection reuse). However, SPDY's
multiplexed connections persist far longer compared to HTTPS. 


-- 
Dave Täht

NSFW: 
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] chrome web page benchmarker fixed

2014-04-17 Thread Dave Taht

- Once initial HTTP GET completes, initiate 24 simultaneous HTTP GETs
(via separate TCP connections), 6 connections each to 4 different
server nodes

I usually don't see more than 15. and certainly not 25k sized objects.

- Once each individual HTTP GET completes, initiate a subsequent GET
to the same server, until 25 objects have been retrieved from each
server.

* We don't make sure to flush all the network state in between runs, so if
you're using that option, don't trust it to work.

The typical scenario we used was a run against dozens or hundreds of urls,
capturing traffic, while varying network conditions.

Regarded the first run as the most interesting.

Can exit the browser and restart after a run like that.

At moment, merely plan to use the tool primarily to survey various
web sites and load times while doing packet captures. Hope was
to get valid data from the network portion of the load, tho...

* If you have an advanced Chromium setup, this definitely does not work. I
advise using the benchmark extension only with a separate Chromium profile
for testing purposes. Our flushing of sockets, caches, etc does not actually
work correctly when you use the Chromium multiprofile feature and also fails
to flush lots of our other network caches.

noted.

* No one on Chromium really believes the time to paint numbers that we
output :) It's complicated. Our graphics stack is complicated. The time from

I actually care only about time-to-full layout as that's a core network
effect...

when Blink thinks it painted to when the GPU actually blits to the screen
cannot currently be corroborated with any high degree of accuracy from
within our code.

* It has not been maintained since 2010. It is quite likely there are many
other subtle inaccuracies here.

Grok.

In short, while you can expect it to give you a very high level
understanding of performance issues, I advise against placing non-trivial
confidence in the accuracy of the numbers generated by the benchmark
extension. The fact that numbers are produced by the extension should not be
treated as evidence that the extension actually functions correctly.

OK, noted. Still delighted to be able to have a simple load generator
that exercises the browsers and generates some results, however
dubious.

Cheers.

On Thu, Apr 17, 2014 at 10:49 AM, Dave Taht dave.t...@gmail.com wrote:

Getting a grip on real web page load time behavior in an age of
sharded websites,
dozens of dns lookups, javascript, and fairly random behavior in ad
services
and cdns against how a modern browsers behaves is very, very hard.

it turns out if you run

google-chrome --enable-benchmarking --enable-net-benchmarking

(Mac users have to embed these options in their startup script - see
http://www.chromium.org/developers/how-tos/run-chromium-with-flags )

enable developer options and install and run the chrome web page
benchmarker,
(
https://chrome.google.com/webstore/detail/page-benchmarker/channimfdomahekjcahlbpccbgaopjll?hl=en
)

that it works (at least for me, on a brief test of the latest chrome, on
linux.
Can someone try windows and mac?)

You can then feed in a list of urls to test against, and post process
the resulting .csv file to your hearts content. We used to use this
benchmark a lot while trying to characterise typical web behaviors
under aqm and packet scheduling systems under load. Running
it simultaneously with a rrul test or one of the simpler tcp upload or
download
tests in the rrul suite was often quite interesting.

It turned out the doc has been wrong a while as to the name of the second
command lnie option. I was gearing up mentally for having to look at
the source

http://code.google.com/p/chromium/issues/detail?id=338705

/me happy

--
Dave Täht

Heartbleed POC on wifi campus networks with EAP auth:
http://www.eduroam.edu.au/advisory.html

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

--
Dave Täht

NSFW:
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] [AQM Evaluation Guidelines]

2014-04-15 Thread Dave Taht

On Tue, Apr 15, 2014 at 6:57 AM, Nicolas KUHN
nicolas.k...@telecom-bretagne.eu wrote:
 Thank you for detailing the content of the Cable Labs document and where 
 these 700kB come from.
 Concerning your last point:

 As such I would be strongly in favour of changing the draft to actually
 describe realistic web client behaviour, rather than just summarising it
 as repeated downloads of 700KB.

+100.



 I understand that it may be a drastic simplification to just summarise the 
 web client behaviour as only repeated downloads of 700kB. However, the draft 
 may not detail realistic web client behaviour: I believe that it may be out 
 of topic and the draft cannot contain such level of complexity for all the 
 covered protocols/traffic.
 I propose that the following changes:

 Was:
 - Realistic HTTP web traffic (repeated download of 700kB);
 Changed by:
 - Realistic HTTP web page downloads: the tester should at least 
 consider repeated downloads of 700kB - for a more accurate web traffic, a 
 single user web page download [White] may exploited;

 What do you think ?

An AQM evaluation guide MUST include evaluations against real traffic
patterns. Period. The
white PLT model was decent; the repeated single flow 700k download
proposal is nuts. (I can certainly see
attempting to emulate DASH traffic, however)

I have further pointed out some flaws in the white PLT model in
previous emails - notably as to the effect of not emulating DNS
traffic - and have been working towards acquiring a reasonable
distribution of DNS hit, miss, and fill numbers to plug into it for
some time. That has required work - work on finding a decent web
benchmark - and work on acquiring statistics that make sense - and
some of that work is beginning to bear fruit. Dnsmasq, for example has
sprouted the ability to collect statistics, we are trying to get the
chrome web page benchmarker working again, and so on. You ignore the
overhead of DNS lookups to your peril. There are other overheads worth
looking at, too...

Similarly I regard testing a correct emulation of bittorrent's
real-world behavior in an AQM'd environment as pretty critical.
[white] was not even close in this respect. (but it was a good first
try!)

Overall I suggest that we also adopt the same tests that other WGs are
proposing for their protocols. rmcat
had a good starter set here:

http://www.ietf.org/proceedings/89/slides/slides-89-rmcat-2.pdf





 Regards,

 Nicolas

 On Apr 15, 2014, at 12:28 PM, Toke Høiland-Jørgensen t...@toke.dk wrote:

 Nicolas KUHN nicolas.k...@telecom-bretagne.eu writes:

 and realistic HTTP web traffic (repeated download of 700kB). As a reminder,
 please find here the comments of Shahid Akhtar regarding these values:

 The Cablelabs work doesn't specify web traffic as simply repeated
 downloads of 700KB, though. Quoting from [0], the actual wording is:

 Webs indicates the number of simultaneous web users (repeated
 downloads of a 700 kB page as described in Appendix A of [White]),

 Where [White] refers to [1] which states (in the Appendix):

 The file sizes are generated via a log-normal distribution, such that
 the log10 of file size is drawn from a normal distribution with mean =
 3.34 and standard deviation = 0.84. The file sizes (yi) are calculated
 from the resulting 100 draws (xi ) using the following formula, in
 order to produce a set of 100 files whose total size =~ 600 kB (614400
 B):

 And in the main text it specifies (in section 3.2.3) the actual model
 for the web traffic used:

 Model single user web page download as follows:

 - Web page modeled as single HTML page + 100 objects spread evenly
 across 4 servers. Web object sizes are currently fixed at 25 kB each,
 whereas the initial HTML page is 100 kB. Appendix A provides an
 alternative page model that may be explored in future work.

 - Server RTTs set as follows (20 ms, 30 ms, 50 ms, 100 ms).

 - Initial HTTP GET to retrieve a moderately sized object (100 kB HTML
 page) from server 1.

 - Once initial HTTP GET completes, initiate 24 simultaneous HTTP GETs
 (via separate TCP connections), 6 connections each to 4 different
 server nodes

 - Once each individual HTTP GET completes, initiate a subsequent GET
 to the same server, until 25 objects have been retrieved from each
 server.


 Which is a pretty far cry from just saying repeated downloads of 700
 KB and, while still somewhat bigger, matches the numbers from Google
 better in terms of distribution between page sizes and other objects.
 And, more importantly, it features the kind of parallelism and
 interactions that a real web browser does; which, as Shahid mentioned is
 (can be) quite important for the treatment it receives by an AQM.

 As such I would be strongly in favour of changing the draft to actually
 describe realistic web client behaviour, rather than just summarising it
 as repeated downloads of 700KB.


 -Toke


 [0]

Re: [aqm] working group LAST CALL on recommendations draft

2014-04-09 Thread Dave Taht

I still don't support wglc.

a) Nit: Network Working Group?

b) I have given up on using the term AQM to describe anything other
than Active Queue Length Management algorithms.

What I wrote about SQM is mostly outside the scope of the AQM
guidelines document, but it's here:

http://www.bufferbloat.net/projects/cerowrt/wiki/Smart_Queue_Management

but I can live with the broad definition as used in this document.

c) I also don't just mean fair or flow queuing when I say packet
scheduling, we've identified an elephant in the room as is actually
rate limiting (tbf, htb), or hybrid rate limiting/scheduling (hfsc),
or powerboost style rate limiting/expansion

Moving on:

d) The traditional technique for managing the queue length in a network
device is to set a maximum length (in terms of packets) - well,
bytes is common on many devices like dlsams and cmts and modems...

substitute: packets or bytes

e) 2. Provide a lower-delay interactive service

I tend to regard dns traffic also as rather significant and telnet is
rather obsolete. ssh.

f) 3. Non-TCP-friendly Transport Protocols

There's also the problem of DDOS attacks.

g) Another topic requiring consideration is the appropriate granularity
of a flow when considering a queue management method. There are a
few natural answers: 1) a transport (e.g. TCP or UDP) flow (source
address/port, destination address/port, Differentiated Services Code
Point - DSCP); 2) a source/destination host pair (IP addresses,
DSCP); 3) a given source host or a given destination host. We
suggest that the source/destination host pair gives the most
appropriate granularity in many circumstances.

I don't suggest the last, as I have no data that backs it up.

and my request to include a 5 tuple address/port destination
address/port and protocol isn't in here,
although the MF classifier is mentioned later on in section 4.4.
Elsewhere (in webrtc) there is an assumption that the 5 tuple is
addr/port daddr/dport protocol.

I don't actually think doing a 5 tuple including the dscp rather than
the protocol is a very good idea given the amount of misclassified
flows I see transiting site boundries. I do think moving out dscp
flows into their own queues and then 5 tupling with protocol
is not a bad idea.

As for the final recomendations:

h)4. AQM algorithms SHOULD respond to measured congestion, not
application profiles.

I'm not sure if this precludes active classification and optimization measures?

http://www.smallnetbuilder.com/lanwan/lanwan-features/32297-does-qualcomms-streamboost-really-work

Not all applications transmit packets of the same size. Although
applications may be characterized by particular profiles of packet
size this should not be used as the basis for AQM (see next section).

From a packet scheduling perspective I strongly support using some
differentiation based on packet size at low bandwidths. 300 bytes
works well.

for AQM, don't care, just care about latency no matter if it comes
from a pps problem or a packet size problem. I didn't mind pie's
increasing probability of a drop based on packet size (Which so far
as I know is still in cablemodem pie, and: it helps on competing voip traffic)

i) 4.5

might want to also mention more modern protocols like uTP and QUIC

In 2013, an obvious example of further research is the need to
consider the use of Map/Reduce applications in data centers; do we
need to extend our taxonomy of TCP/SCTP sessions to include not only
mice and elephants, but lemmings. Lemmings are flash crowds
of mice that the network inadvertently try to signal to as if they
were elephant flows, resulting in head of line blocking in data
center applications.

I like to talk about ANTS. Can suggest some language if you want.

On Wed, Apr 9, 2014 at 8:35 AM, Wesley Eddy w...@mti-systems.com wrote:
We didn't receive any comments yet on the updated recommendations
draft, which we were trying to have a working group last call on
per Richard's email to the list on 3/5.

Since we think people might not have noticed the last call, we're
re-announcing it.

In the next two weeks, please review this document:

https://datatracker.ietf.org/doc/draft-ietf-aqm-recommenda
tion/

and relay any comments, questions, corrections, words of support,
etc. to this AQM mailing list.

Thanks for your help in finishing this document!

--
Wes Eddy
MTI Systems

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

--
Dave Täht

NSFW:
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] packet loss (and therefore ECN) might not happen much actually

2014-03-21 Thread Dave Taht

I'd be very interest in DNS request/reply analysis of that traffic.

On Fri, Mar 21, 2014 at 9:27 PM, Fred Baker (fred) f...@cisco.com wrote:

 On Mar 4, 2014, at 9:12 AM, Eggert, Lars l...@netapp.com wrote:
 it looks like (in japan at least) TCP is very rarely controlled by packet 
 loss (dupack or timeout) but more by
 sender or receiver rate limiting (or just being too short lived:)

 It would be interesting to know their delay variation. You've seen my famous 
 9 second delay graphic. There was no packet loss at all in that... You have 
 also seen, I believe, my annotated Shepherd Diagram of an upload to Picasa. 
 That was from Akasaka, and had three drops in a five second window, resulting 
 in the session spending 40% of its duration underrunning available capacity.

 It would be interesting to know the traffic mix, the line speeds and 
 latencies end to end, and so on. From my perspective, it's Really Hard to say 
 the internet acts this way; consider the problem of the six blind 
 philosophers and the elephant... What I think I *can* say is that I measured 
 something in a certain way in a particular topological place at a particular 
 time and with a particular workload, I analyzed it in a certain way, and in 
 that measurement and analysis I observed ... something.

 If you want my guess at what the Japanese trace measured, it had upwards of 
 50 MBPS end to end and enough buffer at that rate in the bottleneck switch 
 to prevent tail-drop loss in the ambient workload. Short sessions, which 
 predominate, would not touch that, and high volume sessions might, as you 
 say, self-limit in one of several ways.

 For comparison, yesterday, I took 24 hours of tcpdump trace on my laptop and 
 wrote a reduction script. I started out by capturing 38 hours of traces 
 earlier in the week in one hour chunks, and discovered that tcpdump 
 zero-bases its data structures when it switches output files. Then I took a 
 single 24 hour trace file.

 In that reduction, I distinguished between microflows *from* me and 
 microflows *to* me (where me might be my IPv4 or my IPv6 address or name), 
 which would be the two halves of a TCP session. I also threw out sessions 
 that didn't make sense to me, such as ones that might have already been open 
 when I started the trace. Reason? I have asymmetric bandwidth (12 MBPS down 
 and 2 MBPS up, sez the contract, and I think that's interpreted as at 
 least, as I have seen higher), and I expect the two directions to behave a 
 little differently.

 Rates are in kilobits/second, and all numbers are for a session. I have TCP 
 sessions that are as short as a single packet each way (data/RST, for 
 whatever reason I might receive such things, and maybe SYN/SYN-ACK) and 
 pipelined tcp connections lasting the better part of an hour (I opened all of 
 my face:b00c friends' pages, which moved quite a bit of data, all using IPv6).

 my flows: 10548
 my retransmissions: 4009
 my packets: min=1   median=10   95%=33  
 max=73732
 my bytes:   min=1   median=2493 95%=19608   
 max=697486314
 my durations:   min=0.002751median=58.09656195%=120.355108  
 max=35764.656936
 my kbps:min=0.74median=0.577529 95%=17.171851   
 max=1049048788.929813

 his flows: 14977
 his retransmissions: 2859
 his packets:min=1   median=995%=104 max=181542
 his bytes:  min=1   median=3795 95%=110354  max=221579702
 his durations:  min=0.15median=46.14641295%=148.102106  
 max=35764.620901
 his kbps:   min=0.000459median=0.928163 95%=114.575466  
 max=22604.513177

 There are some weird questions I want to understand about the max fields. I 
 edited out sessions that were open when I started the trace, of which there 
 were a few. There are a couple of other strange sessions. One of these days I 
 might sort out the difference between 14977 and 10548. But I think the bottom 
 line is that while the median session in my home office probably doesn't 
 incur a loss, it looks to me like the ones at the 95th percentile for size 
 probably does - and maybe several.

 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm




-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

[aqm] stochastic hashing in hardware with a limited number of queues

2014-03-14 Thread Dave Taht

The thread on netdev starting here:

http://comments.gmane.org/gmane.linux.network/307532

was pretty interesting, where a research group at suny looked hard
at the behavior of a 64 hw queue system running giant flows:

http://www.fsl.cs.sunysb.edu/~mchen/fast14poster-hashcast-portrait.pdf

They ran smack into the birthday problem inherent in a small number
of queues. And also a bug (now fixed).

The conclusion of the thread was amusing, in that with the new
sch_fq scheduler with a single hardware queue (and a string
of fixes over the past year for tcp small queues and tso offloads),
performed as well as the multi queue implementation... with utter
fairness.

On Sun, Mar 9, 2014 at 9:44 AM, Eric Dumazet eric.dumazet at
gmail.com wrote:

 Multiqueue is not a requirement in your case. You can easily reach line
 rate with a single queue on a 10Gbe NIC.


I repeated the experiment for 10 times using one tx queue with FQ, and
all clients get fair share of the bandwidth. The overall throughout
showed no difference between the single queue case and the mq case,
and the throughput in both cases are close to the line rate.


Sometimes merely because a feature is available on the hardware
does not mean it should be used. Certainly multiple hw queues is
a good idea for some traffic mixes, but not for the circumstances of
this particular test series.


-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] Notes

2014-03-04 Thread Dave Taht

On Tue, Mar 4, 2014 at 5:53 PM, Scheffenegger, Richard r...@netapp.com wrote:
 First of all, thanks to the note takers.

 We've had quite some discussion around the AQM evaluation guideline draft,
 and I believe the notes capture many of the points brought up.

 If you have been up and made a comment on the Microphone, I would like you
 to check if the spirit of your comment has been properly captured in the
 notes:
 http://etherpad.tools.ietf.org:9000/p/notes-ietf-89-aqm

Not even close to what I said. Is it too much to request, that for
future meetings
that the proceedings be recorded and comments transcribed? The technology
exists...

Dave Taht: Care a lot about inter-flow packet loss. Bursty is really
bad. Like to have a metric on inter flow loss

This reminds me of an old far side joke.

http://hubpages.com/hub/Gary-Larson#slide209782

Substitute Packet loss for Ginger here.

What I said was:

I care a lot about interflow latency, jitter, and packet loss. Only
bursty packet loss is really bad.  I'd like to have
a metric on interflow latency, jitter, and packet loss.

Of these, packet loss is the *least* of my concerns. Our protocols
recover from and compensate well
from non-bursty packet loss, and packet loss IS the most common signal
to tell protocols to slow down.
and thus desirable...

As an illustrative example, the cerowrt group has been working on ways
to make aqm and packet scheduling
technologies work well at rates well below 10Mbit, notably on the
768kbit uplinks common in the DSL world (which
also has weird framing derived from the bad olde days of ATM)

At below 100Mbit, TCP behavior is dominated by certain constants -
notably the initial window, be it 3,4 or 10, but also
MTU * IWx in relation to MSS, availability of pacing on on/off traffic
with a large cwnd, etc.

There are a string of recent tests put up here

http://richb-hanover.com/

The first graph shows bufferbloat in all its glory on the link - well
over 2secs of delay and goodput of
about 1.6Mbits on the download.

The remainder of the graphs are on variants of nfq_codel and fq_codel
setups, but the core result was
after applying the cerowrt SQM system (scheduling and aqm)
goodput was way, way, up and latency way, way, down compared to the
bufferbloat'd alternative,
nearly triple the download goodput, and 1/50th the latency. (The
debate is over how best to
get better interflow results and the differences in results not much
above a percentage point)

 - packet loss on this link after applying AQM was well over 35%! But
as it is not bursty, and latency is
held low, the link remains markedly useful, all the flows work pretty
well, and the low rate flows are
doing good...

Thread for ongoing discussion here:

https://lists.bufferbloat.net/pipermail/cerowrt-devel/2014-February/002370.html

Packet captures seem to show that MAC TCP is not reducing it's window
to a reasonable value,
nor is it reducing MSS to something more appropriate for the link
rate. I'd recomend looking at
the packet captures on that test to get a feel for how slow start,
fast recovery and dup acks are is
interact at these timescales.

Packet loss, particularly when taken as a pure percentage is not a
good metric for most measurements.
Most of the time, I don't give a rats arse about it.






 Richard Scheffenegger

 NetApp
 r...@netapp.com
 +43 1 3676811 3146 Office (2143 3146 - internal)
 +43 676 654 3146 Mobile
 www.netapp.com

 EURO PLAZA
 Gebäude G, Stiege 7, 3.OG
 Am Euro Platz 2
 A-1120 Wien




 ___
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm




-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] Draft Agenda for IETF89

2014-02-16 Thread Dave Taht

On Sat, Feb 15, 2014 at 7:10 AM, Michael Welzl mich...@ifi.uio.no wrote:

 14:40
 draft-fairhurst-ecn-motivation
 Gorry Fairhurst
 15 min


 This is apparently not a published draft yet.


 It's draft-welzl-ecn-benefits,
 http://tools.ietf.org/html/draft-welzl-ecn-benefits-00

It describes the benefits of ECN persuasively and well.

I would rather like a section discussing the negatives.


-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html

___
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Re: [aqm] I-D Action: draft-ietf-aqm-recommendation-02.txt

2014-02-13 Thread Dave Taht

On Thu, Feb 13, 2014 at 4:30 PM, Fred Baker (fred) f...@cisco.com wrote:
Gorry and I have posted a second update to the AQM Recommendations draft
discussed at IETF 88. This update mostly picks up nit-level matters.

We, of course, invite review, and would suggest that reviews look at this
version.

A few nits.

A) I have not bought into byte-pkt. I don't want to go into it today.

In particular, I'd like the original pie benchmarks rerun now that
that code doesn't have a byte sensitive dropping mode, and the two
compared.

Perhaps that would shed some light on the issue.

B) Another topic requiring consideration is the appropriate granularity
of a flow when considering a queue management method. There are a
few natural answers: 1) a transport (e.g. TCP or UDP) flow (source
address/port, destination address/port, Differentiated Services Code
Point - DSCP); 2) a source/destination host pair (IP addresses,
DSCP); 3) a given source host or a given destination host.

add:

4) 5 tuple consisting of source ip/port, dest/port, proto.

And we can hash it out later.

C) We
suggest that the source/destination host pair gives the most
appropriate granularity in many circumstances.

Back that up with measurements of real traffic from real homes and
small businesses, and I'll believe you. Breaking up packet trains back
into packets in sane ways is the only way to deal with the impact
of iw10 at low bandwidths that I can think of, in particular.

In the interim I would suggest language that waffles more as to
appropriate methods.

D) Traffic
classes may be differentiated based on an Access Control List (ACL),
the packet DiffServ Code Point (DSCP) [RFC5559], setting of the ECN
field[RFC3168] [RFC4774] or an equivalent codepoint at a lower layer.

Are you ruling out port number? I have no problem with (for example)
deprioritizing port 873 (rsync) somewhat relevant to other traffic. Same
goes for some other well known ports...

Are you ruling out protocol number?

Destination address? (stuff inside my network gets treated
differently than stuff egressing)

These are all common methods of classifying traffic that has
codepoints that cannot be trusted.

And regrettably, on inbound from another domain, diffserv values
cannot be trusted, period. I don't know how to fit that into this draft, but
a MUST regarding remarking inbound diffserv appropriately is
needed. Right now I just quash everything inbound to BE.

E) A malfunctioning or non-conforming
network device may similarly hide an ECN mark. In normal operation
such cases should be very uncommon.

I disagree with the last sentence. ECN unleashed will be ECN abused.

If the recent ntp flooding attacks were ECN marked, and ECN widely
deployed, what would have happened?

(I still strongly support the notion of ECN, but don't want to
deprecate the dangers)

A diff from IETF 88's version may be found at

http://tools.ietf.org/rfcdiff?url1=http://tools.ietf.org/id/draft-ietf-aqm-recommendation-00.txturl2=http://tools.ietf.org/id/draft-ietf-aqm-recommendation-02.txt

which is also http://tinyurl.com/k9tfufm

On Feb 13, 2014, at 1:20 PM, internet-dra...@ietf.org
wrote:

A New Internet-Draft is available from the on-line Internet-Drafts
directories.
This draft is a work item of the Active Queue Management and Packet
Scheduling Working Group of the IETF.

Title : IETF Recommendations Regarding Active Queue
Management
Authors : Fred Baker
Godred Fairhurst
Filename: draft-ietf-aqm-recommendation-02.txt
Pages : 22
Date: 2014-02-13

Abstract:
This memo presents recommendations to the Internet community
concerning measures to improve and preserve Internet performance. It
presents a strong recommendation for testing, standardization, and
widespread deployment of active queue management (AQM) in network
devices, to improve the performance of today's Internet. It also
urges a concerted effort of research, measurement, and ultimate
deployment of AQM mechanisms to protect the Internet from flows that
are not sufficiently responsive to congestion notification.

The note largely repeats the recommendations of RFC 2309, updated
after fifteen years of experience and new research.

The IETF datatracker status page for this draft is:
https://datatracker.ietf.org/doc/draft-ietf-aqm-recommendation/

There's also a htmlized version available at:
http://tools.ietf.org/html/draft-ietf-aqm-recommendation-02

A diff from the previous version is available at:
http://www.ietf.org/rfcdiff?url2=draft-ietf-aqm-recommendation-02

Please note that it may take a couple of minutes from the time of submission
until the htmlized version and diff are available at tools.ietf.org.

Internet-Drafts are also available by anonymous FTP at:
ftp://ftp.ietf.org/internet-drafts/

[aqm] some comments I'd made on the cablelabs study privately

2014-02-09 Thread Dave Taht

Since that study and test design were highly influential on the AQM
requirements draft, I am going to publish now here what my comments
were at the time, with a couple updates here and there

I am not sure if any of the ns2 code from the last round made it out
to the public. ?

* Executive summary

  The Cablelabs AQM paper was the best simulation study of the effects
of the new bufferbloat-fighting AQMs verses the common Internet
traffic types in the home that has been created to date...

  However, it is only a study of half the edge network. It assumes
throughout that there are no excessive latencies to be had from the
CMTS side. Field measurements show latencies in excess of 1.8 seconds
from the CMTS side. 300ms on Verizon gpon. Buffer sizes in the range
of 64k to 512k on DSL in general, with some RED, SFQ, and SQF actually
deployed there.

  So... while the focus has been on what is perceived as the larger
problem, the cable modems themselves, downstream behavior was not
studied and the entire simulation set to reasonable values for ns2
modelers... and not seen in the real world.

  In the real world (RW) Flows are almost always bidirectional. What
happens on the downstream side affects the upstream side and vice
versa, as per Van Jacobson's fountain analogy. Correctly
compensating for bidirectional TCP dynamics is incredibly important.

  The second largest problem with original cablelabs study is that it
only analyzed traffic at one specific (although common) setting for
cable operators, 20Mbits down, and 5 Mbits up. A common, lower setting
should be analyzed, as well as more premier services. Some tweaking of
codel derived technologies (flows and quantum), and of pie (alpha and
beta) are indicated both at lower and higher bandwidths for optimum
results.  Additionally the effects of classification, notably of
background traffic, has not been explored.

  There are numerous other difficulties in the simulations and models
that need to be understood in order to make good decisions moving
forward. This document goes into more detail on those later.

All the AQMs tested performed vastly better than standard FIFO drop
tail as well as buffercontrol. They all require minimal configuration
to work. With some configuration they can be made to work better.

* Recomendations I'd made at the time

** Study be repeated using at least two more bandwidth settings
** More exact emulation of current CMTS behavior
   Based on real world measurements
** Addition of more traffic types, notably VPN and Videoconferencing
** Improvements to the VOIP, web models
** Continued attempts at getting real world and simulated benchmarks
to line up.

My approach has been to follow the simulation work and try to devise
real world benchmarks that are similar, and feed back the results into
the ongoing simulation process. There are multiple limitations in this
method, too, notably getting repeatable results, and doing large scale
tests on customer equipment, both of which are subject to heisenbugs.

* Issues in the cablelabs study
** Downstream behavior
   Tests with actual cablemodems in actual configurations shows a
significant amount of buffering on the downstream. At 20Mbits, DS
buffering well in excess of 1 second has been observed. The effect of
excessive buffering on this side has not been explored in these tests.

Certain behaviors - TCP's burstiness as it opens it's window to
account for what it is thinking as a long path - reflect interestingly
on congestion avoidance, on the downstream, and the effects on the
upstream side of the pair are interesting too.

I note that my own RW statistics were often very skewed by some very
bad ack behavior on TSO offloads that has been a bug in Linux for
years and recently fixed.

** Web model
*** The web model does not emulate DNS lookups.
  Caching DNS forwarders are typically located on a gateway box
(not sure about cablemodems ??), and the ISP locates a full DNS server
nearby, (within 10ms RTT). DNS traffic is particularly sensitive to
delay, loss and head of line blocking, and slowed DNS traffic stalls
subsequent tcp connections on sharded web traffic in particular.

*** The web model does no caching
A fairly large percentage (not high enough) of websites make use
of various forms of caching, ranging from marking whole objects as
cachable for a certain amount of time, or using the etags method to
provide checksum-like value for an if-modified get request. Use of the
former method eliminates a RTT entirely, the latter works inside of a
http 1.1 pipeline well.

*** The web model does not use https
Establishing a secure http connection requires additional round trips.

*** The web model doesn't emulate tons of tabs
Web users, already highly interactive, now tend to have tons of
tabs, all on individual web sites, many of which are doing some sort
of polling or interaction in the background against the remote web
server. These benchmarks do not emulate this highly

Re: [aqm] Prefatory comments re draft-aqm-reccommendation and -evaluation, and a question

2014-01-22 Thread Dave Taht

On Thu, Jan 23, 2014 at 1:10 AM, Fred Baker (fred) f...@cisco.com wrote:
 No, you're not blowing smoke. I'm not sure I would compare the behavior to
 PMTUD, as in that the endpoint is given a magic number and manages to it,
 where in this case, it is given the results of its behavior, and it manages
 to improve that.

 But this is what I have rambled on about in threads relating to the size of
 a buffer. Folks would really like to have a magic way to calculate the
 buffer size (an amount of memory) they need to install in a router or
 switch, and it isn't that easy, because it has a lot to do with where in the
 network a system is located and how it is used by the applications that use
 it. But AQM, in the end, isn't about buffer size. It is about buffer
 occupancy. In the ideal case, if there are N sessions active on the
 bottleneck link in a path, we would like each to obtain 1/N of the
 bottleneck's capacity, which is to say that it should be able to maximize
 its throughput, while keeping an average of zero packets standing in the
 queue (minimizing both latency and variation in latency). If you know your
 math, you know that the ideal goal isn't actually achievable. But that
 doesn't stop us from trying to asymptotically approach it.

I prefer to think of the goal as to keep a minimum of 1 packet in the queue,
not as an average of 0.


 On Jan 17, 2014, at 3:51 PM, David Collier-Brown dave...@rogers.com wrote:

 I've been reading through the internet-drafts, and one paragraph struck me
 as very illuminating.  This is therefor a sanity-check before I go full-hog
 down a particular path...

 The comment is from Baker and Fairhurst,
 https://datatracker.ietf.org/doc/draft-ietf-aqm-recommendation/ and the
 paragraph is [emphases added]

 The point of buffering in the network is to absorb data bursts and to
 transmit them during the (hopefully) ensuing bursts of silence. This is
 essential to permit the transmission of bursty data. Normally small queues
 are preferred in network devices, with sufficient queue capacity to absorb
 the bursts. The counter-intuitive result is that maintaining normally-small
 queues can result in higher throughput as well as lower end-to- end delay.
 In summary, queue limits should not reflect the steady state queues we want
 to be maintained in the network; instead, they should reflect the size of
 bursts that a network device needs to absorb.

 All of a sudden we're talking about the kinds of queues I know a little
 about (:-))
 ---

 I'm going to suggest that these are queues and associated physical buffers
 that do two things:

 hold packets that arrive at a bottleneck for a long as it takes to send them
 out a slower link that they came in on, and
 hold bursts of packets that arrive adjacent to each other until they can be
 sent out in a normal spacing, with some small amount of time between them


 In an illustration of Dave Taht's, the first looks something like this

 -+
 |X||X|  +---
 |X||X| |XXX||XXX|
 |X||X|  +---
 -+

 At the choke-point there is a buffer at least big enough to give the packet
 a chance to wheel from line into column (:-)) and start down the smaller
 pipe.

 The speed at which the acks come back,  the frequency of drops, and any
 explicit congestion notifications slows the sender until they don't overload
 the skinnier pipe, thus spacing the packets in the fatter pipe out.

 Various causes [Leland] can slow or speed the packets in the fat pipe,
 making it possible for several to arrive adjacent to each other, followed by
 a gap.  The second purpose of a buffer is to hold  these bursts while things
 space themselves back out.

 They need to be big enough at minimum to do the speed matching,  and at
 maximum,  big enough to spread a burst back into a normal progression,
 always assuming that acks, drops and explicit congestion notifications are
 slowing the sender to the speed of the slowest part of the network.
 ---

 If I'm right about this, we can draw some helpful conclusions

 buffer sizes can be set based on measurements:

 speed differences, which are pretty static, plus
 observed burstyness

 drops and ECN can be done to match the slowest speed in the path

 The latter suddenly sounds a bit like path MTU discovery, except it's a bit
 more dynamic, and varies with both path and what's happening in various
 parts of it.

 To me, as a capacity/performance nerd, this sounds a lot more familiar and
 manageable. My question to you, before I start madly scribbling on the
 internet drafts is:

 Am I blowing smoke?

 --dave

 --
 David Collier-Brown, | Always do right. This will gratify
 System Programmer and Author | some people and astonish the rest
 dav...@spamcop.net   |  -- Mark Twain
 (416) 223-8968

 ___

 aqm

Re: [aqm] Prefatory comments re draft-aqm-reccommendation and -evaluation, and a question

2014-01-22 Thread Dave Taht

On Thu, Jan 23, 2014 at 1:15 AM, Dave Taht dave.t...@gmail.com wrote:
 On Thu, Jan 23, 2014 at 1:10 AM, Fred Baker (fred) f...@cisco.com wrote:
 No, you're not blowing smoke. I'm not sure I would compare the behavior to
 PMTUD, as in that the endpoint is given a magic number and manages to it,
 where in this case, it is given the results of its behavior, and it manages
 to improve that.

 But this is what I have rambled on about in threads relating to the size of
 a buffer. Folks would really like to have a magic way to calculate the
 buffer size (an amount of memory) they need to install in a router or
 switch, and it isn't that easy, because it has a lot to do with where in the
 network a system is located and how it is used by the applications that use
 it. But AQM, in the end, isn't about buffer size. It is about buffer
 occupancy. In the ideal case, if there are N sessions active on the
 bottleneck link in a path, we would like each to obtain 1/N of the
 bottleneck's capacity, which is to say that it should be able to maximize
 its throughput, while keeping an average of zero packets standing in the
 queue (minimizing both latency and variation in latency). If you know your
 math, you know that the ideal goal isn't actually achievable. But that
 doesn't stop us from trying to asymptotically approach it.

 I prefer to think of the goal as to keep a minimum of 1 packet in the queue,
 not as an average of 0.

And that's not strictly true either. In the case of wifi, and other bundling
technologies like those used in cable, you want to keep a minimum of a good
aggregate of packets in the queue for that technology.



 On Jan 17, 2014, at 3:51 PM, David Collier-Brown dave...@rogers.com wrote:

 I've been reading through the internet-drafts, and one paragraph struck me
 as very illuminating.  This is therefor a sanity-check before I go full-hog
 down a particular path...

 The comment is from Baker and Fairhurst,
 https://datatracker.ietf.org/doc/draft-ietf-aqm-recommendation/ and the
 paragraph is [emphases added]

 The point of buffering in the network is to absorb data bursts and to
 transmit them during the (hopefully) ensuing bursts of silence. This is
 essential to permit the transmission of bursty data. Normally small queues
 are preferred in network devices, with sufficient queue capacity to absorb
 the bursts. The counter-intuitive result is that maintaining normally-small
 queues can result in higher throughput as well as lower end-to- end delay.
 In summary, queue limits should not reflect the steady state queues we want
 to be maintained in the network; instead, they should reflect the size of
 bursts that a network device needs to absorb.

 All of a sudden we're talking about the kinds of queues I know a little
 about (:-))
 ---

 I'm going to suggest that these are queues and associated physical buffers
 that do two things:

 hold packets that arrive at a bottleneck for a long as it takes to send them
 out a slower link that they came in on, and
 hold bursts of packets that arrive adjacent to each other until they can be
 sent out in a normal spacing, with some small amount of time between them


 In an illustration of Dave Taht's, the first looks something like this

 -+
 |X||X|  +---
 |X||X| |XXX||XXX|
 |X||X|  +---
 -+

 At the choke-point there is a buffer at least big enough to give the packet
 a chance to wheel from line into column (:-)) and start down the smaller
 pipe.

 The speed at which the acks come back,  the frequency of drops, and any
 explicit congestion notifications slows the sender until they don't overload
 the skinnier pipe, thus spacing the packets in the fatter pipe out.

 Various causes [Leland] can slow or speed the packets in the fat pipe,
 making it possible for several to arrive adjacent to each other, followed by
 a gap.  The second purpose of a buffer is to hold  these bursts while things
 space themselves back out.

 They need to be big enough at minimum to do the speed matching,  and at
 maximum,  big enough to spread a burst back into a normal progression,
 always assuming that acks, drops and explicit congestion notifications are
 slowing the sender to the speed of the slowest part of the network.
 ---

 If I'm right about this, we can draw some helpful conclusions

 buffer sizes can be set based on measurements:

 speed differences, which are pretty static, plus
 observed burstyness

 drops and ECN can be done to match the slowest speed in the path

 The latter suddenly sounds a bit like path MTU discovery, except it's a bit
 more dynamic, and varies with both path and what's happening in various
 parts of it.

 To me, as a capacity/performance nerd, this sounds a lot more familiar and
 manageable. My question to you, before I start madly scribbling on the
 internet drafts is:

 Am I blowing smoke?

 --dave

Re: [aqm] Text for aqm-recommendation on independent ECN config

2013-12-13 Thread Dave Taht

For starters, the codel signaling delay from the onset of continuous
over 5ms delay on packets defaults to target 100ms, not 200ms. I don't
know who started saying 200ms but even I started believing it with the
few brain cells I've had to spare of late. 5x a CDN rtt in a world of
30-60k images sounds about right.

Secondly, codel drops/marks from head, not from tail, so the signal
gets back to the sender in 1/2  the real physical RTT after that,
rather than the tail of a queue that may be out of control at that
point. Much faster than pie.

There has been so much misinformation spread of late on these threads.
I'm hoping we're beginning to make a dent in it? I look forward to
making all this clear on the upcoming RFCs. I think I should stop now,
revisit the rest of this thread and see what else can be cleared up
before even beginning to tackle fq_codel after I get caught up on
sleep.

as for your other comments...

I have always said deploy RED and for that matter DRR, SFQ or SQF
where you can. I distinctly remember polling the crowd at the first
uknof I went to and being sad to discover only about 4% of the room
had (4 people).

I DO hold that red is too hard to configure for ordinary mortals, and
that it doesn't work at all on variable bandwidth links like cable, or
wireless, which happen to be the dominant form of end-user link
nowadays.

As for the hysteresis problem, in practice it doesn't seem to be
much of a problem. things get well under control before a web page
completes.Same goes for my tests against DASH traffic. I have plenty
of plots and traces of this. Many are on the results webpage for
bufferbloat.net.

as for a good default for interval, a good number IS dependent on your
RTT, and without coupling the ingress and output queues, it's
difficult to determine or even auto tune that. Perhaps with connection
tracking or some other form of coupling, one day. The ACC code from
the gargoyle router project is worth looking at.

I am satisfied that fq_codel can be deployed on fixed rate lines
without any tuning on bandwidths ranging from 4mbit to 1gbit, today,
as it stands. I have done hundreds of thousands of tests to prove
that. Optimizations are helpful for the 3 band system that is what
mostly deployed today, such as smaller quantum's on slow asymmetric
links, and a smaller packet limit on low memory routers.

A larger target is working well on sub 4ms links. I think that could
auto tune better.

A lower target and interval seem right for data center use, but I have
yet to get anyone to run my suite of published tests.

A rate limiter is required to compensate for ISP's lousy
dslam/cmts/gpon head end and CPE at least until this code makes it
onto those devices. Long lead times predominate on this sort of
hardware - We have three years to get DOCSIS 3.1 right, as one
example.

These are second order problems that will be fixed over time. Wifi and
wireless remain problematic, but dents in those problems seem imminent
by next year, and many of the problems aren't aqm or packet scheduling
ones.

SO. damn straight, I'm one of the people pushing for deployment,
notably on boxes that are easy to upgrade and fix as we learn more
about what we should be doing. I'm definitely reluctant to hard code
stuff into big iron or hard to replace firmware as yet. But as matt
mathis said at ietf - what we have is such an improvement over what is
in place today, that it is time to deploy. After almost 3 years of
effort I'm happy to have a few million boxes in place to learn more
from. Aren't you?

We just have a couple billion boxes left to fix. Plenty of time to
tweak things as we go along. If you want RED, or ARED, in linux, it's
been fixed now for 2 years to perform as to the spec. Go for it. If
you could create something to automate RED configuration as I have for
the ceroshaper tool in cerowrt, let me know.

Any time someone has debloating code worth working on... I'm willing
to help. I've been helping on pie, and as you know I've been looking
over your DCTCP experiment carefully, finding and fixing bugs, and
moving the code forward to where it can be compared against a modern
kernel and a modern TCP and modern AQM and packet scheduling systems.




On Thu, Dec 12, 2013 at 4:05 PM, Bob Briscoe bob.bris...@bt.com wrote:
 Dave,


 At 22:11 12/12/2013, Dave Taht wrote:

 but quickly...

 Bob, I object to your characterization of users links being busy 1-3%
 of the time. That's an average.


 I said it was an average. You're repeating and agreeing with what I said,
 but saying you object to me saying it?


 When they are busy, they are very busy
 for short periods, typically 2-16 seconds in the case of web traffic,
 then idle for minutes. DASH traffic is busy for 2+ seconds every 10 on
 a 20mbit link, and so on, for 1.5 hours or so. Etc.


 Yes, again, you're agreeing with me.

 The mean for a Web session is towards the low end of the 2-16 seconds range
 even now. And as we get the other latency-saving advances out

97 matches

Mail list logo