[Cerowrt-devel] Fwd: Dave's wishlist [was: Source-specific routing merged]

2015-03-17 Thread Dave Taht
-- Forwarded message --
From: Dave Taht 
Date: Tue, Mar 17, 2015 at 7:41 AM
Subject: Re: Dave's wishlist [was: Source-specific routing merged]
To: Juliusz Chroboczek 
Cc: "babel-us...@lists.alioth.debian.org" <
babel-us...@lists.alioth.debian.org>, Gabriel Kerneis ,
Steven Barth , Henning Rogge , Paul
McKenney , Felix Fietkau 


My quest is always for an extra "9" of reliability. Anyplace where you can
make something more robust (even if it is out at the .99) level, I
tend to like to do in order to have the highest MTBF possible in
combination with all the other moving parts on the spacecraft (spaceship
earth).

One of the reasons why I love paul mckenney so much is that he deeply cares
about stuff that happens only one in a billion times.

>From this blog post of his: http://paulmck.livejournal.com/37782.html

"I quickly learned that the bug is difficult to reproduce, requiring
something like 100 hours of focused rcutorture testing. Bisection based on
100-hour tests would have consumed the remainder of 2014 and a significant
fraction of 2015, so something better was required. In fact, something *way*
better was required because there was only a very small number of failures,
which meant that the expected test time to reproduce the bug might well
have been 200 hours or even 300 hours instead of my best guess of 100
hours."

so, thus, I get picky on system daemons.

On Tue, Mar 17, 2015 at 4:45 AM, Juliusz Chroboczek <
j...@pps.univ-paris-diderot.fr> wrote:

> > 1) Did the issue with procd ever get resolved? (sighup I think it was)
>
> Gabriel, Steven?  Can procd be configured not to send SIGHUP, or shall
> I add an option to babeld to ignore it?  (Currently babeld terminates on
> SIGHUP, and I like it that way, since it prevents a babeld from sticking
> around after you log off.)
>

this basically logjammed on this issue. Either procd needed to be modified
to be able to
send an arbitrary signal, or babel changed to take sighup as a reload.

I'd done the simple patch to babel.

But I understand your use case also (stop routing on hup via remote
access), and have never poked into procd. Perhaps it could be changed to
take a var for the actual signal number it uses per daemon, but I will
argue that has overhead the openwrt devs would be loathe to take. But: I
will look. Babel can't be the only thing that needs a different signal to
reload...

a third way out is just to patch babel for openwrt...


> > 2) got the new vars into openwrt, or shall I do?
>

rtt branch had 2 new vars as best as I recall, and I envisioned the babels
package being retired, which as best as I recall has extra vars like src-eq.

Secondly the command lines would get complex on me, and I figured just
re-writing the conf file was saner than command line args. Get a
sigWHATEVER reload (or mmap) the conf file, checksum it against the
previous version, do nothing if it didn't change.

Thirdly, having an openwrt specific uci and/or ubus parser that could be
compiled in would be more reliable than a script, simpler and faster. I can
try to find funding for doing that... (in like 1.7's timeframe!!) I looked
over the libubus and libuci interfaces and staggered away confused.

Gabriel?
>
> > ecn
>
> No, since I don't understand why you think that setting ECN on Babel
> packets makes sense.  (It might make sense to set ECN on some Babel
> packets -- the ones that are marked as "urgent" -- but I'm interested in
> hearing your reasoning.)
>

fq_codel is the default on openwrt. ECN is enabled on that. Basic ECN
marking is 2 characters of new code. (tracking it harder, but that's
boilerplate code now). hnetd is presently very dumb about coalescing /60s
out of /64s...  I'd like to be trying much faster update schedules on
ethernet as per some of the discussion on homenet.

But, let me take this subject to another thread than this.


>
> > atomic route updates
>
> Ausgeschlossen.


Nothing is impossible.


> Last time I tried, I got a number of complaints that it
> broke operational networks.
>

As the new FIB table patches have landed in linux 4.0 and later, it has
done some odd things with RCU that I am not sure would be a good thing with
the present delete+add routes system everything like quagga+ babel seems to
use.

I'd written about it here while discussing the amazing new FIB patches (7x
reductions in lookup time or more), but was not aware that henning had
actually got atomic route updates that worked.

http://lists.openwall.net/netdev/2015/03/11/136

So, perhaps autodetection of some sort here, also, would be of help.

And figuring out why it used to break.

And ooh! atomic route changes! no packet loss at all! Look at that extra 9!


> It's also less important than it used to be, since the hysteresis
> algorithm in 1.5.0 dramatically reduced the number of route switches --
> current versions of Babel should not be loosing a measurable number of
> packets due to non-atomic switches.
>

How many 9s do you want?


>
> > IP

Re: [Cerowrt-devel] Fwd: Dave's wishlist [was: Source-specific routing merged]

2015-03-17 Thread David Lang

On Tue, 17 Mar 2015, Dave Taht wrote:


My quest is always for an extra "9" of reliability. Anyplace where you can
make something more robust (even if it is out at the .99) level, I
tend to like to do in order to have the highest MTBF possible in
combination with all the other moving parts on the spacecraft (spaceship
earth).


There are different ways to add reliability

one is to try and make sure nothing ever fails

the second is to have a way of recovering when things go wrong.


Bufferbloat came about because people got trapped into the first mode of 
thinking (packets should never get lost), when the right answer ended up being 
to realize that we have a recovery method and use it.


Sometimes trying to make sure nothing ever fails adds a lot of complexity to the 
code to handle all the corner cases, and the overall reliability will improve by 
instead simplify normal flow, even if it add a small number of failures, if that 
means that you can have a common set of recovery code that's well excercised and 
tested.


As you are talking about loosing packets with route changes, watch out that you 
don't fall into this trap.


David Lang
___
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel


[Cerowrt-devel] The next slice of cake

2015-03-17 Thread Jonathan Morton
After far too long, it looks like I’ll have the opportunity to work on sch_cake 
a bit more.  So here’s a little bit of a “state of the union” speech about what 
we’ve got and what I’m planing to add to it.

So far we’ve got a deficit-mode, non-bursting shaper that works pretty well, 
and an integrated implementation of fq_codel that tunes itself (that is, the 
target delay) to the bandwidth set on the shaper.  The configuration is “as 
easy as cake”; the intention is that you can just specify one parameter (the 
bandwidth to shape at) and leave everything else at the defaults; there simply 
aren’t very many visible knobs, because they aren’t needed.

We’ve also got Diffserv classification, and that part hasn’t been so 
successful.  Each class grabs all traffic with some subset of the codepoints, 
and stuffs them into a separate shaper+fq_codel instance, and the 
higher-priority shapers steal bandwidth from the lower ones to enforce 
priority.  High-priority classes can only use a limited amount of bandwidth, 
exactly as specified in generic Diffserv PHBs.

It works, perfectly as designed, but the resulting behaviour isn’t particularly 
desirable from an end-user perspective.  In particular, people run tests using 
best-effort traffic to see how much bandwidth they’re getting, resulting in 
complaints that cake had to be given a bigger number to get the correct 
throughput - which of course also stops it from functioning correctly when 
background traffic is added to the mix.  So that needed a rethink.

Incidentally, the existing Diffserv implementation can be disabled by 
specifying the “besteffort” keyword.  This lumps all traffic into a single 
class, handled by a single shaper at the configured rate.  Cake already works 
pretty well in that mode; sometimes I turn the shaper down to analogue-modem 
speeds and note, with some satisfaction, that everything *still* works.  Except 
YouTube, but that’s only because streaming video really does need more than 
analogue-modem bandwidth.

As for performance, I’m able to make my ancient Pentium-MMX shape at over 50 
Mbps, summing traffic in both directions between two bridged Fast Ethernet 
cards.  This limitation is probably a combination of timer latency and 
context-switch overhead.  I don’t expect it to improve much, unless we find a 
way to seriously reduce those overheads (which are already quite low for a 
modern desktop OS).  A faster machine with better timers gets better 
performance, of course.

So there are two big things I want to change in the next version:

The easy part (at least in terms of how many unknowns there are) is adjusting 
the flow-queueing part so that it uses set-associative hashing instead of 
straight hashing when selecting a queue.  This should reduce the incidence of 
hash collisions considerably for a given number of flow queues, or conversely 
provide equivalent collision performance with a smaller number of queues.

The more interesting part is to rework the Diffserv prioritiser so that it 
behaves more usefully.  I think I’ve hit upon the right idea which should make 
this work in practice - instead of individually hard-shaping each class, 
instead use the shaper logic as a threshold function between high and low 
priority, and instead implement a single shaper to handle all traffic.  The 
priority function can then be handled by a weighted DRR system - which is 
already in place, but doesn’t do much - with just that small modification for 
changing the weights based on the shaper state.

So high-priority traffic gets high priority - but only if it limits itself to a 
reasonable bandwidth.  Above that bandwidth, it gets low priority, but is still 
able to use the full shaped bandwidth if nobody else contends for it.  And 
(unlike say HFSC) we need precisely two parameters per class to do this, both 
specified as ratios rather than hard bandwidth numbers: a bandwidth share 
(which determines both the shaper setting and the low-priority-mode DRR 
weighting) and a priority factor (which determines the high-priority-mode DRR 
weighting).  So if those knobs end up being exposed to userspace, they’ll be 
easier to understand and thus use correctly.

All of this feeds my main goal with Diffserv, which is to start giving 
applications natural incentives to mark their traffic appropriately.  Each 
class has both an advantage, and a tradeoff which must be accepted to realise 
that advantage.  If you need absolutely minimal latency, you can choose a 
high-priority class, but you’ll have to be frugal about bandwidth.  If you need 
maximum throughput, you’ll have to put up with reduced priority compared to 
latency-sensitive traffic.  And if you want to be altruistic, you can choose to 
mark your stuff as bulk, background traffic, and it’ll be treated accordingly.  
All of this is in accordance with existing RFCs.

A small caveat: cake is not designed for wifi.  It’s designed for links that 
can at least be treated as full-duplex to a clos

Re: [Cerowrt-devel] Fwd: Dave's wishlist [was: Source-specific routing merged]

2015-03-17 Thread dpreed
I agree wholeheartedly with your point, David.

One other clarifying point (I'm not trying to be pedantic, here, but it may 
sound that way):

Reliability is not the same as Availability.  The two are quite different.

 Bufferbloat is pretty much an "availability" issue, not a reliability issue.  
In other words, packets are not getting lost.  The system is just preventing 
desired use.

Availability issues can be due to actual failures of components, but there are 
lots of availability issues that are caused (as you suggest) by attempts to 
focus narrowly on "loss of data" or "component failures".

When you build a system, there is a temptation to apply what is called the 
Fallacy of Composition (look it up on Wikipedia for precise definition).  The 
key thing in the Fallacy of Composition is that when a system of components has 
a property as a whole, then every component of the system must by definition 
have that property.

(The end-to-end argument is a specific rule that is based on a recognition of 
the Fallacy of Composition in one case.)

We all know that there is never a single moment when any moderately large part 
of the Internet does not contain failed components.  Yet the Internet has 
*very* high availability - 24x7x365, and we don't need to know very much about 
what parts are failing.  That's by design, of course. And it is a design that 
does not derive its properties from a trivial notion of "proof of correctness", 
or even "bug freeness"

The relevance of a "failure" or even a "design flaw" to system availability is 
a matter of a much bigger perspective of what the system does, and what its 
users perceive as to whether they can get work done.




On Tuesday, March 17, 2015 3:30pm, "David Lang"  said:

> On Tue, 17 Mar 2015, Dave Taht wrote:
> 
>> My quest is always for an extra "9" of reliability. Anyplace where you can
>> make something more robust (even if it is out at the .99) level, I
>> tend to like to do in order to have the highest MTBF possible in
>> combination with all the other moving parts on the spacecraft (spaceship
>> earth).
> 
> There are different ways to add reliability
> 
> one is to try and make sure nothing ever fails
> 
> the second is to have a way of recovering when things go wrong.
> 
> 
> Bufferbloat came about because people got trapped into the first mode of
> thinking (packets should never get lost), when the right answer ended up being
> to realize that we have a recovery method and use it.
> 
> Sometimes trying to make sure nothing ever fails adds a lot of complexity to 
> the
> code to handle all the corner cases, and the overall reliability will improve 
> by
> instead simplify normal flow, even if it add a small number of failures, if 
> that
> means that you can have a common set of recovery code that's well excercised 
> and
> tested.
> 
> As you are talking about loosing packets with route changes, watch out that 
> you
> don't fall into this trap.
> 
> David Lang
> ___
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
> 


___
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel


Re: [Cerowrt-devel] The next slice of cake

2015-03-17 Thread Jonathan Morton

> On 17 Mar, 2015, at 22:34, Carlos R. Pasqualini 
>  wrote:
> 
> would you mind to point me to a repository or download area and some
> docs about how to get it working and test it's performance?
> 
> in a (too)fast (and lazy) search at google can't find anything
> 
> Here, i have only 3 DSL links with 3Mbps bandwidth each, for aprox. 300
> student's computers.

That certainly sounds like a situation where cake could help.

Dave Täht made patches a few months ago, based on linux-net-next, which are 
available here:

http://snapon.lab.bufferbloat.net/~d/codel_patches/new_codels.tgz

Those include *two* versions of cake, one of which is configured to use a 
different version of the Codel algorithm than the other.  The intent at the 
time was to compare those two versions against each other, but they also happen 
to have the most up-to-date version of cake.  It really has been a while since 
I’ve been able to work on it.

You’ll need to build the kernel with “sch_cake” or “sch_cake2” turned on.  If 
you copy over your existing kernel config and run “make oldconfig”, you should 
get asked about them (as well as other things).

You’ll also need a patched version of the iproute2 utilities to configure cake. 
 Patches here:

http://snapon.lab.bufferbloat.net/~d/codel_patches/iproute_patches.tgz

Then it’s as simple as running:

# tc qdisc replace dev ethX root cake besteffort bandwidth Kbps atm

That will take care of your outbound traffic, if you replace “ethX” and “” 
with whatever is appropriate (and “cake2” if you built that version).  If you 
have control of both ends of the link, then you can do the same thing to handle 
inbound traffic.

If you only have control of one end of the link, you’ll need to use ingress 
shaping to handle inbound traffic.  This is a little bit more complicated to 
set up (via an Intermediate Functional Block device) than the usual egress 
shaping, and has a couple of disadvantages, but it does work and does help:

# ifconfig ifb0 up
# tc qdisc replace dev ethX handle : ingress
# tc filter add dev ethX parent : protocol all u32 match u32 0 0 action 
mirred egress redirect dev ifb0
# tc qdisc replace dev ifb0 root cake besteffort bandwidth Kbps atm

Both  and  should be slightly below your actual link rates, to ensure 
that cake controls the bottleneck queue.  The “atm” flag is there to take 
account of ATM framing, which ADSL uses.  You can experiment with the precise 
rates without disrupting existing traffic flows:

# tc -s qdisc

(the above is to look up the correct handle figures to use below)

# tc qdisc change dev  handle N:M cake bandwidth ...

Have fun!

 - Jonathan Morton

___
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel


Re: [Cerowrt-devel] The next slice of cake

2015-03-17 Thread Dave Taht
now in mainline openwrt chaos calmer is a version of sqm-scripts that will
allow for the use of cake or cake2, if you have iproute2 support for it
(which is not mainlined). It simplifies the sqm-scripts code
dramatically.

Our intent is to work closer to upstream openwrt in the future. The plan is
to have (for example) have a kmod-cake package, which builds outside the
kernel tree, and a specialized iproute2 package specifically for
bufferbloat experiments, and in both cases
have them get built by the openwrt build cluster and available for all
their platforms.

unfortunately this concept does not extend as easily to working on
mac80211, as yet.
___
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel


Re: [Cerowrt-devel] DOCSIS 3+ recommendation?

2015-03-17 Thread Valdis . Kletnieks
On Mon, 16 Mar 2015 13:35:32 -0700, Matt Taggart said:
> Hi cerowrt-devel,
>
> My cable internet provider (Comcast) has been pestering me (monthly email
> and robocalls) to upgrade my cable modem to something newer. But I _like_
> my current one (no wifi, battery backup) and it's been very stable and can
> handle the data rates I am paying for. But they are starting to roll out
> faster service plans and I guess it would be good to have that option (and
> eventually they will probably boost the speed of the plan I'm paying for).
> So...
>
> Any recommendations for cable modems that are known to be solid and less
> bufferbloated?

I've been using the Motorola Surfboard SB6141 on Comcast with good results.
Anybody got a good suggestion on how to test a cablemodem for bufferbloat,
or what you can do about it anyhow (given that firmware is usually pushed
from the ISP side)?


pgpkXU3zCiBml.pgp
Description: PGP signature
___
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel


Re: [Cerowrt-devel] DOCSIS 3+ recommendation?

2015-03-17 Thread David P. Reed
It is not the cable modem itself that is bufferbloated. It is the head end 
working with the cable modem. Docsis 3 has mechanisms to avoid queue buildup 
but they are turned on by the head end.

I don't know for sure but I believe that the modem itself cannot measure or 
control the queueing in the system to minimize latency.

You can use codel or whatever if you bound you traffic upward and stifle 
traffic downward. But that doesn't deal with the queueing in the link away from 
your home.

On Mar 17, 2015, valdis.kletni...@vt.edu wrote:
>On Mon, 16 Mar 2015 13:35:32 -0700, Matt Taggart said:
>> Hi cerowrt-devel,
>>
>> My cable internet provider (Comcast) has been pestering me (monthly
>email
>> and robocalls) to upgrade my cable modem to something newer. But I
>_like_
>> my current one (no wifi, battery backup) and it's been very stable
>and can
>> handle the data rates I am paying for. But they are starting to roll
>out
>> faster service plans and I guess it would be good to have that option
>(and
>> eventually they will probably boost the speed of the plan I'm paying
>for).
>> So...
>>
>> Any recommendations for cable modems that are known to be solid and
>less
>> bufferbloated?
>
>I've been using the Motorola Surfboard SB6141 on Comcast with good
>results.
>Anybody got a good suggestion on how to test a cablemodem for
>bufferbloat,
>or what you can do about it anyhow (given that firmware is usually
>pushed
>from the ISP side)?
>
>
>
>
>___
>Cerowrt-devel mailing list
>Cerowrt-devel@lists.bufferbloat.net
>https://lists.bufferbloat.net/listinfo/cerowrt-devel

-- Sent with K-@ Mail - the evolution of emailing.___
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel


Re: [Cerowrt-devel] DOCSIS 3+ recommendation?

2015-03-17 Thread Jonathan Morton
DOCSIS 3.1 mandates support for AQM (at minimum the PIE algorithm) in both
CPE and head end. If you can get hold of a D3.1 modem, you'll at least be
ready for the corresponding upgrade by your ISP.

Unfortunately I don't know which cable modems support which DOCSIS
versions, but it should be straightforward to look that up for any given
model.

- Jonathan Morton
___
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel