On Fri, 25 Jul 2014 14:37:34 -0400, valdis.kletni...@vt.edu wrote:
On Sat, 24 May 2014 10:02:53 -0400, "R." said:

Further, this function could be auto-scheduled or made enabled on
router boot up.

Yeah, if such a thing worked, it would be good.

(Note in the following that a big part of my *JOB* is doing "What could
possibly go wrong?" analysis on mission-critical systems, which tends
to color
my viewpoint on projects. I still think the basic concept is good, just difficult to do, and am listing the obvious challenges for anybody brave
enough to tackle it... :)

I must be missing something important which prevents this. What is it?

There's a few biggies. The first is what the linux-kernel calls -ENOPATCH -
nobody's written the code.  The second is you need an upstream target
someplace
to test against. You need to deal with both the "server is unavalailable due
to a backhoe incident 2 time zones away" problem (which isn't *that*
hard, just
default to Something Not Obviously Bad(TM), and "server is slashdotted" (whci is a bit harder to deal with. Remember that there's some really odd corner
cases to worry about - for instance, if there's a power failure in a
town, then
when the electric company restores power you're going to have every
cerowrt box
hit the server within a few seconds - all over the same uplink most
likely.  No
good data can result from that... (Holy crap, it's been almost 3
decades since
I first saw a Sun 3/280 server tank because 12 Sun 3/50s all rebooted
over the
network at once when building power was restored).

And if you're in Izbekistan and the closest server netwise is at 60
Hudson, the
analysis to compute the correct values becomes.... interesting.

Dealing with non-obvious error conditions is also a challenge - a router may only boot once every few months. And if you happen to be booting just as a BGP routing flap is causing your traffic to take a vastly suboptimal path, you may end up encoding a vastly inaccurate setting and have it stuck there, causing suckage for non-obvious reasons for the non-technical, so you really don't want to enable auto-tuning unless you also have a good plan for
auto-*RE*tuning....

have the router record it's finding, and then repeat the test periodically, recording it's finding as well. If the new finding is substantially different from the prior ones, schedule a retest 'soon' (or default to the prior setting if it's bad enough), otherwise, if there aren't many samples, schedule a test 'soon' if there are a lot of samples, schedule a test in a while.

However, I think the big question is how much the tuning is required.

If a connection with BQL and fq_codel is 90% as good as a tuned setup, default to untuned unless the user explicitly hits a button to measure (and then a second button to accept the measurement)

If BQL and fw_codel by default are M70% as good as a tuned setup, there's more space to argue that all setups must be tuned, but then the question is how to they fare against a old, non-BQL, non-fq-codel setup? if they are considerably better, it may still be worthwhile.

David Lang
_______________________________________________
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel

Reply via email to