Re: [Bloat] Congestion control with FQ-Codel/Cake with Multicast?

2024-05-24 Thread Toke Høiland-Jørgensen via Bloat
Linus Lüssing via Bloat  writes:

> I was wondering, have there been any tests with multicast on a
> recent OpenWrt with FQ-Codel or Cake, do these queueing machanisms
> work as a viable, fair congestion control option for multicast,
> too? Has anyone looked at / tested this?

FQ-CoDel or CAKE on the interface probably wouldn't help much (there's a
reason we put it into the WiFi stack instead of the qdisc layer).

However, AQL in the WiFi stack could control the multicast/broadcast
queue. It does not currently do so, however, but Felix sent an RFC patch
to enable this back in February. So this may turn up at some point in
the future in a WiFi stack near you:

https://lore.kernel.org/r/20240209184730.69589-1-...@nbd.name

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] The sad state of MP-TCP

2024-04-02 Thread Toke Høiland-Jørgensen via Bloat
Juliusz Chroboczek  writes:

>>> There should be a knob in the kernel to transparently replace TCP with
>>> MP-TCP, but I couldn't find one.
>
>> There is, sorta. Specifically, a BPF hook that can override the protocol
>> (added in kernel 6.6):
>> 
>> https://lore.kernel.org/all/cover.1692147782.git.geliang.t...@suse.com/
>
> So we're no longer doing sysctls, we're now monkey patching system
> calls? I guess if it works for JavaScript, why shouldn't it work for
> the kernel.

If it helps you sleep at night, you can think of it more as a lisp
machine than a javascript runtime ;)

You're not the first to make the comparison, though:

https://dl.acm.org/doi/abs/10.1145/3609021.3609306

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] The sad state of MP-TCP

2024-04-01 Thread Toke Høiland-Jørgensen via Bloat
Juliusz Chroboczek via Bloat  writes:

> Unfortunately, MP-TCP does not replace TCP in Linux, it's implemented as
> a separate transport protocol.  That means that in order to use MP-TCP,
> every application needs to be patched to use PROT_MPTCP instead of
> PROT_TCP.  Go applications need to call SetMultipathTCP(true) on every
> net.Conn and every net.Listener.
>
> There should be a knob in the kernel to transparently replace TCP with
> MP-TCP, but I couldn't find one.

There is, sorta. Specifically, a BPF hook that can override the protocol
(added in kernel 6.6):

https://lore.kernel.org/all/cover.1692147782.git.geliang.t...@suse.com/

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] The Confucius queue management scheme

2024-02-14 Thread Toke Høiland-Jørgensen via Bloat
Jonathan Morton  writes:

>> On 10 Feb, 2024, at 7:05 pm, Toke Høiland-Jørgensen via Bloat 
>>  wrote:
>> 
>> This looks interesting: https://arxiv.org/pdf/2310.18030.pdf
>> 
>> They propose a scheme to gradually let new flows achieve their fair
>> share of the bandwidth, to avoid the sudden drops in the available
>> capacity for existing flows that can happen with FQ if a lot of flows
>> start up at the same time.
>
> I took some time to read and think about this.
>
> The basic idea is delightfully simple: "old" flows have a fixed weight
> of 1.0; "new" flows have a weight of (old flows / new flows) *
> 2^(k*t), where t is the age of the flow and k is a tuning constant,
> and are reclassified as "old" flows when this quantity reaches 1.0.
> They also describe a queuing mechanism which uses these weights, which
> while mildly interesting in itself, isn't directly relevant since a
> variant of DRR++ would also work here.
>
> I noticed four significant problems, three of which arise from
> significant edge cases, and the fourth is an implementation detail
> which can easily be remedied. I didn't see any discussion of these
> edge cases in the paper, only the implementation detail. The latter is
> just a discretisation of the exponential function into doubling
> epochs, probably due to an unfamiliarity with fixed-point arithmetic
> techniques. We can ignore it when thinking about the wider design
> theory.
>
> The first edge case is already fatal unless somehow handled: starting
> with an idle link, there are no "old" flows and thus the numerator of
> the equation is zero, resulting in a zero weight for any number of new
> flows which then arise. There are several reasonable and quite trivial
> ways to handle this.
>
> The second edge case is the dynamic behaviour when "new" flows
> transition to "old" ones. This increases the numerator and decreases
> the denominator for other "new" flows, causing a cascade effect where
> several "new" flows of similar but not identical age suddenly become
> "old", and younger flows see a sudden jump in weight, thus available
> capacity. This would become apparent in realistic traffic more easily
> than in a lab setting. A formulation which remains smooth over this
> transition would be preferable.
>
> The third edge case is that there is no described mechanism to remove
> flows from the "old" set when they become idle. Most flows on the
> Internet are in practice short, so they might even go permanently idle
> before leaving the "new" set. If not addressed, this becomes either a
> memory leak or a mechanism for the flow hash table to rapidly fill up,
> so that in practice all flows are soon seen as "old". The DRR++
> mechanism doesn't suffice, because the state in Confucius is supposed
> to evolve over longer time periods, much longer than the sojourn time
> of an individual packet in the queue.
>
> The basic idea is interesting, but the algorithmic realisation of the
> idea needs work.

Thank you for taking a detailed look! I think you're basically echoing
my immediate sentiment when reading this: neat idea, not quite convinced
about the implementation details. But I didn't spend enough time
thinking about it to express the problems in such concrete detail, so
thank you for doing that! :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] The Confucius queue management scheme

2024-02-10 Thread Toke Høiland-Jørgensen via Bloat
This looks interesting: https://arxiv.org/pdf/2310.18030.pdf

They propose a scheme to gradually let new flows achieve their fair
share of the bandwidth, to avoid the sudden drops in the available
capacity for existing flows that can happen with FQ if a lot of flows
start up at the same time.

No code available, unfortunately, although they promise to release it
later...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] [Cake] Two questions re high speed congestionmanagement anddatagram protocols

2023-06-28 Thread Toke Høiland-Jørgensen via Bloat
"David P. Reed via Bloat"  writes:

> (One such nightmare can be seen in LKML... Search for
> dpr...@deepplum.com patch emails. I tried hard, was worn down, then
> gave up, since I found a way to avoid the bug, in virtualization code
> on x86, and gave up on getting it fixed after a year. Life is too
> short.

Went looking, since I think it's important to learn from such process
failures (and you're certainly not the first to opine that getting
patches into the kernel is challenging).

I'm assuming you're referring to this series, right?

https://lore.kernel.org/all/20200704203809.76391-4-dpr...@deepplum.com/

Which, to me, looks like it was pretty close to being accepted; another
revision would probably have made the cut...

...In fact it seems those patches were later resurrected by Sean as part
of this series, six months later:

https://lore.kernel.org/all/20201231002702.2223707-1-sea...@google.com/

One of the patches retained your authorship and made it into the kernel
in this commit:

https://git.kernel.org/torvalds/c/5364a305


So wouldn't necessarily call that a complete failure :) Seems the main
process issue you hit here was a subsystem that was too resource
constrained on the review side, which does sadly happen. The kernel
process tends to heavily favour convenience of reviewers for the same
reason (which is one of the reasons it can be off-putting to outsiders,
so it's a bit of a chicken and egg situation...)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] PhD thesis with results related to buffering needs on variable-capacity links

2023-01-03 Thread Toke Høiland-Jørgensen via Bloat
Bjørn Ivar Teigen via Bloat  writes:

> Hi everyone,
>
> I defended my PhD in December. I hope some of the results are interesting
> to the bufferbloat community.
>
> The title is "Opportunities and Limitations in Network Quality
> Optimization: Quality Attenuation Models of WiFi Network Variability"
> Full text here: https://www.duo.uio.no/handle/10852/98385

Congratulations! Looks interesting, I look forward to looking at it in
more detail :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] [bbr-dev] Aggregating without bloating - hard times for tcp and wifi

2022-11-23 Thread Toke Høiland-Jørgensen via Bloat
Bob McMahon  writes:

> Does the TSQ code honor no-aggregation per voice access class or
> TCP_NODELAY where the app making the socket write calls knows that the WiFi
> aggregation isn't likely helpful? Sorry, my Linux stack expertise is quite
> limited.

TSQ only influences the buffering in the TCP layer. The WiFi stack will
still limit aggregation using its own logic (I think it turns it off
entirely for voice?). TCP_NODELAY is also orthogonal to TSQ; TSQ only
kicks in when there's a bunch of data buffered, in which case
TCP_NODELAY has no effect...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] [bbr-dev] Aggregating without bloating - hard times for tcp and wifi

2022-11-22 Thread Toke Høiland-Jørgensen via Bloat
Neal Cardwell via Bloat  writes:

> On Tue, Nov 22, 2022 at 2:43 PM 'Bob McMahon' via BBR Development <
> bbr-...@googlegroups.com> wrote:
>
>> Thanks for sharing this. Curious about how the xTSQ value can be set? Can
>> it be done with sysctl?
>>
>> *We continue our analysis by using the ms-version of TSQ patch, which
>> enables the tune of the TSQ size allowing each TCP variant to enqueue more
>> than 1 ms of data at the current TCP rate. In particular, we allow to
>> enqueue the equivalent of x ms of data, naming each test xTSQ, with x being
>> an integer value. It is important to notice that this patch has been
>> included in the Linux kernel mainline, and each Wi-Fi driver can now set
>> the desired xTSQ value**.*
>>
>
> I believe they are setting the xTSQ value using the sk_pacing_shift field,
> which was added here:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3a9b76fd0db9f0d426533f96a68a62a58753a51e
>
> AFAIK the intent is only for drivers to set that, and there's no sysctl for
> that, but of course you could add a sysctl for testing if you wanted.
> :-)

Yup, indeed this is what mac80211 fiddles with:
https://elixir.bootlin.com/linux/latest/source/net/mac80211/main.c#L739
https://elixir.bootlin.com/linux/latest/source/net/mac80211/tx.c#L4156

AFAICT, no in-tree drivers override the value set by mac80211.

I believe the tests in that paper were conducted with this series
applied:
https://lore.kernel.org/all/20180105113256.14835-1-natale.patricie...@gmail.com/

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Up-to-date buffer sizes?

2022-03-09 Thread Toke Høiland-Jørgensen via Bloat
Michael Menth  writes:

> Hi all,
>
> are there up-to-date references giving evidence about typical buffer 
> sizes for various link speeds and technologies?

Heh. There was a whole workshop on it a couple of years ago; not sure if
it concluded anything: http://buffer-workshop.stanford.edu/program/

But really, asking about buffer sizing is missing the point; if you have
static buffers with no other management (like AQM and FQ) you're most
likely already doing it wrong... :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] [Cerowrt-devel] uplink bufferbloat and scheduling problems

2021-12-01 Thread Toke Høiland-Jørgensen via Bloat
"Valdis Klētnieks"  writes:

> On Wed, 01 Dec 2021 13:09:46 -0800, David Lang said:
>
>> with wifi where you can transmit multiple packets in one airtime slot, you 
>> need 
>> enough buffer to handle the entire burst.
>
> OK, I'll bite... roughly how many min-sized or max-sized packets can you fit
> into one slot?

On 802.11n, 64kB; on 802.11ac, 4MB(!); on 802.11ax, no idea - the same as 
802.11ac?

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Beyond Bufferbloat: End-to-End Congestion Control Cannot Avoid Latency Spikes

2021-11-02 Thread Toke Høiland-Jørgensen via Bloat
Bjørn Ivar Teigen  writes:

> Hi everyone,
>
> I've recently published a paper on Arxiv which is relevant to the
> Bufferbloat problem. I hope it will be helpful in convincing AQM doubters.
> Discussions at the recent IAB workshop inspired me to write a detailed
> argument for why end-to-end methods cannot avoid latency spikes. I couldn't
> find this argument in the literature.
>
> Here is the Arxiv link: https://arxiv.org/abs/2111.00488

I found this a very approachable paper expressing a phenomenon that
should be no surprise to anyone on this list: when flow rate drops,
latency spikes.

> A direct consequence is that we need AQMs at all points in the internet
> where congestion is likely to happen, even for short periods, to mitigate
> the impact of latency spikes. Here I am assuming we ultimately want an
> Internet without lag-spikes, not just low latency on average.

This was something I was wondering when reading your paper. How will
AQMs help? When the rate drops the AQM may be able to react faster, but
it won't be able to affect the flow xmit rate any faster than your
theoretical "perfect" propagation time...

So in effect, your paper seems to be saying "a flow that saturates the
link cannot avoid latency spikes from self-congestion when the link rate
drops, and the only way we can avoid this interfering with *other* flows
is by using FQ"? Or?

Also, another follow-on question that might be worth looking into is
short flows: Many flows fit entirely in an IW, or at least never exit
slow start. So how does that interact with what you're describing? Is it
possible to quantify this effect?

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Little's Law mea culpa, but not invalidating my main point

2021-07-09 Thread Toke Høiland-Jørgensen via Bloat
"Holland, Jake via Bloat"  writes:

> Hi David,
>
> That’s an interesting point, and I think you’re right that packet
> arrival is poorly modeled as a Poisson process, because in practice
> packet transmissions are very rarely unrelated to other packet
> transmissions.
>
> But now you’ve got me wondering what the right approach is. Do you
> have any advice for how to improve this kind of modeling?

I actually tried my hand at finding something better for my master's
thesis and came across something called a Markov-Modulated Poisson
Process (MMPP/D/1 queue)[0]. It looked promising, but unfortunately I
failed to make it produce any useful predictions. Most likely this was
as much a result of my own failings as a queueing theorist as it was the
fault of the model (I was in way over my head by the time I got to that
model); so I figured I'd mention it here in case anyone more qualified
would have any opinion on it.

I did manage to get the Linux kernel to produce queueing behaviour that
resembled that of a standard M/M/1 queue (if you squint a bit); all you
have to do is to use a traffic generator that emits packets with the
distribution the model assumes... :)

The full thesis is still available[1] for the perusal of morbidly curious.

-Toke

[0] https://www.sciencedirect.com/science/article/abs/pii/016653169390035S
[1] https://rucforsk.ruc.dk/ws/files/57613884/thesis-final.pdf
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] Fwd: [ncc-announce] [news] Apply now for the RIPE NCC Community Projects Fund

2021-06-16 Thread Toke Høiland-Jørgensen via Bloat
Figured this might be of interest to some people here :)

-Toke

--- Begin Message ---
Hello,

For the fifth year in a row, the RIPE NCC is committing to support activities 
and projects for 'the good of the Internet'. We are pleased to invite 
applications for funding from the RIPE NCC Community Projects Fund. 

Through the Community Projects Fund, the RIPE NCC will provide up to EUR 
250,000 per year to support projects of value to the operation, resilience and 
sustainability of the Internet, with a focus on tools and services benefitting 
the technical community in the Europe, the Middle East and Central Asia. 

The Internet today owes its success to a lot of non-commercial, 
community-driven work, and this continues to ensure the Internet remains 
resilient, secure and accessible. If you or your organisation are working on a 
new or existing project that fits the criteria below and need some financial 
support to get it off the ground or keep it going, we want to hear from you!

Is your project of benefit to the Internet, particularly the RIPE community?
Is your project non-commercial in nature? 
Do you have a clear project plan and timeline? 

Some things to keep in mind:

- Funding may be used to purchase equipment, but this cannot be the sole 
expenditure
- Funding cannot be used for humanitarian aid, donations, or to encourage 
political reform
- Funding cannot be used to provide scholarships or cover tuition fees
- Funding cannot be used to support any form of commercial activity 

The call for applicants will remain open until 23:59 (UTC), Saturday, 31 July 
2021. You can find more details on the RIPE NCC Community Projects Fund page:
https://www.ripe.net/support/cpf

We also welcome Samaneh Tajalizadehkhoob and Flavio Luciani to the selection 
committee. The selection committee for 2021 consists of Jaya Baloo, Bert 
Hubert, Flavio Luciani, Samaneh Tajalizadehkhoob, Tim Wattenberg and Remco van 
Mook (RIPE NCC Executive Board representative). Thank you to the selection 
committee for volunteering their time to support the Community Projects Fund.

Best regards,
Alastair Strachan
External Relations Officer
RIPE NCC

--- End Message ---
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Fwd: Traffic shaping at 10~300mbps at a 10Gbps link

2021-06-07 Thread Toke Høiland-Jørgensen via Bloat
Jonathan Morton  writes:

>> On 7 Jun, 2021, at 8:28 pm, Rich Brown  wrote:
>> 
>> Saw this on the lartc mailing list... For my own information, does anyone 
>> have thoughts, esp. for this quote:
>> 
>> "... when the speed comes to about 4.5Gbps download (upload is about 
>> 500mbps), chaos kicks in. CPU load goes sky high (all 24x2.4Ghz physical 
>> cores above 90% - 48x2.4Ghz if count that virtualization is on)..."
>
> This is probably the same phenomenon that limits most cheap CPE
> devices to about 100Mbps or 300Mbps with software shaping, just on a
> bigger scale due to running on fundamentally better hardware.
>
> My best theory to date on the root cause of this phenomenon is a
> throughput bottleneck between the NIC and the system RAM via DMA,
> which happens to be bypassed by a hardware forwarding engine within
> the NIC (or in an external switch chip) when software shaping is
> disabled. I note that 4.5Gbps is close to the capacity of a single
> PCIe v2 lane, so checking the topology of the NIC's attachment to the
> machine might help to confirm my theory.
>
> To avoid the problem, you'll either need to shape to a rate lower than
> the bottleneck capacity, or eliminate the unexpected bottleneck by
> implementing a faster connection to the NIC that can support
> wire-speed transfers.

I very much doubt this has anything to do with system bottlenecks. We
hit the PCIe bottleneck when trying to push 100Gbit through a server, 5
Gbps is trivial for a modern device.

Rather, as Jesper pointed out this sounds like root qdisc lock
contention...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] Freenode IRC kerfuffle

2021-05-30 Thread Toke Høiland-Jørgensen via Bloat
Hi everyone

In case you haven't noticed, there's been quite a kerfuffle around the
Freenode IRC network. As always, LWN has excellent coverage[0]. This has
resulted in quite an exodus from Freenode, and since some people were
hanging on the #bufferbloat channel there, I figured I'd mention it on
this list as well.

People seem to be migrating to either irc.oftc.net or the new
libera.chat network setup by the old Freenode admins. OpenWrt picked
OFTC[1], so I registered #bufferbloat there as well, although I seem to
be the only person there. There also seem to be a few people hanging out
in #bufferbloat on Libera.

Anyway, the above just FYI in case any of y'all are still using IRC and
want a place to talk about bloat there :)

-Toke

[0] https://lwn.net/SubscriberLink/857140/68d7cc81b184fa26/
[1] https://openwrt.org/irc
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Netperf re-licensed as MIT

2021-03-29 Thread Toke Høiland-Jørgensen via Bloat
Aaron Wood  writes:

> iperf3 isn’t “academic”, but is more focused on scientific computing (ESNet
> pushes a LOT of data CERN around, on 100Gbps backbones).
>
> But that also skews their usage/needs.  Very high throughput bulk transfers
> with long durations, over mixed systems. Not as many concerns about
> latency, except in that latency can cause messes with congestion
> control.

Yeah, I'm not too concerned about the code quality of iperf either - if
it ever reaches (rough) feature parity with netperf (list I posted
up-thread) I'm quite happy to turn it into an automatic fallback for
netperf in Flent, the same way we do with some of the other tools...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Netperf re-licensed as MIT

2021-03-29 Thread Toke Høiland-Jørgensen via Bloat
Aaron Wood  writes:

> One of my long concerns with the RRUL test is that the ICMP ping test
> portion is not isochronous

That would be the UDP_RR test, you mean (ICMP is isochronous)? Yeah,
that is a bit annoying, but as Dave says if irtt is available, Flent
will use that, and that *is* isochronous :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Netperf re-licensed as MIT

2021-03-28 Thread Toke Høiland-Jørgensen via Bloat
Dave Taht  writes:

> so glad to hear the license has been fixed.
>
> carl, iperf is only used in a few of flent's tests. We trusted netperf
> - as did the linux kernel developers - a lot further than all the
> iperf variants combined - at the time we started work on flent.

And, more importantly, netperf provides a lot of features that iperf
doesn't; such as dumping TCP info, setting congestion control, etc -
full list here:

https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-January/002648.html

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] Netperf re-licensed as MIT

2021-03-26 Thread Toke Høiland-Jørgensen via Bloat
Hopefully this means we can get it packaged for the distros that have
thus far refused to because of the license - i.e., Debian and Fedora!

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Trouble Installing PPing in MacOS

2021-02-27 Thread Toke Høiland-Jørgensen via Bloat
Jason Iannone  writes:

> Ideally, always on monitoring and export at the PE. An incremental first
> step in an adhoc off box tester, maybe deployed with the perfsonar suite.

Right. Are any of those boxes Linux-based? Otherwise the BPF
implementation is not much use to you :)

> Meaningfully visualizing the output seems challenging. Do you have any
> insights on that part?

No concrete plans yet, but the obvious thing that comes to mind is
looking for latency variation within flows, possibly correlating it with
rate. I tried to do something with the Measurement Labs NDT dataset some
years back:

https://dl.acm.org/doi/10.1145/2999572.2999603

For a middlebox monitor, the most interesting for identifying possible
interventions is to figure out which path the data corresponds to.
Depends a little on the network topology and the position of the
middlebox what would be a good way to do that...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Trouble Installing PPing in MacOS

2021-02-27 Thread Toke Høiland-Jørgensen via Bloat
Jason Iannone  writes:

> Beyond getting acquainted with a new dataset? I'm a transit network that
> supports, among other traffic types, science flows. I think new monitoring
> methods can help identify targets for intervention.

Right, I meant more in terms of deployment: are you looking to run this
as an always-on monitor on a middlebox, or are you just running ad-hoc
measurements on a client device?

I ask because we have a PhD student working on a re-implementation of
pping in BPF, the goal of which is precisely to be able to run as an
always-on monitor:
https://github.com/xdp-project/bpf-examples/tree/master/pping

So any insights into what you're thinking of doing with the tool would
potentially be helpful - adding in Simon, who's writing the code.

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Trouble Installing PPing in MacOS

2021-02-26 Thread Toke Høiland-Jørgensen via Bloat
TJason Iannone  writes:

> I ended up cloning the pping repo and running make locally.
>
> Installing was a few steps:
>
> 1. mkdir ~/src/libtins/build
> 2. cd ~/src/libtins/build
> 2. git clone https://github.com/mfontanini/libtins.git
> 3. make
> 4. sudo make install
> 5. cd ~/src
> 6. git clone https://github.com/pollere/pping.git
> 7. cd pping
> 8. make
> 9. ./pping
>
> The promise of this, as Kathleen Nichols points out, is that we can
> passively monitor production flows to get a novel sense of end to end
> performance per flow. I don't know of any other passive monitoring
> technique, beyond a port mirror + a whole gang of systems, that can provide
> this level of detail. Please enlighten me if I'm wrong. The only other
> passive monitoring mechanisms I'm aware of are SNMP polling, IPFIX/*Flow,
> and Streaming Telemetry Interface. None of those systems provide end to end
> flow performance details. The standard in-band active monitoring tools are
> good for determining node to node and full path metrics, but this provides
> a more complete picture of end to end performance beyond active
> y.1731/802.3ag/OAM probes. I'm a little surprised that I'm only learning
> about it now.

What's your use case? :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] [Cake] Fwd: [Galene] Dave on bufferbloat and jitter at 8pm CET Tuesday 23

2021-02-26 Thread Toke Høiland-Jørgensen via Bloat
Taraldsen Erik  writes:

> This is getting LTE/5G spesific.  Not sure if it belongs on the list.
> Let us know if we are generating noise.

I for one think it's fascinating - carry on! :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] [Cake] Fwd: [Galene] Dave on bufferbloat and jitter at 8pm CET Tuesday 23

2021-02-26 Thread Toke Høiland-Jørgensen via Bloat
Nils Andreas Svee  writes:

> On 2/25/21 11:30 AM, Toke Høiland-Jørgensen wrote:
>
>> Ah, wireguard doesn't have XDP support, so that's likely not going to
>> work; and if you run it on the physical interface, even if you didn't
>> get driver errors, the tool would just see the encrypted packets which
>> is not terribly helpful (it parses TCP timestamps to match
>> incoming/outgoing packets and compute the RTT).
>
> I figured that might be the case. Yes I would've disabled the VPN if I 
> didn't get driver errors.
> I changed the network interface to use an emulated Intel E1000 tonight, 
> and if I bypass the VPN it works as it should.

Right, awesome!

>> I guess we should be more flexible about which hooks we support, so it
>> can also be used on devices with no XDP support. Adding in Simon, who is
>> writing the code; I think he is focused on getting a couple of other
>> features done first, but this could go on the TODO list :)
>
> It's not like I'm in a hurry, and I'd probably need some time to figure 
> out how to tweak the CAKE parameters correctly with this anyway ;)
>
> Speaking of, isn't one of the challenges with solutions like these that 
> it's hard to tell when conditions have improved and allow for more 
> throughput? At least that's what I remember being the issue when I 
> tested CAKE's autorate-ingress back in the day.

Yeah, there would have to be some kind of probing to discover when the
bandwidth goes up (maybe something like what BBR does?). Working out the
details of this is still in the future, this is all just loose plans
that I'll try to get back to once we have the measurement tool working
reasonably well. Input and experiments welcome, of course!

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Trouble Installing PPing in MacOS

2021-02-25 Thread Toke Høiland-Jørgensen via Bloat
Jason Iannone  writes:

> Hi,
>
> I'm new here. Can anyone help me get pping installed? As far as I can tell,
> cmake, make, and make install all worked, but I don't have pping. Does
> anyone with a bigger brain than mine have a suggestion?
>
> $ pping
> -bash: pping: command not found

My bet would be a $PATH issue. You could try just running it from the
directory where you compiled it? I.e., substitute './pping' for 'pping'
- or look at the output of 'make install' and see if you have the
corresponding directory in your $PATH.

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Updated Bufferbloat Test

2021-02-25 Thread Toke Høiland-Jørgensen via Bloat
Toke Høiland-Jørgensen  writes:

>>   * We tried really hard to get as close to saturating gigabit
>> connections as possible. We redesigned completely the way we chunk
>> files, added a “warming up” period, and spent quite a bit optimizing
>> our code to minimize CPU usage, as we found that was often the
>> limiting factor to our speed test results.
>
> Yup, this seems to work better now! I can basically saturate my
> connection now; Chromium seems to be a bit better than Firefox in this
> respect, but I ended up getting very close on both:
>
> Chromium:
> https://www.waveform.com/tools/bufferbloat?test-id=b14731d3-46d7-49ba-8cc7-3641b495e6c7
> Firefox:
> https://www.waveform.com/tools/bufferbloat?test-id=877f496a-457a-4cc2-8f4c-91e23065c59e
>
> (this is with a ~100Mbps base load on a Gbps connection, so at least the
> Chromium result is pretty much link speed).

Did another test while replacing the queue on my router with a big FIFO.
Still got an A+ score:

https://www.waveform.com/tools/bufferbloat?test-id=9965c8db-367c-45f1-927c-a94eb8da0e08

However, note the max latency in download; quite a few outliers, jet I
still get a jitter score of only 22.6ms. Also, this time there's a
warning triangle on the "low latency gaming" row of the table, but the
score is still A+. Should it really be possible to get the highest score
while one of the rows has a warning in it?

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Updated Bufferbloat Test

2021-02-25 Thread Toke Høiland-Jørgensen via Bloat
Sina Khanifar  writes:

> Based on Toke’s feedback:
> https://lists.bufferbloat.net/pipermail/bloat/2020-November/015960.html
> https://lists.bufferbloat.net/pipermail/bloat/2020-November/015976.html

Thank you for the update, and especially this very detailed changelog!
I'm impressed! A few points on the specific items below:

>   * We changed the way the speed tests run to show an instantaneous
> speed as the test is being run.

Much better, I can actually see what's going on now :)
Maybe an 'abort' button somewhere would be useful? Once you've clicked
start the only way to abort is currently to close the browser tab...

>   * We moved the bufferbloat grade into the main results box.

Also very good!

>   * We tried really hard to get as close to saturating gigabit
> connections as possible. We redesigned completely the way we chunk
> files, added a “warming up” period, and spent quite a bit optimizing
> our code to minimize CPU usage, as we found that was often the
> limiting factor to our speed test results.

Yup, this seems to work better now! I can basically saturate my
connection now; Chromium seems to be a bit better than Firefox in this
respect, but I ended up getting very close on both:

Chromium:
https://www.waveform.com/tools/bufferbloat?test-id=b14731d3-46d7-49ba-8cc7-3641b495e6c7
Firefox:
https://www.waveform.com/tools/bufferbloat?test-id=877f496a-457a-4cc2-8f4c-91e23065c59e

(this is with a ~100Mbps base load on a Gbps connection, so at least the
Chromium result is pretty much link speed).

Interestingly, while my link is not bloated (the above results are
without running any shaping, just FQ-CoDel on the physical link), it did
manage to knock out the BFD exchange with my upstream BGP peers, causing
routes to flap. So it's definitely saturating something! :D

>   * We changed the shield grades altogether and went through a few
> different iterations of how to show the effect of bufferbloat on
> connectivity, and ended up with a “table view” to try to show the
> effect that bufferbloat specifically is having on the connection
> (compared to when the connection is unloaded).

I like this, with one caveat: When you have a good score, you end up
with a table that has all checkmarks in both the "normally" and "with
bufferbloat" columns, which is a bit confusing (makes one think "huh, I
can do low-latency gaming with bufferbloat?"). So I think changing the
column headings would be good; if I'm interpreting what you're trying to
convey, the second column should really say "your connection", right?
And maybe "normally" should be "Ideally"?

>   * We now link from the results table view to the FAQ where the
> conditions for each type of connection are explained.

This works well. I also like the FAQ in general (the water/oil in the
sink analogy is great!). What did you base the router recommendations
on? I haven't heard about that Asus gaming router before, does that ship
SQM? Also, the first time you mention the open source distributions,
OpenWrt is not a link (but it is the second time around).

>   * We also changed the way we measure latency and now use the faster
> of either Google’s CDN or Cloudflare at any given location.

Are you sure this is working? Mine seems to pick the Google fonts CDN.
The Javascript console outputs 'latency_source_selector.js:26 times
(2) [12.8101472421, 12.8098562038]', but in the network tab I
see two OPTIONS requests to fonts.gstatic.com, so I suspect those two
requests both go there? My ICMP ping time to Google is ~11ms, and it's
1.8ms to speed.cloudflare.com, so it seems a bit odd that it would pick
the other one... But maybe it's replying faster to HTTP?

> We’re also using the WebTiming APIs to get a more accurate latency
> number, though this does not work on some mobile browsers (e.g. iOS
> Safari) and as a result we show a higher latency on mobile devices.
> Since our test is less a test of absolute latency and more a test of
> relative latency with and without load, we felt this was workable.

That seems reasonable.

>   * Our jitter is now an average (was previously RMS).

I'll echo what the others have said about jitter.

>   * The “before you start” text was rewritten and moved above the start 
> button.
>   * We now spell out upload and download instead of having arrows.
>   * We hugely reduced the number of cross-site scripts. I was a bit
> embarrassed by this if I’m honest - I spent a long time building web
> tools for the EFF, where we almost never allowed any cross-site
> scripts. * Our site is hosted on Shopify, and adding any features via
> their app store ends up adding a whole lot of gunk. But we uninstalled
> some apps, rewrote our template, and ended up removing a whole lot of
> the gunk. There’s still plenty of room for improvement, but it should
> be a lot better than before.

Thank you for this! It even works without allowing all the shopify
scripts, so that's all good :)

-Toke
___
Bloat m

Re: [Bloat] [Cake] Fwd: [Galene] Dave on bufferbloat and jitter at 8pm CET Tuesday 23

2021-02-25 Thread Toke Høiland-Jørgensen via Bloat
"Nils Andreas Svee"  writes:

> I ran it on my router though, which has a decent amount of TCP flows running 
> at any given time.
> It's all going over a wg tunnel though, that might be wonky for all I
> know.

Ah, wireguard doesn't have XDP support, so that's likely not going to
work; and if you run it on the physical interface, even if you didn't
get driver errors, the tool would just see the encrypted packets which
is not terribly helpful (it parses TCP timestamps to match
incoming/outgoing packets and compute the RTT).

I guess we should be more flexible about which hooks we support, so it
can also be used on devices with no XDP support. Adding in Simon, who is
writing the code; I think he is focused on getting a couple of other
features done first, but this could go on the TODO list :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] [Cake] Fwd: [Galene] Dave on bufferbloat and jitter at 8pm CET Tuesday 23

2021-02-24 Thread Toke Høiland-Jørgensen via Bloat


On 24 February 2021 23:49:48 CET, Nils Andreas Svee  wrote:
>I'll look into pping. Admittedly I'm quite ignorant about BPF, so I'll
>likely blunder about for a bit, but hey, got it to compile - *and* run,
>but I didn't get any output other than the messages from clean_map.
>Dunno if I did something wrong, I'll look at it again tomorrow.


It monitors TCP traffic, so it'll only show something if you have TCP flows 
going while you run it...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] [Cake] Fwd: [Galene] Dave on bufferbloat and jitter at 8pm CET Tuesday 23

2021-02-24 Thread Toke Høiland-Jørgensen via Bloat
Taraldsen Erik  writes:

> Disclamer: I'm working on the Fixed Wireless products for Telenor
> (Zyxel NR7101 outdoor wall mounted unit). Not the Mobile Broadband
> products. We are working with Zyxel and Qualcom to try and implement
> an upstream queue which adapts to available radio resources. To much
> NDA so can't really disclose anything useful. Lets just say we are
> aware of the issues and are actively working to try and improve the
> situation - but don't hold your breath for a sollution.
>
> What sort of HW are you running your LTE on?
>
> Do you have a subscription with rate limitations? The PGW (router
> which enforces the limit) is a lot more latency friendly than if you
> are radio limited. So it may be beneficial to have a "slow"
> subscription rather than "free speed" then it comes to latency. Slow
> meaning lower subscrption rate than radio rate.

Ah, this is lovely! "How do I get my internet to be faster?" "Just buy a
slower connection!" :D

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] [Cake] Fwd: [Galene] Dave on bufferbloat and jitter at 8pm CET Tuesday 23

2021-02-24 Thread Toke Høiland-Jørgensen via Bloat
Dave Taht  writes:

> wow, that is (predictably) miserable, even with cake. The only
> solution that is going to
> work is to somehow actively monitor your link quality and adjust cake
> to suit. Or we can start trying to use kathie's passive ping tools.

We have a PhD student working on a BPF-based implementation of pping:
https://github.com/xdp-project/bpf-examples/tree/master/pping

My hope is that this can end up being an always-on thing that runs on
the router and can be used to adjust the CAKE parameters as latency
spikes.

There are still a few rough edges on the implementation (most notably
the data output can become quite high), but it should otherwise be
usable, so feel free to take it for a spin. Needs a fairly recent LLVM
(10+ IIRC) to compile the BPF parts.

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] HardenedBSD implementation of CAKE

2021-02-16 Thread Toke Høiland-Jørgensen via Bloat
"Mark D."  writes:

> Hello all,
> I am wondering if anyone is working on implementing fq_codel or CAKE on the
> HardenedBSD network stack? A few popular firewall and router distributions
> use this OS, and I believe that it would be beneficial for it to have AQM.
> If this work has already been done, please point me to the documentation
> for it. I loved what fq_codel was able to do on my OpenWRT flashed
> off-the-shelf router, and I would love to be able to use it on my x86
> OPNSense box. I am not a developer, but am a user that would love to
> support both the CAKE project and the HardenedBSD project if possible.

Not aware of any effort in this direction. However, the code in the
Linux kernel is dual-licensed GPL/BSD so at least there should be no
licensing impediments to porting. I believe that is how fq_codel ended
up in BSD.

CAKE does rely on quite a few kernel internals, though, so it would take
someone with integral BSD kernel knowledge to port it to their
equivalents there; in that sense it's not a trivial project...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Measuring CoDel

2021-01-22 Thread Toke Høiland-Jørgensen via Bloat
Hal Murray  writes:

> Toke said:
>> Yeah, the overhead of CoDel itself (and even FQ-CoDel) is basically nil (as
>> in, we have not been able to measure it), when otherwise doing forwarding
>> using the regular Linux stack. 
>
> I may be able to help with that.
>
> Are you familiar with Dick Sites' KUtrace?
>   Stanford Seminar - KUtrace 2020
>   https://www.youtube.com/watch?v=2HE7tSZGna0

Nope - but from a quick glance it looks similar to what you can do with
'perf'? :)

> The catch is that I've never used CoDel so somebody will have to teach
> me how to setup a test environment, and then show me the chunks in the
> kernel you want to measure.

To measure the CoDel algorithm, I guess the thing to measure would be
codel_dequeue():

https://elixir.bootlin.com/linux/latest/source/include/net/codel_impl.h#L142

However, that has loops in it that depend on flow state, so its
execution time will vary some. For fq_codel it would be the
fq_codel_enqueue() and fq_codel_dequeue() functions, but they have a
similar problem:

https://elixir.bootlin.com/linux/latest/source/net/sched/sch_fq_codel.c

Also, the larger problem is that the overhead of these drown in all the
other processing the kernel does for each packet (none of the
queueing-related functions even register on a 'perf' report when
forwarding packets. Still, might be interesting to see, who knows? So
feel free to take a stab at it :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] New OpenWrt release fixing several dnsmasq CVEs

2021-01-22 Thread Toke Høiland-Jørgensen via Bloat
Daniel Sterling  writes:

> On Fri, Jan 22, 2021 at 4:25 PM Jonathan Foulkes  
> wrote:
>> I did not install .6, I only performed an opkg update of the dnamasq package 
>> itself. So kernal is the same in my case.
>
> given you just updated the package -- that's extremely weird.
> userspace update shouldn't be able to lock up the kernel in any
> case. very bizarre.

Well, it's a daemon with pretty wide permissions to mess with the
system, so it's not impossible (evidently). Still, will be interesting
to see what turns out to be the root cause of this...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] New OpenWrt release fixing several dnsmasq CVEs

2021-01-22 Thread Toke Høiland-Jørgensen via Bloat
Jonathan Foulkes  writes:

> I installed the updated package on a 19.07.4 box running cake, and QoS 
> performance went down the tubes.
> Last night it locked up completely while attempting to stream.
>
> See the PingPlots others have posted to this forum thread, mine look similar, 
> went from constant sub 50ms to very spiky, then some loss, loss increasing, 
> and if high traffic, lock-up.
> https://forum.openwrt.org/t/security-advisory-2021-01-19-1-dnsmasq-multiple-vulnerabilities/85903/39
>
> load is low, sirq is low, so box does not seem stressed.
>
> Any reason Cake would be sensitive to a dnsmasq bug?

No, not really. I mean, dnsmasq could be sending some traffic that
interferes with stuff? Or it could be a kernel regression - the release
did bump the kernel version as well...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] UniFi Dream Machine Pro

2021-01-22 Thread Toke Høiland-Jørgensen via Bloat
Sebastian Moeller  writes:

> I am confident that this device will happily run CoDel and even
> fq_codel close to line-rate, as codel/fq_codel have a relative modest
> processing cost.

Yeah, the overhead of CoDel itself (and even FQ-CoDel) is basically nil
(as in, we have not been able to measure it), when otherwise doing
forwarding using the regular Linux stack.

As Sebastian says, the source of lower performance when using SQM on
some boxes is the traffic shaper, and sometimes the lack of hardware
offloads.

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Thanks to developers / htb+fq_codel ISP shaper

2021-01-21 Thread Toke Høiland-Jørgensen via Bloat
Robert Chacon  writes:

> Toke,
>
> Thank you very much for pointing me in the right direction.
> I am having some fun in the lab tinkering with the 'mq' qdisc and Jesper's
> xdp-cpumap-tc.
> It seems I will need to use iptables or nftables to filter packets to
> corresponding queues, since mq apparently cannot have u32 filters on its
> root.
> I will try to familiarize myself with iptables and nftables, and hopefully
> get it working soon and report back. Thank you!

Cool - adding in Jesper, maybe he has some input on this :)

-Toke


> On Fri, Jan 15, 2021 at 5:30 AM Toke Høiland-Jørgensen  wrote:
>
>> Robert Chacon  writes:
>>
>> >> Cool! What kind of performance are you seeing? The README mentions being
>> >> limited by the BPF hash table size, but can you actually shape 2000
>> >> customers on one machine? On what kind of hardware and at what rate(s)?
>> >
>> > On our production network our peak throughput is 1.5Gbps from 200
>> clients,
>> > and it works very well.
>> > We use a simple consumer-class AMD 2700X CPU in production because
>> > utilization of the shaper VM is ~15% at 1.5Gbps load.
>> > Customers get reliably capped within ±2Mbps of their allocated
>> htb/fq_codel
>> > bandwidth, which is very helpful to control network congestion.
>> >
>> > Here are some graphs from RRUL performed on our test bench hypervisor:
>> >
>> https://raw.githubusercontent.com/rchac/LibreQoS/main/docs/fq_codel_1000_subs_4G.png
>> > In that example, bandwidth for the "subscriber" client VM was set to
>> 4Gbps.
>> > 1000 IPv4 IPs and 1000 IPv6 IPs were in the filter hash table of
>> LibreQoS.
>> > The test bench server has an AMD 3900X running Ubuntu in Proxmox. 4Gbps
>> > utilizes 10% of the VM's 12 cores. Paravirtualized VirtIO network drivers
>> > are used and most offloading types are enabled.
>> > In our setup, VM networking multiqueue isn't enabled (it kept disrupting
>> > traffic flow), so 6Gbps is probably the most it can achieve like this.
>> Our
>> > qdiscs in this VM may be limited to one core because of that.
>>
>> I suspect the issue you had with multiqueue is that it requires per-CPU
>> partitioning on a per-customer base to work well. This is possible to do
>> with XDP, as Jesper demonstrates here:
>>
>> https://github.com/netoptimizer/xdp-cpumap-tc
>>
>> With this it should be possible to scale the hardware queues across
>> multiple CPUs properly, and you should be able to go to much higher
>> rates by just throwing more CPU cores at it. At least on bare metal; not
>> sure if the VM virt-drivers have the needed support yet...
>>
>> -Toke
>>
>
>
> -- 
> [image: photograph]
>
>
> *Robert Chacón* Owner
> *M* (915) 730-1472
> *E* robert.cha...@jackrabbitwireless.com
> *JackRabbit Wireless LLC*
> P.O. Box 222111
> El Paso, TX 79913
> *jackrabbitwireless.com* 
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] New OpenWrt release fixing several dnsmasq CVEs

2021-01-19 Thread Toke Høiland-Jørgensen via Bloat
Hi everyone

In case you haven't seen, there's a new OpenWrt release out[0] that
fixes several CVEs in dnsmasq; seems like quite a bunch at once[1].

So in the interest of keeping everyone's routers safe, here's a gentle
nudge to update :)

-Toke

[0] https://openwrt.org/releases/19.07/notes-19.07.6
[1] https://www.jsof-tech.com/disclosures/dnspooq/
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Thanks to developers / htb+fq_codel ISP shaper

2021-01-15 Thread Toke Høiland-Jørgensen via Bloat
Robert Chacon  writes:

>> Cool! What kind of performance are you seeing? The README mentions being
>> limited by the BPF hash table size, but can you actually shape 2000
>> customers on one machine? On what kind of hardware and at what rate(s)?
>
> On our production network our peak throughput is 1.5Gbps from 200 clients,
> and it works very well.
> We use a simple consumer-class AMD 2700X CPU in production because
> utilization of the shaper VM is ~15% at 1.5Gbps load.
> Customers get reliably capped within ±2Mbps of their allocated htb/fq_codel
> bandwidth, which is very helpful to control network congestion.
>
> Here are some graphs from RRUL performed on our test bench hypervisor:
> https://raw.githubusercontent.com/rchac/LibreQoS/main/docs/fq_codel_1000_subs_4G.png
> In that example, bandwidth for the "subscriber" client VM was set to 4Gbps.
> 1000 IPv4 IPs and 1000 IPv6 IPs were in the filter hash table of LibreQoS.
> The test bench server has an AMD 3900X running Ubuntu in Proxmox. 4Gbps
> utilizes 10% of the VM's 12 cores. Paravirtualized VirtIO network drivers
> are used and most offloading types are enabled.
> In our setup, VM networking multiqueue isn't enabled (it kept disrupting
> traffic flow), so 6Gbps is probably the most it can achieve like this. Our
> qdiscs in this VM may be limited to one core because of that.

I suspect the issue you had with multiqueue is that it requires per-CPU
partitioning on a per-customer base to work well. This is possible to do
with XDP, as Jesper demonstrates here:

https://github.com/netoptimizer/xdp-cpumap-tc

With this it should be possible to scale the hardware queues across
multiple CPUs properly, and you should be able to go to much higher
rates by just throwing more CPU cores at it. At least on bare metal; not
sure if the VM virt-drivers have the needed support yet...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Thanks to developers / htb+fq_codel ISP shaper

2021-01-14 Thread Toke Høiland-Jørgensen via Bloat
Robert Chacon  writes:

> Hello everyone,
>
> I am new here, my name is Robert. I operate a small ISP in the US. I wanted
> to post here to thank Dave Täht, as well as the dozens of contributors to
> the fq_codel and cake projects.

Thank you for reaching out! It's always fun to hear about real-world
deployments of this technology, and it's great to hear that it's working
well for you! :)

> I created a simple python application that uses htb+fq_codel to shape my
> customers' traffic, and have seen great performance improvements. I am
> maintaining it as an open source project for other ISPs to use at
> https://github.com/rchac/LibreQoS

Cool! What kind of performance are you seeing? The README mentions being
limited by the BPF hash table size, but can you actually shape 2000
customers on one machine? On what kind of hardware and at what rate(s)?

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Rebecca Drucker's talk sounds like it exposes an addressable bloat issue in Ciscos

2021-01-10 Thread Toke Høiland-Jørgensen via Bloat
Jonathan Morton  writes:

> The virtual-clock algorithm I implemented in Cake is essentially a
> deficit-mode algorithm.  During any continuous period of traffic
> delivery, defined as finding a packet in the queue when one is
> scheduled to deliver, the time of delivering the next packet is
> updated after every packet is delivered, by calculating the
> serialisation time of that packet and adding it to the previous
> delivery schedule.  As long as that time is in the past, the next
> packet may be delivered immediately.  When it goes into the future,
> the time to wait before delivering the next packet is precisely known.
> Hence bursts occur only due to quantum effects and are automatically
> of the minimum size necessary to maintain throughput, without any
> configuration (explicit or otherwise).

Also, while CAKE's shaper predates it, the rest of the Linux kernel is
also moving to a timing-based packet scheduling model, following Van
Jacobson's talk at Netdevconf in 2018:

https://netdevconf.info/0x12/session.html?evolving-from-afap-teaching-nics-about-time

In particular, the TCP stack uses early departure time since 2018:
https://lwn.net/Articles/766564/

The (somewhat misnamed) sch_fq packet scheduler will also obey packet
timestamps and when scheduling, which works with both the timestamps set
by the TCP stack as per the commit above, but can also be set from
userspace with a socket option, or from a BPF filter.

Jesper wrote a BPF-based implementation of a shaper that uses a BPF
filter to set packet timestamps to shape traffic at a set rate with
precise timing (avoiding bursts):
https://github.com/xdp-project/bpf-examples/tree/master/traffic-pacing-edt

The use case here is an ISP middlebox that can smooth out traffic to
avoid tail drops in shallow-buffered switches. He tells me it scales
quite well, although some tuning of the kernel and drivers is necessary
to completely avoid microbursts. There's also a BPF implementation of
CoDel in there, BTW.

I've been talking to Jesper about comparing his implementation's
performance to the shaper in CAKE, but we haven't gotten around to it
yet. We'll share data once we do, obviously :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Openwrt stability?

2021-01-05 Thread Toke Høiland-Jørgensen via Bloat
Stephen Hemminger  writes:

> I having lots of issues with openwrt stability.
> Today's issue seems to be some device on one leg of the home LAN causing 
> OpenWrt router
> to crash. That leg has Xbox and other audio gear.
>
> Any idea how to debug this? Is there a way to get serial console?

There usually is, but depending on the device it may involve disassembly
and/or soldering. Which device are you running? And which openwrt
version?

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Why you need at least 3Mbps upload to get good game performance with ~1500byte packets: Doing the math

2020-12-09 Thread Toke Høiland-Jørgensen via Bloat
Sebastian Moeller  writes:

> Hi Toke,
>
>
>> On Dec 9, 2020, at 12:20, Toke Høiland-Jørgensen  wrote:
>> 
>> Sebastian Moeller  writes:
>> 
>>> Hi Toke,
>>> 
>>> 
>>>> On Dec 9, 2020, at 11:52, Toke Høiland-Jørgensen via Bloat 
>>>>  wrote:
>>>> 
>>>> Kenneth Porter  writes:
>>>> 
>>>>> <https://forum.openwrt.org/t/why-you-need-at-least-3mbps-upload-to-get-good-game-performance-with-1500byte-packets-doing-the-math/81240>
>>>>> 
>>>>> Upstream article:
>>>>> 
>>>>> <http://models.street-artists.org/2020/12/05/why-gaming-on-a-dsl-line-is-terrible-and-the-math-says-theres-nothing-you-can-do-about-it/>
>>>> 
>>>> Good points, but doesn't mention options to decrease the packet size
>>>> (lower MTU/MSS clamping)... :)
>>> 
>>> But he is doing exactly that in the script he developed for OpenWrt 
>>> games on poor links:
>> 
>> Ah, cool! May be necessary to actually decrease the interface MTU as
>> well, though, since TCP MSS clamping won't work for QUIC...
>
>   Mmmh, QUIC does pMPUd, no? IN that case a "simple" filter to drop QUIC 
> packets along a certain size might already do the trick?

Maybe? But actually lowering the MTU of the interface would have the
same effect, I guess? And what happens in the wild is anyone's guess, of
course... ;)

>> And of course, for IPv6 you can't decrease the MTU below 1280 bytes
>> without breaking spec :(
>
>   Jepp, but MSS clamping still works, except there are limits how
>   low OS will go, Macos will not go below ~200, and I believe
>   Linux also recently got increased values for min MSS to counter
>   some DOS issues with SACK and friends, no? That said, it is well
>   possible that even IPv6 might work with smaller MTUs...

Sure, MSS clamping will work even for IPv6, but only for TCP...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Why you need at least 3Mbps upload to get good game performance with ~1500byte packets: Doing the math

2020-12-09 Thread Toke Høiland-Jørgensen via Bloat
Sebastian Moeller  writes:

> Hi Toke,
>
>
>> On Dec 9, 2020, at 11:52, Toke Høiland-Jørgensen via Bloat 
>>  wrote:
>> 
>> Kenneth Porter  writes:
>> 
>>> <https://forum.openwrt.org/t/why-you-need-at-least-3mbps-upload-to-get-good-game-performance-with-1500byte-packets-doing-the-math/81240>
>>> 
>>> Upstream article:
>>> 
>>> <http://models.street-artists.org/2020/12/05/why-gaming-on-a-dsl-line-is-terrible-and-the-math-says-theres-nothing-you-can-do-about-it/>
>> 
>> Good points, but doesn't mention options to decrease the packet size
>> (lower MTU/MSS clamping)... :)
>
>   But he is doing exactly that in the script he developed for OpenWrt 
> games on poor links:

Ah, cool! May be necessary to actually decrease the interface MTU as
well, though, since TCP MSS clamping won't work for QUIC...

And of course, for IPv6 you can't decrease the MTU below 1280 bytes
without breaking spec :(

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Why you need at least 3Mbps upload to get good game performance with ~1500byte packets: Doing the math

2020-12-09 Thread Toke Høiland-Jørgensen via Bloat
Kenneth Porter  writes:

> 
>
> Upstream article:
>
> 

Good points, but doesn't mention options to decrease the packet size
(lower MTU/MSS clamping)... :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Xfinity Flex streaming box starves on cake?

2020-12-06 Thread Toke Høiland-Jørgensen via Bloat
Kenneth Porter  writes:

> I suspect that my Xfinity Flex box has too small an internal buffer and is 
> starving when fed by my cake-enabled OpenWrt router.

From your config you have around 10x the downstream bandwidth you need
for streaming, so unless you are really hitting your connection hard
with other things I would not expect packets to buffer in the
modem/router at all while you're streaming.

So if this really is caused by sqm-scripts (which I think you should do
a few more tests to definitely confirm), I would think it more likely it
was due to some other weird interaction. Streaming box expecting a
certain DSCP marking? Server choking on ACK stream due to filtering?
Hard to say... A packet dump of the stream going to the box while the
black screen happens may be illuminating, but could be hard to pull off
if it's not reliably reproducible...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] starlink

2020-12-01 Thread Toke Høiland-Jørgensen via Bloat
Conor Beh  writes:

> Hello,
>
> Jim Gettys made a reddit post on r/Starlink asking for data from beta
> testers. I am one of those testers. I spun up an Ubuntu VM and did three
> runs of flent and rrul as depicted in the getting started page. You may
> find the results here:
> https://drive.google.com/file/d/1NIGPpCMrJgi8Pb27t9a9VbVOGzsKLE0K/view?usp=sharing

Thanks for sharing! That is some terrible bloat, though! :(

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] BBR implementations, knobs to turn?

2020-11-20 Thread Toke Høiland-Jørgensen via Bloat
Jesper Dangaard Brouer  writes:

> Hi Erik,
>
> I really appreciate that you are reaching out to the bufferbloat community
> for this real-life 5G mobile testing.  Lets all help out Erik.

Yes! FYI, I've been communicating off-list with Erik for quite some
time, he's doing great work but fighting the usual up-hill battle to get
others to recognise the issues; so +1, let's give him all the help we
can :)

> From your graphs, it does look like you are measuring latency
> under-load, e.g. while the curl download/upload is running.  This is
> great as this is the first rule of bufferbloat measuring :-)  (and Luca
> hinted to this)
>
> The Huawei policer/shaper sounds scary.  And 1000 packets deep queue
> also sound like a recipe for bufferbloat.  I would of-cause like to
> re-write the Huawei policer/shaper with the knowledge and techniques we
> know from our bufferbloat work in the Linux Kernel.  (If only I knew
> someone that coded on 5G solutions that could implement this on their
> hardware solution, and provide a better product Cc. Carlo)
>
> Are you familiar with Toke's (cc) work/PhD on handling bufferbloat on
> wireless networks?  (Hint: Airtime fairness)
>
> Solving bufferbloat in wireless networks require more than applying
> fq_codel on the bottleneck queue, it requires Airtime fairness.  Doing
> scheduling based Clients use of Radio-time and transmit-opportunities
> (TXOP), instead of shaping based on bytes. (This is why it can (if you
> are very careful) make sense to "holding back packets a bit" to
> generate a packet aggregate that only consumes one TXOP).
>
> The culprit is that each Client/MobilePhone will be sending at
> different rates, and scheduling based on bytes, will cause a Client with
> a low rate to consume a too large part of the shared radio airtime.
> That basically sums up Toke's PhD ;-)

Much as I of course appreciate the call-out, airtime fairness itself is
not actually much of an issue with mobile networks (LTE/5G/etc)... :)

The reason being that they use TDMA scheduling enforced by the base
station; so there's a central controller that enforces airtime usage
built into the protocol, which ensures fairness (unless the operator
explicitly configures it to be unfair for policy reasons). So the new
insight in my PhD is not so much "airtime fairness is good for wireless
links" as it is "we can achieve airtime fairness in CDMA/CS-scheduled
networks like WiFi".

Your other points about bloated queues etc, are spot on. Ideally, we
could get operators to fix their gear, but working around the issues
like Erik is doing can work in the meantime. And it's great to see that
it seems like Telenor is starting to roll this out; as far as I can tell
that has taken quite a bit of advocacy from Erik's side to get there! :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] configuration on an OpenVPN server

2020-11-19 Thread Toke Høiland-Jørgensen via Bloat
Matt Taggart  writes:

> Hi,
>
> I would like to configure SQM on an OpenVPN server and I am thinking 
> about how to do this. I have already setup piece_of_cake on the upstream 
> connection(in this case 900Mbit down/250Mbit up). I think that by itself 
> should do a decent job of keeping things fair between all the VPN 
> clients: they are assigned private IPs and triple-isolate should do the 
> right thing.
>
> But OpenVPN creates a tun device for that traffic and I could 
> potentially do more to manage the VPN traffic separately from the server 
> host traffic. One thing that occurred to me is that due to asymmetric 
> upload/download the host has, and the fact that the VPN traffic has to 
> go to/from the client, maybe the download rate of the tun device will 
> never exceed the upload rate of the host (since we need to retransmit 
> that data to the clients) and vice versa for the upload? So to force 
> myself to be a bottleneck should I have qdiscs on the tun device 
> limiting to ~240Mbit in each direction?
>
> Hopefully that is clear. Let me know if it's not. Also anything else I 
> should consider in this situation?

I think what you're saying is that all the VPN traffic goes back out the
same link, right? I.e., there are no other links on that server (to a
LAN or something) that the traffic can go out?

In that case, yeah, the ingress traffic should mostly be limited by the
egress since it all needs to be forwarded anyway (assuming the clients
are not accessing any services on the host itself either). I run a
similar setup for a tor bridge, and I just have this on the egress:

qdisc cake 8006: dev eth0 root refcnt 2 bandwidth 100Mbit besteffort flows 
nonat nowash no-ack-filter no-split-gso rtt 100.0ms raw overhead 0 

This keeps both ingress and egress traffic pretty much exactly at a
steady 100 Mbps (which means it eats up about 1 terabyte of data a day
:)).

One additional wrinkle you may get with OpenVPN is that the VPN driver
itself can suffer from bufferbloat. So when you shape the underlying
link, packets will queue in the tun device first, adding delay. You
could maybe fix this by shaping the tun device as well, but it's not
obvious what bandwidth to shape at, since that will depend on the
ingress/egress distribution of your clients. And I'm no even sure
traffic going from one VPN client to another will pass through the tun
device...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

2020-11-09 Thread Toke Høiland-Jørgensen via Bloat
>> Also, it's TX, and we are only doing RX, as I said already somewhere, 
>> it's async routing, so the TX data comes via another router back.
>
> Okay, but as this is a router you also need to transmit this
> (asymmetric) traffic out another interface right.
>
> Could you also provide ethtool_stats for the TX interface?
>
> Notice that the tool[1] ethtool_stats.pl support monitoring several
> interfaces at the same time, e.g. run:
>
>  ethtool_stats.pl --sec 3 --dev eth4 --dev ethTX
>
> And provide output as pastebin.

Also, from your drawing, everything goes through the same switch, right?
And since pause frames are involved... Maybe it's the switch being
overwhelmed?

In general, I find that pause frames give more problems than they solve,
so I always turn them off...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Comparing bufferbloat tests (was: We built a new bufferbloat test and keen for feedback)

2020-11-06 Thread Toke Høiland-Jørgensen via Bloat
Stephen Hemminger  writes:

> PS: Why to US providers have such asymmetric bandwidth? Getting something 
> symmetric
> requires going to a $$$ business rate.

For Cable, the DOCSIS standard is asymmetric by design, but not *that*
asymmetric. I *think* the rest is because providers have to assign
channels independently for upstream and downstream, and if they just
assign them all to downstream they can advertise a bigger number...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

2020-11-06 Thread Toke Høiland-Jørgensen via Bloat
"Thomas Rosenstein"  writes:

> On 6 Nov 2020, at 12:18, Jesper Dangaard Brouer wrote:
>
>> On Fri, 06 Nov 2020 10:18:10 +0100
>> "Thomas Rosenstein"  wrote:
>>
> I just tested 5.9.4 seems to also fix it partly, I have long
> stretches where it looks good, and then some increases again. (3.10
> Stock has them too, but not so high, rather 1-3 ms)
>
>>
>> That you have long stretches where latency looks good is interesting
>> information.   My theory is that your system have a periodic userspace
>> process that does a kernel syscall that takes too long, blocking
>> network card from processing packets. (Note it can also be a kernel
>> thread).
>
> The weird part is, I first only updated router-02 and pinged to 
> router-04 (out of traffic flow), there I noticed these long stretches of 
> ok ping.
>
> When I updated also router-03 and router-04, the old behaviour kind of 
> was back, this confused me.
>
> Could this be related to netlink? I have gobgpd running on these 
> routers, which injects routes via netlink.
> But the churn rate during the tests is very minimal, maybe 30 - 40 
> routes every second.
>
> Otherwise we got: salt-minion, collectd, node_exporter, sshd

collectd may be polling the interface stats; try turning that off?

>>
>> Another theory is the NIC HW does strange things, but it is not very
>> likely.  E.g. delaying the packets before generating the IRQ 
>> interrupt,
>> which hide it from my IRQ-to-softirq latency tool.
>>
>> A question: What traffic control qdisc are you using on your system?
>
> kernel 4+ uses pfifo, but there's no dropped packets
> I have also tested with fq_codel, same behaviour and also no weirdness 
> in the packets queue itself
>
> kernel 3.10 uses mq, and for the vlan interfaces noqueue

Do you mean that you only have a single pfifo qdisc on kernel 4+? Why is
it not using mq?

Was there anything in the ethtool stats?

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Comparing bufferbloat tests (was: We built a new bufferbloat test and keen for feedback)

2020-11-05 Thread Toke Høiland-Jørgensen via Bloat
> I believe TLS handshake time is not included here. I’m using the
> Resource Timing API
> 
> to measure the time-to-first-byte for a request that I’m sending to
> retrieve a static file. The resource loading phases
> 
> section of the documentation explicitly shows the different stages for
> DNS Lookup, TCP connection establishment, etc. I’m using the
> difference between requestStart and responseStart values. This value
> is deemed to be the same as time-to-first-byte
> 
> seen in the inspector’s network tab.

This does not seem completely ludicrous, at least :)

> We’re using this static file
> 
> that is hosted on a google CDN. We tried multiple different files, and
> this one had the lowest latency in both locations that we tested it
> (I’m in Toronto, and my colleague Sina is in San Francisco).

Ah, so that's why that request showed up :)

Curious to know why you picked this instead of, say, something from
speed.cloudflare.com (since you're using that for the speed tests anyway)?

> @Toke Høiland-Jørgensen
>> Your test does a decent job and comes pretty close, at least
>> in Chromium (about 800 Mbps which is not too far off at the application
>> layer, considering I have a constant 100Mbps flow in the background
>> taking up some of the bandwidth). Firefox seems way off (one test said
>> 500Mbps the other >1000).
>
>
> The way I’m measuring download is that I make multiple simultaneous
> requests to cloudflare’s backend requesting 100MB files. Their backend
> simply returns a file that has “0”s in the body repeated until 100MB
> of file is generated. Then I use readable streams
> 
> to make multiple measurements of (total bytes downloaded, timestamp).
> Then I fit a line to the measurements collected, and the slope of that
> line is the calculated bandwidth. For gigabit connections, this
> download happens very quickly, and it may be the case that not a lot
> of points are collected, in which case the fitted line is not accurate
> and one might get overly-huge bandwidths as is the >1000 case in ur
> Firefox browser. I think this might be fixed if we increase the
> download time. Currently it’s 5s, maybe changing that to 10-20s would
> help. I think in general it’d be a good feature to have a "more
> advanced options” feature that allows the user to adjust some
> parameters of the connection (such as number of parallel connections,
> download scenario’s duration, upload scenario’s duration, etc.)

Yeah, I think running the test for longer will help; 5s is not nearly
enough to saturate a connection, especially not as the link speed increases.

> The reason I do this line-fitting is because I want to get rid of the
> bandwidth ramp-up time when the download begins.

Yeah, allowing some ramp-up time before determining the bandwidth seems
reasonable, but it's not generally possible to just pick a static number
of (say) seconds to chop off... Having the graph over time helps
sanity-check things, though.

Also, a continuous graph of latency samples over time (for the whole
duration, including idle/upload/download) is usually very instructive
when plotting such a test.

> Real-time Bandwidth Reporting
> Using readable-streams also allows for instantaneous bandwidth
> reporting (maybe using average of a moving window) similar to what
> fast.com  or speedtest.net 
> do, but I unfortunately am not able to do the same thing with upload,
> since getting progress on http uploads adds some pre-flight OPTIONS
> requests which cloudflare’s speedtest backend
>  doesn’t allow those requests. For this
> test we are directly hitting cloudflare’s backend, you can see this in
> the network tab:
>
> Our download is by sending an http GET request to this endpoint:
> https://speed.cloudflare.com/__down?bytes=1
>  and our upload
> is done by sending and http POST request to this endpoint:
> https://speed.cloudflare.com/__up 
>
> Since we are using cloudflare’s backend we are limited by what they
> allow us to do.

The test at speed.cloudflare.com does seem to plot real-time upload
bandwidth; is that a privileged operation for themselves, or something?

> I did try making my own worker which essentially does the same thing
> as cloudflare’s speedtest backend (They do have this template worker
>  that for the
> most part does the same thing.) I modified that worker a bit so that
> it all

Re: [Bloat] Comparing bufferbloat tests (was: We built a new bufferbloat test and keen for feedback)

2020-11-05 Thread Toke Høiland-Jørgensen via Bloat
Dave Collier-Brown  writes:

> On 2020-11-05 6:48 a.m., Toke Høiland-Jørgensen via Bloat wrote:
>
> Also, holy cow, what's going on with your connection? The unloaded
> latency says 17/110/200 min/median/max RTT. Is that due to bad
> measurements, or do you have a lot of cross traffic and a really bloated
> link? :/
>
> -Toke
>
>
> The tests differ somewhat while looking at an unloaded residential link 
> provided by a local monopoly, Rogers Cable, and mitigated by an IQrouter (my 
> old linksys is long dead (;-))
>
> DSLReports says
>
>   *   144.7 Mb/s down
>   *   14.05 MB/s up
>   *   bufferbloat A+
>   *   downloading lag 40-100 ms

Still a pretty big span from 40-100ms; how does that turn into an A+
score, I wonder?

> Waveform says:
>
>   *   43.47 Mbps down
>   *   16.05 Mbps up
>   *   bufferbloat grade A+
>   *   unloaded latency 93.5 ms
>
> So we're reporting different speeds and RTTs. Are we using different
> units or definitions, I wonder?

Well either that, or one of the tests is just busted. My immediate guess
would be the not-yet-released prototype is the least accurate ;)
I do wonder why, though...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

2020-11-05 Thread Toke Høiland-Jørgensen via Bloat
"Thomas Rosenstein"  writes:

> On 5 Nov 2020, at 13:38, Toke Høiland-Jørgensen wrote:
>
>> "Thomas Rosenstein"  writes:
>>
>>> On 5 Nov 2020, at 12:21, Toke Høiland-Jørgensen wrote:
>>>
 "Thomas Rosenstein"  writes:

>> If so, this sounds more like a driver issue, or maybe something to
>> do
>> with scheduling. Does it only happen with ICMP? You could try this
>> tool
>> for a userspace UDP measurement:
>
> It happens with all packets, therefore the transfer to backblaze 
> with
> 40
> threads goes down to ~8MB/s instead of >60MB/s

 Huh, right, definitely sounds like a kernel bug; or maybe the new
 kernel
 is getting the hardware into a state where it bugs out when there 
 are
 lots of flows or something.

 You could try looking at the ethtool stats (ethtool -S) while 
 running
 the test and see if any error counters go up. Here's a handy script 
 to
 monitor changes in the counters:

 https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl

> I'll try what that reports!
>
>> Also, what happens if you ping a host on the internet (*through* 
>> the
>> router instead of *to* it)?
>
> Same issue, but twice pronounced, as it seems all interfaces are
> affected.
> So, ping on one interface and the second has the issue.
> Also all traffic across the host has the issue, but on both sides, 
> so
> ping to the internet increased by 2x

 Right, so even an unloaded interface suffers? But this is the same
 NIC,
 right? So it could still be a hardware issue...

> Yep default that CentOS ships, I just tested 4.12.5 there the issue
> also
> does not happen. So I guess I can bisect it then...(really don't 
> want
> to
> 😃)

 Well that at least narrows it down :)
>>>
>>> I just tested 5.9.4 seems to also fix it partly, I have long 
>>> stretches
>>> where it looks good, and then some increases again. (3.10 Stock has 
>>> them
>>> too, but not so high, rather 1-3 ms)
>>>
>>> for example:
>>>
>>> 64 bytes from x.x.x.x: icmp_seq=10 ttl=64 time=0.169 ms
>>> 64 bytes from x.x.x.x: icmp_seq=11 ttl=64 time=5.53 ms
>>> 64 bytes from x.x.x.x: icmp_seq=12 ttl=64 time=9.44 ms
>>> 64 bytes from x.x.x.x: icmp_seq=13 ttl=64 time=0.167 ms
>>> 64 bytes from x.x.x.x: icmp_seq=14 ttl=64 time=3.88 ms
>>>
>>> and then again:
>>>
>>> 64 bytes from x.x.x.x: icmp_seq=15 ttl=64 time=0.569 ms
>>> 64 bytes from x.x.x.x: icmp_seq=16 ttl=64 time=0.148 ms
>>> 64 bytes from x.x.x.x: icmp_seq=17 ttl=64 time=0.286 ms
>>> 64 bytes from x.x.x.x: icmp_seq=18 ttl=64 time=0.257 ms
>>> 64 bytes from x.x.x.x: icmp_seq=19 ttl=64 time=0.220 ms
>>> 64 bytes from x.x.x.x: icmp_seq=20 ttl=64 time=0.125 ms
>>> 64 bytes from x.x.x.x: icmp_seq=21 ttl=64 time=0.188 ms
>>> 64 bytes from x.x.x.x: icmp_seq=22 ttl=64 time=0.202 ms
>>> 64 bytes from x.x.x.x: icmp_seq=23 ttl=64 time=0.195 ms
>>> 64 bytes from x.x.x.x: icmp_seq=24 ttl=64 time=0.177 ms
>>> 64 bytes from x.x.x.x: icmp_seq=25 ttl=64 time=0.242 ms
>>> 64 bytes from x.x.x.x: icmp_seq=26 ttl=64 time=0.339 ms
>>> 64 bytes from x.x.x.x: icmp_seq=27 ttl=64 time=0.183 ms
>>> 64 bytes from x.x.x.x: icmp_seq=28 ttl=64 time=0.221 ms
>>> 64 bytes from x.x.x.x: icmp_seq=29 ttl=64 time=0.317 ms
>>> 64 bytes from x.x.x.x: icmp_seq=30 ttl=64 time=0.210 ms
>>> 64 bytes from x.x.x.x: icmp_seq=31 ttl=64 time=0.242 ms
>>> 64 bytes from x.x.x.x: icmp_seq=32 ttl=64 time=0.127 ms
>>> 64 bytes from x.x.x.x: icmp_seq=33 ttl=64 time=0.217 ms
>>> 64 bytes from x.x.x.x: icmp_seq=34 ttl=64 time=0.184 ms
>>>
>>>
>>> For me it looks now that there was some fix between 5.4.60 and 5.9.4 
>>> ...
>>> anyone can pinpoint it?
>>
>> $ git log --no-merges --oneline v5.4.60..v5.9.4|wc -l
>> 72932
>>
>> Only 73k commits; should be easy, right? :)
>>
>> (In other words no, I have no idea; I'd suggest either (a) asking on
>> netdev, (b) bisecting or (c) using 5.9+ and just making peace with not
>> knowing).
>
> Guess I'll go the easy route and let it be ...
>
> I'll update all routers to the 5.9.4 and see if it fixes the traffic 
> flow - will report back once more after that.

Sounds like a plan :)

>>
>> How did you configure the new kernel? Did you start from scratch, 
>> or
>> is
>> it based on the old centos config?
>
> first oldconfig and from there then added additional options for 
> IB,
> NVMe, etc (which I don't really need on the routers)

 OK, so you're probably building with roughly the same options in 
 terms
 of scheduling granularity etc. That's good. Did you enable spectre
 mitigations etc on the new kernel? What's the output of
 `tail /sys/devices/system/cpu/vulnerabilities/*` ?
>>>
>>> mitigations are off
>>
>> Right, I just figured maybe you were hitting some threshold that
>> involved a lot of indirect calls which slowed things down due to
>> miti

Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

2020-11-05 Thread Toke Høiland-Jørgensen via Bloat
"Thomas Rosenstein"  writes:

> On 5 Nov 2020, at 12:21, Toke Høiland-Jørgensen wrote:
>
>> "Thomas Rosenstein"  writes:
>>
 If so, this sounds more like a driver issue, or maybe something to 
 do
 with scheduling. Does it only happen with ICMP? You could try this
 tool
 for a userspace UDP measurement:
>>>
>>> It happens with all packets, therefore the transfer to backblaze with 
>>> 40
>>> threads goes down to ~8MB/s instead of >60MB/s
>>
>> Huh, right, definitely sounds like a kernel bug; or maybe the new 
>> kernel
>> is getting the hardware into a state where it bugs out when there are
>> lots of flows or something.
>>
>> You could try looking at the ethtool stats (ethtool -S) while running
>> the test and see if any error counters go up. Here's a handy script to
>> monitor changes in the counters:
>>
>> https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl
>>
>>> I'll try what that reports!
>>>
 Also, what happens if you ping a host on the internet (*through* the
 router instead of *to* it)?
>>>
>>> Same issue, but twice pronounced, as it seems all interfaces are
>>> affected.
>>> So, ping on one interface and the second has the issue.
>>> Also all traffic across the host has the issue, but on both sides, so
>>> ping to the internet increased by 2x
>>
>> Right, so even an unloaded interface suffers? But this is the same 
>> NIC,
>> right? So it could still be a hardware issue...
>>
>>> Yep default that CentOS ships, I just tested 4.12.5 there the issue 
>>> also
>>> does not happen. So I guess I can bisect it then...(really don't want 
>>> to
>>> 😃)
>>
>> Well that at least narrows it down :)
>
> I just tested 5.9.4 seems to also fix it partly, I have long stretches 
> where it looks good, and then some increases again. (3.10 Stock has them 
> too, but not so high, rather 1-3 ms)
>
> for example:
>
> 64 bytes from x.x.x.x: icmp_seq=10 ttl=64 time=0.169 ms
> 64 bytes from x.x.x.x: icmp_seq=11 ttl=64 time=5.53 ms
> 64 bytes from x.x.x.x: icmp_seq=12 ttl=64 time=9.44 ms
> 64 bytes from x.x.x.x: icmp_seq=13 ttl=64 time=0.167 ms
> 64 bytes from x.x.x.x: icmp_seq=14 ttl=64 time=3.88 ms
>
> and then again:
>
> 64 bytes from x.x.x.x: icmp_seq=15 ttl=64 time=0.569 ms
> 64 bytes from x.x.x.x: icmp_seq=16 ttl=64 time=0.148 ms
> 64 bytes from x.x.x.x: icmp_seq=17 ttl=64 time=0.286 ms
> 64 bytes from x.x.x.x: icmp_seq=18 ttl=64 time=0.257 ms
> 64 bytes from x.x.x.x: icmp_seq=19 ttl=64 time=0.220 ms
> 64 bytes from x.x.x.x: icmp_seq=20 ttl=64 time=0.125 ms
> 64 bytes from x.x.x.x: icmp_seq=21 ttl=64 time=0.188 ms
> 64 bytes from x.x.x.x: icmp_seq=22 ttl=64 time=0.202 ms
> 64 bytes from x.x.x.x: icmp_seq=23 ttl=64 time=0.195 ms
> 64 bytes from x.x.x.x: icmp_seq=24 ttl=64 time=0.177 ms
> 64 bytes from x.x.x.x: icmp_seq=25 ttl=64 time=0.242 ms
> 64 bytes from x.x.x.x: icmp_seq=26 ttl=64 time=0.339 ms
> 64 bytes from x.x.x.x: icmp_seq=27 ttl=64 time=0.183 ms
> 64 bytes from x.x.x.x: icmp_seq=28 ttl=64 time=0.221 ms
> 64 bytes from x.x.x.x: icmp_seq=29 ttl=64 time=0.317 ms
> 64 bytes from x.x.x.x: icmp_seq=30 ttl=64 time=0.210 ms
> 64 bytes from x.x.x.x: icmp_seq=31 ttl=64 time=0.242 ms
> 64 bytes from x.x.x.x: icmp_seq=32 ttl=64 time=0.127 ms
> 64 bytes from x.x.x.x: icmp_seq=33 ttl=64 time=0.217 ms
> 64 bytes from x.x.x.x: icmp_seq=34 ttl=64 time=0.184 ms
>
>
> For me it looks now that there was some fix between 5.4.60 and 5.9.4 ... 
> anyone can pinpoint it?

$ git log --no-merges --oneline v5.4.60..v5.9.4|wc -l
72932

Only 73k commits; should be easy, right? :)

(In other words no, I have no idea; I'd suggest either (a) asking on
netdev, (b) bisecting or (c) using 5.9+ and just making peace with not
knowing).

 How did you configure the new kernel? Did you start from scratch, or
 is
 it based on the old centos config?
>>>
>>> first oldconfig and from there then added additional options for IB,
>>> NVMe, etc (which I don't really need on the routers)
>>
>> OK, so you're probably building with roughly the same options in terms
>> of scheduling granularity etc. That's good. Did you enable spectre
>> mitigations etc on the new kernel? What's the output of
>> `tail /sys/devices/system/cpu/vulnerabilities/*` ?
>
> mitigations are off

Right, I just figured maybe you were hitting some threshold that
involved a lot of indirect calls which slowed things down due to
mitigations. Guess not, then...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] We built a new bufferbloat test and keen for feedback

2020-11-05 Thread Toke Høiland-Jørgensen via Bloat
Dave Collier-Brown  writes:

> Tried it, and I really like the header and use of candle-charts!
>
> I got this:
>
> [cid:part1.81AC21AC.758FE66F@indexexchange.com]
>
> I'd like to be able to explain it to non-techie folks (my grandma, and also 
> my IT team at work (;-)), so I wonder on their behalf...
>
>   *   Why is unloaded a large number, and loaded a small one?
>  *   milliseconds sound like delay, so 111.7 ms sounds slower than 0.0 ms
>   *   Is bloat and latency something bad? The zeroes are in green, does that 
> mean they're good?
>   *   Is max "bad"? In that case I'd call it "worst" and min "best"
>   *   Is median the middle or the average? (no kidding, I've been asked that! 
> I'd call it average)
>   *   Is 25% twenty-five percent of the packets? (I suspect it's a percentile)
>   *   What does this mean in terms of how many Skype calls I can have 
> happening at my house? I have two kids, a wife and a grandmother, all of whom 
> Skype a lot.
>
> Looking at the cool stuff in the banner, it looks like I can browse,
> do audio calls, video calls (just one, or many?) but not streaming
> (any or just 4k?) or gaming.  Emphasizing that would be instantly
> understandable by grandma and the kids.

Also, holy cow, what's going on with your connection? The unloaded
latency says 17/110/200 min/median/max RTT. Is that due to bad
measurements, or do you have a lot of cross traffic and a really bloated
link? :/

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

2020-11-05 Thread Toke Høiland-Jørgensen via Bloat
"Thomas Rosenstein"  writes:

>> If so, this sounds more like a driver issue, or maybe something to do
>> with scheduling. Does it only happen with ICMP? You could try this 
>> tool
>> for a userspace UDP measurement:
>
> It happens with all packets, therefore the transfer to backblaze with 40 
> threads goes down to ~8MB/s instead of >60MB/s

Huh, right, definitely sounds like a kernel bug; or maybe the new kernel
is getting the hardware into a state where it bugs out when there are
lots of flows or something.

You could try looking at the ethtool stats (ethtool -S) while running
the test and see if any error counters go up. Here's a handy script to
monitor changes in the counters:

https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl

> I'll try what that reports!
>
>> Also, what happens if you ping a host on the internet (*through* the
>> router instead of *to* it)?
>
> Same issue, but twice pronounced, as it seems all interfaces are 
> affected.
> So, ping on one interface and the second has the issue.
> Also all traffic across the host has the issue, but on both sides, so 
> ping to the internet increased by 2x

Right, so even an unloaded interface suffers? But this is the same NIC,
right? So it could still be a hardware issue...

> Yep default that CentOS ships, I just tested 4.12.5 there the issue also 
> does not happen. So I guess I can bisect it then...(really don't want to 
> 😃)

Well that at least narrows it down :)

>>
>> How did you configure the new kernel? Did you start from scratch, or 
>> is
>> it based on the old centos config?
>
> first oldconfig and from there then added additional options for IB, 
> NVMe, etc (which I don't really need on the routers)

OK, so you're probably building with roughly the same options in terms
of scheduling granularity etc. That's good. Did you enable spectre
mitigations etc on the new kernel? What's the output of
`tail /sys/devices/system/cpu/vulnerabilities/*` ?

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

2020-11-04 Thread Toke Høiland-Jørgensen via Bloat
"Thomas Rosenstein"  writes:

> On 4 Nov 2020, at 17:10, Toke Høiland-Jørgensen wrote:
>
>> Thomas Rosenstein via Bloat  writes:
>>
>>> Hi all,
>>>
>>> I'm coming from the lartc mailing list, here's the original text:
>>>
>>> =
>>>
>>> I have multiple routers which connect to multiple upstream providers, 
>>> I
>>> have noticed a high latency shift in icmp (and generally all 
>>> connection)
>>> if I run b2 upload-file --threads 40 (and I can reproduce this)
>>>
>>> What options do I have to analyze why this happens?
>>>
>>> General Info:
>>>
>>> Routers are connected between each other with 10G Mellanox Connect-X
>>> cards via 10G SPF+ DAC cables via a 10G Switch from fs.com
>>> Latency generally is around 0.18 ms between all routers (4).
>>> Throughput is 9.4 Gbit/s with 0 retransmissions when tested with 
>>> iperf3.
>>> 2 of the 4 routers are connected upstream with a 1G connection 
>>> (separate
>>> port, same network card)
>>> All routers have the full internet routing tables, i.e. 80k entries 
>>> for
>>> IPv6 and 830k entries for IPv4
>>> Conntrack is disabled (-j NOTRACK)
>>> Kernel 5.4.60 (custom)
>>> 2x Xeon X5670 @ 2.93 Ghz
>>> 96 GB RAM
>>> No Swap
>>> CentOs 7
>>>
>>> During high latency:
>>>
>>> Latency on routers which have the traffic flow increases to 12 - 20 
>>> ms,
>>> for all interfaces, moving of the stream (via bgp disable session) 
>>> moves
>>> also the high latency
>>> iperf3 performance plumets to 300 - 400 MBits
>>> CPU load (user / system) are around 0.1%
>>> Ram Usage is around 3 - 4 GB
>>> if_packets count is stable (around 8000 pkt/s more)
>>
>> I'm not sure I get you topology. Packets are going from where to 
>> where,
>> and what link is the bottleneck for the transfer you're doing? Are you
>> measuring the latency along the same path?
>>
>> Have you tried running 'mtr' to figure out which hop the latency is 
>> at?
>
> I tried to draw the topology, I hope this is okay and explains betters 
> what's happening:
>
> https://drive.google.com/file/d/15oAsxiNfsbjB9a855Q_dh6YvFZBDdY5I/view?usp=sharing

Ohh, right, you're pinging between two of the routers across a 10 Gbps
link with plenty of capacity to spare, and *that* goes up by two orders
of magnitude when you start the transfer, even though the transfer
itself is <1Gbps? Am I understanding you correctly now?

If so, this sounds more like a driver issue, or maybe something to do
with scheduling. Does it only happen with ICMP? You could try this tool
for a userspace UDP measurement:

https://github.com/heistp/irtt/

Also, what happens if you ping a host on the internet (*through* the
router instead of *to* it)?

And which version of the Connect-X cards are you using (or rather, which
driver? mlx4?)

> So it must be something in the kernel tacking on a delay, I could try to 
> do a bisect and build like 10 kernels :)

That may ultimately end up being necessary. However, when you say 'stock
kernel' you mean what CentOS ships, right? If so, that's not really a
3.10 kernel - the RHEL kernels (that centos is based on) are... somewhat
creative... about their versioning. So if you're switched to a vanilla
upstream kernel you may find bisecting difficult :/

How did you configure the new kernel? Did you start from scratch, or is
it based on the old centos config?

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] We built a new bufferbloat test and keen for feedback

2020-11-04 Thread Toke Høiland-Jørgensen via Bloat
Sam Westwood  writes:

> Hi everyone,
>
> My name is Sam and I'm the co-founder and COO of Waveform.com. At Waveform
> we provide equipment to help improve cell phone service, and being in the
> industry we've always been interested in all aspects of network
> connectivity. Bufferbloat for us has always been interesting, and while
> there are a few tests out there we never found one that was fantastic. So
> we thought we'd try and build one!
>
> My colleague Arshan has built the test, which we based upon the Cloudflare
> Speedtest template that was discussed earlier in the summer in a previous
> thread.
>
> We measure bufferbloat under two conditions: when downlink is saturated and
> when uplink is saturated. The test involves three stages: Unloaded,
> Downlink Saturated, and Uplink Saturated. In the first stage we simply
> measure latency to a file hosted on a CDN. This is usually around 5ms and
> could vary a bit based on the user's location. We use the webTiming API to
> find the time-to-first-byte, and consider that as the latency. In the
> second stage we run a download, while simultaneously measuring latency. In
> the third stage we do the same but for upload. Both download and upload
> usually take around 5 seconds. We show the median, first quartile and the
> third quartile on distribution charts corresponding to each stage to
> provide a visual representation of the latency variations. For download and
> upload we have used Cloudflare's speedtest backend.

This sounds great, thanks for doing this! It certainly sounds like
you're on the right track here. Some comments below...

> You can find the test here: https://www.waveform.com/apps/dev-arshan
>
> We built testing it on Chrome, but it works on Firefox and mobile too. On
> mobile results may be a little different, as the APIs aren't available and
> so instead we implemented a more manual method, which can be a little
> noisier.
>
> This is a really early alpha, and so we are keen to get any and all
> feedback you have :-). Things that we would particularly like feedback on:
>
>- How does the bufferbloat measure compare to other tests you may have
>run on the same connection (e.g. dslreports, fast.com)
>- How the throughput results (download/upload/latency) look compared to
>other tools

I'm fortunate enough to have a full Gbps fibre link, which makes it
really hard to saturate the connection from a browser test (or at all,
sometimes). Your test does a decent job and comes pretty close, at least
in Chromium (about 800 Mbps which is not too far off at the application
layer, considering I have a constant 100Mbps flow in the background
taking up some of the bandwidth). Firefox seems way off (one test said
500Mbps the other >1000).

This does mean that I can't say much that's useful about your
bufferbloat scores, unfortunately. The latency measurement puzzled me a
bit (your tool says 16.6ms, but I get half that when I ping the
cloudfront.net CDN, which I think is what you're measuring against?),
but it does seem to stay fairly constant.

How do you calculate the jitter score? It's not obvious how you get from
the percentiles to the jitter.

Link to the test in Chromium:
https://www.waveform.com/apps/dev-arshan?test-id=91a55adc-7513-4b55-b8a6-0fa698ce634e

>- Any feedback on the user interface of the test itself? We know that
>before releasing more widely we will put more effort into explaining
>bufferbloat than we have so far.

Brain dump of thoughts on the UI:

I found it hard to tell whether it was doing anything while the test was
running. Most other tests have some kind of very obvious feedback
(moving graphs of bandwidth-over-time for cloudflare/dslreports, a
honking big number going up and down for fast.com), which I was missing
here. I would also have liked to a measure of bandwidth over time, it
seems a bit suspicious (from a "verify that this is doing something
reasonable" PoV) that it just spits out a number at the end without
telling me how long it ran, or how it got to that number.

It wasn't obvious at first either that the header changes from
"bufferbloat test" to "your bufferbloat grade" once the test is over I
think the stages + result would be better put somewhere else where it's
more obvious (the rest of the page grows downwards, so why isn't the
result at the "end"?).

Also, what are the shields below the grade supposed to mean? Do they
change depending on the result? On which criteria? And it's telling me I
have an A+ grade, so why is there a link to fix my bufferbloat issues?

Smaller nit, I found the up/down arrows in "up saturated" and "down
saturated" a bit hard to grasp at first, I think spelling out
upload/download would be better. Also not sure I like the "saturated"
term in the first place; do people know what that means in a networking
context? And can you be sure the network is actually *being* saturated?

Why is the "before you start" text below the page? Shouldn't it be at
the top? And maybe

Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

2020-11-04 Thread Toke Høiland-Jørgensen via Bloat
Thomas Rosenstein via Bloat  writes:

> Hi all,
>
> I'm coming from the lartc mailing list, here's the original text:
>
> =
>
> I have multiple routers which connect to multiple upstream providers, I 
> have noticed a high latency shift in icmp (and generally all connection) 
> if I run b2 upload-file --threads 40 (and I can reproduce this)
>
> What options do I have to analyze why this happens?
>
> General Info:
>
> Routers are connected between each other with 10G Mellanox Connect-X 
> cards via 10G SPF+ DAC cables via a 10G Switch from fs.com
> Latency generally is around 0.18 ms between all routers (4).
> Throughput is 9.4 Gbit/s with 0 retransmissions when tested with iperf3.
> 2 of the 4 routers are connected upstream with a 1G connection (separate 
> port, same network card)
> All routers have the full internet routing tables, i.e. 80k entries for 
> IPv6 and 830k entries for IPv4
> Conntrack is disabled (-j NOTRACK)
> Kernel 5.4.60 (custom)
> 2x Xeon X5670 @ 2.93 Ghz
> 96 GB RAM
> No Swap
> CentOs 7
>
> During high latency:
>
> Latency on routers which have the traffic flow increases to 12 - 20 ms, 
> for all interfaces, moving of the stream (via bgp disable session) moves 
> also the high latency
> iperf3 performance plumets to 300 - 400 MBits
> CPU load (user / system) are around 0.1%
> Ram Usage is around 3 - 4 GB
> if_packets count is stable (around 8000 pkt/s more)

I'm not sure I get you topology. Packets are going from where to where,
and what link is the bottleneck for the transfer you're doing? Are you
measuring the latency along the same path?

Have you tried running 'mtr' to figure out which hop the latency is at?

> Here is the tc -s qdisc output:

This indicates ("dropped 0" and "ecn_mark 0") that there's no
backpressure on the qdisc, so something else is going on.

Also, you said the issue goes away if you downgrade the kernel? That
does sound odd...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] Detecting FQ at the bottleneck

2020-10-25 Thread Toke Høiland-Jørgensen via Bloat
This popped up in my Google Scholar mentions:

https://arxiv.org/pdf/2010.08362

It proposes using a delay-based CC when FQ is present, and a loss-based
when it isn't. It has a fairly straight-forward mechanism for detecting
an FQ bottleneck: Start two flows where one has 2x the sending rate than
the other, keep increasing their sending rate until both suffer losses,
and observe the goodput at this point: If it's ~1:1 there's FQ,
otherwise there isn't.

They cite 98% detection accuracy using netns-based tests and sch_fq.

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] cake + ipv6

2020-10-01 Thread Toke Høiland-Jørgensen via Bloat
Daniel Sterling  writes:

> On Mon, Sep 28, 2020 at 11:14 AM Toke Høiland-Jørgensen  wrote:
>> It depends. A 'sparse' flow should get consistent priority
>
>> That's per flow. But you're misunderstanding the 100ms value. That's the
>> 'interval', which is (simplifying a bit) the amount of time CAKE will
>> wait until it reacts to a flow building a queue.
>
>> If a flow exceeds its fair share *rate*,
>> it'll no longer (from CAKEs) PoV be a 'sparse flow', and it'll get the
>> same treatment as all other flows (round-robin scheduling), and if it
>> keeps sending at this higher rate, it'll keep being scheduled in this
>> way. If the flow is non-elastic (i.e., doesn't slows down in response to
>> packet drops), it'll self-congest and you'll see that as increased
>> latency.
>
> Ah! Thank you *very* much for this explanation. I greatly appreciate
> the effort everyone in this group puts into explaining (and tolerating
> :) ) new users of cake.
>
> In my case: I am happy to report this is *not* a bug or an issue with
> cake, as I originally thought. I am able to reproduce the issue I was
> seeing (high ping times as reported by the xbox game's network
> monitoring) w/o cake being in the mix at all. So this issue is either
> with how I've configured / built openwrt, or with my wireless network
> mesh, or with the xbox itself. It is NOT an issue with cake.
>
> Thank you all very much again. I will continue to use and test cake
> and let you know if I encounter further issues with cake itself.

You're welcome! Happy experimenting :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] cake + ipv6

2020-09-28 Thread Toke Høiland-Jørgensen via Bloat
Daniel Sterling  writes:

> I guess the reason I'm surprised is I'm confused about the following:
>
> For UDP streams that use <1mbit down , should I expect cake in ingress
> mode to keep those at low latency even in the face of a constantly
> full queue / backlog, using "besteffort" but also using host
> isolation?

It depends. A 'sparse' flow should get consistent priority (and hence
low latency), but what exactly constitutes a sparse flow varies with the
link bandwidth and usage. I wrote a longish analysis of this for
FQ-CoDel[0] which should more or less carry over to CAKE; basically it
boils down to any flow using less than it's "fair share" of the link
should get priority, assuming its packets are spaced evenly. The main
difference between CAKE and FQ-CoDel is that the host fairness changes a
flows 'fair share', and that CAKE has a 'cooldown' period for sparse
flows after they disappear, which may change dynamics slightly. But the
basic mechanism is the same.

[0] https://ieeexplore.ieee.org/document/8469111/

> By backlog I mean what I see with tc -s qdisc. I assume that's the
> total of all individual flow backlogs, right?

Yup.

> I'm guessing no, but I'm wondering why not. Let's say we have some
> hosts doing downloads as fast as they can with as many connections as
> they can, and another separate host doing "light" UDP (xbox FPS
> traffic).
>
> So that's, say, 4 hosts constantly filling their queue's backlog -- it
> will always have as many bytes as the rtt setting allows -- so by
> default up to 100ms of bytes at my bandwidth setting, per flow, right?
> Or is that per host?

That's per flow. But you're misunderstanding the 100ms value. That's the
'interval', which is (simplifying a bit) the amount of time CAKE will
wait until it reacts to a flow building a queue. The actual amount of
queueing CAKE is *aiming for* is the 'target', which is interval/20 so
5ms by default. So in the 'steady state' a flow's backlog should
oscillate around this (ignoring for the moment that "steady state" is an
idealisation that rarely, if ever, exists in the real world).

BTW, you can see per-flow statistics by using 'tc -s class show
$DEVICE'.

> And then another host (the xbox) that will have a constant flow that
> doesn't really respond to shaping hints -- it's going to have a steady
> state of packets that it wants to receive and send no matter what. It
> might go from "high" to "low" update resolution (e.g. 256kbit to
> 128bkit), but that's about it. It will always want about 256kbit down
> and 128kbit up with v4 UDP.
>
> Normally that stream will have an rtt of < 50ms. Sometimes, e.g.
> in-between rounds of the same game (thus the same UDP flow), the
> server might let the rtt spike to 100+ ms since nothing much needs to
> be sent between rounds.

I'm not quite sure what you mean by "the server will let the RTT spike",
actually. Above you seemed to be saying that the gaming flow would just
carry one sending at whatever rate it wants?

> But once the new round starts, we'll want low latency again.
>
> Is it at all possible that cake, seeing the UDP stream is no longer
> demanding low latency (in-between rounds), it figures it can let its
> rtt stay at 100+ms per its rtt settings, even after the new round
> starts and the xbox wants low latency again?

You're attributing a bit more intent to CAKE here than it really
possesses ;)

CAKE doesn't have a notion of "what a flow wants". It just schedules
flows in a certain way, and if a flow happens to use less than its fair
share of bandwidth (the analysis linked above), it'll get temporary
priority whenever a packet from that flow arrives.

> That is, since every host wants some traffic, and most if not all the
> queues / backlogs will always be filled, is it possible that once a
> flow allows its rtt to rise, cake won't let it back down again until
> there's a lull?

Yes, this part is right, sorta. If a flow exceeds its fair share *rate*,
it'll no longer (from CAKEs) PoV be a 'sparse flow', and it'll get the
same treatment as all other flows (round-robin scheduling), and if it
keeps sending at this higher rate, it'll keep being scheduled in this
way. If the flow is non-elastic (i.e., doesn't slows down in response to
packet drops), it'll self-congest and you'll see that as increased
latency. So if what you meant by "the server will let the RTT spike", is
that the flow is bursty at times, and in some periods it'll increase its
rate above what CAKE's notion of fair share is, then yeah, that can lead
to the behaviour you're seeing.

> As I said, I solved this by giving xbox traffic absolute first
> priority with the "prio" qdisc. Obviously this means the xbox traffic
> can starve everything else given a malicious flow, but that's not
> likely to happen and if it does, I will notice.

If it helps to always prioritise the flow, then that seems to be an
indication that what I'm describing above is what you're seeing.

The various diffserv modes of CAKE are meant to be a 

Re: [Bloat] How about a topical LWN article on demonstrating the real-world goodness of CAKE?

2020-09-07 Thread Toke Høiland-Jørgensen via Bloat
Dave Collier-Brown  writes:

> LWN said OK, but I'm stuck on the search for a striking test, one that 
> resonates with "grandma".
>
> My next two thoughts, probably for the current long weekend, is either
> to call a loop-back number with Skype and/or ask Toke how he got "Big
> Buck Bunny" to suffer dropouts. I'd love to use the latter, as it
> aligns with the observation that this is a time when conference-call
> failures are driving my colleagues to drink (;-))

I assume you're referring to the Dash data included in the Polify paper,
right? :)

Not sure if I ever got it to drop out completely; just did some
measurements of which bitrate it picked, and the context was airtime
prioritisation, not so much the latency improvements. But anyway, the
tests just used the reference dash.js[0] player, with a logger addition
that Flent can parse[1].

I don't recall what exactly is needed on the server side to run this,
but I think it's basically just dropping the right Big Buck Bunny
tarball on a web server along with dash-logger.js from the Flent sources :)

-Toke


[0] https://github.com/Dash-Industry-Forum/dash.js/wiki
[1] 
https://github.com/tohojo/flent/commit/6b83896cc0df1d468577ef0f35abbab6dd025c3f
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Other CAKE territory (was: CAKE in openwrt high CPU)

2020-09-04 Thread Toke Høiland-Jørgensen via Bloat
David Collier-Brown  writes:

> On 2020-09-03 10:32 a.m., Toke Høiland-Jørgensen via Bloat wrote
>
>> Yeah, offloading of some sort is another option, but I consider that
>> outside of the "CAKE stays relevant" territory, since that will most
>> likely involve an entirely programmable packet scheduler. There was some
>> discussion of adding such a qdisc to Linux at LPC[0]. The Eiffel[1]
>> algorithm seems promising.
>>
>> -Toke
>
> I'm wondering if edge servers with 1Gb NICs are inside the "CAKE stays 
> relevant" territory?
>
> My main customer/employer has a gazillion of those, currently reporting
>
> **
>
> *qdisc mq 0: root*
>
> *
>
> qdisc pfifo_fast 0: parent :8 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 
> 1 1 1
>
> ...
>
> *
>
> because their OS is just a tiny bit elderly (;-)). We we're planning to 
> roll forward this quarter to centos 8.2, where CAKE is an option.
>
> It strikes me that the self-tuning capacity of CAKE might be valuable 
> for a whole /class/ of small rack-mounted machines, but you just 
> mentioned the desire for better multi-processor support.
>
> Am I reaching for the moon, or is this something within reach?

As Jonathan says, servers mostly have enough CPU that running at 1gbps
is not an issue. And especially if you're not shaping, running CAKE in
unlimited mode should not be an issue.

However, do consider what you're trying to achieve here. Most of the
specific features of CAKE are targeting gateway routers. For instance,
for a server you may be better off with sch_fq to also get efficient
pacing support. Depends on what the server is doing...

But please, get rid of pfifo_fast! Anything is better than that! ;)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] CAKE in openwrt high CPU

2020-09-03 Thread Toke Høiland-Jørgensen via Bloat


On 3 September 2020 17:31:07 CEST, Luca Muscariello  
wrote:
>On Thu, Sep 3, 2020 at 4:32 PM Toke Høiland-Jørgensen 
>wrote:
>>
>> Luca Muscariello  writes:
>>
>> > On Thu, Sep 3, 2020 at 3:19 PM Mikael Abrahamsson via Bloat
>> >  wrote:
>> >>
>> >> On Tue, 1 Sep 2020, Toke Høiland-Jørgensen wrote:
>> >>
>> >> > Yup, the number of cores is only going to go up, so for CAKE to
>stay
>> >> > relevant it'll need to be able to take advantage of this
>eventually :)
>> >>
>> >> https://www.hardkernel.com/shop/odroid-h2plus/ is an interesting
>platform,
>> >> it has a quad core machine with 2 x 2.5GbE NICs.
>> >>
>> >> When using something like this for routing with HTB+CAKE for
>bidirectional
>> >> shaping below line rate, what would be the main things that would
>need to
>> >> be improved?
>> >
>> > IMO, hardware offloading for shaping, beyond this specific
>platform.
>> > I ignore if there is any roadmap with that objective.
>>
>> Yeah, offloading of some sort is another option, but I consider that
>> outside of the "CAKE stays relevant" territory, since that will most
>> likely involve an entirely programmable packet scheduler. There was
>some
>> discussion of adding such a qdisc to Linux at LPC[0]. The Eiffel[1]
>> algorithm seems promising.
>>
>> -Toke
>>
>> [0] https://linuxplumbersconf.org/event/7/contributions/679/
>> [1] https://www.usenix.org/conference/nsdi19/presentation/saeed
>
>These are all interesting efforts for scheduling but orthogonal to
>shaping
>and not going to help make shaping more scalable.

Eiffel says it can do shaping by way of a global calendar queue... Planning to 
put that to the test :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] CAKE in openwrt high CPU

2020-09-03 Thread Toke Høiland-Jørgensen via Bloat
Luca Muscariello  writes:

> On Thu, Sep 3, 2020 at 3:19 PM Mikael Abrahamsson via Bloat
>  wrote:
>>
>> On Tue, 1 Sep 2020, Toke Høiland-Jørgensen wrote:
>>
>> > Yup, the number of cores is only going to go up, so for CAKE to stay
>> > relevant it'll need to be able to take advantage of this eventually :)
>>
>> https://www.hardkernel.com/shop/odroid-h2plus/ is an interesting platform,
>> it has a quad core machine with 2 x 2.5GbE NICs.
>>
>> When using something like this for routing with HTB+CAKE for bidirectional
>> shaping below line rate, what would be the main things that would need to
>> be improved?
>
> IMO, hardware offloading for shaping, beyond this specific platform.
> I ignore if there is any roadmap with that objective.

Yeah, offloading of some sort is another option, but I consider that
outside of the "CAKE stays relevant" territory, since that will most
likely involve an entirely programmable packet scheduler. There was some
discussion of adding such a qdisc to Linux at LPC[0]. The Eiffel[1]
algorithm seems promising.

-Toke

[0] https://linuxplumbersconf.org/event/7/contributions/679/
[1] https://www.usenix.org/conference/nsdi19/presentation/saeed
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] CAKE in openwrt high CPU

2020-09-03 Thread Toke Høiland-Jørgensen via Bloat
Mikael Abrahamsson  writes:

> On Mon, 31 Aug 2020, Toke Høiland-Jørgensen wrote:
>
>> And what about when you're running CAKE in 'unlimited' mode?
>
> I tried this:
>
> # tc qdisc add dev eth0 root cake bandwidth 900mbit

So the difference from before is just the lack of inbound shaping, or?

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] CAKE in openwrt high CPU

2020-09-03 Thread Toke Høiland-Jørgensen via Bloat
Mikael Abrahamsson  writes:

> On Tue, 1 Sep 2020, Toke Høiland-Jørgensen wrote:
>
>> Yup, the number of cores is only going to go up, so for CAKE to stay 
>> relevant it'll need to be able to take advantage of this eventually :)
>
> https://www.hardkernel.com/shop/odroid-h2plus/ is an interesting platform, 
> it has a quad core machine with 2 x 2.5GbE NICs.
>
> When using something like this for routing with HTB+CAKE for bidirectional 
> shaping below line rate, what would be the main things that would need to 
> be improved?

The aforementioned multi-processor support...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] CAKE in openwrt high CPU

2020-09-02 Thread Toke Høiland-Jørgensen via Bloat
Jonathan Foulkes  writes:

>> Right, so some benefit might be possible here. Does the NIC have
>> multiple hardware queues (`ls /sys/class/net/$IFACE/queues` should tell
>> you)?
>
> Here is the output of:
> /sys/devices/virtual/net/eth0.2/queues# ls
> rx-0  tx-0
> /sys/devices/virtual/net/eth0.2/queues/rx-0# cat rps_cpus 
> 0
>
> /sys/devices/virtual/net/eth0.2/queues/tx-0# cat xps_cpus 
> 0

Hmm, so no multiq support on this driver, it looks like. So not sure to
what extent it will be possible to effectively utilise both cores on
this box, sadly :/

>> Yup, the number of cores is only going to go up, so for CAKE to stay
>> relevant it'll need to be able to take advantage of this eventually :)
>
> True, the mid-range market is already there, and so soon will be the
> lower-end. And with ISPs lighting up more and more capacity, the
> demand will be there to be able to shape higher and higher rates.
>
> But I agree with Jonathan Morton that once every deice has sufficient
> capacity, more makes no difference. I went for 100/15 to 300/24 and
> never noticed the difference.
>
> Hell, there are days I switch to my backup 10/0.7 DSL line for a test,
> and forget to switch back, and will work for hours and not notice I’m
> not on the 300Mbps line ;-)

Heh, if you can live with a 10/0.7 line without noticing I think you're
more patient than me ;) But still, fair point; doesn't mean that people
will still not *want* to run a higher speeds, though... :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] CAKE in openwrt high CPU

2020-09-01 Thread Toke Høiland-Jørgensen via Bloat
Jonathan Foulkes  writes:

> Thanks Toke, we currently are on an MT7621a @880, so a dual-core.

Right, so some benefit might be possible here. Does the NIC have
multiple hardware queues (`ls /sys/class/net/$IFACE/queues` should tell
you)?

> And we are looking for a good quad-core platform that will support
> 600Mbps or more with Cake enabled, hopefully with AX radios as well.

Yup, the number of cores is only going to go up, so for CAKE to stay
relevant it'll need to be able to take advantage of this eventually :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] CAKE in openwrt high CPU

2020-09-01 Thread Toke Høiland-Jørgensen via Bloat
Jonathan Morton  writes:

>> On 1 Sep, 2020, at 9:45 pm, Toke Høiland-Jørgensen via Bloat 
>>  wrote:
>> 
>> CAKE takes the global qdisc lock.
>
> Presumably this is a default mechanism because CAKE doesn't handle any
> locking itself.
>
> Obviously it would need to be replaced with at least a lock over
> CAKE's complete data structures, taking the lock on each entry point
> and releasing it at each return point, and I assume there is a flag we
> can set to indicate we do so. Finer-grained locking might be possible,
> but CAKE is fairly complex so that might be hard to implement. Locking
> per CAKE instance would at least allow running ingress and egress on
> different CPUs.

What you're describing here is basically the existing qdisc root lock.
It is per instance of the qdisc, and it is held only while enqueueing
and dequeueing packets from that qdisc. So it is possible today to run
the ingress and egress instances of CAKE on different CPUs. All you have
to do is schedule the packets to be processed on different CPUs in the
different directions - which usually means messing with RPS settings for
the NIC, and as I remarked to Sebastian, for many OpenWrt SOCs this is
not really supported...

To make CAKE truly take advantage of multiple CPUs, there are to
options:

1. Make it aware of multiple hardware queues. To do this, we to
   implement the 'attach()' method in the Qdisc_ops struct (see sch_mq
   for an example). The idea here would be to create stub child qdiscs
   with a separate struct Qdisc_ops implementing enqueue() and
   dequeue(). These would be called separately for each hardware queue,
   with their separate locks held at the time; and with proper XPS
   steering, each hardware queue can be serviced by a separate CPU.

2. Set the TCQ_F_NOLOCK in the qdisc flags; this will cause the existing
   enqueue() and dequeue() functions to be called without the root lock
   being held, and the qdisc is responsible for dealing with that
   itself.

Of course in either case, the trick is to get the CAKE data structures
to play nice with concurrent access from multiple CPUs. For option 1.
above, we could just duplicate all the flow queues for each netdev queue
and take the hit in wasted space - or we could partition the data
structure, either statically at init, or dynamically as each flow
becomes active. But at a minimum there would need to be some way for the
shaper to enforce the maximum rate. Maybe a granular lock or an atomic
is good enough for this, though?

Note also that for 2. there's an ongoing issue[0] with packets getting
stuck which is still unresolved, as far as I can tell - so not sure if
this is the right way to go. However, apart from this, the benefit of 2.
is that CAKE could *potentially* process packets on multiple CPUs
without relying on hardware multi-Q. I'm not quite sure if the stack
will actually process packets on more than one CPU without them,
though.

Either way, I suppose some experimentation would be needed to find the
best solution.

-Toke

[0] 
https://lore.kernel.org/netdev/CACS=qq+a0H=e8ylfu95ae7hr0bq9ytcbbn2rfx82ojnppkb...@mail.gmail.com/
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] CAKE in openwrt high CPU

2020-09-01 Thread Toke Høiland-Jørgensen via Bloat
Sebastian Moeller  writes:

> Hi Toke,
>
>
>> On Sep 1, 2020, at 18:11, Toke Høiland-Jørgensen via Bloat 
>>  wrote:
>> 
>> Jonathan Foulkes  writes:
>> 
>>> Toke, that link returns a 404 for me.
>> 
>> Ah, seems an extra character snuck in at the end - try this:
>> 
>> https://github.com/dtaht/sch_cake/commit/3152477235c934022049fcddc063c45d37ec10e6
>> 
>>> For others, I’ve found that testing cake throughput with isolation options 
>>> enabled is tricky if there are many competing connections. 
>>> Like I keep having to tell my customers, fairness algorithms mean no one 
>>> device will ever gain 100% of the bandwidth so long as there are other open 
>>> & active connections from other devices.
>>> 
>>> That said, I’d love to find options to increase throughput for
>>> single-tin configs.
>> 
>> Yeah, doing something about this is on my list, one way or another. Not
>> sure how much more we can do in terms of overhead, so we may have to go
>> for multi-q (and multi-CPU) support. How many CPU cores does the
>> IQrouter have?
>
>   It might be worth looking how the typical two cake instances
>   distribute across the available CPUs, in some version of OpenWrt
>   all cake's and ethernet interupt processing crowed up on a
>   single CPU leading to "out of CPU" behaviour with 50% idle
>   remaining... I think that usinf a different RPS scheme might
>   work better.

Well, many home routers don't have any functional RPS at all. Also, it
doesn't help since CAKE takes the global qdisc lock. Both of those
issues should be fixed, ideally :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] CAKE in openwrt high CPU

2020-09-01 Thread Toke Høiland-Jørgensen via Bloat
Jonathan Foulkes  writes:

> Toke, that link returns a 404 for me.

Ah, seems an extra character snuck in at the end - try this:

https://github.com/dtaht/sch_cake/commit/3152477235c934022049fcddc063c45d37ec10e6

> For others, I’ve found that testing cake throughput with isolation options 
> enabled is tricky if there are many competing connections. 
> Like I keep having to tell my customers, fairness algorithms mean no one 
> device will ever gain 100% of the bandwidth so long as there are other open & 
> active connections from other devices.
>
> That said, I’d love to find options to increase throughput for
> single-tin configs.

Yeah, doing something about this is on my list, one way or another. Not
sure how much more we can do in terms of overhead, so we may have to go
for multi-q (and multi-CPU) support. How many CPU cores does the
IQrouter have?

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] Applying FQ-CoDel to multipath tunnelled QUIC

2020-09-01 Thread Toke Høiland-Jørgensen via Bloat
This popped up in my Google Scholar alerts:

https://www.diva-portal.org/smash/get/diva2:1451478/FULLTEXT02

It's a master's thesis (from my old university in Karlstad) describing
an integration of FQ-CoDel into a QUIC-based tunnel implementation, with
what seems to be very nice results[0] :)

Seems there's source code as well: https://github.com/FajFs/pquic

-Toke

[0] I only glanced through the results, didn't read the whole thing;
hence the 'seems to be'...
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] CAKE in openwrt high CPU

2020-08-31 Thread Toke Høiland-Jørgensen via Bloat
Mikael Abrahamsson  writes:

> On Mon, 31 Aug 2020, Toke Høiland-Jørgensen wrote:
>
>> Hmm, you say CAKE and FQ-Codel - so you're not enabling the shaper (that
>> would be FQ-CoDel+HTB)? An exact config might be useful (or just the
>> output of tc -s qdisc).
>
> Yeah, I guess I'm also using HTB to get the 900 megabit/s SQM is looking 
> for.

Ah, right, makes more sense :)

> If I only use FQ_CODEL to get interface speeds my performance is fine.

And what about when you're running CAKE in 'unlimited' mode?

>> If you are indeed not shaping, maybe you're hitting the issue fixed by this 
>> commit?
>>
>> https://github.com/dtaht/sch_cake/commit/3152477235c934022049fcddc063c45d37ec10e6n
>
> I enabled it just now to get the config.
>
> qdisc cake 8030: dev eth0 root refcnt 9 bandwidth 900Mbit besteffort 
> triple-isolate nonat nowash no-ack-filter split-gso rtt 100.0ms raw 
> overhead 0

Hmm, right, you could try no-split-gso as an option as well; you're
pretty close to the point where we turn it off by default, and you're
getting pretty large packets (max_len), so your performance may be
suffering from the splitting...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] CAKE in openwrt high CPU

2020-08-31 Thread Toke Høiland-Jørgensen via Bloat
Mikael Abrahamsson via Bloat  writes:

> Hi,
>
> I migrated to an APU2 (https://www.pcengines.ch/apu2.htm) as residential 
> router, from my previous WRT1200AC (marvell armada 385).
>
> I was running OpenWrt 18.06 on that one, now I am running latest 19.07.3 
> on the APU2.
>
> Before I had 500/100 and I had to use FQ_CODEL because CAKE took too much 
> CPU to be able to do 500/100 on the WRT1200AC. Now I upgraded to 1000/1000 
> and tried it again, and even the APU2 can only do CAKE up to ~300 
> megabit/s. With FQ_CODEL I get full speed (configure 900/900 in SQM in 
> OpenWrt).
>
> Looking in top, I see sirq% sitting at 50% pegged. This is typical what I 
> see when CPU based forwarding is maxed out. From my recollection of 
> running CAKE on earlier versions of openwrt (17.x) I don't remember CAKE 
> using more CPU than FQ_CODEL.
>
> Anyone know what's up? I'm fine running FQ_CODEL, it solves any 
> bufferbloat but... I thought CAKE supposedly should use less CPU, not 
> more?

Hmm, you say CAKE and FQ-Codel - so you're not enabling the shaper (that
would be FQ-CoDel+HTB)? An exact config might be useful (or just the
output of tc -s qdisc).

If you are indeed not shaping, maybe you're hitting the issue fixed by this 
commit?

https://github.com/dtaht/sch_cake/commit/3152477235c934022049fcddc063c45d37ec10e6n

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Phoronix: Linux 5.9 to allow FQ_PIE as default

2020-07-16 Thread Toke Høiland-Jørgensen via Bloat
Jonathan Morton  writes:

> In any case, it is already possible to chose any qdisc you like (with
> default parameters) as the default qdisc.  I'm really not sure what
> the fuss is about.

No, it isn't - not at compile time. Which is what the phoronix post
references. It's literally this commit:
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=b97e9d9d67c88bc413a3c27734d45d98d8d52b00

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Phoronix: Linux 5.9 to allow FQ_PIE as default

2020-07-16 Thread Toke Høiland-Jørgensen via Bloat
Rich Brown  writes:

> I was asking whether there was anything "there" (that is, interesting)
> in that Phoronix posting / Linux announcement.

Ah. No, as I said, not really; the patch just added fq_pie to the list
of qdiscs that can be set as default via sysctl...

>> On Jul 15, 2020, at 9:02 AM, Toke Høiland-Jørgensen  wrote:
>> 
>>> Is there any "there" here?
>> 
>> I'm sorry, what? :)
>
> Sorry, this is a reference to Gertrude Stein's quote... See Wiktionary
> - https://en.wiktionary.org/wiki/there_is_no_there_there

Ah, right, not familiar with that :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Phoronix: Linux 5.9 to allow FQ_PIE as default

2020-07-15 Thread Toke Høiland-Jørgensen via Bloat
Rich Brown  writes:

> Is there any "there" here?

I'm sorry, what? :)

> https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1193738-linux-5-9-to-allow-defaulting-to-fq-pie-queuing-discipline-for-fighting-bufferbloat

As far as the patch, allowing fq_pie as default makes sense in that it
is a "works with defaults" type of qdisc. Which as also the reason used
in the patch IIRC.

Not sure why anyone would *want* to run it as default over fq_codel; but
TBH I have never tried it either, so... *shrugs* :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] Finally shipping: Second print run of "Bufferbloat and Beyond" (my thesis)

2020-06-16 Thread Toke Høiland-Jørgensen via Bloat
--- Begin Message ---
Hi everyone

Dave already spilled the beans a few months ago. However, everything got
delayed by COVID-related logistics challenges, and since this has now
finally been cleared up and shipping has commenced, I thought I'd make
an "official" announcement:

The second print run of my thesis ("Bufferbloat and Beyond") is now
shipping! The university kindly agreed to do this and fund it from their
general "research dissemination" budget, which means you can order a
physical copy for free(!), and it will be shipped to you anywhere in the
world!

Simply go here:

https://bufferbloat-and-beyond.net/

and fill in the form, and a copy should find its way to you in whatever
time frame your local postal service operates at for international
shipments. It is also possible to download a digital version, of course,
and the web site contains links to the individual papers.

Nothing has changed content-wise since the original version was
published a bit over a year and a half ago (apart from the correction of
an embarrassing TeX-related typesetting mistake). However, I think it's
pretty cool to be able to offer up physical copies this way, so don't be
shy about ordering one if I didn't already give you one! Also, feel free
to share the link :)

Many thanks to the university for doing this, and to Dave for his
insistent reminders to get me to set this all up!

-Toke
--- End Message ---
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Is still netperf a valid tool?

2020-06-15 Thread Toke Høiland-Jørgensen via Bloat
--- Begin Message ---
Sergio Belkin  writes:

> Hi,
> I've seen that many of the recommended tools to diagnose/troubleshoot
> bufferbloat use netperf.
> Netperf in https://github.com/HewlettPackard/netperf has many years of
> inactivity. In fact, in recent versions of distros don't include it.
> So, my question is: is still netperf a reliable tool?

Reliable in the sense that it works and produces results that are likely
to be fairly close to the reality you want to measure? Absolutely.
Reliable in the sense that you can always rely on it being available?
Unfortunately not.

The latter is more of a licensing issue, though, which unfortunately
also means that it is not likely to be fixed... :(

-Toke
--- End Message ---
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] [Ecn-sane] Fwd: [tsvwg] Fwd: Working Group Last Call: QUIC protocol drafts

2020-06-10 Thread Toke Høiland-Jørgensen via Bloat
--- Begin Message ---
Dave Taht  writes:

> I am happy to see quic in last call. there are a ton of interoperble
> implementations now.

And in related news, this RFC of pacing from userspace was posted on
netdev yesterday:

https://lore.kernel.org/netdev/20200609140934.110785-1-willemdebruijn.ker...@gmail.com/T/

-Toke
--- End Message ---
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] What's a good non-intrusive way to look at bloat (and perhaps things like gout (:-))

2020-06-04 Thread Toke Høiland-Jørgensen via Bloat
--- Begin Message ---
Jonathan Morton  writes:

>> On 4 Jun, 2020, at 1:21 am, Dave Collier-Brown 
>>  wrote:
>> 
>> We've good tools to measure network performance under stress, by the
>> simple expedient of stressing it, but is there a good approach I
>> could recommend to my company to monitor a bunch of reasonably modern
>> links, without the measurement significantly affecting their state?
>> 
>> I don't mind increasing bandwidth usage, but I'm downright grumpy
>> about adding to the service time: I have a transaction that times out
>> for gross slowness if it takes much more that an tenth of a second,
>> and it involves a scatter-gather interaction with at least 10
>> customers in that time.
>> 
>> I'm topically interested in bloat, but really we should understand
>> "everything" about our links. If they can get the bloats like cattle,
>> they can probably get the gout, like King Henry the Eighth (;-))
>> 
>> My platform is Centos 8, and I have lots of Smarter Colleagues to
>> help.
>
> My first advice would be to browse pollere.net for tools - like pping
> (passive ping), which monitors the latency of flows in transit. That
> should give you some interesting information without adding any load
> at all. There is also connmon (https://github.com/pollere/connmon).

Ah, good idea, totally forgot about Kathy's tools! :)

I figure one could probably implement something like connmon in eBPF (as
an XDP or TC hook program) and have it run as an always-on monitor with
fairly low overhead. Dave, if you have development resources to throw at
this, I'll be happy to help with pointers on how to get the eBPF bits
working. I believe CentOS 8.2+ should have the needed kernel support...

Of course, you could also just use the connmon utility as-is if you have
CPU cycles to spare for the extra overhead (it looks like it's using
libpcap to capture the packets and process them in userspace).

-Toke
--- End Message ---
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] New speed/latency/jitter test site from Cloudflare

2020-06-04 Thread Toke Høiland-Jørgensen via Bloat
--- Begin Message ---
Jonathan Morton  writes:

>> On 3 Jun, 2020, at 7:48 pm, Dave Taht  wrote:
>> 
>> I am of course, always interested in how they are measuring latency, and 
>> where.
>
> They don't seem to be adding more latency measurements once the
> download tests begin.  So in effect they are only measuring idle
> latency.

Yup, but they do seem to be amenable to fixing that:
https://github.com/cloudflare/worker-speedtest-template/issues/13

As for where the latency is measured to, it seems to be HTTP 'pings' to
the nearest Cloudflare CDN endpoint.

-Toke
--- End Message ---
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] New speed/latency/jitter test site from Cloudflare

2020-06-03 Thread Toke Høiland-Jørgensen via Bloat
--- Begin Message ---
Adam Hunt  writes:

> https://speed.cloudflare.com/
> https://blog.cloudflare.com/test-your-home-network-performance/
> https://news.ycombinator.com/item?id=23313657

Ah, cool; we already had a mention of this the other day, but didn't see
the announcement or discussion.

Opened an issue to request bloat tests:
https://github.com/cloudflare/worker-speedtest-template/issues/13 if
anyone wants to go and chime in (or bug the developers, they seem to be
active in that hnews thread...

-Toke
--- End Message ---
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] this explains speedtest stuff

2020-04-26 Thread Toke Høiland-Jørgensen via Bloat
--- Begin Message ---
Kenneth Porter  writes:

> Maybe we need some Youtube videos showing end user experiences of the
> benefit of reduced latency.

The (now defunct) RITE project did this one, which I thought was rather
good: https://youtu.be/F1a-eMF9xdY

-Toke
--- End Message ---
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] this explains speedtest stuff

2020-04-26 Thread Toke Høiland-Jørgensen via Bloat
--- Begin Message ---
Kenneth Porter  writes:

> On 4/25/2020 9:00 AM, Dave Taht wrote:
>> Oh, I misread your report. I thought this was cake, not fq_codel. Care
>> to try that?
>
> I'd love to if I knew how to add cake to CentOS 7. I've installed kernel 
> modules for unsupported Ethernet interfaces before, so perhaps cake is 
> available that way from a 3rd party repo? Or I could adapt a 3rd party 
> driver RPM's source to add cake, instead. I really don't want to do a 
> full custom kernel, though. That's hard to maintain over time as there's 
> a new kernel in the updates every month or two. Some hints on how to add 
> a qdisc to a kernel would be welcome.

There's an out-of-tree version of cake here that should theoretically
build as a module if you have the right kernel-headers installed

https://github.com/dtaht/sch_cake:

However, RHEL kernels (and thus CentOS) lie about their kernel versions,
so it may be that all the compatibility stuff for old kernels we have in
that repo is going to break. Feel free to give it a shot, though.

Or, y'know, just upgrade to CentOS 8 ;)

-Toke
--- End Message ---
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] this explains speedtest stuff

2020-04-24 Thread Toke Høiland-Jørgensen via Bloat
--- Begin Message ---
Kenneth Porter  writes:

> On 4/23/2020 6:20 PM, Dave Taht wrote:
>> I used layer_cake with this overriding the defaults on my cable modems
>> in /etc/config/sqm
>>
>>  option iqdisc_opts 'docsis wash besteffort nat ingress '
>>  option eqdisc_opts 'docsis ack-filter nat'
>
> I think I see where to do this in the GUI. (I can't remember if I 
> installed a mini-emacs and I can never remember how to drive vi, so raw 
> file editing might be harder than a GUI field.)
>
> Where can I find documentation on what those options do?

`man tc-cake`

-Toke
--- End Message ---
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Quick question about lists

2020-03-30 Thread Toke Høiland-Jørgensen via Bloat
--- Begin Message ---
Erika Miller  writes:

>  Quick question about lists
> Is it okay if we feature lists.bufferbloat in our next email newsletter?
> It's a perfect fit for a piece we're doing and I think our audience would
> find some of the content on your site super useful.

Hi Erika

Your email got stuck in the mailing list moderation queue, hence the
lack of replies.

As for your question, well, lists.bufferbloat.net is a public mailing
list archive, so I'd say feature away. Although I must say I'm a bit
confused as to why you'd want to feature that, but maybe you meant the
bufferbloat.net site itself?

-Toke
--- End Message ---
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


[Bloat] Clearing out the moderation queues

2020-03-30 Thread Toke Høiland-Jørgensen via Bloat
--- Begin Message ---
Hey everyone

I'm going through the moderation queue of these two lists, and, well,
turns out there was a bit of a backlog. Up to four years of backlog, in
fact. I'm letting everything through that is not spam, so if you're
wondering why a lot of old emails suddenly show up in your mailbox
that's why.

Apologies for the inconvenience, I'll keep an eye on the moderation
requests in the future (I wasn't getting notifications before, but now
that I am it shouldn't be an issue to get a moderation turn-around
slightly below four years ;)).

-Toke
--- End Message ---
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat