Re: [Bloat] We built a new bufferbloat test and keen for feedback

2020-11-05 Thread Sebastian Moeller

Hi Sam,
 
first thanks, always good to see more dedicated tools to asses latency under 
load, especially tools that are easy to use and do not require the user to 
maintain her/his own dedicated endpoints!
More below in-line, prefixed [SM].
 
 

Gesendet: Mittwoch, 04. November 2020 um 22:30 Uhr
Von: "Sam Westwood" 
An: bloat@lists.bufferbloat.net
Betreff: [Bloat] We built a new bufferbloat test and keen for feedback

Hi everyone, 
My name is Sam and I'm the co-founder and COO of Waveform.com. At Waveform we 
provide equipment to help improve cell phone service, and being in the industry 
we've always been interested in all aspects of network connectivity. 
Bufferbloat for us has always been interesting, and while there are a few tests 
out there we never found one that was fantastic. So we thought we'd try and 
build one!
My colleague Arshan has built the test, which we based upon the Cloudflare 
Speedtest template that was discussed earlier in the summer in a previous 
thread.
 
We measure bufferbloat under two conditions: when downlink is saturated and 
when uplink is saturated.
 
        [SM] This is a decent starting point. In addition it might be helpful 
to at least optionally include a test with with bidirectional saturating load, 
in the past such tests typically were quite successful in detecting bufferbloat 
sources, that were less obvious in the uni-directional load tests. I am not 
sure however how well that can work with a browser based test?
 
 
The test involves three stages: Unloaded, Downlink Saturated, and Uplink 
Saturated. In the first stage we simply measure latency to a file hosted on a 
CDN. This is usually around 5ms and could vary a bit based on the user's 
location. We use the webTiming API to find the time-to-first-byte, and consider 
that as the latency. 

[SM] Mmmh, I like that this is a relevant latency measure, it might 
make sense though to make sure users realize that this is not the eqivalent 
number to runing a ICMP eche request against the same endpoint?


In the second stage we run a download, while simultaneously measuring latency. 
In the third stage we do the same but for upload. Both download and upload 
usually take around 5 seconds. 

[SM] On heavily bufferbloated links it often takes a considerable 
amount of time for the bottleneck buffers to drain after a uni-directional 
test, so it might make sense to separate the two direction test with an 
additional phase of idle latency measurements. If that latency is like the 
initial unloaded latency, all is well, but if latency slowly ramps down in that 
phase you have a smoking gun for bad bufferbloat.
 Also, there are link technologies and scheduler techniques that can 
prioritise relative short flows (e.g. Comcast's powerboost) to avoid just 
measuring the properties of these short duration special modes, it might make 
sense to optionally and considerably lengthen the duration of the test 
durations to say 30 seconds (empirically powerboost does not engage for a full 
30second perid at full rate, but that might be arms race). Also to assess 
possible root causes for latency and rate issues, it is very helpful to show 
time resolved plots, that show the development of rate and latency over the 
duration of all phases of the test. For example, using longer running flent 
tests I could pinpoint the cyclic channel scanning of my laptop's wifi as a 
source of repeated bufferbloat with a period of ~10 seconds, by seeing evenly 
spaced latency spikes and rate dips every 10 seconds then went away when 
switching to wired ethernet...

We show the median, first quartile and the third quartile on distribution 
charts corresponding to each stage to provide a visual representation of the 
latency variations. For download and upload we have used Cloudflare's speedtest 
backend.

[SM] This is great, it would be nice though to also add a graphical 
representation, be it a histogram or a cumulative density plot of latencies 
(split out for idle, download, upload and the idle period between down- and 
upload).


Best Regards
Sebastian


 
You can find the test here: https://www.waveform.com/apps/dev-arshan
 
We built testing it on Chrome, but it works on Firefox and mobile too. On 
mobile results may be a little different, as the APIs aren't available and so 
instead we implemented a more manual method, which can be a little noisier.
 
This is a really early alpha, and so we are keen to get any and all feedback 
you have :-). Things that we would particularly like feedback on:
How does the bufferbloat measure compare to other tests you may have run on the 
same connection (e.g. dslreports, fast.com[http://fast.com])How the throughput 
results (download/upload/latency) look compared to other toolsAny feedback on 
the user interface of the test itself? We know that before releasing more 
widely we will put more effort into explaining bufferbloat than we have so far. 
 Anything else you would 

Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

2020-11-05 Thread Thomas Rosenstein via Bloat



On 5 Nov 2020, at 1:10, Toke Høiland-Jørgensen wrote:


"Thomas Rosenstein"  writes:


On 4 Nov 2020, at 17:10, Toke Høiland-Jørgensen wrote:


Thomas Rosenstein via Bloat  writes:


Hi all,

I'm coming from the lartc mailing list, here's the original text:

=

I have multiple routers which connect to multiple upstream 
providers,

I
have noticed a high latency shift in icmp (and generally all
connection)
if I run b2 upload-file --threads 40 (and I can reproduce this)

What options do I have to analyze why this happens?

General Info:

Routers are connected between each other with 10G Mellanox 
Connect-X

cards via 10G SPF+ DAC cables via a 10G Switch from fs.com
Latency generally is around 0.18 ms between all routers (4).
Throughput is 9.4 Gbit/s with 0 retransmissions when tested with
iperf3.
2 of the 4 routers are connected upstream with a 1G connection
(separate
port, same network card)
All routers have the full internet routing tables, i.e. 80k entries
for
IPv6 and 830k entries for IPv4
Conntrack is disabled (-j NOTRACK)
Kernel 5.4.60 (custom)
2x Xeon X5670 @ 2.93 Ghz
96 GB RAM
No Swap
CentOs 7

During high latency:

Latency on routers which have the traffic flow increases to 12 - 20
ms,
for all interfaces, moving of the stream (via bgp disable session)
moves
also the high latency
iperf3 performance plumets to 300 - 400 MBits
CPU load (user / system) are around 0.1%
Ram Usage is around 3 - 4 GB
if_packets count is stable (around 8000 pkt/s more)


I'm not sure I get you topology. Packets are going from where to
where,
and what link is the bottleneck for the transfer you're doing? Are 
you

measuring the latency along the same path?

Have you tried running 'mtr' to figure out which hop the latency is
at?


I tried to draw the topology, I hope this is okay and explains 
betters

what's happening:

https://drive.google.com/file/d/15oAsxiNfsbjB9a855Q_dh6YvFZBDdY5I/view?usp=sharing


Ohh, right, you're pinging between two of the routers across a 10 Gbps
link with plenty of capacity to spare, and *that* goes up by two 
orders

of magnitude when you start the transfer, even though the transfer
itself is <1Gbps? Am I understanding you correctly now?


Exactly :)



If so, this sounds more like a driver issue, or maybe something to do
with scheduling. Does it only happen with ICMP? You could try this 
tool

for a userspace UDP measurement:


It happens with all packets, therefore the transfer to backblaze with 40 
threads goes down to ~8MB/s instead of >60MB/s




https://github.com/heistp/irtt/



I'll try what that reports!


Also, what happens if you ping a host on the internet (*through* the
router instead of *to* it)?


Same issue, but twice pronounced, as it seems all interfaces are 
affected.

So, ping on one interface and the second has the issue.
Also all traffic across the host has the issue, but on both sides, so 
ping to the internet increased by 2x




And which version of the Connect-X cards are you using (or rather, 
which

driver? mlx4?)



It's Connect-X 4 Lx cards, specifcally: MCX4121A-ACAT
Driver is mlx5_core

So it must be something in the kernel tacking on a delay, I could try 
to

do a bisect and build like 10 kernels :)


That may ultimately end up being necessary. However, when you say 
'stock

kernel' you mean what CentOS ships, right? If so, that's not really a
3.10 kernel - the RHEL kernels (that centos is based on) are... 
somewhat

creative... about their versioning. So if you're switched to a vanilla
upstream kernel you may find bisecting difficult :/


Yep default that CentOS ships, I just tested 4.12.5 there the issue also 
does not happen. So I guess I can bisect it then...(really don't want to 
😃)




How did you configure the new kernel? Did you start from scratch, or 
is

it based on the old centos config?


first oldconfig and from there then added additional options for IB, 
NVMe, etc (which I don't really need on the routers)




-Toke

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

2020-11-05 Thread Toke Høiland-Jørgensen via Bloat
"Thomas Rosenstein"  writes:

>> If so, this sounds more like a driver issue, or maybe something to do
>> with scheduling. Does it only happen with ICMP? You could try this 
>> tool
>> for a userspace UDP measurement:
>
> It happens with all packets, therefore the transfer to backblaze with 40 
> threads goes down to ~8MB/s instead of >60MB/s

Huh, right, definitely sounds like a kernel bug; or maybe the new kernel
is getting the hardware into a state where it bugs out when there are
lots of flows or something.

You could try looking at the ethtool stats (ethtool -S) while running
the test and see if any error counters go up. Here's a handy script to
monitor changes in the counters:

https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl

> I'll try what that reports!
>
>> Also, what happens if you ping a host on the internet (*through* the
>> router instead of *to* it)?
>
> Same issue, but twice pronounced, as it seems all interfaces are 
> affected.
> So, ping on one interface and the second has the issue.
> Also all traffic across the host has the issue, but on both sides, so 
> ping to the internet increased by 2x

Right, so even an unloaded interface suffers? But this is the same NIC,
right? So it could still be a hardware issue...

> Yep default that CentOS ships, I just tested 4.12.5 there the issue also 
> does not happen. So I guess I can bisect it then...(really don't want to 
> 😃)

Well that at least narrows it down :)

>>
>> How did you configure the new kernel? Did you start from scratch, or 
>> is
>> it based on the old centos config?
>
> first oldconfig and from there then added additional options for IB, 
> NVMe, etc (which I don't really need on the routers)

OK, so you're probably building with roughly the same options in terms
of scheduling granularity etc. That's good. Did you enable spectre
mitigations etc on the new kernel? What's the output of
`tail /sys/devices/system/cpu/vulnerabilities/*` ?

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] We built a new bufferbloat test and keen for feedback

2020-11-05 Thread Toke Høiland-Jørgensen via Bloat
Dave Collier-Brown  writes:

> Tried it, and I really like the header and use of candle-charts!
>
> I got this:
>
> [cid:part1.81AC21AC.758FE66F@indexexchange.com]
>
> I'd like to be able to explain it to non-techie folks (my grandma, and also 
> my IT team at work (;-)), so I wonder on their behalf...
>
>   *   Why is unloaded a large number, and loaded a small one?
>  *   milliseconds sound like delay, so 111.7 ms sounds slower than 0.0 ms
>   *   Is bloat and latency something bad? The zeroes are in green, does that 
> mean they're good?
>   *   Is max "bad"? In that case I'd call it "worst" and min "best"
>   *   Is median the middle or the average? (no kidding, I've been asked that! 
> I'd call it average)
>   *   Is 25% twenty-five percent of the packets? (I suspect it's a percentile)
>   *   What does this mean in terms of how many Skype calls I can have 
> happening at my house? I have two kids, a wife and a grandmother, all of whom 
> Skype a lot.
>
> Looking at the cool stuff in the banner, it looks like I can browse,
> do audio calls, video calls (just one, or many?) but not streaming
> (any or just 4k?) or gaming.  Emphasizing that would be instantly
> understandable by grandma and the kids.

Also, holy cow, what's going on with your connection? The unloaded
latency says 17/110/200 min/median/max RTT. Is that due to bad
measurements, or do you have a lot of cross traffic and a really bloated
link? :/

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

2020-11-05 Thread Thomas Rosenstein via Bloat



On 5 Nov 2020, at 12:21, Toke Høiland-Jørgensen wrote:


"Thomas Rosenstein"  writes:

If so, this sounds more like a driver issue, or maybe something to 
do

with scheduling. Does it only happen with ICMP? You could try this
tool
for a userspace UDP measurement:


It happens with all packets, therefore the transfer to backblaze with 
40

threads goes down to ~8MB/s instead of >60MB/s


Huh, right, definitely sounds like a kernel bug; or maybe the new 
kernel

is getting the hardware into a state where it bugs out when there are
lots of flows or something.

You could try looking at the ethtool stats (ethtool -S) while running
the test and see if any error counters go up. Here's a handy script to
monitor changes in the counters:

https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl


I'll try what that reports!


Also, what happens if you ping a host on the internet (*through* the
router instead of *to* it)?


Same issue, but twice pronounced, as it seems all interfaces are
affected.
So, ping on one interface and the second has the issue.
Also all traffic across the host has the issue, but on both sides, so
ping to the internet increased by 2x


Right, so even an unloaded interface suffers? But this is the same 
NIC,

right? So it could still be a hardware issue...

Yep default that CentOS ships, I just tested 4.12.5 there the issue 
also
does not happen. So I guess I can bisect it then...(really don't want 
to

😃)


Well that at least narrows it down :)


I just tested 5.9.4 seems to also fix it partly, I have long stretches 
where it looks good, and then some increases again. (3.10 Stock has them 
too, but not so high, rather 1-3 ms)


for example:

64 bytes from x.x.x.x: icmp_seq=10 ttl=64 time=0.169 ms
64 bytes from x.x.x.x: icmp_seq=11 ttl=64 time=5.53 ms
64 bytes from x.x.x.x: icmp_seq=12 ttl=64 time=9.44 ms
64 bytes from x.x.x.x: icmp_seq=13 ttl=64 time=0.167 ms
64 bytes from x.x.x.x: icmp_seq=14 ttl=64 time=3.88 ms

and then again:

64 bytes from x.x.x.x: icmp_seq=15 ttl=64 time=0.569 ms
64 bytes from x.x.x.x: icmp_seq=16 ttl=64 time=0.148 ms
64 bytes from x.x.x.x: icmp_seq=17 ttl=64 time=0.286 ms
64 bytes from x.x.x.x: icmp_seq=18 ttl=64 time=0.257 ms
64 bytes from x.x.x.x: icmp_seq=19 ttl=64 time=0.220 ms
64 bytes from x.x.x.x: icmp_seq=20 ttl=64 time=0.125 ms
64 bytes from x.x.x.x: icmp_seq=21 ttl=64 time=0.188 ms
64 bytes from x.x.x.x: icmp_seq=22 ttl=64 time=0.202 ms
64 bytes from x.x.x.x: icmp_seq=23 ttl=64 time=0.195 ms
64 bytes from x.x.x.x: icmp_seq=24 ttl=64 time=0.177 ms
64 bytes from x.x.x.x: icmp_seq=25 ttl=64 time=0.242 ms
64 bytes from x.x.x.x: icmp_seq=26 ttl=64 time=0.339 ms
64 bytes from x.x.x.x: icmp_seq=27 ttl=64 time=0.183 ms
64 bytes from x.x.x.x: icmp_seq=28 ttl=64 time=0.221 ms
64 bytes from x.x.x.x: icmp_seq=29 ttl=64 time=0.317 ms
64 bytes from x.x.x.x: icmp_seq=30 ttl=64 time=0.210 ms
64 bytes from x.x.x.x: icmp_seq=31 ttl=64 time=0.242 ms
64 bytes from x.x.x.x: icmp_seq=32 ttl=64 time=0.127 ms
64 bytes from x.x.x.x: icmp_seq=33 ttl=64 time=0.217 ms
64 bytes from x.x.x.x: icmp_seq=34 ttl=64 time=0.184 ms


For me it looks now that there was some fix between 5.4.60 and 5.9.4 ... 
anyone can pinpoint it?






How did you configure the new kernel? Did you start from scratch, or
is
it based on the old centos config?


first oldconfig and from there then added additional options for IB,
NVMe, etc (which I don't really need on the routers)


OK, so you're probably building with roughly the same options in terms
of scheduling granularity etc. That's good. Did you enable spectre
mitigations etc on the new kernel? What's the output of
`tail /sys/devices/system/cpu/vulnerabilities/*` ?


mitigations are off

==> /sys/devices/system/cpu/vulnerabilities/itlb_multihit <==
KVM: Vulnerable

==> /sys/devices/system/cpu/vulnerabilities/l1tf <==
Mitigation: PTE Inversion; VMX: vulnerable

==> /sys/devices/system/cpu/vulnerabilities/mds <==
Vulnerable; SMT vulnerable

==> /sys/devices/system/cpu/vulnerabilities/meltdown <==
Vulnerable

==> /sys/devices/system/cpu/vulnerabilities/spec_store_bypass <==
Vulnerable

==> /sys/devices/system/cpu/vulnerabilities/spectre_v1 <==
Vulnerable: __user pointer sanitization and usercopy barriers only; no 
swapgs barriers


==> /sys/devices/system/cpu/vulnerabilities/spectre_v2 <==
Vulnerable, STIBP: disabled

==> /sys/devices/system/cpu/vulnerabilities/srbds <==
Not affected

==> /sys/devices/system/cpu/vulnerabilities/tsx_async_abort <==
Not affected


Grub Boot options are: crashkernel=896M rd.lvm.lv=cl/root net.ifnames=0 
biosdevname=0 scsi_mod.use_blk_mq=1 dm_mod.use_blk_mq=y mitigations=off 
console=tty0 console=ttyS1,115200




-Toke

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

2020-11-05 Thread Toke Høiland-Jørgensen via Bloat
"Thomas Rosenstein"  writes:

> On 5 Nov 2020, at 12:21, Toke Høiland-Jørgensen wrote:
>
>> "Thomas Rosenstein"  writes:
>>
 If so, this sounds more like a driver issue, or maybe something to 
 do
 with scheduling. Does it only happen with ICMP? You could try this
 tool
 for a userspace UDP measurement:
>>>
>>> It happens with all packets, therefore the transfer to backblaze with 
>>> 40
>>> threads goes down to ~8MB/s instead of >60MB/s
>>
>> Huh, right, definitely sounds like a kernel bug; or maybe the new 
>> kernel
>> is getting the hardware into a state where it bugs out when there are
>> lots of flows or something.
>>
>> You could try looking at the ethtool stats (ethtool -S) while running
>> the test and see if any error counters go up. Here's a handy script to
>> monitor changes in the counters:
>>
>> https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl
>>
>>> I'll try what that reports!
>>>
 Also, what happens if you ping a host on the internet (*through* the
 router instead of *to* it)?
>>>
>>> Same issue, but twice pronounced, as it seems all interfaces are
>>> affected.
>>> So, ping on one interface and the second has the issue.
>>> Also all traffic across the host has the issue, but on both sides, so
>>> ping to the internet increased by 2x
>>
>> Right, so even an unloaded interface suffers? But this is the same 
>> NIC,
>> right? So it could still be a hardware issue...
>>
>>> Yep default that CentOS ships, I just tested 4.12.5 there the issue 
>>> also
>>> does not happen. So I guess I can bisect it then...(really don't want 
>>> to
>>> 😃)
>>
>> Well that at least narrows it down :)
>
> I just tested 5.9.4 seems to also fix it partly, I have long stretches 
> where it looks good, and then some increases again. (3.10 Stock has them 
> too, but not so high, rather 1-3 ms)
>
> for example:
>
> 64 bytes from x.x.x.x: icmp_seq=10 ttl=64 time=0.169 ms
> 64 bytes from x.x.x.x: icmp_seq=11 ttl=64 time=5.53 ms
> 64 bytes from x.x.x.x: icmp_seq=12 ttl=64 time=9.44 ms
> 64 bytes from x.x.x.x: icmp_seq=13 ttl=64 time=0.167 ms
> 64 bytes from x.x.x.x: icmp_seq=14 ttl=64 time=3.88 ms
>
> and then again:
>
> 64 bytes from x.x.x.x: icmp_seq=15 ttl=64 time=0.569 ms
> 64 bytes from x.x.x.x: icmp_seq=16 ttl=64 time=0.148 ms
> 64 bytes from x.x.x.x: icmp_seq=17 ttl=64 time=0.286 ms
> 64 bytes from x.x.x.x: icmp_seq=18 ttl=64 time=0.257 ms
> 64 bytes from x.x.x.x: icmp_seq=19 ttl=64 time=0.220 ms
> 64 bytes from x.x.x.x: icmp_seq=20 ttl=64 time=0.125 ms
> 64 bytes from x.x.x.x: icmp_seq=21 ttl=64 time=0.188 ms
> 64 bytes from x.x.x.x: icmp_seq=22 ttl=64 time=0.202 ms
> 64 bytes from x.x.x.x: icmp_seq=23 ttl=64 time=0.195 ms
> 64 bytes from x.x.x.x: icmp_seq=24 ttl=64 time=0.177 ms
> 64 bytes from x.x.x.x: icmp_seq=25 ttl=64 time=0.242 ms
> 64 bytes from x.x.x.x: icmp_seq=26 ttl=64 time=0.339 ms
> 64 bytes from x.x.x.x: icmp_seq=27 ttl=64 time=0.183 ms
> 64 bytes from x.x.x.x: icmp_seq=28 ttl=64 time=0.221 ms
> 64 bytes from x.x.x.x: icmp_seq=29 ttl=64 time=0.317 ms
> 64 bytes from x.x.x.x: icmp_seq=30 ttl=64 time=0.210 ms
> 64 bytes from x.x.x.x: icmp_seq=31 ttl=64 time=0.242 ms
> 64 bytes from x.x.x.x: icmp_seq=32 ttl=64 time=0.127 ms
> 64 bytes from x.x.x.x: icmp_seq=33 ttl=64 time=0.217 ms
> 64 bytes from x.x.x.x: icmp_seq=34 ttl=64 time=0.184 ms
>
>
> For me it looks now that there was some fix between 5.4.60 and 5.9.4 ... 
> anyone can pinpoint it?

$ git log --no-merges --oneline v5.4.60..v5.9.4|wc -l
72932

Only 73k commits; should be easy, right? :)

(In other words no, I have no idea; I'd suggest either (a) asking on
netdev, (b) bisecting or (c) using 5.9+ and just making peace with not
knowing).

 How did you configure the new kernel? Did you start from scratch, or
 is
 it based on the old centos config?
>>>
>>> first oldconfig and from there then added additional options for IB,
>>> NVMe, etc (which I don't really need on the routers)
>>
>> OK, so you're probably building with roughly the same options in terms
>> of scheduling granularity etc. That's good. Did you enable spectre
>> mitigations etc on the new kernel? What's the output of
>> `tail /sys/devices/system/cpu/vulnerabilities/*` ?
>
> mitigations are off

Right, I just figured maybe you were hitting some threshold that
involved a lot of indirect calls which slowed things down due to
mitigations. Guess not, then...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

2020-11-05 Thread Thomas Rosenstein via Bloat



On 5 Nov 2020, at 13:38, Toke Høiland-Jørgensen wrote:


"Thomas Rosenstein"  writes:


On 5 Nov 2020, at 12:21, Toke Høiland-Jørgensen wrote:


"Thomas Rosenstein"  writes:


If so, this sounds more like a driver issue, or maybe something to
do
with scheduling. Does it only happen with ICMP? You could try this
tool
for a userspace UDP measurement:


It happens with all packets, therefore the transfer to backblaze 
with

40
threads goes down to ~8MB/s instead of >60MB/s


Huh, right, definitely sounds like a kernel bug; or maybe the new
kernel
is getting the hardware into a state where it bugs out when there 
are

lots of flows or something.

You could try looking at the ethtool stats (ethtool -S) while 
running
the test and see if any error counters go up. Here's a handy script 
to

monitor changes in the counters:

https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl


I'll try what that reports!

Also, what happens if you ping a host on the internet (*through* 
the

router instead of *to* it)?


Same issue, but twice pronounced, as it seems all interfaces are
affected.
So, ping on one interface and the second has the issue.
Also all traffic across the host has the issue, but on both sides, 
so

ping to the internet increased by 2x


Right, so even an unloaded interface suffers? But this is the same
NIC,
right? So it could still be a hardware issue...


Yep default that CentOS ships, I just tested 4.12.5 there the issue
also
does not happen. So I guess I can bisect it then...(really don't 
want

to
😃)


Well that at least narrows it down :)


I just tested 5.9.4 seems to also fix it partly, I have long 
stretches
where it looks good, and then some increases again. (3.10 Stock has 
them

too, but not so high, rather 1-3 ms)

for example:

64 bytes from x.x.x.x: icmp_seq=10 ttl=64 time=0.169 ms
64 bytes from x.x.x.x: icmp_seq=11 ttl=64 time=5.53 ms
64 bytes from x.x.x.x: icmp_seq=12 ttl=64 time=9.44 ms
64 bytes from x.x.x.x: icmp_seq=13 ttl=64 time=0.167 ms
64 bytes from x.x.x.x: icmp_seq=14 ttl=64 time=3.88 ms

and then again:

64 bytes from x.x.x.x: icmp_seq=15 ttl=64 time=0.569 ms
64 bytes from x.x.x.x: icmp_seq=16 ttl=64 time=0.148 ms
64 bytes from x.x.x.x: icmp_seq=17 ttl=64 time=0.286 ms
64 bytes from x.x.x.x: icmp_seq=18 ttl=64 time=0.257 ms
64 bytes from x.x.x.x: icmp_seq=19 ttl=64 time=0.220 ms
64 bytes from x.x.x.x: icmp_seq=20 ttl=64 time=0.125 ms
64 bytes from x.x.x.x: icmp_seq=21 ttl=64 time=0.188 ms
64 bytes from x.x.x.x: icmp_seq=22 ttl=64 time=0.202 ms
64 bytes from x.x.x.x: icmp_seq=23 ttl=64 time=0.195 ms
64 bytes from x.x.x.x: icmp_seq=24 ttl=64 time=0.177 ms
64 bytes from x.x.x.x: icmp_seq=25 ttl=64 time=0.242 ms
64 bytes from x.x.x.x: icmp_seq=26 ttl=64 time=0.339 ms
64 bytes from x.x.x.x: icmp_seq=27 ttl=64 time=0.183 ms
64 bytes from x.x.x.x: icmp_seq=28 ttl=64 time=0.221 ms
64 bytes from x.x.x.x: icmp_seq=29 ttl=64 time=0.317 ms
64 bytes from x.x.x.x: icmp_seq=30 ttl=64 time=0.210 ms
64 bytes from x.x.x.x: icmp_seq=31 ttl=64 time=0.242 ms
64 bytes from x.x.x.x: icmp_seq=32 ttl=64 time=0.127 ms
64 bytes from x.x.x.x: icmp_seq=33 ttl=64 time=0.217 ms
64 bytes from x.x.x.x: icmp_seq=34 ttl=64 time=0.184 ms


For me it looks now that there was some fix between 5.4.60 and 5.9.4 
...

anyone can pinpoint it?


$ git log --no-merges --oneline v5.4.60..v5.9.4|wc -l
72932

Only 73k commits; should be easy, right? :)

(In other words no, I have no idea; I'd suggest either (a) asking on
netdev, (b) bisecting or (c) using 5.9+ and just making peace with not
knowing).


Guess I'll go the easy route and let it be ...

I'll update all routers to the 5.9.4 and see if it fixes the traffic 
flow - will report back once more after that.




How did you configure the new kernel? Did you start from scratch, 
or

is
it based on the old centos config?


first oldconfig and from there then added additional options for 
IB,

NVMe, etc (which I don't really need on the routers)


OK, so you're probably building with roughly the same options in 
terms

of scheduling granularity etc. That's good. Did you enable spectre
mitigations etc on the new kernel? What's the output of
`tail /sys/devices/system/cpu/vulnerabilities/*` ?


mitigations are off


Right, I just figured maybe you were hitting some threshold that
involved a lot of indirect calls which slowed things down due to
mitigations. Guess not, then...



Thanks for the support :)


-Toke

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

2020-11-05 Thread Toke Høiland-Jørgensen via Bloat
"Thomas Rosenstein"  writes:

> On 5 Nov 2020, at 13:38, Toke Høiland-Jørgensen wrote:
>
>> "Thomas Rosenstein"  writes:
>>
>>> On 5 Nov 2020, at 12:21, Toke Høiland-Jørgensen wrote:
>>>
 "Thomas Rosenstein"  writes:

>> If so, this sounds more like a driver issue, or maybe something to
>> do
>> with scheduling. Does it only happen with ICMP? You could try this
>> tool
>> for a userspace UDP measurement:
>
> It happens with all packets, therefore the transfer to backblaze 
> with
> 40
> threads goes down to ~8MB/s instead of >60MB/s

 Huh, right, definitely sounds like a kernel bug; or maybe the new
 kernel
 is getting the hardware into a state where it bugs out when there 
 are
 lots of flows or something.

 You could try looking at the ethtool stats (ethtool -S) while 
 running
 the test and see if any error counters go up. Here's a handy script 
 to
 monitor changes in the counters:

 https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl

> I'll try what that reports!
>
>> Also, what happens if you ping a host on the internet (*through* 
>> the
>> router instead of *to* it)?
>
> Same issue, but twice pronounced, as it seems all interfaces are
> affected.
> So, ping on one interface and the second has the issue.
> Also all traffic across the host has the issue, but on both sides, 
> so
> ping to the internet increased by 2x

 Right, so even an unloaded interface suffers? But this is the same
 NIC,
 right? So it could still be a hardware issue...

> Yep default that CentOS ships, I just tested 4.12.5 there the issue
> also
> does not happen. So I guess I can bisect it then...(really don't 
> want
> to
> 😃)

 Well that at least narrows it down :)
>>>
>>> I just tested 5.9.4 seems to also fix it partly, I have long 
>>> stretches
>>> where it looks good, and then some increases again. (3.10 Stock has 
>>> them
>>> too, but not so high, rather 1-3 ms)
>>>
>>> for example:
>>>
>>> 64 bytes from x.x.x.x: icmp_seq=10 ttl=64 time=0.169 ms
>>> 64 bytes from x.x.x.x: icmp_seq=11 ttl=64 time=5.53 ms
>>> 64 bytes from x.x.x.x: icmp_seq=12 ttl=64 time=9.44 ms
>>> 64 bytes from x.x.x.x: icmp_seq=13 ttl=64 time=0.167 ms
>>> 64 bytes from x.x.x.x: icmp_seq=14 ttl=64 time=3.88 ms
>>>
>>> and then again:
>>>
>>> 64 bytes from x.x.x.x: icmp_seq=15 ttl=64 time=0.569 ms
>>> 64 bytes from x.x.x.x: icmp_seq=16 ttl=64 time=0.148 ms
>>> 64 bytes from x.x.x.x: icmp_seq=17 ttl=64 time=0.286 ms
>>> 64 bytes from x.x.x.x: icmp_seq=18 ttl=64 time=0.257 ms
>>> 64 bytes from x.x.x.x: icmp_seq=19 ttl=64 time=0.220 ms
>>> 64 bytes from x.x.x.x: icmp_seq=20 ttl=64 time=0.125 ms
>>> 64 bytes from x.x.x.x: icmp_seq=21 ttl=64 time=0.188 ms
>>> 64 bytes from x.x.x.x: icmp_seq=22 ttl=64 time=0.202 ms
>>> 64 bytes from x.x.x.x: icmp_seq=23 ttl=64 time=0.195 ms
>>> 64 bytes from x.x.x.x: icmp_seq=24 ttl=64 time=0.177 ms
>>> 64 bytes from x.x.x.x: icmp_seq=25 ttl=64 time=0.242 ms
>>> 64 bytes from x.x.x.x: icmp_seq=26 ttl=64 time=0.339 ms
>>> 64 bytes from x.x.x.x: icmp_seq=27 ttl=64 time=0.183 ms
>>> 64 bytes from x.x.x.x: icmp_seq=28 ttl=64 time=0.221 ms
>>> 64 bytes from x.x.x.x: icmp_seq=29 ttl=64 time=0.317 ms
>>> 64 bytes from x.x.x.x: icmp_seq=30 ttl=64 time=0.210 ms
>>> 64 bytes from x.x.x.x: icmp_seq=31 ttl=64 time=0.242 ms
>>> 64 bytes from x.x.x.x: icmp_seq=32 ttl=64 time=0.127 ms
>>> 64 bytes from x.x.x.x: icmp_seq=33 ttl=64 time=0.217 ms
>>> 64 bytes from x.x.x.x: icmp_seq=34 ttl=64 time=0.184 ms
>>>
>>>
>>> For me it looks now that there was some fix between 5.4.60 and 5.9.4 
>>> ...
>>> anyone can pinpoint it?
>>
>> $ git log --no-merges --oneline v5.4.60..v5.9.4|wc -l
>> 72932
>>
>> Only 73k commits; should be easy, right? :)
>>
>> (In other words no, I have no idea; I'd suggest either (a) asking on
>> netdev, (b) bisecting or (c) using 5.9+ and just making peace with not
>> knowing).
>
> Guess I'll go the easy route and let it be ...
>
> I'll update all routers to the 5.9.4 and see if it fixes the traffic 
> flow - will report back once more after that.

Sounds like a plan :)

>>
>> How did you configure the new kernel? Did you start from scratch, 
>> or
>> is
>> it based on the old centos config?
>
> first oldconfig and from there then added additional options for 
> IB,
> NVMe, etc (which I don't really need on the routers)

 OK, so you're probably building with roughly the same options in 
 terms
 of scheduling granularity etc. That's good. Did you enable spectre
 mitigations etc on the new kernel? What's the output of
 `tail /sys/devices/system/cpu/vulnerabilities/*` ?
>>>
>>> mitigations are off
>>
>> Right, I just figured maybe you were hitting some threshold that
>> involved a lot of indirect calls which slowed things down due to
>> miti

[Bloat] Comparing bufferbloat tests (was: We built a new bufferbloat test and keen for feedback)

2020-11-05 Thread Dave Collier-Brown


On 2020-11-05 6:48 a.m., Toke Høiland-Jørgensen via Bloat wrote:

Also, holy cow, what's going on with your connection? The unloaded
latency says 17/110/200 min/median/max RTT. Is that due to bad
measurements, or do you have a lot of cross traffic and a really bloated
link? :/

-Toke


The tests differ somewhat while looking at an unloaded residential link 
provided by a local monopoly, Rogers Cable, and mitigated by an IQrouter (my 
old linksys is long dead (;-))

DSLReports says

 *   144.7 Mb/s down
 *   14.05 MB/s up
 *   bufferbloat A+
 *   downloading lag 40-100 ms

Waveform says:

 *   43.47 Mbps down
 *   16.05 Mbps up
 *   bufferbloat grade A+
 *   unloaded latency 93.5 ms

So we're reporting different speeds and RTTs. Are we using different units or 
definitions, I wonder?

--dave

--
David Collier-Brown, | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
dave.collier-br...@indexexchange.com
 |  -- Mark Twain



CONFIDENTIALITY NOTICE AND DISCLAIMER : This telecommunication, including any 
and all attachments, contains confidential information intended only for the 
person(s) to whom it is addressed. Any dissemination, distribution, copying or 
disclosure is strictly prohibited and is not a waiver of confidentiality. If 
you have received this telecommunication in error, please notify the sender 
immediately by return electronic mail and delete the message from your inbox 
and deleted items folders. This telecommunication does not constitute an 
express or implied agreement to conduct transactions by electronic means, nor 
does it constitute a contract offer, a contract amendment or an acceptance of a 
contract offer. Contract terms contained in this telecommunication are subject 
to legal review and the completion of formal documentation and are not binding 
until same is confirmed in writing and has been signed by an authorized 
signatory.
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

2020-11-05 Thread Jesper Dangaard Brouer
On Thu, 05 Nov 2020 13:22:10 +0100
Thomas Rosenstein via Bloat  wrote:

> On 5 Nov 2020, at 12:21, Toke Høiland-Jørgensen wrote:
> 
> > "Thomas Rosenstein"  writes:
> >  
> >>> If so, this sounds more like a driver issue, or maybe something to 
> >>> do with scheduling. Does it only happen with ICMP? You could try this
> >>> tool for a userspace UDP measurement:  
> >>
> >> It happens with all packets, therefore the transfer to backblaze with 
> >> 40 threads goes down to ~8MB/s instead of >60MB/s  
> >
> > Huh, right, definitely sounds like a kernel bug; or maybe the new 
> > kernel is getting the hardware into a state where it bugs out when
> > there are lots of flows or something.
> >
> > You could try looking at the ethtool stats (ethtool -S) while
> > running the test and see if any error counters go up. Here's a
> > handy script to monitor changes in the counters:
> >
> > https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl
> >  
> >> I'll try what that reports!
> >>  
> >>> Also, what happens if you ping a host on the internet (*through*
> >>> the router instead of *to* it)?  
> >>
> >> Same issue, but twice pronounced, as it seems all interfaces are
> >> affected.
> >> So, ping on one interface and the second has the issue.
> >> Also all traffic across the host has the issue, but on both sides,
> >> so ping to the internet increased by 2x  
> >
> > Right, so even an unloaded interface suffers? But this is the same 
> > NIC, right? So it could still be a hardware issue...
> >  
> >> Yep default that CentOS ships, I just tested 4.12.5 there the
> >> issue also does not happen. So I guess I can bisect it
> >> then...(really don't want to 😃)  
> >
> > Well that at least narrows it down :)  
> 
> I just tested 5.9.4 seems to also fix it partly, I have long
> stretches where it looks good, and then some increases again. (3.10
> Stock has them too, but not so high, rather 1-3 ms)
> 
> for example:
> 
> 64 bytes from x.x.x.x: icmp_seq=10 ttl=64 time=0.169 ms
> 64 bytes from x.x.x.x: icmp_seq=11 ttl=64 time=5.53 ms
> 64 bytes from x.x.x.x: icmp_seq=12 ttl=64 time=9.44 ms
> 64 bytes from x.x.x.x: icmp_seq=13 ttl=64 time=0.167 ms
> 64 bytes from x.x.x.x: icmp_seq=14 ttl=64 time=3.88 ms
> 
> and then again:
> 
> 64 bytes from x.x.x.x: icmp_seq=15 ttl=64 time=0.569 ms
> 64 bytes from x.x.x.x: icmp_seq=16 ttl=64 time=0.148 ms
> 64 bytes from x.x.x.x: icmp_seq=17 ttl=64 time=0.286 ms
> 64 bytes from x.x.x.x: icmp_seq=18 ttl=64 time=0.257 ms
> 64 bytes from x.x.x.x: icmp_seq=19 ttl=64 time=0.220 ms
> 64 bytes from x.x.x.x: icmp_seq=20 ttl=64 time=0.125 ms
> 64 bytes from x.x.x.x: icmp_seq=21 ttl=64 time=0.188 ms
> 64 bytes from x.x.x.x: icmp_seq=22 ttl=64 time=0.202 ms
> 64 bytes from x.x.x.x: icmp_seq=23 ttl=64 time=0.195 ms
> 64 bytes from x.x.x.x: icmp_seq=24 ttl=64 time=0.177 ms
> 64 bytes from x.x.x.x: icmp_seq=25 ttl=64 time=0.242 ms
> 64 bytes from x.x.x.x: icmp_seq=26 ttl=64 time=0.339 ms
> 64 bytes from x.x.x.x: icmp_seq=27 ttl=64 time=0.183 ms
> 64 bytes from x.x.x.x: icmp_seq=28 ttl=64 time=0.221 ms
> 64 bytes from x.x.x.x: icmp_seq=29 ttl=64 time=0.317 ms
> 64 bytes from x.x.x.x: icmp_seq=30 ttl=64 time=0.210 ms
> 64 bytes from x.x.x.x: icmp_seq=31 ttl=64 time=0.242 ms
> 64 bytes from x.x.x.x: icmp_seq=32 ttl=64 time=0.127 ms
> 64 bytes from x.x.x.x: icmp_seq=33 ttl=64 time=0.217 ms
> 64 bytes from x.x.x.x: icmp_seq=34 ttl=64 time=0.184 ms
> 
> 
> For me it looks now that there was some fix between 5.4.60 and 5.9.4
> ... anyone can pinpoint it?

I have some bpftrace tools to measure these kind of latency spikes here:
 [1] https://github.com/xdp-project/xdp-project/blob/master/areas/latency/

The tool you want is: softirq_net_latency.bt
 [2] 
https://github.com/xdp-project/xdp-project/blob/master/areas/latency/softirq_net_latency.bt

Example output see[3]:
 [3] https://bugzilla.redhat.com/show_bug.cgi?id=1795049#c8

Based on the kernel versions, I don't expect this to be same latency
issue as described in the bugzilla[3] case (as IIRC it was fixed in
4.19).  It can still be similar issue, where some userspace process is
reading information from the kernel (/sys/fs/cgroup/memory/memory.stat
in BZ case) that blocks softirq from running, and result in these
latency spikes.



Install guide to bpftrace[4]:
 [4] 
https://github.com/xdp-project/xdp-project/blob/master/areas/mem/bpftrace/INSTALL.org
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Comparing bufferbloat tests (was: We built a new bufferbloat test and keen for feedback)

2020-11-05 Thread Toke Høiland-Jørgensen via Bloat
Dave Collier-Brown  writes:

> On 2020-11-05 6:48 a.m., Toke Høiland-Jørgensen via Bloat wrote:
>
> Also, holy cow, what's going on with your connection? The unloaded
> latency says 17/110/200 min/median/max RTT. Is that due to bad
> measurements, or do you have a lot of cross traffic and a really bloated
> link? :/
>
> -Toke
>
>
> The tests differ somewhat while looking at an unloaded residential link 
> provided by a local monopoly, Rogers Cable, and mitigated by an IQrouter (my 
> old linksys is long dead (;-))
>
> DSLReports says
>
>   *   144.7 Mb/s down
>   *   14.05 MB/s up
>   *   bufferbloat A+
>   *   downloading lag 40-100 ms

Still a pretty big span from 40-100ms; how does that turn into an A+
score, I wonder?

> Waveform says:
>
>   *   43.47 Mbps down
>   *   16.05 Mbps up
>   *   bufferbloat grade A+
>   *   unloaded latency 93.5 ms
>
> So we're reporting different speeds and RTTs. Are we using different
> units or definitions, I wonder?

Well either that, or one of the tests is just busted. My immediate guess
would be the not-yet-released prototype is the least accurate ;)
I do wonder why, though...

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] Comparing bufferbloat tests (was: We built a new bufferbloat test and keen for feedback)

2020-11-05 Thread Toke Høiland-Jørgensen via Bloat
> I believe TLS handshake time is not included here. I’m using the
> Resource Timing API
> 
> to measure the time-to-first-byte for a request that I’m sending to
> retrieve a static file. The resource loading phases
> 
> section of the documentation explicitly shows the different stages for
> DNS Lookup, TCP connection establishment, etc. I’m using the
> difference between requestStart and responseStart values. This value
> is deemed to be the same as time-to-first-byte
> 
> seen in the inspector’s network tab.

This does not seem completely ludicrous, at least :)

> We’re using this static file
> 
> that is hosted on a google CDN. We tried multiple different files, and
> this one had the lowest latency in both locations that we tested it
> (I’m in Toronto, and my colleague Sina is in San Francisco).

Ah, so that's why that request showed up :)

Curious to know why you picked this instead of, say, something from
speed.cloudflare.com (since you're using that for the speed tests anyway)?

> @Toke Høiland-Jørgensen
>> Your test does a decent job and comes pretty close, at least
>> in Chromium (about 800 Mbps which is not too far off at the application
>> layer, considering I have a constant 100Mbps flow in the background
>> taking up some of the bandwidth). Firefox seems way off (one test said
>> 500Mbps the other >1000).
>
>
> The way I’m measuring download is that I make multiple simultaneous
> requests to cloudflare’s backend requesting 100MB files. Their backend
> simply returns a file that has “0”s in the body repeated until 100MB
> of file is generated. Then I use readable streams
> 
> to make multiple measurements of (total bytes downloaded, timestamp).
> Then I fit a line to the measurements collected, and the slope of that
> line is the calculated bandwidth. For gigabit connections, this
> download happens very quickly, and it may be the case that not a lot
> of points are collected, in which case the fitted line is not accurate
> and one might get overly-huge bandwidths as is the >1000 case in ur
> Firefox browser. I think this might be fixed if we increase the
> download time. Currently it’s 5s, maybe changing that to 10-20s would
> help. I think in general it’d be a good feature to have a "more
> advanced options” feature that allows the user to adjust some
> parameters of the connection (such as number of parallel connections,
> download scenario’s duration, upload scenario’s duration, etc.)

Yeah, I think running the test for longer will help; 5s is not nearly
enough to saturate a connection, especially not as the link speed increases.

> The reason I do this line-fitting is because I want to get rid of the
> bandwidth ramp-up time when the download begins.

Yeah, allowing some ramp-up time before determining the bandwidth seems
reasonable, but it's not generally possible to just pick a static number
of (say) seconds to chop off... Having the graph over time helps
sanity-check things, though.

Also, a continuous graph of latency samples over time (for the whole
duration, including idle/upload/download) is usually very instructive
when plotting such a test.

> Real-time Bandwidth Reporting
> Using readable-streams also allows for instantaneous bandwidth
> reporting (maybe using average of a moving window) similar to what
> fast.com  or speedtest.net 
> do, but I unfortunately am not able to do the same thing with upload,
> since getting progress on http uploads adds some pre-flight OPTIONS
> requests which cloudflare’s speedtest backend
>  doesn’t allow those requests. For this
> test we are directly hitting cloudflare’s backend, you can see this in
> the network tab:
>
> Our download is by sending an http GET request to this endpoint:
> https://speed.cloudflare.com/__down?bytes=1
>  and our upload
> is done by sending and http POST request to this endpoint:
> https://speed.cloudflare.com/__up 
>
> Since we are using cloudflare’s backend we are limited by what they
> allow us to do.

The test at speed.cloudflare.com does seem to plot real-time upload
bandwidth; is that a privileged operation for themselves, or something?

> I did try making my own worker which essentially does the same thing
> as cloudflare’s speedtest backend (They do have this template worker
>  that for the
> most part does the same thing.) I modified that worker a bit so that
> it all

[Bloat] Starlink packet routing details

2020-11-05 Thread Kenneth Porter




___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat