On 5 Nov 2020, at 1:10, Toke Høiland-Jørgensen wrote:

"Thomas Rosenstein" <thomas.rosenst...@creamfinance.com> writes:

On 4 Nov 2020, at 17:10, Toke Høiland-Jørgensen wrote:

Thomas Rosenstein via Bloat <bloat@lists.bufferbloat.net> writes:

Hi all,

I'm coming from the lartc mailing list, here's the original text:

=====

I have multiple routers which connect to multiple upstream providers,
I
have noticed a high latency shift in icmp (and generally all
connection)
if I run b2 upload-file --threads 40 (and I can reproduce this)

What options do I have to analyze why this happens?

General Info:

Routers are connected between each other with 10G Mellanox Connect-X
cards via 10G SPF+ DAC cables via a 10G Switch from fs.com
Latency generally is around 0.18 ms between all routers (4).
Throughput is 9.4 Gbit/s with 0 retransmissions when tested with
iperf3.
2 of the 4 routers are connected upstream with a 1G connection
(separate
port, same network card)
All routers have the full internet routing tables, i.e. 80k entries
for
IPv6 and 830k entries for IPv4
Conntrack is disabled (-j NOTRACK)
Kernel 5.4.60 (custom)
2x Xeon X5670 @ 2.93 Ghz
96 GB RAM
No Swap
CentOs 7

During high latency:

Latency on routers which have the traffic flow increases to 12 - 20
ms,
for all interfaces, moving of the stream (via bgp disable session)
moves
also the high latency
iperf3 performance plumets to 300 - 400 MBits
CPU load (user / system) are around 0.1%
Ram Usage is around 3 - 4 GB
if_packets count is stable (around 8000 pkt/s more)

I'm not sure I get you topology. Packets are going from where to
where,
and what link is the bottleneck for the transfer you're doing? Are you
measuring the latency along the same path?

Have you tried running 'mtr' to figure out which hop the latency is
at?

I tried to draw the topology, I hope this is okay and explains betters
what's happening:

https://drive.google.com/file/d/15oAsxiNfsbjB9a855Q_dh6YvFZBDdY5I/view?usp=sharing

Ohh, right, you're pinging between two of the routers across a 10 Gbps
link with plenty of capacity to spare, and *that* goes up by two orders
of magnitude when you start the transfer, even though the transfer
itself is <1Gbps? Am I understanding you correctly now?

Exactly :)


If so, this sounds more like a driver issue, or maybe something to do
with scheduling. Does it only happen with ICMP? You could try this tool
for a userspace UDP measurement:

It happens with all packets, therefore the transfer to backblaze with 40 threads goes down to ~8MB/s instead of >60MB/s


https://github.com/heistp/irtt/


I'll try what that reports!

Also, what happens if you ping a host on the internet (*through* the
router instead of *to* it)?

Same issue, but twice pronounced, as it seems all interfaces are affected.
So, ping on one interface and the second has the issue.
Also all traffic across the host has the issue, but on both sides, so ping to the internet increased by 2x


And which version of the Connect-X cards are you using (or rather, which
driver? mlx4?)


It's Connect-X 4 Lx cards, specifcally: MCX4121A-ACAT
Driver is mlx5_core

So it must be something in the kernel tacking on a delay, I could try to
do a bisect and build like 10 kernels :)

That may ultimately end up being necessary. However, when you say 'stock
kernel' you mean what CentOS ships, right? If so, that's not really a
3.10 kernel - the RHEL kernels (that centos is based on) are... somewhat
creative... about their versioning. So if you're switched to a vanilla
upstream kernel you may find bisecting difficult :/

Yep default that CentOS ships, I just tested 4.12.5 there the issue also does not happen. So I guess I can bisect it then...(really don't want to 😃)


How did you configure the new kernel? Did you start from scratch, or is
it based on the old centos config?

first oldconfig and from there then added additional options for IB, NVMe, etc (which I don't really need on the routers)


-Toke
_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat

Reply via email to