Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

Thomas Rosenstein via Bloat Fri, 06 Nov 2020 09:05:37 -0800


On 6 Nov 2020, at 15:13, Jesper Dangaard Brouer wrote:

On Fri, 6 Nov 2020 13:53:58 +0100
Jesper Dangaard Brouer <bro...@redhat.com> wrote:

[...]


Could this be related to netlink? I have gobgpd running on these
routers, which injects routes via netlink.
But the churn rate during the tests is very minimal, maybe 30 - 40
routes every second.


Yes, this could be related.  The internal data-structure for FIB
lookups is a fibtrie which is a compressed patricia tree, related to
radix tree idea.  Thus, I can imagine that the kernel have to
rebuild/rebalance the tree with all these updates.


Reading the kernel code. The IPv4 fib_trie code is very well tuned,

fully RCU-ified, meaning read-side is lock-free. The resize()functioncode in net//ipv4/fib_trie.c have max_work limiter to avoid it usestoo

much time.  And the update looks lockfree.

The IPv6 update looks more scary, as it seems to take a "bh" spinlock
that can block softirq from running code in net/ipv6/ip6_fib.c
(spin_lock_bh(&f6i->fib6_table->tb6_lock).

I'm using ping on IPv4, but I'll try to see if IPv6 makes anydifference!

Have you tried to use 'perf record' to observe that is happening onthe system while these latency incidents happen? (let me know if youwant some cmdline hints)

Haven't tried this yet. If you have some hints what events to monitorI'll take them!


--
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat

Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

Reply via email to