I've been digging into this and this appears to be a regression
introduced by the following patch
https://github.com/torvalds/linux/commit/3cd3399dd7a84 which was first
released in Linux 6.0.0.

The bug is not a memory leak but rather a bug in how memory usage is
counted. Excess memory is not actually being consumed, though the bug is
still fatal since the counter controls Linux's memory pressure logic.

The (apparently) responsible patch is a performance optimisation which
attempts to reduce the frequency of writes to the system-wide counter
which (I suspect) is subtly misusing some atomic operation on ARM. If
you undo this patch in a recent kernel, the bug disappears.

I am currently working on a detailed bug report for the relevant Kernel
maintainers.

NB: It appears that the "5.15" kernel shipped by Rocky (and RHEL)
includes a back-port of this bug, hence my seeing the bug in that kernel
version on Rocky Linux. A non-RedHat-patched vanilla build of 5.15 does
not exhibit the bug in my system either.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2045560

Title:
  TCP memory  leak, slow network (arm64)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-signed-aws-6.2/+bug/2045560/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to