[Kernel-packages] [Bug 2050098] Re: cgroup2 broken since 5.15.0-90-generic?

2024-01-24 Thread Stefan Fleischmann
** Package changed: linux-signed (Ubuntu) => slurm-wlm (Ubuntu) ** Changed in: slurm-wlm (Ubuntu) Status: Confirmed => Invalid ** Summary changed: - cgroup2 broken since 5.15.0-90-generic? + load_ebpf_prog() fails for long bpf() logs -- You received this bug notification because you

[Kernel-packages] [Bug 2050098] Re: cgroup2 broken since 5.15.0-90-generic?

2024-01-24 Thread Stefan Fleischmann
So turns out this is not a kernel bug after all. As @hedrick mentioned it is indeed related to the bpf logs. I suppose kernel 5.15 just produces longer logs here than the newer kernels. Here is the original bug report for Slurm https://bugs.schedmd.com/show_bug.cgi?id=17210 that includes a patch.

[Kernel-packages] [Bug 2050098] Re: cgroup2 broken since 5.15.0-90-generic?

2024-01-23 Thread Stefan Fleischmann
@hedrick: Regarding that workaround you mentioned above, I would guess it only suppresses the error message but doesn't fix the problem with broken cgroup confinement. Is that correct? I've done more testing and have identified the following commit:

[Kernel-packages] [Bug 2050098] Re: cgroup2 broken since 5.15.0-90-generic?

2024-01-23 Thread Charles Hedrick
Furthermore, slurm doesn't actually use the log data that the kernel would pass back. So my code is better even with kernels that work. Why pass a log buffer that you aren't going to use? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to

[Kernel-packages] [Bug 2050098] Re: cgroup2 broken since 5.15.0-90-generic?

2024-01-23 Thread Charles Hedrick
However the same patch is in 6.5. I give up. At least I have a workaround. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed in Ubuntu. https://bugs.launchpad.net/bugs/2050098 Title: cgroup2 broken since 5.15.0-90-generic?

[Kernel-packages] [Bug 2050098] Re: cgroup2 broken since 5.15.0-90-generic?

2024-01-23 Thread Charles Hedrick
>From timing I suspect kernel.org 2dcb31e65d26a29a6842500e904907180e80a091, but I don't understand the code so I can't tell whether there's actually a problem there. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed in Ubuntu.

[Kernel-packages] [Bug 2050098] Re: cgroup2 broken since 5.15.0-90-generic?

2024-01-23 Thread Charles Hedrick
In src/plugins/cgroup/v2/ebpf.c, comment out logging. I.e. change attr.log_level = 1; attr.log_buf = (size_t) log; attr.log_size = sizeof(log); to attr.log_level

[Kernel-packages] [Bug 2050098] Re: cgroup2 broken since 5.15.0-90-generic?

2024-01-23 Thread Charles Hedrick
I'm concerned with the security implications of being frozen on a kernel we can't update. I suspect we should start testing an HWE kernel. That raises other issues, since there are lots of other features we also need to work, so it's going to require a fair amount of testing. I'd accelerate our

[Kernel-packages] [Bug 2050098] Re: cgroup2 broken since 5.15.0-90-generic?

2024-01-23 Thread Charles Hedrick
We are also seeing this. We're now stuck on old kernels until this is fixed. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed in Ubuntu. https://bugs.launchpad.net/bugs/2050098 Title: cgroup2 broken since 5.15.0-90-generic?

[Kernel-packages] [Bug 2050098] Re: cgroup2 broken since 5.15.0-90-generic?

2024-01-23 Thread Stefan Fleischmann
** Summary changed: - cgroup2 appears to be broken + cgroup2 broken since 5.15.0-90-generic? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed in Ubuntu. https://bugs.launchpad.net/bugs/2050098 Title: cgroup2 broken since