[Kernel-packages] [Bug 2016186] Re: 5.19 not reporting cgroups v1 blkio.throttle.io_serviced
Hey Andrea, Thanks for the help getting this all fixed up. I see that the change is committed for Lunar and Kinetic. Is there a good way for me to follow when this'll land for the Ubuntu Jammy linux-aws, linux-gcp, and linux-azure packages? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2016186 Title: 5.19 not reporting cgroups v1 blkio.throttle.io_serviced Status in linux package in Ubuntu: Incomplete Status in linux source package in Kinetic: Fix Committed Status in linux source package in Lunar: Fix Committed Status in linux source package in Mantic: Incomplete Bug description: [Impact] Commit f382fb0bcef4 ("block: remove legacy IO schedulers") introduced a behavior change in the blkio throttle cgroup subsystem: IO statistics are not reported anymore unless a throttling rule is explicitly defined, because the current code only counts bios that are actually throttled. This behavior change is potentially breaking some user-space applications that are relying on the old behavior (see original bug report below). [Test case] - mount cgroup v1 - create a blkio cgroup - move a task into the blkio cgroup - perform some I/O (i.e., dd) - read the IO stats for the cgroup (blkio.throttle.io_serviced and blkio.throttle.io_service_bytes in cgroupfs) - IO stats are all 0, unless a throttle rule is defined Previous behavior (kernel 5.15) was showing I/O statistics even without throttling rules defined. [Fix] Apply / backport this fix: https://lore.kernel.org/lkml/20230507170631.89607-1-hanjinke@bytedance.com/t/ [Regression potential] The fix is affecting the block IO cgroup subsystem, we may see potential regressions in this particular cgroup subsystem with this fix applied. [Original bug report] Hi, I'm still investigating but, am a bit stuck. Here's what I've found so far. Today I've upgraded some nodes in AWS EC2 from the previous v5.15 linux-aws package to the recently pusblished v5.19 package and rebooted. It seems that even when there's disk activity, the files: /sys/fs/cgroup/blkio/blkio.throttle.io_serviced /sys/fs/cgroup/blkio/blkio.throttle.io_service_bytes Are only ever populated with 0's. Prior on v5.15 these would reflect the actual disk usage. No other system configuration changes were applied just the kernel upgrade and reboot. I've also verified that simply rebooting a v5.15 where this does work doesn't break the reporting. These EC2 instances are running with cgroups v1 due to other compatability issues and I suspect that might be the issue. So far, I cannot find any differences. mtab shows the same v1 mount setup, the kernel options match betwen v5.15 and v5.19. I'm more than happy to fetch whatever info would help out here. I'd love to get 5.19 working for us but, we really need the data from these files. Info: Prior version that works: Linux ip-10-128-168-154 5.15.0-1031-aws #35-Ubuntu SMP Fri Feb 10 02:07:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux Upgraded version that's broken: Linux ip-10-128-166-219 5.19.0-1022-aws #23~22.04.1-Ubuntu SMP Fri Mar 17 15:38:24 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux EC2 instances built off of the published 22.04 LTS AMI in us-east-1. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2016186/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2016186] Re: 5.19 not reporting cgroups v1 blkio.throttle.io_serviced
I've updated this to report the issue for linux-azure and linux-gcp as their jammy-updates repo's have recently updated to kernel 5.19 and appear to be affected as well. I could use some guidance on reporting this upstream as 5.19 doesn't seem to be a supported kernel version (looking https://www.kernel.org/) so it's unclear the correct way to go about that. Related, https://packages.ubuntu.com/jammy-updates/linux-gcp-lts-22.04 doesn't exist as of right now. If that could be published similar to https://packages.ubuntu.com/jammy-updates/linux-azure-lts-22.04 and https://packages.ubuntu.com/jammy-updates/linux-aws-lts-22.04 that'd be a huge help for me. ** Also affects: linux-gcp (Ubuntu) Importance: Undecided Status: New ** Also affects: linux-azure (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/2016186 Title: 5.19 not reporting cgroups v1 blkio.throttle.io_serviced Status in linux-aws package in Ubuntu: New Status in linux-azure package in Ubuntu: New Status in linux-gcp package in Ubuntu: New Bug description: Hi, I'm still investigating but, am a bit stuck. Here's what I've found so far. Today I've upgraded some nodes in AWS EC2 from the previous v5.15 linux-aws package to the recently pusblished v5.19 package and rebooted. It seems that even when there's disk activity, the files: /sys/fs/cgroup/blkio/blkio.throttle.io_serviced /sys/fs/cgroup/blkio/blkio.throttle.io_service_bytes Are only ever populated with 0's. Prior on v5.15 these would reflect the actual disk usage. No other system configuration changes were applied just the kernel upgrade and reboot. I've also verified that simply rebooting a v5.15 where this does work doesn't break the reporting. These EC2 instances are running with cgroups v1 due to other compatability issues and I suspect that might be the issue. So far, I cannot find any differences. mtab shows the same v1 mount setup, the kernel options match betwen v5.15 and v5.19. I'm more than happy to fetch whatever info would help out here. I'd love to get 5.19 working for us but, we really need the data from these files. Info: Prior version that works: Linux ip-10-128-168-154 5.15.0-1031-aws #35-Ubuntu SMP Fri Feb 10 02:07:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux Upgraded version that's broken: Linux ip-10-128-166-219 5.19.0-1022-aws #23~22.04.1-Ubuntu SMP Fri Mar 17 15:38:24 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux EC2 instances built off of the published 22.04 LTS AMI in us-east-1. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/2016186/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2016186] Re: 5.19 not reporting cgroups v1 blkio.throttle.io_serviced
A few clarifications from IRC: 1. We run all of our Ubuntu 22.04 LTS nodes with the kernel args 'systemd.unified_cgroup_hierarchy=0 systemd.legacy_systemd_cgroup_controller=true' to force cgroups v1 as, unfortunately, we cannot safely turn on cgroups v2 yet (that's another pile of work I want to do!). 2. If you install 'linux-modules-extra-aws', 'modprobe bfq', and then 'echo bfq > /sys/block/nvme0n1/queue/scheduler' you will see stats in the '/sys/fs/cgroup/blkio/blkio.bfq.io_service*' files. 3. However, we continue to only see 0's in the '/sys/fs/cgroup/blkio/blkio.throttle.io_service*' files. Potentially an upstream change but, definitely something that breaks with the '5.19.0.1022.23~22.04.6' Jammy package update. For me, this likely means I need to pin everything to the older 5.15 package pending cgroups v2 working or a fix to this. Obviously I'd prefer having this fixed so that we can get to 5.19 and stick w/ cgroups v1. I'd also offer a note that pushing 5.19 to Jammy without this support feels like a breaking change. I'm more worried that _other_ cgroups v1 controllers aren't working in a way I haven't noticed yet. Anyway, thanks so much for the help so far and gimme a holler if I can test/confirm anything else! -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/2016186 Title: 5.19 not reporting cgroups v1 blkio.throttle.io_serviced Status in linux-aws package in Ubuntu: New Bug description: Hi, I'm still investigating but, am a bit stuck. Here's what I've found so far. Today I've upgraded some nodes in AWS EC2 from the previous v5.15 linux-aws package to the recently pusblished v5.19 package and rebooted. It seems that even when there's disk activity, the files: /sys/fs/cgroup/blkio/blkio.throttle.io_serviced /sys/fs/cgroup/blkio/blkio.throttle.io_service_bytes Are only ever populated with 0's. Prior on v5.15 these would reflect the actual disk usage. No other system configuration changes were applied just the kernel upgrade and reboot. I've also verified that simply rebooting a v5.15 where this does work doesn't break the reporting. These EC2 instances are running with cgroups v1 due to other compatability issues and I suspect that might be the issue. So far, I cannot find any differences. mtab shows the same v1 mount setup, the kernel options match betwen v5.15 and v5.19. I'm more than happy to fetch whatever info would help out here. I'd love to get 5.19 working for us but, we really need the data from these files. Info: Prior version that works: Linux ip-10-128-168-154 5.15.0-1031-aws #35-Ubuntu SMP Fri Feb 10 02:07:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux Upgraded version that's broken: Linux ip-10-128-166-219 5.19.0-1022-aws #23~22.04.1-Ubuntu SMP Fri Mar 17 15:38:24 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux EC2 instances built off of the published 22.04 LTS AMI in us-east-1. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/2016186/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2016186] [NEW] 5.19 not reporting cgroups v1 blkio.throttle.io_serviced
Public bug reported: Hi, I'm still investigating but, am a bit stuck. Here's what I've found so far. Today I've upgraded some nodes in AWS EC2 from the previous v5.15 linux- aws package to the recently pusblished v5.19 package and rebooted. It seems that even when there's disk activity, the files: /sys/fs/cgroup/blkio/blkio.throttle.io_serviced /sys/fs/cgroup/blkio/blkio.throttle.io_service_bytes Are only ever populated with 0's. Prior on v5.15 these would reflect the actual disk usage. No other system configuration changes were applied just the kernel upgrade and reboot. I've also verified that simply rebooting a v5.15 where this does work doesn't break the reporting. These EC2 instances are running with cgroups v1 due to other compatability issues and I suspect that might be the issue. So far, I cannot find any differences. mtab shows the same v1 mount setup, the kernel options match betwen v5.15 and v5.19. I'm more than happy to fetch whatever info would help out here. I'd love to get 5.19 working for us but, we really need the data from these files. Info: Prior version that works: Linux ip-10-128-168-154 5.15.0-1031-aws #35-Ubuntu SMP Fri Feb 10 02:07:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux Upgraded version that's broken: Linux ip-10-128-166-219 5.19.0-1022-aws #23~22.04.1-Ubuntu SMP Fri Mar 17 15:38:24 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux EC2 instances built off of the published 22.04 LTS AMI in us-east-1. ** Affects: linux-aws (Ubuntu) Importance: Undecided Status: New ** Description changed: Hi, I'm still investigating but, am a bit stuck. Here's what I've found so far. Today I've upgraded some nodes in AWS EC2 from the previous v5.15 linux- aws package to the recently pusblished v5.19 package and rebooted. It seems that even when there's disk activity, the files: /sys/fs/cgroup/blkio/blkio.throttle.io_serviced /sys/fs/cgroup/blkio/blkio.throttle.io_service_bytes Are only ever populated with 0's. Prior on v5.15 these would reflect the actual disk usage. No other system configuration changes were applied just the kernel upgrade and reboot. I've also verified that simply rebooting a v5.15 where this does work doesn't break the reporting. These EC2 instances are running with cgroups v1 due to other compatability issues and I suspect that might be the issue. So far, I cannot find any differences. mtab shows the same v1 mount setup, the kernel options match betwen v5.15 and v5.19. I'm more than happy to fetch whatever info would help out here. I'd love to get 5.19 working for us but, we really need the data from these files. + + Info: + Prior version that works: Linux ip-10-128-168-154 5.15.0-1031-aws #35-Ubuntu SMP Fri Feb 10 02:07:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux + Upgraded version that's broken: Linux ip-10-128-166-219 5.19.0-1022-aws #23~22.04.1-Ubuntu SMP Fri Mar 17 15:38:24 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux + + EC2 instances built off of the published 22.04 LTS AMI in us-east-1. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/2016186 Title: 5.19 not reporting cgroups v1 blkio.throttle.io_serviced Status in linux-aws package in Ubuntu: New Bug description: Hi, I'm still investigating but, am a bit stuck. Here's what I've found so far. Today I've upgraded some nodes in AWS EC2 from the previous v5.15 linux-aws package to the recently pusblished v5.19 package and rebooted. It seems that even when there's disk activity, the files: /sys/fs/cgroup/blkio/blkio.throttle.io_serviced /sys/fs/cgroup/blkio/blkio.throttle.io_service_bytes Are only ever populated with 0's. Prior on v5.15 these would reflect the actual disk usage. No other system configuration changes were applied just the kernel upgrade and reboot. I've also verified that simply rebooting a v5.15 where this does work doesn't break the reporting. These EC2 instances are running with cgroups v1 due to other compatability issues and I suspect that might be the issue. So far, I cannot find any differences. mtab shows the same v1 mount setup, the kernel options match betwen v5.15 and v5.19. I'm more than happy to fetch whatever info would help out here. I'd love to get 5.19 working for us but, we really need the data from these files. Info: Prior version that works: Linux ip-10-128-168-154 5.15.0-1031-aws #35-Ubuntu SMP Fri Feb 10 02:07:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux Upgraded version that's broken: Linux ip-10-128-166-219 5.19.0-1022-aws #23~22.04.1-Ubuntu SMP Fri Mar 17 15:38:24 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux EC2 instances built off of the published 22.04 LTS AMI in us-east-1. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/2016186/+subscriptions -- Mailing list: