[Kernel-packages] [Bug 2016186] Re: 5.19 not reporting cgroups v1 blkio.throttle.io_serviced

2023-05-22 Thread Jared Ledvina (Datadog)
Hey Andrea, 
Thanks for the help getting this all fixed up. I see that the change is 
committed for Lunar and Kinetic. 

Is there a good way for me to follow when this'll land for the Ubuntu
Jammy linux-aws, linux-gcp, and linux-azure packages?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2016186

Title:
  5.19 not reporting cgroups v1 blkio.throttle.io_serviced

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Kinetic:
  Fix Committed
Status in linux source package in Lunar:
  Fix Committed
Status in linux source package in Mantic:
  Incomplete

Bug description:
  [Impact]

  Commit f382fb0bcef4 ("block: remove legacy IO schedulers") introduced
  a behavior change in the blkio throttle cgroup subsystem: IO
  statistics are not reported anymore unless a throttling rule is
  explicitly defined, because the current code only counts bios that are
  actually throttled.

  This behavior change is potentially breaking some user-space 
  applications that are relying on the old behavior (see original bug 
  report below).

  [Test case]

   - mount cgroup v1
   - create a blkio cgroup
   - move a task into the blkio cgroup
   - perform some I/O (i.e., dd)
   - read the IO stats for the cgroup (blkio.throttle.io_serviced and 
blkio.throttle.io_service_bytes in cgroupfs)
   - IO stats are all 0, unless a throttle rule is defined

  Previous behavior (kernel 5.15) was showing I/O statistics even
  without throttling rules defined.

  [Fix]

  Apply / backport this fix:

  
https://lore.kernel.org/lkml/20230507170631.89607-1-hanjinke@bytedance.com/t/

  [Regression potential]

  The fix is affecting the block IO cgroup subsystem, we may see
  potential regressions in this particular cgroup subsystem with this
  fix applied.

  [Original bug report]

  Hi,

  I'm still investigating but, am a bit stuck. Here's what I've found so
  far.

  Today I've upgraded some nodes in AWS EC2 from the previous v5.15
  linux-aws package to the recently pusblished v5.19 package and
  rebooted. It seems that even when there's disk activity, the files:

  /sys/fs/cgroup/blkio/blkio.throttle.io_serviced
  /sys/fs/cgroup/blkio/blkio.throttle.io_service_bytes

  Are only ever populated with 0's. Prior on v5.15 these would reflect
  the actual disk usage. No other system configuration changes were
  applied just the kernel upgrade and reboot. I've also verified that
  simply rebooting a v5.15 where this does work doesn't break the
  reporting. These EC2 instances are running with cgroups v1 due to
  other compatability issues and I suspect that might be the issue. So
  far, I cannot find any differences. mtab shows the same v1 mount
  setup, the kernel options match betwen v5.15 and v5.19.

  I'm more than happy to fetch whatever info would help out here. I'd
  love to get 5.19 working for us but, we really need the data from
  these files.

  Info:
  Prior version that works: Linux ip-10-128-168-154 5.15.0-1031-aws #35-Ubuntu 
SMP Fri Feb 10 02:07:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
  Upgraded version that's broken: Linux ip-10-128-166-219 5.19.0-1022-aws 
#23~22.04.1-Ubuntu SMP Fri Mar 17 15:38:24 UTC 2023 x86_64 x86_64 x86_64 
GNU/Linux

  EC2 instances built off of the published 22.04 LTS AMI in us-east-1.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2016186/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2016186] Re: 5.19 not reporting cgroups v1 blkio.throttle.io_serviced

2023-04-26 Thread Jared Ledvina (Datadog)
I've updated this to report the issue for linux-azure and linux-gcp as
their jammy-updates repo's have recently updated to kernel 5.19 and
appear to be affected as well. I could use some guidance on reporting
this upstream as 5.19 doesn't seem to be a supported kernel version
(looking https://www.kernel.org/) so it's unclear the correct way to go
about that.

Related, https://packages.ubuntu.com/jammy-updates/linux-gcp-lts-22.04
doesn't exist as of right now. If that could be published similar to
https://packages.ubuntu.com/jammy-updates/linux-azure-lts-22.04 and
https://packages.ubuntu.com/jammy-updates/linux-aws-lts-22.04 that'd be
a huge help for me.

** Also affects: linux-gcp (Ubuntu)
   Importance: Undecided
   Status: New

** Also affects: linux-azure (Ubuntu)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/2016186

Title:
  5.19 not reporting cgroups v1 blkio.throttle.io_serviced

Status in linux-aws package in Ubuntu:
  New
Status in linux-azure package in Ubuntu:
  New
Status in linux-gcp package in Ubuntu:
  New

Bug description:
  Hi,

  I'm still investigating but, am a bit stuck. Here's what I've found so
  far.

  Today I've upgraded some nodes in AWS EC2 from the previous v5.15
  linux-aws package to the recently pusblished v5.19 package and
  rebooted. It seems that even when there's disk activity, the files:

  /sys/fs/cgroup/blkio/blkio.throttle.io_serviced
  /sys/fs/cgroup/blkio/blkio.throttle.io_service_bytes

  Are only ever populated with 0's. Prior on v5.15 these would reflect
  the actual disk usage. No other system configuration changes were
  applied just the kernel upgrade and reboot. I've also verified that
  simply rebooting a v5.15 where this does work doesn't break the
  reporting. These EC2 instances are running with cgroups v1 due to
  other compatability issues and I suspect that might be the issue. So
  far, I cannot find any differences. mtab shows the same v1 mount
  setup, the kernel options match betwen v5.15 and v5.19.

  I'm more than happy to fetch whatever info would help out here. I'd
  love to get 5.19 working for us but, we really need the data from
  these files.

  Info:
  Prior version that works: Linux ip-10-128-168-154 5.15.0-1031-aws #35-Ubuntu 
SMP Fri Feb 10 02:07:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
  Upgraded version that's broken: Linux ip-10-128-166-219 5.19.0-1022-aws 
#23~22.04.1-Ubuntu SMP Fri Mar 17 15:38:24 UTC 2023 x86_64 x86_64 x86_64 
GNU/Linux

  EC2 instances built off of the published 22.04 LTS AMI in us-east-1.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/2016186/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2016186] Re: 5.19 not reporting cgroups v1 blkio.throttle.io_serviced

2023-04-14 Thread Jared Ledvina (Datadog)
A few clarifications from IRC:
1. We run all of our Ubuntu 22.04 LTS nodes with the kernel args 
'systemd.unified_cgroup_hierarchy=0 
systemd.legacy_systemd_cgroup_controller=true' to force cgroups v1 as, 
unfortunately, we cannot safely turn on cgroups v2 yet (that's another pile of 
work I want to do!). 
2. If you install 'linux-modules-extra-aws', 'modprobe bfq', and then 'echo bfq 
> /sys/block/nvme0n1/queue/scheduler' you will see stats in the 
'/sys/fs/cgroup/blkio/blkio.bfq.io_service*' files. 
3. However, we continue to only see 0's in the 
'/sys/fs/cgroup/blkio/blkio.throttle.io_service*' files. 

Potentially an upstream change but, definitely something that breaks
with the '5.19.0.1022.23~22.04.6' Jammy package update. For me, this
likely means I need to pin everything to the older 5.15 package pending
cgroups v2 working or a fix to this. Obviously I'd prefer having this
fixed so that we can get to 5.19 and stick w/ cgroups v1. I'd also offer
a note that pushing 5.19 to Jammy without this support feels like a
breaking change. I'm more worried that _other_ cgroups v1 controllers
aren't working in a way I haven't noticed yet. Anyway, thanks so much
for the help so far and gimme a holler if I can test/confirm anything
else!

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/2016186

Title:
  5.19 not reporting cgroups v1 blkio.throttle.io_serviced

Status in linux-aws package in Ubuntu:
  New

Bug description:
  Hi,

  I'm still investigating but, am a bit stuck. Here's what I've found so
  far.

  Today I've upgraded some nodes in AWS EC2 from the previous v5.15
  linux-aws package to the recently pusblished v5.19 package and
  rebooted. It seems that even when there's disk activity, the files:

  /sys/fs/cgroup/blkio/blkio.throttle.io_serviced
  /sys/fs/cgroup/blkio/blkio.throttle.io_service_bytes

  Are only ever populated with 0's. Prior on v5.15 these would reflect
  the actual disk usage. No other system configuration changes were
  applied just the kernel upgrade and reboot. I've also verified that
  simply rebooting a v5.15 where this does work doesn't break the
  reporting. These EC2 instances are running with cgroups v1 due to
  other compatability issues and I suspect that might be the issue. So
  far, I cannot find any differences. mtab shows the same v1 mount
  setup, the kernel options match betwen v5.15 and v5.19.

  I'm more than happy to fetch whatever info would help out here. I'd
  love to get 5.19 working for us but, we really need the data from
  these files.

  Info:
  Prior version that works: Linux ip-10-128-168-154 5.15.0-1031-aws #35-Ubuntu 
SMP Fri Feb 10 02:07:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
  Upgraded version that's broken: Linux ip-10-128-166-219 5.19.0-1022-aws 
#23~22.04.1-Ubuntu SMP Fri Mar 17 15:38:24 UTC 2023 x86_64 x86_64 x86_64 
GNU/Linux

  EC2 instances built off of the published 22.04 LTS AMI in us-east-1.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/2016186/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2016186] [NEW] 5.19 not reporting cgroups v1 blkio.throttle.io_serviced

2023-04-13 Thread Jared Ledvina (Datadog)
Public bug reported:

Hi,

I'm still investigating but, am a bit stuck. Here's what I've found so
far.

Today I've upgraded some nodes in AWS EC2 from the previous v5.15 linux-
aws package to the recently pusblished v5.19 package and rebooted. It
seems that even when there's disk activity, the files:

/sys/fs/cgroup/blkio/blkio.throttle.io_serviced
/sys/fs/cgroup/blkio/blkio.throttle.io_service_bytes

Are only ever populated with 0's. Prior on v5.15 these would reflect the
actual disk usage. No other system configuration changes were applied
just the kernel upgrade and reboot. I've also verified that simply
rebooting a v5.15 where this does work doesn't break the reporting.
These EC2 instances are running with cgroups v1 due to other
compatability issues and I suspect that might be the issue. So far, I
cannot find any differences. mtab shows the same v1 mount setup, the
kernel options match betwen v5.15 and v5.19.

I'm more than happy to fetch whatever info would help out here. I'd love
to get 5.19 working for us but, we really need the data from these
files.

Info:
Prior version that works: Linux ip-10-128-168-154 5.15.0-1031-aws #35-Ubuntu 
SMP Fri Feb 10 02:07:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Upgraded version that's broken: Linux ip-10-128-166-219 5.19.0-1022-aws 
#23~22.04.1-Ubuntu SMP Fri Mar 17 15:38:24 UTC 2023 x86_64 x86_64 x86_64 
GNU/Linux

EC2 instances built off of the published 22.04 LTS AMI in us-east-1.

** Affects: linux-aws (Ubuntu)
 Importance: Undecided
 Status: New

** Description changed:

  Hi,
  
  I'm still investigating but, am a bit stuck. Here's what I've found so
  far.
  
  Today I've upgraded some nodes in AWS EC2 from the previous v5.15 linux-
  aws package to the recently pusblished v5.19 package and rebooted. It
  seems that even when there's disk activity, the files:
  
  /sys/fs/cgroup/blkio/blkio.throttle.io_serviced
  /sys/fs/cgroup/blkio/blkio.throttle.io_service_bytes
  
  Are only ever populated with 0's. Prior on v5.15 these would reflect the
  actual disk usage. No other system configuration changes were applied
  just the kernel upgrade and reboot. I've also verified that simply
  rebooting a v5.15 where this does work doesn't break the reporting.
  These EC2 instances are running with cgroups v1 due to other
  compatability issues and I suspect that might be the issue. So far, I
  cannot find any differences. mtab shows the same v1 mount setup, the
  kernel options match betwen v5.15 and v5.19.
  
  I'm more than happy to fetch whatever info would help out here. I'd love
  to get 5.19 working for us but, we really need the data from these
  files.
+ 
+ Info:
+ Prior version that works: Linux ip-10-128-168-154 5.15.0-1031-aws #35-Ubuntu 
SMP Fri Feb 10 02:07:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
+ Upgraded version that's broken: Linux ip-10-128-166-219 5.19.0-1022-aws 
#23~22.04.1-Ubuntu SMP Fri Mar 17 15:38:24 UTC 2023 x86_64 x86_64 x86_64 
GNU/Linux
+ 
+ EC2 instances built off of the published 22.04 LTS AMI in us-east-1.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/2016186

Title:
  5.19 not reporting cgroups v1 blkio.throttle.io_serviced

Status in linux-aws package in Ubuntu:
  New

Bug description:
  Hi,

  I'm still investigating but, am a bit stuck. Here's what I've found so
  far.

  Today I've upgraded some nodes in AWS EC2 from the previous v5.15
  linux-aws package to the recently pusblished v5.19 package and
  rebooted. It seems that even when there's disk activity, the files:

  /sys/fs/cgroup/blkio/blkio.throttle.io_serviced
  /sys/fs/cgroup/blkio/blkio.throttle.io_service_bytes

  Are only ever populated with 0's. Prior on v5.15 these would reflect
  the actual disk usage. No other system configuration changes were
  applied just the kernel upgrade and reboot. I've also verified that
  simply rebooting a v5.15 where this does work doesn't break the
  reporting. These EC2 instances are running with cgroups v1 due to
  other compatability issues and I suspect that might be the issue. So
  far, I cannot find any differences. mtab shows the same v1 mount
  setup, the kernel options match betwen v5.15 and v5.19.

  I'm more than happy to fetch whatever info would help out here. I'd
  love to get 5.19 working for us but, we really need the data from
  these files.

  Info:
  Prior version that works: Linux ip-10-128-168-154 5.15.0-1031-aws #35-Ubuntu 
SMP Fri Feb 10 02:07:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
  Upgraded version that's broken: Linux ip-10-128-166-219 5.19.0-1022-aws 
#23~22.04.1-Ubuntu SMP Fri Mar 17 15:38:24 UTC 2023 x86_64 x86_64 x86_64 
GNU/Linux

  EC2 instances built off of the published 22.04 LTS AMI in us-east-1.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/2016186/+subscriptions


-- 
Mailing list: