@mruffel just wanted to check back to see if the instructions in the report worked to reproduce the problem for you. If so, do you have any estimate when packages with the patch will be made available? Thanks!
-- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to e2fsprogs in Ubuntu. https://bugs.launchpad.net/bugs/2036467 Title: Resizing cloud-images occasionally fails due to superblock checksum mismatch in resize2fs Status in cloud-images: New Status in e2fsprogs package in Ubuntu: In Progress Status in e2fsprogs source package in Trusty: Won't Fix Status in e2fsprogs source package in Xenial: Won't Fix Status in e2fsprogs source package in Bionic: Won't Fix Status in e2fsprogs source package in Focal: In Progress Status in e2fsprogs source package in Jammy: In Progress Status in e2fsprogs source package in Lunar: In Progress Status in e2fsprogs source package in Mantic: In Progress Bug description: [Impact] This is a long running bug plaguing cloud-images, where on a rare occasion resize2fs would fail and the image would not resize to fit the entire disk. Online resizes would fail due to a superblock checksum mismatch, where the superblock in memory differs from what is currently on disk due to changes made to the image. $ resize2fs /dev/nvme1n1p1 resize2fs 1.47.0 (5-Feb-2023) resize2fs: Superblock checksum does not match superblock while trying to open /dev/nvme1n1p1 Couldn't find valid filesystem superblock. Changing the read of the superblock to Direct I/O solves the issue. [Testcase] Start an c5.large instance on AWS, and attach a 60gb gp3 volume for use as a scratch disk. Run the following script, courtesy of Krister Johansen and his team: #!/usr/bin/bash set -euxo pipefail while true do parted /dev/nvme1n1 mklabel gpt mkpart primary 2048s 2099200s sleep .5 mkfs.ext4 /dev/nvme1n1p1 mount -t ext4 /dev/nvme1n1p1 /mnt stress-ng --temp-path /mnt -D 4 & STRESS_PID=$! sleep 1 growpart /dev/nvme1n1 1 resize2fs /dev/nvme1n1p1 kill $STRESS_PID wait $STRESS_PID umount /mnt wipefs -a /dev/nvme1n1p1 wipefs -a /dev/nvme1n1 done Test packages are available in the following ppa: https://launchpad.net/~mruffell/+archive/ubuntu/lp2036467-test If you install the test packages, the race no longer occurs. [Where problems could occur] We are changing how resize2fs reads the superblock from underlying disks. If a regression were to occur, resize2fs could fail to resize offline or online volumes. As all cloud-images are online resized during their initial boot, this could have a large impact to public and private clouds should a regression occur. [Other info] Upstream mailing list discussion: https://lore.kernel.org/linux-ext4/20230605225221.ga5...@templeofstupid.com/ https://lore.kernel.org/linux-ext4/20230609042239.ga1436...@mit.edu/ This was fixed in the below commit upstream: commit 43a498e938887956f393b5e45ea6ac79cc5f4b84 Author: Theodore Ts'o <ty...@mit.edu> Date: Thu, 15 Jun 2023 00:17:01 -0400 Subject: resize2fs: use Direct I/O when reading the superblock for online resizes Link: https://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git/commit/?id=43a498e938887956f393b5e45ea6ac79cc5f4b84 The commit has not been tagged to any release. All supported Ubuntu releases require this fix, and need to be published in standard non- ESM archives to be picked up in cloud images. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-images/+bug/2036467/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp