One relevant discussion would be why we decided to not change mdadm code anymore. What happens here is that we have an inter-dependency between mdadm and cryptroot - we first changed the mdadm max counter to "untangle" that relation, in a way cryptroot would run more times than mdadm.
But studying better the initramfs-tools, we notice that we could use the same "hack" currently in code to execute mdadm on local-block for cryptroot, and add an extra cryptroot run if mdadm was executed. This way, we make things work as expected (ab-)using the same code already present on initramfs-tools, without requiring modifying yet another boot component. I've set mdadm as "Opinion" in this LP because *it is affected*, in fact, it is part of the problem. But...not changing mdadm is a cheaper option in my opinion. At least for now..I plan to try a refactor on initramfs-tools to cope with inter-relations of components on local- block, regardless of their number (and this will require changing mdadm). -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to initramfs-tools in Ubuntu. https://bugs.launchpad.net/bugs/1879980 Title: Fail to boot with LUKS on top of RAID1 if the array is broken/degraded Status in cryptsetup package in Ubuntu: In Progress Status in initramfs-tools package in Ubuntu: In Progress Status in mdadm package in Ubuntu: Opinion Status in cryptsetup source package in Xenial: Won't Fix Status in initramfs-tools source package in Xenial: Won't Fix Status in mdadm source package in Xenial: Won't Fix Status in cryptsetup source package in Bionic: In Progress Status in initramfs-tools source package in Bionic: In Progress Status in mdadm source package in Bionic: Opinion Status in cryptsetup source package in Focal: In Progress Status in initramfs-tools source package in Focal: In Progress Status in mdadm source package in Focal: Opinion Status in cryptsetup source package in Groovy: In Progress Status in initramfs-tools source package in Groovy: In Progress Status in mdadm source package in Groovy: Opinion Status in cryptsetup package in Debian: New Bug description: [Impact] * Considering a setup of a encrypted rootfs on top of md RAID1 device, Ubuntu is currently unable to decrypt the rootfs if the array gets degraded, like for example if one of the array's members gets removed. * The problem has 2 main aspects: first, cryptsetup initramfs script attempts to decrypt the array only in the local-top boot stage, and in case it fails, it gives-up and show user a shell (boot is aborted). * Second, mdadm initramfs script that assembles degraded arrays executes later on boot, in the local-block stage. So, in a stacked setup of encrypted root on top of RAID, if the RAID is degraded, cryptsetup fails early in the boot, preventing mdadm to assemble the degraded array. * The hereby proposed solution has 2 components: first, cryptsetup script is modified to allow a gentle failure on local-top stage, then it retries for a while (according to a heuristic based on ROOTDELAY with minimum of 30 executions) in a later stage (local-block). This gives time to other initramfs scripts to run, like mdadm in local- block stage. And this is meant to work this way according to initramfs-tools documentation (although Ubuntu changed it a bit with wait-for-root, hence we stopped looping on local-block, see next bullet). * Second, initramfs-tools was adjusted - currently, it runs for a while the mdadm local-block script, in order to assemble the arrays in a non-degraded mode. We extended this approach to also execute cryptsetup, in a way that after mdadm ends its execution, we execute at least once more time cryptsetup. In an ideal world we should loop on local-block as Debian's initramfs (in a way to remove hardcoded mdadm/cryptsetup mentions from initramfs-tools code), but this would be really a big change, non-SRUable probably. I plan to work that for future Ubuntu releases. [Test case] * Install Ubuntu in a Virtual Machine with 2 disks. Use the installer to create a RAID1 volume and an encrypted root on top of it. * Boot the VM, and use "sgdisk"/"wipefs" to erase the partition table from one of the RAID members. Reboot and it will fail to mount rootfs and continue boot process. * If using the initramfs-toos/cryptsetup patches hereby proposed, the rootfs can be mounted normally. [Regression potential] * There are potential for regressions, since this is a change in 2 boot components. The patches were designed in a way to keep the regular case working, it changes the failure case which is not currently working anyway. * A modification in the behavior of cryptsetup was introduced: right now, if we fail the password 3 times (the default maximum attempts), the script doesn't "panic" and drop to a shell immediately; instead it runs once more (or twice, if mdadm is installed) before failing. This is a minor change given the benefit of the being able to mount rootfs in a degraded RAID1 scenario. * Other potential regressions could show-up as boot problems, but the change in initramfs-tools specifically is not invasive, it just may delay boot time a bit, given we now run cryptsetup multiple times on local-block, with 1 sec delays between executions. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/1879980/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp