Hi, My wording might be a bit over the top, apologies in advance. I do really mean the best, just a bit baffled at the wontfix reply and lack of acknowledgement of the problem.
I'm not really understanding the "Half VG was never a real thing" approach and general dismissal of this bug, I think it should be reopened still and this use case handled properly or at least further discussion. I consider it to be a regression and quite a nasty one at that especially for servers without BMC. LVM officially support activation of any LV in a VG as long as specific PVs needed for this LV are present. Specifying --activationmode is something else entirely and should never be done (unless disaster recovery). >From lvchange(8) man page about this option: > Determines if LV activation is allowed when PVs are missing, e.g. because of > a device failure. ... Meaning that --activationmode decides whether LV with missing PVs may be activated or not. This is not the case in my setup and in many other server setups. To make sure: *only* the PVs for the *exact* LV in question need to be present, PVs for *other* LVs in the same VG may be missing. What LVM does not support, unless you force it, is actually changing a VG when PVs of this VG are missing, as to my knowledge VG metadata is stored on each and every PV independently. LV activation does not count as VG modification. The entire reason to have a single VG for many things is because of flexibility LVM provides within a VG. Almost all of the features *require* PVs within the same VG. Snapshots, thin LVM, cache, etc. Even basic feature like resizing require a nice PV - VG - LV structure. As a PV can only be in one VG, what refusal of this bug is suggesting is that one has to manually partition (GPT) drives for root-related stuff, with their own PVs and VGs and LVs, so it can boot. And then create other partitions with PVs on the same disk for the additional stuff. This literally defeats the entire purpose of LVM, easy volume management, and means you have to resize the physical disk partitions to extend things. Good luck with that! A few examples: root filesystem on some NVMe SSD and a bunch of HDDs connected for persistent storage. Now let's say I want to: * When running backups, copy-on-write snapshot on the SSD PV so the HDD array PVs performance remains in good condition instead of COW on the HDDs. * Implement LVM caching for this HDD array PVs on fast SSD PV. * Thin pool on HDDs and store the metadata on the SSD PV for performance. * Reminder: simply having a /etc/luks.key on rootfs used to open other PVs (HDD array) will result in boot failure if they're in the same VG! All totally reasonable use cases. But if you are requiring split VG for "root stuff needed during booting" this is all not supported, because HDDs are only activated later. Maybe these HDD PVs are actually on a networked file system (NBD, NFS) are only available later. LVM can do all of this safely no problem. (I have used many of the above in production environments with great results.) Anyway, the main problem is that the udev rules are currently coded to only do activation when LVM_VG_NAME_COMPLETE aka a VG has all the PVs used for all the LVs within the VG. This is not suitable for early stage boot processes and activation of LV for root, swap and other early required fs. It *is* suitable for auto-activation of LVs during late stage boot process or during runtime. On my workstation, opening mail of this issue, a typical boot is as follows: # The PV (SSD) containing root and swap is unlocked (LUKS, manual input pw). # Manual initramfs hack that "lvchange -aay -y root swap". lvm: WARNING: VG verm-r4e is missing PV HDD_ARRAY_1... lvm: WARNING: VG verm-r4e is missing PV HDD_ARRAY_2... # Rootfs get mounted, late-stage boot lvm cryptsetup etc run. # It notices verm-r4e VG is incomplete. lvm[1535]: PV /dev/dm-2 online, VG verm-r4e incomplete (need 2). # The /etc/luks.key is used to open HDD array PVs. lvm[2387]: PV /dev/dm-17 online, VG verm-r4e incomplete (need 1). lvm[2534]: PV /dev/dm-19 online, VG verm-r4e is complete. # Now the all PVs are there, the remainder LVs get auto-activated. lvm[2540]: 9 logical volume(s) in volume group "verm-r4e" now active # As a result, my HDD pool LV is on and gets fsck + mounted cleanly. This is working very well and is by design and very normal way of using LVM. Both startup and shutdown works perfectly with clean mount/unmount etc. Also, if system configuration somehow triggers raid rebuilds every boot then sysadmin will get mail from tooling (if properly configured at least). Pretty sure it also only syncs dirty pages in such a case anyway, which is very quick. In any case I'd rather have a system mail root with some issue than hard fail in initramfs for no good reason. This strongly feels like removing upstream supported use cases to "protect" sysadmin from wrongly configuring their system, at the expense of people that do know what they're doing and are relying on this feature to work with a more complex LVM setup. If LVM is too complex people can use plain partitioning. I'd be glad to assist in any potential solutions or further explanation, so definitely feel free to ask! Cheers, -- Melvin Vermeeren Systems engineer
signature.asc
Description: This is a digitally signed message part.