On 03/28/22 18:16, Richard W.M. Jones wrote: > On Mon, Mar 28, 2022 at 01:25:42PM +0300, Nikolay Ivanets wrote: >> Hello >> >> This commit caused a regression with LDM devices: >> https://github.com/libguestfs/libguestfs/commit/86577ee3883836c1c4fff258c05261bd3858e22b >> >> 1. list-filesystems double all LDM volumes: >> >>> <fs> list-filesystems >> /dev/mapper/ldm_vol_AQ-W10-EFI-VM-Dg0_Volume5: ntfs >> /dev/sda1: ntfs >> /dev/sda2: vfat >> /dev/mapper/ldm_vol_AQ-W10-EFI-VM-Dg0_Volume5: ntfs >> >> 2. inspect-os in this case detects 2 OSes: >> >>> <fs> inspect-os >> /dev/mapper/ldm_vol_AQ-W10-EFI-VM-Dg0_Volume5 >> /dev/mapper/ldm_vol_AQ-W10-EFI-VM-Dg0_Volume5 > >> The issue is that now we explicitly adds DM devices to a list of >> file systems, but latter we also selectively adds LDM volumes. Which >> also DM devices: > >>> <fs> list-ldm-volumes >> /dev/mapper/ldm_vol_AQ-W10-EFI-VM-Dg0_Volume5 >> >>> <fs> list-dm-devices >> /dev/mapper/ldm_vol_AQ-W10-EFI-VM-Dg0_Volume5 > > I can understand why this would happen ... > > From a historical point of view, list-filesystems wasn't very well > specified or implemented. In the simple case where you just have > whole block devices, partitions and LVs, it means something like > "examine every block device, partition, and LV, and return the > filesystems found", with some additional hot spicy sauce on top for > ignoring a block device if it's partitioned (or a partition if it > contains a PV etc). The implementation worked only in simple cases > and we've layered more hacks on top since then. > > I wonder if what we really want to do is something like: > > - for each { block device, partition, LV, LDM, etc } > > - examine it for a filesystem, if found, add to list > > - deduplicate the final list > > In particular we'd get rid of functions like "is_not_partitioned_device". > > There's some danger that when examining (eg) a block device, the tools > we use for finding filesystems are not very accurate and could find a > filesystem where it's actually located on a contained structure. I > think the main (only?) danger would be with RAID0/1 devices? > > The more fundamental problem is that we don't have a good idea of what > devices contain other devices (partly because Linux itself doesn't > have a good idea of this concept, and partly because the problem > doesn't really make sense since devices can contain other devices in > quite arbitrary ways). > > I'm just thinking out loud. > >> It would be easy to fix leaving only unique file systems eliminating >> duplicates, > > It would fix this problem, but ... > >> however it would not help if LDM contains volumes of >> type other than simple. Such a volumes consist of LDM partitions >> which are also DM devices and thus the mentioned patch will include >> them into a list if file systems but LDM partitions cannot contains >> file system. Here you can see an example of both cases where >> list-filesystems returns doubled LDM volumes as well as LDM >> partitions. Later cannot contains file system: > > ... I guess this is like the RAID example? > >>> <fs> list-filesystems >> /dev/mapper/ldm_part_WIN-RSVUTBNGR5D-Dg0_Disk1-01: unknown >> /dev/mapper/ldm_part_WIN-RSVUTBNGR5D-Dg0_Disk1-02: ntfs >> /dev/mapper/ldm_part_WIN-RSVUTBNGR5D-Dg0_Disk2-01: unknown >> /dev/mapper/ldm_part_WIN-RSVUTBNGR5D-Dg0_Disk2-02: ntfs >> /dev/mapper/ldm_part_WIN-RSVUTBNGR5D-Dg0_Disk3-01: unknown >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume1: ntfs >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume10: ntfs >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume11: ntfs >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume2: ntfs >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume3: ntfs >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume4: ntfs >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume5: vfat >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume6: ntfs >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume7: vfat >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume8: ntfs >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume9: unknown >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume1: ntfs >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume10: ntfs >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume11: ntfs >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume2: ntfs >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume3: ntfs >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume4: ntfs >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume5: vfat >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume6: ntfs >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume7: vfat >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume8: ntfs >> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume9: unknown > > I think for this particular bug we could (and in fact, should) just > add a deduplication step at the end. Dead easy patch. > > However, longer term this just adds another hack on the pile of hacks, > and I'm not sure how to fix this properly, if indeed that is possible > at all. > > Maybe Laszlo has some ideas here.
Mykola's report made me fidget uncomfortably in my seat, and I wanted to see your thoughts on it at first. :) I've come across a closely related aspect while working on <https://bugzilla.redhat.com/show_bug.cgi?id=1658126> (LUKS-on-LV support): see libguestfs commit b6ef56187f6d ('TODO: remove "Better support for encrypted devices"', 2022-02-28). Quoting the commit message: We don't seem to need an API like "list-luks-devices", as "list-dm-devices" returns decrypted (i.e., opened) LUKS devices too; for example, in the LUKS-on-LVM case: > ><fs> list-dm-devices > /dev/mapper/luks-0d619854-ccd5-43b1-8883-991fec5ef713 > /dev/mapper/luks-4e9e7a6f-a68c-42fd-92b4-8f4f2579a389 I had totally not expected that, but was happy to take advantage of it. :) So basically the way we deal with *stacked* block devices is ad-hoc and obscure, and I'm concerned about any invasive changes. Thus far only "lsblk" appears to be the only tool that comprehensively and recursively reports a tree of devices, but the output of it is not machine-readable. I had checked the outputs of "dmsetup info" and similar stuff, nothing contains enough information apart from lsblk (and lsblk is unparseable). In the BZ I even described the problem that the node names of the decrypted LUKS devices under /dev/mapper/ are *at best* guessable from "common practice" (the UUID-based naming scheme). So it's all ad-hoc, and I certainly prefer plastering it over. Whatever we touch here has a huge chance of causing regressions; at least let us keep the regressions light. A dedup patch should not hurt existing functionality (dedup is idempotent: if the list contains unique elements already, then it has no effect). If we know where the duplication creeps in precisely, we can document that in the dedup code / commit message. I'd strongly prefer not changing any enumeration / traversal code. Thanks Laszlo _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs