On 03/28/22 18:16, Richard W.M. Jones wrote:
> On Mon, Mar 28, 2022 at 01:25:42PM +0300, Nikolay Ivanets wrote:
>> Hello
>>
>> This commit caused a regression with LDM devices:
>> https://github.com/libguestfs/libguestfs/commit/86577ee3883836c1c4fff258c05261bd3858e22b
>>
>> 1. list-filesystems double all LDM volumes:
>>
>>> <fs> list-filesystems
>> /dev/mapper/ldm_vol_AQ-W10-EFI-VM-Dg0_Volume5: ntfs
>> /dev/sda1: ntfs
>> /dev/sda2: vfat
>> /dev/mapper/ldm_vol_AQ-W10-EFI-VM-Dg0_Volume5: ntfs
>>
>> 2. inspect-os in this case detects 2 OSes:
>>
>>> <fs> inspect-os
>> /dev/mapper/ldm_vol_AQ-W10-EFI-VM-Dg0_Volume5
>> /dev/mapper/ldm_vol_AQ-W10-EFI-VM-Dg0_Volume5
>
>> The issue is that now we explicitly adds DM devices to a list of
>> file systems, but latter we also selectively adds LDM volumes. Which
>> also DM devices:
>
>>> <fs> list-ldm-volumes
>> /dev/mapper/ldm_vol_AQ-W10-EFI-VM-Dg0_Volume5
>>
>>> <fs> list-dm-devices
>> /dev/mapper/ldm_vol_AQ-W10-EFI-VM-Dg0_Volume5
>
> I can understand why this would happen ...
>
> From a historical point of view, list-filesystems wasn't very well
> specified or implemented.  In the simple case where you just have
> whole block devices, partitions and LVs, it means something like
> "examine every block device, partition, and LV, and return the
> filesystems found", with some additional hot spicy sauce on top for
> ignoring a block device if it's partitioned (or a partition if it
> contains a PV etc).  The implementation worked only in simple cases
> and we've layered more hacks on top since then.
>
> I wonder if what we really want to do is something like:
>
>  - for each { block device, partition, LV, LDM, etc }
>
>  - examine it for a filesystem, if found, add to list
>
>  - deduplicate the final list
>
> In particular we'd get rid of functions like "is_not_partitioned_device".
>
> There's some danger that when examining (eg) a block device, the tools
> we use for finding filesystems are not very accurate and could find a
> filesystem where it's actually located on a contained structure.  I
> think the main (only?) danger would be with RAID0/1 devices?
>
> The more fundamental problem is that we don't have a good idea of what
> devices contain other devices (partly because Linux itself doesn't
> have a good idea of this concept, and partly because the problem
> doesn't really make sense since devices can contain other devices in
> quite arbitrary ways).
>
> I'm just thinking out loud.
>
>> It would be easy to fix leaving only unique file systems eliminating
>> duplicates,
>
> It would fix this problem, but ...
>
>> however it would not help if LDM contains volumes of
>> type other than simple. Such a volumes consist of LDM partitions
>> which are also DM devices and thus the mentioned patch will include
>> them into a list if file systems but LDM partitions cannot contains
>> file system. Here you can see an example of both cases where
>> list-filesystems returns doubled LDM volumes as well as LDM
>> partitions. Later cannot contains file system:
>
> ... I guess this is like the RAID example?
>
>>> <fs> list-filesystems
>> /dev/mapper/ldm_part_WIN-RSVUTBNGR5D-Dg0_Disk1-01: unknown
>> /dev/mapper/ldm_part_WIN-RSVUTBNGR5D-Dg0_Disk1-02: ntfs
>> /dev/mapper/ldm_part_WIN-RSVUTBNGR5D-Dg0_Disk2-01: unknown
>> /dev/mapper/ldm_part_WIN-RSVUTBNGR5D-Dg0_Disk2-02: ntfs
>> /dev/mapper/ldm_part_WIN-RSVUTBNGR5D-Dg0_Disk3-01: unknown
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume1: ntfs
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume10: ntfs
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume11: ntfs
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume2: ntfs
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume3: ntfs
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume4: ntfs
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume5: vfat
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume6: ntfs
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume7: vfat
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume8: ntfs
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume9: unknown
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume1: ntfs
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume10: ntfs
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume11: ntfs
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume2: ntfs
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume3: ntfs
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume4: ntfs
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume5: vfat
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume6: ntfs
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume7: vfat
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume8: ntfs
>> /dev/mapper/ldm_vol_WIN-RSVUTBNGR5D-Dg0_Volume9: unknown
>
> I think for this particular bug we could (and in fact, should) just
> add a deduplication step at the end.  Dead easy patch.
>
> However, longer term this just adds another hack on the pile of hacks,
> and I'm not sure how to fix this properly, if indeed that is possible
> at all.
>
> Maybe Laszlo has some ideas here.

Mykola's report made me fidget uncomfortably in my seat, and I wanted to
see your thoughts on it at first. :)

I've come across a closely related aspect while working on
<https://bugzilla.redhat.com/show_bug.cgi?id=1658126> (LUKS-on-LV
support): see libguestfs commit b6ef56187f6d ('TODO: remove "Better
support for encrypted devices"', 2022-02-28). Quoting the commit
message:

    We don't seem to need an API like "list-luks-devices", as
    "list-dm-devices" returns decrypted (i.e., opened) LUKS devices too; for
    example, in the LUKS-on-LVM case:

    > ><fs> list-dm-devices
    > /dev/mapper/luks-0d619854-ccd5-43b1-8883-991fec5ef713
    > /dev/mapper/luks-4e9e7a6f-a68c-42fd-92b4-8f4f2579a389

I had totally not expected that, but was happy to take advantage of it.
:)

So basically the way we deal with *stacked* block devices is ad-hoc and
obscure, and I'm concerned about any invasive changes. Thus far only
"lsblk" appears to be the only tool that comprehensively and recursively
reports a tree of devices, but the output of it is not machine-readable.
I had checked the outputs of "dmsetup info" and similar stuff, nothing
contains enough information apart from lsblk (and lsblk is unparseable).
In the BZ I even described the problem that the node names of the
decrypted LUKS devices under /dev/mapper/ are *at best* guessable from
"common practice" (the UUID-based naming scheme).

So it's all ad-hoc, and I certainly prefer plastering it over. Whatever
we touch here has a huge chance of causing regressions; at least let us
keep the regressions light. A dedup patch should not hurt existing
functionality (dedup is idempotent: if the list contains unique elements
already, then it has no effect). If we know where the duplication creeps
in precisely, we can document that in the dedup code / commit message.
I'd strongly prefer not changing any enumeration / traversal code.

Thanks
Laszlo
_______________________________________________
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs

Reply via email to