> The idea of aligning cpio metadata is very interesting. I can see how it'd
> help initramfs building speed tremendously.
>
> As I understand it, RPM is pretty different: the main difference is that
> we're trying (fairly hard) not to change the normal format of rpm as found on
> mirrors for now. There are some very interesting ideas on how to change the
> upstream format, but in doing so, we'd render all existing servers unable to
> read the format.
To clarify, aligning cpio data segments for *newly built* rpms shouldn't
necessarily require any change in format. They'd continue to function the same
as earlier cpio payload rpms, be it with some extra zero-padding.
> If we could tolerate the breakage: I'd love to experiment with
> `BTRFS_IOC_ENCODED_WRITE` which would reduce writes down and eliminate
> explicit decompression. For clients or filesystems without CoW support: RPM
> could decompress and write the normal file. I was hoping encoded writes would
> eliminate the complex path with curl -> librepo -> rpm2extents. I'm not sure
> you could get data from the network and write encoded data to disk in one
> pass like we're doing now. Do you have any ideas on how to resolve that
> challenge?
I'm not too familiar with the rpm on-disk format, but I'd hoped that
`BTRFS_IOC_ENCODED_WRITE` could be used without a change to the format, by
having the rpm header parsed during download to determine whether the
compressed payload could be written as-is. With a cpio payload it'd then be a
matter of copy_file_range()ing the (optimally aligned) compressed file data
segments into the destination during installation.
`BTRFS_IOC_ENCODED_WRITE` appears very restrictive at this stage though:
- it requires `CAP_SYS_ADMIN`, so probably isn't a viable option for
containers, etc.
- ioctl calls need to specify both unencoded and encoded offset+length, meaning
that we'd still need to parse rpm payload compression metadata
- the ioctl unencoded length can't exceed 128 KiB
- for zstd encoded I/Os, the ioctl data must represent "as a single zstd frame
with the windowLog compression parameter set to no more than 17"
- On openSUSE Tumbleweed I see some rpms currently using zstd compression
level 19. IIUC, Fedora uses the same zstd level
> Adding cpio metadata, along with a "null" compression type could help
> eliminate the change in `fsm.c` on how the payload is iterated. Note that
> `rpm2extents` does not (and cannot) touch headers without invalidating
> signatures, so the change in compression type is inferred and handled in the
> plugin.
>
> Lastly, there's another optimization that would be lost in adopting cpio
> formatting: content de-duplication. I'm not sure how important this is tho in
> the big picture, so it might be a worthwhile tradeoff.
Indeed. FWIW, I think your extent based approach offers a lot of worthwhile
benefits, but just wanted to point out that something similarly CoW friendly
(although less efficient) is possible without necessarily requiring invasive
changes :-)
> Thanks for the feedback! Matthew.
Thanks for the response!
--
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/discussions/2057#discussioncomment-4337009
You are receiving this because you are subscribed to this thread.
Message ID:
<rpm-software-management/rpm/repo-discussions/2057/comments/4337...@github.com>
_______________________________________________
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint