On 2018-10-13 18:28, Chris Murphy wrote:
Is it practical and desirable to make Btrfs based OS installation
images reproducible? Or is Btrfs simply too complex and
non-deterministic? [1]

The main three problems with Btrfs right now for reproducibility are:
a. many objects have uuids other than the volume uuid; and mkfs only
lets us set the volume uuid
b. atime, ctime, mtime, otime; and no way to make them all the same
c. non-deterministic allocation of file extents, compression, inode
assignment, logical and physical address allocation

I'm imagining reproducible image creation would be a mkfs feature that
builds on Btrfs seed and --rootdir concepts to constrain Btrfs
features to maybe make reproducible Btrfs volumes possible:

- No raid
- Either all objects needing uuids can have those uuids specified by
switch, or possibly a defined set of uuids expressly for this use
case, or possibly all of them can just be zeros (eek? not sure)
- A flag to set all times the same
- Possibly require that target block device is zero filled before
creation of the Btrfs
- Possibly disallow subvolumes and snapshots
- Require the resulting image is seed/ro and maybe also a new
compat_ro flag to enforce that such Btrfs file systems cannot be
modified after the fact.
- Enforce a consistent means of allocation and compression

The end result is creating two Btrfs volumes would yield image files
with matching hashes.
So in other words, you care about matching the block layout _exactly_. This is a great idea for paranoid people, but it's usually overkill. Realistically, almost nothing in userspace cares about the block layout, worrying about it just makes verifying the reproduced image a bit easier (there's no reason you can't verify all the relevant data without doing a checksum or HMAC of the image as a whole).

If I had to guess, the biggest challenge would be allocation. But it's
also possible that such an image may have problems with "sprouts". A
non-removable sprout seems fairly straightforward and safe; but if a
"reproducible build" type of seed is removed, it seems like removal
needs to be smart enough to refresh *all* uuids found in the sprout: a
hard break from the seed.

Competing file systems, ext4 with make_ext4 fork, and squashfs. At the
moment I'm thinking it might be easier to teach squashfs integrity
checking than to make Btrfs reproducible.  But then I also think
restricting Btrfs features, and applying some requirements to
constrain Btrfs to make it reproducible, really enhances the Btrfs
seed-sprout feature.

Any thoughts? Useful? Difficult to implement?

Squashfs might be a better fit for this use case *if* it can be taught
about integrity checking. It does per file checksums for the purpose
of deduplication but those checksums aren't retained for later
integrity checking.
I've seen projects with SquashFS that store integrity data separately but leverage other infrastructure. Methods I've seen so far include:

* GPG-signed SquashFS images, usually with detached signatures
* SquashFS with PAR2 integrity checking data
* SquashFS on top of dm-verity
* SquashFS on top of dm-integrity

The first two need to be externally checked prior to mount, but doing so is not hard. The fourth is tricky to set up right, but provides better integration with encrypted images. The third does exactly what's needed though. You just use the embedded data variant of dm-verity, bind the resultant image to a loop device, activate dm-verity on the loop device, and mount the resultant mapped device like any other SquashFS image.

I've also seen some talk of using SquashFS with IMA and IMA appraisal, but I've not seen anybody actually _do_ that, and it wouldn't be on quite the level you seem to want (it verifies the files in the image, but not the image as a whole).

Reply via email to