hello,

in last week's email to the reproducible-builds email list[1] about reproducible Arch Linux I mentioned there's only one unreproducible package left in docker.io/library/archlinux.

[1]: https://lists.reproducible-builds.org/pipermail/rb-general/2024-March/003291.html

Due to amazing work by dvzrv and Foxboron this package is now also reproducible!

 INFO  arch_repro_status > All packages are reproducible!
 INFO  arch_repro_status > Your system is 100.00% reproducible.

To try for yourself use:

podman run --rm -t archlinux sh -c 'pacman -Suy arch-repro-status --noconfirm && arch-repro-status'

However:

Where do we go from here? It would be cool if the OCI container image itself could also be reproduced (bit-for-bit), but I'm not sure if there's any prior work (specifically for images listed as 'official' on Docker Hub)?

Specifically what I mean - given a line like this:

FROM archlinux@sha256:2dbd72d1e5510e047db7f441bf9069e9c53391b87e04e5bee3f379cd03cec060

I want to reproduce the artifact(s) that are pulled in by this, with the packages our Arch Linux rebuilders have reproduced from source code. From what I understand this hash points to a json manifest that is not contained in the container image itself and was generated by the registry (should we archive them?), and this manifest then points to the sha256 of the tar containing the filesystem (I'm possibly missing an indirection here).

Hopefully one of the many SBOM formats can help with this. :P

I know the container image is built from these two repositories but I don't have any in-depth knowledge:

- https://github.com/docker-library/official-images/blob/master/library/archlinux
- https://gitlab.archlinux.org/archlinux/archlinux-docker

The only work towards reproducible container images I'm aware of is by Akihiro Suda:

https://github.com/reproducible-containers/repro-get#are-container-images-bit-to-bit-reproducible

I'm suspecting the current scripts used by Arch Linux would still be prone to mirror changes[2] though, meaning new package uploads would end up in our reproduced artifacts (causing mismatches) and the container image could only be reproduced for a short amount of time.

[2]: https://gitlab.archlinux.org/archlinux/archlinux-docker/-/blob/98cd79111dd530447f491d547d14f3c38e227e46/scripts/make-rootfs.sh#L24-29

I'm also not sure if there's a missing puzzle piece with reproducible containers in regards to this manifest json that is generated by the registry. The image digest being unpredictable has also been mentioned in a cosign github issue[3].

[3]: https://github.com/sigstore/cosign/issues/2516

Input much appreciated!

## Caveats

Probably worth mentioning, at the time of writing there's no consensus across multiple orgs yet, the https://reproducible.archlinux.org instance reports this status, two other rebuilders don't report the full 100% yet.

$ arch-repro-status -r https://reproducible.crypto-lab.ch
[...]
 INFO  arch_repro_status > 3/118 packages are not reproducible.
 INFO  arch_repro_status > Your system is 97.46% reproducible.

$ arch-repro-status -r https://wolfpit.net/rebuild
[...]
 INFO  arch_repro_status > 3/118 packages are not reproducible.
 INFO  arch_repro_status > Your system is 97.46% reproducible.

The packages in question are part of this rebuild todo (specifically gcc-libs, glibc, ncurses):

https://archlinux.org/todo/rebuild-core-with-reproducible-pacman/

Meaning there's currently some luck involved for these 3 packages, e.g. using btrfs currently increases your chances to get an exact match (after a few tries). We're obviously trying to get rid of this caveat though.

---

If you appreciate this flavor of supply-chain security you may be interested in repro-env[4] that I'm currently trying to land[5] in ubuntu 24.04 LTS, but is blocked by Debian's libnettle[6].

[4]: https://github.com/kpcyrd/repro-env
[5]: https://tracker.debian.org/pkg/rust-repro-env
[6]: https://tracker.debian.org/pkg/nettle

cheers,
kpcyrd

Reply via email to