I experimented with verification builds building packages that were recently built by the Debian buildd infrastrcture... relatively soon after the .buildinfo files are made available, without relying on snapshot.debian.org... with the goal of getting bit-for-bit identical verification of newly added packages in the Debian archive.
Overall, I think the results are promising and we should actually try something kind of like this in a more systematic way! Fair warning, this has turned into quite a long email... * Background For the most part in Debian, we have been doing CI builds, where a package is built twice and the results compared, but it is not verifying packages in the official Debian archive. It is useful, especially for catching regressions in toolchains and such, but verifying the packages people actually use is obviously desireable. In order to actually perform a verification build, you need the exact same packages installed in a build environment... There was a beta project performing verification builds that appears to have stalled sometime in 2022: https://beta.tests.reproducible-builds.org/ From what I recall, one of the main challenges was the reliability of the snapshot.debian.org service which lead to the development of an alternative snapshotting service, although that is currently not yet completed... At some point, debsnapshot was used to perform some limited testing, but this was also dependent on a reliable snapshot.debian.org. There have been several other attempts are rebuilders for debian, but the main challenge usually seems to come down to a working snapshot service in order to be able to sufficiently reproduce the build environment a package was originally built in... * Summary of approach for this experiment Copy a .buildinfo file from either coccia.debian.org:/srv/ftp-master.debian.org/buildinfo/2023/09/16 or https://buildinfos.debian.net/ftp-master.debian.org/buildinfo/2023/09/16 or other dates, but something fairly recent for best results... Create a package-specific snapshot of all the exact versions of packages in the .buildinfo file (Installed-Build-Depends). Build a package with the exact versions from the .buildinfo file added as build-dependencies, with the package-specific snapshot added to available repositories(as well as a bunch of others), leveraging "sbuild --build-dep-resolver=aptitude" to resolve the potentially complicated build dependencies. This supports sid and experimental reasonably well, including binNMUs. It also supports the few bookworm-proposed-updates and bookworm-backports .buildinfo files to some degree. Not sure where to get .buildinfo files from debian-security, but would love to test those as well! In theory it supports trixie as well, but nearly all packages for trixie currently get built in sid/unstable rather than directly in trixie. I found that building sid and experimental worked best starting with a slightly out-of-date trixie tarball, as it was almost always easier to upgrade packages than to downgrade. Currently bookworm-proposed-updates and bookworm-backports are fairly stable, although possibly the same issue might apply. * Package specific snapshots vs. complete snapshots I have mixed feelings on the package-specific snapshots. It solves the problem of getting old versions of packages to verify the build (or at least could, with a bit more work), but with some drawbacks (custom apt keyring, redundant information in *many* little snapshots, kind of complicated). Having explored package-specific snapshots, I think a better approach might be to make forward-looking snapshots of ftp.debian.org, incoming.debian.org and ideally security.debian.org (in addition to snapshot.debian.org or a replacement)... With locally available complete snapshots, each .buildinfo can be processed as soon as possible to find the list of snapshots that would satisfy the dependencies (to reduce the likelihood of having to rummage through older snapshots to find dependencies)... and make an addendum to the .buildinfo file that includes enough information to fully resolve all the build dependencies... allowing the build to be performed at some other time. This addendum might also need to recommend a snapshot for the build chroot or base tarball, though that might be a bit trickier. This could avoid having to leverage something like metasnap.debian.net, that can process a .buildinfo and spit out the relevent sanpshots. * The Code My proof of concept collection of scripts, configuration and and total lack of documentation: https://salsa.debian.org/reproducible-builds/debian-verification-build-experiment In retrospect, I should clearly have started by poking more at debrebuild and other prior art... oops! This also did not handle the syncing of the .buildinfo files at all, which I did manually for this experiment, but that is a fairly straightforward problem, and buildinfos.debian.net does this already. * Some actual results! Testing only arch:all and arch:amd64 .buildinfos, I had decent luck with 2023/09/16: total buildinfos to check: 538 attempted/building: 535 unreproducible: 28 5 % reproducible: 461 85 % failed: 46 8 % unknown: 3 0 % Overall, reasonable results. This day had a quite large number of .buildinfos to process relative to most days (most days are below 300, more below). I have not verified that these packages actually match the checksums of .deb packages in the archive, but they match the -buildd.buildinfos which is close enough for now. There may be a small amount of double-counting for builds that were for one reason or another performed multiple times, potentially marked as multiples of failed, reproducible and unreproducible. And probably other smallish discrepancies... but the overall numbers seem representative. Some of the failures were due to missing or unresolvable build-dependencies, some just regular build failures. Recent version changes of of glibc, binutils, and gcc* caused some build-dependency resolution problems. The unknown are simply the discrepancy from how many performed builds vs. how many *-buildd.buildinfos were available for that day. I also had similar results for 2023-09-15 and 2023-09-17, but ... this morning most of those results myseriously disappeared!?! No idea what happened to them. I had also done some earlier testing before I settled on this particular approach, but was still getting reasonably good results with those earlier experiments too. * Partially reproducible? A significant number of source packages produce multiple binary packages, of which frequently some of those are reproducible, even if all of them are not. It would be worth tracking that, as people do not necessarily use all the binary packages of a given source package. I still want to someday make a partial mirror using packages that were successfully reproduced and matching the ones in the official archive... as a very inefficient and unreliable rsync implementation! * Number of .buildinfos per day When I started this experiment, I thought of focusing only on a reduced set of debian, but quickly realized that a moderately powerful machine or two can usually handle the workload of all of the .buildinfos produced on a given day. Just to get an idea of how many builds per day this is, looking at the number of the *(all|amd64)-buildd.buildinfos per day since 2021, the vast majority of days have 299 or fewer buildinfos per day (888 days out of 990 days), with 599 or fewer being most of the of the remaining days (93), and a handful of days with more builds (8). Our current CI tests.reproducible-builds.org amd64 builders test thousands of packages per day most days, and that is building each package twice. I excluded builds that were performed by maintainers, as they do not migrate to testing, are a small minority of .deb related uploads, and are probably trickier to validate (e.g. arbitrary build paths, built on arbitrary days in the past due to NEW processing delays, possibly arch:amd64+all builds, etc.) ... maybe important to validate for some of the same reasons, but outside the scope for now. * Time Troubles One concern I have is that by building relatively close in time, it may produce false positives for general reproducibility due to building in the same year, month or day. I am not sure of the value of a verification build that can only be verified if performed in the same day, month or year. I guess verification builds could be retried at a later date to be more sure with some sort of snapshot service. Since it is hard to control the time in the build environment, building in a VM with a future clock (+398 days) could workaround this to get more confident results of reproducibility, although that may trigger other time related failures. * Package-specific caveats and doubts The little package-specific snapshots for each .buildinfo do not recursively resolve the dependencies, instead relying for the most part on those landing in the official archive. With a bit more work, I suspect those dependencies could get fully resolved in these package-specific snapshots and it could be made more reliable... but I also think this might be the wrong approach. A big downsides to this approach is that it requires trusting another apt keyring, as these package-specific snapshots are not signed by the official debian builders. One of the big advantages is that packages may depend on versions from a mix of ftp.debian.org dinstall runs, due to also pulling in packages from incoming.debian.org. But maybe that can be resolved in other ways. For someone to be able to independently verify these builds, these package-specific snapshots would need to be published somehow. * Looking forward and backwards at snapshots I do think that a more complete snapshot approach is probably better than package-specific snapshots, and it might be worth doing forward-looking snapshots of ftp.debian.org (and security.debian.org and incoming.debian.org), in addition to trying to fill out all the missing past snapshots to be able to attempt verification builds of older packages, such as all of bookworm. Snapshotting the archive(s) multiple times per day, today, tomorrow, and going forward will at least enable doing verification rebuilds of packages starting from this point, with less immediate overhead than trying to replicate the entire functionality or more complete history of snapshot.debian.org. I wonder if having multiple snapshot.debian.org implementations might actually be a desireable thing, as it is so essential to the ability to do long-term reproducible builds verification builds, and having additional independent snapshots could provide redundancy and the ability to repair breakages if one of the services fails in some way. * In closing... To me it seems viable to successfully do verification builds of most of the packages recently built. There are approaches to do published snapshots of the archive that would make it possible to verify these builds at a later time by independent third parties. Everything is always harder than it looks, but maybe we can get *some* real-world verification sooner than later? :) live well, vagrant
signature.asc
Description: PGP signature
_______________________________________________ Reproducible-builds mailing list Reproducible-builds@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds