On 2024-03-05, John Gilmore wrote: > A quick note: > Vagrant Cascadian <vagr...@reproducible-builds.org> wrote: >> It would be pretty impractical, at least for Debian tests, to test >> without SOURC_DATE_EPOCH, as dpkg will set SOURCE_DATE_EPOCH from >> debian/changelog for quite a few years now. > > Making a small patch to the local dpkg to alter or remove the value of > SOURCE_DATE_EPOCH, then trying to reproduce all the packages from source > using that version of dpkg, would tell you which of them (newly) fail to > reproduce because they depend on SOURCE_DATE_EPOCH.
Sure... which brings us to... >> Sounds like an interesting project for someone with significant spare >> time and computing resources to take on! > > It looks to me like the whole Ubuntu source code (that gets into the > standard release) fits in about 25 GB. The Debian 12.0.0 release > sources fit in 83GB (19 DVD images). Both of these are under 1% of a > 10TB disk drive that runs about $200. A recent Ryzen mini-desktop, > with a 0.5TB SSD that could cache it all, costs about $300. Is this > significant computing resources? For another $40 we could add a better > heat sink and a USB fan. How many days would recompiling a whole > release take on this $540 worth of hardware? You also notably left out ram requirements, which is almost more important than CPU, from what I've seen! You were not talking about a single pass through the archive, you asked for a combinatorially explosive comparison (e.g. with and without build paths, with and without SOURCE_DATE_EPOCH, with and without locale differences, with and without username variations, etc.) ... and for it to continue to be useful, you'd have to keep doing it... indefinitely. Debian currently tests over 25 variations (most of which have actually resulted in differences in the wild): https://tests.reproducible-builds.org/debian/index_variations.html To systematically identify these "simply" through building each possible combination for any significant set of software... is a much larger task. Obviously, you could narrow it to only the set of variations you want to research, or for a limited package set. At least for Debian, with what I would guess is significantly more computing power than you've described, usually did no better than about 30 days from the oldest build, meaning some packages were always behind. We also blacklist some packges that just take too much ram, disk or time, though that is considerably less that 1% of ~35k packages. More importantly, that is with only two builds per package, not testing all 625 permutations of 25 interacting variations per package. > (I agree that the "spare" time to set it up and configure the build > would be the hard part. This is why I advocate for writing and > releasing, directly in the source release DVDs, the tools that would > automate the recompilation and binary comparison. The end user should > be able to boot the matching binary release DVD, download or copy in the > source DVD images, and type "reproduce-release".) Automation can help significantly, although at some point you need to write all that automation, write the code that processes the results meaningfully, and verify that it is working correctly... and continue to verify it as new package versions come in, and so on. In short, easier said than done? live well, vagrant
signature.asc
Description: PGP signature