Re: Arch Linux minimal container userland 100% reproducible - now what?
John Gilmore wrote: > It seems to me that the next step in making the Arch release ISOs > reproducible is to have the Arch release engineering team create a > source-code release ISO that matches each binary release ISO. Then you > (or anyone) could test the reproducibility of the release by having > merely those two ISO images and a bare amd64 computer (without even an > Internet connection). kpcyrd wrote: > I think this falls under "bootstrappable builds", a bare amd64 computer > still needs something to boot into (a CD with only source code won't do > the trick). Bootstrappable builds are a different thing. Worthwhile, but not what I was asking for. I just wanted provable reproducibility from two ISO images and nothing more. I was asking that a bare amd64 be able to boot from an Arch Linux *binary* ISO image. And then be fed a matching Arch Linux *source* ISO image. And that the scripts in the source image would be able to reproduce the binary image from its source code, running the binaries (like the kernel, shell, and compiler) from the binary ISO image to do the rebuilds (without Internet access). This should be much simpler than doing a bootstrap from bare metal *without* a binary ISO image. And if your source/binary ISO images can do that, it's not just an academic exercise in reproducibility. It can also produce a new binary ISO that is built from that source ISO plus a few patches (e.g. for fixing security issues). Or, it can "recompile-the-world" after you (or any user) makes a small change to a kernel, include file, library, or compiler -- and show exactly how many programs compile to something *different* as a result. Basically, that pair of ISOs becomes a seed that can carry forward, or fork, the whole distribution. For anybody who receives them. That is the promise of free software, but the complexity of modern distros plus the convenience of ubiquitous Internet have inadvertently tended to undermine that promise. Until the reproducible builds effort! If someday an Electromagnetic Pulse weapon destroys all the running computers, we'd like to bootstrap the whole industry up again, without breadboarding 8-bit micros and manually toggling in programs. Instead, a chip foundry can take these two ISOs and a bare laptop out of a locked fire-safe, reboot the (Arch Linux) world from them, and then use that Linux machine to control the chip-making and chip-testing machines that can make more high-function chips. (This would depend on the chip-makers keeping good offline fireproof backups of their own application software -- but even if they had that, they can't reboot and maintain the chip foundry without working source code for their controller's OS.) John
Re: Verifying reproducibility of Java builds from Maven Central
On Thu, Mar 28, 2024, at 16:41, Railean, Alexander via rb-general wrote: > I am trying to understand how someone can independently verify the > reproducibility of Java projects on Maven Central. Having explored the > repositories on Maven Central, I could not find examples where the > “buildinfo” file was present. Publishing a buildinfo to Maven Central is indeed relatively uncommon. > The archives of this mailing list pointed out examples such as > https://repo1.maven.org/maven2/com/typesafe/akka/akka-actor_2.13/2.6.4/akka-actor_2.13-2.6.4.buildinfo, > and yet my understanding is that this is not enough [but why?], hence > reproducible-central was created to address some sort of gap. > > So far, my mental model is that: > • By including buildinfo in the artifacts on Maven Central, library authors > empower users to check for themselves if the build is reproducible or not. > • Reproducible-central takes it a step further and attempts to do a build > and then gives you a “yes/no” result. > > Thus, the former makes the problem solvable in principle, whereas the latter > actually solves it. Is my understanding is correct? Mostly: publishing the buildinfo is optional, it is possible to have a reproducible build without publishing the buildinfo metadata (but you might need some other way to convey the requirements for your build environment). Indeed, reproducible-central has successfully rebuilt many artifacts that haven't published a buildinfo. > Besides that, I have some additional questions: > 1. Can you provide references to documentation that explains how to make sure > buildinfo ends up on Maven Central? In the case of Akka, they/we use the https://github.com/raboof/sbt-reproducible-builds/ plugin for the sbt build tool that is used to build Akka. > 2. Is there a tutorial that describes how to get featured on Reproducible > Central? > > > I had a look at > https://github.com/jvm-repo-rebuild/reproducible-central/blob/master/doc/BUILDSPEC.md, > and my understanding is that this is not working for projects built on > Windows, because it relies on rebuild.sh, which implies one has bash. The > library I publish on Maven Central is built on a Windows computer – does this > mean that I won’t be able to list it in reproducible-builds? Hmm, that sounds tricky. However, my experience with Java/Maven is that it is often possible to achieve reproducibility across operating systems: artifacts built on MacOS can often be rebuilt on Linux and vice-versa, so perhaps the same is also true for Windows? Kind regards, -- Arnout Engelen Engelen Open Source https://engelen.eu
Verifying reproducibility of Java builds from Maven Central
Hi everybody, I am trying to understand how someone can independently verify the reproducibility of Java projects on Maven Central. Having explored the repositories on Maven Central, I could not find examples where the "buildinfo" file was present. The archives of this mailing list pointed out examples such as https://repo1.maven.org/maven2/com/typesafe/akka/akka-actor_2.13/2.6.4/akka-actor_2.13-2.6.4.buildinfo, and yet my understanding is that this is not enough [but why?], hence reproducible-central was created to address some sort of gap. So far, my mental model is that: * By including buildinfo in the artifacts on Maven Central, library authors empower users to check for themselves if the build is reproducible or not. * Reproducible-central takes it a step further and attempts to do a build and then gives you a "yes/no" result. Thus, the former makes the problem solvable in principle, whereas the latter actually solves it. Is my understanding is correct? Besides that, I have some additional questions: 1. Can you provide references to documentation that explains how to make sure buildinfo ends up on Maven Central? 2. Is there a tutorial that describes how to get featured on Reproducible Central? I had a look at https://github.com/jvm-repo-rebuild/reproducible-central/blob/master/doc/BUILDSPEC.md, and my understanding is that this is not working for projects built on Windows, because it relies on rebuild.sh, which implies one has bash. The library I publish on Maven Central is built on a Windows computer - does this mean that I won't be able to list it in reproducible-builds? Looking forward to your feedback, Alex
Re: Arch Linux minimal container userland 100% reproducible - now what?
On 3/26/24 5:03 PM, Michael Schierl via rb-general wrote: So we can expect many year/month pairs embedded in manpages that got unnoticed since mostly the build happens in the same month? Or have they been manually vetted? The results on reproducible.archlinux.org don't aim to guarantee the absence of reproducible builds issues, they instead aim to confirm the binary can be built from the given source code and build instructions (which is, at least for me, why I'm working on reproducible builds, since this means we can take the source code at face value for what's in the binaries). Embedded timestamps are considered bad because they are usually a show-stopper for this (and timestamps with second/minute precision still are for us). There's a different kind of system that tries to prove the absence of reproducible builds issues - I've referred to this as "build environment fuzzing" in the past and it's the kind of thing tests.reproducible-builds.org does. These results also still exist for Arch Linux[1] (since 2017), and if you're concerned about this you could check over there, but since Arch Linux _integrates_ with other eco-systems (instead of re-implementing them like Debian tries to), some builds fail to build if the clock is too far off, since https certificates would be considered expired. There's a lot of `curl -k` going on to work around this, but e.g. cargo has no option to "turn off all security", so these packages simply won't build on there. [1]: https://tests.reproducible-builds.org/archlinux/ In late 2019 it turned out to be easier to "do the real thing" instead of trying to find more workarounds, and "not having enough true-positives" isn't really a problem we're having at the moment. If you find a false-negative please shout. If anybody is bothered by the claims Arch Linux is making they're very welcome to run a rebuilder with a clock that is off by 48h (this would be interesting to have, but still wouldn't guarantee the absence of other reproducible builds issues, like missing Cargo.lock files). Apart from Guix pushing bootstrappable builds for quite some time, recent builds of Freedesktop SDK (container userland mostly used for flatpaks) are fully bootstrapped from stage0 - except for Rust which is not boostrapped via mrustc but built using the binary package from upstream. Is there any public website I could look at for results? According to our tests, having reproducible distro tooling isn't enough because there's still plenty of opensource software doing silly things in their build processes. Assuming I wanted to bootstrap some (non-reproducible) Arch setup from Freedesktop SDK and then use it to verify the reproducible builds, what steps would I have to take? If you want to bootstrap the 114 packages that are present in docker.io/library/archlinux from source, you would need to: - Build any version of pacman (which is C and shell scripts, but for makepkg you might even get away with just the shell scripts) - Download all 114 buildinfo files for these packages (they are contained inside of the package itself) - Identify all packages and their versions that are referenced in there as build dependency - Build these packages on Freedesktop SDK with `makepkg --nodeps`, this disables dependency checks and simply assumes the required tools/compilers are going to be in $PATH - the checksums of packages built this way are naturally going to be different from the official packages but that's ok - Use the packages you built to setup the build environment that is described in each buildinfo file - Run the build with makepkg and SOURCE_DATE_EPOCH set to the value in the buildinfo file This should result in exact matches of the official packages, but of course there are a few things that could go wrong so I can not make any guarantees. Instead of doing the last two steps you could also remove the signature checks in archlinux-repro[2] and populate its download cache folder with the packages you built yourself, archlinux-repro then takes care of the rest. [2]: https://github.com/archlinux/archlinux-repro Has anything like that been tried for Arch? How many dependency loops are there in the build dependencies of the packages mentioned above, and can they be broken by using packages from Freedesktop SDK? I'm not aware of anybody having tried this. There wasn't much point in trying without having achieved reproducible builds first. cheers, kpcyrd