> Does that come close enough to your "explainable differences" to alleviate concerns about lack of bit-to-bit reproducibility?
Of course the devil is in the details, but I would say - personally - because it's not really my decision, it's more of an infra policy: if you can write a script (purely sources you can read) and have it released together in the sources, and if yoy can run the script on two files and that script shows and explains all the differences when you run the same package locally and in the cloud, then this meets **my** definition of "reproducible-enough". However (according to my reproducibility check) this only matters if you actually **do** verify it—that is, if 3 PMC members actually build the thing (locally), compare results and give their +1s. This is at least what we do in Airflow for example. Which—I'm not sure if meets your "decrease load" expectation - because for me "reproducibility" is not just something that exists "in theory," (even bit-to-bit) but something that is actually verified. It's a property of the release process that 3 PMC members check during the voting process. Also the question is where the input files are downloaded from. I asume from SVN of ours, and there is similar reproducible check before to produce those. :). In short for me, the goal of building things in GH is not to **reduce the load**—in fact, I think proper security for a release introduces some healthy friction and requires effort. It's deliberately designed to make people think, pause and actually verify that what they are doing is good and secure. Rather than striving to spend as little effort as possible, the optimisation goal is to ensure we catch all potential deviations and perform manual inspection of the process to potentially catch process issues. But—maybe I'm also overthinking it :) - I'd love to hear what others think as well. J. On Sat, Mar 21, 2026 at 5:53 PM Neil C Smith <[email protected]> wrote: > HI Jarek, > > Thanks. Lots to think about ... however, > > On Sat, 21 Mar 2026 at 12:29, Jarek Potiuk <[email protected]> wrote: > > ... it's the > > reason why dropping reproducible builds and switching to GH makes it a > bit > > more dangerous. > > Let's be clear this is not what we're looking to do. You don't need > to convince me that reproducible builds are a desirable thing. This > particular process is not reproducible (at a bit-for-bit level, anyway > - more below). And it is highly unlikely to be so, for the > foreseeable future at least. This is partly because of the very thing > we want to use GHA for: accessing code signing. We have up to 4 > different levels of code signing, each building on the output of the > other, using OS native tooling. Not to mention other OS native > tooling in use for other aspects of the installer building. > > So, this is very much coming from the perspective that reproducible > would be great, it's not happening, so what is the next best thing we > can do? It is also definitely not about "rogue release managers"! :-) > It is about reducing workload, making installers possible again, and > having done our over-complicated releases many times, having more eyes > on possible issues before and after release. > > > But it's not either-or; you do not have to have bit-to-bit > reproducibility. > > People often mistake those two. I don't even think this is what INFRA > > expects. As long as whatever you produce has "explainable differences," > > this is still reproducibility and serves the same purpose as bit-to-bit. > > Bit-for-bit verification is easy, but if you can demonstrably show and > > verify that the difference between two artifacts you produce comes only > > from metadata (timestamps, file permissions, and so on) and not the code > - > > it's still reproducibility. > > > > This is actually the best of both worlds: when you can provide GH > > "controlled environment" and some kind of reproducibility, you have all > the > > benefits of both. > > > > So maybe that is worth exploring? > > Maybe we already have? To explain, besides use of the different OS > runner images plus Java runtime, the workflow here only downloads two > additional binaries, both releases of the Apache NetBeans project. > The first is the cross-platform IDE binary Zip bundle itself, the > second is the NBPackage tool. Only the latter is actually executed on > the runner - the former is purely payload. > > NBPackage is deliberately designed to bundle from a binary zip, so > that after installing any OS-specific package, you can binary diff the > installed files. Therefore it is possible to verify all the > unmodified files, the few additional files generated by the packager, > and the small list of modified files with signed natve binaries within > them. > > This does not 100% guarantee that the installer packages themselves > are correct. Although they are mostly introspectable with the right > tooling. > > Does that come close enough to your "explainable differences" to > alleviate concerns about lack of bit-to-bit reproducibility? > > Thanks and best wishes, > > Neil > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: > [email protected] > >
