Re: "Reproducible build" definition in OpenSSF glossary

Timo Pohl Wed, 23 Apr 2025 05:39:36 -0700

I have been thinking about this for a bit myself, and at least from thepoint of view of someone trying to verify reproducibility of a certainbuild, i would say the environment is definitely necessary, and ifanything the current definition is too unspecific to allow genericverification.

> What matters is that someone is able to get the same bit-by-bitidentical output.

I would argue that what's relevant is that *everyone* is able to get thesame bit-by-bit identical output. Without specifying the relevant partsof the environment, you quickly get into a situation where somerebuilders happen to have an environment where the output artifact isthe same, while others with a different environment get a differentartifact. In my opinion, the inputs for a "reproducible build" should bespecific enough so that anyone adhering to these inputs gets the sameartifact out, and you don't need to get lucky, or get additionalinformation from somewhere else to successfully reproduce the same artifact.

However, I believe that achieving "verified reproducible builds" does

not require reproducing the build environment bit-by-bit identically.

I agree! And i see that the given definition in the OpenSSF glossary maybe a little misleading in this regard. However, the source definition onreproducible-builds.org further explains that it is about the *relevantattributes of the build environment*, which are up to the projectmaintainer to determine. That way, they ideally don't overspecify thebuild environment, and rebuilders don't have to bit-by-bit reproduce thewhole original build environment. They just have to reproduce the partsthat the maintainer has deemed relevant for the build to be reproducible.

In a recent pre-publication [1] i proposed a more formal definition ofreproducibility:

> A tuple (source code, build instructions, build environment,artifacts) is considered reproducible, if executing the buildinstructions on the source code within the build environment alwaysproduces the same artifacts when compared via bit-by-bit equality.

But as this is designed as a more precise version for academic use, ican see that it may feel a little... clunky for a general purpose glossary.


Best
Timo

On 23.04.25 11:07, Simon Josefsson via rb-general wrote:

"David A. Wheeler via rb-general"
<[email protected]> writes:

The OpenSSF is building a "glossary" set (so we consistently use the
same meaning for the same term), and I drafted a definition for "reproducible 
build"
based on this group:

https://glossary.openssf.org/reproducible-build/

Thanks.  I think the "same source code, build environment and build
instructions" part may lead people the wrong way.

Others may have different goals, but for me the point of supporting
reproducible builds is so that we can get to "verified reproducible
builds", which to me is what matters for end-users.  It is great that
you mention this goal above!  It seems this goal is often forgotten.

With that goal in mind, I don't think it matters what the build input or
build environment is.

What matters is that someone is able to get the same bit-by-bit
identical output.

Where I think people may go wrong with the text above is that you are
led to believe that there is a one-to-one mapping involved for the build
environment.

However, I believe that achieving "verified reproducible builds" does
not require reproducing the build environment bit-by-bit identically.

For example, if I'm able to independently rebuild Debian's version of
Firefox using the same Firefox source code but some other build
environment, I would still count the firefix binary as a "verified
reproducible build".  Does anyone disagree with that?  Why?

Here is my attempt at clarification:

OLD:

    A build is reproducible if given the same source code, build
    environment and build instructions, any party can recreate bit-by-bit
    identical copies of all specified artifacts.

NEW:

    A build is reproducible if given the same source code, any party can
    recreate bit-by-bit identical copies of all specified artifacts.
    Information about the build environment and build instructions is
    usually needed to achieve that.

What do you think?

Btw, I recently wrote about verifying reproducible source tarballs:

https://blog.josefsson.org/2025/04/17/verified-reproducible-tarballs/

Turns out I was not able to reproduce any upstream-published tarballs
that I looked at.  Does anyone know of any earlier systematic efforts to
verify reproducability of source tarballs in a similar way?  Is anyone
interested in working on this, for a couple of high-profile packages to
see if we are able to reproduce them?

/Simon


--
Timo Pohl

Institut für Informatik IV             Raum:   1.018
Universität Bonn                       Tel.:   +49 228 73-54246
Friedrich-Hirzebruch-Allee 8           E-Mail: [email protected]
53115 Bonn                             PGP key id: 0x4872A6DD1019A4D8


Department of Computer Science IV      Room:   1.018
University of Bonn                     Phone:  +49 228 73-54246
Friedrich-Hirzebruch-Allee 8           E-Mail: [email protected]
53115 Bonn                             PGP key id: 0x4872A6DD1019A4D8
Germany

Re: "Reproducible build" definition in OpenSSF glossary

Reply via email to