I have been thinking about this for a bit myself, and at least from the point of view of someone trying to verify reproducibility of a certain build, i would say the environment is definitely necessary, and if anything the current definition is too unspecific to allow generic verification.

> What matters is that someone is able to get the same bit-by-bit identical output.

I would argue that what's relevant is that *everyone* is able to get the same bit-by-bit identical output. Without specifying the relevant parts of the environment, you quickly get into a situation where some rebuilders happen to have an environment where the output artifact is the same, while others with a different environment get a different artifact. In my opinion, the inputs for a "reproducible build" should be specific enough so that anyone adhering to these inputs gets the same artifact out, and you don't need to get lucky, or get additional information from somewhere else to successfully reproduce the same artifact.

However, I believe that achieving "verified reproducible builds" does
not require reproducing the build environment bit-by-bit identically.

I agree! And i see that the given definition in the OpenSSF glossary may be a little misleading in this regard. However, the source definition on reproducible-builds.org further explains that it is about the *relevant attributes of the build environment*, which are up to the project maintainer to determine. That way, they ideally don't overspecify the build environment, and rebuilders don't have to bit-by-bit reproduce the whole original build environment. They just have to reproduce the parts that the maintainer has deemed relevant for the build to be reproducible.

In a recent pre-publication [1] i proposed a more formal definition of reproducibility:

> A tuple (source code, build instructions, build environment, artifacts) is considered reproducible, if executing the build instructions on the source code within the build environment always produces the same artifacts when compared via bit-by-bit equality.

But as this is designed as a more precise version for academic use, i can see that it may feel a little... clunky for a general purpose glossary.

Best
Timo

On 23.04.25 11:07, Simon Josefsson via rb-general wrote:
"David A. Wheeler via rb-general"
<[email protected]> writes:

The OpenSSF is building a "glossary" set (so we consistently use the
same meaning for the same term), and I drafted a definition for "reproducible 
build"
based on this group:

https://glossary.openssf.org/reproducible-build/
Thanks.  I think the "same source code, build environment and build
instructions" part may lead people the wrong way.

Others may have different goals, but for me the point of supporting
reproducible builds is so that we can get to "verified reproducible
builds", which to me is what matters for end-users.  It is great that
you mention this goal above!  It seems this goal is often forgotten.

With that goal in mind, I don't think it matters what the build input or
build environment is.

What matters is that someone is able to get the same bit-by-bit
identical output.

Where I think people may go wrong with the text above is that you are
led to believe that there is a one-to-one mapping involved for the build
environment.

However, I believe that achieving "verified reproducible builds" does
not require reproducing the build environment bit-by-bit identically.

For example, if I'm able to independently rebuild Debian's version of
Firefox using the same Firefox source code but some other build
environment, I would still count the firefix binary as a "verified
reproducible build".  Does anyone disagree with that?  Why?

Here is my attempt at clarification:

OLD:

    A build is reproducible if given the same source code, build
    environment and build instructions, any party can recreate bit-by-bit
    identical copies of all specified artifacts.

NEW:

    A build is reproducible if given the same source code, any party can
    recreate bit-by-bit identical copies of all specified artifacts.
    Information about the build environment and build instructions is
    usually needed to achieve that.

What do you think?

Btw, I recently wrote about verifying reproducible source tarballs:

https://blog.josefsson.org/2025/04/17/verified-reproducible-tarballs/

Turns out I was not able to reproduce any upstream-published tarballs
that I looked at.  Does anyone know of any earlier systematic efforts to
verify reproducability of source tarballs in a similar way?  Is anyone
interested in working on this, for a couple of high-profile packages to
see if we are able to reproduce them?

/Simon

--
Timo Pohl

Institut für Informatik IV             Raum:   1.018
Universität Bonn                       Tel.:   +49 228 73-54246
Friedrich-Hirzebruch-Allee 8           E-Mail: [email protected]
53115 Bonn                             PGP key id: 0x4872A6DD1019A4D8


Department of Computer Science IV      Room:   1.018
University of Bonn                     Phone:  +49 228 73-54246
Friedrich-Hirzebruch-Allee 8           E-Mail: [email protected]
53115 Bonn                             PGP key id: 0x4872A6DD1019A4D8
Germany

Reply via email to