Re: [rb-general] Definition of "reproducible build"

2019-02-14 Thread Richard Purdie
On Thu, 2019-02-14 at 12:25 -0800, John Gilmore wrote:
> > I like the idea, however what you are proposing is basically a new
> > distro/fork, where you would remove all unreproducible packages, as
> > every distro still has some unreproducible bits.
> 
> I suggest going the other way -- produce a distro that is "80%
> reproducible" from its source code USB stick and its binary boot USB
> stick.  You'd already have the global reproducibility structure and
> scripts written and working, even before the last packages are
> individually reproducible.  That global reproducibility tech would be
> immediately adoptable by any distro.  The output of the reproduction
> scripts would be a bootable binary that does boot and run!  It would
> still have differences from the "release master" bootable binary, but
> those differences would be irrelevant to the functioning of the binary,
> and would be clearly visible with "diff -r".
> 
> (For one thing, this would cause the distros to actually produce a
> "source code USB stick image".  Currently most of them don't.  They
> instead require you to download thousands of separate source packages or
> tarballs, and have no scripts readily visible for building those into a
> bootable binary image.)
> 
> After accomplishing that, then the focus could go on the 20% (or 10% or
> whatever) of packages that aren't yet reproducible.  And, people making
> small distros could cut out such packages to make a 100% reproducible
> distro, as Holger suggested.

FWIW, the Yocto Project supports that today in the form of our "build-
appliance" images. They contain all the sources and tools to rebuild
the image.

We don't go for full reproducibilty "out the box" at a timestamp level
but you can configure the build to do that. Even out the box we're way
better than 80% though!

Cheers,

Richard




___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] Definition of "reproducible build"

2019-02-14 Thread John Gilmore
> I like the idea, however what you are proposing is basically a new
> distro/fork, where you would remove all unreproducible packages, as
> every distro still has some unreproducible bits.

I suggest going the other way -- produce a distro that is "80%
reproducible" from its source code USB stick and its binary boot USB
stick.  You'd already have the global reproducibility structure and
scripts written and working, even before the last packages are
individually reproducible.  That global reproducibility tech would be
immediately adoptable by any distro.  The output of the reproduction
scripts would be a bootable binary that does boot and run!  It would
still have differences from the "release master" bootable binary, but
those differences would be irrelevant to the functioning of the binary,
and would be clearly visible with "diff -r".

(For one thing, this would cause the distros to actually produce a
"source code USB stick image".  Currently most of them don't.  They
instead require you to download thousands of separate source packages or
tarballs, and have no scripts readily visible for building those into a
bootable binary image.)

After accomplishing that, then the focus could go on the 20% (or 10% or
whatever) of packages that aren't yet reproducible.  And, people making
small distros could cut out such packages to make a 100% reproducible
distro, as Holger suggested.

John

___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] Definition of "reproducible build"

2019-02-14 Thread Holger Levsen
Hi John,

On Mon, Jan 28, 2019 at 11:18:43PM -0800, John Gilmore wrote:
> =?utf-8?Q?Ludovic_Court=C3=A8s?=  wrote:
> > I agree that insisting on provenance is crucial.  Dockerfiles (andsimilar) 
> > are often viewed as “source”, but they really aren’t source:the actual 
> > source would come with the distros they refer to (Debian,pip, etc.)
> > Those distros might in turn refer to external pre-built binaries,though, 
> > such as “bootstrap binaries” for compilers (Rust, OpenJDK, andso on.)
> 
> I propose a definition for whether a bootable OS distro is reproducible.
> (If what you're building is not a whole distro that can self-compile,
> this definition doesn't apply.)
> 
> Our initial goal would be to produce a bootable binary release (DVD or
> USB stick) and a source release (ditto).  The source release would
> include the script that allows the binary release to recompile the
> source release to a new binary release that ends up bit-for-bit
> identical.  Such a binary/source release pair would be called
> "reproducible".

I like the idea, however what you are proposing is basically a new
distro/fork, where you would remove all unreproducible packages, as
every distro still has some unreproducible bits.

(I'm not opposed to creating yet another distro/fork, I just wanted to
point that out.)

It definitly would be a good prototype, for others to learn from.


-- 
tschau,
Holger

---
   holger@(debian|reproducible-builds|layer-acht).org
   PGP fingerprint: B8BF 5413 7B09 D35C F026 FE9D 091A B856 069A AA1C


signature.asc
Description: PGP signature
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] Definition of "reproducible build"

2019-01-28 Thread John Gilmore
=?utf-8?Q?Ludovic_Court=C3=A8s?=  wrote:
> I agree that insisting on provenance is crucial.  Dockerfiles (andsimilar) 
> are often viewed as “source”, but they really aren’t source:the actual 
> source would come with the distros they refer to (Debian,pip, etc.)
> Those distros might in turn refer to external pre-built binaries,though, such 
> as “bootstrap binaries” for compilers (Rust, OpenJDK, andso on.)

I propose a definition for whether a bootable OS distro is reproducible.
(If what you're building is not a whole distro that can self-compile,
this definition doesn't apply.)

Our initial goal would be to produce a bootable binary release (DVD or
USB stick) and a source release (ditto).  The source release would
include the script that allows the binary release to recompile the
source release to a new binary release that ends up bit-for-bit
identical.  Such a binary/source release pair would be called
"reproducible".

That's useful: If you have to fix a bug in it, you can make the mods you
need in the source tree, rebuild the world, and out will come a release
with just that one change in the binaries, verifiably identical except
where it matters.  And developers can use such a release to detect what
changes matter to whom, such as: when you alter a system include file,
which binaries change?

During development, the code would be built by some earlier release's
tools, built piecemeal, etc, like current build processes do.  Anytime
before release, the developers can test whether a draft source release
builds into a binary release that itself can build the sources into the
same binary release.  And fix any discrepancies, ideally long before
release.

This is similar to what GCC does to test itself, or what Cygnus did to
test the whole toolchain for cross-compiling.  But applied to the
entire OS release.

Such a paired source/binary release doesn't require a chain of
provenance of earlier binary software, particularly if people can
demonstrate bootstrapping it using several different earlier compiler
toolchains, still producing the same binaries.  You can bootstrap
it with itself.

The separate efforts to minimize the amount of binary code we have to
trust to do a rebuild are laudable and fascinating.  Keep going!  But we
shouldn't require whole distros to do that yet.  We haven't even
accomplished a basic paired binary/source reproducible release yet, for
any major release -- or have we?

John

PS: For extra points, the binary release should be able to cross-compile
its source release into a binary release for each other supported
platform, reproducibly.  And those other-platform binary releases should
cross-compile the source release back bit-for-bit into the same binary
release you started with.

___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] Definition of "reproducible build"

2019-01-25 Thread Ludovic Courtès
Hello,

Marvin Humphrey  skribis:

> 2.  Hoovering up the build environment into a Docker container or
> similar might be enough to produce "reproducible" results, but
> without provenance information for the "relevant attributes of the
> build environment", the benefits are diminished. ("Does the all-new
> opaque build environment for release X.Y.Z contain a trojan?")

I agree that insisting on provenance is crucial.  Dockerfiles (and
similar) are often viewed as “source”, but they really aren’t source:
the actual source would come with the distros they refer to (Debian,
pip, etc.)

Those distros might in turn refer to external pre-built binaries,
though, such as “bootstrap binaries” for compilers (Rust, OpenJDK, and
so on.)

(In a way this has a connection to the work on
 by fellow hackers.)

> Assuming that keeping the generality of the official definition is
> important to you, can you suggest any options for downstream
> "authors or distributors" to tighten that up?

The GPL (probably other licenses have something similar) has a
definition of “Corresponding Source” that suggests provenance:

The "Corresponding Source" for a work in object code form means all
  the source code needed to generate, install, and (for an executable
  work) run the object code and to modify the work, including scripts to
  control those activities.  However, it does not include the work's
  System Libraries, or general-purpose tools or generally available free
  programs which are used unmodified in performing those activities but
  which are not part of the work. […]

Perhaps we would need a “recursive” definition of “Corresponding Source”
to really convey the idea of provenance tracking and reproducibility of
a complete software stack?

Anyway, I’m interested in what the ASF or Debian might come up with!

Thanks,
Ludo’.
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.

Re: [rb-general] Definition of "reproducible build"

2019-01-24 Thread Holger Levsen
Hi Marvin,

thanks for reaching out to us reproducible-builds.org folks!

On Mon, Jan 21, 2019 at 03:10:44PM -0800, Marvin Humphrey wrote:
> Over on the legal-discuss list at the Apache Software Foundation, we are
> currently discussing reproducible builds.
> 
> https://markmail.org/message/k7ldwepd3ph2qxsp

yup, David Wheeler also has pointed us to that thread. Exciting!

> If anyone would like to participate in the discussion, you can subscribe
> by sending an email to: legal-discuss-subscr...@apache.org

I fear I cannot commit to yet another mailinglist. But please do feel
free to cc: me on any mail on this topic you find relevant!

> The history of binary packages at the ASF is long and fraught.  The
> Foundation only officially endorses pure source code packages; what is
> being considered is whether the ASF should give its official imprimatur
> to binary releases and whether such binary release packages should be
> required to be the result of a reproducible build.
> 
> For a while now, I've been contemplating what a patch to the ASF's
> Release Policy[1] requiring reproducibility ought to look like.  In some
> ways it would be nice if you folks could serve as a steward for the
> definition of "reproducible build", similar to how the Open Source
> Initiative maintains the Open Source Definition[2], so that an external
> policy document could reference it.

Thanks. A lot! :)

> You currently have a definitions page[3] which is nice and easy to
> understand.  A couple of comments:

thanks! also for the comments!

> 1.  The current definition would be a bit awkward to reference in an
> official document or policy because it is not either frozen or
> versioned.

excellent idea, I've recorded it at
https://salsa.debian.org/reproducible-builds/reproducible-website/issues/5

> 2.  Hoovering up the build environment into a Docker container or
> similar might be enough to produce "reproducible" results, but
> without provenance information for the "relevant attributes of the
> build environment", the benefits are diminished. ("Does the all-new
> opaque build environment for release X.Y.Z contain a trojan?")
> Assuming that keeping the generality of the official definition is
> important to you, can you suggest any options for downstream
> "authors or distributors" to tighten that up?

not really. I believe https://bugs.debian.org/844431 has some more thoughts on
this issue though.


-- 
tschüß,
Holger

---
   holger@(debian|reproducible-builds|layer-acht).org
   PGP fingerprint: B8BF 5413 7B09 D35C F026 FE9D 091A B856 069A AA1C


signature.asc
Description: PGP signature
___
rb-general@lists.reproducible-builds.org mailing list

To change your subscription options, visit 
https://lists.reproducible-builds.org/listinfo/rb-general.

To unsubscribe, send an email to 
rb-general-unsubscr...@lists.reproducible-builds.org.