Re: [Reproducible-builds] Bug#138409: dpkg-dev: please add support for .buildinfo files
Hi! We were discussions the restrictions on buildinfo identifiers: fweb_1.62-12+b2_brahms-20120530T114812Z.buildinfo ^^^ this part The proposal was “the string should consist only of alphanumeric characters and hyphens”. Guillem made the following comment while reviewing the patches for dpkg: Guillem Jover: > Can we just simply use the package name rules instead? It also avoids > potential problems with case and similar. (There's a > pkg_name_is_illegal function in Dpkg::Package already.) After reaching out to Ansgar with a patch for dak to implement the above, he replied: Ansgar Burchardt: > The allowed sets for package names and the suffix of buildinfo > filenames won't be the same even with that change. However currently > the suffix of buildinfo filenames matches what is allowed for .changes > files. > > I'm not sure why allowing capital letters used in the suffix of > buildinfo files should be an issue? After all we allow capital letters > for both version numbers (part of the filename) and in names of changes > files. > > (In the other direction not everything allowed as a package name can be > used as the suffix of .changes and .buildinfo files either.) Guillem, any further comments? Do you have any strong opposition to the initial proposal? -- Lunar.''`. lu...@debian.org: :Ⓐ : # apt-get install anarchism `. `'` `- signature.asc Description: Digital signature ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Re: [Reproducible-builds] Bug#138409: dpkg-dev: please add support for .buildinfo files
Guillem Jover: > Lunar: > > I think the proposed patch is missing a field to record some environment > > variables that can affect the build process. Right now, I'm thinking of > > DEB_BUILD_OPTIONS and DEB_flag_{SET,STRIP,APPEND,PREPEND} (from > > dpkg-buildflags). Maybe others? Build profiles? > > > > Initially I was against recording such information, but now that we > > understand the purpose of .buildinfo files better and not mandate that > > they be reproducible themselves, it doesn't matter if one contains > > `DEB_BUILD_OPTIONS=parallel=4` and the other > > `DEB_BUILD_OPTIONS=parallel=16`. It is the responsibility of users > > trying to recreate a given package to filter this out. > > Hmm right, makes sense, but I see this also to be a bit problematic. > There are many variables that do affect the build which we'd need to > record. Including all of the them seems like another privacy > concerning issue. Whitelisting, we might end up missing some but it's > privacy-safe; blacklisting we might end-up including sensitive ones, > but not miss any build-related ones, which is privacy-unsafe. > > Some things that come to mind that do affect the build in a significant > way: CC, LD_LIBRARY_PATH, DEB*, DPKG_*, PATH, MAKEFLAGS. > > The traditional build flags (i.e. CFLAGS, LDFLAGS, etc) might also affect > the build depending on the rules file. I was thinking of a whitelist approach and to start with use cases we can already think of, adding more variables to record if we identify them as missing in the future. How about naming the field “Environment-Variables”? I will also add another vendor hook for the list of variables. > Build profiles are already recorded in the binary packages, but having > that in the .buildinfo file seems right as it makes it easier to > reproduce the build environment. Should it be a new field or would recording the DEB_BUILD_PROFILES variable be enough? -- Lunar.''`. lu...@debian.org: :Ⓐ : # apt-get install anarchism `. `'` `- signature.asc Description: Digital signature ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Re: [Reproducible-builds] Bug#138409: dpkg-dev: please add support for .buildinfo files
Guillem Jover: > > One of the main change is that `.buildinfo` should now be named with an > > arbitrary identifier. By default this defaults to $HOSTNAME-$TIMESTAMP > > but can be set to an arbitrary value by the `--buildinfo-identifier` > > command line flag. > > Hmmm, leaking the hostname seems slightly privacy-concerning? If the > information therein is not relevant I'd rather use something like an > UUID (although that would require increasing the pseudo-build-essential > set), or just hashing the hostname-timestamp with something like md5 > or sha1 or similar. My hunch is that having a timestamp visible in the file name might help recognizing files quickly after a bunch of builds, especially to identify the last one. So I'd rather keep it. If privacy is the goal, hashing just the hostname would not be help much, as any precomputed table would work. How about $TIMESTAMP-$EIGHT_FIRST_CHARS_OF_BUILDINFO_MD5? (I'm picking md5 because it's already used in dpkg-gensymbols.) > Can we just simply use the package name rules instead? It also avoids > potential problems with case and similar. (There's a > pkg_name_is_illegal function in Dpkg::Package already.) Sounds reasonable. I've updated the wiki page and prepared a patch for dak. > > +} else { > > +warning(_g('no .dsc file, skipping .buildinfo generation')); > > +} > > } > > ISTR mentioning this before, but I don't see why generating the > buildinfo file is tied to existing a source package at all? The source > should be included if we are including a source in the upload, that's > it. The whole puprose of the reproducible builds effort is to provide a verifiable path from sources to binaries. Signed .buildinfo files are certifications that the listed binary packages have been built using the described source and environment. Only listing the source in .buildinfo when a source is included in the upload would prevent us to have multiple builders certify the same binaries. That would cut us from providing multiple certifications and would undermine the purpose of reproducible builds. So I could remove the limitation, but the resulting .buildinfo file would not be very useful for reproducible builds. > > +$fields->{'Source'} = $spackage; > > +if ($changelog->{'Binary-Only'}) { > > +$fields->{'Source'} .= ' (' . $sourceversion . ')'; > > +$fields->{'Changes'} = $changelog->{'Changes'} . "\n\n" > > + . ' -- ' . $changelog->{'Maintainer'} > > + . ' ' . $changelog->{'Date'}; > > +} > > Hmmm, it bothers me slightly that the Changes field here diverges in > form from the one in the .changes file. I can understand. It's been designed that way because it's actually only there for binNMUs where the source is the same as the original and we need a way to reconstruct the right changelog file. Because sbuild might actually change its strings in the future, it felt like plain copy/pasting was the safest. So recreating the changelog in case of binNMU is about outputing the value of the Changes field in the .buildinfo, a blank line, and the changelog from the original source. > I think I'd prefer to have the Date as its own field, maybe always > included. And also the Maintainer field. Any reason to not include > them all the time or on their own? I think they would be confusing. If we would to include the “Maintainer” I guess we should call it “Changed-By” like in .changes. “Date” as such would be a confusing name because I would tend to think of it as the date of the build, and not as the date of the latest changelog of a binNMU… So maybe “Changed-On” or “Change-Date”. But this feels just more complicated than just the current implementation, even if the format differs slightly. Maybe we can rename that field instead to “Extra-Changelog-Entry” or something else so it's clear they have different format? > > +my $environment = Dpkg::Deps::AND->new(); > > +foreach my $pkg (sort keys %env_pkgs) { > > +foreach my $installed_pkg (@{$facts->{pkg}->{$pkg}}) { > > +my $version = $installed_pkg->{version}; > > +my $architecture = $installed_pkg->{architecture}; > > +my $pkg_name = $pkg; > > > +if (defined $architecture && > > +$architecture ne 'all' && $architecture ne $build_arch) { > > +$pkg_name = "$pkg_name:$architecture"; > > +} > […] > Also this will include all Multi-Arch instances for a given package > regardless of them being used or not, I don't think that's desirable. Can we know for sure which one have been used? I'm already working on other changes you suggested. Thanks, -- Lunar.''`. lu...@debian.org: :Ⓐ : # apt-get install anarchism `. `'` `- signature.asc Description: Digital signature ___ Reproducible-builds mailing list Repr
Re: [Reproducible-builds] Bug#138409: dpkg-dev: please add support for .buildinfo files
Hi, On Freitag, 29. Januar 2016, Guillem Jover wrote: > > (I'd be in favor of naming the first accepted buildinfo > > file of the archive just "" so that it's predictable… > I'm not sure how we'd use a sequential number in a distributed manner > starting with 0s though? :9 we can't :) but dak could do it. (in the Debian use case.) cheers, Holger signature.asc Description: This is a digitally signed message part. ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Re: [Reproducible-builds] Bug#138409: dpkg-dev: please add support for .buildinfo files
Hi! zOn Thu, 2016-01-28 at 19:01:59 +0100, Holger Levsen wrote: > On Donnerstag, 28. Januar 2016, Guillem Jover wrote: > > Hmmm, leaking the hostname seems slightly privacy-concerning? If the > > information therein is not relevant I'd rather use something like an > > UUID (although that would require increasing the pseudo-build-essential > > set), or just hashing the hostname-timestamp with something like md5 > > or sha1 or similar. > > yeah, "something / anything" is fine. dak / the archive software can rename > it > anyway, as it likes. Ah ok, perfect then. > (I'd be in favor of naming the first accepted buildinfo > file of the archive just "" so that it's predictable… I'm not sure how we'd use a sequential number in a distributed manner starting with 0s though? :9 > > I've some pending changes I'll be committing to master or a separate > > branch, that I'd like to be tested on the reproducible setup (ideally > > against the already generated and pre-existing reproducible binaries), > > if that's possible, I'll ask about that when those land, I just need > > to finish up fewm more unit tests. > > That's possible, though not (yet nor in near future) against pre-existing > binaries. (We lack the code for that.) > > What we can do easily, is build and upload dpkg to our repo and use it to > build the whole Debian archive on amd64, which roughly takes 8 days for both > sid+stretch, and thus roughly 4 days for one suite, if we disable building > the > other. (Which we can definitly do, especially if we don't disable building of > new versions in that other suite…) Oh ok, that's a bit unfortunate, it would be nice to be able to catch any possible regressions in the generated binaries, and it would also show how to use reproducible builds to see that nothing has actually changed, even after implementation changes in the toolchain. On Thu, 2016-01-28 at 19:02:53 +0100, Holger Levsen wrote: > +many thanks for your thorough review! :-) No problem! Regards, Guillem ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Re: [Reproducible-builds] Bug#138409: dpkg-dev: please add support for .buildinfo files
Guillem Jover: > I've some pending changes I'll be committing to master or a separate > branch, that I'd like to be tested on the reproducible setup (ideally > against the already generated and pre-existing reproducible binaries), > if that's possible, I'll ask about that when those land, I just need > to finish up fewm more unit tests. > > Here's a quick review: […] Thanks for the review! Just so I can organize my work, are your pending changes related to the .buildinfo? I can spend the next days improving the patch with your remarks, but I'd rather not duplicate your work. :) -- Lunar.''`. lu...@debian.org: :Ⓐ : # apt-get install anarchism `. `'` `- signature.asc Description: Digital signature ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Re: [Reproducible-builds] Bug#138409: dpkg-dev: please add support for .buildinfo files
+many thanks for your thorough review! :-) signature.asc Description: This is a digitally signed message part. ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Re: [Reproducible-builds] Bug#138409: dpkg-dev: please add support for .buildinfo files
Hi Guillem, just quickly commenting on two sub topics… On Donnerstag, 28. Januar 2016, Guillem Jover wrote: > > One of the main change is that `.buildinfo` should now be named with an > > arbitrary identifier. By default this defaults to $HOSTNAME-$TIMESTAMP > > but can be set to an arbitrary value by the `--buildinfo-identifier` > > command line flag. > Hmmm, leaking the hostname seems slightly privacy-concerning? If the > information therein is not relevant I'd rather use something like an > UUID (although that would require increasing the pseudo-build-essential > set), or just hashing the hostname-timestamp with something like md5 > or sha1 or similar. yeah, "something / anything" is fine. dak / the archive software can rename it anyway, as it likes. (I'd be in favor of naming the first accepted buildinfo file of the archive just "" so that it's predictable… > I've some pending changes I'll be committing to master or a separate > branch, that I'd like to be tested on the reproducible setup (ideally > against the already generated and pre-existing reproducible binaries), > if that's possible, I'll ask about that when those land, I just need > to finish up fewm more unit tests. That's possible, though not (yet nor in near future) against pre-existing binaries. (We lack the code for that.) What we can do easily, is build and upload dpkg to our repo and use it to build the whole Debian archive on amd64, which roughly takes 8 days for both sid+stretch, and thus roughly 4 days for one suite, if we disable building the other. (Which we can definitly do, especially if we don't disable building of new versions in that other suite…) cheers, Holger signature.asc Description: This is a digitally signed message part. ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Re: [Reproducible-builds] Bug#138409: dpkg-dev: please add support for .buildinfo files
Hi! On Wed, 2016-01-27 at 08:58:47 +0100, Jérémy Bobbio wrote: > Jérémy Bobbio: > > The attached patch will enable dpkg-buildpackage to create .buildinfo > > files as specified on the Debian wiki [1]. They have two main purposes: > > > > * recording information about the system environment used during a > >particular build—versions of the build dependencies installed, system > >architecture, etc. for easier forensics/debugging; > > * describe how to recreate (partially or in full) the original > >environment when trying to reproduce a particular build. > > I think the proposed patch is missing a field to record some environment > variables that can affect the build process. Right now, I'm thinking of > DEB_BUILD_OPTIONS and DEB_flag_{SET,STRIP,APPEND,PREPEND} (from > dpkg-buildflags). Maybe others? Build profiles? > > Initially I was against recording such information, but now that we > understand the purpose of .buildinfo files better and not mandate that > they be reproducible themselves, it doesn't matter if one contains > `DEB_BUILD_OPTIONS=parallel=4` and the other > `DEB_BUILD_OPTIONS=parallel=16`. It is the responsibility of users > trying to recreate a given package to filter this out. Hmm right, makes sense, but I see this also to be a bit problematic. There are many variables that do affect the build which we'd need to record. Including all of the them seems like another privacy concerning issue. Whitelisting, we might end up missing some but it's privacy-safe; blacklisting we might end-up including sensitive ones, but not miss any build-related ones, which is privacy-unsafe. Some things that come to mind that do affect the build in a significant way: CC, LD_LIBRARY_PATH, DEB*, DPKG_*, PATH, MAKEFLAGS. The traditional build flags (i.e. CFLAGS, LDFLAGS, etc) might also affect the build depending on the rules file. Build profiles are already recorded in the binary packages, but having that in the .buildinfo file seems right as it makes it easier to reproduce the build environment. Ideally, parallel=N should not have any visible effect but I guess it currently might. Most of the other DEB_BUILD_OPTIONS do have a visible effect on the artifacts generated. Thanks, Guillem ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds
Re: [Reproducible-builds] Bug#138409: dpkg-dev: please add support for .buildinfo files
Hi! On Tue, 2016-01-05 at 14:32:51 +0100, Jérémy Bobbio wrote: > Control: retitle -1 dpkg-dev: please add support for .buildinfo files > Control: tag -1 + patch > The attached patch will enable dpkg-buildpackage to create .buildinfo > files as specified on the Debian wiki [1]. They have two main purposes: > > * recording information about the system environment used during a >particular build—versions of the build dependencies installed, system >architecture, etc. for easier forensics/debugging; > * describe how to recreate (partially or in full) the original >environment when trying to reproduce a particular build. > > Since Guillem's preliminary review in February 2015 [2], the > specification has slightly elvolved to be a bit more relaxed and the > code have been improved. Cool, thanks! > One of the main change is that `.buildinfo` should now be named with an > arbitrary identifier. By default this defaults to $HOSTNAME-$TIMESTAMP > but can be set to an arbitrary value by the `--buildinfo-identifier` > command line flag. Hmmm, leaking the hostname seems slightly privacy-concerning? If the information therein is not relevant I'd rather use something like an UUID (although that would require increasing the pseudo-build-essential set), or just hashing the hostname-timestamp with something like md5 or sha1 or similar. > To address privacy concerns, the Build-Path field is now only included > when either the build path starts by `/build/` or > `--always-include-path` has been specified on the command line of > `dpkg-genbuildinfo`. Thanks! > .buildinfo files are now accepted (although discarded) by the Debian > archive [3]. This change should thus not affect Debian developpers in > their daily work. Ah perfect, thanks for pushing for this. I'm planning on including most of the patches that look fine for 1.18.5 or .6 ideally. I've some pending changes I'll be committing to master or a separate branch, that I'd like to be tested on the reproducible setup (ideally against the already generated and pre-existing reproducible binaries), if that's possible, I'll ask about that when those land, I just need to finish up fewm more unit tests. Here's a quick review: > From 7b6d953f834b1e8800d3f8af4570d57d86e5c592 Mon Sep 17 00:00:00 2001 > From: =?UTF-8?q?J=C3=A9r=C3=A9my=20Bobbio?= > Date: Fri, 6 Nov 2015 12:17:39 + > Subject: [PATCH] Add support for .buildinfo files > diff --git a/man/dpkg-buildpackage.1 b/man/dpkg-buildpackage.1 > index 13770ba..54cee7b 100644 > --- a/man/dpkg-buildpackage.1 > +++ b/man/dpkg-buildpackage.1 > @@ -317,6 +322,12 @@ The source package version (without the epoch). > The upstream version. > .RE > .TP > +.BI \-\-buildinfo-identifier= identifier > +By default, \fBdpkg\-buildpackage\fP put the system hostname and the “puts”, but perhaps better “uses foo and bar to create the quux filename”. > +current time in the name of the \fB.buildinfo\fP file. An arbitrary > +identifier can be specified as a replacement (since dpkg 1.18.5). I'd probably describe first what the option actually does, so that the version when it got intruduced makes sense to what it refers to. And then how does that diverges from the default. And please, place new sentences on a new line. (I should probably update the "coding-style" about that.) > It must > +contain only alphanumeric characters and hyphens. > +.TP > .BI \-p sign-command > When \fBdpkg\-buildpackage\fP needs to execute GPG to sign a source > control (\fB.dsc\fP) file or a \fB.changes\fP file it will run > diff --git a/man/dpkg-genbuildinfo.1 b/man/dpkg-genbuildinfo.1 > new file mode 100644 > index 000..77f2a76 > --- /dev/null > +++ b/man/dpkg-genbuildinfo.1 > @@ -0,0 +1,98 @@ > +.BI \-\-always\-include\-path > +By default, the \fBBuild-Path\fR field will only be written if the current > +directory starts with \fB/build/\fR. Specify this option to always write > +a \fBBuild-Path\fR field when generating the \fB.buildinfo\fR. Missing escaped dash in field names. > diff --git a/scripts/Dpkg/Control/Types.pm b/scripts/Dpkg/Control/Types.pm > index 09e12d1..ad6e11b 100644 > --- a/scripts/Dpkg/Control/Types.pm > +++ b/scripts/Dpkg/Control/Types.pm > @@ -51,16 +52,17 @@ between Dpkg::Control and Dpkg::Control::Fields. > > use constant { > CTRL_UNKNOWN => 0, > -CTRL_INFO_SRC => 1, # First control block in debian/control > -CTRL_INFO_PKG => 2, # Subsequent control blocks in debian/control > -CTRL_INDEX_SRC => 4, # Entry in repository's Packages files > -CTRL_INDEX_PKG => 8, # Entry in repository's Sources files > -CTRL_PKG_SRC => 16, # .dsc file of source package > -CTRL_PKG_DEB => 32, # DEBIAN/control in binary packages > -CTRL_FILE_CHANGES => 64, # .changes file > -CTRL_FILE_VENDOR => 128, # File in $Dpkg::CONFDIR/origins > -CTRL_FILE_STATUS => 256, # $Dpkg::ADMINDIR/status > -CTRL_CHANGELOG => 512, # Output of dpkg-parsechangelog > +
Re: [Reproducible-builds] Bug#138409: dpkg-dev: please add support for .buildinfo files
Jérémy Bobbio: > The attached patch will enable dpkg-buildpackage to create .buildinfo > files as specified on the Debian wiki [1]. They have two main purposes: > > * recording information about the system environment used during a >particular build—versions of the build dependencies installed, system >architecture, etc. for easier forensics/debugging; > * describe how to recreate (partially or in full) the original >environment when trying to reproduce a particular build. I think the proposed patch is missing a field to record some environment variables that can affect the build process. Right now, I'm thinking of DEB_BUILD_OPTIONS and DEB_flag_{SET,STRIP,APPEND,PREPEND} (from dpkg-buildflags). Maybe others? Build profiles? Initially I was against recording such information, but now that we understand the purpose of .buildinfo files better and not mandate that they be reproducible themselves, it doesn't matter if one contains `DEB_BUILD_OPTIONS=parallel=4` and the other `DEB_BUILD_OPTIONS=parallel=16`. It is the responsibility of users trying to recreate a given package to filter this out. -- Lunar.''`. lu...@debian.org: :Ⓐ : # apt-get install anarchism `. `'` `- signature.asc Description: Digital signature ___ Reproducible-builds mailing list Reproducible-builds@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds