Re: [Reproducible-builds] Bug#805321: debian-installer: builds unreproducible netboot images

2015-11-22 Thread Steven Chamberlain
Hi,

I rewrote the patches according to KiBi's feedback and they are
now uploaded to our jessie-kfreebsd suite, and this Git branch:
https://anonscm.debian.org/cgit/d-i/debian-installer.git/log/?h=jessie-kfreebsd

In my own testing on ZFS, file ordering was still an issue for the
makefs tool that builds the initrd.  But if I were to try again
on UFS, I hope to be able to reproduce the entire
netboot-installer-images tarball as built by the buildds.

This tarball includes bits that are bundled onto the official release
images by debian-cd tools.  Making this reproducible is a prerequisite
for someday having reproducibly-built official release images.

I could merge these patches into sid if they seem okay?  The only
commit that should not be merged is this one, which is specific to
jessie-kfreebsd and must be slightly changed for sid:
kfreebsd: use makefs -T to clamp timestamps

I expect that Linux d-i builds will have some reproducibility issues
in whatever generates the initrd or ISOs, but I may look into that after
the jessie-kfreebsd release is done.

Regards,
-- 
Steven Chamberlain
ste...@pyro.eu.org


signature.asc
Description: Digital signature
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

Re: [Reproducible-builds] Bug#805321: debian-installer: builds unreproducible netboot images

2015-11-16 Thread Cyril Brulebois
(Trimming a lot because running out of time.)

Steven Chamberlain  (2015-11-17):
> Seems reasonable to factor out and put it here.  If we don't, someone
> may add a new $GZIP call later, forget -n and make it unreproducible
> again.
> 
> Although it is a macro here, GZIP is also the name of an environment
> variable used by gzip (but not pigz?), which is likely confusing.  And
> the later tar invocations call gzip (not pigz) in quite a few places
> regardless of the GZIP macro contents;  those do look at the GZIP
> environment variable though.

Feel free to rename stuff as needed.

> > Finally, not everything is built using debian/rules targets (with or
> > without dpkg-buildpackage). One should still be able to just run e.g.:
> > “make -C build build_netboot-gtk USE_UDEBS_FROM=sid”
> > 
> > See BUILD_DATE handling, for example. We end up with a default setting
> > through:
> > |build/config/common:BUILD_DATE ?= $(shell date -u '+%Y%m%d-%H:%M')
> 
> That would not be reproducible, then!  (it is embedded in the tarballs)

Sure. I just meant to point out that even if BUILD_DATE is set through
debian/rules, a default value is set for users not building through
debian/rules, meaning BUILD_DATE gets set in the end, instead of just
being empty.

> > [ Please note that calling $(shell dpkg-parsechangelog -SDate) to set
> > SOURCE_DATE_EPOCH there would only work when building from the toplevel
> > directory, and not from the build/ subdirectory for example. ]
> 
> If it's anyway not going to be reproducible, we could similarly fall
> back to a SOURCE_DATE_EPOCH ?= now;  or the caller could choose to
> provide them.

Mraw,
KiBi.


signature.asc
Description: Digital signature
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

Re: [Reproducible-builds] Bug#805321: debian-installer: builds unreproducible netboot images

2015-11-16 Thread Steven Chamberlain
Hi KiBi,

Many thanks for reviewing this.

Cyril Brulebois wrote:
> Please make sure not to depend on features which are not found in stable
> (I'm not entirely sure about oldstable at this point), which might hinder
> our ability to cherry-pick bits and pieces from master to jessie.

I think I could make that easier by splitting into smaller commits:

  * parts only needed for kfreebsd
  * parts more appropriate for stable, oldstable, etc.
  * parts only appropriate for sid (or just not use new GNU tar
features yet)

> > --- a/build/Makefile
> > +++ b/build/Makefile
> > @@ -56,7 +56,7 @@
> >  # Add to PATH so dpkg will always work, and so local programs will be 
> > found.
> >  PATH := util:$(PATH):/usr/sbin:/sbin
> >  EATMYDATA = $(shell which eatmydata 2>/dev/null)
> > -GZIP = $(shell which pigz gzip | head -1)
> > +GZIP = $(shell which pigz gzip | head -1) -n
> 
> I think I already added -n to a bunch of calls. Not sure whether adding
> it here once and for all would be better than adding it where it's
> missing though. Anyway, not my biggest question/comment/concern here.

Seems reasonable to factor out and put it here.  If we don't, someone
may add a new $GZIP call later, forget -n and make it unreproducible
again.

Although it is a macro here, GZIP is also the name of an environment
variable used by gzip (but not pigz?), which is likely confusing.  And
the later tar invocations call gzip (not pigz) in quite a few places
regardless of the GZIP macro contents;  those do look at the GZIP
environment variable though.

> I think those 3(.5) occurrences really should be factorized, especially
> given the logic is the same: replacing "cd foo && tar bar ." with more
> code. Somewhere under build/util would probably be suitable.

I agree.  It is a very common pattern in other Debian packages too, and
it often needs patching for reproducibility.

> > --- a/build/config/x86.cfg
> > +++ b/build/config/x86.cfg
> > @@ -332,6 +332,11 @@ arch_miniiso: x86_syslinux x86_grub_efi
> > | todos > $(TEMP_CD_TREE)/win32-loader.ini; \
> > fi
> >  
> > +   # Clamp timestamps to be no newer than last changelog entry, see
> > +   # https://wiki.debian.org/ReproducibleBuilds/TimestampsInTarball
> > +   find $(TEMP_CD_TREE) -newermt "$(SOURCE_DATE)" -print0 \
> > +| xargs -0r touch --no-dereference --date="$(SOURCE_DATE)"
> > +

> [...] above is using SOURCE_DATE, which is undefined as
> far as I can tell since SOURCE_DATE_EPOCH is what's getting defined.
> Maybe it should call the same new util (with different parameters since
> we only need touch here, and no tar call)?

That's a typo, and was buggy - it would touch all timestamps, not just
the ones necessary.  I may drop this part as I don't think it's needed
any more with the newer makefs, which will clamp the timestamps itself.

> Finally, not everything is built using debian/rules targets (with or
> without dpkg-buildpackage). One should still be able to just run e.g.:
> “make -C build build_netboot-gtk USE_UDEBS_FROM=sid”
> 
> See BUILD_DATE handling, for example. We end up with a default setting
> through:
> |build/config/common:BUILD_DATE ?= $(shell date -u '+%Y%m%d-%H:%M')

That would not be reproducible, then!  (it is embedded in the tarballs)

> [ Please note that calling $(shell dpkg-parsechangelog -SDate) to set
> SOURCE_DATE_EPOCH there would only work when building from the toplevel
> directory, and not from the build/ subdirectory for example. ]

If it's anyway not going to be reproducible, we could similarly fall
back to a SOURCE_DATE_EPOCH ?= now;  or the caller could choose to
provide them.

Regards,
-- 
Steven Chamberlain
ste...@pyro.eu.org


signature.asc
Description: Digital signature
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

Re: [Reproducible-builds] Bug#805321: debian-installer: builds unreproducible netboot images

2015-11-16 Thread Cyril Brulebois
Hi,

Steven Chamberlain  (2015-11-17):
> Attached is my jessie-kfreebsd implementation.  As I said, it should be
> much cleaner to implement this in sid with newer GNU tar.
> 
> Regards,
> -- 
> Steven Chamberlain
> ste...@pyro.eu.org

> diff --git a/build/Makefile b/build/Makefile
> index ec5a084..6261a4d 100644
> --- a/build/Makefile
> +++ b/build/Makefile
> @@ -56,7 +56,7 @@
>  # Add to PATH so dpkg will always work, and so local programs will be found.
>  PATH := util:$(PATH):/usr/sbin:/sbin
>  EATMYDATA = $(shell which eatmydata 2>/dev/null)
> -GZIP = $(shell which pigz gzip | head -1)
> +GZIP = $(shell which pigz gzip | head -1) -n

I think I already added -n to a bunch of calls. Not sure whether adding
it here once and for all would be better than adding it where it's
missing though. Anyway, not my biggest question/comment/concern here.

>  # We don't want this to be run each time we re-enter.
>  ifndef DEB_HOST_ARCH
> @@ -149,7 +149,7 @@ MFSROOT_LIMIT := 68m
>  endif
>  
>  define mkfs.ufs1
> -  sh -c 'makefs -t ffs -s $(MFSROOT_LIMIT) -f 3000 -o minfree=0,version=1 
> $$0 ${TREE}'
> +  sh -c 'makefs -t ffs -T $(SOURCE_DATE_EPOCH) -s $(MFSROOT_LIMIT) -f 3000 
> -o minfree=0,version=1 $$0 ${TREE}'

Straightforward enough.

>  define e2fsck
> @@ -803,7 +803,14 @@ $(TEMP_MINIISO): $(TEMP_BOOT_SCREENS) arch_miniiso
>  
>  # various kinds of information, for use on debian-cd isos
>  $(DEBIAN_CD_INFO): $(TEMP_BOOT_SCREENS) $(TEMP_CD_INFO_DIR)
> - (cd $(TEMP_CD_INFO_DIR); tar czf - .) > $@
> + # Clamp timestamps to be no newer than last changelog entry, see
> + # https://wiki.debian.org/ReproducibleBuilds/TimestampsInTarball
> + find $(TEMP_CD_INFO_DIR) -newermt "@$(SOURCE_DATE_EPOCH)" -print0 | 
> xargs -0r touch --no-dereference --date="@$(SOURCE_DATE_EPOCH)"
> + # Create tarball with files sorted in a stable order, see
> + # https://wiki.debian.org/ReproducibleBuilds/FileOrderInTarballs
> + # and without timestamp in the gzip header, see
> + # https://wiki.debian.org/ReproducibleBuilds/TimestampsInGzipHeaders
> + ( cd $(TEMP_CD_INFO_DIR) && find . -print0 | LC_ALL=C sort -z | GZIP=-n 
> tar --no-recursion --null -T - -czf -) > $@
>   update-manifest $@ $(MANIFEST-DEBIAN_CD_INFO)

Once.

>  # a directory full of files for netbooting
> @@ -822,7 +829,14 @@ $(NETBOOT_TAR): $(TEMP_NETBOOT_DIR)
>   # Create an version info file.
>   echo 'Debian version:  $(DEBIAN_VERSION)' > 
> $(TEMP_NETBOOT_DIR)/version.info
>   echo 'Installer build: $(BUILD_DATE)' >> 
> $(TEMP_NETBOOT_DIR)/version.info
> - (cd $(TEMP_NETBOOT_DIR); tar czf - .) > $@
> + # Clamp timestamps to be no newer than last changelog entry, see
> + # https://wiki.debian.org/ReproducibleBuilds/TimestampsInTarball
> + find $(TEMP_NETBOOT_DIR) -newermt "@$(SOURCE_DATE_EPOCH)" -print0 | 
> xargs -0r touch --no-dereference --date="@$(SOURCE_DATE_EPOCH)"
> + # Create tarball with files sorted in a stable order, see
> + # https://wiki.debian.org/ReproducibleBuilds/FileOrderInTarballs
> + # and without timestamp in the gzip header, see
> + # https://wiki.debian.org/ReproducibleBuilds/TimestampsInGzipHeaders
> + ( cd $(TEMP_NETBOOT_DIR) && find . -print0 | LC_ALL=C sort -z | GZIP=-n 
> tar --no-recursion --null -T - -czf -) > $@

Twice.

>   update-manifest $@ $(MANIFEST-NETBOOT_TAR) $(UDEB_LISTS)
>  
>  $(TEMP_BOOT_SCREENS): arch_boot_screens
> diff --git a/build/config/x86.cfg b/build/config/x86.cfg
> index 3caadd2..b0fc9a2 100644
> --- a/build/config/x86.cfg
> +++ b/build/config/x86.cfg
> @@ -332,6 +332,11 @@ arch_miniiso: x86_syslinux x86_grub_efi
>   | todos > $(TEMP_CD_TREE)/win32-loader.ini; \
>   fi
>  
> + # Clamp timestamps to be no newer than last changelog entry, see
> + # https://wiki.debian.org/ReproducibleBuilds/TimestampsInTarball
> + find $(TEMP_CD_TREE) -newermt "$(SOURCE_DATE)" -print0 \
> +  | xargs -0r touch --no-dereference --date="$(SOURCE_DATE)"
> +

Refraining from writing “almost thrice”. [XXX]

>   if [ "$(GRUB_EFI)" = y ]; then \
>   xorriso -as mkisofs -r -J -b isolinux.bin -c boot.cat \
>   -no-emul-boot -boot-load-size 4 -boot-info-table \
> diff --git a/debian/changelog b/debian/changelog
> index 42aed37..09c8a02 100644
> --- a/debian/changelog
> +++ b/debian/changelog
> @@ -1,3 +1,21 @@
> +debian-installer (20150422+kbsd8u2) jessie-kfreebsd; urgency=medium
> +
> +  * Improve reproducibility of debian-installer netboot images:
> +(Closes: #805321)
> +- clamp timestamps in the d-i ramdisk to be no later than
> +  the most recent debian/changelog entry of this package
> +  - raise makefs dependency on >= 20100306-5+kbsd8u1
> +- clamp timestamps in the mini.iso similarly
> +- clamp timestamps in the netboot tarball;  store files in a
> +  stable order
> +- clamp timestamps in the cd info tarball;  store files in a
> +  stable o

Re: [Reproducible-builds] Bug#805321: debian-installer: builds unreproducible netboot images

2015-11-16 Thread Cyril Brulebois
(Keeping everyone initially x-d-cc'd in the loop.)

Hi,

Steven Chamberlain  (2015-11-16):
> Package: debian-installer
> Version: 20150422
> Severity: wishlist
> Tags: patch

Where's the patch? :p

> The debian-installer package build produces netboot.tar.gz and
> the mini.iso netboot install media.  It doesn't do this in an easily
> reproducible way:
> 
>   * the d-i initrd/mfsroot is a filesystem image, having variable
> mtime/ctime/atime timestamps from package build time;
>   * likewise in the generated mini.iso;
>   * netboot.tar.gz also has varying timestamps;  the order of files
> may also vary depending on the filesystem;
>   * likewise in the cd info tarball;
>   * likewise in the debian-installer-images tarball;
>   * all gzipped outfile files have a timestamp in the header.
> 
> I have a patch aimed at jessie-kfreebsd that should fix all of the
> above.  It should be possible to do the same in sid with much less
> code, due to new GNU tar features and other reproducible builds work.

Please make sure not to depend on features which are not found in stable
(I'm not entirely sure about oldstable at this point), which might hinder
our ability to cherry-pick bits and pieces from master to jessie.

I know this might sound a bit silly since you're talking about targetting
jessie-kfreebsd anyway, but I'd like to point that out anyway, just in
case someone wants to rework/“simplify” your work later on.

> I've 'clamped' timestamps to be no later than the most recent
> debian/changelog entry date.  That way, the non-useful timestamps
> from during the build are adjusted to a constant value.  Older
> timestamps, actually indicating how old a file is, are untouched.
> The BUILD_DATE, actually the package version number, is unchanged.
> 
> Specifically on kfreebsd, the generated mfsroot is a ffs filesystem
> having file atimes, and another timestamp in the filesystem superblock.
> I intend to patch makefs so that it can clamp timestamps to a given
> SOURCE_DATE_EPOCH.
> 
> Besides a file ordering issue in makefs, all output files including
> netboot.tar.gz and mini.iso then seem to be reproducible for
> jessie-kfreebsd, at least.  :)

I don't have much knowledge in this area (or time to investigate right
away), so I'll probably let reproducible people comment on this once they
see your patch.

Mraw,
KiBi.


signature.asc
Description: Digital signature
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds