Hello fellow MacPorts people, Brace yourselves, this mail is going to be long.
As some of you may know [1], I recently represented MacPorts at a workshop on build reproducibility in Athens organized by the Debian folks. A wide range of projects and package management systems was represented; the full list is available on the website [2]. The *BSDs and Homebrew (Mike McQuaid) are probably most relevant to our interests. Use Cases ========= If you are not sure why we should care about reproducible builds, let me assure you there are plenty of reasons. A more detailed rationale was prepared at the workshop and should be available at [3] soon, but I'll take the time to mention a few points. Note that "reproducible builds" in this context always means bit-by-bit reproducibility, unless explicitly stated otherwise. - Source <-> Binary Correspondence: Reproducible builds allow developers to verify that what our buildbot is serving us is actually what it claims to be. - Attack Surface Reduction: Having reproducible builds reduces the motivation to attack our buildbot setup, because modifications can be detected. - Caching: You can avoid rebuilds of packages that build reproducibly if the inputs haven't changed. This doesn't seem to be *that* important to us at the moment, but is a big selling point, especially for commercial software development. - Delta Reduction: With reproducible builds, small changes in source will be more likely to cause small changes in the resulting binary. This could be used to allow binary-delta updating, reducing download time, bandwidth requirements, and update time. - Support Burden Reduction: Build reproducibility can provide confidence that a user's build is exactly what a packager intended and rule out a whole class of bugs. How to Build Reproducibly ========================= The process to get reproducible builds is pretty well-understood. The reproducible-builds.org documentation outlines the most common problems and issues that prevent reproducible builds [3]. For the most part, all distributions face the same issues, which allows us to build on the effort of projects with larger man power, like Debian. There are a couple of points that might not be obvious or easily overlooked that I'd like to point out: - Filesystem ordering and locale-dependent sorting: Relying on the order of files that readdir(3) returns makes builds unreproducible. Sorting those files will only help if the sort result doesn't differ by locale. - Timestamps are everywhere and are responsible for a large part of unreproducible builds. Using __DATE__, __TIME__, __TIMESTAMP__, or similar macros should be avoided. Version numbers or version control system information are much better replacements: If your build is reproducible, it does not matter *when* it happened. However, lots of tools include timestamps by default, such as gzip(1) when compressing our manpages (r143068) or tar(1) when creating our binary archives. Strategies to solve these problems exist, e.g. by providing a ceiling value for all time stamps while creating a tarball or using the environment variable SOURCE_DATE_EPOCH [4] for date-dependent macros. - Well-defined build environments: Pretty much the rest of the world has good OS-level support for a chroot(2)-like mechanism that can be used to provide a build environment that only contains inputs from a controlled list of dependencies. FreeBSD has jails, Linux has namespaces, but the only thing OS X supports in this direction are chroots, and those have a reputation of breaking some of Apple's tools like xcodebuild (a reputation I may set out verifying or falsifying). Trace mode is a step into the right direction, but doesn't catch everything and is very slow compared to other methods. To additionally make matters more complicated, we rely on Apple's toolchain, which can be updated and/or changed independent of MacPorts. Testing Build Reproducibility ============================= In order to find out whether a build can be reproduced, it should be done multiple times, with possibly varying input settings. The more input and environment settings can be modified without the build result changing, the higher the reproducibility. Debian has a couple of machines available and runs a Jenkins setup that will build each package twice but vary a couple of settings for the second build, such as: hostname, domainname, environment variables (TZ, LANG, LC_ALL, PATH), UID/GID, Kernel version, umask, CPU type, current time (by a large amount to trigger changes in year, month and day regardless of timezone), and filesystem sort order (by using a FUSE filesystem that will make readdir(3) return different results). While Debian's setup is available for use by other projects, it is of little use to us because OS X cannot be virtualized on non-Apple hardware without violating the EULA. One of the biggest hurdles towards systematic testing for build reproducibility in MacPorts (and Homebrew as well, btw) is thus the availability of Apple hardware. To track down the differences that cause builds to be non-reproducible, a couple of people from the Debian reproducible builds effort have written diffoscope [5], a python diff tool that will interpret file formats and try hard to give you a human-readable difference between two files. Support for Mach-O binaries is available as a patch at [6] (and I hope to push it upstream soon). This tool could also be helpful to look at differences in stealth updates. State of Reproducible Builds in MacPorts ======================================== Despite the several obstacles mentioned above, build reproducibility in MacPorts is actually not a lost cause. This is partly because we have historically always tried to keep a clean and similar build environment across machines, e.g. by using privilege separation, removing all but a few white-listed environment variables and trace mode. Timestamps are our biggest issue on the road towards reproducible tarballs at the moment. In a sloppy test done by Marius Schamschula and me, we managed to reproduce our builds of bash down to timestamp issues in gzip headers and tarball metadata. Unfortunately, generating statistics on reproducibility requires buildserver support. To fix the timestamp issues, I am looking for a suitable value to use as SOURCE_DATE_EPOCH and then add a find statement before creating the archive that will put an upper mtime limit on all files to be packaged. I am not yet sure what a good (reproducible!) timestamp might be: - The Portfile mtime would be perfect, but is not preserved by Subversion, so we cannot rely on it. It is preserved by our rsync sync, but the mtime in that is probably meaningless since it's the one generated on the rsync server during svn update. - The newest timestamp inside a source code tree is a good choice (and https://github.com/0-wiz-0/findnewest could easily give us that timestamp), but sources fetched from version control systems do not always set it to the time of the commit (AFAIK Git doesn't, for example). - A fixed value of 0 or 1 is not a very good choice. We could put an additional piece of metadata into Portfiles to be used as timestamp (e.g. just like we have checksums). It is my understanding that FreeBSD will chose to go this route. Miscellaneous Topics ==================== I've learned that our builds of GHC and all Haskell modules are likely ABI-incompatible when downloaded from the buildbot vs. built locally. We should disable parallel building for Haskell to fix this until upstream provides a better solution. Luckily, this hasn't largely affected us yet, because binary availability in the Haskell land is high. Homebrew achieves good binary package coverage for non-default prefixes by scanning the build results for $prefix. In library load commands, the path is changed using install_name_tool(1) on installation locally, in text files, the path is simply changed. If $prefix is found in a binary file, the archive is marked as non-prefix-invariant and ignored by non-default prefix installations. Homebrew has methods to provide compiler wrappers that ensure that build systems are UsingTheRightCompiler, and additionally ensure that the compiler flags are set as expected (e.g. -arch flags, -stdlib flag for C++). Google's Blaze (Open Source: Bazel) build system supports license annotation on build results and license compatibility analysis. Their approach to the problem might be interesting input for the set of scripts we use to determine whether a binary archive is distributable. Acknowledgments =============== I'd like to thank portmgr@ for giving me the chance to represent the MacPorts Project at this event. Travel and Accomodation has been sponsored by the Linux Foundation. Conference Location and Moderation have been sponsored by the Open Technology Fund. Dinner has been provided by Google ;-) [1] https://lists.macosforge.org/pipermail/macports-dev/2015-September/031440.html [2] https://reproducible-builds.org/events/athens2015/ [3] https://reproducible-builds.org/docs/ [4] https://reproducible-builds.org/specs/source-date-epoch/ [5] https://diffoscope.org/ [6] https://lists.reproducible-builds.org/pipermail/diffoscope/2015-December/000000.html -- Clemens Lang MacPorts Developer _______________________________________________ macports-dev mailing list macports-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/macports-dev