(Please keep me on CC, I am not subscribed) Background ==========
Previous background is here: https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00182.html Upon further discussion, we decided to add support for multiple mappings and to rename the environment variable to BUILD_PATH_PREFIX_MAP. We have also prepared a document that describes how this works in detail, so that projects can be confident that they are interoperable: https://reproducible-builds.org/specs/build-path-prefix-map/ The specification is currently in DRAFT status, awaiting some final feedback, including what the GCC maintainers think about it. If one is interested in reading about this topic in the wider context of reproducible builds, there's some more background here: https://wiki.debian.org/ReproducibleBuilds/StandardEnvironmentVariables Proposal ======== This patch series adds a new environment variable BUILD_PATH_PREFIX_MAP. When this is set, GCC will treat this as extra implicit "-fdebug-prefix-map=$value" command-line arguments that precede any explicit ones. This makes the final binary output reproducible, and also hides the unreproducible value (the source path prefixes) from CFLAGS et. al. which many build tools (understandably) embed as-is into their build output. This environment variable also acts on the __FILE__ macro, mapping it in the same way that debug-prefix-map works for debug symbols. We have seen that __FILE__ is also a very large source of unreproducibility, and is represented quite heavily in the 3k+ figure given earlier. Finally, we tweak the mapping algorithm so that it applies only to whole path components when matching prefixes. This algorithm contains fewer corner cases and is more predictable, so it is easier for users to figure out how to set the mapping appropriately, and it is better as a standardised algorithm that other build tools might like to adopt. (The original idea came from discussions with some rustc developers about this same topic.) This does technically break backwards-compatibility, but I was under the impression that this option was not seen as such a critical feature, that this would be too important. I am also happy to justify it in more detail on request. Nevertheless, for this reason our draft specification currently offers two algorithms for implementers, but I would reduce this to one if the GCC maintainers agree to accept this third patch. Testing ======= I've tested these patches on a Debian unstable x86_64-linux-gnu schroot running inside a Debian jessie system, on a full-bootstrap build. The output of contrib/compare_tests is as follows: ~~~~ gcc-7-20170409$ contrib/compare_tests ../gcc-build-0 ../gcc-build-1 # Comparing directories ## Dir1=../gcc-build-0: 8 sum files ## Dir2=../gcc-build-1: 8 sum files # Comparing 8 common sum files ## /bin/sh contrib/compare_tests /tmp/gxx-sum1.24154 /tmp/gxx-sum2.24154 New tests that PASS: gcc.dg/cpp/build_path_prefix_map-1.c (test for excess errors) gcc.dg/cpp/build_path_prefix_map-1.c execution test gcc.dg/cpp/build_path_prefix_map-2.c (test for excess errors) gcc.dg/cpp/build_path_prefix_map-2.c execution test gcc.dg/debug/dwarf2/build_path_prefix_map-1.c (test for excess errors) gcc.dg/debug/dwarf2/build_path_prefix_map-1.c scan-assembler DW_AT_comp_dir: "DWARF2TEST/gcc gcc.dg/debug/dwarf2/build_path_prefix_map-2.c (test for excess errors) gcc.dg/debug/dwarf2/build_path_prefix_map-2.c scan-assembler DW_AT_comp_dir: "/ # No differences found in 8 common sum files ~~~~ I can also provide the full logs on request. -- I've also fuzzed the prefix-map code using AFL with ASAN enabled. Due to how AFL works I did not fuzz this patch directly but a smaller program with just the parser and remapper, available here: https://anonscm.debian.org/cgit/reproducible/build-path-prefix-map-spec.git/tree/consume Over the course of about ~4k cycles, no crashes were found. To reproduce, you could run something like: $ echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor $ make CC=afl-gcc clean reset-fuzz-pecsplit.c fuzz-pecsplit.c -- I will soon test this patch backported to Debian GCC-6 on tests.reproducible-builds.org and will have results in a few days or weeks. Some preliminary tests earlier gave good results (about +40 packages reproducible over ~2 days) but we had to abort due to some misscheduling. Copyright disclaimer ==================== I dedicate these patches to the public domain by waiving all of my rights to the work worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law. See https://creativecommons.org/publicdomain/zero/1.0/legalcode for full text. Please let me know if the above is insufficient and I will be happy to sign any relevant forms. However, I would prefer it if the prefix-map.{h,c} remain public domain since its code is also duplicated in our "example code" repo (url above), which is meant for other projects to copy+paste.