>> But today, if you're building an executable for others, it's common to build >> using a >> container/chroot or similar that makes it easy to implement "must compile >> with these paths", >> while *fixing* this is often a lot of work.
I know that my opinion is not popular, but let me try again before we lay this decision to rest. In avoiding fixing directory dependencies, you can move the complexity around, but in doing so you didn't reduce the complexity. Our instructions for reproducing any package would have to identify what container/chroot/namespace/whatever the end-user must set up to be able to successfully reproduce a package. Will these be the same for every package, for every distro, and for every other environment in which we want to inspire reproducibility? Do we need to add those constraints to the Linux Foundation's Filesystem Hierarchy Standard? Do we need to add them to the buildinfo files? Ideally the tools that ordinary people traditionally use to reproduce one, such as dpkg-buildpackage or rpmbuild, will have been improved to do the container/chroot setup automatically. Otherwise, naive users will have to figure out what a container is or why it is necessary for them to grok this obscure environmental thing in order to tell if their binary package was tampered with or not. Will they always have to build software as root, because chroot doesn't and can't work for ordinary users? If we punt this, there will be an ongoing flow of "my package doesn't build to the same binary, somebody must be 0wning me" emails from people who do the obvious thing like type "make" and "cmp". Do we want successful reproducibility to depend on setting up servers and virtual machines and web-servers and databases and build farms and CI-queues and such? Yes, to reproduce a whole distro, reproducibility has to WORK there, but does it have to DEPEND on that complex infrastructure? I'm an old Unix guy and so are millions of end-users and sysadmins. Containers are a recent Linux thing. Namespaces ditto. I still have never found a use for containers; I tried using Docker for something and was bemused to discover that it could calculate all kinds of stuff, but none of the output of the calculation could come back into my ordinary Linux filesystem (without some kind of obscure per-invocation JCL-like configuration setup), so I stopped trying to use it. Another time, I tried booting an on-disk, installed copy of Ubuntu inside a virtual machine, so I could keep running an older service that's hard to port forward, while migrating the rest of my machine to a newer Ubuntu release. VM/360 could do that decades ago, but I discovered that that use-case is not well supported in the Linux vm tools and documentation, so I gave up on that too. There are more things in heaven and earth, Horatius, than spending all of your time doing sysadmin. These newfangled tools are just not as well rounded as the stuff that's been well understood in Unix since the 1970s or 1980s, like "directories". If only seventeen experts in the world can figure out if a package has been tampered with, we will have labored mightily but not done much to improve computer security. Also recall what pains the full-source bootstrap people are having to go through after some imho foolish decisions were made about depending on modern C++ features inside core tools like gcc and gdb. Reproducible builds should make the underlying software LESS dependent on the particular configuration of the build environment; that's kind of the point. >>> ... it makes reproducibilty from around 80-85% of all >>> packages to >95%, IOW with this shortcut we can have meaningful >>> reproducibility >>> *many years* sooner, than without. If we move the goal posts in order to claim victory, who are we fooling but ourselves? I'd rather that we knew and documented that 57% of packages are absolutely reproducible, 23% require SOURCE_DATE_EPOCH, and 12% still require a standardized source code directory, than to claim all 95% are "meaningfully reproducible" today. John