Eric Blake <ebl...@redhat.com> writes: > Widening the audience to include bug-gnulib, which is the upstream > source of "# build-to-host.m4 serial 3" which was bypassed by the > malicious "# build-to-host.m4 serial 30". > > On Sun, Mar 31, 2024 at 11:51:36PM +0200, Guillem Jover wrote: >> Hi! >> >> While analyzing the recent xz backdoor hook into the build system [A], >> I noticed that one of the aspects why the hook worked was because it >> seems like «autoreconf -f -i» (that is run in Debian as part of >> dh-autoreconf via dh) still seems to take the serial into account, >> which was bumped in the tampered .m4 file. If either the gettext.m4 >> had gotten downgraded (to the version currently in Debian, which would >> not have pulled the tampered build-to-host.m4), or once Debian upgrades >> gettext, the build-to-host.m4 would get downgraded to the upstream >> clean version, then the hook would have been disabled and the backdoor >> would be inert. (Of course at that point the malicious actor would >> have found another way to hook into the build system, but the less >> avenues there are the better.) >> >> I've tried to search the list and checked for old bug reports on the >> debbugs.gnu.org site, but didn't notice anything. To me this looks like >> a very unexpected behavior, but it's not clear whether this is intentional >> or a bug. In any case regardless of either position, it would be good to >> improve this (either by fixing --force to force things even if >> downgrading, or otherwise perhaps to add a new option to really force >> everything). >> >> [A] <https://lists.debian.org/debian-devel/2024/03/msg00367.html> >> Longish mail, search for "try to go in detail" for the analysis. > > My understanding is that the use of serial numbers in .m4 snippets was > intentional in gnulib (more or less where the practice originated), > but only because gnulib prefers a linear history (everything is > monotonically increasing, no forks for the serial number to diverge > on). In light of this weekend's mess, Bruno may have more ideas about > how to prevent his files from being turned into backdoor delivery > mechanisms in the future.
I think the root cause here is assuming 'autoreconf -fi' achieves anything related to re-bootstrapping. I think the entire concept of re-bootstrapping from a source tarball with generated contents in it is fundamentally flawed. I have proposed that we should start to release *-src.tar.gz tarballs that doesn't have any pre-generated in it, that can be completely bootstrapped using external tools. See writeup here: https://blog.josefsson.org/2024/04/01/towards-reproducible-minimal-source-code-tarballs-please-welcome-src-tar-gz/ To me, moving things towards this approach allows incremental work that eventually will be more reliable than anything that attempts to re-boostrap from a tarball with some pre-generated artifacts in it (because there will always be uncertainty if the artifact used was actually built or came from the tarball). I suggest that we extend 'make dist' to produce these *-src.tar.gz tarballs, possibly only when some new automake AM_INIT_AUTOMAKE flag is used. There could be some functions to modify how the tarball is generated, much like we have dist-hooks today that is often used to generate ChangeLog for the tarballs. Thoughts? /Simon
signature.asc
Description: PGP signature