Bruno Haible <br...@clisp.org> writes: > Richard Stallman commented on Jacob Bachmeyer's idea: >> > > Another related check that /would/ have caught this attempt would be >> > > comparing the aclocal m4 files in a release against their >> (meta)upstream >> > > sources before building a package. This is something distribution >> > > maintainers could do without cooperation from upstream. If >> > > m4/build-to-host.m4 had been recognized as coming from gnulib and >> > > compared to the copy in gnulib, the nonempty diff would have been >> > > suspicious. >> >> I have a hunch that some effort is needed to do that comparison, but >> that it is feasible to write a script to do it could make it easy. >> Is that so? > > Yes, the technical side of such a comparison is relatively easy to > implement: > - There are less than about 2000 or 5000 *.m4 files that are shared > between projects. Downloading and storing all historical versions > of these files will take ca. 0.1 to 1 GB. > - They would be stored in a content-based index, i.e. indexed by > sha256 hash code. > - A distribution could then quickly test whether a *.m4 file found > in a distrib tarball is "known". > > The recurrently time-consuming part is, whenever an "unknown" *.m4 file > appears, to > - manually review it, > - update the list of upstream git repositories (e.g. when a project > has been forked) or the list of releases to consider (e.g. snapshots > of GNU Autoconf or GNU libtool, or distribution-specific modifications). > > I agree with Jacob that a distro can put this in place, without needing > to bother upstream developers.
I'm currently looking at adding support for this to https://github.com/hlein/distro-backdoor-scanner. It was brought up at https://openwall.com/lists/oss-security/2024/04/02/5. > > Bruno best, sam