Richard Stallman wrote:
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > I was recently reading about the backdoor announced in xz-utils the
  > other day, and one of the things that caught my attention was how
  > (ab)use of the GNU build system played a role in allowing the backdoor
  > to go unnoticed: https://openwall.com/lists/oss-security/2024/03/29/4

[...]

I don't want to get involved in fixing the bug, but I want to
make sure the GNU Project is working on it.

I believe the GNU Project is blameless and uninvolved in this matter. I am aware of possible elements used in the attack from the GNU Project, however two of them are innocent improvements abused by the cracker and the third was modified by the cracker.

The backdoor is in two major segments: a binary blob hidden in a testsuite data file and two shell scripts that drop it, also hidden in testsuite data files. A modified version of build-to-host.m4 from gnulib is used to insert code into configure to initially extract the dropper and run it (via pipeline---the dropper shell scripts never touch the filesystem).

If several conditions are met (building a shared library on 'x86_64-*-linux-gnu', HAVE_FUNC_ATTRIBUTE_IFUNC, using the GNU toolchain (or at least a linker that claims to be "GNU ld" and a compiler invoked as "gcc"), and building under either dpkg or rpm), the dropper extracts a binary blob and links it with a legitimate object, which is patched and recompiled (using sed in a pipeline; the modified C source never touches the filesystem) to call a function exported from the blob.

The aclocal m4 files involved in the attack were never committed to the xz Git repository, instead being added to each release tarball using autopoint. (This was the package's standard practice before the attack.) The offending build-to-host.m4 was modified by the cracker, either directly in the release tree or at the location where autopoint will find it. Some of the modifications "sound like" the cracker may have used a language model trained on other GNU sources---they are very plausible at first glance.

The elements from the GNU Project potentially implicated are build-to-host.m4, gettext.m4, and the ifunc feature in glibc. All of these turn out to be innocent.

The initial "key" that activated the backdoor dropper was a modified version of the gl_BUILD_TO_HOST macro from gnulib. The dropper also checks m4/gettext.m4 for the text "dnl Convert it to C string syntax." and fails to extract the blob if found. It turns out that gl_BUILD_TO_HOST is used only as a dependency of gettext.m4 and that that comment was removed in the same commit that factored out gl_BUILD_TO_HOST to gnulib. (commit 3adaddd73c8edcceaed059e859bd5262df65fc5a in GNU gettext repository is by Bruno Haible; his involvement in the backdoor campaign is *extremely* unlikely in my view)

The "ifunc" feature merely allows the programmer to store function pointers in the PLT instead of the data segment when alternate implementations of a function are involved. Theoretically, it should actually be a security improvement, as the PLT can be made read-only after all links are resolved, while the data segment must remain generally writable.

The backdoor will not be dropped if the use of ifunc is disabled or if the feature is unavailable. I currently believe that the cracker used ifunc support as a covert flag to disable the backdoor when the oss-fuzz project was scanning the package. I also suspect that the cracker's claim that ifuncs cause segfaults under -fsanitize=address (in the message for commit ee44863ae88e377a5df10db007ba9bfadde3d314 in the xz Git repository) may have been less than honest; that commit also gives credit to another of the cracker's sockpuppets for the original patch and was committed by the cracker's main "Jia Tan" sockpuppet, so the involvement of the primary maintainer (who is listed as the author of the commit in Git) is uncertain. (In other words, the xz Git repository likely contains blatant lies put there by the cracker.)

Looking into this a little more, I now know what the dropper's C source patch does: the blob's initialization entrypoint is named _get_cpuid (note only one leading underscore) and is called from an inserted static inline function that crc{32,64}_resolve (the ifunc resolvers that choose CRC implementations) are patched to call. The dropper also ensures (by modifying liblzma_la_LDFLAGS in src/liblzma/Makefile) that liblzma.so will be linked with -Wl,-z,now so that ifuncs are resolved as the shared object is loaded. That is how the backdoor blob initially gains control at a time during early process initialization when the PLT is still writable despite other hardening.


-- Jacob

Reply via email to