Katherine Mcmillan: > Just for clarity, does anyone know what "Unix-like operating systems" > would be affected by this?
None. TLDR: The build process of the backdoor explicitly aborts on platforms other than Linux x86-64. As the maintainer of the archivers/xz port, I took a look at the build stages of the malicious code, because I had already prepared an update to 5.6.1 and run the code in question. Two ostensible test files were committed to the xz repository immediately before the 5.6.0 release and updated immediately before 5.6.1: bad-3-corrupt_lzma2.xz, which as the name suggests is a malformed compressed file, and good-large_compressed.lzma, which is a valid file and extracts to a mixture of easily compressible repeated characters and uncompressible pseudo-random data. By themselves those files are completely harmless. As is common practice, the xz repository only contains input files like configure.ac and Makefile.am for the GNU autotools. For the release tarball, an autotools run generates the actual configure script, Makefile.in, etc., so the result can be built with "./configure && make". For the 5.6.0 and 5.6.1 release, the build-to-host.m4 macro package that ships as part of GNU gettext was replaced by a modified version that was copied into the release tarball and, importantly, was used to generate a modified configure script. Let's call this stage 0. When you run the configure script, the stage 0 shell snippet is executed. The malicious code runs a pipe of commands that reads the bad-3-corrupt_lzma2.xz file, swaps some byte values to turn it into a valid file, extracts the file with xz (which must already be installed), and feeds the content--let's call it stage 1--into a shell. In 5.6.1, the stage 1 script will abort right away if the operating system doesn't identify as "Linux" with uname(1). The script runs another pipe of commands that decompresses good-large_compressed.lzma, picks some chunks of the result, replaces some byte values to turn it into a valid LZMA data stream, extracts the content--let's call it stage 2--and feeds it into a shell. The data manipulation in stage 1 uses the head(1) command with the "-c" command flag, which isn't available on OpenBSD. In 5.6.1 there is another early attempt in the stage 2 script to verify that the operating system is Linux, however the syntax is broken so it doesn't actually do anything. The stage 2 script runs quite a number of tests to ensure that the environment in which it executes is the one it expects: details of the directory tree, details of the files generated by configure, that the platform is x86-64 Linux, that the compiler is gcc and the linker GNU ld, that the IFUNC feature is available, that is runs as part of a .deb or .rpm package build. If any single one of those tests fails, the script aborts right away. If everything checks out, stage 2 again runs a series of data manipulation commands to extract from good-large_compressed.lzma two object files and injects them into the build to generate a manipulated liblzma. Various checks that stage 2 performs will fail on OpenBSD and again it relies on "head -c" and now also on the GNU version of sed(1) to perform the required data manipulations. For the actual code inserted into liblzma on Linux x86-64, I have to refer to the ongoing reverse engineering performed by the Linux people. It is my understanding that its code is triggered by an IFUNC constructor during dynamic linking that checks that it is in the address space of a /usr/sbin/sshd process and then proceeds to redirect an RSA signature verification routine to its own malicious code. Liblzma ends up dynamically linked to sshd because of a systemd-related extension added by many Linux packagers that pulls in liblzma as an unrelated dependency. The actual backdoor is triggered by an SSH connection that authenticates with a certificate that includes an RSA public key, part of which is a payload that is checked against a fingerprint, then verified for a correct Ed448 signature with a key only the attacker knows, and then this content is directly executed in a shell spawned by sshd for remote code execution. The build stage of the backdoor is well hidden. The stage 0 shell snippet looks at first glance like a plausible part of the poorly readable autoconf/automake tooling. The test files that hide the further stages and actual backdoor code are unsuspicious by themselves. 5.6.1 added further tests to abort early on non-Linux platforms, presumably so that nobody examining build problems would stumble over anything suspicious. I think the check for a .deb or .rpm build is intended to inject the backdoor only during automated package building, so people developing or debugging xz would not accidentally discover it in the build directory. I can identify four commits in the xz Git repository that are related to the backdoor. In chronological order: 2024-02-23 cf44e4b Tests: Add a few test files. 2024-03-09 82ecc53 liblzma: Fix false Valgrind error report with GCC. 2024-03-09 8c9b8b2 liblzma: Fix typos in crc32_fast.c and crc64_fast.c. 2024-03-09 6e63681 Tests: Update two test files. cf44e4b and 6e63681 directly add and update hidden malicious code. Aside from its documented change, 82ecc53 introduces an unmotivated whitespace change... - return is_arch_extension_supported() + return is_arch_extension_supported() ... which is then reverted by 8c9b8b2. The stage 2 script actually relies on matching "return is_arch_extension_supported", so 82ecc53 breaks the backdoor injection and 8c9b8b2 restores it. Maybe a change intended for testing by the malware author accidentally slipped in. Another malicious commit, entirely unrelated to the backdoor, is 2024-02-26 328c52d Build: Fix Linux Landlock feature test in Autotools and CMake builds. This introduces a syntax error that breaks Landlock detection when using CMake instead of the autotools build framework, so the Linux sandboxing is disabled in this case. The syntax error is a single period '.' as the first character on an otherwise empty line of C code. That is designed so it will be easily missed. It does not plausibly pass for a typo because no typical editing glitch will leave a '.' character there. I'm not aware of any clearly malicious commit before 2024-02-23. I'll conclude this brain dump by pointing out that much of the emerging narrative about this backdoor that you can read all over the net is based on idle speculation and selective interpretation of facts. -- Christian "naddy" Weisgerber na...@mips.inka.de