https://github.com/zyedidia created https://github.com/llvm/llvm-project/pull/189569
This PR introduces an x86-64 backend for Lightweight Fault Isolation (LFI), similar to the one being developed for AArch64 (see #167061 for the initial AArch64 PR). LFI is a compiler-based mechanism that enables efficient in-process sandboxing. See the [RFC](https://discourse.llvm.org/t/rfc-lightweight-fault-isolation-lfi-efficient-native-code-sandboxing-upstream-lfi-target-and-compiler-changes/88380) from last fall for details. This PR adds the `x86_64_lfi` target (similar to `aarch64_lfi`), sets up reserved registers, and implements some initial rewrites for system instructions (system calls and TLS accesses). The rewrites are done at the MC level, using the `MCLFIRewriter` infrastructure. I have updated the documentation to describe the x86-64 sandboxing scheme and to list rewrites that will be implemented in future PRs (to keep each individual PR small). For performance and compatibility reasons, the plan is currently to use bundling for maintaining control-flow integrity in the sandbox, which requires #175830. For now we are setting up the rewrites without using bundling, but we can also use a CFI mechanism based on shadow stack+endbr in order to have something usable while bundling is reviewed. >From d4e56d4c89a5edbe938f9f95a55038670c24d781 Mon Sep 17 00:00:00 2001 From: Zachary Yedidia <[email protected]> Date: Mon, 16 Mar 2026 19:26:26 -0400 Subject: [PATCH 1/2] [LFI][X86] Initial LFI X86 rewriter for system instructions --- clang/lib/Basic/Targets/X86.cpp | 4 + llvm/docs/LFI.rst | 597 ++++++++++++++---- llvm/include/llvm/TargetParser/Triple.h | 11 +- llvm/lib/MC/MCLFI.cpp | 4 + .../Target/X86/MCTargetDesc/CMakeLists.txt | 1 + .../X86/MCTargetDesc/X86MCLFIRewriter.cpp | 132 ++++ .../X86/MCTargetDesc/X86MCLFIRewriter.h | 55 ++ .../X86/MCTargetDesc/X86MCTargetDesc.cpp | 14 + llvm/lib/Target/X86/X86ISelLoweringCall.cpp | 5 + llvm/lib/Target/X86/X86RegisterInfo.cpp | 10 + llvm/lib/Target/X86/X86Subtarget.h | 2 + llvm/lib/TargetParser/Triple.cpp | 8 + llvm/test/MC/X86/LFI/sys.s | 12 + llvm/test/MC/X86/LFI/tp.s | 10 + 14 files changed, 748 insertions(+), 117 deletions(-) create mode 100644 llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp create mode 100644 llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.h create mode 100644 llvm/test/MC/X86/LFI/sys.s create mode 100644 llvm/test/MC/X86/LFI/tp.s diff --git a/clang/lib/Basic/Targets/X86.cpp b/clang/lib/Basic/Targets/X86.cpp index cb941c94c84a7..39daa2c473317 100644 --- a/clang/lib/Basic/Targets/X86.cpp +++ b/clang/lib/Basic/Targets/X86.cpp @@ -534,6 +534,10 @@ void X86TargetInfo::getTargetDefines(const LangOptions &Opts, DefineStd(Builder, "i386", Opts); } + if (getTriple().isLFI()) { + Builder.defineMacro("__LFI__"); + } + Builder.defineMacro("__SEG_GS"); Builder.defineMacro("__SEG_FS"); Builder.defineMacro("__seg_gs", "__attribute__((address_space(256)))"); diff --git a/llvm/docs/LFI.rst b/llvm/docs/LFI.rst index 65d8b70f17e0b..ffaf3010a4174 100644 --- a/llvm/docs/LFI.rst +++ b/llvm/docs/LFI.rst @@ -38,10 +38,9 @@ runtime), responsible for initializing the sandbox region, loading the program, and servicing system call requests, or other forms of runtime calls. LFI uses an architecture-specific sandboxing scheme based on the general -technique of Software-Based Fault Isolation (SFI). Initial support for LFI in -LLVM is focused on the AArch64 platform, with x86-64 support planned for the -future. The initial version of LFI for AArch64 is designed to support the -Armv8.1 AArch64 architecture. +technique of Software-Based Fault Isolation (SFI). LLVM currently supports LFI +for the AArch64 and X86-64 platforms. The AArch64 version is designed to +support the Armv8.1 AArch64 architecture. See `https://github.com/lfi-project <https://github.com/lfi-project/>`__ for details about the LFI project and additional software needed to run LFI @@ -50,28 +49,87 @@ programs. Compiler Requirements +++++++++++++++++++++ -When building for the ``aarch64_lfi`` target, the compiler must restrict use of -the instruction set to a subset of instructions, which are known to be safe -from a sandboxing perspective. To do this, we apply a set of simple rewrites at -the assembly language level to transform standard native AArch64 assembly into -LFI-compatible AArch64 assembly. +When building for an LFI target (``aarch64_lfi`` or ``x86_64_lfi``), the +compiler must restrict use of the instruction set to a subset of instructions, +which are known to be safe from a sandboxing perspective. To do this, we apply a +set of simple rewrites at the assembly language level to transform standard +native assembly into LFI-compatible assembly. These rewrites (also called "expansions") are applied at the very end of the LLVM compilation pipeline (during the assembler step). This allows the rewrites to be applied to hand-written assembly, including inline assembly. +Context Register +++++++++++++++++ + +Both architectures designate a context register that points to a block of +thread-local memory managed by the LFI runtime. The context register is ``x25`` +on AArch64 and ``r15`` on X86-64. The layout is as follows: + ++--------+--------+----------------------------------------------+ +| Offset | Size | Description | ++--------+--------+----------------------------------------------+ +| 0 | 8 | Reserved for future use. | ++--------+--------+----------------------------------------------+ +| 8 | 8 | Reserved for use by the LFI runtime. | ++--------+--------+----------------------------------------------+ +| 16 | 8 | Virtual thread pointer (used for TP access). | ++--------+--------+----------------------------------------------+ + +Linker Support +++++++++++++++ + +In the initial version, LFI only supports static linking, and only supports +creating ``static-pie`` binaries. There is nothing that fundamentally precludes +support for dynamic linking on the LFI target, but such support would require +that the code generated by the linker for PLT entries be slightly modified in +order to conform to the LFI architecture subset. + +Assembler Directives +++++++++++++++++++++ + +The LFI assembler supports the following directives for controlling the +rewriter. + +``.lfi_rewrite_disable`` +======================== + +Disables LFI assembly rewrites for all subsequent instructions, until +``.lfi_rewrite_enable`` is used. This can be useful for hand-written assembly +that is already safe and should not be modified by the rewriter. + +``.lfi_rewrite_enable`` +======================= + +Re-enables LFI assembly rewrites after a previous ``.lfi_rewrite_disable``. + +Example: + +.. code-block:: gas + + .lfi_rewrite_disable + // No rewrites applied here. + ldr x0, [x27, w1, uxtw] + .lfi_rewrite_enable + +AArch64 ++++++++ + +The AArch64 LFI target is ``aarch64_lfi``. + Compiler Options ================ -The LFI target has several configuration options. +The AArch64 LFI target has several configuration options, specified via +``-mattr=``: -* ``+lfi-loads``: enable sandboxing for loads (default: true). -* ``+lfi-stores``: enable sandboxing for stores (default: true). +* ``+no-lfi-loads``: Disable sandboxing for load instructions (stores-only mode). +* ``+no-lfi-stores``: Disable sandboxing for store instructions. -Use ``+nolfi-loads`` to create a "stores-only" sandbox that may read, but not +Use ``+no-lfi-loads`` to create a "stores-only" sandbox that may read, but not write, outside the sandbox region. -Use ``+nolfi-loads+nolfi-stores`` to create a "jumps-only" sandbox that may +Use ``+no-lfi-loads,+no-lfi-stores`` to create a "jumps-only" sandbox that may read/write outside the sandbox region but may not transfer control outside (e.g., may not execute system calls directly). This is primarily useful in combination with some other form of memory sandboxing, such as Intel MPK. @@ -79,8 +137,8 @@ combination with some other form of memory sandboxing, such as Intel MPK. Reserved Registers ================== -The LFI target uses a custom ABI that reserves additional registers for the -platform. The registers are listed below, along with the security invariant +The AArch64 LFI target uses a custom ABI that reserves additional registers for +the platform. The registers are listed below, along with the security invariant that must be maintained. * ``x27``: always holds the sandbox base address. @@ -88,16 +146,7 @@ that must be maintained. * ``sp``: always holds an address within the sandbox. * ``x30``: always holds an address within the sandbox. * ``x26``: scratch register. -* ``x25``: points to a thread-local virtual register file for storing runtime context information. - -Linker Support -============== - -In the initial version, LFI only supports static linking, and only supports -creating ``static-pie`` binaries. There is nothing that fundamentally precludes -support for dynamic linking on the LFI target, but such support would require -that the code generated by the linker for PLT entries be slightly modified in -order to conform to the LFI architecture subset. +* ``x25``: context register (see `Context Register`_). Assembly Rewrites ================= @@ -240,73 +289,178 @@ before moving it back into ``sp`` with a safe ``add``. Link register modification ~~~~~~~~~~~~~~~~~~~~~~~~~~~ -When the link register is modified, we write the modified value to a -temporary, before loading it back into ``x30`` with a safe ``add``. - -+-----------------------+----------------------------+ -| Original | Rewritten | -+-----------------------+----------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| ldr x30, [...] | ldr x26, [...] | -| | add x30, x27, w26, uxtw | -| | | -+-----------------------+----------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| ldp xN, x30, [...] | ldp xN, x26, [...] | -| | add x30, x27, w26, uxtw | -| | | -+-----------------------+----------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| ldp x30, xN, [...] | ldp x26, xN, [...] | -| | add x30, x27, w26, uxtw | -| | | -+-----------------------+----------------------------+ +When the link register is modified, the guard is deferred until the next +control flow instruction. This approach maintains compatibility with Pointer +Authentication Code (PAC) instructions by keeping signed pointers intact until +they are needed for control flow. The guard uses ``x30`` as both the source and +destination (``add x30, x27, w30, uxtw``). + ++---------------------------+-------------------------------+ +| Original | Rewritten | ++---------------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| ldr x30, [...] | ldr x30, [...] | +| ret | add x30, x27, w30, uxtw | +| | ret | +| | | ++---------------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| ldp xN, x30, [...] | ldp xN, x30, [...] | +| ret | add x30, x27, w30, uxtw | +| | ret | +| | | ++---------------------------+-------------------------------+ + +Pointer Authentication Code (PAC) support +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +LFI is designed to be compatible with ARM Pointer Authentication Code (PAC) +instructions. PAC signs and authenticates pointers (typically the return +address in ``x30``) to protect against control-flow hijacking attacks. + +To get the security benefits of PAC with LFI-compiled code, the hardware must +support **FEAT_FPAC** (Faulting PAC), which causes authentication failures to +immediately fault. Without FEAT_FPAC, a failed authentication produces a +"poisoned" pointer that only faults when dereferenced, which may not provide +immediate detection of authentication failures. + +**PACIASP** (sign return address) passes through unchanged. It signs the +current value of ``x30`` using the stack pointer as a modifier, which does not +affect LFI's security guarantees. + +**AUTIASP** (authenticate return address) passes through unchanged. On +processors with FEAT_FPAC, authentication failure automatically faults. + ++-------------------+------------------------+ +| Original | Rewritten | ++-------------------+------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| paciasp | paciasp | +| | | ++-------------------+------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| autiasp | autiasp | +| | | ++-------------------+------------------------+ + +Note that the deferred LR guard approach is essential for PAC compatibility. +If the guard were applied immediately after loading a signed return address, +it would corrupt the PAC signature, causing subsequent ``autiasp`` to fail. +By deferring the guard until control flow, signed pointers remain intact +through the authentication process. + +**Authenticated returns** (``retaa``/``retab``) combine authentication with +return. LFI expands these into their component operations: + ++-------------------+-------------------------------+ +| Original | Rewritten | ++-------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| retaa | autiasp | +| | add x30, x27, w30, uxtw | +| | ret | +| | | ++-------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| retab | autibsp | +| | add x30, x27, w30, uxtw | +| | ret | +| | | ++-------------------+-------------------------------+ + +**Authenticated branches** (``braa``/``brab``/``braaz``/``brabz``) combine +authentication with indirect branch. LFI expands these by first authenticating +the target register, then performing a normal sandboxed branch: + ++-------------------+-------------------------------+ +| Original | Rewritten | ++-------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| braa xN, xM | autia xN, xM | +| | add x28, x27, wN, uxtw | +| | br x28 | +| | | ++-------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| braaz xN | autiza xN | +| | add x28, x27, wN, uxtw | +| | br x28 | +| | | ++-------------------+-------------------------------+ + +**Authenticated calls** (``blraa``/``blrab``/``blraaz``/``blrabz``) are +expanded similarly: + ++-------------------+-------------------------------+ +| Original | Rewritten | ++-------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| blraa xN, xM | autia xN, xM | +| | add x28, x27, wN, uxtw | +| | blr x28 | +| | | ++-------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| blraaz xN | autiza xN | +| | add x28, x27, wN, uxtw | +| | blr x28 | +| | | ++-------------------+-------------------------------+ + +**Authenticated exception returns** (``eretaa``/``eretab``) are not supported +by LFI and will produce an error. System instructions ~~~~~~~~~~~~~~~~~~~ System calls are rewritten into a sequence that loads the address of the first runtime call entrypoint and jumps to it. The runtime call entrypoint table is -stored at the start of the sandbox, so it can be referenced by ``x27``. The -rewrite also saves and restores the link register, since it is used for -branching into the runtime. - -+-----------------+----------------------------+ -| Original | Rewritten | -+-----------------+----------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| svc #0 | mov w26, w30 | -| | ldr x30, [x27] | -| | blr x30 | -| | add x30, x27, w26, uxtw | -| | | -+-----------------+----------------------------+ +stored at a negative offset from the sandbox base, so it can be referenced by +``x27``. The rewrite also saves and restores the link register, since it is +used for branching into the runtime. + ++-----------------+------------------------------+ +| Original | Rewritten | ++-----------------+------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| svc #0 | mov x26, x30 | +| | ldur x30, [x27, #-8] | +| | blr x30 | +| | add x30, x27, w26, uxtw | +| | | ++-----------------+------------------------------+ + +Thread pointer +~~~~~~~~~~~~~~ -Thread-local storage -~~~~~~~~~~~~~~~~~~~~ - -TLS accesses are rewritten into accesses offset from ``x25``, which is a -reserved register that points to a virtual register file, with a location for -storing the sandbox's thread pointer. ``TP`` is the offset into that virtual -register file where the thread pointer is stored. - -+----------------------+-----------------------+ -| Original | Rewritten | -+----------------------+-----------------------+ -| .. code-block:: | .. code-block:: | -| | | -| mrs xN, tpidr_el0 | ldr xN, [x25, #TP] | -| | | -+----------------------+-----------------------+ -| .. code-block:: | .. code-block:: | -| | | -| mrs tpidr_el0, xN | str xN, [x25, #TP] | -| | | -+----------------------+-----------------------+ +TP accesses are rewritten into loads/stores from the context register +(``x25``), which holds the virtual thread pointer at offset 16 (see +`Context Register`_). + ++----------------------+-------------------------+ +| Original | Rewritten | ++----------------------+-------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| mrs xN, tpidr_el0 | ldr xN, [x25, #16] | +| | | ++----------------------+-------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| msr tpidr_el0, xN | str xN, [x25, #16] | +| | | ++----------------------+-------------------------+ Optimizations ============= @@ -335,22 +489,14 @@ can be removed. Address generation ~~~~~~~~~~~~~~~~~~ +**Note**: this optimization has not been implemented. + Addresses to global symbols in position-independent executables are frequently generated via ``adrp`` followed by ``ldr``. Since the address generated by ``adrp`` can be statically guaranteed to be within the sandbox, it is safe to directly target ``x28`` for these sequences. This allows the omission of a guard instruction before the ``ldr``. -+----------------------+-----------------------+ -| Original | Rewritten | -+----------------------+-----------------------+ -| .. code-block:: | .. code-block:: | -| | | -| adrp xN, target | adrp x28, target | -| ldr xN, [xN, imm] | ldr xN, [x28, imm] | -| | | -+----------------------+-----------------------+ - Stack guard elimination ~~~~~~~~~~~~~~~~~~~~~~~ @@ -398,32 +544,255 @@ In certain cases, guards may be hoisted outside of loops. | | | +-----------------------+-------------------------------+ -Assembler Directives -==================== +X86-64 +++++++ -The LFI assembler supports the following directives for controlling the -rewriter. +The X86-64 LFI target is ``x86_64_lfi``. -``.lfi_rewrite_disable`` -~~~~~~~~~~~~~~~~~~~~~~~~ +Reserved Registers +================== -Disables LFI assembly rewrites for all subsequent instructions, until -``.lfi_rewrite_enable`` is used. This can be useful for hand-written assembly -that is already safe and should not be modified by the rewriter. +The X86-64 LFI target reserves the following registers: -``.lfi_rewrite_enable`` -~~~~~~~~~~~~~~~~~~~~~~~ +* ``r14``: always holds the sandbox base address. Also used as the runtime call + table pointer (the runtime call table is stored at the sandbox base). +* ``gs``: always holds the sandbox base address (used as a segment register for + memory access sandboxing). +* ``rsp``: always holds an address within the sandbox. +* ``r15``: context register (see `Context Register`_). +* ``r11``: scratch register. -Re-enables LFI assembly rewrites after a previous ``.lfi_rewrite_disable``. +Assembly Rewrites +================= -Example: +Terminology +~~~~~~~~~~~ -.. code-block:: gas +In the following assembly rewrites, some shorthand is used. - .lfi_rewrite_disable - // No rewrites applied here. - ldr x0, [x27, w1, uxtw] - .lfi_rewrite_enable +* ``%rN`` or ``%eN``: refers to any general-purpose non-reserved register. +* ``{a,b,c}``: matches any of ``a``, ``b``, or ``c``. + +Instructions placed between ``.bundle_lock`` and ``.bundle_unlock`` directives +must all be placed inside the same bundle. The directive ``.bundle_lock +align_to_end`` ensures that the last instruction in the ``.bundle_lock`` +sequence is placed at the end of the bundle. + +Control flow +~~~~~~~~~~~~ + +**Note**: these rewrites have not been implemented. + +Indirect jumps are rewritten to first apply a mask that zeroes the top 32 bits +and bottom 5 bits of the target. An ``addq`` instruction is then used to fill +in the top 32 bits with the sandbox base. + +Indirect calls are similar, but the call instruction must be placed at the +end of the bundle so that the return address is bundle-aligned. Direct calls +must also be placed at the end of a bundle. + +The addressing mode ``LFI:N(...)`` specifies to apply an LFI addressing mode +transformation (see the Memory accesses section) when rewriting the addressing +mode. + ++------------------+------------------------------+ +| Original | Rewritten | ++------------------+------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| jmpq *%rX | .bundle_lock | +| | andl $0xffffffe0, %eX | +| | addq %r14, %rX | +| | jmpq *%rX | +| | .bundle_unlock | +| | | ++------------------+------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| jmpq *N(...) | movq LFI:N(...), %r11 | +| | .bundle_lock | +| | andl $0xffffffe0, %r11d | +| | addq %r14, %r11 | +| | jmpq *%r11 | +| | .bundle_unlock | +| | | ++------------------+------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| callq *%rX | .bundle_lock align_to_end | +| | andl $0xffffffe0, %eX | +| | addq %r14, %r11 | +| | callq *%r11 | +| | .bundle_unlock | +| | | ++------------------+------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| callq *N(...) | movq LFI:N(...), %r11 | +| | .bundle_lock align_to_end | +| | andl $0xffffffe0, %r11d | +| | addq %r14, %r11 | +| | callq *%r11 | +| | .bundle_unlock | +| | | ++------------------+------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| ret | popq %r11 | +| | .bundle_lock | +| | andl $0xffffffe0, %r11d | +| | addq %r14, %r11 | +| | jmpq *%r11 | +| | .bundle_unlock | +| | | ++------------------+------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| call ... | .bundle_lock align_to_end | +| | call ... | +| | .bundle_unlock | +| | | ++------------------+------------------------------+ + +Memory accesses +~~~~~~~~~~~~~~~ + +**Note**: these rewrites have not been implemented. + +Memory accesses are transformed to safe versions by rewriting the addressing +mode. The rewrite prefixes the addressing mode with ``%gs:`` to make the access +relative to the sandbox base. All registers must be changed to the 32-bit form +(``%eX``). + +The stack ``%rsp`` may be accessed directly because it is always guaranteed to +contain a valid sandbox address. ``lea`` instructions do not need rewriting for +their addressing mode since they do not actually perform a memory access. + ++--------------------+--------------------+ +| Original | Rewritten | ++--------------------+--------------------+ +| .. code-block:: | .. code-block:: | +| | | +| lea N(...), ... | lea N(...), ... | +| | | ++--------------------+--------------------+ + ++-------------------+-----------------------+ +| Original | Rewritten | ++-------------------+-----------------------+ +| .. code-block:: | .. code-block:: | +| | | +| N(%rsp) | N(%rsp) | +| | | ++-------------------+-----------------------+ +| .. code-block:: | .. code-block:: | +| | | +| N(%rip) | N(%rip) | +| | | ++-------------------+-----------------------+ +| .. code-block:: | .. code-block:: | +| | | +| N(%rX) | %gs:N(%eX) | +| | | ++-------------------+-----------------------+ +| .. code-block:: | .. code-block:: | +| | | +| N(%rX, %rY, S) | %gs:N(%eX, %eY, S) | +| | | ++-------------------+-----------------------+ +| .. code-block:: | .. code-block:: | +| | | +| N(, %rX, S) | N(, %eX, S) | +| | | ++-------------------+-----------------------+ + +String instructions +~~~~~~~~~~~~~~~~~~~ + +**Note**: these rewrites have not been implemented. + +String instructions perform memory accesses using specific registers. Those +registers must be manually guarded before the instruction. + ++-----------------+----------------------------+ +| Original | Rewritten | ++-----------------+----------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| rep? stosq | .bundle_lock | +| | movl %edi, %edi | +| | leaq (%r14, %rdi), %rdi | +| | rep? stosq | +| | .bundle_unlock | +| | | ++-----------------+----------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| rep? movsq | .bundle_lock | +| | movl %edi, %edi | +| | leaq (%r14, %rdi), %rdi | +| | movl %esi, %esi | +| | leaq (%r14, %rsi), %rsi | +| | rep? movsq | +| | .bundle_unlock | +| | | ++-----------------+----------------------------+ + +Stack modification +~~~~~~~~~~~~~~~~~~ + +**Note**: these rewrites have not been implemented. + +Since the stack pointer must always contain a valid sandbox address, any +modification to the stack pointer must be rewritten to modify it via ``%esp`` +and then re-guard it with ``leaq (%rsp, %r14), %rsp``. We use this guard form +instead of ``addq %r14, %rsp`` to avoid modifying the flags. + ++------------------+----------------------------+ +| Original | Rewritten | ++------------------+----------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| MOD ..., %rsp | .bundle_lock | +| | MOD ..., %esp | +| | leaq (%rsp, %r14), %rsp | +| | .bundle_unlock | +| | | ++------------------+----------------------------+ + +System instructions +~~~~~~~~~~~~~~~~~~~ + +System calls are rewritten into a sequence that loads the return address into +the scratch register and jumps to the runtime call handler. The runtime call +handler table is stored at the address pointed to by ``r14``. + ++-------------------+-------------------------------+ +| Original | Rewritten | ++-------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| syscall | leaq .Ltmp(%rip), %r11 | +| | jmpq *(%r14) | +| | .Ltmp: | +| | | ++-------------------+-------------------------------+ + +Thread pointer +~~~~~~~~~~~~~~ + +The ``movq %fs:0, %rX`` pattern (used for TLS access) is rewritten to load the +virtual thread pointer from the context register (``r15``) at offset 16 (see +`Context Register`_). + ++-----------------------+---------------------------+ +| Original | Rewritten | ++-----------------------+---------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| movq %fs:0, %rX | movq 16(%r15), %rX | +| | | ++-----------------------+---------------------------+ References ++++++++++ diff --git a/llvm/include/llvm/TargetParser/Triple.h b/llvm/include/llvm/TargetParser/Triple.h index 8d238a527b7f1..0ce2b1cd79bd0 100644 --- a/llvm/include/llvm/TargetParser/Triple.h +++ b/llvm/include/llvm/TargetParser/Triple.h @@ -155,6 +155,8 @@ class Triple { AArch64SubArch_arm64ec, AArch64SubArch_lfi, + X8664SubArch_lfi, + KalimbaSubArch_v3, KalimbaSubArch_v4, KalimbaSubArch_v5, @@ -960,8 +962,10 @@ class Triple { /// Tests whether the target is LFI. bool isLFI() const { - return getArch() == Triple::aarch64 && - getSubArch() == Triple::AArch64SubArch_lfi; + return (getArch() == Triple::aarch64 && + getSubArch() == Triple::AArch64SubArch_lfi) || + (getArch() == Triple::x86_64 && + getSubArch() == Triple::X8664SubArch_lfi); } /// Tests whether the target supports the EHABI exception @@ -1231,7 +1235,8 @@ class Triple { /// True if the target uses TLSDESC by default. bool hasDefaultTLSDESC() const { - return isAArch64() || (isAndroid() && isRISCV64()) || isOSFuchsia(); + return isAArch64() || (isAndroid() && isRISCV64()) || isOSFuchsia() || + (isX86() && isLFI()); } /// Tests whether the target uses -data-sections as default. diff --git a/llvm/lib/MC/MCLFI.cpp b/llvm/lib/MC/MCLFI.cpp index 2d0d1caec1430..7b39964dfae86 100644 --- a/llvm/lib/MC/MCLFI.cpp +++ b/llvm/lib/MC/MCLFI.cpp @@ -41,6 +41,10 @@ void initializeLFIMCStreamer(MCStreamer &Streamer, MCContext &Ctx, NoteName = ".note.LFI.ABI.aarch64"; NoteArch = "aarch64"; break; + case Triple::x86_64: + NoteName = ".note.LFI.ABI.x86_64"; + NoteArch = "x86_64"; + break; default: reportFatalUsageError("Unsupported architecture for LFI"); } diff --git a/llvm/lib/Target/X86/MCTargetDesc/CMakeLists.txt b/llvm/lib/Target/X86/MCTargetDesc/CMakeLists.txt index f2e7d43fc17f6..94d11cb1573be 100644 --- a/llvm/lib/Target/X86/MCTargetDesc/CMakeLists.txt +++ b/llvm/lib/Target/X86/MCTargetDesc/CMakeLists.txt @@ -9,6 +9,7 @@ add_llvm_component_library(LLVMX86Desc X86MCTargetDesc.cpp X86MCAsmInfo.cpp X86MCCodeEmitter.cpp + X86MCLFIRewriter.cpp X86MachObjectWriter.cpp X86MnemonicTables.cpp X86ELFObjectWriter.cpp diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp new file mode 100644 index 0000000000000..6bf290dceae08 --- /dev/null +++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp @@ -0,0 +1,132 @@ +//===- X86MCLFIRewriter.cpp -------------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements the X86MCLFIRewriter class, which rewrites X86-64 +// instructions for LFI (Lightweight Fault Isolation) sandboxing. +// +//===----------------------------------------------------------------------===// + +#include "X86MCLFIRewriter.h" +#include "X86BaseInfo.h" +#include "X86MCTargetDesc.h" +#include "llvm/MC/MCContext.h" +#include "llvm/MC/MCExpr.h" +#include "llvm/MC/MCInst.h" +#include "llvm/MC/MCStreamer.h" +#include "llvm/MC/MCSubtargetInfo.h" + +using namespace llvm; + +// LFI reserved registers. +static constexpr MCRegister LFIBaseReg = X86::R14; +static constexpr MCRegister LFIScratchReg = X86::R11; +static constexpr MCRegister LFITPReg = X86::R15; + +// Byte offset into the context register file (pointed to by R15) where the +// thread pointer is stored. +static constexpr int TPOffset = 16; + +static bool isSyscall(const MCInst &Inst) { + return Inst.getOpcode() == X86::SYSCALL; +} + +static bool isTPRead(const MCInst &Inst) { + // Match movq %fs:0, %rX + return Inst.getOpcode() == X86::MOV64rm && + Inst.getOperand(1).getReg() == X86::NoRegister && + Inst.getOperand(2).isImm() && Inst.getOperand(2).getImm() == 1 && + Inst.getOperand(3).getReg() == X86::NoRegister && + Inst.getOperand(4).isImm() && Inst.getOperand(4).getImm() == 0 && + Inst.getOperand(5).getReg() == X86::FS; +} + +// syscall +// -> +// leaq .Ltmp(%rip), %r11 +// jmpq *(%r14) +// .Ltmp: +void X86::X86MCLFIRewriter::emitLFICall(MCStreamer &Out, + const MCSubtargetInfo &STI) { + MCSymbol *Symbol = Out.getContext().createTempSymbol(); + + // leaq .Ltmp(%rip), %r11 + MCInst Lea; + Lea.setOpcode(X86::LEA64r); + Lea.addOperand(MCOperand::createReg(LFIScratchReg)); + Lea.addOperand(MCOperand::createReg(X86::RIP)); + Lea.addOperand(MCOperand::createImm(1)); + Lea.addOperand(MCOperand::createReg(X86::NoRegister)); + Lea.addOperand( + MCOperand::createExpr(MCSymbolRefExpr::create(Symbol, Out.getContext()))); + Lea.addOperand(MCOperand::createReg(X86::NoRegister)); + Out.emitInstruction(Lea, STI); + + // jmpq *(%r14) + MCInst Jmp; + Jmp.setOpcode(X86::JMP64m); + Jmp.addOperand(MCOperand::createReg(LFIBaseReg)); + Jmp.addOperand(MCOperand::createImm(1)); + Jmp.addOperand(MCOperand::createReg(X86::NoRegister)); + Jmp.addOperand(MCOperand::createImm(0)); + Jmp.addOperand(MCOperand::createReg(X86::NoRegister)); + Out.emitInstruction(Jmp, STI); + + Out.emitLabel(Symbol); +} + +void X86::X86MCLFIRewriter::expandSyscall(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) { + emitLFICall(Out, STI); +} + +// movq %fs:0, %rX +// -> +// movq TPOffset(%r15), %rX +void X86::X86MCLFIRewriter::expandTPRead(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) { + MCRegister DestReg = Inst.getOperand(0).getReg(); + + MCInst Mov; + Mov.setOpcode(X86::MOV64rm); + Mov.addOperand(MCOperand::createReg(DestReg)); + Mov.addOperand(MCOperand::createReg(LFITPReg)); // Base + Mov.addOperand(MCOperand::createImm(1)); // Scale + Mov.addOperand(MCOperand::createReg(X86::NoRegister)); // Index + Mov.addOperand(MCOperand::createImm(TPOffset)); // Displacement + Mov.addOperand(MCOperand::createReg(X86::NoRegister)); // Segment + Out.emitInstruction(Mov, STI); +} + +void X86::X86MCLFIRewriter::doRewriteInst(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) { + if (mayModifyRegister(Inst, LFIBaseReg) || mayModifyRegister(Inst, LFITPReg)) + return error(Inst, "illegal modification of reserved LFI register"); + + if (isSyscall(Inst)) + return expandSyscall(Inst, Out, STI); + + if (isTPRead(Inst)) + return expandTPRead(Inst, Out, STI); + + // Pass through all other instructions unchanged. + Out.emitInstruction(Inst, STI); +} + +bool X86::X86MCLFIRewriter::rewriteInst(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) { + // The guard prevents rewrite-recursion when we emit instructions from inside + // the rewriter (such instructions should not be rewritten). + if (!Enabled || Guard) + return false; + Guard = true; + + doRewriteInst(Inst, Out, STI); + + Guard = false; + return true; +} diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.h b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.h new file mode 100644 index 0000000000000..59d3ac24a4b58 --- /dev/null +++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.h @@ -0,0 +1,55 @@ +//===- X86MCLFIRewriter.h ---------------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file declares the X86MCLFIRewriter class, the X86 specific +// subclass of MCLFIRewriter. +// +//===----------------------------------------------------------------------===// +#ifndef LLVM_LIB_TARGET_X86_MCTARGETDESC_X86MCLFIREWRITER_H +#define LLVM_LIB_TARGET_X86_MCTARGETDESC_X86MCLFIREWRITER_H + +#include "llvm/MC/MCInstrInfo.h" +#include "llvm/MC/MCLFIRewriter.h" +#include "llvm/MC/MCRegisterInfo.h" + +namespace llvm { +class MCContext; +class MCInst; +class MCStreamer; +class MCSubtargetInfo; + +namespace X86 { + +class X86MCLFIRewriter : public MCLFIRewriter { +public: + X86MCLFIRewriter(MCContext &Ctx, std::unique_ptr<MCRegisterInfo> &&RI, + std::unique_ptr<MCInstrInfo> &&II) + : MCLFIRewriter(Ctx, std::move(RI), std::move(II)) {} + + bool rewriteInst(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) override; + +private: + /// Recursion guard to prevent infinite loops when emitting instructions. + bool Guard = false; + + void doRewriteInst(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI); + + void expandSyscall(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI); + + void expandTPRead(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI); + + void emitLFICall(MCStreamer &Out, const MCSubtargetInfo &STI); +}; + +} // namespace X86 +} // namespace llvm +#endif // LLVM_LIB_TARGET_X86_MCTARGETDESC_X86MCLFIREWRITER_H diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp index 0c874b7e6d674..97a4736f869c1 100644 --- a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp +++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp @@ -16,6 +16,7 @@ #include "X86BaseInfo.h" #include "X86IntelInstPrinter.h" #include "X86MCAsmInfo.h" +#include "X86MCLFIRewriter.h" #include "X86TargetStreamer.h" #include "llvm-c/Visibility.h" #include "llvm/ADT/APInt.h" @@ -698,6 +699,16 @@ static MCInstrAnalysis *createX86MCInstrAnalysis(const MCInstrInfo *Info) { return new X86_MC::X86MCInstrAnalysis(Info); } +static MCLFIRewriter * +createX86MCLFIRewriter(MCStreamer &S, std::unique_ptr<MCRegisterInfo> &&RegInfo, + std::unique_ptr<MCInstrInfo> &&InstInfo) { + auto RW = std::make_unique<X86::X86MCLFIRewriter>( + S.getContext(), std::move(RegInfo), std::move(InstInfo)); + auto *Ptr = RW.get(); + S.setLFIRewriter(std::move(RW)); + return Ptr; +} + // Force static initialization. extern "C" LLVM_C_ABI void LLVMInitializeX86TargetMC() { for (Target *T : {&getTheX86_32Target(), &getTheX86_64Target()}) { @@ -720,6 +731,9 @@ extern "C" LLVM_C_ABI void LLVMInitializeX86TargetMC() { // Register the code emitter. TargetRegistry::RegisterMCCodeEmitter(*T, createX86MCCodeEmitter); + // Register the LFI rewriter. + TargetRegistry::RegisterMCLFIRewriter(*T, createX86MCLFIRewriter); + // Register the obj target streamer. TargetRegistry::RegisterObjectTargetStreamer(*T, createX86ObjectTargetStreamer); diff --git a/llvm/lib/Target/X86/X86ISelLoweringCall.cpp b/llvm/lib/Target/X86/X86ISelLoweringCall.cpp index 37c80e27f4bd2..97663e7e98664 100644 --- a/llvm/lib/Target/X86/X86ISelLoweringCall.cpp +++ b/llvm/lib/Target/X86/X86ISelLoweringCall.cpp @@ -2973,6 +2973,11 @@ bool X86TargetLowering::isEligibleForSiblingCallOpt( if (IsCalleeWin64 != IsCallerWin64) return false; + // Do not optimize vararg calls with 6 arguments for LFI since LFI reserves + // %r11, meaning there will not be enough registers available. + if (Subtarget.isLFI() && ArgLocs.size() > 5) + return false; + // If we are using a GOT, don't generate sibling calls to non-local, // default-visibility symbols. Tail calling such a symbol requires using a GOT // relocation, which forces early binding of the symbol. This breaks code that diff --git a/llvm/lib/Target/X86/X86RegisterInfo.cpp b/llvm/lib/Target/X86/X86RegisterInfo.cpp index 83dd6ea287e83..9eac7441f07d5 100644 --- a/llvm/lib/Target/X86/X86RegisterInfo.cpp +++ b/llvm/lib/Target/X86/X86RegisterInfo.cpp @@ -617,6 +617,16 @@ BitVector X86RegisterInfo::getReservedRegs(const MachineFunction &MF) const { Reserved.set(*AI); } + // Reserve registers for LFI sandboxing. + if (MF.getSubtarget<X86Subtarget>().isLFI()) { + for (MCRegAliasIterator AI(X86::R11, this, true); AI.isValid(); ++AI) + Reserved.set(*AI); + for (MCRegAliasIterator AI(X86::R14, this, true); AI.isValid(); ++AI) + Reserved.set(*AI); + for (MCRegAliasIterator AI(X86::R15, this, true); AI.isValid(); ++AI) + Reserved.set(*AI); + } + assert(checkAllSuperRegsMarked(Reserved, {X86::SIL, X86::DIL, X86::BPL, X86::SPL, X86::SIH, X86::DIH, X86::BPH, X86::SPH})); diff --git a/llvm/lib/Target/X86/X86Subtarget.h b/llvm/lib/Target/X86/X86Subtarget.h index 692c7938ddc00..4ef765755ace9 100644 --- a/llvm/lib/Target/X86/X86Subtarget.h +++ b/llvm/lib/Target/X86/X86Subtarget.h @@ -309,6 +309,8 @@ class X86Subtarget final : public X86GenSubtargetInfo { bool isTargetMCU() const { return TargetTriple.isOSIAMCU(); } bool isTargetFuchsia() const { return TargetTriple.isOSFuchsia(); } + bool isLFI() const { return TargetTriple.isLFI(); } + bool isTargetWindowsMSVC() const { return TargetTriple.isWindowsMSVCEnvironment(); } diff --git a/llvm/lib/TargetParser/Triple.cpp b/llvm/lib/TargetParser/Triple.cpp index c80cee39989be..ac53683083797 100644 --- a/llvm/lib/TargetParser/Triple.cpp +++ b/llvm/lib/TargetParser/Triple.cpp @@ -119,6 +119,10 @@ StringRef Triple::getArchName(ArchType Kind, SubArchType SubArch) { if (SubArch == AArch64SubArch_lfi) return "aarch64_lfi"; break; + case Triple::x86_64: + if (SubArch == X8664SubArch_lfi) + return "x86_64_lfi"; + break; case Triple::spirv: switch (SubArch) { case Triple::SPIRVSubArch_v10: @@ -599,6 +603,7 @@ static Triple::ArchType parseArch(StringRef ArchName) { // FIXME: Do we need to support these? .Cases({"i786", "i886", "i986"}, Triple::x86) .Cases({"amd64", "x86_64", "x86_64h"}, Triple::x86_64) + .Case("x86_64_lfi", Triple::x86_64) .Cases({"powerpc", "powerpcspe", "ppc", "ppc32"}, Triple::ppc) .Cases({"powerpcle", "ppcle", "ppc32le"}, Triple::ppcle) .Cases({"powerpc64", "ppu", "ppc64"}, Triple::ppc64) @@ -856,6 +861,9 @@ static Triple::SubArchType parseSubArch(StringRef SubArchName) { if (SubArchName == "aarch64_lfi") return Triple::AArch64SubArch_lfi; + if (SubArchName == "x86_64_lfi") + return Triple::X8664SubArch_lfi; + if (SubArchName.starts_with("spirv")) return StringSwitch<Triple::SubArchType>(SubArchName) .EndsWith("v1.0", Triple::SPIRVSubArch_v10) diff --git a/llvm/test/MC/X86/LFI/sys.s b/llvm/test/MC/X86/LFI/sys.s new file mode 100644 index 0000000000000..0a41f7f7e6b36 --- /dev/null +++ b/llvm/test/MC/X86/LFI/sys.s @@ -0,0 +1,12 @@ +// RUN: llvm-mc -triple x86_64_lfi %s | FileCheck %s + +syscall +// CHECK: leaq .Ltmp0(%rip), %r11 +// CHECK-NEXT: jmpq *(%r14) +// CHECK-NEXT: .Ltmp0: + +movq %fs:0, %rax +// CHECK: movq 16(%r15), %rax + +movq %fs:0, %rdi +// CHECK: movq 16(%r15), %rdi diff --git a/llvm/test/MC/X86/LFI/tp.s b/llvm/test/MC/X86/LFI/tp.s new file mode 100644 index 0000000000000..85f70c2b6b162 --- /dev/null +++ b/llvm/test/MC/X86/LFI/tp.s @@ -0,0 +1,10 @@ +// RUN: llvm-mc -triple x86_64_lfi %s | FileCheck %s + +movq %fs:0, %rax +// CHECK: movq 16(%r15), %rax + +movq %fs:0, %rdi +// CHECK: movq 16(%r15), %rdi + +movq %fs:0, %rcx +// CHECK: movq 16(%r15), %rcx >From 3191d48ec2bfc3dc9b56e31784fb1140b2b064a1 Mon Sep 17 00:00:00 2001 From: Zachary Yedidia <[email protected]> Date: Tue, 31 Mar 2026 01:34:06 -0700 Subject: [PATCH 2/2] Call initSections to allow tests to run This call will be removed once a separate fix is applied. --- llvm/tools/llvm-mc/llvm-mc.cpp | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/llvm/tools/llvm-mc/llvm-mc.cpp b/llvm/tools/llvm-mc/llvm-mc.cpp index 3763ce0ea974c..6d277527e241d 100644 --- a/llvm/tools/llvm-mc/llvm-mc.cpp +++ b/llvm/tools/llvm-mc/llvm-mc.cpp @@ -632,8 +632,12 @@ int main(int argc, char **argv) { std::move(CE), std::move(MAB))); Triple T(TripleName); - if (T.isLFI()) + if (T.isLFI()) { + // TODO: Do not merge this change. This is a temporary fix until #188625 + // is merged. + Str->initSections(*STI); initializeLFIMCStreamer(*Str.get(), Ctx, T); + } } else if (FileType == OFT_Null) { Str.reset(TheTarget->createNullStreamer(Ctx)); } else { _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
