https://github.com/zyedidia updated https://github.com/llvm/llvm-project/pull/189569
>From 0de48827f8abb2ada0ddb489366427459aad5063 Mon Sep 17 00:00:00 2001 From: Zachary Yedidia <[email protected]> Date: Mon, 16 Mar 2026 19:26:26 -0400 Subject: [PATCH 01/10] [LFI][X86] Initial LFI X86 rewriter for system instructions --- clang/lib/Basic/Targets/X86.cpp | 4 + llvm/docs/LFI.rst | 284 ++++++++++++++++-- llvm/include/llvm/TargetParser/Triple.h | 11 +- llvm/lib/MC/MCLFI.cpp | 4 + .../Target/X86/MCTargetDesc/CMakeLists.txt | 1 + .../X86/MCTargetDesc/X86MCLFIRewriter.cpp | 132 ++++++++ .../X86/MCTargetDesc/X86MCLFIRewriter.h | 55 ++++ .../X86/MCTargetDesc/X86MCTargetDesc.cpp | 14 + llvm/lib/Target/X86/X86ISelLoweringCall.cpp | 5 + llvm/lib/Target/X86/X86RegisterInfo.cpp | 10 + llvm/lib/Target/X86/X86Subtarget.h | 2 + llvm/lib/TargetParser/Triple.cpp | 8 + llvm/test/MC/X86/LFI/sys.s | 12 + llvm/test/MC/X86/LFI/tp.s | 10 + 14 files changed, 528 insertions(+), 24 deletions(-) create mode 100644 llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp create mode 100644 llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.h create mode 100644 llvm/test/MC/X86/LFI/sys.s create mode 100644 llvm/test/MC/X86/LFI/tp.s diff --git a/clang/lib/Basic/Targets/X86.cpp b/clang/lib/Basic/Targets/X86.cpp index 60c001a826078..fa570a63761fd 100644 --- a/clang/lib/Basic/Targets/X86.cpp +++ b/clang/lib/Basic/Targets/X86.cpp @@ -534,6 +534,10 @@ void X86TargetInfo::getTargetDefines(const LangOptions &Opts, DefineStd(Builder, "i386", Opts); } + if (getTriple().isLFI()) { + Builder.defineMacro("__LFI__"); + } + Builder.defineMacro("__SEG_GS"); Builder.defineMacro("__SEG_FS"); Builder.defineMacro("__seg_gs", "__attribute__((address_space(256)))"); diff --git a/llvm/docs/LFI.rst b/llvm/docs/LFI.rst index a173cd57e0ee0..1c045a65c2dc9 100644 --- a/llvm/docs/LFI.rst +++ b/llvm/docs/LFI.rst @@ -38,10 +38,9 @@ runtime), responsible for initializing the sandbox region, loading the program, and servicing system call requests, or other forms of runtime calls. LFI uses an architecture-specific sandboxing scheme based on the general -technique of Software-Based Fault Isolation (SFI). Initial support for LFI in -LLVM is focused on the AArch64 platform, with x86-64 support planned for the -future. The initial version of LFI for AArch64 is designed to support the -Armv8.1 AArch64 architecture. +technique of Software-Based Fault Isolation (SFI). LLVM currently supports LFI +for the AArch64 and X86-64 platforms. The AArch64 version is designed to +support the Armv8.1 AArch64 architecture. See `https://github.com/lfi-project <https://github.com/lfi-project/>`__ for details about the LFI project and additional software needed to run LFI @@ -50,11 +49,11 @@ programs. Compiler Requirements +++++++++++++++++++++ -When building for the ``aarch64_lfi`` target, the compiler must restrict use of -the instruction set to a subset of instructions, which are known to be safe -from a sandboxing perspective. To do this, we apply a set of simple rewrites at -the assembly language level to transform standard native AArch64 assembly into -LFI-compatible AArch64 assembly. +When building for an LFI target (``aarch64_lfi`` or ``x86_64_lfi``), the +compiler must restrict use of the instruction set to a subset of instructions, +which are known to be safe from a sandboxing perspective. To do this, we apply a +set of simple rewrites at the assembly language level to transform standard +native assembly into LFI-compatible assembly. These rewrites (also called "expansions") are applied at the very end of the LLVM compilation pipeline (during the assembler step). This allows the rewrites @@ -302,8 +301,11 @@ before moving it back into ``sp`` with a safe ``add``. Link register modification ~~~~~~~~~~~~~~~~~~~~~~~~~~~ -When the link register is modified, we write the modified value to a -temporary, before loading it back into ``x30`` with a safe ``add``. +When the link register is modified, the guard is deferred until the next +control flow instruction. This approach maintains compatibility with Pointer +Authentication Code (PAC) instructions by keeping signed pointers intact until +they are needed for control flow. The guard uses ``x30`` as both the source and +destination (``add x30, x27, w30, uxtw``). +---------------------------+-------------------------------+ | Original | Rewritten | @@ -402,16 +404,6 @@ generated via ``adrp`` followed by ``ldr``. Since the address generated by directly target ``x28`` for these sequences. This allows the omission of a guard instruction before the ``ldr``. -+----------------------+-----------------------+ -| Original | Rewritten | -+----------------------+-----------------------+ -| .. code-block:: | .. code-block:: | -| | | -| adrp xN, target | adrp x28, target | -| ldr xN, [xN, imm] | ldr xN, [x28, imm] | -| | | -+----------------------+-----------------------+ - Stack guard elimination ~~~~~~~~~~~~~~~~~~~~~~~ @@ -459,6 +451,256 @@ In certain cases, guards may be hoisted outside of loops. | | | +-----------------------+-------------------------------+ +X86-64 +++++++ + +The X86-64 LFI target is ``x86_64_lfi``. + +Reserved Registers +================== + +The X86-64 LFI target reserves the following registers: + +* ``r14``: always holds the sandbox base address. Also used as the runtime call + table pointer (the runtime call table is stored at the sandbox base). +* ``gs``: always holds the sandbox base address (used as a segment register for + memory access sandboxing). +* ``rsp``: always holds an address within the sandbox. +* ``r15``: context register (see `Context Register`_). +* ``r11``: scratch register. + +Assembly Rewrites +================= + +Terminology +~~~~~~~~~~~ + +In the following assembly rewrites, some shorthand is used. + +* ``%rN`` or ``%eN``: refers to any general-purpose non-reserved register. +* ``{a,b,c}``: matches any of ``a``, ``b``, or ``c``. + +Instructions placed between ``.bundle_lock`` and ``.bundle_unlock`` directives +must all be placed inside the same bundle. The directive ``.bundle_lock +align_to_end`` ensures that the last instruction in the ``.bundle_lock`` +sequence is placed at the end of the bundle. + +Control flow +~~~~~~~~~~~~ + +**Note**: these rewrites have not been implemented. + +Indirect jumps are rewritten to first apply a mask that zeroes the top 32 bits +and bottom 5 bits of the target. An ``addq`` instruction is then used to fill +in the top 32 bits with the sandbox base. + +Indirect calls are similar, but the call instruction must be placed at the +end of the bundle so that the return address is bundle-aligned. Direct calls +must also be placed at the end of a bundle. + +The addressing mode ``LFI:N(...)`` specifies to apply an LFI addressing mode +transformation (see the Memory accesses section) when rewriting the addressing +mode. + ++------------------+------------------------------+ +| Original | Rewritten | ++------------------+------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| jmpq *%rX | .bundle_lock | +| | andl $0xffffffe0, %eX | +| | addq %r14, %rX | +| | jmpq *%rX | +| | .bundle_unlock | +| | | ++------------------+------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| jmpq *N(...) | movq LFI:N(...), %r11 | +| | .bundle_lock | +| | andl $0xffffffe0, %r11d | +| | addq %r14, %r11 | +| | jmpq *%r11 | +| | .bundle_unlock | +| | | ++------------------+------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| callq *%rX | .bundle_lock align_to_end | +| | andl $0xffffffe0, %eX | +| | addq %r14, %r11 | +| | callq *%r11 | +| | .bundle_unlock | +| | | ++------------------+------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| callq *N(...) | movq LFI:N(...), %r11 | +| | .bundle_lock align_to_end | +| | andl $0xffffffe0, %r11d | +| | addq %r14, %r11 | +| | callq *%r11 | +| | .bundle_unlock | +| | | ++------------------+------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| ret | popq %r11 | +| | .bundle_lock | +| | andl $0xffffffe0, %r11d | +| | addq %r14, %r11 | +| | jmpq *%r11 | +| | .bundle_unlock | +| | | ++------------------+------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| call ... | .bundle_lock align_to_end | +| | call ... | +| | .bundle_unlock | +| | | ++------------------+------------------------------+ + +Memory accesses +~~~~~~~~~~~~~~~ + +**Note**: these rewrites have not been implemented. + +Memory accesses are transformed to safe versions by rewriting the addressing +mode. The rewrite prefixes the addressing mode with ``%gs:`` to make the access +relative to the sandbox base. All registers must be changed to the 32-bit form +(``%eX``). + +The stack ``%rsp`` may be accessed directly because it is always guaranteed to +contain a valid sandbox address. ``lea`` instructions do not need rewriting for +their addressing mode since they do not actually perform a memory access. + ++--------------------+--------------------+ +| Original | Rewritten | ++--------------------+--------------------+ +| .. code-block:: | .. code-block:: | +| | | +| lea N(...), ... | lea N(...), ... | +| | | ++--------------------+--------------------+ + ++-------------------+-----------------------+ +| Original | Rewritten | ++-------------------+-----------------------+ +| .. code-block:: | .. code-block:: | +| | | +| N(%rsp) | N(%rsp) | +| | | ++-------------------+-----------------------+ +| .. code-block:: | .. code-block:: | +| | | +| N(%rip) | N(%rip) | +| | | ++-------------------+-----------------------+ +| .. code-block:: | .. code-block:: | +| | | +| N(%rX) | %gs:N(%eX) | +| | | ++-------------------+-----------------------+ +| .. code-block:: | .. code-block:: | +| | | +| N(%rX, %rY, S) | %gs:N(%eX, %eY, S) | +| | | ++-------------------+-----------------------+ +| .. code-block:: | .. code-block:: | +| | | +| N(, %rX, S) | N(, %eX, S) | +| | | ++-------------------+-----------------------+ + +String instructions +~~~~~~~~~~~~~~~~~~~ + +**Note**: these rewrites have not been implemented. + +String instructions perform memory accesses using specific registers. Those +registers must be manually guarded before the instruction. + ++-----------------+----------------------------+ +| Original | Rewritten | ++-----------------+----------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| rep? stosq | .bundle_lock | +| | movl %edi, %edi | +| | leaq (%r14, %rdi), %rdi | +| | rep? stosq | +| | .bundle_unlock | +| | | ++-----------------+----------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| rep? movsq | .bundle_lock | +| | movl %edi, %edi | +| | leaq (%r14, %rdi), %rdi | +| | movl %esi, %esi | +| | leaq (%r14, %rsi), %rsi | +| | rep? movsq | +| | .bundle_unlock | +| | | ++-----------------+----------------------------+ + +Stack modification +~~~~~~~~~~~~~~~~~~ + +**Note**: these rewrites have not been implemented. + +Since the stack pointer must always contain a valid sandbox address, any +modification to the stack pointer must be rewritten to modify it via ``%esp`` +and then re-guard it with ``leaq (%rsp, %r14), %rsp``. We use this guard form +instead of ``addq %r14, %rsp`` to avoid modifying the flags. + ++------------------+----------------------------+ +| Original | Rewritten | ++------------------+----------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| MOD ..., %rsp | .bundle_lock | +| | MOD ..., %esp | +| | leaq (%rsp, %r14), %rsp | +| | .bundle_unlock | +| | | ++------------------+----------------------------+ + +System instructions +~~~~~~~~~~~~~~~~~~~ + +System calls are rewritten into a sequence that loads the return address into +the scratch register and jumps to the runtime call handler. The runtime call +handler table is stored at the address pointed to by ``r14``. + ++-------------------+-------------------------------+ +| Original | Rewritten | ++-------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| syscall | leaq .Ltmp(%rip), %r11 | +| | jmpq *(%r14) | +| | .Ltmp: | +| | | ++-------------------+-------------------------------+ + +Thread pointer +~~~~~~~~~~~~~~ + +The ``movq %fs:0, %rX`` pattern (used for TLS access) is rewritten to load the +virtual thread pointer from the context register (``r15``) at offset 16 (see +`Context Register`_). + ++-----------------------+---------------------------+ +| Original | Rewritten | ++-----------------------+---------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| movq %fs:0, %rX | movq 16(%r15), %rX | +| | | ++-----------------------+---------------------------+ + References ++++++++++ diff --git a/llvm/include/llvm/TargetParser/Triple.h b/llvm/include/llvm/TargetParser/Triple.h index 7b24db121818f..4a029e1cdc5a4 100644 --- a/llvm/include/llvm/TargetParser/Triple.h +++ b/llvm/include/llvm/TargetParser/Triple.h @@ -159,6 +159,8 @@ class Triple { AArch64SubArch_arm64ec, AArch64SubArch_lfi, + X8664SubArch_lfi, + KalimbaSubArch_v3, KalimbaSubArch_v4, KalimbaSubArch_v5, @@ -921,8 +923,10 @@ class Triple { /// Tests whether the target is LFI. bool isLFI() const { - return getArch() == Triple::aarch64 && - getSubArch() == Triple::AArch64SubArch_lfi; + return (getArch() == Triple::aarch64 && + getSubArch() == Triple::AArch64SubArch_lfi) || + (getArch() == Triple::x86_64 && + getSubArch() == Triple::X8664SubArch_lfi); } /// Tests whether the target supports the EHABI exception @@ -1190,7 +1194,8 @@ class Triple { /// True if the target uses TLSDESC by default. bool hasDefaultTLSDESC() const { - return isAArch64() || (isAndroid() && isRISCV64()) || isOSFuchsia(); + return isAArch64() || (isAndroid() && isRISCV64()) || isOSFuchsia() || + (isX86() && isLFI()); } /// Tests whether the target uses -data-sections as default. diff --git a/llvm/lib/MC/MCLFI.cpp b/llvm/lib/MC/MCLFI.cpp index a5c65540044b6..7b149f4b53b4b 100644 --- a/llvm/lib/MC/MCLFI.cpp +++ b/llvm/lib/MC/MCLFI.cpp @@ -61,6 +61,10 @@ void emitLFINoteSection(MCStreamer &Streamer, MCContext &Ctx) { NoteName = ".note.LFI.ABI.aarch64"; NoteArch = "aarch64"; break; + case Triple::x86_64: + NoteName = ".note.LFI.ABI.x86_64"; + NoteArch = "x86_64"; + break; default: reportFatalUsageError("Unsupported architecture for LFI"); } diff --git a/llvm/lib/Target/X86/MCTargetDesc/CMakeLists.txt b/llvm/lib/Target/X86/MCTargetDesc/CMakeLists.txt index f2e7d43fc17f6..94d11cb1573be 100644 --- a/llvm/lib/Target/X86/MCTargetDesc/CMakeLists.txt +++ b/llvm/lib/Target/X86/MCTargetDesc/CMakeLists.txt @@ -9,6 +9,7 @@ add_llvm_component_library(LLVMX86Desc X86MCTargetDesc.cpp X86MCAsmInfo.cpp X86MCCodeEmitter.cpp + X86MCLFIRewriter.cpp X86MachObjectWriter.cpp X86MnemonicTables.cpp X86ELFObjectWriter.cpp diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp new file mode 100644 index 0000000000000..6bf290dceae08 --- /dev/null +++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp @@ -0,0 +1,132 @@ +//===- X86MCLFIRewriter.cpp -------------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements the X86MCLFIRewriter class, which rewrites X86-64 +// instructions for LFI (Lightweight Fault Isolation) sandboxing. +// +//===----------------------------------------------------------------------===// + +#include "X86MCLFIRewriter.h" +#include "X86BaseInfo.h" +#include "X86MCTargetDesc.h" +#include "llvm/MC/MCContext.h" +#include "llvm/MC/MCExpr.h" +#include "llvm/MC/MCInst.h" +#include "llvm/MC/MCStreamer.h" +#include "llvm/MC/MCSubtargetInfo.h" + +using namespace llvm; + +// LFI reserved registers. +static constexpr MCRegister LFIBaseReg = X86::R14; +static constexpr MCRegister LFIScratchReg = X86::R11; +static constexpr MCRegister LFITPReg = X86::R15; + +// Byte offset into the context register file (pointed to by R15) where the +// thread pointer is stored. +static constexpr int TPOffset = 16; + +static bool isSyscall(const MCInst &Inst) { + return Inst.getOpcode() == X86::SYSCALL; +} + +static bool isTPRead(const MCInst &Inst) { + // Match movq %fs:0, %rX + return Inst.getOpcode() == X86::MOV64rm && + Inst.getOperand(1).getReg() == X86::NoRegister && + Inst.getOperand(2).isImm() && Inst.getOperand(2).getImm() == 1 && + Inst.getOperand(3).getReg() == X86::NoRegister && + Inst.getOperand(4).isImm() && Inst.getOperand(4).getImm() == 0 && + Inst.getOperand(5).getReg() == X86::FS; +} + +// syscall +// -> +// leaq .Ltmp(%rip), %r11 +// jmpq *(%r14) +// .Ltmp: +void X86::X86MCLFIRewriter::emitLFICall(MCStreamer &Out, + const MCSubtargetInfo &STI) { + MCSymbol *Symbol = Out.getContext().createTempSymbol(); + + // leaq .Ltmp(%rip), %r11 + MCInst Lea; + Lea.setOpcode(X86::LEA64r); + Lea.addOperand(MCOperand::createReg(LFIScratchReg)); + Lea.addOperand(MCOperand::createReg(X86::RIP)); + Lea.addOperand(MCOperand::createImm(1)); + Lea.addOperand(MCOperand::createReg(X86::NoRegister)); + Lea.addOperand( + MCOperand::createExpr(MCSymbolRefExpr::create(Symbol, Out.getContext()))); + Lea.addOperand(MCOperand::createReg(X86::NoRegister)); + Out.emitInstruction(Lea, STI); + + // jmpq *(%r14) + MCInst Jmp; + Jmp.setOpcode(X86::JMP64m); + Jmp.addOperand(MCOperand::createReg(LFIBaseReg)); + Jmp.addOperand(MCOperand::createImm(1)); + Jmp.addOperand(MCOperand::createReg(X86::NoRegister)); + Jmp.addOperand(MCOperand::createImm(0)); + Jmp.addOperand(MCOperand::createReg(X86::NoRegister)); + Out.emitInstruction(Jmp, STI); + + Out.emitLabel(Symbol); +} + +void X86::X86MCLFIRewriter::expandSyscall(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) { + emitLFICall(Out, STI); +} + +// movq %fs:0, %rX +// -> +// movq TPOffset(%r15), %rX +void X86::X86MCLFIRewriter::expandTPRead(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) { + MCRegister DestReg = Inst.getOperand(0).getReg(); + + MCInst Mov; + Mov.setOpcode(X86::MOV64rm); + Mov.addOperand(MCOperand::createReg(DestReg)); + Mov.addOperand(MCOperand::createReg(LFITPReg)); // Base + Mov.addOperand(MCOperand::createImm(1)); // Scale + Mov.addOperand(MCOperand::createReg(X86::NoRegister)); // Index + Mov.addOperand(MCOperand::createImm(TPOffset)); // Displacement + Mov.addOperand(MCOperand::createReg(X86::NoRegister)); // Segment + Out.emitInstruction(Mov, STI); +} + +void X86::X86MCLFIRewriter::doRewriteInst(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) { + if (mayModifyRegister(Inst, LFIBaseReg) || mayModifyRegister(Inst, LFITPReg)) + return error(Inst, "illegal modification of reserved LFI register"); + + if (isSyscall(Inst)) + return expandSyscall(Inst, Out, STI); + + if (isTPRead(Inst)) + return expandTPRead(Inst, Out, STI); + + // Pass through all other instructions unchanged. + Out.emitInstruction(Inst, STI); +} + +bool X86::X86MCLFIRewriter::rewriteInst(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) { + // The guard prevents rewrite-recursion when we emit instructions from inside + // the rewriter (such instructions should not be rewritten). + if (!Enabled || Guard) + return false; + Guard = true; + + doRewriteInst(Inst, Out, STI); + + Guard = false; + return true; +} diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.h b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.h new file mode 100644 index 0000000000000..59d3ac24a4b58 --- /dev/null +++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.h @@ -0,0 +1,55 @@ +//===- X86MCLFIRewriter.h ---------------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file declares the X86MCLFIRewriter class, the X86 specific +// subclass of MCLFIRewriter. +// +//===----------------------------------------------------------------------===// +#ifndef LLVM_LIB_TARGET_X86_MCTARGETDESC_X86MCLFIREWRITER_H +#define LLVM_LIB_TARGET_X86_MCTARGETDESC_X86MCLFIREWRITER_H + +#include "llvm/MC/MCInstrInfo.h" +#include "llvm/MC/MCLFIRewriter.h" +#include "llvm/MC/MCRegisterInfo.h" + +namespace llvm { +class MCContext; +class MCInst; +class MCStreamer; +class MCSubtargetInfo; + +namespace X86 { + +class X86MCLFIRewriter : public MCLFIRewriter { +public: + X86MCLFIRewriter(MCContext &Ctx, std::unique_ptr<MCRegisterInfo> &&RI, + std::unique_ptr<MCInstrInfo> &&II) + : MCLFIRewriter(Ctx, std::move(RI), std::move(II)) {} + + bool rewriteInst(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) override; + +private: + /// Recursion guard to prevent infinite loops when emitting instructions. + bool Guard = false; + + void doRewriteInst(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI); + + void expandSyscall(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI); + + void expandTPRead(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI); + + void emitLFICall(MCStreamer &Out, const MCSubtargetInfo &STI); +}; + +} // namespace X86 +} // namespace llvm +#endif // LLVM_LIB_TARGET_X86_MCTARGETDESC_X86MCLFIREWRITER_H diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp index 5ec4c836572ef..7b42e934c1967 100644 --- a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp +++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp @@ -16,6 +16,7 @@ #include "X86BaseInfo.h" #include "X86IntelInstPrinter.h" #include "X86MCAsmInfo.h" +#include "X86MCLFIRewriter.h" #include "X86TargetStreamer.h" #include "llvm-c/Visibility.h" #include "llvm/ADT/APInt.h" @@ -723,6 +724,16 @@ static MCInstrAnalysis *createX86MCInstrAnalysis(const MCInstrInfo *Info) { return new X86_MC::X86MCInstrAnalysis(Info); } +static MCLFIRewriter * +createX86MCLFIRewriter(MCStreamer &S, std::unique_ptr<MCRegisterInfo> &&RegInfo, + std::unique_ptr<MCInstrInfo> &&InstInfo) { + auto RW = std::make_unique<X86::X86MCLFIRewriter>( + S.getContext(), std::move(RegInfo), std::move(InstInfo)); + auto *Ptr = RW.get(); + S.setLFIRewriter(std::move(RW)); + return Ptr; +} + // Force static initialization. extern "C" LLVM_C_ABI void LLVMInitializeX86TargetMC() { for (Target *T : {&getTheX86_32Target(), &getTheX86_64Target()}) { @@ -745,6 +756,9 @@ extern "C" LLVM_C_ABI void LLVMInitializeX86TargetMC() { // Register the code emitter. TargetRegistry::RegisterMCCodeEmitter(*T, createX86MCCodeEmitter); + // Register the LFI rewriter. + TargetRegistry::RegisterMCLFIRewriter(*T, createX86MCLFIRewriter); + // Register the obj target streamer. TargetRegistry::RegisterObjectTargetStreamer(*T, createX86ObjectTargetStreamer); diff --git a/llvm/lib/Target/X86/X86ISelLoweringCall.cpp b/llvm/lib/Target/X86/X86ISelLoweringCall.cpp index 65d77769b3c45..a2425146e08ff 100644 --- a/llvm/lib/Target/X86/X86ISelLoweringCall.cpp +++ b/llvm/lib/Target/X86/X86ISelLoweringCall.cpp @@ -2974,6 +2974,11 @@ bool X86TargetLowering::isEligibleForSiblingCallOpt( if (IsCalleeWin64 != IsCallerWin64) return false; + // Do not optimize vararg calls with 6 arguments for LFI since LFI reserves + // %r11, meaning there will not be enough registers available. + if (Subtarget.isLFI() && ArgLocs.size() > 5) + return false; + // If we are using a GOT, don't generate sibling calls to non-local, // default-visibility symbols. Tail calling such a symbol requires using a GOT // relocation, which forces early binding of the symbol. This breaks code that diff --git a/llvm/lib/Target/X86/X86RegisterInfo.cpp b/llvm/lib/Target/X86/X86RegisterInfo.cpp index c92e20ad7ef13..777faec20e3b8 100644 --- a/llvm/lib/Target/X86/X86RegisterInfo.cpp +++ b/llvm/lib/Target/X86/X86RegisterInfo.cpp @@ -621,6 +621,16 @@ BitVector X86RegisterInfo::getReservedRegs(const MachineFunction &MF) const { Reserved.set(*AI); } + // Reserve registers for LFI sandboxing. + if (MF.getSubtarget<X86Subtarget>().isLFI()) { + for (MCRegAliasIterator AI(X86::R11, this, true); AI.isValid(); ++AI) + Reserved.set(*AI); + for (MCRegAliasIterator AI(X86::R14, this, true); AI.isValid(); ++AI) + Reserved.set(*AI); + for (MCRegAliasIterator AI(X86::R15, this, true); AI.isValid(); ++AI) + Reserved.set(*AI); + } + assert(checkAllSuperRegsMarked(Reserved, {X86::SIL, X86::DIL, X86::BPL, X86::SPL, X86::SIH, X86::DIH, X86::BPH, X86::SPH})); diff --git a/llvm/lib/Target/X86/X86Subtarget.h b/llvm/lib/Target/X86/X86Subtarget.h index 692c7938ddc00..4ef765755ace9 100644 --- a/llvm/lib/Target/X86/X86Subtarget.h +++ b/llvm/lib/Target/X86/X86Subtarget.h @@ -309,6 +309,8 @@ class X86Subtarget final : public X86GenSubtargetInfo { bool isTargetMCU() const { return TargetTriple.isOSIAMCU(); } bool isTargetFuchsia() const { return TargetTriple.isOSFuchsia(); } + bool isLFI() const { return TargetTriple.isLFI(); } + bool isTargetWindowsMSVC() const { return TargetTriple.isWindowsMSVCEnvironment(); } diff --git a/llvm/lib/TargetParser/Triple.cpp b/llvm/lib/TargetParser/Triple.cpp index c6515425b7eb5..e3673088061ed 100644 --- a/llvm/lib/TargetParser/Triple.cpp +++ b/llvm/lib/TargetParser/Triple.cpp @@ -181,6 +181,10 @@ StringRef Triple::getArchName(ArchType Kind, SubArchType SubArch) { if (SubArch == AArch64SubArch_lfi) return "aarch64_lfi"; break; + case Triple::x86_64: + if (SubArch == X8664SubArch_lfi) + return "x86_64_lfi"; + break; case Triple::spirv: switch (SubArch) { case Triple::SPIRVSubArch_v10: @@ -801,6 +805,7 @@ Triple::ArchType Triple::parseArch(StringRef ArchName) { // FIXME: Do we need to support these? .Cases({"i786", "i886", "i986"}, Triple::x86) .Cases({"amd64", "x86_64", "x86_64h"}, Triple::x86_64) + .Case("x86_64_lfi", Triple::x86_64) .Cases({"powerpc", "powerpcspe", "ppc", "ppc32"}, Triple::ppc) .Cases({"powerpcle", "ppcle", "ppc32le"}, Triple::ppcle) .Cases({"powerpc64", "ppu", "ppc64"}, Triple::ppc64) @@ -1060,6 +1065,9 @@ static Triple::SubArchType parseSubArch(StringRef SubArchName) { if (SubArchName == "aarch64_lfi") return Triple::AArch64SubArch_lfi; + if (SubArchName == "x86_64_lfi") + return Triple::X8664SubArch_lfi; + if (SubArchName.starts_with("spirv")) return StringSwitch<Triple::SubArchType>(SubArchName) .EndsWith("v1.0", Triple::SPIRVSubArch_v10) diff --git a/llvm/test/MC/X86/LFI/sys.s b/llvm/test/MC/X86/LFI/sys.s new file mode 100644 index 0000000000000..0a41f7f7e6b36 --- /dev/null +++ b/llvm/test/MC/X86/LFI/sys.s @@ -0,0 +1,12 @@ +// RUN: llvm-mc -triple x86_64_lfi %s | FileCheck %s + +syscall +// CHECK: leaq .Ltmp0(%rip), %r11 +// CHECK-NEXT: jmpq *(%r14) +// CHECK-NEXT: .Ltmp0: + +movq %fs:0, %rax +// CHECK: movq 16(%r15), %rax + +movq %fs:0, %rdi +// CHECK: movq 16(%r15), %rdi diff --git a/llvm/test/MC/X86/LFI/tp.s b/llvm/test/MC/X86/LFI/tp.s new file mode 100644 index 0000000000000..85f70c2b6b162 --- /dev/null +++ b/llvm/test/MC/X86/LFI/tp.s @@ -0,0 +1,10 @@ +// RUN: llvm-mc -triple x86_64_lfi %s | FileCheck %s + +movq %fs:0, %rax +// CHECK: movq 16(%r15), %rax + +movq %fs:0, %rdi +// CHECK: movq 16(%r15), %rdi + +movq %fs:0, %rcx +// CHECK: movq 16(%r15), %rcx >From ed678be48b165eb662c5a58264dd8c49cf6abe12 Mon Sep 17 00:00:00 2001 From: Zachary Yedidia <[email protected]> Date: Tue, 31 Mar 2026 01:34:06 -0700 Subject: [PATCH 02/10] Call initSections to allow tests to run This call will be removed once a separate fix is applied. --- llvm/tools/llvm-mc/llvm-mc.cpp | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/llvm/tools/llvm-mc/llvm-mc.cpp b/llvm/tools/llvm-mc/llvm-mc.cpp index 903f82e6855ba..55a9ddb6cb318 100644 --- a/llvm/tools/llvm-mc/llvm-mc.cpp +++ b/llvm/tools/llvm-mc/llvm-mc.cpp @@ -631,8 +631,12 @@ int main(int argc, char **argv) { std::move(CE), std::move(MAB))); Triple T(TripleName); - if (T.isLFI()) + if (T.isLFI()) { + // TODO: Do not merge this change. This is a temporary fix until #188625 + // is merged. + Str->initSections(*STI); initializeLFIMCStreamer(*Str.get(), Ctx, T); + } } else if (FileType == OFT_Null) { Str.reset(TheTarget->createNullStreamer(Ctx)); } else { >From bab339a4a957567d76f6900e0e5aa0742c1070cf Mon Sep 17 00:00:00 2001 From: Zachary Yedidia <[email protected]> Date: Mon, 6 Apr 2026 15:17:59 -0700 Subject: [PATCH 03/10] Update based on recent MCLFI infrastructure changes --- .../Target/X86/MCTargetDesc/X86MCTargetDesc.cpp | 10 ++++------ llvm/test/MC/X86/LFI/abi-note.s | 15 +++++++++++++++ llvm/tools/llvm-mc/llvm-mc.cpp | 6 +----- 3 files changed, 20 insertions(+), 11 deletions(-) create mode 100644 llvm/test/MC/X86/LFI/abi-note.s diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp index 7b42e934c1967..445b1fda0bc3c 100644 --- a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp +++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp @@ -725,13 +725,11 @@ static MCInstrAnalysis *createX86MCInstrAnalysis(const MCInstrInfo *Info) { } static MCLFIRewriter * -createX86MCLFIRewriter(MCStreamer &S, std::unique_ptr<MCRegisterInfo> &&RegInfo, +createX86MCLFIRewriter(MCContext &Ctx, + std::unique_ptr<MCRegisterInfo> &&RegInfo, std::unique_ptr<MCInstrInfo> &&InstInfo) { - auto RW = std::make_unique<X86::X86MCLFIRewriter>( - S.getContext(), std::move(RegInfo), std::move(InstInfo)); - auto *Ptr = RW.get(); - S.setLFIRewriter(std::move(RW)); - return Ptr; + return new X86::X86MCLFIRewriter(Ctx, std::move(RegInfo), + std::move(InstInfo)); } // Force static initialization. diff --git a/llvm/test/MC/X86/LFI/abi-note.s b/llvm/test/MC/X86/LFI/abi-note.s new file mode 100644 index 0000000000000..92fb457873a63 --- /dev/null +++ b/llvm/test/MC/X86/LFI/abi-note.s @@ -0,0 +1,15 @@ +// RUN: llvm-mc -triple x86_64_lfi %s | FileCheck %s +// RUN: llvm-mc -filetype=obj -triple x86_64_lfi %s | llvm-readelf -S - | FileCheck %s --check-prefix=ELF + +// CHECK: .section .note.LFI.ABI.x86_64,"aG",@note,.note.LFI.ABI.x86_64,comdat +// CHECK-NEXT: .long 4 +// CHECK-NEXT: .long 7 +// CHECK-NEXT: .long 1 +// CHECK-NEXT: .ascii "LFI" +// CHECK-NEXT: .byte 0 +// CHECK-NEXT: .p2align 2, 0x0 +// CHECK-NEXT: .ascii "x86_64" +// CHECK-NEXT: .byte 0 +// CHECK-NEXT: .p2align 2, 0x0 + +// ELF: .note.LFI.ABI.x86_64 NOTE {{.*}} AG diff --git a/llvm/tools/llvm-mc/llvm-mc.cpp b/llvm/tools/llvm-mc/llvm-mc.cpp index 55a9ddb6cb318..903f82e6855ba 100644 --- a/llvm/tools/llvm-mc/llvm-mc.cpp +++ b/llvm/tools/llvm-mc/llvm-mc.cpp @@ -631,12 +631,8 @@ int main(int argc, char **argv) { std::move(CE), std::move(MAB))); Triple T(TripleName); - if (T.isLFI()) { - // TODO: Do not merge this change. This is a temporary fix until #188625 - // is merged. - Str->initSections(*STI); + if (T.isLFI()) initializeLFIMCStreamer(*Str.get(), Ctx, T); - } } else if (FileType == OFT_Null) { Str.reset(TheTarget->createNullStreamer(Ctx)); } else { >From c67507ae3b9d673d0be25e7daa2bf16ccdd8bb52 Mon Sep 17 00:00:00 2001 From: Zachary Yedidia <[email protected]> Date: Thu, 9 Apr 2026 17:33:27 -0700 Subject: [PATCH 04/10] Remove doc changes related to aarch64 --- llvm/docs/LFI.rst | 92 +++++++++++++++++++++++++---------------------- 1 file changed, 50 insertions(+), 42 deletions(-) diff --git a/llvm/docs/LFI.rst b/llvm/docs/LFI.rst index 1c045a65c2dc9..72ed48fd8ed33 100644 --- a/llvm/docs/LFI.rst +++ b/llvm/docs/LFI.rst @@ -118,13 +118,13 @@ Compiler Options The LFI target has several configuration options, specified via ``-mattr=``: -* ``+no-lfi-loads``: Disable sandboxing for load instructions (stores-only mode). -* ``+no-lfi-stores``: Disable sandboxing for store instructions. +* ``+lfi-loads``: enable sandboxing for loads (default: true). +* ``+lfi-stores``: enable sandboxing for stores (default: true). -Use ``+no-lfi-loads`` to create a "stores-only" sandbox that may read, but not +Use ``+nolfi-loads`` to create a "stores-only" sandbox that may read, but not write, outside the sandbox region. -Use ``+no-lfi-loads,+no-lfi-stores`` to create a "jumps-only" sandbox that may +Use ``+nolfi-loads+nolfi-stores`` to create a "jumps-only" sandbox that may read/write outside the sandbox region but may not transfer control outside (e.g., may not execute system calls directly). This is primarily useful in combination with some other form of memory sandboxing, such as Intel MPK. @@ -148,7 +148,7 @@ that must be maintained. * ``sp``: always holds an address within the sandbox. * ``x30``: always holds an address within the sandbox. * ``x26``: scratch register. -* ``x25``: context register (see `Context Register`_). +* ``x25``: points to a thread-local virtual register file for storing runtime context information. The current design only supports 4GiB sandboxes, which requires the sandbox base address to be 4GiB-aligned. This is because LFI's ABI stores pointers as @@ -301,11 +301,8 @@ before moving it back into ``sp`` with a safe ``add``. Link register modification ~~~~~~~~~~~~~~~~~~~~~~~~~~~ -When the link register is modified, the guard is deferred until the next -control flow instruction. This approach maintains compatibility with Pointer -Authentication Code (PAC) instructions by keeping signed pointers intact until -they are needed for control flow. The guard uses ``x30`` as both the source and -destination (``add x30, x27, w30, uxtw``). +When the link register is modified, we write the modified value to a +temporary, before loading it back into ``x30`` with a safe ``add``. +---------------------------+-------------------------------+ | Original | Rewritten | @@ -330,42 +327,43 @@ System instructions System calls are rewritten into a sequence that loads the address of the first runtime call entrypoint and jumps to it. The runtime call entrypoint table is -stored at a negative offset from the sandbox base, so it can be referenced by -``x27``. The rewrite also saves and restores the link register, since it is -used for branching into the runtime. - -+-----------------+------------------------------+ -| Original | Rewritten | -+-----------------+------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| svc #0 | mov x26, x30 | -| | ldur x30, [x27, #-8] | -| | blr x30 | -| | add x30, x27, w26, uxtw | -| | | -+-----------------+------------------------------+ +stored at the start of the sandbox, so it can be referenced by ``x27``. The +rewrite also saves and restores the link register, since it is used for +branching into the runtime. + ++-----------------+----------------------------+ +| Original | Rewritten | ++-----------------+----------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| svc #0 | mov w26, w30 | +| | ldr x30, [x27] | +| | blr x30 | +| | add x30, x27, w26, uxtw | +| | | ++-----------------+----------------------------+ Thread pointer (TP) ~~~~~~~~~~~~~~~~~~~ -TP accesses are rewritten into loads/stores from the context register -(``x25``), which holds the virtual thread pointer at offset 16 (see -`Context Register`_). - -+----------------------+-------------------------+ -| Original | Rewritten | -+----------------------+-------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| mrs xN, tpidr_el0 | ldr xN, [x25, #16] | -| | | -+----------------------+-------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| msr tpidr_el0, xN | str xN, [x25, #16] | -| | | -+----------------------+-------------------------+ +TLS accesses are rewritten into accesses offset from ``x25``, which is a +reserved register that points to a virtual register file, with a location for +storing the sandbox's thread pointer. ``TP`` is the offset into that virtual +register file where the thread pointer is stored. + ++----------------------+-----------------------+ +| Original | Rewritten | ++----------------------+-----------------------+ +| .. code-block:: | .. code-block:: | +| | | +| mrs xN, tpidr_el0 | ldr xN, [x25, #TP] | +| | | ++----------------------+-----------------------+ +| .. code-block:: | .. code-block:: | +| | | +| mrs tpidr_el0, xN | str xN, [x25, #TP] | +| | | ++----------------------+-----------------------+ Optimizations ============= @@ -404,6 +402,16 @@ generated via ``adrp`` followed by ``ldr``. Since the address generated by directly target ``x28`` for these sequences. This allows the omission of a guard instruction before the ``ldr``. ++----------------------+-----------------------+ +| Original | Rewritten | ++----------------------+-----------------------+ +| .. code-block:: | .. code-block:: | +| | | +| adrp xN, target | adrp x28, target | +| ldr xN, [xN, imm] | ldr xN, [x28, imm] | +| | | ++----------------------+-----------------------+ + Stack guard elimination ~~~~~~~~~~~~~~~~~~~~~~~ >From 45ca02854a6ec23b6d2053d985f98b00ab7b0fe5 Mon Sep 17 00:00:00 2001 From: Zachary Yedidia <[email protected]> Date: Thu, 16 Apr 2026 03:20:38 -0700 Subject: [PATCH 05/10] Handle all %fs access patterns Since we are now using a reserved register for the context, we can directly rewrite all %fs access sequences without relying on mno-tls-direct-seg-refs to force the compiler to only produce %fs:0. --- llvm/docs/LFI.rst | 51 +++++-- .../X86/MCTargetDesc/X86MCLFIRewriter.cpp | 135 ++++++++++++++---- .../X86/MCTargetDesc/X86MCLFIRewriter.h | 11 +- llvm/test/MC/X86/LFI/sys.s | 6 - llvm/test/MC/X86/LFI/tp.s | 35 +++++ 5 files changed, 186 insertions(+), 52 deletions(-) diff --git a/llvm/docs/LFI.rst b/llvm/docs/LFI.rst index 72ed48fd8ed33..8397ee03172f8 100644 --- a/llvm/docs/LFI.rst +++ b/llvm/docs/LFI.rst @@ -696,18 +696,45 @@ handler table is stored at the address pointed to by ``r14``. Thread pointer ~~~~~~~~~~~~~~ -The ``movq %fs:0, %rX`` pattern (used for TLS access) is rewritten to load the -virtual thread pointer from the context register (``r15``) at offset 16 (see -`Context Register`_). - -+-----------------------+---------------------------+ -| Original | Rewritten | -+-----------------------+---------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| movq %fs:0, %rX | movq 16(%r15), %rX | -| | | -+-----------------------+---------------------------+ +Thread pointer accesses via the ``%fs`` segment (used for TLS) are rewritten to +use the virtual thread pointer from the context register (``r15``) at offset 16 +(see `Context Register`_). The rewrite handles any load or store instruction +with an ``%fs``-segment memory operand. ``Op`` represents any such instruction. + ++--------------------------------------+----------------------------------------+ +| Original | Rewritten | ++--------------------------------------+----------------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| Op %fs:0, %rD | Op 16(%r15), %rD | +| | | ++--------------------------------------+----------------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| Op %fs:(%rX), %rD | movq 16(%r15), %rD | +| | Op (%rD, %rX), %rD | +| | | ++--------------------------------------+----------------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| Op %rS, %fs:(%rX) | movq 16(%r15), %r11 | +| | Op %rS, (%r11, %rX) | +| | | ++--------------------------------------+----------------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| Op %fs:N(%rX, %rY, S), %rD | movq 16(%r15), %r11 | +| | leaq (%r11, %rX), %r11 | +| | Op N(%r11, %rY, S), %rD | +| | | ++--------------------------------------+----------------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| Op %rS, %fs:N(%rX, %rY, S) | movq 16(%r15), %r11 | +| | leaq (%r11, %rX), %r11 | +| | Op %rS, N(%r11, %rY, S) | +| | | ++--------------------------------------+----------------------------------------+ References ++++++++++ diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp index 6bf290dceae08..6e55789c43920 100644 --- a/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp +++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp @@ -35,14 +35,19 @@ static bool isSyscall(const MCInst &Inst) { return Inst.getOpcode() == X86::SYSCALL; } -static bool isTPRead(const MCInst &Inst) { - // Match movq %fs:0, %rX - return Inst.getOpcode() == X86::MOV64rm && - Inst.getOperand(1).getReg() == X86::NoRegister && - Inst.getOperand(2).isImm() && Inst.getOperand(2).getImm() == 1 && - Inst.getOperand(3).getReg() == X86::NoRegister && - Inst.getOperand(4).isImm() && Inst.getOperand(4).getImm() == 0 && - Inst.getOperand(5).getReg() == X86::FS; +// Find the index of the first memory operand with %fs segment override. +// Returns -1 if not found. +static int findFSMemOperand(const MCInst &Inst, const MCInstrInfo &InstInfo) { + const MCInstrDesc &Desc = InstInfo.get(Inst.getOpcode()); + for (unsigned I = 0, E = Desc.getNumOperands(); I < E; ++I) { + if (Desc.operands()[I].OperandType == MCOI::OPERAND_MEMORY) { + if (I + 4 < Inst.getNumOperands() && Inst.getOperand(I + 4).isReg() && + Inst.getOperand(I + 4).getReg() == X86::FS) + return I; + I += 4; + } + } + return -1; } // syscall @@ -50,8 +55,7 @@ static bool isTPRead(const MCInst &Inst) { // leaq .Ltmp(%rip), %r11 // jmpq *(%r14) // .Ltmp: -void X86::X86MCLFIRewriter::emitLFICall(MCStreamer &Out, - const MCSubtargetInfo &STI) { +static void emitLFICall(MCStreamer &Out, const MCSubtargetInfo &STI) { MCSymbol *Symbol = Out.getContext().createTempSymbol(); // leaq .Ltmp(%rip), %r11 @@ -79,39 +83,114 @@ void X86::X86MCLFIRewriter::emitLFICall(MCStreamer &Out, Out.emitLabel(Symbol); } -void X86::X86MCLFIRewriter::expandSyscall(const MCInst &Inst, MCStreamer &Out, - const MCSubtargetInfo &STI) { +void X86::X86MCLFIRewriter::rewriteSyscall(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) { emitLFICall(Out, STI); } -// movq %fs:0, %rX -// -> -// movq TPOffset(%r15), %rX -void X86::X86MCLFIRewriter::expandTPRead(const MCInst &Inst, MCStreamer &Out, - const MCSubtargetInfo &STI) { - MCRegister DestReg = Inst.getOperand(0).getReg(); - +// Emit: movq TPOffset(%r15), %Reg +static void emitTPLoad(MCRegister Reg, MCStreamer &Out, + const MCSubtargetInfo &STI) { MCInst Mov; Mov.setOpcode(X86::MOV64rm); - Mov.addOperand(MCOperand::createReg(DestReg)); - Mov.addOperand(MCOperand::createReg(LFITPReg)); // Base - Mov.addOperand(MCOperand::createImm(1)); // Scale - Mov.addOperand(MCOperand::createReg(X86::NoRegister)); // Index - Mov.addOperand(MCOperand::createImm(TPOffset)); // Displacement - Mov.addOperand(MCOperand::createReg(X86::NoRegister)); // Segment + Mov.addOperand(MCOperand::createReg(Reg)); + Mov.addOperand(MCOperand::createReg(LFITPReg)); + Mov.addOperand(MCOperand::createImm(1)); + Mov.addOperand(MCOperand::createReg(X86::NoRegister)); + Mov.addOperand(MCOperand::createImm(TPOffset)); + Mov.addOperand(MCOperand::createReg(X86::NoRegister)); Out.emitInstruction(Mov, STI); } +bool X86::X86MCLFIRewriter::isFSAccess(const MCInst &Inst) { + return (mayLoad(Inst) || mayStore(Inst)) && + findFSMemOperand(Inst, *InstInfo) >= 0; +} + +// Rewrite %fs-segment memory accesses to use the virtual thread pointer stored +// at TPOffset(%r15). The actual memory access is currently unsandboxed because +// load/store sandboxing is not yet supported. Example rewrites: +// +// movq %fs:0, %rax +// -> +// movq 16(%r15), %rax +// +// movq %fs:(%rdi), %rax +// -> +// movq 16(%r15), %rax +// movq (%rax, %rdi), %rax +// +// movq 8(%rdi, %rsi, 2), %rax +// -> +// movq 16(%r15), %rax +// leaq (%rax, %rdi), %rax +// movq 8(%rax, %rsi, 2), %rax +void X86::X86MCLFIRewriter::rewriteFSAccess(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) { + int MemIdx = findFSMemOperand(Inst, *InstInfo); + assert(MemIdx >= 0); + + MCRegister BaseReg = Inst.getOperand(MemIdx).getReg(); + MCRegister IndexReg = Inst.getOperand(MemIdx + 2).getReg(); + bool HasBase = BaseReg != X86::NoRegister; + bool HasIndex = IndexReg != X86::NoRegister; + bool HasDisp = !Inst.getOperand(MemIdx + 3).isImm() || + Inst.getOperand(MemIdx + 3).getImm() != 0; + + // %fs:0 -> TPOffset(%r15) + if (!HasBase && !HasIndex && !HasDisp) { + MCInst Modified(Inst); + Modified.getOperand(MemIdx).setReg(LFITPReg); + Modified.getOperand(MemIdx + 3).setImm(TPOffset); + Modified.getOperand(MemIdx + 4).setReg(X86::NoRegister); + return Out.emitInstruction(Modified, STI); + } + + // Use the dest register as TP temporary when it is available and not used in + // the addressing mode, otherwise use %r11. + MCRegister TPDest = LFIScratchReg; + if (MemIdx > 0 && Inst.getOperand(0).isReg()) { + const MCInstrDesc &Desc = InstInfo->get(Inst.getOpcode()); + MCRegister DestReg = Inst.getOperand(0).getReg(); + if (Desc.getOperandConstraint(0, MCOI::TIED_TO) == -1 && + X86MCRegisterClasses[X86::GR64RegClassID].contains(DestReg) && + (!HasBase || DestReg != BaseReg) && (!HasIndex || DestReg != IndexReg)) + TPDest = DestReg; + } + + emitTPLoad(TPDest, Out, STI); + + // Both slots occupied: fold base into TPDest via lea. + if (HasBase && HasIndex) { + MCInst Lea; + Lea.setOpcode(X86::LEA64r); + Lea.addOperand(MCOperand::createReg(TPDest)); + Lea.addOperand(MCOperand::createReg(TPDest)); + Lea.addOperand(MCOperand::createImm(1)); + Lea.addOperand(MCOperand::createReg(BaseReg)); + Lea.addOperand(MCOperand::createImm(0)); + Lea.addOperand(MCOperand::createReg(X86::NoRegister)); + Out.emitInstruction(Lea, STI); + } + + MCInst Modified(Inst); + Modified.getOperand(MemIdx).setReg(TPDest); + if (HasBase && !HasIndex) + Modified.getOperand(MemIdx + 2).setReg(BaseReg); + Modified.getOperand(MemIdx + 4).setReg(X86::NoRegister); + Out.emitInstruction(Modified, STI); +} + void X86::X86MCLFIRewriter::doRewriteInst(const MCInst &Inst, MCStreamer &Out, const MCSubtargetInfo &STI) { if (mayModifyRegister(Inst, LFIBaseReg) || mayModifyRegister(Inst, LFITPReg)) return error(Inst, "illegal modification of reserved LFI register"); if (isSyscall(Inst)) - return expandSyscall(Inst, Out, STI); + return rewriteSyscall(Inst, Out, STI); - if (isTPRead(Inst)) - return expandTPRead(Inst, Out, STI); + if (isFSAccess(Inst)) + return rewriteFSAccess(Inst, Out, STI); // Pass through all other instructions unchanged. Out.emitInstruction(Inst, STI); diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.h b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.h index 59d3ac24a4b58..d74f875311d94 100644 --- a/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.h +++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.h @@ -41,13 +41,12 @@ class X86MCLFIRewriter : public MCLFIRewriter { void doRewriteInst(const MCInst &Inst, MCStreamer &Out, const MCSubtargetInfo &STI); - void expandSyscall(const MCInst &Inst, MCStreamer &Out, - const MCSubtargetInfo &STI); - - void expandTPRead(const MCInst &Inst, MCStreamer &Out, - const MCSubtargetInfo &STI); + void rewriteSyscall(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI); - void emitLFICall(MCStreamer &Out, const MCSubtargetInfo &STI); + bool isFSAccess(const MCInst &Inst); + void rewriteFSAccess(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI); }; } // namespace X86 diff --git a/llvm/test/MC/X86/LFI/sys.s b/llvm/test/MC/X86/LFI/sys.s index 0a41f7f7e6b36..099bb969373b0 100644 --- a/llvm/test/MC/X86/LFI/sys.s +++ b/llvm/test/MC/X86/LFI/sys.s @@ -4,9 +4,3 @@ syscall // CHECK: leaq .Ltmp0(%rip), %r11 // CHECK-NEXT: jmpq *(%r14) // CHECK-NEXT: .Ltmp0: - -movq %fs:0, %rax -// CHECK: movq 16(%r15), %rax - -movq %fs:0, %rdi -// CHECK: movq 16(%r15), %rdi diff --git a/llvm/test/MC/X86/LFI/tp.s b/llvm/test/MC/X86/LFI/tp.s index 85f70c2b6b162..652aeedf91b27 100644 --- a/llvm/test/MC/X86/LFI/tp.s +++ b/llvm/test/MC/X86/LFI/tp.s @@ -8,3 +8,38 @@ movq %fs:0, %rdi movq %fs:0, %rcx // CHECK: movq 16(%r15), %rcx + +addq %fs:0, %rax +// CHECK: addq 16(%r15), %rax + +movq %fs:(%rdi), %rax +// CHECK: movq 16(%r15), %rax +// CHECK-NEXT: movq (%rax,%rdi), %rax + +movq %fs:(%rcx), %rdx +// CHECK: movq 16(%r15), %rdx +// CHECK-NEXT: movq (%rdx,%rcx), %rdx + +// base == dest, falls back to %r11 +movq %fs:(%rax), %rax +// CHECK: movq 16(%r15), %r11 +// CHECK-NEXT: movq (%r11,%rax), %rax + +movq %rax, %fs:(%rdi) +// CHECK: movq 16(%r15), %r11 +// CHECK-NEXT: movq %rax, (%r11,%rdi) + +movq %fs:8(%rdi,%rsi,2), %rax +// CHECK: movq 16(%r15), %rax +// CHECK-NEXT: leaq (%rax,%rdi), %rax +// CHECK-NEXT: movq 8(%rax,%rsi,2), %rax + +movq %fs:(%rax,%rbx,4), %rcx +// CHECK: movq 16(%r15), %rcx +// CHECK-NEXT: leaq (%rcx,%rax), %rcx +// CHECK-NEXT: movq (%rcx,%rbx,4), %rcx + +movq %rax, %fs:8(%rdi,%rsi,2) +// CHECK: movq 16(%r15), %r11 +// CHECK-NEXT: leaq (%r11,%rdi), %r11 +// CHECK-NEXT: movq %rax, 8(%r11,%rsi,2) >From 6365dc137d3b152b709ad60aa9cc88d55af3b411 Mon Sep 17 00:00:00 2001 From: Zachary Yedidia <[email protected]> Date: Wed, 22 Apr 2026 14:36:21 -0700 Subject: [PATCH 06/10] Small improvements to x86_64_lfi triple --- llvm/include/llvm/TargetParser/Triple.h | 4 ++-- llvm/lib/TargetParser/Triple.cpp | 7 +++---- 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/llvm/include/llvm/TargetParser/Triple.h b/llvm/include/llvm/TargetParser/Triple.h index 4a029e1cdc5a4..e6e0777d3fd71 100644 --- a/llvm/include/llvm/TargetParser/Triple.h +++ b/llvm/include/llvm/TargetParser/Triple.h @@ -159,7 +159,7 @@ class Triple { AArch64SubArch_arm64ec, AArch64SubArch_lfi, - X8664SubArch_lfi, + X86_64SubArch_lfi, KalimbaSubArch_v3, KalimbaSubArch_v4, @@ -926,7 +926,7 @@ class Triple { return (getArch() == Triple::aarch64 && getSubArch() == Triple::AArch64SubArch_lfi) || (getArch() == Triple::x86_64 && - getSubArch() == Triple::X8664SubArch_lfi); + getSubArch() == Triple::X86_64SubArch_lfi); } /// Tests whether the target supports the EHABI exception diff --git a/llvm/lib/TargetParser/Triple.cpp b/llvm/lib/TargetParser/Triple.cpp index e3673088061ed..c3ac3762d549b 100644 --- a/llvm/lib/TargetParser/Triple.cpp +++ b/llvm/lib/TargetParser/Triple.cpp @@ -182,7 +182,7 @@ StringRef Triple::getArchName(ArchType Kind, SubArchType SubArch) { return "aarch64_lfi"; break; case Triple::x86_64: - if (SubArch == X8664SubArch_lfi) + if (SubArch == X86_64SubArch_lfi) return "x86_64_lfi"; break; case Triple::spirv: @@ -804,8 +804,7 @@ Triple::ArchType Triple::parseArch(StringRef ArchName) { .Cases({"i386", "i486", "i586", "i686"}, Triple::x86) // FIXME: Do we need to support these? .Cases({"i786", "i886", "i986"}, Triple::x86) - .Cases({"amd64", "x86_64", "x86_64h"}, Triple::x86_64) - .Case("x86_64_lfi", Triple::x86_64) + .Cases({"amd64", "x86_64", "x86_64h", "x86_64_lfi"}, Triple::x86_64) .Cases({"powerpc", "powerpcspe", "ppc", "ppc32"}, Triple::ppc) .Cases({"powerpcle", "ppcle", "ppc32le"}, Triple::ppcle) .Cases({"powerpc64", "ppu", "ppc64"}, Triple::ppc64) @@ -1066,7 +1065,7 @@ static Triple::SubArchType parseSubArch(StringRef SubArchName) { return Triple::AArch64SubArch_lfi; if (SubArchName == "x86_64_lfi") - return Triple::X8664SubArch_lfi; + return Triple::X86_64SubArch_lfi; if (SubArchName.starts_with("spirv")) return StringSwitch<Triple::SubArchType>(SubArchName) >From eea83a4345dc36730e113073fd44070384288185 Mon Sep 17 00:00:00 2001 From: Zachary Yedidia <[email protected]> Date: Wed, 22 Apr 2026 14:37:44 -0700 Subject: [PATCH 07/10] Remove unimplemented doc sections --- llvm/docs/LFI.rst | 242 +++++++--------------------------------------- 1 file changed, 37 insertions(+), 205 deletions(-) diff --git a/llvm/docs/LFI.rst b/llvm/docs/LFI.rst index 8397ee03172f8..f971116f95144 100644 --- a/llvm/docs/LFI.rst +++ b/llvm/docs/LFI.rst @@ -118,13 +118,13 @@ Compiler Options The LFI target has several configuration options, specified via ``-mattr=``: -* ``+lfi-loads``: enable sandboxing for loads (default: true). -* ``+lfi-stores``: enable sandboxing for stores (default: true). +* ``+no-lfi-loads``: Disable sandboxing for load instructions (stores-only mode). +* ``+no-lfi-stores``: Disable sandboxing for store instructions. -Use ``+nolfi-loads`` to create a "stores-only" sandbox that may read, but not +Use ``+no-lfi-loads`` to create a "stores-only" sandbox that may read, but not write, outside the sandbox region. -Use ``+nolfi-loads+nolfi-stores`` to create a "jumps-only" sandbox that may +Use ``+no-lfi-loads,+no-lfi-stores`` to create a "jumps-only" sandbox that may read/write outside the sandbox region but may not transfer control outside (e.g., may not execute system calls directly). This is primarily useful in combination with some other form of memory sandboxing, such as Intel MPK. @@ -148,7 +148,7 @@ that must be maintained. * ``sp``: always holds an address within the sandbox. * ``x30``: always holds an address within the sandbox. * ``x26``: scratch register. -* ``x25``: points to a thread-local virtual register file for storing runtime context information. +* ``x25``: context register (see `Context Register`_). The current design only supports 4GiB sandboxes, which requires the sandbox base address to be 4GiB-aligned. This is because LFI's ABI stores pointers as @@ -327,43 +327,42 @@ System instructions System calls are rewritten into a sequence that loads the address of the first runtime call entrypoint and jumps to it. The runtime call entrypoint table is -stored at the start of the sandbox, so it can be referenced by ``x27``. The -rewrite also saves and restores the link register, since it is used for -branching into the runtime. - -+-----------------+----------------------------+ -| Original | Rewritten | -+-----------------+----------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| svc #0 | mov w26, w30 | -| | ldr x30, [x27] | -| | blr x30 | -| | add x30, x27, w26, uxtw | -| | | -+-----------------+----------------------------+ +stored at a negative offset from the sandbox base, so it can be referenced by +``x27``. The rewrite also saves and restores the link register, since it is +used for branching into the runtime. + ++-----------------+------------------------------+ +| Original | Rewritten | ++-----------------+------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| svc #0 | mov x26, x30 | +| | ldur x30, [x27, #-8] | +| | blr x30 | +| | add x30, x27, w26, uxtw | +| | | ++-----------------+------------------------------+ Thread pointer (TP) ~~~~~~~~~~~~~~~~~~~ -TLS accesses are rewritten into accesses offset from ``x25``, which is a -reserved register that points to a virtual register file, with a location for -storing the sandbox's thread pointer. ``TP`` is the offset into that virtual -register file where the thread pointer is stored. - -+----------------------+-----------------------+ -| Original | Rewritten | -+----------------------+-----------------------+ -| .. code-block:: | .. code-block:: | -| | | -| mrs xN, tpidr_el0 | ldr xN, [x25, #TP] | -| | | -+----------------------+-----------------------+ -| .. code-block:: | .. code-block:: | -| | | -| mrs tpidr_el0, xN | str xN, [x25, #TP] | -| | | -+----------------------+-----------------------+ +TP accesses are rewritten into loads/stores from the context register +(``x25``), which holds the virtual thread pointer at offset 16 (see +`Context Register`_). + ++----------------------+-------------------------+ +| Original | Rewritten | ++----------------------+-------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| mrs xN, tpidr_el0 | ldr xN, [x25, #16] | +| | | ++----------------------+-------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| msr tpidr_el0, xN | str xN, [x25, #16] | +| | | ++----------------------+-------------------------+ Optimizations ============= @@ -488,193 +487,26 @@ In the following assembly rewrites, some shorthand is used. * ``%rN`` or ``%eN``: refers to any general-purpose non-reserved register. * ``{a,b,c}``: matches any of ``a``, ``b``, or ``c``. -Instructions placed between ``.bundle_lock`` and ``.bundle_unlock`` directives -must all be placed inside the same bundle. The directive ``.bundle_lock -align_to_end`` ensures that the last instruction in the ``.bundle_lock`` -sequence is placed at the end of the bundle. - Control flow ~~~~~~~~~~~~ **Note**: these rewrites have not been implemented. -Indirect jumps are rewritten to first apply a mask that zeroes the top 32 bits -and bottom 5 bits of the target. An ``addq`` instruction is then used to fill -in the top 32 bits with the sandbox base. - -Indirect calls are similar, but the call instruction must be placed at the -end of the bundle so that the return address is bundle-aligned. Direct calls -must also be placed at the end of a bundle. - -The addressing mode ``LFI:N(...)`` specifies to apply an LFI addressing mode -transformation (see the Memory accesses section) when rewriting the addressing -mode. - -+------------------+------------------------------+ -| Original | Rewritten | -+------------------+------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| jmpq *%rX | .bundle_lock | -| | andl $0xffffffe0, %eX | -| | addq %r14, %rX | -| | jmpq *%rX | -| | .bundle_unlock | -| | | -+------------------+------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| jmpq *N(...) | movq LFI:N(...), %r11 | -| | .bundle_lock | -| | andl $0xffffffe0, %r11d | -| | addq %r14, %r11 | -| | jmpq *%r11 | -| | .bundle_unlock | -| | | -+------------------+------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| callq *%rX | .bundle_lock align_to_end | -| | andl $0xffffffe0, %eX | -| | addq %r14, %r11 | -| | callq *%r11 | -| | .bundle_unlock | -| | | -+------------------+------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| callq *N(...) | movq LFI:N(...), %r11 | -| | .bundle_lock align_to_end | -| | andl $0xffffffe0, %r11d | -| | addq %r14, %r11 | -| | callq *%r11 | -| | .bundle_unlock | -| | | -+------------------+------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| ret | popq %r11 | -| | .bundle_lock | -| | andl $0xffffffe0, %r11d | -| | addq %r14, %r11 | -| | jmpq *%r11 | -| | .bundle_unlock | -| | | -+------------------+------------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| call ... | .bundle_lock align_to_end | -| | call ... | -| | .bundle_unlock | -| | | -+------------------+------------------------------+ - Memory accesses ~~~~~~~~~~~~~~~ **Note**: these rewrites have not been implemented. -Memory accesses are transformed to safe versions by rewriting the addressing -mode. The rewrite prefixes the addressing mode with ``%gs:`` to make the access -relative to the sandbox base. All registers must be changed to the 32-bit form -(``%eX``). - -The stack ``%rsp`` may be accessed directly because it is always guaranteed to -contain a valid sandbox address. ``lea`` instructions do not need rewriting for -their addressing mode since they do not actually perform a memory access. - -+--------------------+--------------------+ -| Original | Rewritten | -+--------------------+--------------------+ -| .. code-block:: | .. code-block:: | -| | | -| lea N(...), ... | lea N(...), ... | -| | | -+--------------------+--------------------+ - -+-------------------+-----------------------+ -| Original | Rewritten | -+-------------------+-----------------------+ -| .. code-block:: | .. code-block:: | -| | | -| N(%rsp) | N(%rsp) | -| | | -+-------------------+-----------------------+ -| .. code-block:: | .. code-block:: | -| | | -| N(%rip) | N(%rip) | -| | | -+-------------------+-----------------------+ -| .. code-block:: | .. code-block:: | -| | | -| N(%rX) | %gs:N(%eX) | -| | | -+-------------------+-----------------------+ -| .. code-block:: | .. code-block:: | -| | | -| N(%rX, %rY, S) | %gs:N(%eX, %eY, S) | -| | | -+-------------------+-----------------------+ -| .. code-block:: | .. code-block:: | -| | | -| N(, %rX, S) | N(, %eX, S) | -| | | -+-------------------+-----------------------+ - String instructions ~~~~~~~~~~~~~~~~~~~ **Note**: these rewrites have not been implemented. -String instructions perform memory accesses using specific registers. Those -registers must be manually guarded before the instruction. - -+-----------------+----------------------------+ -| Original | Rewritten | -+-----------------+----------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| rep? stosq | .bundle_lock | -| | movl %edi, %edi | -| | leaq (%r14, %rdi), %rdi | -| | rep? stosq | -| | .bundle_unlock | -| | | -+-----------------+----------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| rep? movsq | .bundle_lock | -| | movl %edi, %edi | -| | leaq (%r14, %rdi), %rdi | -| | movl %esi, %esi | -| | leaq (%r14, %rsi), %rsi | -| | rep? movsq | -| | .bundle_unlock | -| | | -+-----------------+----------------------------+ - Stack modification ~~~~~~~~~~~~~~~~~~ **Note**: these rewrites have not been implemented. -Since the stack pointer must always contain a valid sandbox address, any -modification to the stack pointer must be rewritten to modify it via ``%esp`` -and then re-guard it with ``leaq (%rsp, %r14), %rsp``. We use this guard form -instead of ``addq %r14, %rsp`` to avoid modifying the flags. - -+------------------+----------------------------+ -| Original | Rewritten | -+------------------+----------------------------+ -| .. code-block:: | .. code-block:: | -| | | -| MOD ..., %rsp | .bundle_lock | -| | MOD ..., %esp | -| | leaq (%rsp, %r14), %rsp | -| | .bundle_unlock | -| | | -+------------------+----------------------------+ - System instructions ~~~~~~~~~~~~~~~~~~~ >From 203d29768d90e2836d4831928d42502389c3f1a0 Mon Sep 17 00:00:00 2001 From: Zachary Yedidia <[email protected]> Date: Tue, 26 May 2026 01:49:13 -0700 Subject: [PATCH 08/10] Updates based on reviewer feedback --- llvm/docs/LFI.rst | 4 +- llvm/include/llvm/TargetParser/Triple.h | 2 +- .../X86/MCTargetDesc/X86MCLFIRewriter.cpp | 64 +++++++++++-------- llvm/test/CodeGen/X86/lfi-sibcall.ll | 21 ++++++ 4 files changed, 61 insertions(+), 30 deletions(-) create mode 100644 llvm/test/CodeGen/X86/lfi-sibcall.ll diff --git a/llvm/docs/LFI.rst b/llvm/docs/LFI.rst index f971116f95144..5bff4c1a4fb6f 100644 --- a/llvm/docs/LFI.rst +++ b/llvm/docs/LFI.rst @@ -512,7 +512,9 @@ System instructions System calls are rewritten into a sequence that loads the return address into the scratch register and jumps to the runtime call handler. The runtime call -handler table is stored at the address pointed to by ``r14``. +handler table is stored at the address pointed to by ``r14``. The ``r11`` +register stores the return address (marked by the label ``.Ltmp`` in the +block below). +-------------------+-------------------------------+ | Original | Rewritten | diff --git a/llvm/include/llvm/TargetParser/Triple.h b/llvm/include/llvm/TargetParser/Triple.h index e6e0777d3fd71..1711cae0d7b66 100644 --- a/llvm/include/llvm/TargetParser/Triple.h +++ b/llvm/include/llvm/TargetParser/Triple.h @@ -1195,7 +1195,7 @@ class Triple { /// True if the target uses TLSDESC by default. bool hasDefaultTLSDESC() const { return isAArch64() || (isAndroid() && isRISCV64()) || isOSFuchsia() || - (isX86() && isLFI()); + isLFI(); } /// Tests whether the target uses -data-sections as default. diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp index 6e55789c43920..49c0bac048494 100644 --- a/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp +++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp @@ -35,18 +35,17 @@ static bool isSyscall(const MCInst &Inst) { return Inst.getOpcode() == X86::SYSCALL; } -// Find the index of the first memory operand with %fs segment override. -// Returns -1 if not found. +// Find the index of the memory operand if it has an %fs segment override. +// Returns -1 if there is no memory operand or no %fs override. static int findFSMemOperand(const MCInst &Inst, const MCInstrInfo &InstInfo) { const MCInstrDesc &Desc = InstInfo.get(Inst.getOpcode()); - for (unsigned I = 0, E = Desc.getNumOperands(); I < E; ++I) { - if (Desc.operands()[I].OperandType == MCOI::OPERAND_MEMORY) { - if (I + 4 < Inst.getNumOperands() && Inst.getOperand(I + 4).isReg() && - Inst.getOperand(I + 4).getReg() == X86::FS) - return I; - I += 4; - } - } + int MemRefIdx = X86II::getMemoryOperandNo(Desc.TSFlags); + if (MemRefIdx < 0) + return -1; + int MemIdx = MemRefIdx + X86II::getOperandBias(Desc); + const MCOperand &Seg = Inst.getOperand(MemIdx + X86::AddrSegmentReg); + if (Seg.isReg() && Seg.getReg() == X86::FS) + return MemIdx; return -1; } @@ -55,7 +54,8 @@ static int findFSMemOperand(const MCInst &Inst, const MCInstrInfo &InstInfo) { // leaq .Ltmp(%rip), %r11 // jmpq *(%r14) // .Ltmp: -static void emitLFICall(MCStreamer &Out, const MCSubtargetInfo &STI) { +void X86::X86MCLFIRewriter::rewriteSyscall(const MCInst &Inst, MCStreamer &Out, + const MCSubtargetInfo &STI) { MCSymbol *Symbol = Out.getContext().createTempSymbol(); // leaq .Ltmp(%rip), %r11 @@ -83,11 +83,6 @@ static void emitLFICall(MCStreamer &Out, const MCSubtargetInfo &STI) { Out.emitLabel(Symbol); } -void X86::X86MCLFIRewriter::rewriteSyscall(const MCInst &Inst, MCStreamer &Out, - const MCSubtargetInfo &STI) { - emitLFICall(Out, STI); -} - // Emit: movq TPOffset(%r15), %Reg static void emitTPLoad(MCRegister Reg, MCStreamer &Out, const MCSubtargetInfo &STI) { @@ -130,24 +125,24 @@ void X86::X86MCLFIRewriter::rewriteFSAccess(const MCInst &Inst, MCStreamer &Out, int MemIdx = findFSMemOperand(Inst, *InstInfo); assert(MemIdx >= 0); - MCRegister BaseReg = Inst.getOperand(MemIdx).getReg(); - MCRegister IndexReg = Inst.getOperand(MemIdx + 2).getReg(); + MCRegister BaseReg = Inst.getOperand(MemIdx + X86::AddrBaseReg).getReg(); + MCRegister IndexReg = Inst.getOperand(MemIdx + X86::AddrIndexReg).getReg(); bool HasBase = BaseReg != X86::NoRegister; bool HasIndex = IndexReg != X86::NoRegister; - bool HasDisp = !Inst.getOperand(MemIdx + 3).isImm() || - Inst.getOperand(MemIdx + 3).getImm() != 0; + bool HasDisp = !Inst.getOperand(MemIdx + X86::AddrDisp).isImm() || + Inst.getOperand(MemIdx + X86::AddrDisp).getImm() != 0; // %fs:0 -> TPOffset(%r15) if (!HasBase && !HasIndex && !HasDisp) { MCInst Modified(Inst); - Modified.getOperand(MemIdx).setReg(LFITPReg); - Modified.getOperand(MemIdx + 3).setImm(TPOffset); - Modified.getOperand(MemIdx + 4).setReg(X86::NoRegister); + Modified.getOperand(MemIdx + X86::AddrBaseReg).setReg(LFITPReg); + Modified.getOperand(MemIdx + X86::AddrDisp).setImm(TPOffset); + Modified.getOperand(MemIdx + X86::AddrSegmentReg).setReg(X86::NoRegister); return Out.emitInstruction(Modified, STI); } // Use the dest register as TP temporary when it is available and not used in - // the addressing mode, otherwise use %r11. + // the addressing mode, otherwise use %r11 (e.g., movq %fs:(%rax), %rax). MCRegister TPDest = LFIScratchReg; if (MemIdx > 0 && Inst.getOperand(0).isReg()) { const MCInstrDesc &Desc = InstInfo->get(Inst.getOpcode()); @@ -160,7 +155,13 @@ void X86::X86MCLFIRewriter::rewriteFSAccess(const MCInst &Inst, MCStreamer &Out, emitTPLoad(TPDest, Out, STI); - // Both slots occupied: fold base into TPDest via lea. + // Both slots occupied: fold base into TPDest via lea. For example: + // + // movq %fs:8(%rdi,%rsi,2), %rax + // -> + // movq 16(%r15), %rax + // leaq (%rax,%rdi), %rax + // movq 8(%rax,%rsi,2), %rax if (HasBase && HasIndex) { MCInst Lea; Lea.setOpcode(X86::LEA64r); @@ -173,11 +174,18 @@ void X86::X86MCLFIRewriter::rewriteFSAccess(const MCInst &Inst, MCStreamer &Out, Out.emitInstruction(Lea, STI); } + // Emit the access with TPDest as the new base, and the original base + // (offset from %fs) as the new index. For example: + // + // movq %fs:(%rdi), %rax + // -> + // movq 16(%r15), %rax + // movq (%rax,%rdi), %rax MCInst Modified(Inst); - Modified.getOperand(MemIdx).setReg(TPDest); + Modified.getOperand(MemIdx + X86::AddrBaseReg).setReg(TPDest); if (HasBase && !HasIndex) - Modified.getOperand(MemIdx + 2).setReg(BaseReg); - Modified.getOperand(MemIdx + 4).setReg(X86::NoRegister); + Modified.getOperand(MemIdx + X86::AddrIndexReg).setReg(BaseReg); + Modified.getOperand(MemIdx + X86::AddrSegmentReg).setReg(X86::NoRegister); Out.emitInstruction(Modified, STI); } diff --git a/llvm/test/CodeGen/X86/lfi-sibcall.ll b/llvm/test/CodeGen/X86/lfi-sibcall.ll new file mode 100644 index 0000000000000..84be6594a2836 --- /dev/null +++ b/llvm/test/CodeGen/X86/lfi-sibcall.ll @@ -0,0 +1,21 @@ +; RUN: llc < %s -mtriple=x86_64_lfi | FileCheck %s + +; LFI reserves %r11. For indirect vararg tail calls that consume all six +; argument GPRs, the function pointer would normally be loaded into %r11. With +; %r11 reserved there is no free register for the call target, so sibling-call +; optimization must be disabled. + +define void @caller6_indirect_vararg(ptr %fn, i64 %a, i64 %b, i64 %c, i64 %d, i64 %e, i64 %f) { +; CHECK-LABEL: caller6_indirect_vararg: +; CHECK: callq * +; CHECK-NOT: TAILCALL + tail call void (i64, i64, i64, i64, i64, i64, ...) %fn(i64 %a, i64 %b, i64 %c, i64 %d, i64 %e, i64 %f) + ret void +} + +define void @caller5_indirect_vararg(ptr %fn, i64 %a, i64 %b, i64 %c, i64 %d, i64 %e) { +; CHECK-LABEL: caller5_indirect_vararg: +; CHECK: jmpq *{{.*}} # TAILCALL + tail call void (i64, i64, i64, i64, i64, ...) %fn(i64 %a, i64 %b, i64 %c, i64 %d, i64 %e) + ret void +} >From 21665e1b527f73a18b2f732ead9d3ba2c9182a16 Mon Sep 17 00:00:00 2001 From: Zachary Yedidia <[email protected]> Date: Tue, 26 May 2026 15:03:48 -0700 Subject: [PATCH 09/10] Use -8 rtcall offset for syscalls to match AArch64 --- llvm/docs/LFI.rst | 2 +- llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp | 2 +- llvm/test/MC/X86/LFI/sys.s | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/llvm/docs/LFI.rst b/llvm/docs/LFI.rst index 5bff4c1a4fb6f..ea2d07191276b 100644 --- a/llvm/docs/LFI.rst +++ b/llvm/docs/LFI.rst @@ -522,7 +522,7 @@ block below). | .. code-block:: | .. code-block:: | | | | | syscall | leaq .Ltmp(%rip), %r11 | -| | jmpq *(%r14) | +| | jmpq *-8(%r14) | | | .Ltmp: | | | | +-------------------+-------------------------------+ diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp index 49c0bac048494..aaf2f8f4980f2 100644 --- a/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp +++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp @@ -76,7 +76,7 @@ void X86::X86MCLFIRewriter::rewriteSyscall(const MCInst &Inst, MCStreamer &Out, Jmp.addOperand(MCOperand::createReg(LFIBaseReg)); Jmp.addOperand(MCOperand::createImm(1)); Jmp.addOperand(MCOperand::createReg(X86::NoRegister)); - Jmp.addOperand(MCOperand::createImm(0)); + Jmp.addOperand(MCOperand::createImm(-8)); Jmp.addOperand(MCOperand::createReg(X86::NoRegister)); Out.emitInstruction(Jmp, STI); diff --git a/llvm/test/MC/X86/LFI/sys.s b/llvm/test/MC/X86/LFI/sys.s index 099bb969373b0..b184bd481df17 100644 --- a/llvm/test/MC/X86/LFI/sys.s +++ b/llvm/test/MC/X86/LFI/sys.s @@ -2,5 +2,5 @@ syscall // CHECK: leaq .Ltmp0(%rip), %r11 -// CHECK-NEXT: jmpq *(%r14) +// CHECK-NEXT: jmpq *-8(%r14) // CHECK-NEXT: .Ltmp0: >From 7f36440fce76c7cc4c194a37fc0c5bebdd94d342 Mon Sep 17 00:00:00 2001 From: Zachary Yedidia <[email protected]> Date: Wed, 27 May 2026 15:56:38 -0700 Subject: [PATCH 10/10] Updates based on reviewer feedback --- .../X86/MCTargetDesc/X86MCLFIRewriter.cpp | 2 +- llvm/test/MC/X86/LFI/{sys.s => syscall.s} | 0 .../MC/X86/LFI/{tp.s => thread-pointer.s} | 21 +++++++++++++++++++ 3 files changed, 22 insertions(+), 1 deletion(-) rename llvm/test/MC/X86/LFI/{sys.s => syscall.s} (100%) rename llvm/test/MC/X86/LFI/{tp.s => thread-pointer.s} (66%) diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp index aaf2f8f4980f2..b499691ecaf33 100644 --- a/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp +++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCLFIRewriter.cpp @@ -115,7 +115,7 @@ bool X86::X86MCLFIRewriter::isFSAccess(const MCInst &Inst) { // movq 16(%r15), %rax // movq (%rax, %rdi), %rax // -// movq 8(%rdi, %rsi, 2), %rax +// movq %fs:8(%rdi, %rsi, 2), %rax // -> // movq 16(%r15), %rax // leaq (%rax, %rdi), %rax diff --git a/llvm/test/MC/X86/LFI/sys.s b/llvm/test/MC/X86/LFI/syscall.s similarity index 100% rename from llvm/test/MC/X86/LFI/sys.s rename to llvm/test/MC/X86/LFI/syscall.s diff --git a/llvm/test/MC/X86/LFI/tp.s b/llvm/test/MC/X86/LFI/thread-pointer.s similarity index 66% rename from llvm/test/MC/X86/LFI/tp.s rename to llvm/test/MC/X86/LFI/thread-pointer.s index 652aeedf91b27..48dd7a0b376e5 100644 --- a/llvm/test/MC/X86/LFI/tp.s +++ b/llvm/test/MC/X86/LFI/thread-pointer.s @@ -43,3 +43,24 @@ movq %rax, %fs:8(%rdi,%rsi,2) // CHECK: movq 16(%r15), %r11 // CHECK-NEXT: leaq (%r11,%rdi), %r11 // CHECK-NEXT: movq %rax, 8(%r11,%rsi,2) + +movq %fs:foo, %rax +// CHECK: movq 16(%r15), %rax +// CHECK-NEXT: movq foo(%rax), %rax + +movq %fs:foo@TPOFF, %rax +// CHECK: movq 16(%r15), %rax +// CHECK-NEXT: movq foo@TPOFF(%rax), %rax + +movq %fs:foo(%rdi), %rax +// CHECK: movq 16(%r15), %rax +// CHECK-NEXT: movq foo(%rax,%rdi), %rax + +movq %fs:foo@TPOFF(%rdi,%rsi,2), %rax +// CHECK: movq 16(%r15), %rax +// CHECK-NEXT: leaq (%rax,%rdi), %rax +// CHECK-NEXT: movq foo@TPOFF(%rax,%rsi,2), %rax + +movq %rcx, %fs:foo +// CHECK: movq 16(%r15), %r11 +// CHECK-NEXT: movq %rcx, foo(%r11) _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
