samitolvanen updated this revision to Diff 433795.
samitolvanen added a comment.
Herald added subscribers: jsji, laytonio, arichardson, emaste.
Herald added a reviewer: MaskRay.

- Changed Clang to emit operand bundles for indirect calls as pcc suggested, 
and dropped the `llvm.kcfi.check` intrinsic.
- Based on further LKML discussion, implemented arch-specific lowering that 
ensures the KCFI check can be placed immediately before the call instruction on 
X86.
- Switched to relative offsets in `.kcfi_traps` and fixed the `__cfi_` preamble 
linkage on X86.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119296/new/

https://reviews.llvm.org/D119296

Files:
  clang/docs/ControlFlowIntegrity.rst
  clang/docs/UsersManual.rst
  clang/include/clang/Basic/Features.def
  clang/include/clang/Basic/Sanitizers.def
  clang/lib/CodeGen/CGCall.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/Driver/SanitizerArgs.cpp
  clang/lib/Driver/ToolChain.cpp
  clang/test/CodeGen/kcfi.c
  clang/test/Driver/fsanitize.c
  lld/ELF/Symbols.cpp
  llvm/docs/LangRef.rst
  llvm/include/llvm/CodeGen/AsmPrinter.h
  llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
  llvm/include/llvm/CodeGen/TargetLowering.h
  llvm/include/llvm/IR/InstrTypes.h
  llvm/include/llvm/IR/LLVMContext.h
  llvm/include/llvm/MC/MCObjectFileInfo.h
  llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
  llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
  llvm/lib/IR/Instructions.cpp
  llvm/lib/IR/LLVMContext.cpp
  llvm/lib/IR/Verifier.cpp
  llvm/lib/MC/MCObjectFileInfo.cpp
  llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
  llvm/lib/Target/AArch64/AArch64FastISel.cpp
  llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
  llvm/lib/Target/AArch64/AArch64ISelLowering.h
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
  llvm/lib/Target/X86/X86AsmPrinter.cpp
  llvm/lib/Target/X86/X86AsmPrinter.h
  llvm/lib/Target/X86/X86ExpandPseudo.cpp
  llvm/lib/Target/X86/X86FastISel.cpp
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/lib/Target/X86/X86ISelLowering.h
  llvm/lib/Target/X86/X86InstrCompiler.td
  llvm/lib/Target/X86/X86InstrControl.td
  llvm/lib/Target/X86/X86InstrInfo.td
  llvm/lib/Target/X86/X86MCInstLower.cpp
  llvm/lib/Target/X86/X86TargetMachine.cpp
  llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Bitcode/operand-bundles-bc-analyzer.ll
  llvm/test/CodeGen/AArch64/kcfi-bti.ll
  llvm/test/CodeGen/AArch64/kcfi.ll
  llvm/test/CodeGen/X86/O0-pipeline.ll
  llvm/test/CodeGen/X86/kcfi.ll
  llvm/test/CodeGen/X86/opt-pipeline.ll
  llvm/test/Transforms/InstCombine/kcfi-operand-bundles.ll
  llvm/test/Transforms/TailCallElim/kcfi-bundle.ll
  llvm/test/Verifier/kcfi-operand-bundles.ll

Index: llvm/test/Verifier/kcfi-operand-bundles.ll
===================================================================
--- /dev/null
+++ llvm/test/Verifier/kcfi-operand-bundles.ll
@@ -0,0 +1,16 @@
+; RUN: not opt -verify < %s 2>&1 | FileCheck %s
+
+define void @test_kcfi_bundle(i64 %arg0, i32 %arg1, void()* %arg2) {
+; CHECK: Multiple kcfi operand bundles
+; CHECK-NEXT: call void %arg2() [ "kcfi"(i32 42), "kcfi"(i32 42) ]
+  call void %arg2() [ "kcfi"(i32 42), "kcfi"(i32 42) ]
+
+; CHECK: Kcfi bundle operand must be an i32 constant
+; CHECK-NEXT: call void %arg2() [ "kcfi"(i64 42) ]
+  call void %arg2() [ "kcfi"(i64 42) ]
+
+; CHECK-NOT: call
+  call void %arg2() [ "kcfi"(i32 42) ] ; OK
+  call void %arg2() [ "kcfi"(i32 42) ] ; OK
+  ret void
+}
Index: llvm/test/Transforms/TailCallElim/kcfi-bundle.ll
===================================================================
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/kcfi-bundle.ll
@@ -0,0 +1,10 @@
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+; Check that the "kcfi" operand bundle doesn't prevent tail calls.
+
+define i64 @f_1(i64 %x, i64(i64)* %f_0) {
+; CHECK-LABEL: @f_1(
+entry:
+; CHECK: tail call i64 %f_0(i64 %x) [ "kcfi"(i32 42) ]
+  %tmp = call i64 %f_0(i64 %x) [ "kcfi"(i32 42) ]
+  ret i64 0
+}
Index: llvm/test/Transforms/InstCombine/kcfi-operand-bundles.ll
===================================================================
--- /dev/null
+++ llvm/test/Transforms/InstCombine/kcfi-operand-bundles.ll
@@ -0,0 +1,25 @@
+; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+
+define void @f1() #0 prefix i32 10 {
+  ret void
+}
+
+declare void @f2() #0 prefix i32 11
+
+; CHECK-LABEL: define void @g(ptr noundef %x) #0
+define void @g(ptr noundef %x) #0 {
+  ; CHECK: call void %x() [ "kcfi"(i32 10) ]
+  call void %x() [ "kcfi"(i32 10) ]
+
+  ; COM: Must drop the kcfi operand bundle from direct calls.
+  ; CHECK: call void @f1()
+  ; CHECK-NOT: [ "kcfi"(i32 10) ]
+  call void @f1() [ "kcfi"(i32 10) ]
+
+  ; CHECK: call void @f2()
+  ; CHECK-NOT: [ "kcfi"(i32 10) ]
+  call void @f2() [ "kcfi"(i32 10) ]
+  ret void
+}
+
+attributes #0 = { "kcfi-target" }
Index: llvm/test/CodeGen/X86/opt-pipeline.ll
===================================================================
--- llvm/test/CodeGen/X86/opt-pipeline.ll
+++ llvm/test/CodeGen/X86/opt-pipeline.ll
@@ -206,6 +206,7 @@
 ; CHECK-NEXT:       Check CFA info and insert CFI instructions if needed
 ; CHECK-NEXT:       X86 Load Value Injection (LVI) Ret-Hardening
 ; CHECK-NEXT:       Pseudo Probe Inserter
+; CHECK-NEXT:       Unpack machine instruction bundles
 ; CHECK-NEXT:       Lazy Machine Block Frequency Analysis
 ; CHECK-NEXT:       Machine Optimization Remark Emitter
 ; CHECK-NEXT:       X86 Assembly Printer
Index: llvm/test/CodeGen/X86/kcfi.ll
===================================================================
--- /dev/null
+++ llvm/test/CodeGen/X86/kcfi.ll
@@ -0,0 +1,89 @@
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu < %s | FileCheck %s --check-prefix=ASM
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -stop-before=finalize-isel < %s | FileCheck %s --check-prefixes=MIR,ISEL
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -stop-after=finalize-isel < %s | FileCheck %s --check-prefixes=MIR,FINAL
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -stop-after=x86-pseudo < %s | FileCheck %s --check-prefixes=MIR,PSEUDO
+
+; ASM:       .type __cfi_f1,@function
+; ASM-LABEL: __cfi_f1:
+; ASM-NEXT:    int3
+; ASM-NEXT:    int3
+; ASM-NEXT:    movl $12345678, %eax
+; ASM-NEXT:    int3
+; ASM-NEXT:    int3
+; ASM-LABEL: .L__cfi_func_end0:
+; ASM-NEXT:  .size   __cfi_f1, .L__cfi_func_end0-__cfi_f1
+define void @f1(ptr noundef %x) #0 prefix i32 12345678 {
+; ASM-LABEL: f1:
+; ASM:       # %bb.0:
+; ASM:         cmpl $12345678, -6(%rdi) # imm = 0xBC614E
+; ASM-NEXT:    je .Ltmp0
+; ASM-NEXT:  .Ltmp1:
+; ASM-NEXT:    ud2
+; ASM-NEXT:    .section .kcfi_traps,"awo",@progbits,.text
+; ASM-NEXT:  .Ltmp2:
+; ASM-NEXT:    .long .Ltmp1-.Ltmp2
+; ASM-NEXT:    .text
+; ASM-NEXT:  .Ltmp0:
+; ASM-NEXT:    callq *%rdi
+
+; MIR-LABEL: name: f1
+; MIR: body:
+; ISEL:   KCFI_CALL64r 12345678, %[[#]], csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp
+; PSEUDO:       BUNDLE implicit-def $eflags, implicit-def $rsp, implicit-def $esp, implicit-def $sp, implicit-def $spl, implicit-def $sph, implicit-def $hsp, implicit-def $ssp, implicit killed $rdi, implicit $rsp, implicit $ssp {
+; PSEUDO-NEXT:    KCFI_CHECK killed renamable $rdi, 12345678, implicit-def $eflags
+; PSEUDO-NEXT:    CALL64r killed renamable $rdi, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp
+; PSEUDO-NEXT:  }
+  call void %x() [ "kcfi"(i32 12345678) ]
+  ret void
+}
+
+; ASM-NOT: __cfi_f2:
+define void @f2(ptr noundef %x) {
+; ASM-LABEL: f2:
+
+; MIR-LABEL: name: f2
+; MIR: body:
+; ISEL:   KCFI_TCRETURNri64 12345678, %[[#]], 0, csr_64, implicit $rsp, implicit $ssp
+; PSEUDO:       BUNDLE implicit-def $eflags, implicit killed $rdi, implicit $rsp, implicit $ssp {
+; PSEUDO-NEXT:    KCFI_CHECK killed renamable $rdi, 12345678, implicit-def $eflags
+; PSEUDO-NEXT:    TAILJMPr64 killed renamable $rdi, csr_64, implicit $rsp, implicit $ssp, implicit $rsp, implicit $ssp
+; PSEUDO-NEXT:  }
+  tail call void %x() [ "kcfi"(i32 12345678) ]
+  ret void
+}
+
+; ASM-NOT: __cfi_f3:
+define void @f3(ptr noundef %x) #1 {
+; ASM-LABEL: f3:
+; MIR-LABEL: name: f3
+; MIR: body:
+; ISEL:   KCFI_INDIRECT_THUNK_CALL64 12345678, %0, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp
+; FINAL:  KCFI_CALL64pcrel32 12345678, &__llvm_retpoline_r11, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, implicit killed $r11
+; PSEUDO:       BUNDLE implicit-def $eflags, implicit-def $rsp, implicit-def $esp, implicit-def $sp, implicit-def $spl, implicit-def $sph, implicit-def $hsp, implicit-def $ssp, implicit killed $r11, implicit $rsp, implicit $ssp {
+; PSEUDO-NEXT:    KCFI_CHECK $r11, 12345678, implicit-def $eflags
+; PSEUDO-NEXT:    CALL64pcrel32 &__llvm_retpoline_r11, csr_64, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, implicit killed $r11
+; PSEUDO-NEXT:  }
+  call void %x() [ "kcfi"(i32 12345678) ]
+  ret void
+}
+
+; ASM-NOT: __cfi_f4:
+define void @f4(ptr noundef %x) #1 {
+; ASM-LABEL: f4:
+; MIR-LABEL: name: f4
+; MIR: body:
+; ISEL:   KCFI_INDIRECT_THUNK_TCRETURN64 12345678, %[[#]], 0, csr_64, implicit $rsp, implicit $ssp
+; FINAL:  KCFI_TCRETURNdi64 12345678, &__llvm_retpoline_r11, 0, csr_64, implicit $rsp, implicit $ssp, implicit killed $r11
+; PSEUDO:       BUNDLE implicit-def $eflags, implicit killed $r11, implicit $rsp, implicit $ssp {
+; PSEUDO-NEXT:    KCFI_CHECK $r11, 12345678, implicit-def $eflags
+; PSEUDO-NEXT:    TAILJMPd64 &__llvm_retpoline_r11, csr_64, implicit $rsp, implicit $ssp, implicit $rsp, implicit $ssp, implicit killed $r11
+; PSEUDO-NEXT:  }
+  tail call void %x() [ "kcfi"(i32 12345678) ]
+  ret void
+}
+
+attributes #0 = { "kcfi-target" }
+attributes #1 = { "target-features"="+retpoline-indirect-branches,+retpoline-indirect-calls" }
+
+!llvm.module.flags = !{!0}
+!0 = !{i32 4, !"kcfi", i32 1}
Index: llvm/test/CodeGen/X86/O0-pipeline.ll
===================================================================
--- llvm/test/CodeGen/X86/O0-pipeline.ll
+++ llvm/test/CodeGen/X86/O0-pipeline.ll
@@ -74,6 +74,7 @@
 ; CHECK-NEXT:       Check CFA info and insert CFI instructions if needed
 ; CHECK-NEXT:       X86 Load Value Injection (LVI) Ret-Hardening  
 ; CHECK-NEXT:       Pseudo Probe Inserter
+; CHECK-NEXT:       Unpack machine instruction bundles
 ; CHECK-NEXT:       Lazy Machine Block Frequency Analysis
 ; CHECK-NEXT:       Machine Optimization Remark Emitter
 ; CHECK-NEXT:       X86 Assembly Printer
Index: llvm/test/CodeGen/AArch64/kcfi.ll
===================================================================
--- /dev/null
+++ llvm/test/CodeGen/AArch64/kcfi.ll
@@ -0,0 +1,63 @@
+; RUN: llc -mtriple=aarch64-- < %s | FileCheck %s --check-prefix=ASM
+; RUN: llc -mtriple=aarch64-- -global-isel < %s | FileCheck %s --check-prefix=ASM
+; RUN: llc -mtriple=aarch64-- -stop-before=finalize-isel < %s | FileCheck %s --check-prefixes=MIR,ISEL
+; RUN: llc -mtriple=aarch64-- -stop-before=finalize-isel -global-isel < %s | FileCheck %s --check-prefixes=MIR,ISEL
+; RUN: llc -mtriple=aarch64-- -stop-after=finalize-isel < %s | FileCheck %s --check-prefixes=MIR,FINAL
+; RUN: llc -mtriple=aarch64-- -stop-after=finalize-isel -global-isel < %s | FileCheck %s --check-prefixes=MIR,FINAL
+; RUN: llc -mtriple=aarch64-- -mattr=harden-sls-blr -stop-before=finalize-isel < %s | FileCheck %s --check-prefixes=MIR,ISEL-SLS
+; RUN: llc -mtriple=aarch64-- -mattr=harden-sls-blr -stop-before=finalize-isel -global-isel < %s | FileCheck %s --check-prefixes=MIR,ISEL-SLS
+; RUN: llc -mtriple=aarch64-- -mattr=harden-sls-blr -stop-after=finalize-isel < %s | FileCheck %s --check-prefixes=MIR,FINAL-SLS
+; RUN: llc -mtriple=aarch64-- -mattr=harden-sls-blr -stop-after=finalize-isel -global-isel < %s | FileCheck %s --check-prefixes=MIR,FINAL-SLS
+
+; ASM:       .word 12345678
+define void @f1(ptr noundef %x) #0 prefix i32 12345678 {
+; ASM-LABEL: f1:
+; ASM:       // %bb.0:
+; ASM:         ldur w16, [x0, #-4]
+; ASM-NEXT:    movk w17, #24910
+; ASM-NEXT:    movk w17, #188, lsl #16
+; ASM-NEXT:    cmp w16, w17
+; ASM-NEXT:    b.eq .Ltmp0
+; ASM-NEXT:    brk #0x8220
+; ASM-NEXT:  .Ltmp0:
+; ASM-NEXT:    blr x0
+
+; MIR-LABEL: name: f1
+; MIR: body:
+
+; ISEL:       KCFI_BLR 12345678, %0
+; FINAL:      KCFI_CHECK %0, 12345678, implicit-def $x16, implicit-def $x17, implicit-def $nzcv
+; FINAL-NEXT: BLR %0
+
+; ISEL-SLS:        KCFI_BLRNoIP 12345678, %0
+; FINAL-SLS:       KCFI_CHECK %0, 12345678, implicit-def $x16, implicit-def $x17, implicit-def $nzcv
+; FINAL-SLS-NEXT:  BLRNoIP %0
+  call void %x() [ "kcfi"(i32 12345678) ]
+  ret void
+}
+
+; ASM-NOT: .word:
+define void @f2(ptr noundef %x) {
+; ASM-LABEL: f2:
+; ASM:       // %bb.0:
+; ASM:         ldur w16, [x0, #-4]
+; ASM-NEXT:    movk w17, #24910
+; ASM-NEXT:    movk w17, #188, lsl #16
+; ASM-NEXT:    cmp w16, w17
+; ASM-NEXT:    b.eq .Ltmp1
+; ASM-NEXT:    brk #0x8220
+; ASM-NEXT:  .Ltmp1:
+; ASM-NEXT:    br x0
+
+; MIR-LABEL: name: f2
+; MIR: body:
+
+; ISEL:       KCFI_TCRETURNri 12345678, %0, 0
+
+; FINAL:      KCFI_CHECK %0, 12345678, implicit-def $x16, implicit-def $x17, implicit-def $nzcv
+; FINAL-NEXT: TCRETURNri %0, 0
+  tail call void %x() [ "kcfi"(i32 12345678) ]
+  ret void
+}
+
+attributes #0 = { "kcfi-target" }
Index: llvm/test/CodeGen/AArch64/kcfi-bti.ll
===================================================================
--- /dev/null
+++ llvm/test/CodeGen/AArch64/kcfi-bti.ll
@@ -0,0 +1,75 @@
+; RUN: llc -mtriple=aarch64-- < %s | FileCheck %s --check-prefix=ASM
+; RUN: llc -mtriple=aarch64-- -stop-before=finalize-isel < %s | FileCheck %s --check-prefixes=MIR,ISEL
+; RUN: llc -mtriple=aarch64-- -stop-after=finalize-isel < %s | FileCheck %s --check-prefixes=MIR,FINAL
+
+; ASM:       .word 12345678
+define void @f1(ptr noundef %x) #0 prefix i32 12345678 {
+; ASM-LABEL: f1:
+; ASM:       // %bb.0:
+; ASM:         ldur w16, [x0, #-4]
+; ASM-NEXT:    movk w17, #24910
+; ASM-NEXT:    movk w17, #188, lsl #16
+; ASM-NEXT:    cmp w16, w17
+; ASM-NEXT:    b.eq .Ltmp0
+; ASM-NEXT:    brk #0x8220
+; ASM-NEXT:  .Ltmp0:
+; ASM-NEXT:    blr x0
+
+; MIR-LABEL: name: f1
+; MIR: body:
+; ISEL:       KCFI_BLR 12345678, %0
+; FINAL:      KCFI_CHECK %0, 12345678, implicit-def $x16, implicit-def $x17, implicit-def $nzcv
+; FINAL-NEXT: BLR %0
+  call void %x() [ "kcfi"(i32 12345678) ]
+  ret void
+}
+
+; ASM:       .word 12345678
+define void @f2(ptr noundef %x) #0 prefix i32 12345678 {
+; ASM-LABEL: f2:
+; ASM:       // %bb.0:
+; ASM:         ldur w16, [x0, #-4]
+; ASM-NEXT:    movk w17, #24910
+; ASM-NEXT:    movk w17, #188, lsl #16
+; ASM-NEXT:    cmp w16, w17
+; ASM-NEXT:    b.eq .Ltmp1
+; ASM-NEXT:    brk #0x8220
+; ASM-NEXT:  .Ltmp1:
+; ASM-NEXT:    blr x0
+
+; MIR-LABEL: name: f2
+; MIR: body:
+; ISEL:       KCFI_BLR_BTI 12345678, %0
+; FINAL:      KCFI_CHECK %0, 12345678, implicit-def $x16, implicit-def $x17, implicit-def $nzcv
+; FINAL-NEXT: BLR_BTI %0
+  call void %x() #1 [ "kcfi"(i32 12345678) ]
+  ret void
+}
+
+; ASM-NOT: .word:
+define void @f3(ptr noundef %x) #0 {
+; ASM-LABEL: f3:
+; ASM:       // %bb.0:
+; ASM:         ldur w9, [x16, #-4]
+; ASM-NEXT:    movk w10, #24910
+; ASM-NEXT:    movk w10, #188, lsl #16
+; ASM-NEXT:    cmp w9, w10
+; ASM-NEXT:    b.eq .Ltmp2
+; ASM-NEXT:    brk #0x8150
+; ASM-NEXT:  .Ltmp2:
+; ASM-NEXT:    br x16
+
+; MIR-LABEL: name: f3
+; MIR: body:
+; ISEL:       KCFI_TCRETURNriBTI 12345678, %1, 0
+; FINAL:      KCFI_CHECK_BTI %1, 12345678, implicit-def $x9, implicit-def $x10, implicit-def $nzcv
+; FINAL-NEXT: TCRETURNriBTI %1, 0
+  tail call void %x() [ "kcfi"(i32 12345678) ]
+  ret void
+}
+
+attributes #0 = { "kcfi-target" }
+attributes #1 = { returns_twice }
+
+!llvm.module.flags = !{!0}
+!0 = !{i32 8, !"branch-target-enforcement", i32 1}
Index: llvm/test/Bitcode/operand-bundles-bc-analyzer.ll
===================================================================
--- llvm/test/Bitcode/operand-bundles-bc-analyzer.ll
+++ llvm/test/Bitcode/operand-bundles-bc-analyzer.ll
@@ -11,6 +11,7 @@
 ; CHECK-NEXT:    <OPERAND_BUNDLE_TAG
 ; CHECK-NEXT:    <OPERAND_BUNDLE_TAG
 ; CHECK-NEXT:    <OPERAND_BUNDLE_TAG
+; CHECK-NEXT:    <OPERAND_BUNDLE_TAG
 ; CHECK-NEXT:  </OPERAND_BUNDLE_TAGS_BLOCK
 
 ; CHECK:   <FUNCTION_BLOCK
Index: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
===================================================================
--- llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
+++ llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
@@ -243,10 +243,12 @@
           isa<PseudoProbeInst>(&I))
         continue;
 
-      // Special-case operand bundles "clang.arc.attachedcall" and "ptrauth".
-      bool IsNoTail =
-          CI->isNoTailCall() || CI->hasOperandBundlesOtherThan(
-            {LLVMContext::OB_clang_arc_attachedcall, LLVMContext::OB_ptrauth});
+      // Special-case operand bundles "clang.arc.attachedcall", "ptrauth", and
+      // "kcfi".
+      bool IsNoTail = CI->isNoTailCall() ||
+                      CI->hasOperandBundlesOtherThan(
+                          {LLVMContext::OB_clang_arc_attachedcall,
+                           LLVMContext::OB_ptrauth, LLVMContext::OB_kcfi});
 
       if (!IsNoTail && CI->doesNotAccessMemory()) {
         // A call to a readnone function whose arguments are all things computed
Index: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
===================================================================
--- llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+++ llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
@@ -3076,6 +3076,27 @@
             Call, Builder.CreateBitOrPointerCast(ReturnedArg, CallTy));
     }
 
+  // Drop unnecessary kcfi operand bundles from calls that were converted
+  // into direct calls.
+  auto Bundle = Call.getOperandBundle(LLVMContext::OB_kcfi);
+  if (Bundle && !Call.isIndirectCall()) {
+    DEBUG_WITH_TYPE(DEBUG_TYPE "-kcfi", {
+      if (CalleeF && CalleeF->hasPrefixData()) {
+        auto *FunctionType = cast<ConstantInt>(CalleeF->getPrefixData());
+        auto *ExpectedType = cast<ConstantInt>(Bundle->Inputs[0]);
+
+        if (FunctionType->getZExtValue() != ExpectedType->getZExtValue())
+          dbgs() << Call.getModule()->getName() << ":"
+                 << Call.getDebugLoc().getLine()
+                 << ": warning: kcfi: " << Call.getCaller()->getName()
+                 << ": call to " << CalleeF->getName()
+                 << " using a mismatching function pointer type\n";
+      }
+    });
+
+    return CallBase::removeOperandBundle(&Call, LLVMContext::OB_kcfi);
+  }
+
   if (isAllocationFn(&Call, &TLI) &&
       isAllocRemovable(&cast<CallBase>(Call), &TLI))
     return visitAllocSite(Call);
Index: llvm/lib/Target/X86/X86TargetMachine.cpp
===================================================================
--- llvm/lib/Target/X86/X86TargetMachine.cpp
+++ llvm/lib/Target/X86/X86TargetMachine.cpp
@@ -594,17 +594,19 @@
   // Insert pseudo probe annotation for callsite profiling
   addPass(createPseudoProbeInserter());
 
-  // On Darwin platforms, BLR_RVMARKER pseudo instructions are lowered to
-  // bundles.
-  if (TT.isOSDarwin())
-    addPass(createUnpackMachineBundles([](const MachineFunction &MF) {
-      // Only run bundle expansion if there are relevant ObjC runtime functions
-      // present in the module.
-      const Function &F = MF.getFunction();
-      const Module *M = F.getParent();
-      return M->getFunction("objc_retainAutoreleasedReturnValue") ||
-             M->getFunction("objc_unsafeClaimAutoreleasedReturnValue");
-    }));
+  // KCFI pseudo instructions are lowered to a bundle, and on Darwin platforms,
+  // also CALL_RVMARKER.
+  addPass(createUnpackMachineBundles([&TT](const MachineFunction &MF) {
+    // Only run bundle expansion if the function performs indirect calls
+    // with the kcfi operand bundle, or there are relevant ObjC runtime
+    // functions present in the module.
+    const Function &F = MF.getFunction();
+    const Module *M = F.getParent();
+    return M->getModuleFlag("kcfi") ||
+           (TT.isOSDarwin() &&
+            (M->getFunction("objc_retainAutoreleasedReturnValue") ||
+             M->getFunction("objc_unsafeClaimAutoreleasedReturnValue")));
+  }));
 }
 
 bool X86PassConfig::addPostFastRegAllocRewrite() {
Index: llvm/lib/Target/X86/X86MCInstLower.cpp
===================================================================
--- llvm/lib/Target/X86/X86MCInstLower.cpp
+++ llvm/lib/Target/X86/X86MCInstLower.cpp
@@ -1336,6 +1336,31 @@
           .addExpr(Op));
 }
 
+void X86AsmPrinter::LowerKCFI_CHECK(const MachineInstr &MI) {
+  const MachineFunction &MF = *MI.getMF();
+
+  EmitAndCountInstruction(MCInstBuilder(X86::CMP32mi)
+                              .addReg(MI.getOperand(0).getReg())
+                              .addImm(1)
+                              .addReg(X86::NoRegister)
+                              .addImm(-6)
+                              .addReg(X86::NoRegister)
+                              .addImm(MI.getOperand(1).getImm()));
+
+  MCSymbol *Pass = OutContext.createTempSymbol();
+  EmitAndCountInstruction(
+      MCInstBuilder(X86::JCC_1)
+          .addExpr(MCSymbolRefExpr::create(Pass, OutContext))
+          .addImm(X86::COND_E));
+
+  MCSymbol *Trap = OutContext.createTempSymbol();
+  OutStreamer->emitLabel(Trap);
+  EmitAndCountInstruction(MCInstBuilder(X86::TRAP));
+  emitKCFITrapEntry(MF, Trap);
+
+  OutStreamer->emitLabel(Pass);
+}
+
 void X86AsmPrinter::LowerASAN_CHECK_MEMACCESS(const MachineInstr &MI) {
   // FIXME: Make this work on non-ELF.
   if (!TM.getTargetTriple().isOSBinFormatELF()) {
@@ -2618,6 +2643,9 @@
     EmitAndCountInstruction(MCInstBuilder(getRetOpcode(*Subtarget)));
     return;
 
+  case X86::KCFI_CHECK:
+    return LowerKCFI_CHECK(*MI);
+
   case X86::ASAN_CHECK_MEMACCESS:
     return LowerASAN_CHECK_MEMACCESS(*MI);
 
Index: llvm/lib/Target/X86/X86InstrInfo.td
===================================================================
--- llvm/lib/Target/X86/X86InstrInfo.td
+++ llvm/lib/Target/X86/X86InstrInfo.td
@@ -88,6 +88,8 @@
 
 def SDT_X86Call   : SDTypeProfile<0, -1, [SDTCisVT<0, iPTR>]>;
 
+def SDT_X86Call_kcfi : SDTypeProfile<0, -1, [SDTCisVT<0, i32>, SDTCisVT<1, iPTR>]>;
+
 def SDT_X86NtBrind : SDTypeProfile<0, -1, [SDTCisVT<0, iPTR>]>;
 
 def SDT_X86VASTART_SAVE_XMM_REGS : SDTypeProfile<0, -1, [SDTCisVT<0, i8>,
@@ -121,6 +123,8 @@
 
 def SDT_X86TCRET : SDTypeProfile<0, 2, [SDTCisPtrTy<0>, SDTCisVT<1, i32>]>;
 
+def SDT_X86TCRET_KCFI : SDTypeProfile<0, 3, [SDTCisVT<0, i32>, SDTCisPtrTy<1>, SDTCisVT<2, i32>]>;
+
 def SDT_X86MEMBARRIER : SDTypeProfile<0, 0, []>;
 
 def SDT_X86ENQCMD : SDTypeProfile<1, 2, [SDTCisVT<0, i32>,
@@ -207,6 +211,12 @@
                         [SDNPHasChain, SDNPOutGlue, SDNPOptInGlue,
                          SDNPVariadic]>;
 
+def X86call_kcfi  : SDNode<"X86ISD::KCFI_CALL", SDT_X86Call_kcfi,
+                        [SDNPHasChain, SDNPOutGlue, SDNPOptInGlue,
+                         SDNPVariadic]>;
+def X86NoTrackCall_kcfi  : SDNode<"X86ISD::KCFI_NT_CALL", SDT_X86Call_kcfi,
+                        [SDNPHasChain, SDNPOutGlue, SDNPOptInGlue,
+                         SDNPVariadic]>;
 
 def X86NoTrackCall : SDNode<"X86ISD::NT_CALL", SDT_X86Call,
                             [SDNPHasChain, SDNPOutGlue, SDNPOptInGlue,
@@ -249,6 +259,8 @@
 
 def X86tcret : SDNode<"X86ISD::TC_RETURN", SDT_X86TCRET,
                         [SDNPHasChain,  SDNPOptInGlue, SDNPVariadic]>;
+def X86tcret_kcfi : SDNode<"X86ISD::KCFI_TC_RETURN", SDT_X86TCRET_KCFI,
+                        [SDNPHasChain,  SDNPOptInGlue, SDNPVariadic]>;
 
 def X86add_flag  : SDNode<"X86ISD::ADD",  SDTBinaryArithWithFlags,
                           [SDNPCommutative]>;
Index: llvm/lib/Target/X86/X86InstrControl.td
===================================================================
--- llvm/lib/Target/X86/X86InstrControl.td
+++ llvm/lib/Target/X86/X86InstrControl.td
@@ -415,6 +415,37 @@
   }
 }
 
+let isPseudo = 1, isCall = 1, isCodeGenOnly = 1,
+    Uses = [RSP, SSP],
+    usesCustomInserter = 1,
+    SchedRW = [WriteJump] in {
+  def KCFI_CALL64r :
+    PseudoI<(outs), (ins i32imm:$type, GR64:$dst), [(X86call_kcfi timm:$type, GR64:$dst)]>,
+            Requires<[In64BitMode,NotUseIndirectThunkCalls]>;
+  def KCFI_CALL64r_NT :
+    PseudoI<(outs), (ins i32imm:$type, GR64:$dst), [(X86NoTrackCall_kcfi timm:$type, GR64:$dst)]>,
+            Requires<[In64BitMode]>, NOTRACK;
+
+  // For indirect thunk calls with KCFI
+  def KCFI_INDIRECT_THUNK_CALL64 :
+    PseudoI<(outs), (ins i32imm:$type, GR64:$dst), [(X86call_kcfi timm:$type, GR64:$dst)]>,
+            Requires<[In64BitMode,UseIndirectThunkCalls]>;
+  def KCFI_CALL64pcrel32 :
+    PseudoI<(outs), (ins i32imm:$type, i64imm:$rvfunc, i64i32imm_brtarget:$dst), []>,
+            Requires<[In64BitMode]>;
+
+  let isTerminator = 1, isReturn = 1, isBarrier = 1 in {
+    def KCFI_TCRETURNri64 :
+      PseudoI<(outs), (ins i32imm:$type, ptr_rc_tailcall:$dst, i32imm:$offset),[]>, NotMemoryFoldable;
+
+    // For indirect thunk tail calls with KCFI checks.
+    def KCFI_INDIRECT_THUNK_TCRETURN64 :
+      PseudoI<(outs), (ins i32imm:$type, GR64:$dst, i32imm:$offset), []>;
+    def KCFI_TCRETURNdi64 :
+      PseudoI<(outs), (ins i32imm:$type, i64i32imm_brtarget:$dst, i32imm:$offset),[]>;
+  }
+}
+
 let isPseudo = 1, isCall = 1, isCodeGenOnly = 1,
     Uses = [RSP, SSP],
     SchedRW = [WriteJump] in {
Index: llvm/lib/Target/X86/X86InstrCompiler.td
===================================================================
--- llvm/lib/Target/X86/X86InstrCompiler.td
+++ llvm/lib/Target/X86/X86InstrCompiler.td
@@ -256,6 +256,15 @@
                             "#SEH_Epilogue", []>;
 }
 
+//===----------------------------------------------------------------------===//
+// Pseudo instructions used by KCFI.
+//===----------------------------------------------------------------------===//
+let
+  Defs = [EFLAGS] in {
+def KCFI_CHECK : PseudoI<
+  (outs), (ins GR64:$ptr, i32imm:$type), []>, Sched<[]>;
+}
+
 //===----------------------------------------------------------------------===//
 // Pseudo instructions used by address sanitizer.
 //===----------------------------------------------------------------------===//
@@ -1328,6 +1337,14 @@
           (TCRETURNdi64 texternalsym:$dst, timm:$off)>,
           Requires<[IsLP64]>;
 
+def : Pat<(X86tcret_kcfi timm:$type, ptr_rc_tailcall:$dst, timm:$off),
+          (KCFI_TCRETURNri64 timm:$type, ptr_rc_tailcall:$dst, timm:$off)>,
+          Requires<[In64BitMode, NotUseIndirectThunkCalls]>;
+
+def : Pat<(X86tcret_kcfi timm:$type, ptr_rc_tailcall:$dst, timm:$off),
+          (KCFI_INDIRECT_THUNK_TCRETURN64 timm:$type, ptr_rc_tailcall:$dst, timm:$off)>,
+          Requires<[In64BitMode, UseIndirectThunkCalls]>;
+
 // Normal calls, with various flavors of addresses.
 def : Pat<(X86call (i32 tglobaladdr:$dst)),
           (CALLpcrel32 tglobaladdr:$dst)>;
Index: llvm/lib/Target/X86/X86ISelLowering.h
===================================================================
--- llvm/lib/Target/X86/X86ISelLowering.h
+++ llvm/lib/Target/X86/X86ISelLowering.h
@@ -77,6 +77,11 @@
     /// Same as call except it adds the NoTrack prefix.
     NT_CALL,
 
+    /// Indirect calls with a KCFI check.
+    KCFI_CALL,
+    KCFI_NT_CALL,
+    KCFI_TC_RETURN,
+
     // Pseudo for a OBJC call that gets emitted together with a special
     // marker instruction.
     CALL_RVMARKER,
@@ -1447,6 +1452,8 @@
 
     bool supportSwiftError() const override;
 
+    bool supportKCFIBundles() const override { return true; }
+
     bool hasStackProbeSymbol(MachineFunction &MF) const override;
     bool hasInlineStackProbe(MachineFunction &MF) const override;
     StringRef getStackProbeSymbolName(MachineFunction &MF) const override;
@@ -1673,6 +1680,9 @@
     MachineBasicBlock *EmitLoweredIndirectThunk(MachineInstr &MI,
                                                 MachineBasicBlock *BB) const;
 
+    MachineBasicBlock *EmitLoweredKCFICall(MachineInstr &MI,
+                                           MachineBasicBlock *BB) const;
+
     MachineBasicBlock *emitEHSjLjSetJmp(MachineInstr &MI,
                                         MachineBasicBlock *MBB) const;
 
Index: llvm/lib/Target/X86/X86ISelLowering.cpp
===================================================================
--- llvm/lib/Target/X86/X86ISelLowering.cpp
+++ llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -4176,6 +4176,7 @@
                   CB->hasFnAttr("no_caller_saved_registers"));
   bool HasNoCfCheck = (CB && CB->doesNoCfCheck());
   bool IsIndirectCall = (CB && isa<CallInst>(CB) && CB->isIndirectCall());
+  bool IsKCFICall = IsIndirectCall && CLI.KCFIType;
   const Module *M = MF.getMMI().getModule();
   Metadata *IsCFProtectionSupported = M->getModuleFlag("cf-protection-branch");
 
@@ -4658,6 +4659,12 @@
   if (InFlag.getNode())
     Ops.push_back(InFlag);
 
+  // Set the type as the first argument for KCFI calls
+  if (IsKCFICall)
+    Ops.insert(
+        Ops.begin() + 1,
+        DAG.getTargetConstant(CLI.KCFIType->getZExtValue(), dl, MVT::i32));
+
   if (isTailCall) {
     // We used to do:
     //// If this is the first return lowered for this function, add the regs
@@ -4665,15 +4672,18 @@
     // This isn't right, although it's probably harmless on x86; liveouts
     // should be computed from returns not tail calls.  Consider a void
     // function making a tail call to a function returning int.
+    unsigned TCOpc = X86ISD::TC_RETURN;
+
+    if (IsKCFICall)
+      TCOpc = X86ISD::KCFI_TC_RETURN;
+
     MF.getFrameInfo().setHasTailCall();
-    SDValue Ret = DAG.getNode(X86ISD::TC_RETURN, dl, NodeTys, Ops);
+    SDValue Ret = DAG.getNode(TCOpc, dl, NodeTys, Ops);
     DAG.addCallSiteInfo(Ret.getNode(), std::move(CSInfo));
     return Ret;
   }
 
-  if (HasNoCfCheck && IsCFProtectionSupported && IsIndirectCall) {
-    Chain = DAG.getNode(X86ISD::NT_CALL, dl, NodeTys, Ops);
-  } else if (CLI.CB && objcarc::hasAttachedCallOpBundle(CLI.CB)) {
+  if (CLI.CB && objcarc::hasAttachedCallOpBundle(CLI.CB)) {
     // Calls with a "clang.arc.attachedcall" bundle are special. They should be
     // expanded to the call, directly followed by a special marker sequence and
     // a call to a ObjC library function. Use the CALL_RVMARKER to do that.
@@ -4689,7 +4699,17 @@
     Ops.insert(Ops.begin() + 1, GA);
     Chain = DAG.getNode(X86ISD::CALL_RVMARKER, dl, NodeTys, Ops);
   } else {
-    Chain = DAG.getNode(X86ISD::CALL, dl, NodeTys, Ops);
+    bool NoTrack = IsIndirectCall && HasNoCfCheck && IsCFProtectionSupported;
+    unsigned CallOpc = X86ISD::CALL;
+
+    if (IsKCFICall && NoTrack)
+      CallOpc = X86ISD::KCFI_NT_CALL;
+    else if (IsKCFICall)
+      CallOpc = X86ISD::KCFI_CALL;
+    else if (NoTrack)
+      CallOpc = X86ISD::NT_CALL;
+
+    Chain = DAG.getNode(CallOpc, dl, NodeTys, Ops);
   }
 
   InFlag = Chain.getValue(1);
@@ -32992,6 +33012,9 @@
   NODE_NAME_CASE(FLD)
   NODE_NAME_CASE(FST)
   NODE_NAME_CASE(CALL)
+  NODE_NAME_CASE(KCFI_CALL)
+  NODE_NAME_CASE(KCFI_NT_CALL)
+  NODE_NAME_CASE(KCFI_TC_RETURN)
   NODE_NAME_CASE(CALL_RVMARKER)
   NODE_NAME_CASE(BT)
   NODE_NAME_CASE(CMP)
@@ -34904,10 +34927,21 @@
     return X86::TCRETURNdi;
   case X86::INDIRECT_THUNK_TCRETURN64:
     return X86::TCRETURNdi64;
+  case X86::KCFI_INDIRECT_THUNK_CALL64:
+    return X86::KCFI_CALL64pcrel32;
+  case X86::KCFI_INDIRECT_THUNK_TCRETURN64:
+    return X86::KCFI_TCRETURNdi64;
   }
   llvm_unreachable("not indirect thunk opcode");
 }
 
+static unsigned getVRegOperandIdxForIndirectThunk(unsigned RPOpc) {
+  if (RPOpc == X86::KCFI_INDIRECT_THUNK_CALL64 ||
+      RPOpc == X86::KCFI_INDIRECT_THUNK_TCRETURN64)
+    return 1; // Skip the type operand.
+  return 0;
+}
+
 static const char *getIndirectThunkSymbol(const X86Subtarget &Subtarget,
                                           unsigned Reg) {
   if (Subtarget.useRetpolineExternalThunk()) {
@@ -34981,8 +35015,10 @@
   // call the retpoline thunk.
   const DebugLoc &DL = MI.getDebugLoc();
   const X86InstrInfo *TII = Subtarget.getInstrInfo();
-  Register CalleeVReg = MI.getOperand(0).getReg();
-  unsigned Opc = getOpcodeForIndirectThunk(MI.getOpcode());
+  unsigned RPOpc = MI.getOpcode();
+  unsigned VRegIdx = getVRegOperandIdxForIndirectThunk(RPOpc);
+  Register CalleeVReg = MI.getOperand(VRegIdx).getReg();
+  unsigned Opc = getOpcodeForIndirectThunk(RPOpc);
 
   // Find an available scratch register to hold the callee. On 64-bit, we can
   // just use R11, but we scan for uses anyway to ensure we don't generate
@@ -35020,13 +35056,30 @@
 
   BuildMI(*BB, MI, DL, TII->get(TargetOpcode::COPY), AvailableReg)
       .addReg(CalleeVReg);
-  MI.getOperand(0).ChangeToES(Symbol);
+  MI.getOperand(VRegIdx).ChangeToES(Symbol);
   MI.setDesc(TII->get(Opc));
   MachineInstrBuilder(*BB->getParent(), &MI)
       .addReg(AvailableReg, RegState::Implicit | RegState::Kill);
   return BB;
 }
 
+MachineBasicBlock *
+X86TargetLowering::EmitLoweredKCFICall(MachineInstr &MI,
+                                       MachineBasicBlock *BB) const {
+  assert(MI.getOperand(0).isImm() && MI.getOperand(1).isReg() &&
+         "Invalid operand type for a KCFI call");
+
+  switch (MI.getOpcode()) {
+  case X86::KCFI_INDIRECT_THUNK_CALL64:
+  case X86::KCFI_INDIRECT_THUNK_TCRETURN64:
+    // Emit indirect thunks here.
+    return EmitLoweredIndirectThunk(MI, BB);
+  default:
+    // KCFI instructions are expanded in X86ExpandPseudo::ExpandKCFICall.
+    return BB;
+  }
+}
+
 /// SetJmp implies future control flow change upon calling the corresponding
 /// LongJmp.
 /// Instead of using the 'return' instruction, the long jump fixes the stack and
@@ -35808,6 +35861,12 @@
   case X86::INDIRECT_THUNK_TCRETURN32:
   case X86::INDIRECT_THUNK_TCRETURN64:
     return EmitLoweredIndirectThunk(MI, BB);
+  case X86::KCFI_CALL64r:
+  case X86::KCFI_CALL64r_NT:
+  case X86::KCFI_TCRETURNri64:
+  case X86::KCFI_INDIRECT_THUNK_CALL64:
+  case X86::KCFI_INDIRECT_THUNK_TCRETURN64:
+    return EmitLoweredKCFICall(MI, BB);
   case X86::CATCHRET:
     return EmitLoweredCatchRet(MI, BB);
   case X86::SEG_ALLOCA_32:
Index: llvm/lib/Target/X86/X86FastISel.cpp
===================================================================
--- llvm/lib/Target/X86/X86FastISel.cpp
+++ llvm/lib/Target/X86/X86FastISel.cpp
@@ -3182,6 +3182,10 @@
   if ((CB && CB->hasFnAttr("no_callee_saved_registers")))
     return false;
 
+  // Indirect calls with KCFI checks need special handling.
+  if (CB && CB->isIndirectCall() && CB->getOperandBundle(LLVMContext::OB_kcfi))
+    return false;
+
   // Functions using thunks for indirect calls need to use SDISel.
   if (Subtarget->useIndirectThunkCalls())
     return false;
Index: llvm/lib/Target/X86/X86ExpandPseudo.cpp
===================================================================
--- llvm/lib/Target/X86/X86ExpandPseudo.cpp
+++ llvm/lib/Target/X86/X86ExpandPseudo.cpp
@@ -63,6 +63,7 @@
 private:
   void ExpandICallBranchFunnel(MachineBasicBlock *MBB,
                                MachineBasicBlock::iterator MBBI);
+  void ExpandKCFICall(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI);
   void expandCALL_RVMARKER(MachineBasicBlock &MBB,
                            MachineBasicBlock::iterator MBBI);
   bool ExpandMI(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI);
@@ -188,6 +189,69 @@
   JTMBB->erase(JTInst);
 }
 
+void X86ExpandPseudo::ExpandKCFICall(MachineBasicBlock &MBB,
+                                     MachineBasicBlock::iterator MBBI) {
+  // Copy the type operand and drop it from the call.
+  MachineOperand Type = MBBI->getOperand(0);
+  MBBI->removeOperand(0);
+  assert(Type.isImm() && "Invalid type operand for a KCFI call");
+
+  static const std::map<unsigned, unsigned> OpcMap = {
+      {X86::KCFI_CALL64r, X86::CALL64r},
+      {X86::KCFI_CALL64r_NT, X86::CALL64r_NT},
+      {X86::KCFI_CALL64pcrel32, X86::CALL64pcrel32},
+      {X86::KCFI_TCRETURNri64, X86::TCRETURNri64},
+      {X86::KCFI_TCRETURNdi64, X86::TCRETURNdi64}};
+
+  unsigned Opc = MBBI->getOpcode();
+  bool IsIndirect =
+      Opc != X86::KCFI_CALL64pcrel32 && Opc != X86::KCFI_TCRETURNdi64;
+  bool IsTailCall =
+      Opc == X86::KCFI_TCRETURNri64 || Opc == X86::KCFI_TCRETURNdi64;
+
+  auto OI = OpcMap.find(Opc);
+  if (OI == OpcMap.end())
+    llvm_unreachable("unexpected opcode");
+
+  // Set the correct opcode for the call.
+  MBBI->setDesc(TII->get(OI->second));
+
+  // Expand tail calls first.
+  if (IsTailCall) {
+    if (!ExpandMI(MBB, MBBI))
+      llvm_unreachable("unexpected failure expanding a tail call");
+
+    unsigned TailCallOpc = MBB.back().getOpcode();
+    assert((TailCallOpc == X86::TAILJMPd64 || TailCallOpc == X86::TAILJMPr64 ||
+            TailCallOpc == X86::TAILJMPr64_REX) &&
+           "Unexpected opcode for a KCFI tail call");
+  }
+
+  MachineInstr &Call = IsTailCall ? MBB.back() : *MBBI;
+  MachineOperand &Target = Call.getOperand(0);
+
+  // Emit the KCFI check immediately before the call.
+  MachineInstr *Check =
+      BuildMI(MBB, Call, Call.getDebugLoc(), TII->get(X86::KCFI_CHECK))
+          .getInstr();
+
+  if (IsIndirect) {
+    assert(Target.isReg() && "Unexpected target operand for an indirect call");
+    Check->addOperand(Target);
+  } else {
+    assert(Target.isSymbol() && "Unexpected target operand for a direct call");
+    // X86TargetLowering::EmitLoweredIndirectThunk always uses r11 for
+    // 64-bit indirect thunk calls.
+    assert(StringRef(Target.getSymbolName()).endswith("_r11") &&
+           "Unexpected register for an indirect thunk KCFI call");
+    Check->addOperand(MachineOperand::CreateReg(X86::R11, false));
+  }
+  Check->addOperand(Type);
+
+  // Bundle the check and the call to prevent further changes.
+  finalizeBundle(MBB, Check->getIterator(), std::next(Call.getIterator()));
+}
+
 void X86ExpandPseudo::expandCALL_RVMARKER(MachineBasicBlock &MBB,
                                           MachineBasicBlock::iterator MBBI) {
   // Expand CALL_RVMARKER pseudo to call instruction, followed by the special
@@ -592,6 +656,13 @@
     MI.setDesc(TII->get(X86::TILEZERO));
     return true;
   }
+  case X86::KCFI_CALL64r:
+  case X86::KCFI_CALL64r_NT:
+  case X86::KCFI_CALL64pcrel32:
+  case X86::KCFI_TCRETURNri64:
+  case X86::KCFI_TCRETURNdi64:
+    ExpandKCFICall(MBB, MBBI);
+    return true;
   case X86::CALL64pcrel32_RVMARKER:
   case X86::CALL64r_RVMARKER:
   case X86::CALL64m_RVMARKER:
Index: llvm/lib/Target/X86/X86AsmPrinter.h
===================================================================
--- llvm/lib/Target/X86/X86AsmPrinter.h
+++ llvm/lib/Target/X86/X86AsmPrinter.h
@@ -98,6 +98,9 @@
 
   void LowerFENTRY_CALL(const MachineInstr &MI, X86MCInstLower &MCIL);
 
+  // KCFI specific lowering for X86.
+  void LowerKCFI_CHECK(const MachineInstr &MI);
+
   // Address sanitizer specific lowering for X86.
   void LowerASAN_CHECK_MEMACCESS(const MachineInstr &MI);
 
@@ -148,6 +151,7 @@
   bool runOnMachineFunction(MachineFunction &MF) override;
   void emitFunctionBodyStart() override;
   void emitFunctionBodyEnd() override;
+  void emitKCFITypeId(const MachineFunction &MF) override;
 
   bool shouldEmitWeakSwiftAsyncExtendedFramePointerFlags() const override {
     return ShouldEmitWeakSwiftAsyncExtendedFramePointerFlags;
Index: llvm/lib/Target/X86/X86AsmPrinter.cpp
===================================================================
--- llvm/lib/Target/X86/X86AsmPrinter.cpp
+++ llvm/lib/Target/X86/X86AsmPrinter.cpp
@@ -33,6 +33,7 @@
 #include "llvm/MC/MCCodeEmitter.h"
 #include "llvm/MC/MCContext.h"
 #include "llvm/MC/MCExpr.h"
+#include "llvm/MC/MCInstBuilder.h"
 #include "llvm/MC/MCSectionCOFF.h"
 #include "llvm/MC/MCSectionELF.h"
 #include "llvm/MC/MCSectionMachO.h"
@@ -108,6 +109,39 @@
   }
 }
 
+void X86AsmPrinter::emitKCFITypeId(const MachineFunction &MF) {
+  // Emit a function symbol for the type identifier data.
+  MCSymbol *FnSym = OutContext.getOrCreateSymbol("__cfi_" + MF.getName());
+  // Use the same linkage as the parent function
+  emitLinkage(&MF.getFunction(), FnSym);
+  if (MAI->hasDotTypeDotSizeDirective())
+    OutStreamer->emitSymbolAttribute(FnSym, MCSA_ELF_TypeFunction);
+  OutStreamer->emitLabel(FnSym);
+
+  EmitAndCountInstruction(MCInstBuilder(X86::INT3));
+  EmitAndCountInstruction(MCInstBuilder(X86::INT3));
+
+  // Embed the type hash in a mov instruction.
+  auto *Hash = cast<ConstantInt>(MF.getFunction().getPrefixData());
+
+  EmitAndCountInstruction(MCInstBuilder(X86::MOV32ri)
+                              .addReg(X86::EAX)
+                              .addImm(Hash->getZExtValue()));
+
+  EmitAndCountInstruction(MCInstBuilder(X86::INT3));
+  EmitAndCountInstruction(MCInstBuilder(X86::INT3));
+
+  if (MAI->hasDotTypeDotSizeDirective()) {
+    MCSymbol *EndSym = OutContext.createTempSymbol("__cfi_func_end");
+    OutStreamer->emitLabel(EndSym);
+
+    const MCExpr *SizeExp = MCBinaryExpr::createSub(
+        MCSymbolRefExpr::create(EndSym, OutContext),
+        MCSymbolRefExpr::create(FnSym, OutContext), OutContext);
+    OutStreamer->emitELFSize(FnSym, SizeExp);
+  }
+}
+
 /// PrintSymbolOperand - Print a raw symbol reference operand.  This handles
 /// jump tables, constant pools, global address and external symbols, all of
 /// which print to a label with various suffixes for relocation types etc.
Index: llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
===================================================================
--- llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
+++ llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
@@ -892,6 +892,23 @@
   return AArch64::TCRETURNri;
 }
 
+static unsigned getKCFICallOpcode(unsigned Opc) {
+  switch (Opc) {
+  case AArch64::BLR:
+    return AArch64::KCFI_BLR;
+  case AArch64::BLRNoIP:
+    return AArch64::KCFI_BLRNoIP;
+  case AArch64::BLR_BTI:
+    return AArch64::KCFI_BLR_BTI;
+  case AArch64::TCRETURNri:
+    return AArch64::KCFI_TCRETURNri;
+  case AArch64::TCRETURNriBTI:
+    return AArch64::KCFI_TCRETURNriBTI;
+  default:
+    llvm_unreachable("unexpected opcode");
+  }
+}
+
 static const uint32_t *
 getMaskForArgs(SmallVectorImpl<AArch64CallLowering::ArgInfo> &OutArgs,
                AArch64CallLowering::CallLoweringInfo &Info,
@@ -943,7 +960,19 @@
     CallSeqStart = MIRBuilder.buildInstr(AArch64::ADJCALLSTACKDOWN);
 
   unsigned Opc = getCallOpcode(MF, Info.Callee.isReg(), true);
+  unsigned CalleeOpNo = 0;
+
+  if (Info.KCFIType)
+    Opc = getKCFICallOpcode(Opc);
+
   auto MIB = MIRBuilder.buildInstrNoInsert(Opc);
+
+  // Add the KCFI type before the call target.
+  if (Info.KCFIType) {
+    MIB.addImm(Info.KCFIType->getZExtValue());
+    ++CalleeOpNo;
+  }
+
   MIB.add(Info.Callee);
 
   // Byte offset for the tail call. When we are sibcalling, this will always
@@ -1045,7 +1074,7 @@
   // If we have -tailcallopt, we need to adjust the stack. We'll do the call
   // sequence start and end here.
   if (!IsSibCall) {
-    MIB->getOperand(1).setImm(FPDiff);
+    MIB->getOperand(CalleeOpNo + 1).setImm(FPDiff);
     CallSeqStart.addImm(0).addImm(0);
     // End the call sequence *before* emitting the call. Normally, we would
     // tidy the frame up after the call. However, here, we've laid out the
@@ -1059,10 +1088,11 @@
 
   // If Callee is a reg, since it is used by a target specific instruction,
   // it must have a register class matching the constraint of that instruction.
-  if (MIB->getOperand(0).isReg())
+  if (MIB->getOperand(CalleeOpNo).isReg())
     constrainOperandRegClass(MF, *TRI, MRI, *MF.getSubtarget().getInstrInfo(),
                              *MF.getSubtarget().getRegBankInfo(), *MIB,
-                             MIB->getDesc(), MIB->getOperand(0), 0);
+                             MIB->getDesc(), MIB->getOperand(CalleeOpNo),
+                             CalleeOpNo);
 
   MF.getFrameInfo().setHasTailCall();
   Info.LoweredTailCall = true;
@@ -1146,6 +1176,10 @@
   else
     Opc = getCallOpcode(MF, Info.Callee.isReg(), false);
 
+  bool IsKCFICall = Info.KCFIType && Opc != AArch64::BLR_RVMARKER;
+  if (IsKCFICall)
+    Opc = getKCFICallOpcode(Opc);
+
   auto MIB = MIRBuilder.buildInstrNoInsert(Opc);
   unsigned CalleeOpNo = 0;
 
@@ -1155,6 +1189,10 @@
     Function *ARCFn = *objcarc::getAttachedARCFunction(Info.CB);
     MIB.addGlobalAddress(ARCFn);
     ++CalleeOpNo;
+  } else if (IsKCFICall) {
+    // Add the KCFI type before the call target.
+    MIB.addImm(Info.KCFIType->getZExtValue());
+    ++CalleeOpNo;
   }
 
   MIB.add(Info.Callee);
Index: llvm/lib/Target/AArch64/AArch64InstrInfo.td
===================================================================
--- llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -308,6 +308,7 @@
                                            SDTCisSameAs<0,2>,
                                            SDTCisSameAs<0,3>]>;
 def SDT_AArch64TCRET : SDTypeProfile<0, 2, [SDTCisPtrTy<0>]>;
+def SDT_AArch64TCRET_KCFI : SDTypeProfile<0, 3, [SDTCisVT<0, i32>, SDTCisPtrTy<1>]>;
 def SDT_AArch64PREFETCH : SDTypeProfile<0, 2, [SDTCisVT<0, i32>, SDTCisPtrTy<1>]>;
 
 def SDT_AArch64ITOF  : SDTypeProfile<1, 1, [SDTCisFP<0>, SDTCisSameAs<0,1>]>;
@@ -535,6 +536,15 @@
                              [SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
                               SDNPVariadic]>;
 
+def AArch64call_kcfi    : SDNode<"AArch64ISD::KCFI_CALL",
+                                SDTypeProfile<0, -1, [SDTCisVT<0, i32>, SDTCisPtrTy<1>]>,
+                                [SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
+                                 SDNPVariadic]>;
+def AArch64call_kcfi_bti: SDNode<"AArch64ISD::KCFI_CALL_BTI",
+                                SDTypeProfile<0, -1, [SDTCisVT<0, i32>, SDTCisPtrTy<1>]>,
+                                [SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
+                                 SDNPVariadic]>;
+
 def AArch64brcond        : SDNode<"AArch64ISD::BRCOND", SDT_AArch64Brcond,
                                 [SDNPHasChain]>;
 def AArch64cbz           : SDNode<"AArch64ISD::CBZ", SDT_AArch64cbz,
@@ -650,6 +660,9 @@
 def AArch64tcret: SDNode<"AArch64ISD::TC_RETURN", SDT_AArch64TCRET,
                   [SDNPHasChain,  SDNPOptInGlue, SDNPVariadic]>;
 
+def AArch64tcret_kcfi: SDNode<"AArch64ISD::KCFI_TC_RETURN", SDT_AArch64TCRET_KCFI,
+                  [SDNPHasChain,  SDNPOptInGlue, SDNPVariadic]>;
+
 def AArch64Prefetch        : SDNode<"AArch64ISD::PREFETCH", SDT_AArch64PREFETCH,
                                [SDNPHasChain, SDNPSideEffect]>;
 
@@ -1420,6 +1433,18 @@
 def MOVbaseTLS : Pseudo<(outs GPR64:$dst), (ins),
                        [(set GPR64:$dst, AArch64threadpointer)]>, Sched<[WriteSys]>;
 
+let Defs = [ X16, X17, NZCV ] in {
+def KCFI_CHECK : Pseudo<
+  (outs), (ins GPR64noip:$ptr, i32imm:$type), []>, Sched<[]>;
+}
+
+// TCRETURNriBTI requires the target address to be in X16 or X17. Define a
+// variant of KCFI_CHECK that avoids clobbering these registers.
+let Defs = [ X9, X10, NZCV ] in {
+def KCFI_CHECK_BTI : Pseudo<
+  (outs), (ins rtcGPR64:$ptr, i32imm:$type), []>, Sched<[]>;
+}
+
 let Uses = [ X9 ], Defs = [ X16, X17, LR, NZCV ] in {
 def HWASAN_CHECK_MEMACCESS : Pseudo<
   (outs), (ins GPR64noip:$ptr, i32imm:$accessinfo),
@@ -2391,6 +2416,15 @@
                      Sched<[WriteBrReg]>;
   def BLR_BTI : Pseudo<(outs), (ins variable_ops), []>,
                 Sched<[WriteBrReg]>;
+
+  let usesCustomInserter = 1 in {
+    def KCFI_BLR : Pseudo<(outs), (ins i32imm:$type, GPR64:$Rn), []>,
+                   Sched<[WriteBrReg]>;
+    def KCFI_BLRNoIP : Pseudo<(outs), (ins i32imm:$type, GPR64noip:$Rn), []>,
+                       Sched<[WriteBrReg]>;
+    def KCFI_BLR_BTI : Pseudo<(outs), (ins i32imm:$type, GPR64:$Rn), []>,
+                       Sched<[WriteBrReg]>;
+  }
 } // isCall
 
 def : Pat<(AArch64call GPR64:$Rn),
@@ -2408,6 +2442,16 @@
           (BLR_BTI GPR64:$Rn)>,
       Requires<[NoSLSBLRMitigation]>;
 
+def : Pat<(AArch64call_kcfi timm:$type, GPR64:$Rn),
+          (KCFI_BLR timm:$type, GPR64:$Rn)>,
+      Requires<[NoSLSBLRMitigation]>;
+def : Pat<(AArch64call_kcfi timm:$type, GPR64noip:$Rn),
+          (KCFI_BLRNoIP timm:$type, GPR64noip:$Rn)>,
+      Requires<[SLSBLRMitigation]>;
+def : Pat<(AArch64call_kcfi_bti timm:$type, GPR64:$Rn),
+          (KCFI_BLR_BTI timm:$type, GPR64:$Rn)>,
+      Requires<[NoSLSBLRMitigation]>;
+
 let isBranch = 1, isTerminator = 1, isBarrier = 1, isIndirectBranch = 1 in {
 def BR  : BranchReg<0b0000, "br", [(brind GPR64:$Rn)]>;
 } // isBranch, isTerminator, isBarrier, isIndirectBranch
@@ -8126,6 +8170,13 @@
   // allowed to tail-call a "BTI c" instruction.
   def TCRETURNriBTI : Pseudo<(outs), (ins rtcGPR64:$dst, i32imm:$FPDiff), []>,
                       Sched<[WriteBrReg]>;
+
+  let usesCustomInserter = 1 in {
+    def KCFI_TCRETURNri : Pseudo<(outs), (ins i32imm:$type, tcGPR64:$dst, i32imm:$FPDiff), []>,
+                          Sched<[WriteBrReg]>;
+    def KCFI_TCRETURNriBTI : Pseudo<(outs), (ins i32imm:$type, rtcGPR64:$dst, i32imm:$FPDiff), []>,
+                             Sched<[WriteBrReg]>;
+  }
 }
 
 def : Pat<(AArch64tcret tcGPR64:$dst, (i32 timm:$FPDiff)),
@@ -8139,6 +8190,13 @@
 def : Pat<(AArch64tcret texternalsym:$dst, (i32 timm:$FPDiff)),
           (TCRETURNdi texternalsym:$dst, imm:$FPDiff)>;
 
+def : Pat<(AArch64tcret_kcfi timm:$type, tcGPR64:$dst, (i32 timm:$FPDiff)),
+          (KCFI_TCRETURNri timm:$type, tcGPR64:$dst, imm:$FPDiff)>,
+      Requires<[NotUseBTI]>;
+def : Pat<(AArch64tcret_kcfi timm:$type, rtcGPR64:$dst, (i32 timm:$FPDiff)),
+          (KCFI_TCRETURNriBTI timm:$type, rtcGPR64:$dst, imm:$FPDiff)>,
+      Requires<[UseBTI]>;
+
 def MOVMCSym : Pseudo<(outs GPR64:$dst), (ins i64imm:$sym), []>, Sched<[]>;
 def : Pat<(i64 (AArch64LocalRecover mcsym:$sym)), (MOVMCSym mcsym:$sym)>;
 
Index: llvm/lib/Target/AArch64/AArch64ISelLowering.h
===================================================================
--- llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -57,6 +57,11 @@
 
   CALL_BTI, // Function call followed by a BTI instruction.
 
+  // Indirect calls with KCFI checks.
+  KCFI_CALL,
+  KCFI_CALL_BTI,
+  KCFI_TC_RETURN,
+
   // Produces the full sequence of instructions for getting the thread pointer
   // offset of a variable into X0, using the TLSDesc model.
   TLSDESC_CALLSEQ,
@@ -571,6 +576,9 @@
   MachineBasicBlock *EmitLoweredCatchRet(MachineInstr &MI,
                                            MachineBasicBlock *BB) const;
 
+  MachineBasicBlock *EmitLoweredKCFICall(MachineInstr &MI,
+                                         MachineBasicBlock *BB) const;
+
   MachineBasicBlock *
   EmitInstrWithCustomInserter(MachineInstr &MI,
                               MachineBasicBlock *MBB) const override;
@@ -808,6 +816,8 @@
     return true;
   }
 
+  bool supportKCFIBundles() const override { return true; }
+
   /// Enable aggressive FMA fusion on targets that want it.
   bool enableAggressiveFMAFusion(EVT VT) const override;
 
Index: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
===================================================================
--- llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -2242,6 +2242,9 @@
     MAKE_CASE(AArch64ISD::MOPS_MEMCOPY)
     MAKE_CASE(AArch64ISD::MOPS_MEMMOVE)
     MAKE_CASE(AArch64ISD::CALL_BTI)
+    MAKE_CASE(AArch64ISD::KCFI_CALL)
+    MAKE_CASE(AArch64ISD::KCFI_CALL_BTI)
+    MAKE_CASE(AArch64ISD::KCFI_TC_RETURN)
   }
 #undef MAKE_CASE
   return nullptr;
@@ -2315,6 +2318,48 @@
   return BB;
 }
 
+MachineBasicBlock *
+AArch64TargetLowering::EmitLoweredKCFICall(MachineInstr &MI,
+                                           MachineBasicBlock *BB) const {
+  static const std::map<unsigned, std::tuple<unsigned, unsigned>> OpcMap = {
+      {AArch64::KCFI_BLR, {AArch64::KCFI_CHECK, AArch64::BLR}},
+      {AArch64::KCFI_BLRNoIP, {AArch64::KCFI_CHECK, AArch64::BLRNoIP}},
+      {AArch64::KCFI_BLR_BTI, {AArch64::KCFI_CHECK, AArch64::BLR_BTI}},
+      {AArch64::KCFI_TCRETURNri, {AArch64::KCFI_CHECK, AArch64::TCRETURNri}},
+      {AArch64::KCFI_TCRETURNriBTI,
+       {AArch64::KCFI_CHECK_BTI, AArch64::TCRETURNriBTI}}};
+
+  auto Opcs = OpcMap.find(MI.getOpcode());
+  if (Opcs == OpcMap.end())
+    llvm_unreachable("unexpected opcode");
+
+  unsigned CheckOpc, CallOpc;
+  std::tie(CheckOpc, CallOpc) = Opcs->second;
+
+  const AArch64InstrInfo *TII = Subtarget->getInstrInfo();
+  MachineOperand Type = MI.getOperand(0);
+
+  // Set the correct call opcode and drop the type operand.
+  MI.setDesc(TII->get(CallOpc));
+  MI.removeOperand(0);
+
+  MachineOperand &Target = MI.getOperand(0);
+  assert(Type.isImm() && Target.isReg() &&
+         "Invalid operand type for a KCFI call");
+
+  // Emit a KCFI check before the call.
+  MachineInstr *Check =
+      BuildMI(*BB, MI, MI.getDebugLoc(), TII->get(CheckOpc)).getInstr();
+  Check->addOperand(Target);
+  Check->addOperand(Type);
+
+  // Note: There's no need to bundle the instructions as we're fine with
+  // additional machine instructions being emitted between the check and
+  // the call. This means we don't have to worry about expanding BLR_BTI
+  // and TCRETURNri* pseudos.
+  return BB;
+}
+
 MachineBasicBlock *AArch64TargetLowering::EmitInstrWithCustomInserter(
     MachineInstr &MI, MachineBasicBlock *BB) const {
   switch (MI.getOpcode()) {
@@ -2345,6 +2390,13 @@
 
   case AArch64::CATCHRET:
     return EmitLoweredCatchRet(MI, BB);
+
+  case AArch64::KCFI_BLR:
+  case AArch64::KCFI_BLRNoIP:
+  case AArch64::KCFI_BLR_BTI:
+  case AArch64::KCFI_TCRETURNri:
+  case AArch64::KCFI_TCRETURNriBTI:
+    return EmitLoweredKCFICall(MI, BB);
   }
 }
 
@@ -6158,6 +6210,7 @@
 
   AArch64FunctionInfo *FuncInfo = MF.getInfo<AArch64FunctionInfo>();
   bool TailCallOpt = MF.getTarget().Options.GuaranteedTailCallOpt;
+  bool IsKCFICall = CLI.CB && CLI.CB->isIndirectCall() && CLI.KCFIType;
   bool IsSibCall = false;
   bool GuardWithBTI = false;
 
@@ -6576,11 +6629,21 @@
 
   SDVTList NodeTys = DAG.getVTList(MVT::Other, MVT::Glue);
 
+  // Set the type as the first argument for KCFI calls
+  if (IsKCFICall)
+    Ops.insert(
+        Ops.begin() + 1,
+        DAG.getTargetConstant(CLI.KCFIType->getZExtValue(), DL, MVT::i32));
+
   // If we're doing a tall call, use a TC_RETURN here rather than an
   // actual call instruction.
   if (IsTailCall) {
+    unsigned TCOpc = AArch64ISD::TC_RETURN;
+    if (IsKCFICall)
+      TCOpc = AArch64ISD::KCFI_TC_RETURN;
+
     MF.getFrameInfo().setHasTailCall();
-    SDValue Ret = DAG.getNode(AArch64ISD::TC_RETURN, DL, NodeTys, Ops);
+    SDValue Ret = DAG.getNode(TCOpc, DL, NodeTys, Ops);
     DAG.addCallSiteInfo(Ret.getNode(), std::move(CSInfo));
     return Ret;
   }
@@ -6599,7 +6662,11 @@
     Function *ARCFn = *objcarc::getAttachedARCFunction(CLI.CB);
     auto GA = DAG.getTargetGlobalAddress(ARCFn, DL, PtrVT);
     Ops.insert(Ops.begin() + 1, GA);
-  } else if (GuardWithBTI)
+  } else if (IsKCFICall && GuardWithBTI)
+    CallOpc = AArch64ISD::KCFI_CALL_BTI;
+  else if (IsKCFICall)
+    CallOpc = AArch64ISD::KCFI_CALL;
+  else if (GuardWithBTI)
     CallOpc = AArch64ISD::CALL_BTI;
 
   // Returns a chain and a flag for retval copy to use.
Index: llvm/lib/Target/AArch64/AArch64FastISel.cpp
===================================================================
--- llvm/lib/Target/AArch64/AArch64FastISel.cpp
+++ llvm/lib/Target/AArch64/AArch64FastISel.cpp
@@ -3134,6 +3134,11 @@
       MF->getInfo<AArch64FunctionInfo>()->branchTargetEnforcement())
     return false;
 
+  // Allow SelectionDAG isel to handle indirect calls with KCFI checks.
+  if (CLI.CB && CLI.CB->isIndirectCall() &&
+      CLI.CB->getOperandBundle(LLVMContext::OB_kcfi))
+    return false;
+
   // Allow SelectionDAG isel to handle tail calls.
   if (IsTailCall)
     return false;
Index: llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
===================================================================
--- llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -111,6 +111,7 @@
 
   typedef std::tuple<unsigned, bool, uint32_t> HwasanMemaccessTuple;
   std::map<HwasanMemaccessTuple, MCSymbol *> HwasanMemaccessSymbols;
+  void LowerKCFI_CHECK(const MachineInstr &MI);
   void LowerHWASAN_CHECK_MEMACCESS(const MachineInstr &MI);
   void emitHwasanMemaccessSymbols(Module &M);
 
@@ -317,6 +318,90 @@
   recordSled(CurSled, MI, Kind, 2);
 }
 
+void AArch64AsmPrinter::LowerKCFI_CHECK(const MachineInstr &MI) {
+  unsigned FunctionTypeReg = AArch64::W16;
+  unsigned ExpectedTypeReg = AArch64::W17;
+
+  // Don't clobber X16 or X17 to avoid unnecessary register shuffling
+  // with BTI tail calls, which must use one of these registers.
+  if (MI.getOpcode() == AArch64::KCFI_CHECK_BTI) {
+    FunctionTypeReg = AArch64::W9;
+    ExpectedTypeReg = AArch64::W10;
+  }
+
+  Register AddrReg = MI.getOperand(0).getReg();
+
+  if (AddrReg.id() == AArch64::XZR) {
+    // Checking XZR makes no sense. Instead of emitting a load, zero the
+    // FunctionTypeReg and use it for the ESR AddrIndex below.
+    AddrReg = Register(getXRegFromWReg(FunctionTypeReg));
+    EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::ORRXrs)
+                                     .addReg(AddrReg)
+                                     .addReg(AArch64::XZR)
+                                     .addReg(AArch64::XZR)
+                                     .addImm(0));
+  } else {
+    EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::LDURWi)
+                                     .addReg(FunctionTypeReg)
+                                     .addReg(AddrReg)
+                                     .addImm(-4));
+  }
+
+  int64_t Type = MI.getOperand(1).getImm();
+  EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::MOVKWi)
+                                   .addReg(ExpectedTypeReg)
+                                   .addReg(ExpectedTypeReg)
+                                   .addImm(Type & 0xFFFF)
+                                   .addImm(0));
+  EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::MOVKWi)
+                                   .addReg(ExpectedTypeReg)
+                                   .addReg(ExpectedTypeReg)
+                                   .addImm((Type >> 16) & 0xFFFF)
+                                   .addImm(16));
+
+  EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::SUBSWrs)
+                                   .addReg(AArch64::WZR)
+                                   .addReg(FunctionTypeReg)
+                                   .addReg(ExpectedTypeReg)
+                                   .addImm(0));
+
+  MCSymbol *Pass = OutContext.createTempSymbol();
+  EmitToStreamer(*OutStreamer,
+                 MCInstBuilder(AArch64::Bcc)
+                     .addImm(AArch64CC::EQ)
+                     .addExpr(MCSymbolRefExpr::create(Pass, OutContext)));
+
+  assert(AddrReg.isPhysical() &&
+         "Unable to encode the target register for the KCFI trap");
+
+  // The base ESR is 0x8000 and the register information is encoded
+  // in bits 0-9 as follows:
+  // - 0-4: n, where the register Xn contains the target address
+  // - 5-9: m, where the register Wm contains the type hash
+  // Where n, m are in [0, 30].
+  unsigned TypeIndex = ExpectedTypeReg - AArch64::W0;
+  unsigned AddrIndex;
+
+  switch (AddrReg.id()) {
+  default:
+    AddrIndex = AddrReg.id() - AArch64::X0;
+    break;
+  case AArch64::FP:
+    AddrIndex = 29;
+    break;
+  case AArch64::LR:
+    AddrIndex = 30;
+    break;
+  }
+
+  assert(AddrIndex < 31 && TypeIndex < 31);
+
+  unsigned ESR = 0x8000 | ((TypeIndex & 31) << 5) | (AddrIndex & 31);
+  EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::BRK).addImm(ESR));
+
+  OutStreamer->emitLabel(Pass);
+}
+
 void AArch64AsmPrinter::LowerHWASAN_CHECK_MEMACCESS(const MachineInstr &MI) {
   Register Reg = MI.getOperand(0).getReg();
   bool IsShort =
@@ -1434,6 +1519,11 @@
     LowerPATCHABLE_TAIL_CALL(*MI);
     return;
 
+  case AArch64::KCFI_CHECK:
+  case AArch64::KCFI_CHECK_BTI:
+    LowerKCFI_CHECK(*MI);
+    return;
+
   case AArch64::HWASAN_CHECK_MEMACCESS:
   case AArch64::HWASAN_CHECK_MEMACCESS_SHORTGRANULES:
     LowerHWASAN_CHECK_MEMACCESS(*MI);
Index: llvm/lib/MC/MCObjectFileInfo.cpp
===================================================================
--- llvm/lib/MC/MCObjectFileInfo.cpp
+++ llvm/lib/MC/MCObjectFileInfo.cpp
@@ -1121,6 +1121,25 @@
                             cast<MCSymbolELF>(TextSec.getBeginSymbol()));
 }
 
+MCSection *
+MCObjectFileInfo::getKCFITrapSection(const MCSection &TextSec) const {
+  if (Ctx->getObjectFileType() != MCContext::IsELF)
+    return nullptr;
+
+  const MCSectionELF &ElfSec = static_cast<const MCSectionELF &>(TextSec);
+  unsigned Flags = ELF::SHF_LINK_ORDER | ELF::SHF_ALLOC | ELF::SHF_WRITE;
+  StringRef GroupName;
+  if (const MCSymbol *Group = ElfSec.getGroup()) {
+    GroupName = Group->getName();
+    Flags |= ELF::SHF_GROUP;
+  }
+
+  return Ctx->getELFSection(".kcfi_traps", ELF::SHT_PROGBITS, Flags, 0,
+                            GroupName,
+                            /*IsComdat=*/true, ElfSec.getUniqueID(),
+                            cast<MCSymbolELF>(TextSec.getBeginSymbol()));
+}
+
 MCSection *
 MCObjectFileInfo::getPseudoProbeSection(const MCSection *TextSec) const {
   if (Ctx->getObjectFileType() == MCContext::IsELF) {
Index: llvm/lib/IR/Verifier.cpp
===================================================================
--- llvm/lib/IR/Verifier.cpp
+++ llvm/lib/IR/Verifier.cpp
@@ -3348,7 +3348,7 @@
   bool FoundDeoptBundle = false, FoundFuncletBundle = false,
        FoundGCTransitionBundle = false, FoundCFGuardTargetBundle = false,
        FoundPreallocatedBundle = false, FoundGCLiveBundle = false,
-       FoundPtrauthBundle = false,
+       FoundPtrauthBundle = false, FoundKCFIBundle = false,
        FoundAttachedCallBundle = false;
   for (unsigned i = 0, e = Call.getNumOperandBundles(); i < e; ++i) {
     OperandBundleUse BU = Call.getOperandBundleAt(i);
@@ -3384,6 +3384,14 @@
             "Ptrauth bundle key operand must be an i32 constant", Call);
       Check(BU.Inputs[1]->getType()->isIntegerTy(64),
             "Ptrauth bundle discriminator operand must be an i64", Call);
+    } else if (Tag == LLVMContext::OB_kcfi) {
+      Check(!FoundKCFIBundle, "Multiple kcfi operand bundles", Call);
+      FoundKCFIBundle = true;
+      Check(BU.Inputs.size() == 1, "Expected exactly one kcfi bundle operand",
+            Call);
+      Check(isa<ConstantInt>(BU.Inputs[0]) &&
+                BU.Inputs[0]->getType()->isIntegerTy(32),
+            "Kcfi bundle operand must be an i32 constant", Call);
     } else if (Tag == LLVMContext::OB_preallocated) {
       Check(!FoundPreallocatedBundle, "Multiple preallocated operand bundles",
             Call);
Index: llvm/lib/IR/LLVMContext.cpp
===================================================================
--- llvm/lib/IR/LLVMContext.cpp
+++ llvm/lib/IR/LLVMContext.cpp
@@ -87,6 +87,11 @@
          "ptrauth operand bundle id drifted!");
   (void)PtrauthEntry;
 
+  auto *KCFIEntry = pImpl->getOrInsertBundleTag("kcfi");
+  assert(KCFIEntry->second == LLVMContext::OB_kcfi &&
+         "kcfi operand bundle id drifted!");
+  (void)KCFIEntry;
+
   SyncScope::ID SingleThreadSSID =
       pImpl->getOrInsertSyncScopeID("singlethread");
   assert(SingleThreadSSID == SyncScope::SingleThread &&
Index: llvm/lib/IR/Instructions.cpp
===================================================================
--- llvm/lib/IR/Instructions.cpp
+++ llvm/lib/IR/Instructions.cpp
@@ -505,7 +505,8 @@
   // Implementation note: this is a conservative implementation of operand
   // bundle semantics, where *any* non-assume operand bundle (other than
   // ptrauth) forces a callsite to be at least readonly.
-  return hasOperandBundlesOtherThan(LLVMContext::OB_ptrauth) &&
+  return hasOperandBundlesOtherThan(
+             {LLVMContext::OB_ptrauth, LLVMContext::OB_kcfi}) &&
          getIntrinsicID() != Intrinsic::assume;
 }
 
Index: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
===================================================================
--- llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -7808,6 +7808,16 @@
   if (TLI.supportSwiftError() && SwiftErrorVal)
     isTailCall = false;
 
+  ConstantInt *KCFIType = nullptr;
+  auto Bundle = CB.getOperandBundle(LLVMContext::OB_kcfi);
+  if (Bundle && CB.isIndirectCall()) {
+    if (!TLI.supportKCFIBundles())
+      report_fatal_error(
+          "Target doesn't support calls with kcfi operand bundles.");
+    KCFIType = cast<ConstantInt>(Bundle->Inputs[0]);
+    assert(KCFIType->getType()->isIntegerTy(32) && "Invalid KCFI type");
+  }
+
   TargetLowering::CallLoweringInfo CLI(DAG);
   CLI.setDebugLoc(getCurSDLoc())
       .setChain(getRoot())
@@ -7815,7 +7825,8 @@
       .setTailCall(isTailCall)
       .setConvergent(CB.isConvergent())
       .setIsPreallocated(
-          CB.countOperandBundlesOfType(LLVMContext::OB_preallocated) != 0);
+          CB.countOperandBundlesOfType(LLVMContext::OB_preallocated) != 0)
+      .setKCFIType(KCFIType);
   std::pair<SDValue, SDValue> Result = lowerInvokable(CLI, EHPadBB);
 
   if (Result.first.getNode()) {
@@ -8359,7 +8370,7 @@
   assert(!I.hasOperandBundlesOtherThan(
              {LLVMContext::OB_deopt, LLVMContext::OB_funclet,
               LLVMContext::OB_cfguardtarget, LLVMContext::OB_preallocated,
-              LLVMContext::OB_clang_arc_attachedcall}) &&
+              LLVMContext::OB_clang_arc_attachedcall, LLVMContext::OB_kcfi}) &&
          "Cannot lower calls with arbitrary operand bundles!");
 
   SDValue Callee = getValue(I.getCalledOperand());
Index: llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
===================================================================
--- llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
+++ llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
@@ -155,6 +155,12 @@
     }
   }
 
+  auto Bundle = CB.getOperandBundle(LLVMContext::OB_kcfi);
+  if (Bundle && CB.isIndirectCall()) {
+    Info.KCFIType = cast<ConstantInt>(Bundle->Inputs[0]);
+    assert(Info.KCFIType->getType()->isIntegerTy(32) && "Invalid KCFI type");
+  }
+
   Info.CB = &CB;
   Info.KnownCallees = CB.getMetadata(LLVMContext::MD_callees);
   Info.CallConv = CallConv;
Index: llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
===================================================================
--- llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -924,21 +924,26 @@
 
   // Emit the prefix data.
   if (F.hasPrefixData()) {
-    if (MAI->hasSubsectionsViaSymbols()) {
-      // Preserving prefix data on platforms which use subsections-via-symbols
-      // is a bit tricky. Here we introduce a symbol for the prefix data
-      // and use the .alt_entry attribute to mark the function's real entry point
-      // as an alternative entry point to the prefix-data symbol.
-      MCSymbol *PrefixSym = OutContext.createLinkerPrivateTempSymbol();
-      OutStreamer->emitLabel(PrefixSym);
+    bool SubsectionsViaSymbols = MAI->hasSubsectionsViaSymbols();
+
+    if (F.hasFnAttribute("kcfi-target"))
+      emitKCFITypeId(*MF);
+    else {
+      if (SubsectionsViaSymbols) {
+        // Preserving prefix data on platforms which use subsections-via-symbols
+        // is a bit tricky. Here we introduce a symbol for the prefix data
+        // and use the .alt_entry attribute to mark the function's real entry
+        // point as an alternative entry point to the prefix-data symbol.
+        MCSymbol *PrefixSym = OutContext.createLinkerPrivateTempSymbol();
+        OutStreamer->emitLabel(PrefixSym);
+      }
 
       emitGlobalConstant(F.getParent()->getDataLayout(), F.getPrefixData());
+    }
 
-      // Emit an .alt_entry directive for the actual function symbol.
+    // Emit an .alt_entry directive for the actual function symbol.
+    if (SubsectionsViaSymbols)
       OutStreamer->emitSymbolAttribute(CurrentFnSym, MCSA_AltEntry);
-    } else {
-      emitGlobalConstant(F.getParent()->getDataLayout(), F.getPrefixData());
-    }
   }
 
   // Emit M NOPs for -fpatchable-function-entry=N,M where M>0. We arbitrarily
@@ -1326,6 +1331,28 @@
   OutStreamer->PopSection();
 }
 
+void AsmPrinter::emitKCFITrapEntry(const MachineFunction &MF,
+                                   const MCSymbol *Symbol) {
+  MCSection *Section =
+      getObjFileLowering().getKCFITrapSection(*MF.getSection());
+
+  if (Section) {
+    OutStreamer->PushSection();
+    OutStreamer->SwitchSection(Section);
+
+    MCSymbol *Loc = OutContext.createLinkerPrivateTempSymbol();
+    OutStreamer->emitLabel(Loc);
+    OutStreamer->emitAbsoluteSymbolDiff(Symbol, Loc, 4);
+
+    OutStreamer->PopSection();
+  }
+}
+
+void AsmPrinter::emitKCFITypeId(const MachineFunction &MF) {
+  const Function &F = MF.getFunction();
+  emitGlobalConstant(F.getParent()->getDataLayout(), F.getPrefixData());
+}
+
 void AsmPrinter::emitPseudoProbe(const MachineInstr &MI) {
   if (PP) {
     auto GUID = MI.getOperand(0).getImm();
Index: llvm/include/llvm/MC/MCObjectFileInfo.h
===================================================================
--- llvm/include/llvm/MC/MCObjectFileInfo.h
+++ llvm/include/llvm/MC/MCObjectFileInfo.h
@@ -16,6 +16,7 @@
 #include "llvm/ADT/Optional.h"
 #include "llvm/ADT/Triple.h"
 #include "llvm/BinaryFormat/Swift.h"
+#include "llvm/MC/MCSection.h"
 #include "llvm/Support/VersionTuple.h"
 
 #include <array>
@@ -359,6 +360,8 @@
 
   MCSection *getBBAddrMapSection(const MCSection &TextSec) const;
 
+  MCSection *getKCFITrapSection(const MCSection &TextSec) const;
+
   MCSection *getPseudoProbeSection(const MCSection *TextSec) const;
 
   MCSection *getPseudoProbeDescSection(StringRef FuncName) const;
Index: llvm/include/llvm/IR/LLVMContext.h
===================================================================
--- llvm/include/llvm/IR/LLVMContext.h
+++ llvm/include/llvm/IR/LLVMContext.h
@@ -95,6 +95,7 @@
     OB_gc_live = 5,                // "gc-live"
     OB_clang_arc_attachedcall = 6, // "clang.arc.attachedcall"
     OB_ptrauth = 7,                // "ptrauth"
+    OB_kcfi = 8,                   // "kcfi"
   };
 
   /// getMDKindID - Return a unique non-zero ID for the specified metadata kind.
Index: llvm/include/llvm/IR/InstrTypes.h
===================================================================
--- llvm/include/llvm/IR/InstrTypes.h
+++ llvm/include/llvm/IR/InstrTypes.h
@@ -2079,7 +2079,8 @@
     for (auto &BOI : bundle_op_infos()) {
       if (BOI.Tag->second == LLVMContext::OB_deopt ||
           BOI.Tag->second == LLVMContext::OB_funclet ||
-          BOI.Tag->second == LLVMContext::OB_ptrauth)
+          BOI.Tag->second == LLVMContext::OB_ptrauth ||
+          BOI.Tag->second == LLVMContext::OB_kcfi)
         continue;
 
       // This instruction has an operand bundle that is not known to us.
Index: llvm/include/llvm/CodeGen/TargetLowering.h
===================================================================
--- llvm/include/llvm/CodeGen/TargetLowering.h
+++ llvm/include/llvm/CodeGen/TargetLowering.h
@@ -3895,6 +3895,9 @@
     return false;
   }
 
+  /// Return true if the target supports kcfi operand bundles.
+  virtual bool supportKCFIBundles() const { return false; }
+
   /// Perform necessary initialization to handle a subset of CSRs explicitly
   /// via copies. This function is called at the beginning of instruction
   /// selection.
@@ -4014,6 +4017,7 @@
     SmallVector<SDValue, 32> OutVals;
     SmallVector<ISD::InputArg, 32> Ins;
     SmallVector<SDValue, 4> InVals;
+    const ConstantInt *KCFIType = nullptr;
 
     CallLoweringInfo(SelectionDAG &DAG)
         : RetSExt(false), RetZExt(false), IsVarArg(false), IsInReg(false),
@@ -4136,6 +4140,11 @@
       return *this;
     }
 
+    CallLoweringInfo &setKCFIType(const ConstantInt *Type) {
+      KCFIType = Type;
+      return *this;
+    }
+
     ArgListTy &getArgs() {
       return Args;
     }
Index: llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
===================================================================
--- llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
+++ llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
@@ -144,6 +144,9 @@
 
     /// The stack index for sret demotion.
     int DemoteStackIndex;
+
+    /// Expected type identifier for indirect calls with a KCFI check.
+    const ConstantInt *KCFIType = nullptr;
   };
 
   /// Argument handling is mostly uniform between the four places that
Index: llvm/include/llvm/CodeGen/AsmPrinter.h
===================================================================
--- llvm/include/llvm/CodeGen/AsmPrinter.h
+++ llvm/include/llvm/CodeGen/AsmPrinter.h
@@ -400,6 +400,9 @@
 
   void emitBBAddrMapSection(const MachineFunction &MF);
 
+  void emitKCFITrapEntry(const MachineFunction &MF, const MCSymbol *Symbol);
+  virtual void emitKCFITypeId(const MachineFunction &MF);
+
   void emitPseudoProbe(const MachineInstr &MI);
 
   void emitRemarksSection(remarks::RemarkStreamer &RS);
Index: llvm/docs/LangRef.rst
===================================================================
--- llvm/docs/LangRef.rst
+++ llvm/docs/LangRef.rst
@@ -2612,6 +2612,23 @@
 ``"ptrauth"`` operand bundle tag.  They are described in the
 `Pointer Authentication <PointerAuth.html#operand-bundle>`__ document.
 
+.. _ob_kcfi:
+
+KCFI Operand Bundles
+^^^^^^^^^^^^^^^^^^^^
+
+A ``"kcfi"`` operand bundle on an indirect call indicates that the call is
+preceded by a runtime type check, which validates that the call target is
+prefixed with a type identifier that matches the operand bundle attribute. For
+example:
+
+.. code-block:: llvm
+
+      call void %0() ["kcfi"(i32 1234)]
+
+Clang emits KCFI operand bundles and the necessary function prefix data with
+``-fsanitize=kcfi``.
+
 .. _moduleasm:
 
 Module-Level Inline Assembly
Index: lld/ELF/Symbols.cpp
===================================================================
--- lld/ELF/Symbols.cpp
+++ lld/ELF/Symbols.cpp
@@ -547,6 +547,13 @@
       return false;
   }
 
+  // The weak __kcfi_typeid_ symbols contain expected KCFI type identifiers
+  // for external symbols. A mismatch in these values means object files have
+  // incompatible declarations for the same function.
+  if (isWeak() && getName().startswith("__kcfi_typeid_") &&
+      cast<Defined>(this)->value != other.value)
+    warn("kcfi: declaration type mismatch: different values for " + getName());
+
   // Incoming STB_GLOBAL overrides STB_WEAK/STB_GNU_UNIQUE. -fgnu-unique changes
   // some vague linkage data in COMDAT from STB_WEAK to STB_GNU_UNIQUE. Treat
   // STB_GNU_UNIQUE like STB_WEAK so that we prefer the first among all
Index: clang/test/Driver/fsanitize.c
===================================================================
--- clang/test/Driver/fsanitize.c
+++ clang/test/Driver/fsanitize.c
@@ -649,6 +649,27 @@
 // RUN: %clang -target x86_64-linux-gnu -fsanitize=cfi -fsanitize-stats -flto -c %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-CFI-STATS
 // CHECK-CFI-STATS: -fsanitize-stats
 
+// RUN: %clang -target x86_64-linux-gnu -fsanitize=kcfi -fsanitize=cfi -flto -fvisibility=hidden %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-KCFI-NOCFI
+// CHECK-KCFI-NOCFI: error: invalid argument '-fsanitize=kcfi' not allowed with '-fsanitize=cfi'
+
+// RUN: %clang -target x86_64-linux-gnu -fsanitize=kcfi -fpatchable-function-entry=1 %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-KCFI-PATCHABLE-NOM
+// CHECK-KCFI-PATCHABLE-NOM: "-fsanitize=kcfi"
+
+// RUN: %clang -target x86_64-linux-gnu -fsanitize=kcfi -fpatchable-function-entry=1,0 %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-KCFI-PATCHABLE-M0
+// CHECK-KCFI-PATCHABLE-M0: "-fsanitize=kcfi"
+
+// RUN: %clang -target x86_64-linux-gnu -fsanitize=kcfi -fpatchable-function-entry=1,1 %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-KCFI-PATCHABLE-M1
+// CHECK-KCFI-PATCHABLE-M1: error: invalid argument '-fsanitize=kcfi' not allowed with '-fpatchable-function-entry=1,1'
+
+// RUN: %clang -target x86_64-linux-gnu -fsanitize=kcfi -fsanitize-trap=kcfi %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-KCFI-NOTRAP
+// CHECK-KCFI-NOTRAP: error: unsupported argument 'kcfi' to option '-fsanitize-trap='
+
+// RUN: %clang -target x86_64-linux-gnu -fsanitize=kcfi %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-KCFI
+// CHECK-KCFI: "-fsanitize=kcfi"
+
+// RUN: %clang -target x86_64-linux-gnu -fsanitize=kcfi -fno-sanitize-recover=kcfi %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-KCFI-RECOVER
+// CHECK-KCFI-RECOVER: error: unsupported argument 'kcfi' to option '-fno-sanitize-recover='
+
 // RUN: %clang_cl -fsanitize=address -c -MDd -### -- %s 2>&1 | FileCheck %s -check-prefix=CHECK-ASAN-DEBUGRTL
 // RUN: %clang_cl -fsanitize=address -c -MTd -### -- %s 2>&1 | FileCheck %s -check-prefix=CHECK-ASAN-DEBUGRTL
 // RUN: %clang_cl -fsanitize=address -c -LDd -### -- %s 2>&1 | FileCheck %s -check-prefix=CHECK-ASAN-DEBUGRTL
Index: clang/test/CodeGen/kcfi.c
===================================================================
--- /dev/null
+++ clang/test/CodeGen/kcfi.c
@@ -0,0 +1,78 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -fsanitize=kcfi -o - %s | FileCheck --check-prefixes=CHECK,O0 %s
+// RUN: %clang_cc1 -O2 -triple x86_64-unknown-linux-gnu -emit-llvm -fsanitize=kcfi -o - %s | FileCheck --check-prefixes=CHECK,O2 %s
+#if !__has_feature(kcfi)
+#error Missing kcfi?
+#endif
+
+// COM: Must emit __kcfi_typeid symbols for address-taken function declarations
+// CHECK: module asm ".weak __kcfi_typeid_f4"
+// CHECK: module asm ".set __kcfi_typeid_f4, [[#%d,HASH:]]"
+
+typedef int (*fn_t)(void);
+
+// CHECK: define dso_local i32 @f1(){{.*}} #[[#TARGET:]] prefix i32 [[#HASH]]
+int f1(void) { return 0; }
+
+// CHECK: define dso_local i32 @f2(){{.*}} #[[#TARGET]] prefix i32 [[#%d,HASH2:]]
+unsigned int f2(void) { return 2; }
+
+// CHECK-LABEL: define dso_local i32 @__call(ptr{{.*}} %f)
+int __call(fn_t f) __attribute__((__no_sanitize__("kcfi"))) {
+  // CHECK-NOT: call i32 %{{.}}(){{.*}} [ "kcfi"
+  return f();
+}
+
+// CHECK: define dso_local i32 @call(ptr{{.*}} %f){{.*}}
+int call(fn_t f) {
+  // CHECK: call i32 %{{.}}(){{.*}} [ "kcfi"(i32 [[#HASH]]) ]
+  return f();
+}
+
+// O0-DAG: define internal i32 @f3() #[[#TARGET]] prefix i32 [[#HASH]]
+static int f3(void) { return 1; }
+
+// CHECK-DAG: declare i32 @f4(){{.*}} #[[#F4ATTR:]] prefix i32 [[#HASH]]
+extern int f4(void);
+
+// COM: Must not emit prefix data for non-address-taken local functions
+// O0: define internal i32 @f5() #[[#LOCAL:]]
+// O0-NOT: prefix i32
+// O0-SAME: {
+static int f5(void) { return 2; }
+
+int test(void) {
+  return call(f1) +
+         __call((fn_t)f2) +
+         call(f3) +
+         call(f4) +
+         f5();
+}
+
+// CHECK: define dso_local i32 @test2(ptr{{.*}} [[PTR:%.]])
+int test2(fn_t p) {
+  // O0: call i32 %{{.}}() [ "kcfi"(i32 [[#HASH]]) ]
+  // O2: tail call i32 [[PTR]](){{.*}} [ "kcfi"(i32 [[#HASH]]) ]
+  int n = p();
+
+  // COM: Must drop the kcfi operand bundle from indirect calls that were
+  // COM: converted to direct calls.
+  // O0: call i32 %{{.}}() [ "kcfi"(i32 [[#HASH]]) ]
+  // O2: tail call i32 @f4()
+  // O2-NOT: "kcfi"
+  p = &f4;
+  n += p();
+
+  // O0: call i32 %{{.}}() [ "kcfi"(i32 [[#HASH]]) ]
+  // O2-NOT: call i32 %{{.}}() [ "kcfi"
+  p = (fn_t)&f2;
+  return n + p();
+}
+
+// CHECK-DAG: attributes #[[#TARGET]] = {{{.*}}"kcfi-target"
+// CHECK-DAG: attributes #[[#F4ATTR]] = {{{.*}}"kcfi-target"
+
+// O0-DAG: attributes #[[#LOCAL]] = {
+// O0-NOT: {{.*}}"kcfi-target"
+// O0-SAME: }
+
+// CHECK-DAG: ![[#]] = !{i32 4, !"kcfi", i32 1}
Index: clang/lib/Driver/ToolChain.cpp
===================================================================
--- clang/lib/Driver/ToolChain.cpp
+++ clang/lib/Driver/ToolChain.cpp
@@ -1081,6 +1081,9 @@
       getTriple().getArch() == llvm::Triple::arm || getTriple().isWasm() ||
       getTriple().isAArch64() || getTriple().isRISCV())
     Res |= SanitizerKind::CFIICall;
+  if (getTriple().getArch() == llvm::Triple::x86_64 ||
+      getTriple().isAArch64(64))
+    Res |= SanitizerKind::KCFI;
   if (getTriple().getArch() == llvm::Triple::x86_64 ||
       getTriple().isAArch64(64) || getTriple().isRISCV())
     Res |= SanitizerKind::ShadowCallStack;
Index: clang/lib/Driver/SanitizerArgs.cpp
===================================================================
--- clang/lib/Driver/SanitizerArgs.cpp
+++ clang/lib/Driver/SanitizerArgs.cpp
@@ -37,7 +37,8 @@
 static const SanitizerMask NotAllowedWithMinimalRuntime =
     SanitizerKind::Function | SanitizerKind::Vptr;
 static const SanitizerMask RequiresPIE =
-    SanitizerKind::DataFlow | SanitizerKind::HWAddress | SanitizerKind::Scudo;
+    SanitizerKind::DataFlow | SanitizerKind::HWAddress | SanitizerKind::Scudo |
+    SanitizerKind::KCFI;
 static const SanitizerMask NeedsUnwindTables =
     SanitizerKind::Address | SanitizerKind::HWAddress | SanitizerKind::Thread |
     SanitizerKind::Memory | SanitizerKind::DataFlow;
@@ -58,8 +59,9 @@
     SanitizerKind::FloatDivideByZero | SanitizerKind::ObjCCast;
 static const SanitizerMask Unrecoverable =
     SanitizerKind::Unreachable | SanitizerKind::Return;
-static const SanitizerMask AlwaysRecoverable =
-    SanitizerKind::KernelAddress | SanitizerKind::KernelHWAddress;
+static const SanitizerMask AlwaysRecoverable = SanitizerKind::KernelAddress |
+                                               SanitizerKind::KernelHWAddress |
+                                               SanitizerKind::KCFI;
 static const SanitizerMask NeedsLTO = SanitizerKind::CFI;
 static const SanitizerMask TrappingSupported =
     (SanitizerKind::Undefined & ~SanitizerKind::Vptr) | SanitizerKind::Integer |
@@ -692,6 +694,25 @@
                      options::OPT_fno_sanitize_cfi_canonical_jump_tables, true);
   }
 
+  if (AllAddedKinds & SanitizerKind::KCFI && DiagnoseErrors) {
+    if (AllAddedKinds & SanitizerKind::CFI)
+      D.Diag(diag::err_drv_argument_not_allowed_with)
+          << "-fsanitize=kcfi"
+          << lastArgumentForMask(D, Args, SanitizerKind::CFI);
+
+    if (Arg *A = Args.getLastArg(options::OPT_fpatchable_function_entry_EQ)) {
+      StringRef S = A->getValue();
+      unsigned N, M;
+      // With -fpatchable-function-entry=N,M, where M > 0,
+      // llvm::AsmPrinter::emitFunctionHeader injects nops before before the
+      // KCFI type identifier, which is currently unsupported.
+      if (!S.consumeInteger(10, N) && S.consume_front(",") &&
+          !S.consumeInteger(10, M) && M > 0)
+        D.Diag(diag::err_drv_argument_not_allowed_with)
+            << "-fsanitize=kcfi" << A->getAsString(Args);
+    }
+  }
+
   Stats = Args.hasFlag(options::OPT_fsanitize_stats,
                        options::OPT_fno_sanitize_stats, false);
 
Index: clang/lib/CodeGen/CodeGenModule.h
===================================================================
--- clang/lib/CodeGen/CodeGenModule.h
+++ clang/lib/CodeGen/CodeGenModule.h
@@ -1396,6 +1396,9 @@
   /// Generate a cross-DSO type identifier for MD.
   llvm::ConstantInt *CreateCrossDsoCfiTypeId(llvm::Metadata *MD);
 
+  /// Generate a KCFI type identifier for T.
+  llvm::ConstantInt *CreateKCFITypeId(QualType T);
+
   /// Create a metadata identifier for the given type. This may either be an
   /// MDString (for external identifiers) or a distinct unnamed MDNode (for
   /// internal identifiers).
@@ -1414,6 +1417,12 @@
   void CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
                                           llvm::Function *F);
 
+  /// Set type hash as prefix data to the given function
+  void SetKCFITypePrefix(const FunctionDecl *FD, llvm::Function *F);
+
+  /// Emit KCFI type identifier constants and remove unused identifiers
+  void FinalizeKCFITypePrefixes();
+
   /// Whether this function's return type has no side effects, and thus may
   /// be trivially discarded if it is unused.
   bool MayDropFunctionReturn(const ASTContext &Context, QualType ReturnType);
@@ -1658,7 +1667,8 @@
                                     llvm::AttrBuilder &FuncAttrs);
 
   llvm::Metadata *CreateMetadataIdentifierImpl(QualType T, MetadataTypeMap &Map,
-                                               StringRef Suffix);
+                                               StringRef Suffix,
+                                               bool OnlyExternal = true);
 };
 
 }  // end namespace CodeGen
Index: clang/lib/CodeGen/CodeGenModule.cpp
===================================================================
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -47,6 +47,7 @@
 #include "clang/CodeGen/BackendUtil.h"
 #include "clang/CodeGen/ConstantInitBuilder.h"
 #include "clang/Frontend/FrontendDiagnostic.h"
+#include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringSwitch.h"
 #include "llvm/ADT/Triple.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
@@ -65,6 +66,7 @@
 #include "llvm/Support/MD5.h"
 #include "llvm/Support/TimeProfiler.h"
 #include "llvm/Support/X86TargetParser.h"
+#include "llvm/Support/xxhash.h"
 
 using namespace clang;
 using namespace CodeGen;
@@ -554,6 +556,8 @@
     CodeGenFunction(*this).EmitCfiCheckFail();
     CodeGenFunction(*this).EmitCfiCheckStub();
   }
+  if (LangOpts.Sanitize.has(SanitizerKind::KCFI))
+    FinalizeKCFITypePrefixes();
   emitAtAvailableLinkGuard();
   if (Context.getTargetInfo().getTriple().isWasm() &&
       !Context.getTargetInfo().getTriple().isOSEmscripten()) {
@@ -738,6 +742,9 @@
                               CodeGenOpts.SanitizeCfiCanonicalJumpTables);
   }
 
+  if (LangOpts.Sanitize.has(SanitizerKind::KCFI))
+    getModule().addModuleFlag(llvm::Module::Override, "kcfi", 1);
+
   if (CodeGenOpts.CFProtectionReturn &&
       Target.checkCFProtectionReturnSupported(getDiags())) {
     // Indicate that we want to instrument return control flow protection.
@@ -1642,6 +1649,15 @@
   return llvm::ConstantInt::get(Int64Ty, llvm::MD5Hash(MDS->getString()));
 }
 
+llvm::ConstantInt *CodeGenModule::CreateKCFITypeId(QualType T) {
+  if (auto *MDS = dyn_cast<llvm::MDString>(CreateMetadataIdentifierImpl(
+          T, MetadataIdMap, "", /*OnlyExternal=*/false)))
+    return llvm::ConstantInt::get(
+        Int32Ty, static_cast<uint32_t>(llvm::xxHash64(MDS->getString())));
+
+  return nullptr;
+}
+
 void CodeGenModule::SetLLVMFunctionAttributes(GlobalDecl GD,
                                               const CGFunctionInfo &Info,
                                               llvm::Function *F, bool IsThunk) {
@@ -2235,6 +2251,60 @@
       F->addTypeMetadata(0, llvm::ConstantAsMetadata::get(CrossDsoTypeId));
 }
 
+void CodeGenModule::SetKCFITypePrefix(const FunctionDecl *FD,
+                                      llvm::Function *F) {
+
+  if (isa<CXXMethodDecl>(FD) && !cast<CXXMethodDecl>(FD)->isStatic())
+    return;
+
+  F->setPrefixData(CreateKCFITypeId(FD->getType()));
+  F->addFnAttr("kcfi-target");
+}
+
+static bool allowKCFIIdentifier(StringRef Name) {
+  // KCFI type identifier constants are only necessary for external assembly
+  // functions, which means it's safe to skip unusual names. Subset of
+  // MCAsmInfo::isAcceptableChar() and MCAsmInfoXCOFF::isAcceptableChar().
+  for (const char &C : Name) {
+    if (llvm::isAlnum(C) || C == '_' || C == '.')
+      continue;
+    return false;
+  }
+  return true;
+}
+
+void CodeGenModule::FinalizeKCFITypePrefixes() {
+  llvm::Module &M = getModule();
+  for (auto &F : M.functions()) {
+    bool AddressTaken = F.hasAddressTaken();
+
+    // Remove KCFI prefix data and attribute from non-address-taken local
+    // functions.
+    if (!AddressTaken && F.hasLocalLinkage()) {
+      F.setPrefixData(nullptr);
+      F.removeFnAttr("kcfi-target");
+    }
+
+    if (!AddressTaken || !F.isDeclaration() || !F.hasPrefixData())
+      continue;
+
+    // Generate a weak constant with the expected KCFI type identifier for all
+    // address-taken function declarations.
+    auto *Id = dyn_cast<llvm::ConstantInt>(F.getPrefixData());
+    if (!Id)
+      continue;
+
+    StringRef Name = F.getName();
+    if (!allowKCFIIdentifier(Name))
+      continue;
+
+    std::string Asm = (".weak __kcfi_typeid_" + Name + "\n.set __kcfi_typeid_" +
+                       Name + ", " + Twine(Id->getSExtValue()) + "\n")
+                          .str();
+    M.appendModuleInlineAsm(Asm);
+  }
+}
+
 void CodeGenModule::SetFunctionAttributes(GlobalDecl GD, llvm::Function *F,
                                           bool IsIncompleteFunction,
                                           bool IsThunk) {
@@ -2317,6 +2387,9 @@
       !CodeGenOpts.SanitizeCfiCanonicalJumpTables)
     CreateFunctionTypeMetadataForIcall(FD, F);
 
+  if (LangOpts.Sanitize.has(SanitizerKind::KCFI))
+    SetKCFITypePrefix(FD, F);
+
   if (getLangOpts().OpenMP && FD->hasAttr<OMPDeclareSimdDeclAttr>())
     getOpenMPRuntime().emitDeclareSimdFunction(FD, F);
 
@@ -6611,7 +6684,8 @@
 
 llvm::Metadata *
 CodeGenModule::CreateMetadataIdentifierImpl(QualType T, MetadataTypeMap &Map,
-                                            StringRef Suffix) {
+                                            StringRef Suffix,
+                                            bool OnlyExternal /*=true*/) {
   if (auto *FnType = T->getAs<FunctionProtoType>())
     T = getContext().getFunctionType(
         FnType->getReturnType(), FnType->getParamTypes(),
@@ -6621,7 +6695,7 @@
   if (InternalId)
     return InternalId;
 
-  if (isExternallyVisible(T->getLinkage())) {
+  if (isExternallyVisible(T->getLinkage()) || !OnlyExternal) {
     std::string OutName;
     llvm::raw_string_ostream Out(OutName);
     getCXXABI().getMangleContext().mangleTypeName(T, Out);
Index: clang/lib/CodeGen/CodeGenFunction.h
===================================================================
--- clang/lib/CodeGen/CodeGenFunction.h
+++ clang/lib/CodeGen/CodeGenFunction.h
@@ -4604,6 +4604,9 @@
   /// passing to a runtime sanitizer handler.
   llvm::Constant *EmitCheckSourceLocation(SourceLocation Loc);
 
+  void EmitKCFIOperandBundle(const CGCallee &Callee,
+                             SmallVectorImpl<llvm::OperandBundleDef> &Bundles);
+
   /// Create a basic block that will either trap or call a handler function in
   /// the UBSan runtime with the provided arguments, and create a conditional
   /// branch to it.
Index: clang/lib/CodeGen/CodeGenFunction.cpp
===================================================================
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -2600,6 +2600,14 @@
   CGM.getSanStats().create(IRB, SSK);
 }
 
+void CodeGenFunction::EmitKCFIOperandBundle(
+    const CGCallee &Callee, SmallVectorImpl<llvm::OperandBundleDef> &Bundles) {
+  const FunctionProtoType *FP =
+      Callee.getAbstractInfo().getCalleeFunctionProtoType();
+  if (FP)
+    Bundles.emplace_back("kcfi", CGM.CreateKCFITypeId(FP->desugar()));
+}
+
 llvm::Value *
 CodeGenFunction::FormResolverCondition(const MultiVersionResolverOption &RO) {
   llvm::Value *Condition = nullptr;
Index: clang/lib/CodeGen/CGCall.cpp
===================================================================
--- clang/lib/CodeGen/CGCall.cpp
+++ clang/lib/CodeGen/CGCall.cpp
@@ -5317,6 +5317,10 @@
   SmallVector<llvm::OperandBundleDef, 1> BundleList =
       getBundlesForFunclet(CalleePtr);
 
+  if (SanOpts.has(SanitizerKind::KCFI) &&
+      !isa_and_nonnull<FunctionDecl>(TargetDecl))
+    EmitKCFIOperandBundle(ConcreteCallee, BundleList);
+
   if (const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(CurFuncDecl))
     if (FD->hasAttr<StrictFPAttr>())
       // All calls within a strictfp function are marked strictfp
Index: clang/include/clang/Basic/Sanitizers.def
===================================================================
--- clang/include/clang/Basic/Sanitizers.def
+++ clang/include/clang/Basic/Sanitizers.def
@@ -126,6 +126,9 @@
                 CFIDerivedCast | CFIICall | CFIMFCall | CFIUnrelatedCast |
                     CFINVCall | CFIVCall)
 
+// Kernel Control Flow Integrity
+SANITIZER("kcfi", KCFI)
+
 // Safe Stack
 SANITIZER("safe-stack", SafeStack)
 
Index: clang/include/clang/Basic/Features.def
===================================================================
--- clang/include/clang/Basic/Features.def
+++ clang/include/clang/Basic/Features.def
@@ -228,6 +228,7 @@
 FEATURE(is_trivially_copyable, LangOpts.CPlusPlus)
 FEATURE(is_union, LangOpts.CPlusPlus)
 FEATURE(modules, LangOpts.Modules)
+FEATURE(kcfi, LangOpts.Sanitize.has(SanitizerKind::KCFI))
 FEATURE(safe_stack, LangOpts.Sanitize.has(SanitizerKind::SafeStack))
 FEATURE(shadow_call_stack,
         LangOpts.Sanitize.has(SanitizerKind::ShadowCallStack))
Index: clang/docs/UsersManual.rst
===================================================================
--- clang/docs/UsersManual.rst
+++ clang/docs/UsersManual.rst
@@ -1692,6 +1692,8 @@
       flow analysis.
    -  ``-fsanitize=cfi``: :doc:`control flow integrity <ControlFlowIntegrity>`
       checks. Requires ``-flto``.
+   -  ``-fsanitize=kcfi``: kernel indirect call forward-edge control flow
+      integrity.
    -  ``-fsanitize=safe-stack``: :doc:`safe stack <SafeStack>`
       protection against stack-based memory corruption errors.
 
Index: clang/docs/ControlFlowIntegrity.rst
===================================================================
--- clang/docs/ControlFlowIntegrity.rst
+++ clang/docs/ControlFlowIntegrity.rst
@@ -306,6 +306,19 @@
 library boundaries are no different from calls within a single program or
 shared library.
 
+.. _kcfi:
+
+``-fsanitize=kcfi``
+-------------------
+
+KCFI, enabled by ``-fsanitize=kcfi``, is an alternative indirect call
+control-flow integrity scheme designed for low-level system software, such
+as operating system kernels. Unlike ``-fsanitize=cfi-icall``, it doesn't
+require ``-flto``, won't result in function pointers being replaced with jump
+table references, and never breaks cross-DSO function address equality. These
+properties make KCFI easier to adopt in low-level software. KCFI is limited to
+indirect call checking only, and isn't compatible with executable-only memory.
+
 Member Function Pointer Call Checking
 =====================================
 
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to