https://github.com/sivadeilra created 
https://github.com/llvm/llvm-project/pull/144745

This fixes a long-standing bug in LLVM's support for generating Windows EH 
tables, specifically the IP2State tables.

On most Windows platforms (AMD64, ARM64, ARM32, IA64, but *not* x86-32), 
exception handling works by looking up instruction pointers in lookup tables. 
These lookup tables are stored in `.xdata` sections in executables. One element 
of the lookup tables are the `IP2State` tables (Instruction Pointer to State).

If a function has any instructions that require cleanup during exception 
unwinding, then it will have an IP2State table. Each entry in the IP2State 
table describes a range of bytes in the function's instruction stream, and 
associates an "EH state number" with that range of instructions. A value of -1 
means "the null state", which does not require any code to execute. A value 
other than -1 is an index into the State table.

The entries in the IP2State table contain byte offsets within the instruction 
stream of the function. The Windows ABI requires that these offsets are aligned 
to instruction boundaries; they are not permitted to point to a byte that is 
not the first byte of an instruction.

Unfortunately, CALL instructions present a problem during unwinding. CALL 
instructions push the address of the instruction after the CALL instruction, so 
that execution can resume after the CALL. If the CALL is the last instruction 
within an IP2State region, then the return address (on the stack) points to the 
*next* IP2State region. This means that the unwinder will use the wrong cleanup 
funclet during unwinding.

To fix this problem, MSVC will insert a NOP after a CALL instruction, if the 
CALL instruction is the last instruction within an IP2State region. The NOP is 
placed within the same IP2State region as the CALL, so that the return address 
points to the NOP and the unwinder will locate the correct region.

Previously, LLVM fixed this by adding 1 to the instruction offsets in the 
IP2State table. This caused the instruction boundary to point *within* the 
instruction after a CALL. This works for the purposes of unwinding, since there 
are no AMD64 instructions that can be encoded in a single byte and which throw 
C++ exceptions. Unfortunately, this violates the Windows ABI specification, 
which requires that the IP2State table entries point to the boundaries between 
exceptions.

To fix this properly, LLVM will now insert a 1-byte NOP after CALL 
instructions, in the same situations that MSVC does. In performance tests, the 
NOP has no detectable significance. The NOP is rarely inserted, since it is 
only inserted when the CALL is the last instruction before an IP2State 
transition or the CALL is the last instruction before the function epilogue.

NOP padding is only necessary on Windows AMD64 targets. On ARM64 and ARM32, 
instructions have a fixed size so the unwinder knows how to "back up" by one 
instruction.

Interaction with Import Call Optimization (ICO):

Import Call Optimization (ICO) is a compiler + OS feature on Windows which 
improves the performance and security of DLL imports. ICO relies on using a 
specific CALL idiom that can be replaced by the OS DLL loader. This removes a 
load and indirect CALL and replaces it with a single direct CALL.

To achieve this, ICO also inserts NOPs after the CALL instruction. If the end 
of the CALL is aligned with an EH state transition, we *also* insert a 
single-byte NOP.  **Both forms of NOPs must be preserved.**  They cannot be 
combined into a single larger NOP; nor can the second NOP be removed.

This is necessary because, if ICO is active and the call site is modified by 
the loader, the loader will end up overwriting the NOPs that were inserted for 
ICO. That means that those NOPs cannot be used for the correct termination of 
the exception handling region (the IP2State transition), so we still need an 
additional NOP instruction.  The NOPs cannot be combined into a longer NOP 
(which is ordinarily desirable) because then ICO would split one instruction, 
producing a malformed instruction after the ICO call.

>From 50c418d4259ac5039d9ab7532afd518a0cfbbc64 Mon Sep 17 00:00:00 2001
From: Arlie Davis <arlie.da...@microsoft.com>
Date: Mon, 16 Jun 2025 11:09:23 -0700
Subject: [PATCH] Fix IP2State tables

---
 .../CodeGenCXX/microsoft-abi-eh-async.cpp     | 180 +++++++++++++
 .../CodeGenCXX/microsoft-abi-eh-disabled.cpp  | 139 ++++++++++
 .../CodeGenCXX/microsoft-abi-eh-ip2state.cpp  | 241 ++++++++++++++++++
 llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp    |   4 +-
 llvm/lib/CodeGen/AsmPrinter/WinException.cpp  |  16 +-
 llvm/lib/CodeGen/AsmPrinter/WinException.h    |   1 -
 llvm/lib/Target/X86/X86AsmPrinter.h           |   2 +
 llvm/lib/Target/X86/X86MCInstLower.cpp        | 151 +++++++++--
 .../test/CodeGen/WinEH/wineh-noret-cleanup.ll |  16 +-
 9 files changed, 706 insertions(+), 44 deletions(-)
 create mode 100644 clang/test/CodeGenCXX/microsoft-abi-eh-async.cpp
 create mode 100644 clang/test/CodeGenCXX/microsoft-abi-eh-disabled.cpp
 create mode 100644 clang/test/CodeGenCXX/microsoft-abi-eh-ip2state.cpp

diff --git a/clang/test/CodeGenCXX/microsoft-abi-eh-async.cpp 
b/clang/test/CodeGenCXX/microsoft-abi-eh-async.cpp
new file mode 100644
index 0000000000000..00bcdaf55aacc
--- /dev/null
+++ b/clang/test/CodeGenCXX/microsoft-abi-eh-async.cpp
@@ -0,0 +1,180 @@
+// RUN: %clang_cl -c --target=x86_64-windows-msvc /EHa -O2 /GS- \
+// RUN:   -Xclang=-import-call-optimization \
+// RUN:   /clang:-S /clang:-o- %s 2>&1 \
+// RUN:   | FileCheck %s
+
+#ifdef __clang__
+#define NO_TAIL __attribute((disable_tail_calls))
+#else
+#define NO_TAIL
+#endif
+
+void might_throw();
+void other_func(int x);
+
+void does_not_throw() noexcept(true);
+
+extern "C" void __declspec(dllimport) some_dll_import();
+
+class HasDtor {
+    int x;
+    char foo[40];
+
+public:
+    explicit HasDtor(int x);
+    ~HasDtor();
+};
+
+class BadError {
+public:
+    int errorCode;
+};
+
+void normal_has_regions() {
+    // CHECK-LABEL: .def "?normal_has_regions@@YAXXZ"
+    // CHECK: .seh_endprologue
+
+    // <-- state -1 (none)
+    {
+        HasDtor hd{42};
+
+        // <-- state goes from -1 to 0
+        // because state changes, we expect the HasDtor::HasDtor() call to 
have a NOP
+        // CHECK: call "??0HasDtor@@QEAA@H@Z"
+        // CHECK-NEXT: nop
+
+        might_throw();
+        // CHECK: call "?might_throw@@YAXXZ"
+        // CHECK-NEXT: nop
+
+        // <-- state goes from 0 to -1 because we're about to call 
HasDtor::~HasDtor()
+        // CHECK: call "??1HasDtor@@QEAA@XZ"
+        // <-- state -1
+    }
+
+    // <-- state -1
+    other_func(10);
+    // CHECK: call "?other_func@@YAXH@Z"
+    // CHECK-NEXT: nop
+    // CHECK: .seh_startepilogue
+
+    // <-- state -1
+}
+
+// This tests a tail call to a destructor.
+void case_dtor_arg_empty_body(HasDtor x)
+{
+    // CHECK-LABEL: .def "?case_dtor_arg_empty_body@@YAXVHasDtor@@@Z"
+    // CHECK: jmp "??1HasDtor@@QEAA@XZ"
+}
+
+int case_dtor_arg_empty_with_ret(HasDtor x)
+{
+    // CHECK-LABEL: .def "?case_dtor_arg_empty_with_ret@@YAHVHasDtor@@@Z"
+    // CHECK: .seh_endprologue
+
+    // CHECK: call "??1HasDtor@@QEAA@XZ"
+    // CHECK-NOT: nop
+
+    // The call to HasDtor::~HasDtor() should NOT have a NOP because the
+    // following "mov eax, 100" instruction is in the same EH state.
+
+    return 100;
+
+    // CHECK: mov eax, 100
+    // CHECK: .seh_startepilogue
+    // CHECK: .seh_endepilogue
+    // CHECK: .seh_endproc
+}
+
+int case_noexcept_dtor(HasDtor x) noexcept(true)
+{
+    // CHECK: .def "?case_noexcept_dtor@@YAHVHasDtor@@@Z"
+    // CHECK: call "??1HasDtor@@QEAA@XZ"
+    // CHECK-NEXT: mov eax, 100
+    // CHECK: .seh_startepilogue
+    return 100;
+}
+
+void case_except_simple_call() NO_TAIL
+{
+    does_not_throw();
+}
+
+void case_noexcept_simple_call() noexcept(true) NO_TAIL
+{
+    does_not_throw();
+}
+
+// This tests that the destructor is called right before SEH_BeginEpilogue,
+// but in a function that has a return value.
+int case_dtor_arg_calls_no_throw(HasDtor x)
+{
+    does_not_throw(); // no NOP expected
+    return 100;
+}
+
+// Check the behavior of CALLs that are at the end of MBBs. If a CALL is within
+// a non-null EH state (state -1) and is at the end of an MBB, then we expect
+// to find an EH_LABEL after the CALL. This causes us to insert a NOP, which
+// is the desired result.
+void case_dtor_runs_after_join(int x) {
+    // CHECK-LABEL: .def "?case_dtor_runs_after_join@@YAXH@Z"
+    // CHECK: .seh_endprologue
+
+    // <-- EH state -1
+
+    // ctor call does not need a NOP, because it has real instructions after it
+    HasDtor hd{42};
+    // CHECK: call "??0HasDtor@@QEAA@H@Z"
+    // CHECK-NEXT: nop
+    // CHECK: test
+
+    // <-- EH state transition from -1 0
+    if (x) {
+        might_throw(); // <-- NOP expected (at end of BB w/ EH_LABEL)
+        // CHECK: call "?might_throw@@YAXXZ"
+        // CHECK-NEXT: nop
+    } else {
+        other_func(10); // <-- NOP expected (at end of BB w/ EH_LABEL)
+        // CHECK: call "?other_func@@YAXH@Z"
+        // CHECK-NEXT: nop
+    }
+    does_not_throw();
+    // <-- EH state transition 0 to -1
+    // ~HasDtor() runs
+
+    // CHECK: .seh_endproc
+
+    // CHECK: "$ip2state$?case_dtor_runs_after_join@@YAXH@Z":
+    // CHECK-NEXT: .long [[func_begin:.Lfunc_begin([0-9]+)@IMGREL]]
+    // CHECK-NEXT: .long -1
+    // CHECK-NEXT: .long [[tmp1:.Ltmp([0-9]+)]]@IMGREL
+    // CHECK-NEXT: .long 0
+    // CHECK-NEXT: .long [[tmp2:.Ltmp([0-9]+)]]@IMGREL
+    // CHECK-NEXT: .long -1
+}
+
+
+// Check the behavior of NOP padding around tail calls.
+// We do not expect to insert NOPs around tail calls.
+// However, the first call (to other_func()) does get a NOP
+// because it comes before .seh_startepilogue.
+void case_tail_call_no_eh() {
+    // CHECK-LABEL: .def "?case_tail_call_no_eh@@YAXXZ"
+    // CHECK: .seh_endprologue
+
+    // ordinary call
+    other_func(10);
+    // CHECK: call "?other_func@@YAXH@Z"
+    // CHECK-NEXT: nop
+
+    // tail call; no NOP padding after JMP
+    does_not_throw();
+
+    // CHECK: .seh_startepilogue
+    // CHECK: .seh_endepilogue
+    // CHECK: jmp "?does_not_throw@@YAXXZ"
+    // CHECK-NOT: nop
+    // CHECK: .seh_endproc
+}
diff --git a/clang/test/CodeGenCXX/microsoft-abi-eh-disabled.cpp 
b/clang/test/CodeGenCXX/microsoft-abi-eh-disabled.cpp
new file mode 100644
index 0000000000000..16fb381d814f2
--- /dev/null
+++ b/clang/test/CodeGenCXX/microsoft-abi-eh-disabled.cpp
@@ -0,0 +1,139 @@
+// RUN: %clang_cl -c --target=x86_64-windows-msvc /EHs-c- -O2 /GS- \
+// RUN:   -Xclang=-import-call-optimization \
+// RUN:   /clang:-S /clang:-o- %s 2>&1 \
+// RUN:   | FileCheck %s
+
+#ifdef __clang__
+#define NO_TAIL __attribute((disable_tail_calls))
+#else
+#define NO_TAIL
+#endif
+
+void might_throw();
+void other_func(int x);
+
+void does_not_throw() noexcept(true);
+
+extern "C" void __declspec(dllimport) some_dll_import();
+
+class HasDtor {
+    int x;
+    char foo[40];
+
+public:
+    explicit HasDtor(int x);
+    ~HasDtor();
+};
+
+void normal_has_regions() {
+    {
+        HasDtor hd{42};
+
+        // because state changes, we expect the HasDtor::HasDtor() call to 
have a NOP
+        might_throw();
+    }
+
+    other_func(10);
+}
+// CHECK-LABEL: .def "?normal_has_regions@@YAXXZ"
+// CHECK: .seh_endprologue
+// CHECK: call "??0HasDtor@@QEAA@H@Z"
+// CHECK-NEXT: call "?might_throw@@YAXXZ"
+// CHECK-NEXT: mov
+// CHECK: call "??1HasDtor@@QEAA@XZ"
+// CHECK-NEXT: mov ecx, 10
+// CHECK-NEXT: call "?other_func@@YAXH@Z"
+// CHECK-NEXT: nop
+// CHECK-NEXT: .seh_startepilogue
+// CHECK-NOT: "$ip2state$?normal_has_regions@@YAXXZ"
+
+// This tests a tail call to a destructor.
+void case_dtor_arg_empty_body(HasDtor x)
+{
+}
+// CHECK-LABEL: .def "?case_dtor_arg_empty_body@@YAXVHasDtor@@@Z"
+// CHECK: jmp "??1HasDtor@@QEAA@XZ"
+
+int case_dtor_arg_empty_with_ret(HasDtor x)
+{
+    // The call to HasDtor::~HasDtor() should NOT have a NOP because the
+    // following "mov eax, 100" instruction is in the same EH state.
+    return 100;
+}
+// CHECK-LABEL: .def "?case_dtor_arg_empty_with_ret@@YAHVHasDtor@@@Z"
+// CHECK: .seh_endprologue
+// CHECK: call "??1HasDtor@@QEAA@XZ"
+// CHECK-NOT: nop
+// CHECK: mov eax, 100
+// CHECK: .seh_startepilogue
+// CHECK: .seh_endepilogue
+// CHECK: .seh_endproc
+
+void case_except_simple_call() NO_TAIL
+{
+    does_not_throw();
+}
+
+// This tests that the destructor is called right before SEH_BeginEpilogue,
+// but in a function that has a return value.
+int case_dtor_arg_calls_no_throw(HasDtor x)
+{
+    does_not_throw(); // no NOP expected
+    return 100;
+}
+
+// Check the behavior of CALLs that are at the end of MBBs. If a CALL is within
+// a non-null EH state (state -1) and is at the end of an MBB, then we expect
+// to find an EH_LABEL after the CALL. This causes us to insert a NOP, which
+// is the desired result.
+void case_dtor_runs_after_join(int x) {
+
+    // ctor call does not need a NOP, because it has real instructions after it
+    HasDtor hd{42};
+
+    if (x) {
+        might_throw();
+    } else {
+        other_func(10);
+    }
+    does_not_throw();
+    // ~HasDtor() runs
+}
+
+// CHECK-LABEL: .def "?case_dtor_runs_after_join@@YAXH@Z"
+// CHECK: .seh_endprologue
+// CHECK: call "??0HasDtor@@QEAA@H@Z"
+// CHECK-NEXT: test
+// CHECK: call "?might_throw@@YAXXZ"
+// CHECK-NEXT: jmp
+// CHECK: call "?other_func@@YAXH@Z"
+// CHECK-NEXT: .LBB
+// CHECK: call "?does_not_throw@@YAXXZ"
+// CHECK-NEXT: lea
+// CHECK-NEXT: call "??1HasDtor@@QEAA@XZ"
+// CHECK-NEXT: nop
+// CHECK-NEXT: .seh_startepilogue
+// CHECK-NOT: "$ip2state$?case_dtor_runs_after_join@@YAXH@Z":
+
+
+// Check the behavior of NOP padding around tail calls.
+// We do not expect to insert NOPs around tail calls.
+// However, the first call (to other_func()) does get a NOP
+// because it comes before .seh_startepilogue.
+void case_tail_call_no_eh() {
+    // ordinary call
+    other_func(10);
+
+    // tail call; no NOP padding after JMP
+    does_not_throw();
+}
+
+// CHECK-LABEL: .def "?case_tail_call_no_eh@@YAXXZ"
+// CHECK: .seh_endprologue
+// CHECK: call "?other_func@@YAXH@Z"
+// CHECK-NEXT: nop
+// CHECK-NEXT: .seh_startepilogue
+// CHECK: .seh_endepilogue
+// CHECK: jmp "?does_not_throw@@YAXXZ"
+// CHECK-NOT: nop
+// CHECK: .seh_endproc
diff --git a/clang/test/CodeGenCXX/microsoft-abi-eh-ip2state.cpp 
b/clang/test/CodeGenCXX/microsoft-abi-eh-ip2state.cpp
new file mode 100644
index 0000000000000..5e3f89c4b4680
--- /dev/null
+++ b/clang/test/CodeGenCXX/microsoft-abi-eh-ip2state.cpp
@@ -0,0 +1,241 @@
+// RUN: %clang_cl -c --target=x86_64-windows-msvc -O2 /EHsc /GS- \
+// RUN:   -Xclang=-import-call-optimization \
+// RUN:   /clang:-S /clang:-o- %s 2>&1 \
+// RUN:   | FileCheck %s
+
+#ifdef __clang__
+#define NO_TAIL __attribute((disable_tail_calls))
+#else
+#define NO_TAIL
+#endif
+
+void might_throw();
+void other_func(int x);
+
+void does_not_throw() noexcept(true);
+
+extern "C" void __declspec(dllimport) some_dll_import();
+
+class HasDtor {
+    int x;
+    char foo[40];
+
+public:
+    explicit HasDtor(int x);
+    ~HasDtor();
+};
+
+class BadError {
+public:
+    int errorCode;
+};
+
+// Verify that when NOP padding for IP2State is active *and* Import Call
+// Optimization is active that we see both forms of NOP padding.
+void case_calls_dll_import() NO_TAIL {
+    some_dll_import();
+}
+// CHECK-LABEL: .def "?case_calls_dll_import@@YAXXZ"
+// CHECK: .seh_endprologue
+// CHECK: .Limpcall{{[0-9]+}}:
+// CHECK-NEXT: rex64
+// CHECK-NEXT: call __imp_some_dll_import
+// CHECK-NEXT: nop dword ptr {{\[.*\]}}
+// CHECK-NEXT: nop
+// CHECK-NEXT: .seh_startepilogue
+
+void normal_has_regions() {
+
+    // <-- state -1 (none)
+    {
+        HasDtor hd{42};
+
+        // <-- state goes from -1 to 0
+        // because state changes, we expect the HasDtor::HasDtor() call to 
have a NOP
+
+        might_throw();
+
+        // <-- state goes from 0 to -1 because we're about to call 
HasDtor::~HasDtor()
+        // <-- state -1
+    }
+
+    // <-- state -1
+    other_func(10);
+
+    // <-- state -1
+}
+// CHECK-LABEL: .def "?normal_has_regions@@YAXXZ"
+// CHECK: .seh_endprologue
+// CHECK: call "??0HasDtor@@QEAA@H@Z"
+// CHECK-NEXT: nop
+// CHECK: call "?might_throw@@YAXXZ"
+// CHECK-NEXT: nop
+// CHECK: call "??1HasDtor@@QEAA@XZ"
+// CHECK: call "?other_func@@YAXH@Z"
+// CHECK-NEXT: nop
+// CHECK: .seh_startepilogue
+
+// This tests a tail call to a destructor.
+void case_dtor_arg_empty_body(HasDtor x)
+{
+}
+// CHECK-LABEL: .def "?case_dtor_arg_empty_body@@YAXVHasDtor@@@Z"
+// CHECK: jmp "??1HasDtor@@QEAA@XZ"
+
+int case_dtor_arg_empty_with_ret(HasDtor x)
+{
+    // CHECK-LABEL: .def "?case_dtor_arg_empty_with_ret@@YAHVHasDtor@@@Z"
+    // CHECK: .seh_endprologue
+
+    // CHECK: call "??1HasDtor@@QEAA@XZ"
+    // CHECK-NOT: nop
+
+    // The call to HasDtor::~HasDtor() should NOT have a NOP because the
+    // following "mov eax, 100" instruction is in the same EH state.
+
+    return 100;
+
+    // CHECK: mov eax, 100
+    // CHECK: .seh_startepilogue
+    // CHECK: .seh_endepilogue
+    // CHECK: .seh_endproc
+}
+
+int case_noexcept_dtor(HasDtor x) noexcept(true)
+{
+    // CHECK: .def "?case_noexcept_dtor@@YAHVHasDtor@@@Z"
+    // CHECK: call "??1HasDtor@@QEAA@XZ"
+    // CHECK-NEXT: mov eax, 100
+    // CHECK-NEXT: .seh_startepilogue
+    return 100;
+}
+
+// Simple call of a function that can throw
+void case_except_simple_call() NO_TAIL
+{
+    might_throw();
+}
+// CHECK-LABEL: .def "?case_except_simple_call@@YAXXZ"
+// CHECK: .seh_endprologue
+// CHECK-NEXT: call "?might_throw@@YAXXZ"
+// CHECK-NEXT: nop
+// CHECK-NEXT: .seh_startepilogue
+
+// Simple call of a function that cannot throw, in a noexcept context.
+void case_noexcept_simple_call() noexcept(true) NO_TAIL
+{
+    does_not_throw();
+}
+// CHECK-LABEL: .def "?case_noexcept_simple_call@@YAXXZ"
+// CHECK: .seh_endprologue
+// CHECK-NEXT: call "?does_not_throw@@YAXXZ"
+// CHECK-NEXT: nop
+// CHECK-NEXT: .seh_startepilogue
+
+
+// This tests that the destructor is called right before SEH_BeginEpilogue,
+// but in a function that has a return value.
+int case_dtor_arg_calls_no_throw(HasDtor x)
+{
+    does_not_throw(); // no NOP expected
+    return 100;
+}
+
+// Check the behavior of CALLs that are at the end of MBBs. If a CALL is within
+// a non-null EH state (state -1) and is at the end of an MBB, then we expect
+// to find an EH_LABEL after the CALL. This causes us to insert a NOP, which
+// is the desired result.
+void case_dtor_runs_after_join(int x) {
+    // CHECK-LABEL: .def "?case_dtor_runs_after_join@@YAXH@Z"
+    // CHECK: .seh_endprologue
+
+    // <-- EH state -1
+
+    // ctor call does not need a NOP, because it has real instructions after it
+    HasDtor hd{42};
+    // CHECK: call "??0HasDtor@@QEAA@H@Z"
+    // CHECK-NEXT: test
+
+    // <-- EH state transition from -1 0
+    if (x) {
+        might_throw(); // <-- NOP expected (at end of BB w/ EH_LABEL)
+        // CHECK: call "?might_throw@@YAXXZ"
+        // CHECK-NEXT: nop
+    } else {
+        other_func(10); // <-- NOP expected (at end of BB w/ EH_LABEL)
+        // CHECK: call "?other_func@@YAXH@Z"
+        // CHECK-NEXT: nop
+    }
+    does_not_throw();
+    // <-- EH state transition 0 to -1
+    // ~HasDtor() runs
+
+    // CHECK: .seh_endproc
+
+    // CHECK: "$ip2state$?case_dtor_runs_after_join@@YAXH@Z":
+    // CHECK-NEXT: .long [[func_begin:.Lfunc_begin([0-9]+)@IMGREL]]
+    // CHECK-NEXT: .long -1
+    // CHECK-NEXT: .long [[tmp1:.Ltmp([0-9]+)]]@IMGREL
+    // CHECK-NEXT: .long 0
+    // CHECK-NEXT: .long [[tmp2:.Ltmp([0-9]+)]]@IMGREL
+    // CHECK-NEXT: .long -1
+}
+
+
+// Check the behavior of NOP padding around tail calls.
+// We do not expect to insert NOPs around tail calls.
+// However, the first call (to other_func()) does get a NOP
+// because it comes before .seh_startepilogue.
+void case_tail_call_no_eh() {
+    // CHECK-LABEL: .def "?case_tail_call_no_eh@@YAXXZ"
+    // CHECK: .seh_endprologue
+
+    // ordinary call
+    other_func(10);
+    // CHECK: call "?other_func@@YAXH@Z"
+    // CHECK-NEXT: nop
+
+    // tail call; no NOP padding after JMP
+    does_not_throw();
+
+    // CHECK: .seh_startepilogue
+    // CHECK: .seh_endepilogue
+    // CHECK: jmp "?does_not_throw@@YAXXZ"
+    // CHECK-NOT: nop
+    // CHECK: .seh_endproc
+}
+
+
+// Check the behavior of a try/catch
+int case_try_catch() {
+    // CHECK-LABEL: .def "?case_try_catch@@YAHXZ"
+    // CHECK: .seh_endprologue
+
+    // Because of the EH_LABELs, the ctor and other_func() get NOPs.
+
+    int result = 0;
+    try {
+        // CHECK: call "??0HasDtor@@QEAA@H@Z"
+        // CHECK-NEXT: nop
+        HasDtor hd{20};
+
+        // CHECK: call "?other_func@@YAXH@Z"
+        // CHECK-NEXT: nop
+        other_func(10);
+
+        // CHECK: call "??1HasDtor@@QEAA@XZ"
+        // CHECK: mov
+    } catch (BadError& e) {
+        result = 1;
+    }
+    return result;
+
+    // CHECK: .seh_endproc
+
+    // CHECK: .def "?dtor$4@?0??case_try_catch@@YAHXZ@4HA"
+    // CHECK: .seh_endprologue
+    // CHECK: call "??1HasDtor@@QEAA@XZ"
+    // CHECK-NEXT: nop
+    // CHECK: .seh_startepilogue
+    // CHECK: .seh_endproc
+}
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp 
b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index e13e92378d4aa..6545b8b6404e0 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -1858,6 +1858,7 @@ void AsmPrinter::emitFunctionBody() {
         OutStreamer->emitLabel(MI.getOperand(0).getMCSymbol());
         break;
       case TargetOpcode::EH_LABEL:
+        OutStreamer->AddComment("EH_LABEL");
         OutStreamer->emitLabel(MI.getOperand(0).getMCSymbol());
         // For AsynchEH, insert a Nop if followed by a trap inst
         //   Or the exception won't be caught.
@@ -1932,8 +1933,9 @@ void AsmPrinter::emitFunctionBody() {
 
         auto CountInstruction = [&](const MachineInstr &MI) {
           // Skip Meta instructions inside bundles.
-          if (MI.isMetaInstruction())
+          if (MI.isMetaInstruction()) {
             return;
+          }
           ++NumInstsInFunction;
           if (CanDoExtraAnalysis) {
             StringRef Name = getMIMnemonic(MI, *OutStreamer);
diff --git a/llvm/lib/CodeGen/AsmPrinter/WinException.cpp 
b/llvm/lib/CodeGen/AsmPrinter/WinException.cpp
index 55d1350e446ab..258349c241369 100644
--- a/llvm/lib/CodeGen/AsmPrinter/WinException.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/WinException.cpp
@@ -325,12 +325,6 @@ const MCExpr *WinException::getLabel(const MCSymbol 
*Label) {
                                  Asm->OutContext);
 }
 
-const MCExpr *WinException::getLabelPlusOne(const MCSymbol *Label) {
-  return MCBinaryExpr::createAdd(getLabel(Label),
-                                 MCConstantExpr::create(1, Asm->OutContext),
-                                 Asm->OutContext);
-}
-
 const MCExpr *WinException::getOffset(const MCSymbol *OffsetOf,
                                       const MCSymbol *OffsetFrom) {
   return MCBinaryExpr::createSub(
@@ -657,7 +651,7 @@ void WinException::emitSEHActionsForRange(const 
WinEHFuncInfo &FuncInfo,
     AddComment("LabelStart");
     OS.emitValue(getLabel(BeginLabel), 4);
     AddComment("LabelEnd");
-    OS.emitValue(getLabelPlusOne(EndLabel), 4);
+    OS.emitValue(getLabel(EndLabel), 4);
     AddComment(UME.IsFinally ? "FinallyFunclet" : UME.Filter ? "FilterFunction"
                                                              : "CatchAll");
     OS.emitValue(FilterOrFinally, 4);
@@ -952,13 +946,7 @@ void WinException::computeIP2StateTable(
       if (!ChangeLabel)
         ChangeLabel = StateChange.PreviousEndLabel;
       // Emit an entry indicating that PCs after 'Label' have this EH state.
-      // NOTE: On ARM architectures, the StateFromIp automatically takes into
-      // account that the return address is after the call instruction (whose 
EH
-      // state we should be using), but on other platforms we need to +1 to the
-      // label so that we are using the correct EH state.
-      const MCExpr *LabelExpression = (isAArch64 || isThumb)
-                                          ? getLabel(ChangeLabel)
-                                          : getLabelPlusOne(ChangeLabel);
+      const MCExpr *LabelExpression = getLabel(ChangeLabel);
       IPToStateTable.push_back(
           std::make_pair(LabelExpression, StateChange.NewState));
       // FIXME: assert that NewState is between CatchLow and CatchHigh.
diff --git a/llvm/lib/CodeGen/AsmPrinter/WinException.h 
b/llvm/lib/CodeGen/AsmPrinter/WinException.h
index 638589adf0ddc..47dd30cef133d 100644
--- a/llvm/lib/CodeGen/AsmPrinter/WinException.h
+++ b/llvm/lib/CodeGen/AsmPrinter/WinException.h
@@ -80,7 +80,6 @@ class LLVM_LIBRARY_VISIBILITY WinException : public 
EHStreamer {
   const MCExpr *create32bitRef(const MCSymbol *Value);
   const MCExpr *create32bitRef(const GlobalValue *GV);
   const MCExpr *getLabel(const MCSymbol *Label);
-  const MCExpr *getLabelPlusOne(const MCSymbol *Label);
   const MCExpr *getOffset(const MCSymbol *OffsetOf, const MCSymbol 
*OffsetFrom);
   const MCExpr *getOffsetPlusOne(const MCSymbol *OffsetOf,
                                  const MCSymbol *OffsetFrom);
diff --git a/llvm/lib/Target/X86/X86AsmPrinter.h 
b/llvm/lib/Target/X86/X86AsmPrinter.h
index efb951b73532f..6c04f8729f1ff 100644
--- a/llvm/lib/Target/X86/X86AsmPrinter.h
+++ b/llvm/lib/Target/X86/X86AsmPrinter.h
@@ -151,6 +151,8 @@ class LLVM_LIBRARY_VISIBILITY X86AsmPrinter : public 
AsmPrinter {
                                     MCSymbol *LazyPointer) override;
 
   void emitCallInstruction(const llvm::MCInst &MCI);
+  bool needsNopAfterCallForWindowsEH(const MachineInstr *MI);
+  void emitNopAfterCallForWindowsEH(const MachineInstr *MI);
 
   // Emits a label to mark the next instruction as being relevant to Import 
Call
   // Optimization.
diff --git a/llvm/lib/Target/X86/X86MCInstLower.cpp 
b/llvm/lib/Target/X86/X86MCInstLower.cpp
index 55d57d15f8d42..26780d44a6493 100644
--- a/llvm/lib/Target/X86/X86MCInstLower.cpp
+++ b/llvm/lib/Target/X86/X86MCInstLower.cpp
@@ -32,6 +32,7 @@
 #include "llvm/CodeGen/MachineModuleInfoImpls.h"
 #include "llvm/CodeGen/MachineOperand.h"
 #include "llvm/CodeGen/StackMaps.h"
+#include "llvm/CodeGen/WinEHFuncInfo.h"
 #include "llvm/IR/DataLayout.h"
 #include "llvm/IR/GlobalValue.h"
 #include "llvm/IR/Mangler.h"
@@ -2538,26 +2539,6 @@ void X86AsmPrinter::emitInstruction(const MachineInstr 
*MI) {
 
   case X86::SEH_BeginEpilogue: {
     assert(MF->hasWinCFI() && "SEH_ instruction in function without WinCFI?");
-    // Windows unwinder will not invoke function's exception handler if IP is
-    // either in prologue or in epilogue.  This behavior causes a problem when 
a
-    // call immediately precedes an epilogue, because the return address points
-    // into the epilogue.  To cope with that, we insert a 'nop' if it ends up
-    // immediately after a CALL in the final emitted code.
-    MachineBasicBlock::const_iterator MBBI(MI);
-    // Check if preceded by a call and emit nop if so.
-    for (MBBI = PrevCrossBBInst(MBBI);
-         MBBI != MachineBasicBlock::const_iterator();
-         MBBI = PrevCrossBBInst(MBBI)) {
-      // Pseudo instructions that aren't a call are assumed to not emit any
-      // code. If they do, we worst case generate unnecessary noops after a
-      // call.
-      if (MBBI->isCall() || !MBBI->isPseudo()) {
-        if (MBBI->isCall())
-          EmitAndCountInstruction(MCInstBuilder(X86::NOOP));
-        break;
-      }
-    }
-
     EmitSEHInstruction(MI);
     return;
   }
@@ -2586,6 +2567,7 @@ void X86AsmPrinter::emitInstruction(const MachineInstr 
*MI) {
       EmitAndCountInstruction(MCInstBuilder(X86::REX64_PREFIX));
       emitCallInstruction(TmpInst);
       emitNop(*OutStreamer, 5, Subtarget);
+      emitNopAfterCallForWindowsEH(MI);
       return;
     }
 
@@ -2606,6 +2588,7 @@ void X86AsmPrinter::emitInstruction(const MachineInstr 
*MI) {
       // For Import Call Optimization to work, we need a 3-byte nop after the
       // call instruction.
       emitNop(*OutStreamer, 3, Subtarget);
+      emitNopAfterCallForWindowsEH(MI);
       return;
     }
     break;
@@ -2639,6 +2622,7 @@ void X86AsmPrinter::emitInstruction(const MachineInstr 
*MI) {
 
   if (MI->isCall()) {
     emitCallInstruction(TmpInst);
+    emitNopAfterCallForWindowsEH(MI);
     return;
   }
 
@@ -2660,6 +2644,133 @@ void X86AsmPrinter::emitCallInstruction(const 
llvm::MCInst &MCI) {
   OutStreamer->emitInstruction(MCI, getSubtargetInfo());
 }
 
+// Checks whether a NOP is required after a CALL and inserts the NOP, if
+// necessary.
+void X86AsmPrinter::emitNopAfterCallForWindowsEH(const MachineInstr *MI) {
+  if (needsNopAfterCallForWindowsEH(MI))
+    EmitAndCountInstruction(MCInstBuilder(X86::NOOP));
+}
+
+// Determines whether a NOP is required after a CALL, so that Windows EH
+// IP2State tables have the correct information.
+//
+// On most Windows platforms (AMD64, ARM64, ARM32, IA64, but *not* x86-32),
+// exception handling works by looking up instruction pointers in lookup
+// tables. These lookup tables are stored in .xdata sections in executables.
+// One element of the lookup tables are the "IP2State" tables (Instruction
+// Pointer to State).
+//
+// If a function has any instructions that require cleanup during exception
+// unwinding, then it will have an IP2State table. Each entry in the IP2State
+// table describes a range of bytes in the function's instruction stream, and
+// associates an "EH state number" with that range of instructions. A value of
+// -1 means "the null state", which does not require any code to execute.
+// A value other than -1 is an index into the State table.
+//
+// The entries in the IP2State table contain byte offsets within the 
instruction
+// stream of the function. The Windows ABI requires that these offsets are
+// aligned to instruction boundaries; they are not permitted to point to a byte
+// that is not the first byte of an instruction.
+//
+// Unfortunately, CALL instructions present a problem during unwinding. CALL
+// instructions push the address of the instruction after the CALL instruction,
+// so that execution can resume after the CALL. If the CALL is the last
+// instruction within an IP2State region, then the return address (on the 
stack)
+// points to the *next* IP2State region. This means that the unwinder will
+// use the wrong cleanup funclet during unwinding.
+//
+// To fix this problem, MSVC will insert a NOP after a CALL instruction, if the
+// CALL instruction is the last instruction within an IP2State region. The NOP
+// is placed within the same IP2State region as the CALL, so that the return
+// address points to the NOP and the unwinder will locate the correct region.
+//
+// Previously, LLVM fixed this by adding 1 to the instruction offsets in the
+// IP2State table. This caused the instruction boundary to point *within* the
+// instruction after a CALL. This works for the purposes of unwinding, since
+// there are no AMD64 instructions that can be encoded in a single byte and
+// which throw C++ exceptions. Unfortunately, this violates the Windows ABI
+// specification, which requires that the IP2State table entries point to the
+// boundaries between exceptions.
+//
+// To fix this properly, LLVM will now insert a 1-byte NOP after CALL
+// instructions, in the same situations that MSVC does. In performance tests,
+// the NOP has no detectable significance. The NOP is rarely inserted, since
+// it is only inserted when the CALL is the last instruction before an IP2State
+// transition or the CALL is the last instruction before the function epilogue.
+//
+// NOP padding is only necessary on Windows AMD64 targets. On ARM64 and ARM32,
+// instructions have a fixed size so the unwinder knows how to "back up" by
+// one instruction.
+//
+// Interaction with Import Call Optimization (ICO):
+//
+// Import Call Optimization (ICO) is a compiler + OS feature on Windows which
+// improves the performance and security of DLL imports. ICO relies on using a
+// specific CALL idiom that can be replaced by the OS DLL loader. This removes
+// a load and indirect CALL and replaces it with a single direct CALL.
+//
+// To achieve this, ICO also inserts NOPs after the CALL instruction. If the
+// end of the CALL is aligned with an EH state transition, we *also* insert
+// a single-byte NOP.  **Both forms of NOPs must be preserved.**  They cannot
+// be combined into a single larger NOP; nor can the second NOP be removed.
+//
+// This is necessary because, if ICO is active and the call site is modified
+// by the loader, the loader will end up overwriting the NOPs that were 
inserted
+// for ICO. That means that those NOPs cannot be used for the correct
+// termination of the exception handling region (the IP2State transition),
+// so we still need an additional NOP instruction.  The NOPs cannot be combined
+// into a longer NOP (which is ordinarily desirable) because then ICO would
+// split one instruction, producing a malformed instruction after the ICO call.
+bool X86AsmPrinter::needsNopAfterCallForWindowsEH(const MachineInstr *MI) {
+  // We only need to insert NOPs after CALLs when targeting Windows on AMD64.
+  // Since this code is already restricted to X86, we just test for Win64.
+  if (!this->Subtarget->isTargetWin64()) {
+    return false;
+  }
+
+  MachineBasicBlock::const_iterator MBBI(MI);
+  ++MBBI; // Step over MI
+  auto End = MI->getParent()->end();
+  for (; MBBI != End; ++MBBI) {
+    // Check the instruction that follows this CALL.
+    const MachineInstr &NextMI = *MBBI;
+
+    // If there is an EH_LABEL after this CALL, then there is an EH state
+    // transition after this CALL. This is exactly the situation which requires
+    // NOP padding.
+    if (NextMI.isEHLabel()) {
+      return true;
+    }
+
+    // Somewhat similarly, if the CALL is the last instruction before the
+    // SEH prologue, then we also need a NOP. This is necessary because the
+    // Windows stack unwinder will not invoke a function's exception handler if
+    // the instruction pointer is in the function prologue or epilogue.
+    if (NextMI.getOpcode() == X86::SEH_BeginEpilogue) {
+      return true;
+    }
+
+    if (!NextMI.isPseudo() && !NextMI.isMetaInstruction()) {
+      // We found a real instruction. During the CALL, the return IP will point
+      // to this instruction. Since this instruction has the same EH state as
+      // the call itself (because there is no intervening EH_LABEL), the
+      // IP2State table will be accurate; there is no need to insert a NOP.
+      return false;
+    }
+
+    // The next instruction is a pseudo-op. Ignore it and keep searching.
+    // Because these instructions do not generate any machine code, they cannot
+    // prevent the IP2State table from pointing at the wrong instruction during
+    // a CALL.
+  }
+
+  // The CALL is at the end of the MBB. We do not insert a NOP. If the CALL
+  // is at the end of an MBB that is within a non-null EH state, then the
+  // MBB will already include an EH_LABEL at the end of this MBB. In that
+  // case, we would see the EH_LABEL, not the CALL, at the end of the MBB.
+  return false;
+}
+
 void X86AsmPrinter::emitLabelAndRecordForImportCallOptimization(
     ImportCallKind Kind) {
   assert(EnableImportCallOptimization);
diff --git a/llvm/test/CodeGen/WinEH/wineh-noret-cleanup.ll 
b/llvm/test/CodeGen/WinEH/wineh-noret-cleanup.ll
index e42b005cf64bd..f35248cf8d44d 100644
--- a/llvm/test/CodeGen/WinEH/wineh-noret-cleanup.ll
+++ b/llvm/test/CodeGen/WinEH/wineh-noret-cleanup.ll
@@ -46,15 +46,15 @@ catch.body.2:
 ; CXX-LABEL: $ip2state$test:
 ; CXX-NEXT:   .long   .Lfunc_begin0@IMGREL
 ; CXX-NEXT:   .long   -1
-; CXX-NEXT:   .long   .Ltmp0@IMGREL+1
+; CXX-NEXT:   .long   .Ltmp0@IMGREL
 ; CXX-NEXT:   .long   1
-; CXX-NEXT:   .long   .Ltmp1@IMGREL+1
+; CXX-NEXT:   .long   .Ltmp1@IMGREL
 ; CXX-NEXT:   .long   -1
 ; CXX-NEXT:   .long   "?catch$3@?0?test@4HA"@IMGREL
 ; CXX-NEXT:   .long   2
-; CXX-NEXT:   .long   .Ltmp2@IMGREL+1
+; CXX-NEXT:   .long   .Ltmp2@IMGREL
 ; CXX-NEXT:   .long   3
-; CXX-NEXT:   .long   .Ltmp3@IMGREL+1
+; CXX-NEXT:   .long   .Ltmp3@IMGREL
 ; CXX-NEXT:   .long   2
 ; CXX-NEXT:   .long   "?catch$5@?0?test@4HA"@IMGREL
 ; CXX-NEXT:   .long   4
@@ -62,19 +62,19 @@ catch.body.2:
 ; SEH-LABEL: test:
 ; SEH-LABEL: .Llsda_begin0:
 ; SEH-NEXT:    .long   .Ltmp0@IMGREL
-; SEH-NEXT:    .long   .Ltmp1@IMGREL+1
+; SEH-NEXT:    .long   .Ltmp1@IMGREL
 ; SEH-NEXT:    .long   dummy_filter@IMGREL
 ; SEH-NEXT:    .long   .LBB0_3@IMGREL
 ; SEH-NEXT:    .long   .Ltmp0@IMGREL
-; SEH-NEXT:    .long   .Ltmp1@IMGREL+1
+; SEH-NEXT:    .long   .Ltmp1@IMGREL
 ; SEH-NEXT:    .long   dummy_filter@IMGREL
 ; SEH-NEXT:    .long   .LBB0_5@IMGREL
 ; SEH-NEXT:    .long   .Ltmp2@IMGREL
-; SEH-NEXT:    .long   .Ltmp3@IMGREL+1
+; SEH-NEXT:    .long   .Ltmp3@IMGREL
 ; SEH-NEXT:    .long   "?dtor$2@?0?test@4HA"@IMGREL
 ; SEH-NEXT:    .long   0
 ; SEH-NEXT:    .long   .Ltmp2@IMGREL
-; SEH-NEXT:    .long   .Ltmp3@IMGREL+1
+; SEH-NEXT:    .long   .Ltmp3@IMGREL
 ; SEH-NEXT:    .long   dummy_filter@IMGREL
 ; SEH-NEXT:    .long   .LBB0_5@IMGREL
 ; SEH-NEXT:  .Llsda_end0:

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to