[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)
DavidSpickett wrote: I've [disabled](https://github.com/llvm/llvm-project/commit/f7fdc8d0cf7cd5caa7d463e1e80e0d5a08c71bb2) the test when threads are disabled. I don't think it needs to be limited to Apple Silicon actually, because a new OS thread should pass the test (and apparently does). But @Bigcheese please see if you agree with that. https://github.com/llvm/llvm-project/pull/136046 ___ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)
DavidSpickett wrote:
That build sets `-DLLVM_ENABLE_THREADS=OFF`. `LLVM_HAS_SPLIT_STACKS_AARCH64`
and `LLVM_HAS_SPLIT_STACKS` are false, which means we use:
```
void llvm::runOnNewStack(unsigned StackSize, function_ref Fn) {
llvm::thread Thread(
StackSize == 0 ? std::nullopt : std::optional(StackSize), Fn);
Thread.join();
}
```
Without threading I think this just calls the function.
So this test may need to 1 - only run on Apple Silicon and 2 - only when
threading is enabled. I wonder if it passes on other platforms because stack
protections make the offset look correct, something like that. We did not see
this failing on AArch64 Linux for example.
https://github.com/llvm/llvm-project/pull/136046
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)
DavidSpickett wrote: Same sort of thing on our Armv8 32-bit build: https://lab.llvm.org/buildbot/#/builders/122/builds/1478 ``` ../llvm/llvm/unittests/Support/ProgramStackTest.cpp:32 Expected: (StackDistance) > (llvm::sys::Process::getPageSizeEstimate()), actual: 8 vs 4096 ``` https://github.com/llvm/llvm-project/pull/136046 ___ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)
evodius96 wrote: Hello, I'm seeing ProgramStackTest fail when running on our downstream toolchain on Linux: ``` 2 | | FAIL: llvm_regressions :: LLVM-Unit/Support/SupportTests/ProgramStackTest/runOnNewStack 3 | | 4 | | Script: 5 | | -- 6 | | /scratch/aphipps/triage_repo/tools/llvm_cgt/build/RelWithAsserts/llvm/unittests/Support/./SupportTests --gtest_filter=ProgramStackTest.runOnNewStack 7 | | -- 8 | | /scratch/aphipps/triage_repo/tools/llvm_cgt/llvm-project/llvm/unittests/Support/ProgramStackTest.cpp:32 9 | | Expected: (StackDistance) > (llvm::sys::Process::getPageSizeEstimate()), actual: 16 vs 4096 ``` Not sure where to begin with debugging this. Your commit says it should only apply to Apple? https://github.com/llvm/llvm-project/pull/136046 ___ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)
llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `lld-x86_64-win` running on `as-worker-93` while building `clang,llvm` at step 7 "test-build-unified-tree-check-all". Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/2823 Here is the relevant piece of the build log for the reference ``` Step 7 (test-build-unified-tree-check-all) failure: test (failure) TEST 'LLVM-Unit :: Support/./SupportTests.exe/90/95' FAILED Script(shard): -- GTEST_OUTPUT=json:C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe-LLVM-Unit-24332-90-95.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=95 GTEST_SHARD_INDEX=90 C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe -- Script: -- C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe --gtest_filter=ProgramEnvTest.CreateProcessLongPath -- C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(160): error: Expected equality of these values: 0 RC Which is: -2 C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(163): error: fs::remove(Twine(LongPath)): did not return errc::success. error number: 13 error message: permission denied C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:160 Expected equality of these values: 0 RC Which is: -2 C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:163 fs::remove(Twine(LongPath)): did not return errc::success. error number: 13 error message: permission denied ``` https://github.com/llvm/llvm-project/pull/136046 ___ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)
https://github.com/rnk approved this pull request. https://github.com/llvm/llvm-project/pull/136046 ___ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)
@@ -0,0 +1,127 @@
+//===--- RunOnNewStack.cpp - Crash Recovery
---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "llvm/Support/ProgramStack.h"
+#include "llvm/Config/config.h"
+#include "llvm/Support/Compiler.h"
+
+#ifdef LLVM_ON_UNIX
+# include // for getrlimit
+#endif
+
+#ifdef _MSC_VER
+# include // for _AddressOfReturnAddress
+#endif
+
+#ifndef LLVM_HAS_SPLIT_STACKS
+# include "llvm/Support/thread.h"
+#endif
+
+using namespace llvm;
+
+uintptr_t llvm::getStackPointer() {
+#if __GNUC__ || __has_builtin(__builtin_frame_address)
+ return (uintptr_t)__builtin_frame_address(0);
+#elif defined(_MSC_VER)
+ return (uintptr_t)_AddressOfReturnAddress();
+#else
+ volatile char CharOnStack = 0;
+ // The volatile store here is intended to escape the local variable, to
+ // prevent the compiler from optimizing CharOnStack into anything other
+ // than a char on the stack.
+ //
+ // Tested on: MSVC 2015 - 2019, GCC 4.9 - 9, Clang 3.2 - 9, ICC 13 - 19.
+ char *volatile Ptr = &CharOnStack;
+ return (uintptr_t)Ptr;
+#endif
+}
+
+unsigned llvm::getDefaultStackSize() {
+#ifdef LLVM_ON_UNIX
+ rlimit RL;
+ getrlimit(RLIMIT_STACK, &RL);
+ return RL.rlim_cur;
+#else
+ // Clang recursively parses, instantiates templates, and evaluates constant
+ // expressions. We've found 8MiB to be a reasonable stack size given the way
+ // Clang works and the way C++ is commonly written.
+ return 8 << 20;
+#endif
+}
+
+// Not an anonymous namespace to avoid warning about undefined local function.
+namespace llvm {
+#ifdef LLVM_HAS_SPLIT_STACKS_AARCH64
+void runOnNewStackImpl(void *Stack, void (*Fn)(void *), void *Ctx) __asm__(
+"_ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_");
+
+// This can't use naked functions because there is no way to know if cfi
+// directives are being emitted or not.
+//
+// When adding new platforms it may be better to move to a .S file with macros
+// for dealing with platform differences.
+__asm__ (
+".globl _ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_\n\t"
+".p2align 2\n\t"
+"_ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_:\n\t"
+".cfi_startproc\n\t"
+"mov x16, sp\n\t"
+"sub x0, x0, #0x20\n\t"// subtract space from stack
+"stp xzr, x16, [x0, #0x00]\n\t"// save old sp
+"stp x29, x30, [x0, #0x10]\n\t"// save fp, lr
+"mov sp, x0\n\t" // switch to new stack
+"add x29, x0, #0x10\n\t" // switch to new frame
+".cfi_def_cfa w29, 16\n\t"
Bigcheese wrote:
The CFI info is about what `x29` points to, which is only valid after the
assignment via `add`. I verified that stopping lldb between the `mov sp, x0`
and `addx29, x0, #0x10` works.
https://github.com/llvm/llvm-project/pull/136046
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)
https://github.com/rnk edited https://github.com/llvm/llvm-project/pull/136046 ___ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)
@@ -0,0 +1,127 @@
+//===--- RunOnNewStack.cpp - Crash Recovery
---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "llvm/Support/ProgramStack.h"
+#include "llvm/Config/config.h"
+#include "llvm/Support/Compiler.h"
+
+#ifdef LLVM_ON_UNIX
+# include // for getrlimit
+#endif
+
+#ifdef _MSC_VER
+# include // for _AddressOfReturnAddress
+#endif
+
+#ifndef LLVM_HAS_SPLIT_STACKS
+# include "llvm/Support/thread.h"
+#endif
+
+using namespace llvm;
+
+uintptr_t llvm::getStackPointer() {
+#if __GNUC__ || __has_builtin(__builtin_frame_address)
+ return (uintptr_t)__builtin_frame_address(0);
+#elif defined(_MSC_VER)
+ return (uintptr_t)_AddressOfReturnAddress();
+#else
+ volatile char CharOnStack = 0;
+ // The volatile store here is intended to escape the local variable, to
+ // prevent the compiler from optimizing CharOnStack into anything other
+ // than a char on the stack.
+ //
+ // Tested on: MSVC 2015 - 2019, GCC 4.9 - 9, Clang 3.2 - 9, ICC 13 - 19.
+ char *volatile Ptr = &CharOnStack;
+ return (uintptr_t)Ptr;
+#endif
+}
+
+unsigned llvm::getDefaultStackSize() {
+#ifdef LLVM_ON_UNIX
+ rlimit RL;
+ getrlimit(RLIMIT_STACK, &RL);
+ return RL.rlim_cur;
+#else
+ // Clang recursively parses, instantiates templates, and evaluates constant
+ // expressions. We've found 8MiB to be a reasonable stack size given the way
+ // Clang works and the way C++ is commonly written.
+ return 8 << 20;
+#endif
+}
+
+// Not an anonymous namespace to avoid warning about undefined local function.
+namespace llvm {
+#ifdef LLVM_HAS_SPLIT_STACKS_AARCH64
+void runOnNewStackImpl(void *Stack, void (*Fn)(void *), void *Ctx) __asm__(
+"_ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_");
+
+// This can't use naked functions because there is no way to know if cfi
+// directives are being emitted or not.
+//
+// When adding new platforms it may be better to move to a .S file with macros
+// for dealing with platform differences.
+__asm__ (
+".globl _ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_\n\t"
+".p2align 2\n\t"
+"_ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_:\n\t"
+".cfi_startproc\n\t"
+"mov x16, sp\n\t"
+"sub x0, x0, #0x20\n\t"// subtract space from stack
+"stp xzr, x16, [x0, #0x00]\n\t"// save old sp
+"stp x29, x30, [x0, #0x10]\n\t"// save fp, lr
+"mov sp, x0\n\t" // switch to new stack
+"add x29, x0, #0x10\n\t" // switch to new frame
+".cfi_def_cfa w29, 16\n\t"
rnk wrote:
Shouldn't these appear immediately after the stack switch, since as soon as you
update SP, the values from the parent frame can no longer be recovered? This
will fix stack unwinding when single-stepping through this code.
https://github.com/llvm/llvm-project/pull/136046
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)
https://github.com/rnk approved this pull request. If we ever generalize this, it would make sense to have a standalone .s /.asm file in the build. https://github.com/llvm/llvm-project/pull/136046 ___ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)
github-actions[bot] wrote:
:warning: C/C++ code formatter, clang-format found issues in your code.
:warning:
You can test this locally with the following command:
``bash
git-clang-format --diff HEAD~1 HEAD --extensions cpp,h --
llvm/include/llvm/Support/ProgramStack.h llvm/lib/Support/ProgramStack.cpp
llvm/unittests/Support/ProgramStackTest.cpp clang/include/clang/Basic/Stack.h
clang/lib/Basic/Stack.cpp clang/lib/Frontend/CompilerInstance.cpp
llvm/include/llvm/Support/CrashRecoveryContext.h
llvm/include/llvm/Support/thread.h llvm/lib/Support/CrashRecoveryContext.cpp
``
View the diff from clang-format here.
``diff
diff --git a/clang/lib/Basic/Stack.cpp b/clang/lib/Basic/Stack.cpp
index 8cbb84943..aa3862a95 100644
--- a/clang/lib/Basic/Stack.cpp
+++ b/clang/lib/Basic/Stack.cpp
@@ -49,10 +49,12 @@ void
clang::runWithSufficientStackSpaceSlow(llvm::function_ref Diag,
llvm::CrashRecoveryContext CRC;
// Preserve the BottomOfStack in case RunSafelyOnNewStack uses split stacks.
uintptr_t PrevBottom = BottomOfStack;
- CRC.RunSafelyOnNewStack([&] {
-noteBottomOfStack(true);
-Diag();
-Fn();
- }, DesiredStackSize);
+ CRC.RunSafelyOnNewStack(
+ [&] {
+noteBottomOfStack(true);
+Diag();
+Fn();
+ },
+ DesiredStackSize);
BottomOfStack = PrevBottom;
}
diff --git a/llvm/include/llvm/Support/ProgramStack.h
b/llvm/include/llvm/Support/ProgramStack.h
index 55964c977..524cdd468 100644
--- a/llvm/include/llvm/Support/ProgramStack.h
+++ b/llvm/include/llvm/Support/ProgramStack.h
@@ -18,8 +18,8 @@
// and other tooling.
#if defined(__APPLE__) && defined(__MACH__) && defined(__aarch64__) &&
\
__has_extension(gnu_asm)
-# define LLVM_HAS_SPLIT_STACKS
-# define LLVM_HAS_SPLIT_STACKS_AARCH64
+#define LLVM_HAS_SPLIT_STACKS
+#define LLVM_HAS_SPLIT_STACKS_AARCH64
#endif
namespace llvm {
diff --git a/llvm/lib/Support/ProgramStack.cpp
b/llvm/lib/Support/ProgramStack.cpp
index 66f74ff66..993ceb452 100644
--- a/llvm/lib/Support/ProgramStack.cpp
+++ b/llvm/lib/Support/ProgramStack.cpp
@@ -11,15 +11,15 @@
#include "llvm/Support/Compiler.h"
#ifdef LLVM_ON_UNIX
-# include // for getrlimit
+#include // for getrlimit
#endif
#ifdef _MSC_VER
-# include // for _AddressOfReturnAddress
+#include // for _AddressOfReturnAddress
#endif
#ifndef LLVM_HAS_SPLIT_STACKS
-# include "llvm/Support/thread.h"
+#include "llvm/Support/thread.h"
#endif
using namespace llvm;
@@ -65,38 +65,34 @@ void runOnNewStackImpl(void *Stack, void (*Fn)(void *),
void *Ctx) __asm__(
//
// When adding new platforms it may be better to move to a .S file with macros
// for dealing with platform differences.
-__asm__ (
-".globl _ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_\n\t"
-".p2align 2\n\t"
-"_ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_:\n\t"
-".cfi_startproc\n\t"
-"mov x16, sp\n\t"
-"sub x0, x0, #0x20\n\t"// subtract space from stack
-"stp xzr, x16, [x0, #0x00]\n\t"// save old sp
-"stp x29, x30, [x0, #0x10]\n\t"// save fp, lr
-"mov sp, x0\n\t" // switch to new stack
-"add x29, x0, #0x10\n\t" // switch to new frame
-".cfi_def_cfa w29, 16\n\t"
-".cfi_offset w30, -8\n\t"// lr
-".cfi_offset w29, -16\n\t" // fp
-
-"mov x0, x2\n\t" // Ctx is the only argument
-"blr x1\n\t" // call Fn
-
-"ldp x29, x30, [sp, #0x10]\n\t"// restore fp, lr
-"ldp xzr, x16, [sp, #0x00]\n\t"// load old sp
-"mov sp, x16\n\t"
-"ret\n\t"
-".cfi_endproc"
-);
+__asm__(".globl _ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_\n\t"
+".p2align 2\n\t"
+"_ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_:\n\t"
+".cfi_startproc\n\t"
+"mov x16, sp\n\t"
+"sub x0, x0, #0x20\n\t" // subtract space from stack
+"stp xzr, x16, [x0, #0x00]\n\t" // save old sp
+"stp x29, x30, [x0, #0x10]\n\t" // save fp, lr
+"mov sp, x0\n\t"// switch to new stack
+"add x29, x0, #0x10\n\t"// switch to new frame
+".cfi_def_cfa w29, 16\n\t"
+".cfi_offset w30, -8\n\t" // lr
+".cfi_offset w29, -16\n\t" // fp
+
+"mov x0, x2\n\t" // Ctx is the only argument
+"blr x1\n\t" // call Fn
+
+"ldp x29, x30, [sp, #0x10]\n\t" // restore fp, lr
+"ldp xzr, x16, [sp, #0x00]\n\t" // load old sp
+"mov sp, x16\n\t"
+"ret\n\t"
+".cfi_endproc");
#endif
} // namespace llvm
namespace {
#ifdef LLVM_HAS_SPLIT_STACKS
-void callback(void *Ctx) {
- (*reinterpret_cast *>(Ctx))();
-}
+void callback(void *Ctx) { (*reinterpret_cast *>(Ctx))();
}
#endif
} // namespace
diff --git a/llvm/unittests/Support/ProgramStackTest.cpp
[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)
llvmbot wrote:
@llvm/pr-subscribers-clang
Author: Michael Spencer (Bigcheese)
Changes
Reland https://github.com/llvm/llvm-project/pull/133173. This changes the
assembly code to use `.cfi_{start,end}proc` directly in a file scope asm
statement and restricts enabling it to MachO to help ensure that the assembly
is only used where it's valid. The previous code had issues with the
`-fno-asynchronous-unwind-tables -g0` configuration due to naked functions
being responsible for emitting `.cfi_{start,end}proc`. There's not a good way
to detect if CFI is enabled in general, so we always emit it. Removing CFI is
only done as a size optimization, so having it for one function is fine.
GCC actually does have `__GCC_HAVE_DWARF2_CFI_ASM` when it's emitting CFI
directives, but Clang does not. compiler-rt just always emits CFI directives
when built using Clang.
This also includes the other follow up fixes for GCC compatibility and fixing
thread.h
---
Clang spawns a new thread to avoid running out of stack space. This can make
debugging and performance analysis more difficult as how the threads are
connected is difficult to recover.
This patch introduces `runOnNewStack` and applies it in Clang. On platforms
that have good support for it this allocates a new stack and moves to it using
assembly. Doing split stacks like this actually runs on most platforms, but
many debuggers and unwinders reject the large or backwards stack offsets that
occur. Apple platforms and tools are known to support this, so this only
enables it there for now.
---
Full diff: https://github.com/llvm/llvm-project/pull/136046.diff
12 Files Affected:
- (modified) clang/docs/ReleaseNotes.rst (+4)
- (modified) clang/include/clang/Basic/Stack.h (+4-1)
- (modified) clang/lib/Basic/Stack.cpp (+12-28)
- (modified) clang/lib/Frontend/CompilerInstance.cpp (+1-1)
- (modified) llvm/include/llvm/Support/CrashRecoveryContext.h (+3)
- (added) llvm/include/llvm/Support/ProgramStack.h (+63)
- (modified) llvm/include/llvm/Support/thread.h (+1)
- (modified) llvm/lib/Support/CMakeLists.txt (+1)
- (modified) llvm/lib/Support/CrashRecoveryContext.cpp (+11)
- (added) llvm/lib/Support/ProgramStack.cpp (+127)
- (modified) llvm/unittests/Support/CMakeLists.txt (+1)
- (added) llvm/unittests/Support/ProgramStackTest.cpp (+35)
``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 07ff1251fc1ad..7d714d93ce854 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -195,6 +195,10 @@ Non-comprehensive list of changes in this release
- Added `__builtin_elementwise_exp10`.
- For AMDPGU targets, added `__builtin_v_cvt_off_f32_i4` that maps to the
`v_cvt_off_f32_i4` instruction.
- Added `__builtin_elementwise_minnum` and `__builtin_elementwise_maxnum`.
+- Clang itself now uses split stacks instead of threads for allocating more
+ stack space when running on Apple AArch64 based platforms. This means that
+ stack traces of Clang from debuggers, crashes, and profilers may look
+ different than before.
New Compiler Flags
--
diff --git a/clang/include/clang/Basic/Stack.h
b/clang/include/clang/Basic/Stack.h
index 30ebd94aedd1f..9674b9d9b62c3 100644
--- a/clang/include/clang/Basic/Stack.h
+++ b/clang/include/clang/Basic/Stack.h
@@ -27,7 +27,10 @@ namespace clang {
/// Call this once on each thread, as soon after starting the thread as
/// feasible, to note the approximate address of the bottom of the stack.
- void noteBottomOfStack();
+ ///
+ /// \param ForceSet set to true if you know the call is near the bottom of a
+ /// new stack. Used for split stacks.
+ void noteBottomOfStack(bool ForceSet = false);
/// Determine whether the stack is nearly exhausted.
bool isStackNearlyExhausted();
diff --git a/clang/lib/Basic/Stack.cpp b/clang/lib/Basic/Stack.cpp
index aa15d8e66950f..8cbb84943f8d3 100644
--- a/clang/lib/Basic/Stack.cpp
+++ b/clang/lib/Basic/Stack.cpp
@@ -13,33 +13,13 @@
#include "clang/Basic/Stack.h"
#include "llvm/Support/CrashRecoveryContext.h"
+#include "llvm/Support/ProgramStack.h"
-#ifdef _MSC_VER
-#include // for _AddressOfReturnAddress
-#endif
+static LLVM_THREAD_LOCAL uintptr_t BottomOfStack = 0;
-static LLVM_THREAD_LOCAL void *BottomOfStack = nullptr;
-
-static void *getStackPointer() {
-#if __GNUC__ || __has_builtin(__builtin_frame_address)
- return __builtin_frame_address(0);
-#elif defined(_MSC_VER)
- return _AddressOfReturnAddress();
-#else
- char CharOnStack = 0;
- // The volatile store here is intended to escape the local variable, to
- // prevent the compiler from optimizing CharOnStack into anything other
- // than a char on the stack.
- //
- // Tested on: MSVC 2015 - 2019, GCC 4.9 - 9, Clang 3.2 - 9, ICC 13 - 19.
- char *volatile Ptr = &CharOnStack;
- return Ptr;
-#endif
-}
-
-void clang::noteBottomOfStack() {
- if (!BottomOfStack)
-BottomOfStack = getStackPointer();
+void
[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)
https://github.com/Bigcheese created
https://github.com/llvm/llvm-project/pull/136046
Reland https://github.com/llvm/llvm-project/pull/133173. This changes the
assembly code to use `.cfi_{start,end}proc` directly in a file scope asm
statement and restricts enabling it to MachO to help ensure that the assembly
is only used where it's valid. The previous code had issues with the
`-fno-asynchronous-unwind-tables -g0` configuration due to naked functions
being responsible for emitting `.cfi_{start,end}proc`. There's not a good way
to detect if CFI is enabled in general, so we always emit it. Removing CFI is
only done as a size optimization, so having it for one function is fine.
GCC actually does have `__GCC_HAVE_DWARF2_CFI_ASM` when it's emitting CFI
directives, but Clang does not. compiler-rt just always emits CFI directives
when built using Clang.
This also includes the other follow up fixes for GCC compatibility and fixing
thread.h
---
Clang spawns a new thread to avoid running out of stack space. This can make
debugging and performance analysis more difficult as how the threads are
connected is difficult to recover.
This patch introduces `runOnNewStack` and applies it in Clang. On platforms
that have good support for it this allocates a new stack and moves to it using
assembly. Doing split stacks like this actually runs on most platforms, but
many debuggers and unwinders reject the large or backwards stack offsets that
occur. Apple platforms and tools are known to support this, so this only
enables it there for now.
>From 2c44bbbf944ec9e2d15c431342a835ef1c7ed18c Mon Sep 17 00:00:00 2001
From: Michael Spencer
Date: Wed, 26 Mar 2025 14:48:19 -0700
Subject: [PATCH] [llvm][clang] Allocate a new stack instead of spawning a new
thread to get more stack space
Clang spawns a new thread to avoid running out of stack space. This
can make debugging and performance analysis more difficult as how the
threads are connected is difficult to recover.
This patch introduces `runOnNewStack` and applies it in Clang. On
platforms that have good support for it this allocates a new stack and
moves to it using assembly. Doing split stacks like this actually runs
on most platforms, but many debuggers and unwinders reject the large
or backwards stack offsets that occur. Apple platforms and tools are
known to support this, so this only enables it there for now.
---
clang/docs/ReleaseNotes.rst | 4 +
clang/include/clang/Basic/Stack.h | 5 +-
clang/lib/Basic/Stack.cpp | 40 ++
clang/lib/Frontend/CompilerInstance.cpp | 2 +-
.../llvm/Support/CrashRecoveryContext.h | 3 +
llvm/include/llvm/Support/ProgramStack.h | 63 +
llvm/include/llvm/Support/thread.h| 1 +
llvm/lib/Support/CMakeLists.txt | 1 +
llvm/lib/Support/CrashRecoveryContext.cpp | 11 ++
llvm/lib/Support/ProgramStack.cpp | 127 ++
llvm/unittests/Support/CMakeLists.txt | 1 +
llvm/unittests/Support/ProgramStackTest.cpp | 35 +
12 files changed, 263 insertions(+), 30 deletions(-)
create mode 100644 llvm/include/llvm/Support/ProgramStack.h
create mode 100644 llvm/lib/Support/ProgramStack.cpp
create mode 100644 llvm/unittests/Support/ProgramStackTest.cpp
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 07ff1251fc1ad..7d714d93ce854 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -195,6 +195,10 @@ Non-comprehensive list of changes in this release
- Added `__builtin_elementwise_exp10`.
- For AMDPGU targets, added `__builtin_v_cvt_off_f32_i4` that maps to the
`v_cvt_off_f32_i4` instruction.
- Added `__builtin_elementwise_minnum` and `__builtin_elementwise_maxnum`.
+- Clang itself now uses split stacks instead of threads for allocating more
+ stack space when running on Apple AArch64 based platforms. This means that
+ stack traces of Clang from debuggers, crashes, and profilers may look
+ different than before.
New Compiler Flags
--
diff --git a/clang/include/clang/Basic/Stack.h
b/clang/include/clang/Basic/Stack.h
index 30ebd94aedd1f..9674b9d9b62c3 100644
--- a/clang/include/clang/Basic/Stack.h
+++ b/clang/include/clang/Basic/Stack.h
@@ -27,7 +27,10 @@ namespace clang {
/// Call this once on each thread, as soon after starting the thread as
/// feasible, to note the approximate address of the bottom of the stack.
- void noteBottomOfStack();
+ ///
+ /// \param ForceSet set to true if you know the call is near the bottom of a
+ /// new stack. Used for split stacks.
+ void noteBottomOfStack(bool ForceSet = false);
/// Determine whether the stack is nearly exhausted.
bool isStackNearlyExhausted();
diff --git a/clang/lib/Basic/Stack.cpp b/clang/lib/Basic/Stack.cpp
index aa15d8e66950f..8cbb84943f8d3 100644
--- a/clang/lib/Basic/Stack.cpp
+++ b/clang/lib/Basic/Stack.cpp
