[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)

2025-05-02 Thread David Spickett via cfe-commits

DavidSpickett wrote:

I've 
[disabled](https://github.com/llvm/llvm-project/commit/f7fdc8d0cf7cd5caa7d463e1e80e0d5a08c71bb2)
 the test when threads are disabled.

I don't think it needs to be limited to Apple Silicon actually, because a new 
OS thread should pass the test (and apparently does). But @Bigcheese please see 
if you agree with that.

https://github.com/llvm/llvm-project/pull/136046
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)

2025-05-02 Thread David Spickett via cfe-commits

DavidSpickett wrote:

That build sets `-DLLVM_ENABLE_THREADS=OFF`. `LLVM_HAS_SPLIT_STACKS_AARCH64` 
and `LLVM_HAS_SPLIT_STACKS` are false, which means we use:
```
void llvm::runOnNewStack(unsigned StackSize, function_ref Fn) {
  llvm::thread Thread(
  StackSize == 0 ? std::nullopt : std::optional(StackSize), Fn);
  Thread.join();
}
```
Without threading I think this just calls the function.

So this test may need to 1 - only run on Apple Silicon and 2 - only when 
threading is enabled. I wonder if it passes on other platforms because stack 
protections make the offset look correct, something like that. We did not see 
this failing on AArch64 Linux for example.

https://github.com/llvm/llvm-project/pull/136046
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)

2025-05-02 Thread David Spickett via cfe-commits

DavidSpickett wrote:

Same sort of thing on our Armv8 32-bit build: 
https://lab.llvm.org/buildbot/#/builders/122/builds/1478
```
../llvm/llvm/unittests/Support/ProgramStackTest.cpp:32
Expected: (StackDistance) > (llvm::sys::Process::getPageSizeEstimate()), 
actual: 8 vs 4096
```


https://github.com/llvm/llvm-project/pull/136046
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)

2025-05-01 Thread Alan Phipps via cfe-commits

evodius96 wrote:

Hello, I'm seeing ProgramStackTest fail when running on our downstream 
toolchain on Linux:
```
2 |   | FAIL: llvm_regressions :: 
LLVM-Unit/Support/SupportTests/ProgramStackTest/runOnNewStack
3 |   | 

4 |   | Script:
5 |   | --
6 |   | 
/scratch/aphipps/triage_repo/tools/llvm_cgt/build/RelWithAsserts/llvm/unittests/Support/./SupportTests
 --gtest_filter=ProgramStackTest.runOnNewStack
7 |   | --
8 |   | 
/scratch/aphipps/triage_repo/tools/llvm_cgt/llvm-project/llvm/unittests/Support/ProgramStackTest.cpp:32
9 |   | Expected: (StackDistance) > 
(llvm::sys::Process::getPageSizeEstimate()), actual: 16 vs 4096
```
Not sure where to begin with debugging this.  Your commit says it should only 
apply to Apple?

https://github.com/llvm/llvm-project/pull/136046
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)

2025-05-01 Thread LLVM Continuous Integration via cfe-commits

llvm-ci wrote:

LLVM Buildbot has detected a new failure on builder `lld-x86_64-win` running on 
`as-worker-93` while building `clang,llvm` at step 7 
"test-build-unified-tree-check-all".

Full details are available at: 
https://lab.llvm.org/buildbot/#/builders/146/builds/2823


Here is the relevant piece of the build log for the reference

```
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
 TEST 'LLVM-Unit :: Support/./SupportTests.exe/90/95' 
FAILED 
Script(shard):
--
GTEST_OUTPUT=json:C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe-LLVM-Unit-24332-90-95.json
 GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=95 GTEST_SHARD_INDEX=90 
C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe
--

Script:
--
C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe 
--gtest_filter=ProgramEnvTest.CreateProcessLongPath
--
C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(160): 
error: Expected equality of these values:
  0
  RC
Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(163): 
error: fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied



C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:160
Expected equality of these values:
  0
  RC
Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:163
fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied







```



https://github.com/llvm/llvm-project/pull/136046
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)

2025-04-30 Thread Reid Kleckner via cfe-commits

https://github.com/rnk approved this pull request.


https://github.com/llvm/llvm-project/pull/136046
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)

2025-04-30 Thread Michael Spencer via cfe-commits


@@ -0,0 +1,127 @@
+//===--- RunOnNewStack.cpp - Crash Recovery 
---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "llvm/Support/ProgramStack.h"
+#include "llvm/Config/config.h"
+#include "llvm/Support/Compiler.h"
+
+#ifdef LLVM_ON_UNIX
+# include  // for getrlimit
+#endif
+
+#ifdef _MSC_VER
+# include   // for _AddressOfReturnAddress
+#endif
+
+#ifndef LLVM_HAS_SPLIT_STACKS
+# include "llvm/Support/thread.h"
+#endif
+
+using namespace llvm;
+
+uintptr_t llvm::getStackPointer() {
+#if __GNUC__ || __has_builtin(__builtin_frame_address)
+  return (uintptr_t)__builtin_frame_address(0);
+#elif defined(_MSC_VER)
+  return (uintptr_t)_AddressOfReturnAddress();
+#else
+  volatile char CharOnStack = 0;
+  // The volatile store here is intended to escape the local variable, to
+  // prevent the compiler from optimizing CharOnStack into anything other
+  // than a char on the stack.
+  //
+  // Tested on: MSVC 2015 - 2019, GCC 4.9 - 9, Clang 3.2 - 9, ICC 13 - 19.
+  char *volatile Ptr = &CharOnStack;
+  return (uintptr_t)Ptr;
+#endif
+}
+
+unsigned llvm::getDefaultStackSize() {
+#ifdef LLVM_ON_UNIX
+  rlimit RL;
+  getrlimit(RLIMIT_STACK, &RL);
+  return RL.rlim_cur;
+#else
+  // Clang recursively parses, instantiates templates, and evaluates constant
+  // expressions. We've found 8MiB to be a reasonable stack size given the way
+  // Clang works and the way C++ is commonly written.
+  return 8 << 20;
+#endif
+}
+
+// Not an anonymous namespace to avoid warning about undefined local function.
+namespace llvm {
+#ifdef LLVM_HAS_SPLIT_STACKS_AARCH64
+void runOnNewStackImpl(void *Stack, void (*Fn)(void *), void *Ctx) __asm__(
+"_ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_");
+
+// This can't use naked functions because there is no way to know if cfi
+// directives are being emitted or not.
+//
+// When adding new platforms it may be better to move to a .S file with macros
+// for dealing with platform differences.
+__asm__ (
+".globl  _ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_\n\t"
+".p2align  2\n\t"
+"_ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_:\n\t"
+".cfi_startproc\n\t"
+"mov   x16, sp\n\t"
+"sub   x0, x0, #0x20\n\t"// subtract space from stack
+"stp   xzr, x16, [x0, #0x00]\n\t"// save old sp
+"stp   x29, x30, [x0, #0x10]\n\t"// save fp, lr
+"mov   sp, x0\n\t"   // switch to new stack
+"add   x29, x0, #0x10\n\t"   // switch to new frame
+".cfi_def_cfa w29, 16\n\t"

Bigcheese wrote:

The CFI info is about what `x29` points to, which is only valid after the 
assignment via `add`.  I verified that stopping lldb between the `mov sp, x0` 
and `addx29, x0, #0x10` works.

https://github.com/llvm/llvm-project/pull/136046
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)

2025-04-23 Thread Reid Kleckner via cfe-commits

https://github.com/rnk edited https://github.com/llvm/llvm-project/pull/136046
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)

2025-04-23 Thread Reid Kleckner via cfe-commits


@@ -0,0 +1,127 @@
+//===--- RunOnNewStack.cpp - Crash Recovery 
---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "llvm/Support/ProgramStack.h"
+#include "llvm/Config/config.h"
+#include "llvm/Support/Compiler.h"
+
+#ifdef LLVM_ON_UNIX
+# include  // for getrlimit
+#endif
+
+#ifdef _MSC_VER
+# include   // for _AddressOfReturnAddress
+#endif
+
+#ifndef LLVM_HAS_SPLIT_STACKS
+# include "llvm/Support/thread.h"
+#endif
+
+using namespace llvm;
+
+uintptr_t llvm::getStackPointer() {
+#if __GNUC__ || __has_builtin(__builtin_frame_address)
+  return (uintptr_t)__builtin_frame_address(0);
+#elif defined(_MSC_VER)
+  return (uintptr_t)_AddressOfReturnAddress();
+#else
+  volatile char CharOnStack = 0;
+  // The volatile store here is intended to escape the local variable, to
+  // prevent the compiler from optimizing CharOnStack into anything other
+  // than a char on the stack.
+  //
+  // Tested on: MSVC 2015 - 2019, GCC 4.9 - 9, Clang 3.2 - 9, ICC 13 - 19.
+  char *volatile Ptr = &CharOnStack;
+  return (uintptr_t)Ptr;
+#endif
+}
+
+unsigned llvm::getDefaultStackSize() {
+#ifdef LLVM_ON_UNIX
+  rlimit RL;
+  getrlimit(RLIMIT_STACK, &RL);
+  return RL.rlim_cur;
+#else
+  // Clang recursively parses, instantiates templates, and evaluates constant
+  // expressions. We've found 8MiB to be a reasonable stack size given the way
+  // Clang works and the way C++ is commonly written.
+  return 8 << 20;
+#endif
+}
+
+// Not an anonymous namespace to avoid warning about undefined local function.
+namespace llvm {
+#ifdef LLVM_HAS_SPLIT_STACKS_AARCH64
+void runOnNewStackImpl(void *Stack, void (*Fn)(void *), void *Ctx) __asm__(
+"_ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_");
+
+// This can't use naked functions because there is no way to know if cfi
+// directives are being emitted or not.
+//
+// When adding new platforms it may be better to move to a .S file with macros
+// for dealing with platform differences.
+__asm__ (
+".globl  _ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_\n\t"
+".p2align  2\n\t"
+"_ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_:\n\t"
+".cfi_startproc\n\t"
+"mov   x16, sp\n\t"
+"sub   x0, x0, #0x20\n\t"// subtract space from stack
+"stp   xzr, x16, [x0, #0x00]\n\t"// save old sp
+"stp   x29, x30, [x0, #0x10]\n\t"// save fp, lr
+"mov   sp, x0\n\t"   // switch to new stack
+"add   x29, x0, #0x10\n\t"   // switch to new frame
+".cfi_def_cfa w29, 16\n\t"

rnk wrote:

Shouldn't these appear immediately after the stack switch, since as soon as you 
update SP, the values from the parent frame can no longer be recovered? This 
will fix stack unwinding when single-stepping through this code.

https://github.com/llvm/llvm-project/pull/136046
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)

2025-04-23 Thread Reid Kleckner via cfe-commits

https://github.com/rnk approved this pull request.

If we ever generalize this, it would make sense to have a standalone .s /.asm 
file in the build.

https://github.com/llvm/llvm-project/pull/136046
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)

2025-04-17 Thread via cfe-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff HEAD~1 HEAD --extensions cpp,h -- 
llvm/include/llvm/Support/ProgramStack.h llvm/lib/Support/ProgramStack.cpp 
llvm/unittests/Support/ProgramStackTest.cpp clang/include/clang/Basic/Stack.h 
clang/lib/Basic/Stack.cpp clang/lib/Frontend/CompilerInstance.cpp 
llvm/include/llvm/Support/CrashRecoveryContext.h 
llvm/include/llvm/Support/thread.h llvm/lib/Support/CrashRecoveryContext.cpp
``





View the diff from clang-format here.


``diff
diff --git a/clang/lib/Basic/Stack.cpp b/clang/lib/Basic/Stack.cpp
index 8cbb84943..aa3862a95 100644
--- a/clang/lib/Basic/Stack.cpp
+++ b/clang/lib/Basic/Stack.cpp
@@ -49,10 +49,12 @@ void 
clang::runWithSufficientStackSpaceSlow(llvm::function_ref Diag,
   llvm::CrashRecoveryContext CRC;
   // Preserve the BottomOfStack in case RunSafelyOnNewStack uses split stacks.
   uintptr_t PrevBottom = BottomOfStack;
-  CRC.RunSafelyOnNewStack([&] {
-noteBottomOfStack(true);
-Diag();
-Fn();
-  }, DesiredStackSize);
+  CRC.RunSafelyOnNewStack(
+  [&] {
+noteBottomOfStack(true);
+Diag();
+Fn();
+  },
+  DesiredStackSize);
   BottomOfStack = PrevBottom;
 }
diff --git a/llvm/include/llvm/Support/ProgramStack.h 
b/llvm/include/llvm/Support/ProgramStack.h
index 55964c977..524cdd468 100644
--- a/llvm/include/llvm/Support/ProgramStack.h
+++ b/llvm/include/llvm/Support/ProgramStack.h
@@ -18,8 +18,8 @@
 // and other tooling.
 #if defined(__APPLE__) && defined(__MACH__) && defined(__aarch64__) && 
\
 __has_extension(gnu_asm)
-# define LLVM_HAS_SPLIT_STACKS
-# define LLVM_HAS_SPLIT_STACKS_AARCH64
+#define LLVM_HAS_SPLIT_STACKS
+#define LLVM_HAS_SPLIT_STACKS_AARCH64
 #endif
 
 namespace llvm {
diff --git a/llvm/lib/Support/ProgramStack.cpp 
b/llvm/lib/Support/ProgramStack.cpp
index 66f74ff66..993ceb452 100644
--- a/llvm/lib/Support/ProgramStack.cpp
+++ b/llvm/lib/Support/ProgramStack.cpp
@@ -11,15 +11,15 @@
 #include "llvm/Support/Compiler.h"
 
 #ifdef LLVM_ON_UNIX
-# include  // for getrlimit
+#include  // for getrlimit
 #endif
 
 #ifdef _MSC_VER
-# include   // for _AddressOfReturnAddress
+#include  // for _AddressOfReturnAddress
 #endif
 
 #ifndef LLVM_HAS_SPLIT_STACKS
-# include "llvm/Support/thread.h"
+#include "llvm/Support/thread.h"
 #endif
 
 using namespace llvm;
@@ -65,38 +65,34 @@ void runOnNewStackImpl(void *Stack, void (*Fn)(void *), 
void *Ctx) __asm__(
 //
 // When adding new platforms it may be better to move to a .S file with macros
 // for dealing with platform differences.
-__asm__ (
-".globl  _ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_\n\t"
-".p2align  2\n\t"
-"_ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_:\n\t"
-".cfi_startproc\n\t"
-"mov   x16, sp\n\t"
-"sub   x0, x0, #0x20\n\t"// subtract space from stack
-"stp   xzr, x16, [x0, #0x00]\n\t"// save old sp
-"stp   x29, x30, [x0, #0x10]\n\t"// save fp, lr
-"mov   sp, x0\n\t"   // switch to new stack
-"add   x29, x0, #0x10\n\t"   // switch to new frame
-".cfi_def_cfa w29, 16\n\t"
-".cfi_offset w30, -8\n\t"// lr
-".cfi_offset w29, -16\n\t"   // fp
-
-"mov   x0, x2\n\t"   // Ctx is the only argument
-"blr   x1\n\t"   // call Fn
-
-"ldp   x29, x30, [sp, #0x10]\n\t"// restore fp, lr
-"ldp   xzr, x16, [sp, #0x00]\n\t"// load old sp
-"mov   sp, x16\n\t"
-"ret\n\t"
-".cfi_endproc"
-);
+__asm__(".globl  _ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_\n\t"
+".p2align  2\n\t"
+"_ZN4llvm17runOnNewStackImplEPvPFvS0_ES0_:\n\t"
+".cfi_startproc\n\t"
+"mov   x16, sp\n\t"
+"sub   x0, x0, #0x20\n\t" // subtract space from stack
+"stp   xzr, x16, [x0, #0x00]\n\t" // save old sp
+"stp   x29, x30, [x0, #0x10]\n\t" // save fp, lr
+"mov   sp, x0\n\t"// switch to new stack
+"add   x29, x0, #0x10\n\t"// switch to new frame
+".cfi_def_cfa w29, 16\n\t"
+".cfi_offset w30, -8\n\t"  // lr
+".cfi_offset w29, -16\n\t" // fp
+
+"mov   x0, x2\n\t" // Ctx is the only argument
+"blr   x1\n\t" // call Fn
+
+"ldp   x29, x30, [sp, #0x10]\n\t" // restore fp, lr
+"ldp   xzr, x16, [sp, #0x00]\n\t" // load old sp
+"mov   sp, x16\n\t"
+"ret\n\t"
+".cfi_endproc");
 #endif
 } // namespace llvm
 
 namespace {
 #ifdef LLVM_HAS_SPLIT_STACKS
-void callback(void *Ctx) {
-  (*reinterpret_cast *>(Ctx))();
-}
+void callback(void *Ctx) { (*reinterpret_cast *>(Ctx))(); 
}
 #endif
 } // namespace
 
diff --git a/llvm/unittests/Support/ProgramStackTest.cpp

[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)

2025-04-16 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Michael Spencer (Bigcheese)


Changes

Reland https://github.com/llvm/llvm-project/pull/133173. This changes the 
assembly code to use `.cfi_{start,end}proc` directly in a file scope asm 
statement and restricts enabling it to MachO to help ensure that the assembly 
is only used where it's valid. The previous code had issues with the 
`-fno-asynchronous-unwind-tables -g0` configuration due to naked functions 
being responsible for emitting `.cfi_{start,end}proc`. There's not a good way 
to detect if CFI is enabled in general, so we always emit it. Removing CFI is 
only done as a size optimization, so having it for one function is fine.

GCC actually does have `__GCC_HAVE_DWARF2_CFI_ASM` when it's emitting CFI 
directives, but Clang does not. compiler-rt just always emits CFI directives 
when built using Clang.

This also includes the other follow up fixes for GCC compatibility and fixing 
thread.h

---

Clang spawns a new thread to avoid running out of stack space. This can make 
debugging and performance analysis more difficult as how the threads are 
connected is difficult to recover.

This patch introduces `runOnNewStack` and applies it in Clang. On platforms 
that have good support for it this allocates a new stack and moves to it using 
assembly. Doing split stacks like this actually runs on most platforms, but 
many debuggers and unwinders reject the large or backwards stack offsets that 
occur. Apple platforms and tools are known to support this, so this only 
enables it there for now.

---
Full diff: https://github.com/llvm/llvm-project/pull/136046.diff


12 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+4) 
- (modified) clang/include/clang/Basic/Stack.h (+4-1) 
- (modified) clang/lib/Basic/Stack.cpp (+12-28) 
- (modified) clang/lib/Frontend/CompilerInstance.cpp (+1-1) 
- (modified) llvm/include/llvm/Support/CrashRecoveryContext.h (+3) 
- (added) llvm/include/llvm/Support/ProgramStack.h (+63) 
- (modified) llvm/include/llvm/Support/thread.h (+1) 
- (modified) llvm/lib/Support/CMakeLists.txt (+1) 
- (modified) llvm/lib/Support/CrashRecoveryContext.cpp (+11) 
- (added) llvm/lib/Support/ProgramStack.cpp (+127) 
- (modified) llvm/unittests/Support/CMakeLists.txt (+1) 
- (added) llvm/unittests/Support/ProgramStackTest.cpp (+35) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 07ff1251fc1ad..7d714d93ce854 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -195,6 +195,10 @@ Non-comprehensive list of changes in this release
 - Added `__builtin_elementwise_exp10`.
 - For AMDPGU targets, added `__builtin_v_cvt_off_f32_i4` that maps to the 
`v_cvt_off_f32_i4` instruction.
 - Added `__builtin_elementwise_minnum` and `__builtin_elementwise_maxnum`.
+- Clang itself now uses split stacks instead of threads for allocating more
+  stack space when running on Apple AArch64 based platforms. This means that
+  stack traces of Clang from debuggers, crashes, and profilers may look
+  different than before.
 
 New Compiler Flags
 --
diff --git a/clang/include/clang/Basic/Stack.h 
b/clang/include/clang/Basic/Stack.h
index 30ebd94aedd1f..9674b9d9b62c3 100644
--- a/clang/include/clang/Basic/Stack.h
+++ b/clang/include/clang/Basic/Stack.h
@@ -27,7 +27,10 @@ namespace clang {
 
   /// Call this once on each thread, as soon after starting the thread as
   /// feasible, to note the approximate address of the bottom of the stack.
-  void noteBottomOfStack();
+  ///
+  /// \param ForceSet set to true if you know the call is near the bottom of a
+  /// new stack. Used for split stacks.
+  void noteBottomOfStack(bool ForceSet = false);
 
   /// Determine whether the stack is nearly exhausted.
   bool isStackNearlyExhausted();
diff --git a/clang/lib/Basic/Stack.cpp b/clang/lib/Basic/Stack.cpp
index aa15d8e66950f..8cbb84943f8d3 100644
--- a/clang/lib/Basic/Stack.cpp
+++ b/clang/lib/Basic/Stack.cpp
@@ -13,33 +13,13 @@
 
 #include "clang/Basic/Stack.h"
 #include "llvm/Support/CrashRecoveryContext.h"
+#include "llvm/Support/ProgramStack.h"
 
-#ifdef _MSC_VER
-#include   // for _AddressOfReturnAddress
-#endif
+static LLVM_THREAD_LOCAL uintptr_t BottomOfStack = 0;
 
-static LLVM_THREAD_LOCAL void *BottomOfStack = nullptr;
-
-static void *getStackPointer() {
-#if __GNUC__ || __has_builtin(__builtin_frame_address)
-  return __builtin_frame_address(0);
-#elif defined(_MSC_VER)
-  return _AddressOfReturnAddress();
-#else
-  char CharOnStack = 0;
-  // The volatile store here is intended to escape the local variable, to
-  // prevent the compiler from optimizing CharOnStack into anything other
-  // than a char on the stack.
-  //
-  // Tested on: MSVC 2015 - 2019, GCC 4.9 - 9, Clang 3.2 - 9, ICC 13 - 19.
-  char *volatile Ptr = &CharOnStack;
-  return Ptr;
-#endif
-}
-
-void clang::noteBottomOfStack() {
-  if (!BottomOfStack)
-BottomOfStack = getStackPointer();
+void 

[clang] [llvm] Reland: [llvm][clang] Allocate a new stack instead of spawning a new thread to get more stack space (PR #136046)

2025-04-16 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese created 
https://github.com/llvm/llvm-project/pull/136046

Reland https://github.com/llvm/llvm-project/pull/133173. This changes the 
assembly code to use `.cfi_{start,end}proc` directly in a file scope asm 
statement and restricts enabling it to MachO to help ensure that the assembly 
is only used where it's valid. The previous code had issues with the 
`-fno-asynchronous-unwind-tables -g0` configuration due to naked functions 
being responsible for emitting `.cfi_{start,end}proc`. There's not a good way 
to detect if CFI is enabled in general, so we always emit it. Removing CFI is 
only done as a size optimization, so having it for one function is fine.

GCC actually does have `__GCC_HAVE_DWARF2_CFI_ASM` when it's emitting CFI 
directives, but Clang does not. compiler-rt just always emits CFI directives 
when built using Clang.

This also includes the other follow up fixes for GCC compatibility and fixing 
thread.h

---

Clang spawns a new thread to avoid running out of stack space. This can make 
debugging and performance analysis more difficult as how the threads are 
connected is difficult to recover.

This patch introduces `runOnNewStack` and applies it in Clang. On platforms 
that have good support for it this allocates a new stack and moves to it using 
assembly. Doing split stacks like this actually runs on most platforms, but 
many debuggers and unwinders reject the large or backwards stack offsets that 
occur. Apple platforms and tools are known to support this, so this only 
enables it there for now.

>From 2c44bbbf944ec9e2d15c431342a835ef1c7ed18c Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Wed, 26 Mar 2025 14:48:19 -0700
Subject: [PATCH] [llvm][clang] Allocate a new stack instead of spawning a new
 thread to get more stack space

Clang spawns a new thread to avoid running out of stack space. This
can make debugging and performance analysis more difficult as how the
threads are connected is difficult to recover.

This patch introduces `runOnNewStack` and applies it in Clang. On
platforms that have good support for it this allocates a new stack and
moves to it using assembly. Doing split stacks like this actually runs
on most platforms, but many debuggers and unwinders reject the large
or backwards stack offsets that occur. Apple platforms and tools are
known to support this, so this only enables it there for now.
---
 clang/docs/ReleaseNotes.rst   |   4 +
 clang/include/clang/Basic/Stack.h |   5 +-
 clang/lib/Basic/Stack.cpp |  40 ++
 clang/lib/Frontend/CompilerInstance.cpp   |   2 +-
 .../llvm/Support/CrashRecoveryContext.h   |   3 +
 llvm/include/llvm/Support/ProgramStack.h  |  63 +
 llvm/include/llvm/Support/thread.h|   1 +
 llvm/lib/Support/CMakeLists.txt   |   1 +
 llvm/lib/Support/CrashRecoveryContext.cpp |  11 ++
 llvm/lib/Support/ProgramStack.cpp | 127 ++
 llvm/unittests/Support/CMakeLists.txt |   1 +
 llvm/unittests/Support/ProgramStackTest.cpp   |  35 +
 12 files changed, 263 insertions(+), 30 deletions(-)
 create mode 100644 llvm/include/llvm/Support/ProgramStack.h
 create mode 100644 llvm/lib/Support/ProgramStack.cpp
 create mode 100644 llvm/unittests/Support/ProgramStackTest.cpp

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 07ff1251fc1ad..7d714d93ce854 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -195,6 +195,10 @@ Non-comprehensive list of changes in this release
 - Added `__builtin_elementwise_exp10`.
 - For AMDPGU targets, added `__builtin_v_cvt_off_f32_i4` that maps to the 
`v_cvt_off_f32_i4` instruction.
 - Added `__builtin_elementwise_minnum` and `__builtin_elementwise_maxnum`.
+- Clang itself now uses split stacks instead of threads for allocating more
+  stack space when running on Apple AArch64 based platforms. This means that
+  stack traces of Clang from debuggers, crashes, and profilers may look
+  different than before.
 
 New Compiler Flags
 --
diff --git a/clang/include/clang/Basic/Stack.h 
b/clang/include/clang/Basic/Stack.h
index 30ebd94aedd1f..9674b9d9b62c3 100644
--- a/clang/include/clang/Basic/Stack.h
+++ b/clang/include/clang/Basic/Stack.h
@@ -27,7 +27,10 @@ namespace clang {
 
   /// Call this once on each thread, as soon after starting the thread as
   /// feasible, to note the approximate address of the bottom of the stack.
-  void noteBottomOfStack();
+  ///
+  /// \param ForceSet set to true if you know the call is near the bottom of a
+  /// new stack. Used for split stacks.
+  void noteBottomOfStack(bool ForceSet = false);
 
   /// Determine whether the stack is nearly exhausted.
   bool isStackNearlyExhausted();
diff --git a/clang/lib/Basic/Stack.cpp b/clang/lib/Basic/Stack.cpp
index aa15d8e66950f..8cbb84943f8d3 100644
--- a/clang/lib/Basic/Stack.cpp
+++ b/clang/lib/Basic/Stack.cpp