https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/71739
>From 5e378ae3efdebedb044528167131c8cae4571a59 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Tue, 7 Nov 2023 17:12:31 -0600
Subject: [PATCH] [OpenMP] Rework handling of global ctor/dtors in OpenMP
Summary:
This patch reworks how we handle global constructors in OpenMP.
Previously, we emitted individual kernels that were all registered and
called individually. In order to provide more generic support, this
patch moves all handling of this to the target backend and the runtime
plugin. This has the benefit of supporting the GNU extensions for
constructors an destructors, removing a class of failures related to
shared library destruction order, and allows targets other than OpenMP
to use the same support without needing to change the frontend.
This is primarily done by calling kernels that the backend emits to
iterate a list of ctor / dtor functions. For x64, this is automatic and
we get it for free with the standard `dlopen` handling. For AMDGPU, we
emit `amdgcn.device.init` and `amdgcn.device.fini` functions which
handle everything atuomatically and simply need to be called. For NVPTX,
a patch https://github.com/llvm/llvm-project/pull/71549 provides the
kernels to call, but the runtime needs to set up the array manually by
pulling out all the known constructor / destructor functions.
One concession that this patch requires is the change that for GPU
targets in OpenMP offloading we will use `llvm.global_dtors` instead of
using `atexit`. This is because `atexit` is a separate runtime function
that does not mesh well with the handling we're trying to do here. This
should be equivalent in all cases except for cases where we would need
to destruct manually such as:
```
struct S { ~S() { foo(); } };
void foo() {
static S s;
}
```
However this is broken in many other ways on the GPU, so it is not
regressing any support, simply increasing the scope of what we can
handle.
This changes the handling of ctors / dtors. This patch now outputs a
information message regarding the deprecation if the old format is used.
This will be completely removed in a later release.
Depends on: https://github.com/llvm/llvm-project/pull/71549
---
clang/lib/CodeGen/CGDeclCXX.cpp | 13 +-
clang/lib/CodeGen/CGOpenMPRuntime.cpp | 130 --
clang/lib/CodeGen/CGOpenMPRuntime.h | 8 --
clang/lib/CodeGen/CodeGenFunction.h | 5 +
clang/lib/CodeGen/CodeGenModule.h | 14 +-
clang/lib/CodeGen/ItaniumCXXABI.cpp | 8 ++
.../amdgcn_openmp_device_math_constexpr.cpp | 48 +--
.../amdgcn_target_global_constructor.cpp | 30 ++--
clang/test/OpenMP/declare_target_codegen.cpp | 1 -
...x_declare_target_var_ctor_dtor_codegen.cpp | 35 +
.../llvm/Frontend/OpenMP/OMPIRBuilder.h | 4 -
llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp | 7 +-
.../plugins-nextgen/amdgpu/src/rtl.cpp| 52 +++
.../common/PluginInterface/GlobalHandler.h| 10 +-
.../PluginInterface/PluginInterface.cpp | 7 +
.../common/PluginInterface/PluginInterface.h | 14 ++
.../plugins-nextgen/cuda/src/rtl.cpp | 115
openmp/libomptarget/src/rtl.cpp | 6 +
18 files changed, 291 insertions(+), 216 deletions(-)
diff --git a/clang/lib/CodeGen/CGDeclCXX.cpp b/clang/lib/CodeGen/CGDeclCXX.cpp
index 3fa28b343663f61..e08a1e5f42df20c 100644
--- a/clang/lib/CodeGen/CGDeclCXX.cpp
+++ b/clang/lib/CodeGen/CGDeclCXX.cpp
@@ -327,6 +327,15 @@ void CodeGenFunction::registerGlobalDtorWithAtExit(const
VarDecl ,
registerGlobalDtorWithAtExit(dtorStub);
}
+/// Register a global destructor using the LLVM 'llvm.global_dtors' global.
+void CodeGenFunction::registerGlobalDtorWithLLVM(const VarDecl ,
+ llvm::FunctionCallee Dtor,
+ llvm::Constant *Addr) {
+ // Create a function which calls the destructor.
+ llvm::Function *dtorStub = createAtExitStub(VD, Dtor, Addr);
+ CGM.AddGlobalDtor(dtorStub);
+}
+
void CodeGenFunction::registerGlobalDtorWithAtExit(llvm::Constant *dtorStub) {
// extern "C" int atexit(void (*f)(void));
assert(dtorStub->getType() ==
@@ -519,10 +528,6 @@ CodeGenModule::EmitCXXGlobalVarDeclInitFunc(const VarDecl
*D,
D->hasAttr()))
return;
- if (getLangOpts().OpenMP &&
- getOpenMPRuntime().emitDeclareTargetVarDefinition(D, Addr, PerformInit))
-return;
-
// Check if we've already initialized this decl.
auto I = DelayedCXXInitPosition.find(D);
if (I != DelayedCXXInitPosition.end() && I->second == ~0U)
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index a8e1150e44566b8..d2be8141a3a4b31 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -1747,136 +1747,6 @@ llvm::Function