[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/87009 >From 4fd1510c2013fd975ac2ad94b3d201bcd5a9d029 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 28 Mar 2024 16:18:19 -0500 Subject: [PATCH] [Libomptarget] Statically link all plugin runtimes Summary: This patch overhauls the `libomptarget` and plugin interface. Currently, we define a C API and compile each plugin as a separate shared library. Then, `libomptarget` loads these API functions and forwards its internal calls to them. This was originally designed to allow multiple implementations of a library to be live. However, since then no one has used this functionality and it prevents us from using much nicer interfaces. If the old behavior is desired it should instead be implemented as a separate plugin. This patch replaces the `PluginAdaptorTy` interface with the `GenericPluginTy` that is used by the plugins. Each plugin exports a `createPlugin_` function that is used to get the specific implementation. This code is now shared with `libomptarget`. There are some notable improvements to this. 1. Massively improved lifetimes of life runtime objects 2. The plugins can use a C++ interface 3. Global state does not need to be duplicated for each plugin + libomptarget 4. Easier to use and add features and improve error handling 5. Less function call overhead / Improved LTO performance. Additional changes in this plugin are related to contending with the fact that state is now shared. Initialization and deinitialization is now handled correctly and in phase with the underlying runtime, allowing us to actually know when something is getting deallocated. Depends on https://github.com/llvm/llvm-project/pull/86971 https://github.com/llvm/llvm-project/pull/86875 https://github.com/llvm/llvm-project/pull/86868 --- clang/test/Driver/linker-wrapper-image.c | 2 +- .../Frontend/Offloading/OffloadWrapper.cpp| 7 +- offload/include/PluginManager.h | 61 ++ offload/include/device.h | 8 +- offload/plugins-nextgen/CMakeLists.txt| 19 +- offload/plugins-nextgen/amdgpu/CMakeLists.txt | 5 - offload/plugins-nextgen/amdgpu/src/rtl.cpp| 14 +- offload/plugins-nextgen/common/CMakeLists.txt | 4 +- .../common/include/PluginInterface.h | 94 +--- .../common/include/Utils/ELF.h| 2 - offload/plugins-nextgen/common/src/JIT.cpp| 40 ++-- .../common/src/PluginInterface.cpp| 205 -- offload/plugins-nextgen/cuda/CMakeLists.txt | 5 - offload/plugins-nextgen/cuda/src/rtl.cpp | 14 +- offload/plugins-nextgen/host/CMakeLists.txt | 8 - offload/plugins-nextgen/host/src/rtl.cpp | 14 +- offload/src/CMakeLists.txt| 4 + offload/src/OffloadRTL.cpp| 1 + offload/src/OpenMP/InteropAPI.cpp | 4 +- offload/src/PluginManager.cpp | 129 --- offload/src/device.cpp| 3 +- offload/src/interface.cpp | 2 +- .../kernelreplay/llvm-omp-kernel-replay.cpp | 2 - .../unittests/Plugins/NextgenPluginsTest.cpp | 1 - 24 files changed, 127 insertions(+), 521 deletions(-) diff --git a/clang/test/Driver/linker-wrapper-image.c b/clang/test/Driver/linker-wrapper-image.c index d01445e3aed04e..5d5d62805e174d 100644 --- a/clang/test/Driver/linker-wrapper-image.c +++ b/clang/test/Driver/linker-wrapper-image.c @@ -30,8 +30,8 @@ // OPENMP: define internal void @.omp_offloading.descriptor_reg() section ".text.startup" { // OPENMP-NEXT: entry: -// OPENMP-NEXT: %0 = call i32 @atexit(ptr @.omp_offloading.descriptor_unreg) // OPENMP-NEXT: call void @__tgt_register_lib(ptr @.omp_offloading.descriptor) +// OPENMP-NEXT: %0 = call i32 @atexit(ptr @.omp_offloading.descriptor_unreg) // OPENMP-NEXT: ret void // OPENMP-NEXT: } diff --git a/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp b/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp index 7241d15ed1c670..8b6f9ea1f4cca3 100644 --- a/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp +++ b/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp @@ -232,12 +232,13 @@ void createRegisterFunction(Module &M, GlobalVariable *BinDesc, // Construct function body IRBuilder<> Builder(BasicBlock::Create(C, "entry", Func)); + Builder.CreateCall(RegFuncC, BinDesc); + // Register the destructors with 'atexit'. This is expected by the CUDA // runtime and ensures that we clean up before dynamic objects are destroyed. - // This needs to be done before the runtime is called and registers its own. + // This needs to be done after plugin initialization to ensure that it is + // called before the plugin runtime is destroyed. Builder.CreateCall(AtExit, UnregFunc); - - Builder.CreateCall(RegFuncC, BinDesc); Builder.CreateRetVoid(); // Add this function to constructors. diff --git a/offload/include/PluginManager.h b/offlo
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
https://github.com/jdoerfert edited https://github.com/llvm/llvm-project/pull/87009 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
@@ -34,38 +33,10 @@ #include #include -struct PluginManager; - -/// Plugin adaptors should be created via `PluginAdaptorTy::create` which will -/// invoke the constructor and call `PluginAdaptorTy::init`. Eventual errors are -/// reported back to the caller, otherwise a valid and initialized adaptor is -/// returned. -struct PluginAdaptorTy { - /// Try to create a plugin adaptor from a filename. - static llvm::Expected> - create(const std::string &Name); - - /// Name of the shared object file representing the plugin. - std::string Name; - - /// Access to the shared object file representing the plugin. - std::unique_ptr LibraryHandler; - -#define PLUGIN_API_HANDLE(NAME) \ - using NAME##_ty = decltype(__tgt_rtl_##NAME); \ - NAME##_ty *NAME = nullptr; +#include "PluginInterface.h" jdoerfert wrote: Order the includes. https://github.com/llvm/llvm-project/pull/87009 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
@@ -3476,3 +3472,9 @@ void *AMDGPUDeviceTy::allocate(size_t Size, void *, TargetAllocTy Kind) { } // namespace target } // namespace omp } // namespace llvm + +extern "C" { +llvm::omp::target::plugin::GenericPluginTy *createPlugin_amdgpu() { jdoerfert wrote: These should use the macro to create the name, one place to change. https://github.com/llvm/llvm-project/pull/87009 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
https://github.com/jdoerfert approved this pull request. I left minor comments. This LGTM, assuming it passes our tests. https://github.com/llvm/llvm-project/pull/87009 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
@@ -92,10 +65,10 @@ struct PluginManager { std::make_unique(TgtBinDesc, TgtDeviceImage)); } - /// Initialize as many devices as possible for this plugin adaptor. Devices - /// that fail to initialize are ignored. Returns the offset the devices were - /// registered at. - void initDevices(PluginAdaptorTy &RTL); + /// Initialize as many devices as possible for this plugin. Devices that fail + /// to initialize are ignored. Returns the offset the devices were registered + /// at. jdoerfert wrote: No return value. https://github.com/llvm/llvm-project/pull/87009 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
@@ -92,10 +65,10 @@ struct PluginManager { std::make_unique(TgtBinDesc, TgtDeviceImage)); } - /// Initialize as many devices as possible for this plugin adaptor. Devices - /// that fail to initialize are ignored. Returns the offset the devices were - /// registered at. - void initDevices(PluginAdaptorTy &RTL); + /// Initialize as many devices as possible for this plugin. Devices that fail jdoerfert wrote: 2 spaces https://github.com/llvm/llvm-project/pull/87009 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
@@ -34,38 +33,10 @@ #include #include -struct PluginManager; - -/// Plugin adaptors should be created via `PluginAdaptorTy::create` which will -/// invoke the constructor and call `PluginAdaptorTy::init`. Eventual errors are -/// reported back to the caller, otherwise a valid and initialized adaptor is -/// returned. -struct PluginAdaptorTy { - /// Try to create a plugin adaptor from a filename. - static llvm::Expected> - create(const std::string &Name); - - /// Name of the shared object file representing the plugin. - std::string Name; - - /// Access to the shared object file representing the plugin. - std::unique_ptr LibraryHandler; - -#define PLUGIN_API_HANDLE(NAME) \ - using NAME##_ty = decltype(__tgt_rtl_##NAME); \ - NAME##_ty *NAME = nullptr; +#include "PluginInterface.h" +using GenericPluginTy = llvm::omp::target::plugin::GenericPluginTy; -#include "Shared/PluginAPI.inc" -#undef PLUGIN_API_HANDLE - - /// Create a plugin adaptor for filename \p Name with a dynamic library \p DL. - PluginAdaptorTy(const std::string &Name, - std::unique_ptr DL); - - /// Initialize the plugin adaptor, this can fail in which case the adaptor is - /// useless. - llvm::Error init(); -}; +struct PluginManager; jdoerfert wrote: Is't it defined just 2 lines later? https://github.com/llvm/llvm-project/pull/87009 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
@@ -3476,3 +3472,9 @@ void *AMDGPUDeviceTy::allocate(size_t Size, void *, TargetAllocTy Kind) { } // namespace target } // namespace omp } // namespace llvm + +extern "C" { +llvm::omp::target::plugin::GenericPluginTy *createPlugin_amdgpu() { jhuber6 wrote: I tried that, but couldn't figure out how to set a name from a macro within a macro. https://github.com/llvm/llvm-project/pull/87009 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/87009 >From 473a4b9bad09bd9af8186932984be7696711692e Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 28 Mar 2024 16:18:19 -0500 Subject: [PATCH] [Libomptarget] Statically link all plugin runtimes Summary: This patch overhauls the `libomptarget` and plugin interface. Currently, we define a C API and compile each plugin as a separate shared library. Then, `libomptarget` loads these API functions and forwards its internal calls to them. This was originally designed to allow multiple implementations of a library to be live. However, since then no one has used this functionality and it prevents us from using much nicer interfaces. If the old behavior is desired it should instead be implemented as a separate plugin. This patch replaces the `PluginAdaptorTy` interface with the `GenericPluginTy` that is used by the plugins. Each plugin exports a `createPlugin_` function that is used to get the specific implementation. This code is now shared with `libomptarget`. There are some notable improvements to this. 1. Massively improved lifetimes of life runtime objects 2. The plugins can use a C++ interface 3. Global state does not need to be duplicated for each plugin + libomptarget 4. Easier to use and add features and improve error handling 5. Less function call overhead / Improved LTO performance. Additional changes in this plugin are related to contending with the fact that state is now shared. Initialization and deinitialization is now handled correctly and in phase with the underlying runtime, allowing us to actually know when something is getting deallocated. Depends on https://github.com/llvm/llvm-project/pull/86971 https://github.com/llvm/llvm-project/pull/86875 https://github.com/llvm/llvm-project/pull/86868 --- clang/test/Driver/linker-wrapper-image.c | 2 +- .../Frontend/Offloading/OffloadWrapper.cpp| 7 +- offload/include/PluginManager.h | 61 ++ offload/include/device.h | 8 +- offload/plugins-nextgen/CMakeLists.txt| 19 +- offload/plugins-nextgen/amdgpu/CMakeLists.txt | 5 - offload/plugins-nextgen/amdgpu/src/rtl.cpp| 14 +- offload/plugins-nextgen/common/CMakeLists.txt | 4 +- .../common/include/PluginInterface.h | 94 +--- .../common/include/Utils/ELF.h| 2 - offload/plugins-nextgen/common/src/JIT.cpp| 40 ++-- .../common/src/PluginInterface.cpp| 205 -- offload/plugins-nextgen/cuda/CMakeLists.txt | 5 - offload/plugins-nextgen/cuda/src/rtl.cpp | 14 +- offload/plugins-nextgen/host/CMakeLists.txt | 8 - offload/plugins-nextgen/host/src/rtl.cpp | 14 +- offload/src/CMakeLists.txt| 4 + offload/src/OffloadRTL.cpp| 1 + offload/src/OpenMP/InteropAPI.cpp | 4 +- offload/src/PluginManager.cpp | 129 --- offload/src/device.cpp| 3 +- offload/src/interface.cpp | 2 +- .../kernelreplay/llvm-omp-kernel-replay.cpp | 2 - .../unittests/Plugins/NextgenPluginsTest.cpp | 1 - 24 files changed, 127 insertions(+), 521 deletions(-) diff --git a/clang/test/Driver/linker-wrapper-image.c b/clang/test/Driver/linker-wrapper-image.c index d01445e3aed04e..5d5d62805e174d 100644 --- a/clang/test/Driver/linker-wrapper-image.c +++ b/clang/test/Driver/linker-wrapper-image.c @@ -30,8 +30,8 @@ // OPENMP: define internal void @.omp_offloading.descriptor_reg() section ".text.startup" { // OPENMP-NEXT: entry: -// OPENMP-NEXT: %0 = call i32 @atexit(ptr @.omp_offloading.descriptor_unreg) // OPENMP-NEXT: call void @__tgt_register_lib(ptr @.omp_offloading.descriptor) +// OPENMP-NEXT: %0 = call i32 @atexit(ptr @.omp_offloading.descriptor_unreg) // OPENMP-NEXT: ret void // OPENMP-NEXT: } diff --git a/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp b/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp index 7241d15ed1c670..8b6f9ea1f4cca3 100644 --- a/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp +++ b/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp @@ -232,12 +232,13 @@ void createRegisterFunction(Module &M, GlobalVariable *BinDesc, // Construct function body IRBuilder<> Builder(BasicBlock::Create(C, "entry", Func)); + Builder.CreateCall(RegFuncC, BinDesc); + // Register the destructors with 'atexit'. This is expected by the CUDA // runtime and ensures that we clean up before dynamic objects are destroyed. - // This needs to be done before the runtime is called and registers its own. + // This needs to be done after plugin initialization to ensure that it is + // called before the plugin runtime is destroyed. Builder.CreateCall(AtExit, UnregFunc); - - Builder.CreateCall(RegFuncC, BinDesc); Builder.CreateRetVoid(); // Add this function to constructors. diff --git a/offload/include/PluginManager.h b/offlo
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/87009 >From 8c4b7ffb49c8f91768af3bec00669bac5433ec0f Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 28 Mar 2024 16:18:19 -0500 Subject: [PATCH] [Libomptarget] Statically link all plugin runtimes Summary: This patch overhauls the `libomptarget` and plugin interface. Currently, we define a C API and compile each plugin as a separate shared library. Then, `libomptarget` loads these API functions and forwards its internal calls to them. This was originally designed to allow multiple implementations of a library to be live. However, since then no one has used this functionality and it prevents us from using much nicer interfaces. If the old behavior is desired it should instead be implemented as a separate plugin. This patch replaces the `PluginAdaptorTy` interface with the `GenericPluginTy` that is used by the plugins. Each plugin exports a `createPlugin_` function that is used to get the specific implementation. This code is now shared with `libomptarget`. There are some notable improvements to this. 1. Massively improved lifetimes of life runtime objects 2. The plugins can use a C++ interface 3. Global state does not need to be duplicated for each plugin + libomptarget 4. Easier to use and add features and improve error handling 5. Less function call overhead / Improved LTO performance. Additional changes in this plugin are related to contending with the fact that state is now shared. Initialization and deinitialization is now handled correctly and in phase with the underlying runtime, allowing us to actually know when something is getting deallocated. Depends on https://github.com/llvm/llvm-project/pull/86971 https://github.com/llvm/llvm-project/pull/86875 https://github.com/llvm/llvm-project/pull/86868 --- clang/test/Driver/linker-wrapper-image.c | 2 +- .../Frontend/Offloading/OffloadWrapper.cpp| 7 +- offload/include/PluginManager.h | 61 ++ offload/include/device.h | 8 +- offload/plugins-nextgen/CMakeLists.txt| 19 +- offload/plugins-nextgen/amdgpu/CMakeLists.txt | 5 - offload/plugins-nextgen/amdgpu/src/rtl.cpp| 14 +- offload/plugins-nextgen/common/CMakeLists.txt | 4 +- .../common/include/PluginInterface.h | 94 +--- .../common/include/Utils/ELF.h| 2 - offload/plugins-nextgen/common/src/JIT.cpp| 40 ++-- .../common/src/PluginInterface.cpp| 205 -- offload/plugins-nextgen/cuda/CMakeLists.txt | 5 - offload/plugins-nextgen/cuda/src/rtl.cpp | 14 +- offload/plugins-nextgen/host/CMakeLists.txt | 8 - offload/plugins-nextgen/host/src/rtl.cpp | 14 +- offload/src/CMakeLists.txt| 4 + offload/src/OffloadRTL.cpp| 1 + offload/src/OpenMP/InteropAPI.cpp | 4 +- offload/src/PluginManager.cpp | 129 --- offload/src/device.cpp| 3 +- offload/src/interface.cpp | 2 +- .../kernelreplay/llvm-omp-kernel-replay.cpp | 2 - .../unittests/Plugins/NextgenPluginsTest.cpp | 1 - 24 files changed, 126 insertions(+), 522 deletions(-) diff --git a/clang/test/Driver/linker-wrapper-image.c b/clang/test/Driver/linker-wrapper-image.c index d01445e3aed04e..5d5d62805e174d 100644 --- a/clang/test/Driver/linker-wrapper-image.c +++ b/clang/test/Driver/linker-wrapper-image.c @@ -30,8 +30,8 @@ // OPENMP: define internal void @.omp_offloading.descriptor_reg() section ".text.startup" { // OPENMP-NEXT: entry: -// OPENMP-NEXT: %0 = call i32 @atexit(ptr @.omp_offloading.descriptor_unreg) // OPENMP-NEXT: call void @__tgt_register_lib(ptr @.omp_offloading.descriptor) +// OPENMP-NEXT: %0 = call i32 @atexit(ptr @.omp_offloading.descriptor_unreg) // OPENMP-NEXT: ret void // OPENMP-NEXT: } diff --git a/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp b/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp index 7241d15ed1c670..8b6f9ea1f4cca3 100644 --- a/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp +++ b/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp @@ -232,12 +232,13 @@ void createRegisterFunction(Module &M, GlobalVariable *BinDesc, // Construct function body IRBuilder<> Builder(BasicBlock::Create(C, "entry", Func)); + Builder.CreateCall(RegFuncC, BinDesc); + // Register the destructors with 'atexit'. This is expected by the CUDA // runtime and ensures that we clean up before dynamic objects are destroyed. - // This needs to be done before the runtime is called and registers its own. + // This needs to be done after plugin initialization to ensure that it is + // called before the plugin runtime is destroyed. Builder.CreateCall(AtExit, UnregFunc); - - Builder.CreateCall(RegFuncC, BinDesc); Builder.CreateRetVoid(); // Add this function to constructors. diff --git a/offload/include/PluginManager.h b/offlo
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/87009 >From 50099312ab7c60b8cfce5473a4c154f8d917dc41 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 27 Mar 2024 15:27:16 -0500 Subject: [PATCH 1/3] [Libomptarget] Rename `libomptarget.rtl.x86_64` to `libomptarget.rtl.host` Summary: All of these are functionally the same code, just compiled for separate architectures. We currently do not expose a way to execute these on separate architectures as the host plugin works using `dlopen` into the same process, and therefore cannot possibly be an incompatible architecture. (This could work with a remote plugin, but this is not supported yet). This patch simply renames all of these to the same thing so we no longer need to check around for its varying definitions. --- offload/plugins-nextgen/host/CMakeLists.txt | 32 ++--- offload/src/CMakeLists.txt | 5 +--- 2 files changed, 17 insertions(+), 20 deletions(-) diff --git a/offload/plugins-nextgen/host/CMakeLists.txt b/offload/plugins-nextgen/host/CMakeLists.txt index 7da18ee278d498..250ff2225c16bf 100644 --- a/offload/plugins-nextgen/host/CMakeLists.txt +++ b/offload/plugins-nextgen/host/CMakeLists.txt @@ -14,36 +14,36 @@ if(CMAKE_SYSTEM_PROCESSOR MATCHES "ppc64le$") endif() # Create the library and add the default arguments. -add_target_library(omptarget.rtl.${machine} ${machine}) +add_target_library(omptarget.rtl.host ${machine}) -target_sources(omptarget.rtl.${machine} PRIVATE src/rtl.cpp) +target_sources(omptarget.rtl.host PRIVATE src/rtl.cpp) if(LIBOMPTARGET_DEP_LIBFFI_FOUND) libomptarget_say("Building ${machine} plugin linked with libffi") if(FFI_STATIC_LIBRARIES) -target_link_libraries(omptarget.rtl.${machine} PRIVATE FFI::ffi_static) +target_link_libraries(omptarget.rtl.host PRIVATE FFI::ffi_static) else() -target_link_libraries(omptarget.rtl.${machine} PRIVATE FFI::ffi) +target_link_libraries(omptarget.rtl.host PRIVATE FFI::ffi) endif() else() libomptarget_say("Building ${machine} plugin for dlopened libffi") - target_sources(omptarget.rtl.${machine} PRIVATE dynamic_ffi/ffi.cpp) - target_include_directories(omptarget.rtl.${machine} PRIVATE dynamic_ffi) + target_sources(omptarget.rtl.host PRIVATE dynamic_ffi/ffi.cpp) + target_include_directories(omptarget.rtl.host PRIVATE dynamic_ffi) endif() # Install plugin under the lib destination folder. -install(TARGETS omptarget.rtl.${machine} +install(TARGETS omptarget.rtl.host LIBRARY DESTINATION "${OFFLOAD_INSTALL_LIBDIR}") -set_target_properties(omptarget.rtl.${machine} PROPERTIES +set_target_properties(omptarget.rtl.host PROPERTIES INSTALL_RPATH "$ORIGIN" BUILD_RPATH "$ORIGIN:${CMAKE_CURRENT_BINARY_DIR}/.." POSITION_INDEPENDENT_CODE ON CXX_VISIBILITY_PRESET protected) -target_include_directories(omptarget.rtl.${machine} PRIVATE +target_include_directories(omptarget.rtl.host PRIVATE ${LIBOMPTARGET_INCLUDE_DIR}) if(LIBOMPTARGET_DEP_LIBFFI_FOUND) - list(APPEND LIBOMPTARGET_TESTED_PLUGINS omptarget.rtl.${machine}) + list(APPEND LIBOMPTARGET_TESTED_PLUGINS omptarget.rtl.host) set(LIBOMPTARGET_TESTED_PLUGINS "${LIBOMPTARGET_TESTED_PLUGINS}" PARENT_SCOPE) else() @@ -66,22 +66,22 @@ elseif(CMAKE_SYSTEM_PROCESSOR MATCHES "ppc64$") "powerpc64-ibm-linux-gnu" "powerpc64-ibm-linux-gnu-LTO") set(LIBOMPTARGET_SYSTEM_TARGETS "${LIBOMPTARGET_SYSTEM_TARGETS}" PARENT_SCOPE) elseif(CMAKE_SYSTEM_PROCESSOR MATCHES "x86_64$") - target_compile_definitions(omptarget.rtl.${machine} PRIVATE TARGET_ELF_ID=EM_X86_64) - target_compile_definitions(omptarget.rtl.${machine} PRIVATE + target_compile_definitions(omptarget.rtl.host PRIVATE TARGET_ELF_ID=EM_X86_64) + target_compile_definitions(omptarget.rtl.host PRIVATE LIBOMPTARGET_NEXTGEN_GENERIC_PLUGIN_TRIPLE="x86_64-pc-linux-gnu") list(APPEND LIBOMPTARGET_SYSTEM_TARGETS "x86_64-pc-linux-gnu" "x86_64-pc-linux-gnu-LTO") set(LIBOMPTARGET_SYSTEM_TARGETS "${LIBOMPTARGET_SYSTEM_TARGETS}" PARENT_SCOPE) elseif(CMAKE_SYSTEM_PROCESSOR MATCHES "aarch64$") - target_compile_definitions(omptarget.rtl.${machine} PRIVATE TARGET_ELF_ID=EM_AARCH64) - target_compile_definitions(omptarget.rtl.${machine} PRIVATE + target_compile_definitions(omptarget.rtl.host PRIVATE TARGET_ELF_ID=EM_AARCH64) + target_compile_definitions(omptarget.rtl.host PRIVATE LIBOMPTARGET_NEXTGEN_GENERIC_PLUGIN_TRIPLE="aarch64-unknown-linux-gnu") list(APPEND LIBOMPTARGET_SYSTEM_TARGETS "aarch64-unknown-linux-gnu" "aarch64-unknown-linux-gnu-LTO") set(LIBOMPTARGET_SYSTEM_TARGETS "${LIBOMPTARGET_SYSTEM_TARGETS}" PARENT_SCOPE) elseif(CMAKE_SYSTEM_PROCESSOR MATCHES "s390x$") - target_compile_definitions(omptarget.rtl.${machine} PRIVATE TARGET_ELF_ID=EM_S390) - target_compile_definitions(omptarget.rtl.${machine} PRIVATE + target_compile_definitions(omptarget.rtl.host PRIVATE TARGET_ELF_I
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
jhuber6 wrote: Going to land this soon. @jplehr @estewart08 @ronlieb Applied this on the AMD fork, here the diff. https://gist.github.com/jhuber6/e856fbe9c73acea13b6d30b20605c73e https://github.com/llvm/llvm-project/pull/87009 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/87009 >From 3ea2ae0f5c438b38d0480cfb38a72d2f7a60142c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 28 Mar 2024 16:18:19 -0500 Subject: [PATCH] [Libomptarget] Statically link all plugin runtimes Summary: This patch overhauls the `libomptarget` and plugin interface. Currently, we define a C API and compile each plugin as a separate shared library. Then, `libomptarget` loads these API functions and forwards its internal calls to them. This was originally designed to allow multiple implementations of a library to be live. However, since then no one has used this functionality and it prevents us from using much nicer interfaces. If the old behavior is desired it should instead be implemented as a separate plugin. This patch replaces the `PluginAdaptorTy` interface with the `GenericPluginTy` that is used by the plugins. Each plugin exports a `createPlugin_` function that is used to get the specific implementation. This code is now shared with `libomptarget`. There are some notable improvements to this. 1. Massively improved lifetimes of life runtime objects 2. The plugins can use a C++ interface 3. Global state does not need to be duplicated for each plugin + libomptarget 4. Easier to use and add features and improve error handling 5. Less function call overhead / Improved LTO performance. Additional changes in this plugin are related to contending with the fact that state is now shared. Initialization and deinitialization is now handled correctly and in phase with the underlying runtime, allowing us to actually know when something is getting deallocated. Depends on https://github.com/llvm/llvm-project/pull/86971 https://github.com/llvm/llvm-project/pull/86875 https://github.com/llvm/llvm-project/pull/86868 --- clang/test/Driver/linker-wrapper-image.c | 2 +- .../Frontend/Offloading/OffloadWrapper.cpp| 7 +- offload/include/PluginManager.h | 61 ++ offload/include/device.h | 8 +- offload/plugins-nextgen/CMakeLists.txt| 19 +- offload/plugins-nextgen/amdgpu/CMakeLists.txt | 5 - offload/plugins-nextgen/amdgpu/src/rtl.cpp| 14 +- offload/plugins-nextgen/common/CMakeLists.txt | 4 +- .../common/include/PluginInterface.h | 94 +--- .../common/include/Utils/ELF.h| 2 - offload/plugins-nextgen/common/src/JIT.cpp| 40 ++-- .../common/src/PluginInterface.cpp| 205 -- offload/plugins-nextgen/cuda/CMakeLists.txt | 5 - offload/plugins-nextgen/cuda/src/rtl.cpp | 14 +- offload/plugins-nextgen/host/CMakeLists.txt | 8 - offload/plugins-nextgen/host/src/rtl.cpp | 14 +- offload/src/CMakeLists.txt| 4 + offload/src/OffloadRTL.cpp| 1 + offload/src/OpenMP/InteropAPI.cpp | 4 +- offload/src/PluginManager.cpp | 129 --- offload/src/device.cpp| 3 +- offload/src/interface.cpp | 2 - .../kernelreplay/llvm-omp-kernel-replay.cpp | 2 - .../unittests/Plugins/NextgenPluginsTest.cpp | 1 - 24 files changed, 125 insertions(+), 523 deletions(-) diff --git a/clang/test/Driver/linker-wrapper-image.c b/clang/test/Driver/linker-wrapper-image.c index d01445e3aed04e..5d5d62805e174d 100644 --- a/clang/test/Driver/linker-wrapper-image.c +++ b/clang/test/Driver/linker-wrapper-image.c @@ -30,8 +30,8 @@ // OPENMP: define internal void @.omp_offloading.descriptor_reg() section ".text.startup" { // OPENMP-NEXT: entry: -// OPENMP-NEXT: %0 = call i32 @atexit(ptr @.omp_offloading.descriptor_unreg) // OPENMP-NEXT: call void @__tgt_register_lib(ptr @.omp_offloading.descriptor) +// OPENMP-NEXT: %0 = call i32 @atexit(ptr @.omp_offloading.descriptor_unreg) // OPENMP-NEXT: ret void // OPENMP-NEXT: } diff --git a/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp b/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp index 7241d15ed1c670..8b6f9ea1f4cca3 100644 --- a/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp +++ b/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp @@ -232,12 +232,13 @@ void createRegisterFunction(Module &M, GlobalVariable *BinDesc, // Construct function body IRBuilder<> Builder(BasicBlock::Create(C, "entry", Func)); + Builder.CreateCall(RegFuncC, BinDesc); + // Register the destructors with 'atexit'. This is expected by the CUDA // runtime and ensures that we clean up before dynamic objects are destroyed. - // This needs to be done before the runtime is called and registers its own. + // This needs to be done after plugin initialization to ensure that it is + // called before the plugin runtime is destroyed. Builder.CreateCall(AtExit, UnregFunc); - - Builder.CreateCall(RegFuncC, BinDesc); Builder.CreateRetVoid(); // Add this function to constructors. diff --git a/offload/include/PluginManager.h b/offloa
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
dhruvachak wrote: There are duplicate definitions of the following `` bool llvm::omp::target::ompt::Initialized = false; ompt_get_callback_t llvm::omp::target::ompt::lookupCallbackByCode = nullptr; ompt_function_lookup_t llvm::omp::target::ompt::lookupCallbackByName = nullptr; `` in src/OpenMP/OMPT/Callback.cpp and plugins-nextgen/common/OMPT/OmptCallback.cpp Can you remove the ones in the plugin? Otherwise, it's not clear which definition is being used. https://github.com/llvm/llvm-project/pull/87009 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
jhuber6 wrote: > There are duplicate definitions of the following > > ``` > bool llvm::omp::target::ompt::Initialized = false; > > ompt_get_callback_t llvm::omp::target::ompt::lookupCallbackByCode = nullptr; > ompt_function_lookup_t llvm::omp::target::ompt::lookupCallbackByName = > nullptr; > ``` > > in src/OpenMP/OMPT/Callback.cpp and > plugins-nextgen/common/OMPT/OmptCallback.cpp > > Can you remove the ones in the plugin? Otherwise, it's not clear which > definition is being used. Sure, do you get that upstream? If so, wonder why I didn't get it. I guess we would just have it to where the `libomptarget` instance handles all the OMPT. https://github.com/llvm/llvm-project/pull/87009 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
dhruvachak wrote: > > There are duplicate definitions of the following > > ``` > > bool llvm::omp::target::ompt::Initialized = false; > > > > ompt_get_callback_t llvm::omp::target::ompt::lookupCallbackByCode = nullptr; > > ompt_function_lookup_t llvm::omp::target::ompt::lookupCallbackByName = > > nullptr; > > ``` > > > > > > > > > > > > > > > > > > > > > > > > in src/OpenMP/OMPT/Callback.cpp and > > plugins-nextgen/common/OMPT/OmptCallback.cpp > > Can you remove the ones in the plugin? Otherwise, it's not clear which > > definition is being used. > > Sure, do you get that upstream? If so, wonder why I didn't get it. I did not build upstream but looking at downstream, I think for some reason they don't show up as duplicate symbols. But looking at the code, they should be removed. There are uses of those variables in the plugin, so there should be only 1 definition. > > I guess we would just have it to where the `libomptarget` instance handles > all the OMPT. https://github.com/llvm/llvm-project/pull/87009 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
jhuber6 wrote: > I did not build upstream but looking at downstream, I think for some reason > they don't show up as duplicate symbols. But looking at the code, they should > be removed. There are uses of those variables in the plugin, so there should > be only 1 definition. Does this apply for anything OMPT related inside of the plugin? There's a few places where we mark callbacks, but those can all be moved into `libomptarget` once this lands. https://github.com/llvm/llvm-project/pull/87009 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
dhruvachak wrote: I think OmptCallback.cpp can be completely removed from the plugin since that functionality is subsumed by offload. As in ``` diff --git a/offload/plugins-nextgen/common/CMakeLists.txt b/offload/plugins-nextgen/common/CMakeLists.txt index acf0af63f050..dc942db826bf 100644 --- a/offload/plugins-nextgen/common/CMakeLists.txt +++ b/offload/plugins-nextgen/common/CMakeLists.txt @@ -46,7 +46,6 @@ endif() # If we have OMPT enabled include it in the list of sources. if (OMPT_TARGET_DEFAULT AND LIBOMPTARGET_OMPT_SUPPORT) - target_sources(PluginCommon PRIVATE OMPT/OmptCallback.cpp) target_include_directories(PluginCommon PRIVATE OMPT) endif() ``` https://github.com/llvm/llvm-project/pull/87009 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/87009 >From 6dfa6dc2956ca714e98bf24b176315da42446553 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 28 Mar 2024 16:18:19 -0500 Subject: [PATCH] [Libomptarget] Statically link all plugin runtimes Summary: This patch overhauls the `libomptarget` and plugin interface. Currently, we define a C API and compile each plugin as a separate shared library. Then, `libomptarget` loads these API functions and forwards its internal calls to them. This was originally designed to allow multiple implementations of a library to be live. However, since then no one has used this functionality and it prevents us from using much nicer interfaces. If the old behavior is desired it should instead be implemented as a separate plugin. This patch replaces the `PluginAdaptorTy` interface with the `GenericPluginTy` that is used by the plugins. Each plugin exports a `createPlugin_` function that is used to get the specific implementation. This code is now shared with `libomptarget`. There are some notable improvements to this. 1. Massively improved lifetimes of life runtime objects 2. The plugins can use a C++ interface 3. Global state does not need to be duplicated for each plugin + libomptarget 4. Easier to use and add features and improve error handling 5. Less function call overhead / Improved LTO performance. Additional changes in this plugin are related to contending with the fact that state is now shared. Initialization and deinitialization is now handled correctly and in phase with the underlying runtime, allowing us to actually know when something is getting deallocated. Depends on https://github.com/llvm/llvm-project/pull/86971 https://github.com/llvm/llvm-project/pull/86875 https://github.com/llvm/llvm-project/pull/86868 --- clang/test/Driver/linker-wrapper-image.c | 2 +- .../Frontend/Offloading/OffloadWrapper.cpp| 7 +- offload/include/PluginManager.h | 61 ++ offload/include/device.h | 8 +- offload/plugins-nextgen/CMakeLists.txt| 19 +- offload/plugins-nextgen/amdgpu/CMakeLists.txt | 5 - offload/plugins-nextgen/amdgpu/src/rtl.cpp| 14 +- offload/plugins-nextgen/common/CMakeLists.txt | 5 +- .../common/include/PluginInterface.h | 94 +--- .../common/include/Utils/ELF.h| 2 - offload/plugins-nextgen/common/src/JIT.cpp| 40 ++-- .../common/src/PluginInterface.cpp| 205 -- offload/plugins-nextgen/cuda/CMakeLists.txt | 5 - offload/plugins-nextgen/cuda/src/rtl.cpp | 14 +- offload/plugins-nextgen/host/CMakeLists.txt | 8 - offload/plugins-nextgen/host/src/rtl.cpp | 14 +- offload/src/CMakeLists.txt| 4 + offload/src/OffloadRTL.cpp| 1 + offload/src/OpenMP/InteropAPI.cpp | 4 +- offload/src/PluginManager.cpp | 129 --- offload/src/device.cpp| 3 +- offload/src/interface.cpp | 2 - .../kernelreplay/llvm-omp-kernel-replay.cpp | 2 - .../unittests/Plugins/NextgenPluginsTest.cpp | 1 - 24 files changed, 125 insertions(+), 524 deletions(-) diff --git a/clang/test/Driver/linker-wrapper-image.c b/clang/test/Driver/linker-wrapper-image.c index d01445e3aed04e..5d5d62805e174d 100644 --- a/clang/test/Driver/linker-wrapper-image.c +++ b/clang/test/Driver/linker-wrapper-image.c @@ -30,8 +30,8 @@ // OPENMP: define internal void @.omp_offloading.descriptor_reg() section ".text.startup" { // OPENMP-NEXT: entry: -// OPENMP-NEXT: %0 = call i32 @atexit(ptr @.omp_offloading.descriptor_unreg) // OPENMP-NEXT: call void @__tgt_register_lib(ptr @.omp_offloading.descriptor) +// OPENMP-NEXT: %0 = call i32 @atexit(ptr @.omp_offloading.descriptor_unreg) // OPENMP-NEXT: ret void // OPENMP-NEXT: } diff --git a/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp b/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp index 7241d15ed1c670..8b6f9ea1f4cca3 100644 --- a/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp +++ b/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp @@ -232,12 +232,13 @@ void createRegisterFunction(Module &M, GlobalVariable *BinDesc, // Construct function body IRBuilder<> Builder(BasicBlock::Create(C, "entry", Func)); + Builder.CreateCall(RegFuncC, BinDesc); + // Register the destructors with 'atexit'. This is expected by the CUDA // runtime and ensures that we clean up before dynamic objects are destroyed. - // This needs to be done before the runtime is called and registers its own. + // This needs to be done after plugin initialization to ensure that it is + // called before the plugin runtime is destroyed. Builder.CreateCall(AtExit, UnregFunc); - - Builder.CreateCall(RegFuncC, BinDesc); Builder.CreateRetVoid(); // Add this function to constructors. diff --git a/offload/include/PluginManager.h b/offloa
[clang] [llvm] [Libomptarget] Statically link all plugin runtimes (PR #87009)
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/87009 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits