[llvm] [clang] [Offloading][NFC] Refactor handling of offloading entries (PR #72544)
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/72544 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [clang] [Offloading][NFC] Refactor handling of offloading entries (PR #72544)
https://github.com/JonChesterfield approved this pull request. Test change is suspect for a patch claiming NFC but it looks like the change is harmless. Thanks for separating refactor from functional change https://github.com/llvm/llvm-project/pull/72544 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [clang] [Offloading][NFC] Refactor handling of offloading entries (PR #72544)
llvmbot wrote: @llvm/pr-subscribers-clang Author: Joseph Huber (jhuber6) Changes Summary: This patch is a simple refactoring of code out of the linker wrapper into a common location. The main motivation behind this change is to make it easier to change the handling in the future to accept a triple to be used to emit entries that function on that target. --- Patch is 21.18 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/72544.diff 5 Files Affected: - (modified) clang/test/Driver/linker-wrapper-image.c (+19-19) - (modified) clang/tools/clang-linker-wrapper/CMakeLists.txt (+1) - (modified) clang/tools/clang-linker-wrapper/OffloadWrapper.cpp (+22-94) - (modified) llvm/include/llvm/Frontend/Offloading/Utility.h (+10-1) - (modified) llvm/lib/Frontend/Offloading/Utility.cpp (+30-1) ``diff diff --git a/clang/test/Driver/linker-wrapper-image.c b/clang/test/Driver/linker-wrapper-image.c index 83e7db6a49a6bb3..bb641a08bc023d5 100644 --- a/clang/test/Driver/linker-wrapper-image.c +++ b/clang/test/Driver/linker-wrapper-image.c @@ -10,9 +10,9 @@ // RUN: clang-linker-wrapper --print-wrapped-module --dry-run --host-triple=x86_64-unknown-linux-gnu \ // RUN: --linker-path=/usr/bin/ld -- %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=OPENMP -// OPENMP: @__start_omp_offloading_entries = external hidden constant %__tgt_offload_entry -// OPENMP-NEXT: @__stop_omp_offloading_entries = external hidden constant %__tgt_offload_entry -// OPENMP-NEXT: @__dummy.omp_offloading.entry = hidden constant [0 x %__tgt_offload_entry] zeroinitializer, section "omp_offloading_entries" +// OPENMP: @__start_omp_offloading_entries = external hidden constant [0 x %struct.__tgt_offload_entry] +// OPENMP-NEXT: @__stop_omp_offloading_entries = external hidden constant [0 x %struct.__tgt_offload_entry] +// OPENMP-NEXT: @__dummy.omp_offloading_entries = hidden constant [0 x %struct.__tgt_offload_entry] zeroinitializer, section "omp_offloading_entries" // OPENMP-NEXT: @.omp_offloading.device_image = internal unnamed_addr constant [[[SIZE:[0-9]+]] x i8] c"\10\FF\10\AD{{.*}}" // OPENMP-NEXT: @.omp_offloading.device_images = internal unnamed_addr constant [1 x %__tgt_device_image] [%__tgt_device_image { ptr @.omp_offloading.device_image, ptr getelementptr inbounds ([[[SIZE]] x i8], ptr @.omp_offloading.device_image, i64 1, i64 0), ptr @__start_omp_offloading_entries, ptr @__stop_omp_offloading_entries }] // OPENMP-NEXT: @.omp_offloading.descriptor = internal constant %__tgt_bin_desc { i32 1, ptr @.omp_offloading.device_images, ptr @__start_omp_offloading_entries, ptr @__stop_omp_offloading_entries } @@ -39,10 +39,10 @@ // CUDA: @.fatbin_image = internal constant [0 x i8] zeroinitializer, section ".nv_fatbin" // CUDA-NEXT: @.fatbin_wrapper = internal constant %fatbin_wrapper { i32 1180844977, i32 1, ptr @.fatbin_image, ptr null }, section ".nvFatBinSegment", align 8 -// CUDA-NEXT: @__dummy.cuda_offloading.entry = hidden constant [0 x %__tgt_offload_entry] zeroinitializer, section "cuda_offloading_entries" // CUDA-NEXT: @.cuda.binary_handle = internal global ptr null -// CUDA-NEXT: @__start_cuda_offloading_entries = external hidden constant [0 x %__tgt_offload_entry] -// CUDA-NEXT: @__stop_cuda_offloading_entries = external hidden constant [0 x %__tgt_offload_entry] +// CUDA-NEXT: @__start_cuda_offloading_entries = external hidden constant [0 x %struct.__tgt_offload_entry] +// CUDA-NEXT: @__stop_cuda_offloading_entries = external hidden constant [0 x %struct.__tgt_offload_entry] +// CUDA-NEXT: @__dummy.cuda_offloading_entries = hidden constant [0 x %struct.__tgt_offload_entry] zeroinitializer, section "cuda_offloading_entries" // CUDA-NEXT: @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 1, ptr @.cuda.fatbin_reg, ptr null }] // CUDA: define internal void @.cuda.fatbin_reg() section ".text.startup" { @@ -68,13 +68,13 @@ // CUDA: while.entry: // CUDA-NEXT: %entry1 = phi ptr [ @__start_cuda_offloading_entries, %entry ], [ %7, %if.end ] -// CUDA-NEXT: %1 = getelementptr inbounds %__tgt_offload_entry, ptr %entry1, i64 0, i32 0 +// CUDA-NEXT: %1 = getelementptr inbounds %struct.__tgt_offload_entry, ptr %entry1, i64 0, i32 0 // CUDA-NEXT: %addr = load ptr, ptr %1, align 8 -// CUDA-NEXT: %2 = getelementptr inbounds %__tgt_offload_entry, ptr %entry1, i64 0, i32 1 +// CUDA-NEXT: %2 = getelementptr inbounds %struct.__tgt_offload_entry, ptr %entry1, i64 0, i32 1 // CUDA-NEXT: %name = load ptr, ptr %2, align 8 -// CUDA-NEXT: %3 = getelementptr inbounds %__tgt_offload_entry, ptr %entry1, i64 0, i32 2 +// CUDA-NEXT: %3 = getelementptr inbounds %struct.__tgt_offload_entry, ptr %entry1, i64 0, i32 2 // CUDA-NEXT: %size = load i64, ptr %3, align 4 -// CUDA-NEXT: %4 = getelementptr inbounds %__tgt_offload_entry, ptr %entry1, i64 0, i32 3 +// CUDA-NEXT: %4 =
[llvm] [clang] [Offloading][NFC] Refactor handling of offloading entries (PR #72544)
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/72544 Summary: This patch is a simple refactoring of code out of the linker wrapper into a common location. The main motivation behind this change is to make it easier to change the handling in the future to accept a triple to be used to emit entries that function on that target. >From 0047be2207b775e6de6dda24751daa933bd66ce5 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 16 Nov 2023 12:05:09 -0600 Subject: [PATCH] [Offloading][NFC] Refactor handling of offloading entries Summary: This patch is a simple refactoring of code out of the linker wrapper into a common location. The main motivation behind this change is to make it easier to change the handling in the future to accept a triple to be used to emit entries that function on that target. --- clang/test/Driver/linker-wrapper-image.c | 38 +++--- .../tools/clang-linker-wrapper/CMakeLists.txt | 1 + .../clang-linker-wrapper/OffloadWrapper.cpp | 116 -- .../llvm/Frontend/Offloading/Utility.h| 11 +- llvm/lib/Frontend/Offloading/Utility.cpp | 31 - 5 files changed, 82 insertions(+), 115 deletions(-) diff --git a/clang/test/Driver/linker-wrapper-image.c b/clang/test/Driver/linker-wrapper-image.c index 83e7db6a49a6bb3..bb641a08bc023d5 100644 --- a/clang/test/Driver/linker-wrapper-image.c +++ b/clang/test/Driver/linker-wrapper-image.c @@ -10,9 +10,9 @@ // RUN: clang-linker-wrapper --print-wrapped-module --dry-run --host-triple=x86_64-unknown-linux-gnu \ // RUN: --linker-path=/usr/bin/ld -- %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=OPENMP -// OPENMP: @__start_omp_offloading_entries = external hidden constant %__tgt_offload_entry -// OPENMP-NEXT: @__stop_omp_offloading_entries = external hidden constant %__tgt_offload_entry -// OPENMP-NEXT: @__dummy.omp_offloading.entry = hidden constant [0 x %__tgt_offload_entry] zeroinitializer, section "omp_offloading_entries" +// OPENMP: @__start_omp_offloading_entries = external hidden constant [0 x %struct.__tgt_offload_entry] +// OPENMP-NEXT: @__stop_omp_offloading_entries = external hidden constant [0 x %struct.__tgt_offload_entry] +// OPENMP-NEXT: @__dummy.omp_offloading_entries = hidden constant [0 x %struct.__tgt_offload_entry] zeroinitializer, section "omp_offloading_entries" // OPENMP-NEXT: @.omp_offloading.device_image = internal unnamed_addr constant [[[SIZE:[0-9]+]] x i8] c"\10\FF\10\AD{{.*}}" // OPENMP-NEXT: @.omp_offloading.device_images = internal unnamed_addr constant [1 x %__tgt_device_image] [%__tgt_device_image { ptr @.omp_offloading.device_image, ptr getelementptr inbounds ([[[SIZE]] x i8], ptr @.omp_offloading.device_image, i64 1, i64 0), ptr @__start_omp_offloading_entries, ptr @__stop_omp_offloading_entries }] // OPENMP-NEXT: @.omp_offloading.descriptor = internal constant %__tgt_bin_desc { i32 1, ptr @.omp_offloading.device_images, ptr @__start_omp_offloading_entries, ptr @__stop_omp_offloading_entries } @@ -39,10 +39,10 @@ // CUDA: @.fatbin_image = internal constant [0 x i8] zeroinitializer, section ".nv_fatbin" // CUDA-NEXT: @.fatbin_wrapper = internal constant %fatbin_wrapper { i32 1180844977, i32 1, ptr @.fatbin_image, ptr null }, section ".nvFatBinSegment", align 8 -// CUDA-NEXT: @__dummy.cuda_offloading.entry = hidden constant [0 x %__tgt_offload_entry] zeroinitializer, section "cuda_offloading_entries" // CUDA-NEXT: @.cuda.binary_handle = internal global ptr null -// CUDA-NEXT: @__start_cuda_offloading_entries = external hidden constant [0 x %__tgt_offload_entry] -// CUDA-NEXT: @__stop_cuda_offloading_entries = external hidden constant [0 x %__tgt_offload_entry] +// CUDA-NEXT: @__start_cuda_offloading_entries = external hidden constant [0 x %struct.__tgt_offload_entry] +// CUDA-NEXT: @__stop_cuda_offloading_entries = external hidden constant [0 x %struct.__tgt_offload_entry] +// CUDA-NEXT: @__dummy.cuda_offloading_entries = hidden constant [0 x %struct.__tgt_offload_entry] zeroinitializer, section "cuda_offloading_entries" // CUDA-NEXT: @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 1, ptr @.cuda.fatbin_reg, ptr null }] // CUDA: define internal void @.cuda.fatbin_reg() section ".text.startup" { @@ -68,13 +68,13 @@ // CUDA: while.entry: // CUDA-NEXT: %entry1 = phi ptr [ @__start_cuda_offloading_entries, %entry ], [ %7, %if.end ] -// CUDA-NEXT: %1 = getelementptr inbounds %__tgt_offload_entry, ptr %entry1, i64 0, i32 0 +// CUDA-NEXT: %1 = getelementptr inbounds %struct.__tgt_offload_entry, ptr %entry1, i64 0, i32 0 // CUDA-NEXT: %addr = load ptr, ptr %1, align 8 -// CUDA-NEXT: %2 = getelementptr inbounds %__tgt_offload_entry, ptr %entry1, i64 0, i32 1 +// CUDA-NEXT: %2 = getelementptr inbounds %struct.__tgt_offload_entry, ptr %entry1, i64 0, i32 1 // CUDA-NEXT: %name = load ptr, ptr %2, align 8 -// CUDA-NEXT: %3 =