[clang] [OpenMP][USM] Adds test for -fopenmp-force-usm flag (PR #75467)

2024-01-18 Thread Jan Patrick Lehr via cfe-commits

https://github.com/jplehr closed https://github.com/llvm/llvm-project/pull/75467
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][USM] Adds test for -fopenmp-force-usm flag (PR #75467)

2024-01-18 Thread Jan Patrick Lehr via cfe-commits

jplehr wrote:

Closing this. Test is now part of feature-PR.

https://github.com/llvm/llvm-project/pull/75467
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][USM] Adds test for -fopenmp-force-usm flag (PR #75467)

2024-01-12 Thread Johannes Doerfert via cfe-commits

https://github.com/jdoerfert approved this pull request.


https://github.com/llvm/llvm-project/pull/75467
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][USM] Adds test for -fopenmp-force-usm flag (PR #75467)

2024-01-04 Thread Johannes Doerfert via cfe-commits

jdoerfert wrote:

I see. FWIW "pGI_decl_tgt_ref_ptr" is really not a good name (not your fault).

https://github.com/llvm/llvm-project/pull/75467
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][USM] Adds test for -fopenmp-force-usm flag (PR #75467)

2024-01-04 Thread Jan Patrick Lehr via cfe-commits

jplehr wrote:

The IR is impacted for the global that is in that test case. Lines ~46-50 
(first IR section) vs line ~68 (second IR section). The remaining code is 
indeed the same.
The way that this test is executed is out of date however, given that I have 
reimplemented the flag.
My plan is to keep this test and add a few runtime tests as well, since we 
should be able to validate that we do not see data transfers.

https://github.com/llvm/llvm-project/pull/75467
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][USM] Adds test for -fopenmp-force-usm flag (PR #75467)

2024-01-03 Thread Johannes Doerfert via cfe-commits

jdoerfert wrote:

I mean, the test shows device IR for two functions, doesn't it? I am not aware 
that part is impacted by USM. Host globals/register functions are.

https://github.com/llvm/llvm-project/pull/75467
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][USM] Adds test for -fopenmp-force-usm flag (PR #75467)

2024-01-03 Thread Joseph Huber via cfe-commits

jhuber6 wrote:

Test should probably show that IR is equivalent to `#pragma omp requires 
unified_shared_memory` or however that's spelled. Basic documentation should be 
provided by the help test in the new flag, but we probably have somewhere in 
the OpenMP docs you could add it to if desired.

https://github.com/llvm/llvm-project/pull/75467
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][USM] Adds test for -fopenmp-force-usm flag (PR #75467)

2024-01-03 Thread Johannes Doerfert via cfe-commits

jdoerfert wrote:

How do the IR checks verify the flag works?

https://github.com/llvm/llvm-project/pull/75467
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][USM] Adds test for -fopenmp-force-usm flag (PR #75467)

2023-12-29 Thread Jan Patrick Lehr via cfe-commits

jplehr wrote:

I updated the feature PR (#75468) with a different solution. Will update the 
test after feedback if the route I took in the other PR is seen as OK.

https://github.com/llvm/llvm-project/pull/75467
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][USM] Adds test for -fopenmp-force-usm flag (PR #75467)

2023-12-14 Thread Jan Patrick Lehr via cfe-commits

https://github.com/jplehr updated 
https://github.com/llvm/llvm-project/pull/75467

>From d3d073d7f57f2a5d06cd8c1de8c1503034af3b6b Mon Sep 17 00:00:00 2001
From: JP Lehr 
Date: Wed, 12 Jul 2023 05:04:41 -0400
Subject: [PATCH] [OpenMP][USM] Adds test for -fopenmp-force-usm flag

This adds a basic test to check the correct generation of double
indirect access to declare target globals in USM mode vs non-USM mode.

Marked as XFAIL to first land test and then enable in subsequent patch.
---
 clang/test/OpenMP/force-usm.c | 74 +++
 1 file changed, 74 insertions(+)
 create mode 100644 clang/test/OpenMP/force-usm.c

diff --git a/clang/test/OpenMP/force-usm.c b/clang/test/OpenMP/force-usm.c
new file mode 100644
index 00..f04d499e5f71c2
--- /dev/null
+++ b/clang/test/OpenMP/force-usm.c
@@ -0,0 +1,74 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --include-generated-funcs --replace-value-regex 
"__omp_offloading_[0-9a-z]+_[0-9a-z]+" "pl_cond[.].+[.|,]" 
--prefix-filecheck-ir-name _ --version 3
+// REQUIRES: amdgpu-registered-target
+// XFAIL: amdgpu-registered-target
+
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown 
-fopenmp-targets=amdgcn-amd-amdhsa -include 
%S/../../lib/Headers/openmp_wrappers/usm/force_usm.h -emit-llvm-bc %s -o 
%t-ppc-host.bc
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple amdgcn-amd-amdhsa 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm %s -include 
%S/../../lib/Headers/openmp_wrappers/usm/force_usm.h -fopenmp-is-device 
-fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck 
-check-prefix=CHECK-USM %s
+
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm-bc %s -o %t-ppc-host.bc
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple amdgcn-amd-amdhsa 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm %s -fopenmp-is-device 
-fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck 
-check-prefix=CHECK-DEFAULT %s
+// expected-no-diagnostics
+
+extern "C" void *malloc(unsigned int b);
+
+int GI;
+#pragma omp declare target
+int *pGI;
+#pragma omp end declare target
+
+int main(void) {
+
+  GI = 0;
+
+  pGI = (int *) malloc(sizeof(int));
+  *pGI = 42;
+
+#pragma omp target map(pGI[:1], GI)
+  {
+GI = 1;
+*pGI = 2;
+  }
+
+  return 0;
+}
+
+// CHECK-USM-LABEL: define weak_odr protected amdgpu_kernel void 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l25
+// CHECK-USM-SAME: (ptr noundef nonnull align 4 dereferenceable(4) [[GI:%.*]]) 
#[[ATTR0:[0-9]+]] {
+// CHECK-USM-NEXT:  entry:
+// CHECK-USM-NEXT:[[GI_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// CHECK-USM-NEXT:[[GI_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) 
[[GI_ADDR]] to ptr
+// CHECK-USM-NEXT:store ptr [[GI]], ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-USM-NEXT:[[TMP0:%.*]] = load ptr, ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-USM-NEXT:[[TMP1:%.*]] = call i32 @__kmpc_target_init(ptr 
addrspacecast (ptr addrspace(1) @[[GLOB1:[0-9]+]] to ptr), i8 1, i1 true)
+// CHECK-USM-NEXT:[[EXEC_USER_CODE:%.*]] = icmp eq i32 [[TMP1]], -1
+// CHECK-USM-NEXT:br i1 [[EXEC_USER_CODE]], label [[USER_CODE_ENTRY:%.*]], 
label [[WORKER_EXIT:%.*]]
+// CHECK-USM:   user_code.entry:
+// CHECK-USM-NEXT:store i32 1, ptr [[TMP0]], align 4
+// CHECK-USM-NEXT:[[TMP2:%.*]] = load ptr, ptr @pGI_decl_tgt_ref_ptr, 
align 8
+// CHECK-USM-NEXT:[[TMP3:%.*]] = load ptr, ptr [[TMP2]], align 8
+// CHECK-USM-NEXT:store i32 2, ptr [[TMP3]], align 4
+// CHECK-USM-NEXT:call void @__kmpc_target_deinit(ptr addrspacecast (ptr 
addrspace(1) @[[GLOB1]] to ptr), i8 1)
+// CHECK-USM-NEXT:ret void
+// CHECK-USM:   worker.exit:
+// CHECK-USM-NEXT:ret void
+//
+//
+// CHECK-DEFAULT-LABEL: define weak_odr protected amdgpu_kernel void 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l25
+// CHECK-DEFAULT-SAME: (ptr noundef nonnull align 4 dereferenceable(4) 
[[GI:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-DEFAULT-NEXT:  entry:
+// CHECK-DEFAULT-NEXT:[[GI_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// CHECK-DEFAULT-NEXT:[[GI_ADDR_ASCAST:%.*]] = addrspacecast ptr 
addrspace(5) [[GI_ADDR]] to ptr
+// CHECK-DEFAULT-NEXT:store ptr [[GI]], ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-DEFAULT-NEXT:[[TMP0:%.*]] = load ptr, ptr [[GI_ADDR_ASCAST]], 
align 8
+// CHECK-DEFAULT-NEXT:[[TMP1:%.*]] = call i32 @__kmpc_target_init(ptr 
addrspacecast (ptr addrspace(1) @[[GLOB1:[0-9]+]] to ptr), i8 1, i1 true)
+// CHECK-DEFAULT-NEXT:[[EXEC_USER_CODE:%.*]] = icmp eq i32 [[TMP1]], -1
+// CHECK-DEFAULT-NEXT:br i1 [[EXEC_USER_CODE]], label 
[[USER_CODE_ENTRY:%.*]], label [[WORKER_EXIT:%.*]]
+// CHECK-DEFAULT:   user_code.entry:
+// CHECK-DEFAULT-NEXT:store i32 1, ptr [[TMP0]], align 4
+// CHECK-DEFAULT-NEXT:[[TMP2:%.*]] = load ptr, ptr addrspacecast (ptr 
addrspace(1) @pGI to ptr), align 8
+// 

[clang] [OpenMP][USM] Adds test for -fopenmp-force-usm flag (PR #75467)

2023-12-14 Thread via cfe-commits

https://github.com/ronlieb approved this pull request.


https://github.com/llvm/llvm-project/pull/75467
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][USM] Adds test for -fopenmp-force-usm flag (PR #75467)

2023-12-14 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Jan Patrick Lehr (jplehr)


Changes

This adds a basic test to check the correct generation of double indirect 
access to declare target globals in USM mode vs non-USM mode.
I am a bit unhappy with the way this test is set up, but could not find a 
better way to do it. Happy to improve that and add more tests then.

Marked as XFAIL to first land test and then enable in subsequent patch.

---
Full diff: https://github.com/llvm/llvm-project/pull/75467.diff


1 Files Affected:

- (added) clang/test/OpenMP/force-usm.c (+73) 


``diff
diff --git a/clang/test/OpenMP/force-usm.c b/clang/test/OpenMP/force-usm.c
new file mode 100644
index 00..222705322b8976
--- /dev/null
+++ b/clang/test/OpenMP/force-usm.c
@@ -0,0 +1,73 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --include-generated-funcs --replace-value-regex 
"__omp_offloading_[0-9a-z]+_[0-9a-z]+" "pl_cond[.].+[.|,]" 
--prefix-filecheck-ir-name _ --version 3
+// XFAIL: amdgpu-registered-target
+
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown 
-fopenmp-targets=amdgcn-amd-amdhsa -include 
%S/../../lib/Headers/openmp_wrappers/usm/force_usm.h -emit-llvm-bc %s -o 
%t-ppc-host.bc
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple amdgcn-amd-amdhsa 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm %s -include 
%S/../../lib/Headers/openmp_wrappers/usm/force_usm.h -fopenmp-is-device 
-fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck 
-check-prefix=CHECK-USM %s
+
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm-bc %s -o %t-ppc-host.bc
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple amdgcn-amd-amdhsa 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm %s -fopenmp-is-device 
-fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck 
-check-prefix=CHECK-DEFAULT %s
+// expected-no-diagnostics
+
+extern "C" void *malloc(unsigned int b);
+
+int GI;
+#pragma omp declare target
+int *pGI;
+#pragma omp end declare target
+
+int main(void) {
+
+  GI = 0;
+
+  pGI = (int *) malloc(sizeof(int));
+  *pGI = 42;
+
+#pragma omp target map(pGI[:1], GI)
+  {
+GI = 1;
+*pGI = 2;
+  }
+
+  return 0;
+}
+
+// CHECK-USM-LABEL: define weak_odr protected amdgpu_kernel void 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l25
+// CHECK-USM-SAME: (ptr noundef nonnull align 4 dereferenceable(4) [[GI:%.*]]) 
#[[ATTR0:[0-9]+]] {
+// CHECK-USM-NEXT:  entry:
+// CHECK-USM-NEXT:[[GI_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// CHECK-USM-NEXT:[[GI_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) 
[[GI_ADDR]] to ptr
+// CHECK-USM-NEXT:store ptr [[GI]], ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-USM-NEXT:[[TMP0:%.*]] = load ptr, ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-USM-NEXT:[[TMP1:%.*]] = call i32 @__kmpc_target_init(ptr 
addrspacecast (ptr addrspace(1) @[[GLOB1:[0-9]+]] to ptr), i8 1, i1 true)
+// CHECK-USM-NEXT:[[EXEC_USER_CODE:%.*]] = icmp eq i32 [[TMP1]], -1
+// CHECK-USM-NEXT:br i1 [[EXEC_USER_CODE]], label [[USER_CODE_ENTRY:%.*]], 
label [[WORKER_EXIT:%.*]]
+// CHECK-USM:   user_code.entry:
+// CHECK-USM-NEXT:store i32 1, ptr [[TMP0]], align 4
+// CHECK-USM-NEXT:[[TMP2:%.*]] = load ptr, ptr @pGI_decl_tgt_ref_ptr, 
align 8
+// CHECK-USM-NEXT:[[TMP3:%.*]] = load ptr, ptr [[TMP2]], align 8
+// CHECK-USM-NEXT:store i32 2, ptr [[TMP3]], align 4
+// CHECK-USM-NEXT:call void @__kmpc_target_deinit(ptr addrspacecast (ptr 
addrspace(1) @[[GLOB1]] to ptr), i8 1)
+// CHECK-USM-NEXT:ret void
+// CHECK-USM:   worker.exit:
+// CHECK-USM-NEXT:ret void
+//
+//
+// CHECK-DEFAULT-LABEL: define weak_odr protected amdgpu_kernel void 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l25
+// CHECK-DEFAULT-SAME: (ptr noundef nonnull align 4 dereferenceable(4) 
[[GI:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-DEFAULT-NEXT:  entry:
+// CHECK-DEFAULT-NEXT:[[GI_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// CHECK-DEFAULT-NEXT:[[GI_ADDR_ASCAST:%.*]] = addrspacecast ptr 
addrspace(5) [[GI_ADDR]] to ptr
+// CHECK-DEFAULT-NEXT:store ptr [[GI]], ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-DEFAULT-NEXT:[[TMP0:%.*]] = load ptr, ptr [[GI_ADDR_ASCAST]], 
align 8
+// CHECK-DEFAULT-NEXT:[[TMP1:%.*]] = call i32 @__kmpc_target_init(ptr 
addrspacecast (ptr addrspace(1) @[[GLOB1:[0-9]+]] to ptr), i8 1, i1 true)
+// CHECK-DEFAULT-NEXT:[[EXEC_USER_CODE:%.*]] = icmp eq i32 [[TMP1]], -1
+// CHECK-DEFAULT-NEXT:br i1 [[EXEC_USER_CODE]], label 
[[USER_CODE_ENTRY:%.*]], label [[WORKER_EXIT:%.*]]
+// CHECK-DEFAULT:   user_code.entry:
+// CHECK-DEFAULT-NEXT:store i32 1, ptr [[TMP0]], align 4
+// CHECK-DEFAULT-NEXT:[[TMP2:%.*]] = load ptr, ptr addrspacecast (ptr 
addrspace(1) @pGI to ptr), align 8
+// CHECK-DEFAULT-NEXT:store i32 2, ptr [[TMP2]], align 4
+// CHECK-DEFAULT-NEXT:call void 

[clang] [OpenMP][USM] Adds test for -fopenmp-force-usm flag (PR #75467)

2023-12-14 Thread Jan Patrick Lehr via cfe-commits

https://github.com/jplehr created 
https://github.com/llvm/llvm-project/pull/75467

This adds a basic test to check the correct generation of double indirect 
access to declare target globals in USM mode vs non-USM mode.
I am a bit unhappy with the way this test is set up, but could not find a 
better way to do it. Happy to improve that and add more tests then.

Marked as XFAIL to first land test and then enable in subsequent patch.

>From ea2a9191122c5659aac380803b381f763c816e07 Mon Sep 17 00:00:00 2001
From: JP Lehr 
Date: Wed, 12 Jul 2023 05:04:41 -0400
Subject: [PATCH] [OpenMP][USM] Adds test for -fopenmp-force-usm flag

This adds a basic test to check the correct generation of double
indirect access to declare target globals in USM mode vs non-USM mode.

Marked as XFAIL to first land test and then enable in subsequent patch.
---
 clang/test/OpenMP/force-usm.c | 73 +++
 1 file changed, 73 insertions(+)
 create mode 100644 clang/test/OpenMP/force-usm.c

diff --git a/clang/test/OpenMP/force-usm.c b/clang/test/OpenMP/force-usm.c
new file mode 100644
index 00..222705322b8976
--- /dev/null
+++ b/clang/test/OpenMP/force-usm.c
@@ -0,0 +1,73 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --include-generated-funcs --replace-value-regex 
"__omp_offloading_[0-9a-z]+_[0-9a-z]+" "pl_cond[.].+[.|,]" 
--prefix-filecheck-ir-name _ --version 3
+// XFAIL: amdgpu-registered-target
+
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown 
-fopenmp-targets=amdgcn-amd-amdhsa -include 
%S/../../lib/Headers/openmp_wrappers/usm/force_usm.h -emit-llvm-bc %s -o 
%t-ppc-host.bc
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple amdgcn-amd-amdhsa 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm %s -include 
%S/../../lib/Headers/openmp_wrappers/usm/force_usm.h -fopenmp-is-device 
-fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck 
-check-prefix=CHECK-USM %s
+
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm-bc %s -o %t-ppc-host.bc
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple amdgcn-amd-amdhsa 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm %s -fopenmp-is-device 
-fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck 
-check-prefix=CHECK-DEFAULT %s
+// expected-no-diagnostics
+
+extern "C" void *malloc(unsigned int b);
+
+int GI;
+#pragma omp declare target
+int *pGI;
+#pragma omp end declare target
+
+int main(void) {
+
+  GI = 0;
+
+  pGI = (int *) malloc(sizeof(int));
+  *pGI = 42;
+
+#pragma omp target map(pGI[:1], GI)
+  {
+GI = 1;
+*pGI = 2;
+  }
+
+  return 0;
+}
+
+// CHECK-USM-LABEL: define weak_odr protected amdgpu_kernel void 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l25
+// CHECK-USM-SAME: (ptr noundef nonnull align 4 dereferenceable(4) [[GI:%.*]]) 
#[[ATTR0:[0-9]+]] {
+// CHECK-USM-NEXT:  entry:
+// CHECK-USM-NEXT:[[GI_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// CHECK-USM-NEXT:[[GI_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) 
[[GI_ADDR]] to ptr
+// CHECK-USM-NEXT:store ptr [[GI]], ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-USM-NEXT:[[TMP0:%.*]] = load ptr, ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-USM-NEXT:[[TMP1:%.*]] = call i32 @__kmpc_target_init(ptr 
addrspacecast (ptr addrspace(1) @[[GLOB1:[0-9]+]] to ptr), i8 1, i1 true)
+// CHECK-USM-NEXT:[[EXEC_USER_CODE:%.*]] = icmp eq i32 [[TMP1]], -1
+// CHECK-USM-NEXT:br i1 [[EXEC_USER_CODE]], label [[USER_CODE_ENTRY:%.*]], 
label [[WORKER_EXIT:%.*]]
+// CHECK-USM:   user_code.entry:
+// CHECK-USM-NEXT:store i32 1, ptr [[TMP0]], align 4
+// CHECK-USM-NEXT:[[TMP2:%.*]] = load ptr, ptr @pGI_decl_tgt_ref_ptr, 
align 8
+// CHECK-USM-NEXT:[[TMP3:%.*]] = load ptr, ptr [[TMP2]], align 8
+// CHECK-USM-NEXT:store i32 2, ptr [[TMP3]], align 4
+// CHECK-USM-NEXT:call void @__kmpc_target_deinit(ptr addrspacecast (ptr 
addrspace(1) @[[GLOB1]] to ptr), i8 1)
+// CHECK-USM-NEXT:ret void
+// CHECK-USM:   worker.exit:
+// CHECK-USM-NEXT:ret void
+//
+//
+// CHECK-DEFAULT-LABEL: define weak_odr protected amdgpu_kernel void 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l25
+// CHECK-DEFAULT-SAME: (ptr noundef nonnull align 4 dereferenceable(4) 
[[GI:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-DEFAULT-NEXT:  entry:
+// CHECK-DEFAULT-NEXT:[[GI_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// CHECK-DEFAULT-NEXT:[[GI_ADDR_ASCAST:%.*]] = addrspacecast ptr 
addrspace(5) [[GI_ADDR]] to ptr
+// CHECK-DEFAULT-NEXT:store ptr [[GI]], ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-DEFAULT-NEXT:[[TMP0:%.*]] = load ptr, ptr [[GI_ADDR_ASCAST]], 
align 8
+// CHECK-DEFAULT-NEXT:[[TMP1:%.*]] = call i32 @__kmpc_target_init(ptr 
addrspacecast (ptr addrspace(1) @[[GLOB1:[0-9]+]] to ptr), i8 1, i1 true)
+// CHECK-DEFAULT-NEXT:[[EXEC_USER_CODE:%.*]] = icmp eq i32 [[TMP1]], -1
+// CHECK-DEFAULT-NEXT: