[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-01-19 Thread via cfe-commits

https://github.com/mmoadeli created 
https://github.com/llvm/llvm-project/pull/78759

- Address space cast of nullptr in local_space into a generic_space for the 
CUDA backend. The reason for this cast was having invalid local memory base 
address for the associated variable.
- In the context of AMD GPU, assigns a NULL value as ~0 for the address spaces 
of sycl_local and sycl_private to match the ones for opencl_local and 
opencl_private.

>From 286ac8f3ea6aec711827ccab9608b010e78b18cf Mon Sep 17 00:00:00 2001
From: m moadeli 
Date: Fri, 19 Jan 2024 18:42:24 +
Subject: [PATCH] - Address space cast of a `local_space` specialized `nullptr`
 into a generic space for the CUDA backend. - In the context of AMD GPU,
 assigns a NULL value of `~0` for the address spaces of `sycl_local` and
 `sycl_private`

---
 clang/lib/Basic/Targets/AMDGPU.h  |   6 +-
 clang/lib/CodeGen/Targets/NVPTX.cpp   |  18 +++
 .../CodeGenSYCL/address-space-conversions.cpp |   4 +
 .../amd-address-space-conversions.cpp | 128 ++
 .../cuda-address-space-conversions.cpp| 122 +
 5 files changed, 276 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
 create mode 100644 clang/test/CodeGenSYCL/cuda-address-space-conversions.cpp

diff --git a/clang/lib/Basic/Targets/AMDGPU.h b/clang/lib/Basic/Targets/AMDGPU.h
index 90a1516ecdd20d..94d6ee7f5f72df 100644
--- a/clang/lib/Basic/Targets/AMDGPU.h
+++ b/clang/lib/Basic/Targets/AMDGPU.h
@@ -418,8 +418,10 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : 
public TargetInfo {
   // value ~0.
   uint64_t getNullPointerValue(LangAS AS) const override {
 // FIXME: Also should handle region.
-return (AS == LangAS::opencl_local || AS == LangAS::opencl_private)
-  ? ~0 : 0;
+return (AS == LangAS::opencl_local || AS == LangAS::opencl_private ||
+AS == LangAS::sycl_local || AS == LangAS::sycl_private)
+   ? ~0
+   : 0;
   }
 
   void setAuxTarget(const TargetInfo *Aux) override;
diff --git a/clang/lib/CodeGen/Targets/NVPTX.cpp 
b/clang/lib/CodeGen/Targets/NVPTX.cpp
index d0dc7c258a03a6..8718f1ecf3a7e0 100644
--- a/clang/lib/CodeGen/Targets/NVPTX.cpp
+++ b/clang/lib/CodeGen/Targets/NVPTX.cpp
@@ -47,6 +47,10 @@ class NVPTXTargetCodeGenInfo : public TargetCodeGenInfo {
CodeGen::CodeGenModule &M) const override;
   bool shouldEmitStaticExternCAliases() const override;
 
+  llvm::Constant *getNullPointer(const CodeGen::CodeGenModule &CGM,
+ llvm::PointerType *T,
+ QualType QT) const override;
+
   llvm::Type *getCUDADeviceBuiltinSurfaceDeviceType() const override {
 // On the device side, surface reference is represented as an object handle
 // in 64-bit integer.
@@ -285,6 +289,20 @@ void 
NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV,
 bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const {
   return false;
 }
+
+llvm::Constant *
+NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM,
+   llvm::PointerType *PT,
+   QualType QT) const {
+  auto &Ctx = CGM.getContext();
+  if (PT->getAddressSpace() != Ctx.getTargetAddressSpace(LangAS::opencl_local))
+return llvm::ConstantPointerNull::get(PT);
+
+  auto NPT = llvm::PointerType::get(
+  PT->getContext(), Ctx.getTargetAddressSpace(LangAS::opencl_generic));
+  return llvm::ConstantExpr::getAddrSpaceCast(
+  llvm::ConstantPointerNull::get(NPT), PT);
+}
 }
 
 void CodeGenModule::handleCUDALaunchBoundsAttr(llvm::Function *F,
diff --git a/clang/test/CodeGenSYCL/address-space-conversions.cpp 
b/clang/test/CodeGenSYCL/address-space-conversions.cpp
index 3933ad375412da..10a181318a174b 100644
--- a/clang/test/CodeGenSYCL/address-space-conversions.cpp
+++ b/clang/test/CodeGenSYCL/address-space-conversions.cpp
@@ -25,6 +25,10 @@ void usages() {
   __attribute__((opencl_local)) int *LOC;
   // CHECK-DAG: [[NoAS:%[a-zA-Z0-9]+]] = alloca ptr addrspace(4)
   // CHECK-DAG: [[NoAS]].ascast = addrspacecast ptr [[NoAS]] to ptr 
addrspace(4)
+  LOC = nullptr;
+  // CHECK-DAG: store ptr addrspace(3) null, ptr addrspace(4) [[LOC]].ascast, 
align 8
+  GLOB = nullptr;
+  // CHECK-DAG: store ptr addrspace(1) null, ptr addrspace(4) [[GLOB]].ascast, 
align 8
   int *NoAS;
   // CHECK-DAG: [[PRIV:%[a-zA-Z0-9]+]] = alloca ptr
   // CHECK-DAG: [[PRIV]].ascast = addrspacecast ptr [[PRIV]] to ptr 
addrspace(4)
diff --git a/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp 
b/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
new file mode 100644
index 00..35da61cd8cbbe3
--- /dev/null
+++ b/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
@@ -0,0 +1,128 @@
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fsycl-is-device 
-disable-llvm-passes -emit-llvm %s -o - 

[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-01-19 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: None (mmoadeli)


Changes

- Address space cast of nullptr in local_space into a generic_space for the 
CUDA backend. The reason for this cast was having invalid local memory base 
address for the associated variable.
- In the context of AMD GPU, assigns a NULL value as ~0 for the address spaces 
of sycl_local and sycl_private to match the ones for opencl_local and 
opencl_private.

---
Full diff: https://github.com/llvm/llvm-project/pull/78759.diff


5 Files Affected:

- (modified) clang/lib/Basic/Targets/AMDGPU.h (+4-2) 
- (modified) clang/lib/CodeGen/Targets/NVPTX.cpp (+18) 
- (modified) clang/test/CodeGenSYCL/address-space-conversions.cpp (+4) 
- (added) clang/test/CodeGenSYCL/amd-address-space-conversions.cpp (+128) 
- (added) clang/test/CodeGenSYCL/cuda-address-space-conversions.cpp (+122) 


``diff
diff --git a/clang/lib/Basic/Targets/AMDGPU.h b/clang/lib/Basic/Targets/AMDGPU.h
index 90a1516ecdd20d..94d6ee7f5f72df 100644
--- a/clang/lib/Basic/Targets/AMDGPU.h
+++ b/clang/lib/Basic/Targets/AMDGPU.h
@@ -418,8 +418,10 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : 
public TargetInfo {
   // value ~0.
   uint64_t getNullPointerValue(LangAS AS) const override {
 // FIXME: Also should handle region.
-return (AS == LangAS::opencl_local || AS == LangAS::opencl_private)
-  ? ~0 : 0;
+return (AS == LangAS::opencl_local || AS == LangAS::opencl_private ||
+AS == LangAS::sycl_local || AS == LangAS::sycl_private)
+   ? ~0
+   : 0;
   }
 
   void setAuxTarget(const TargetInfo *Aux) override;
diff --git a/clang/lib/CodeGen/Targets/NVPTX.cpp 
b/clang/lib/CodeGen/Targets/NVPTX.cpp
index d0dc7c258a03a6..8718f1ecf3a7e0 100644
--- a/clang/lib/CodeGen/Targets/NVPTX.cpp
+++ b/clang/lib/CodeGen/Targets/NVPTX.cpp
@@ -47,6 +47,10 @@ class NVPTXTargetCodeGenInfo : public TargetCodeGenInfo {
CodeGen::CodeGenModule &M) const override;
   bool shouldEmitStaticExternCAliases() const override;
 
+  llvm::Constant *getNullPointer(const CodeGen::CodeGenModule &CGM,
+ llvm::PointerType *T,
+ QualType QT) const override;
+
   llvm::Type *getCUDADeviceBuiltinSurfaceDeviceType() const override {
 // On the device side, surface reference is represented as an object handle
 // in 64-bit integer.
@@ -285,6 +289,20 @@ void 
NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV,
 bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const {
   return false;
 }
+
+llvm::Constant *
+NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM,
+   llvm::PointerType *PT,
+   QualType QT) const {
+  auto &Ctx = CGM.getContext();
+  if (PT->getAddressSpace() != Ctx.getTargetAddressSpace(LangAS::opencl_local))
+return llvm::ConstantPointerNull::get(PT);
+
+  auto NPT = llvm::PointerType::get(
+  PT->getContext(), Ctx.getTargetAddressSpace(LangAS::opencl_generic));
+  return llvm::ConstantExpr::getAddrSpaceCast(
+  llvm::ConstantPointerNull::get(NPT), PT);
+}
 }
 
 void CodeGenModule::handleCUDALaunchBoundsAttr(llvm::Function *F,
diff --git a/clang/test/CodeGenSYCL/address-space-conversions.cpp 
b/clang/test/CodeGenSYCL/address-space-conversions.cpp
index 3933ad375412da..10a181318a174b 100644
--- a/clang/test/CodeGenSYCL/address-space-conversions.cpp
+++ b/clang/test/CodeGenSYCL/address-space-conversions.cpp
@@ -25,6 +25,10 @@ void usages() {
   __attribute__((opencl_local)) int *LOC;
   // CHECK-DAG: [[NoAS:%[a-zA-Z0-9]+]] = alloca ptr addrspace(4)
   // CHECK-DAG: [[NoAS]].ascast = addrspacecast ptr [[NoAS]] to ptr 
addrspace(4)
+  LOC = nullptr;
+  // CHECK-DAG: store ptr addrspace(3) null, ptr addrspace(4) [[LOC]].ascast, 
align 8
+  GLOB = nullptr;
+  // CHECK-DAG: store ptr addrspace(1) null, ptr addrspace(4) [[GLOB]].ascast, 
align 8
   int *NoAS;
   // CHECK-DAG: [[PRIV:%[a-zA-Z0-9]+]] = alloca ptr
   // CHECK-DAG: [[PRIV]].ascast = addrspacecast ptr [[PRIV]] to ptr 
addrspace(4)
diff --git a/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp 
b/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
new file mode 100644
index 00..35da61cd8cbbe3
--- /dev/null
+++ b/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
@@ -0,0 +1,128 @@
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fsycl-is-device 
-disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
+void bar(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar2(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF2:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar(__attribute__((opencl_local)) int &Data) {}
+// CHECK-DAG: define dso_local void @[[LOCAL_REF:[a-zA-Z0-9_]+]](ptr 
addrspace(3) noundef align 4 dereferenc

[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-01-22 Thread Yaxun Liu via cfe-commits

yxsamliu wrote:

LGTM for amdgpu

https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-01-22 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B commented:

It would be great to add some tests for local AS null pointers for NVPTX and 
AMDGPU back-ends.


https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-01-22 Thread via cfe-commits

mmoadeli wrote:

> It would be great to add some tests for local AS null pointers for NVPTX and 
> AMDGPU back-ends.

They already have it 
[here](https://github.com/llvm/llvm-project/blob/286ac8f3ea6aec711827ccab9608b010e78b18cf/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp#L24)

https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-01-22 Thread Artem Belevich via cfe-commits


@@ -285,6 +289,20 @@ void 
NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV,
 bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const {
   return false;
 }
+
+llvm::Constant *
+NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM,
+   llvm::PointerType *PT,
+   QualType QT) const {
+  auto &Ctx = CGM.getContext();
+  if (PT->getAddressSpace() != Ctx.getTargetAddressSpace(LangAS::opencl_local))
+return llvm::ConstantPointerNull::get(PT);
+
+  auto NPT = llvm::PointerType::get(
+  PT->getContext(), Ctx.getTargetAddressSpace(LangAS::opencl_generic));
+  return llvm::ConstantExpr::getAddrSpaceCast(
+  llvm::ConstantPointerNull::get(NPT), PT);
+}

Artem-B wrote:

 I don't quite understand what's going on here. Why are we ASC'ing *all* null 
pointers to `LangAS::opencl_generic` ?

Will it work for CUDA (as in the CUDA language)? I think this code should be 
restricted to apply the ASC only for OpenCL and leave CUDA/HIP with the dafault.




https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-01-22 Thread Artem Belevich via cfe-commits

Artem-B wrote:

> * Address space cast of nullptr in local_space into a generic_space for the 
> CUDA backend.

I think you mean "NVPTX back-end". CUDA is a front-end entity (C++ w/ GPU 
extensions)

https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-18 Thread via cfe-commits

https://github.com/mmoadeli updated 
https://github.com/llvm/llvm-project/pull/78759

>From 9d743cbf91dd727dc32e994e82205f8114a44d7b Mon Sep 17 00:00:00 2001
From: m moadeli 
Date: Fri, 19 Jan 2024 18:42:24 +
Subject: [PATCH 1/2] - Address space cast of a `local_space` specialized
 `nullptr` into a generic space for the CUDA backend. - In the context of AMD
 GPU, assigns a NULL value of `~0` for the address spaces of `sycl_local` and
 `sycl_private`

---
 clang/lib/Basic/Targets/AMDGPU.h  |   6 +-
 clang/lib/CodeGen/Targets/NVPTX.cpp   |  18 +++
 .../CodeGenSYCL/address-space-conversions.cpp |   4 +
 .../amd-address-space-conversions.cpp | 128 ++
 .../cuda-address-space-conversions.cpp| 122 +
 5 files changed, 276 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
 create mode 100644 clang/test/CodeGenSYCL/cuda-address-space-conversions.cpp

diff --git a/clang/lib/Basic/Targets/AMDGPU.h b/clang/lib/Basic/Targets/AMDGPU.h
index e80589dde0ecb5..94d9ba93ed226f 100644
--- a/clang/lib/Basic/Targets/AMDGPU.h
+++ b/clang/lib/Basic/Targets/AMDGPU.h
@@ -414,8 +414,10 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : 
public TargetInfo {
   // value ~0.
   uint64_t getNullPointerValue(LangAS AS) const override {
 // FIXME: Also should handle region.
-return (AS == LangAS::opencl_local || AS == LangAS::opencl_private)
-  ? ~0 : 0;
+return (AS == LangAS::opencl_local || AS == LangAS::opencl_private ||
+AS == LangAS::sycl_local || AS == LangAS::sycl_private)
+   ? ~0
+   : 0;
   }
 
   void setAuxTarget(const TargetInfo *Aux) override;
diff --git a/clang/lib/CodeGen/Targets/NVPTX.cpp 
b/clang/lib/CodeGen/Targets/NVPTX.cpp
index d0dc7c258a03a6..8718f1ecf3a7e0 100644
--- a/clang/lib/CodeGen/Targets/NVPTX.cpp
+++ b/clang/lib/CodeGen/Targets/NVPTX.cpp
@@ -47,6 +47,10 @@ class NVPTXTargetCodeGenInfo : public TargetCodeGenInfo {
CodeGen::CodeGenModule &M) const override;
   bool shouldEmitStaticExternCAliases() const override;
 
+  llvm::Constant *getNullPointer(const CodeGen::CodeGenModule &CGM,
+ llvm::PointerType *T,
+ QualType QT) const override;
+
   llvm::Type *getCUDADeviceBuiltinSurfaceDeviceType() const override {
 // On the device side, surface reference is represented as an object handle
 // in 64-bit integer.
@@ -285,6 +289,20 @@ void 
NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV,
 bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const {
   return false;
 }
+
+llvm::Constant *
+NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM,
+   llvm::PointerType *PT,
+   QualType QT) const {
+  auto &Ctx = CGM.getContext();
+  if (PT->getAddressSpace() != Ctx.getTargetAddressSpace(LangAS::opencl_local))
+return llvm::ConstantPointerNull::get(PT);
+
+  auto NPT = llvm::PointerType::get(
+  PT->getContext(), Ctx.getTargetAddressSpace(LangAS::opencl_generic));
+  return llvm::ConstantExpr::getAddrSpaceCast(
+  llvm::ConstantPointerNull::get(NPT), PT);
+}
 }
 
 void CodeGenModule::handleCUDALaunchBoundsAttr(llvm::Function *F,
diff --git a/clang/test/CodeGenSYCL/address-space-conversions.cpp 
b/clang/test/CodeGenSYCL/address-space-conversions.cpp
index 3933ad375412da..10a181318a174b 100644
--- a/clang/test/CodeGenSYCL/address-space-conversions.cpp
+++ b/clang/test/CodeGenSYCL/address-space-conversions.cpp
@@ -25,6 +25,10 @@ void usages() {
   __attribute__((opencl_local)) int *LOC;
   // CHECK-DAG: [[NoAS:%[a-zA-Z0-9]+]] = alloca ptr addrspace(4)
   // CHECK-DAG: [[NoAS]].ascast = addrspacecast ptr [[NoAS]] to ptr 
addrspace(4)
+  LOC = nullptr;
+  // CHECK-DAG: store ptr addrspace(3) null, ptr addrspace(4) [[LOC]].ascast, 
align 8
+  GLOB = nullptr;
+  // CHECK-DAG: store ptr addrspace(1) null, ptr addrspace(4) [[GLOB]].ascast, 
align 8
   int *NoAS;
   // CHECK-DAG: [[PRIV:%[a-zA-Z0-9]+]] = alloca ptr
   // CHECK-DAG: [[PRIV]].ascast = addrspacecast ptr [[PRIV]] to ptr 
addrspace(4)
diff --git a/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp 
b/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
new file mode 100644
index 00..35da61cd8cbbe3
--- /dev/null
+++ b/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
@@ -0,0 +1,128 @@
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fsycl-is-device 
-disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
+void bar(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar2(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF2:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar(__attribute__((opencl_local)) int &Data) {}
+// CHEC

[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-19 Thread via cfe-commits

https://github.com/mmoadeli updated 
https://github.com/llvm/llvm-project/pull/78759

>From 717ad72ef5f1f318ef707cc829df8d4a9b46b131 Mon Sep 17 00:00:00 2001
From: m moadeli 
Date: Fri, 19 Jan 2024 18:42:24 +
Subject: [PATCH 1/2] - Address space cast of a `local_space` specialized
 `nullptr` into a generic space for the CUDA backend. - In the context of AMD
 GPU, assigns a NULL value of `~0` for the address spaces of `sycl_local` and
 `sycl_private`

---
 clang/lib/Basic/Targets/AMDGPU.h  |   6 +-
 clang/lib/CodeGen/Targets/NVPTX.cpp   |  18 +++
 .../CodeGenSYCL/address-space-conversions.cpp |   4 +
 .../amd-address-space-conversions.cpp | 128 ++
 .../cuda-address-space-conversions.cpp| 122 +
 5 files changed, 276 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
 create mode 100644 clang/test/CodeGenSYCL/cuda-address-space-conversions.cpp

diff --git a/clang/lib/Basic/Targets/AMDGPU.h b/clang/lib/Basic/Targets/AMDGPU.h
index e80589dde0ecb5..94d9ba93ed226f 100644
--- a/clang/lib/Basic/Targets/AMDGPU.h
+++ b/clang/lib/Basic/Targets/AMDGPU.h
@@ -414,8 +414,10 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : 
public TargetInfo {
   // value ~0.
   uint64_t getNullPointerValue(LangAS AS) const override {
 // FIXME: Also should handle region.
-return (AS == LangAS::opencl_local || AS == LangAS::opencl_private)
-  ? ~0 : 0;
+return (AS == LangAS::opencl_local || AS == LangAS::opencl_private ||
+AS == LangAS::sycl_local || AS == LangAS::sycl_private)
+   ? ~0
+   : 0;
   }
 
   void setAuxTarget(const TargetInfo *Aux) override;
diff --git a/clang/lib/CodeGen/Targets/NVPTX.cpp 
b/clang/lib/CodeGen/Targets/NVPTX.cpp
index d0dc7c258a03a6..8718f1ecf3a7e0 100644
--- a/clang/lib/CodeGen/Targets/NVPTX.cpp
+++ b/clang/lib/CodeGen/Targets/NVPTX.cpp
@@ -47,6 +47,10 @@ class NVPTXTargetCodeGenInfo : public TargetCodeGenInfo {
CodeGen::CodeGenModule &M) const override;
   bool shouldEmitStaticExternCAliases() const override;
 
+  llvm::Constant *getNullPointer(const CodeGen::CodeGenModule &CGM,
+ llvm::PointerType *T,
+ QualType QT) const override;
+
   llvm::Type *getCUDADeviceBuiltinSurfaceDeviceType() const override {
 // On the device side, surface reference is represented as an object handle
 // in 64-bit integer.
@@ -285,6 +289,20 @@ void 
NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV,
 bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const {
   return false;
 }
+
+llvm::Constant *
+NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM,
+   llvm::PointerType *PT,
+   QualType QT) const {
+  auto &Ctx = CGM.getContext();
+  if (PT->getAddressSpace() != Ctx.getTargetAddressSpace(LangAS::opencl_local))
+return llvm::ConstantPointerNull::get(PT);
+
+  auto NPT = llvm::PointerType::get(
+  PT->getContext(), Ctx.getTargetAddressSpace(LangAS::opencl_generic));
+  return llvm::ConstantExpr::getAddrSpaceCast(
+  llvm::ConstantPointerNull::get(NPT), PT);
+}
 }
 
 void CodeGenModule::handleCUDALaunchBoundsAttr(llvm::Function *F,
diff --git a/clang/test/CodeGenSYCL/address-space-conversions.cpp 
b/clang/test/CodeGenSYCL/address-space-conversions.cpp
index 3933ad375412da..10a181318a174b 100644
--- a/clang/test/CodeGenSYCL/address-space-conversions.cpp
+++ b/clang/test/CodeGenSYCL/address-space-conversions.cpp
@@ -25,6 +25,10 @@ void usages() {
   __attribute__((opencl_local)) int *LOC;
   // CHECK-DAG: [[NoAS:%[a-zA-Z0-9]+]] = alloca ptr addrspace(4)
   // CHECK-DAG: [[NoAS]].ascast = addrspacecast ptr [[NoAS]] to ptr 
addrspace(4)
+  LOC = nullptr;
+  // CHECK-DAG: store ptr addrspace(3) null, ptr addrspace(4) [[LOC]].ascast, 
align 8
+  GLOB = nullptr;
+  // CHECK-DAG: store ptr addrspace(1) null, ptr addrspace(4) [[GLOB]].ascast, 
align 8
   int *NoAS;
   // CHECK-DAG: [[PRIV:%[a-zA-Z0-9]+]] = alloca ptr
   // CHECK-DAG: [[PRIV]].ascast = addrspacecast ptr [[PRIV]] to ptr 
addrspace(4)
diff --git a/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp 
b/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
new file mode 100644
index 00..35da61cd8cbbe3
--- /dev/null
+++ b/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
@@ -0,0 +1,128 @@
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fsycl-is-device 
-disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
+void bar(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar2(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF2:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar(__attribute__((opencl_local)) int &Data) {}
+// CHEC

[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-26 Thread Matt Arsenault via cfe-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-26 Thread Matt Arsenault via cfe-commits

https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-15 Thread via cfe-commits

https://github.com/mmoadeli updated 
https://github.com/llvm/llvm-project/pull/78759

>From 286ac8f3ea6aec711827ccab9608b010e78b18cf Mon Sep 17 00:00:00 2001
From: m moadeli 
Date: Fri, 19 Jan 2024 18:42:24 +
Subject: [PATCH 1/2] - Address space cast of a `local_space` specialized
 `nullptr` into a generic space for the CUDA backend. - In the context of AMD
 GPU, assigns a NULL value of `~0` for the address spaces of `sycl_local` and
 `sycl_private`

---
 clang/lib/Basic/Targets/AMDGPU.h  |   6 +-
 clang/lib/CodeGen/Targets/NVPTX.cpp   |  18 +++
 .../CodeGenSYCL/address-space-conversions.cpp |   4 +
 .../amd-address-space-conversions.cpp | 128 ++
 .../cuda-address-space-conversions.cpp| 122 +
 5 files changed, 276 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
 create mode 100644 clang/test/CodeGenSYCL/cuda-address-space-conversions.cpp

diff --git a/clang/lib/Basic/Targets/AMDGPU.h b/clang/lib/Basic/Targets/AMDGPU.h
index 90a1516ecdd20d..94d6ee7f5f72df 100644
--- a/clang/lib/Basic/Targets/AMDGPU.h
+++ b/clang/lib/Basic/Targets/AMDGPU.h
@@ -418,8 +418,10 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : 
public TargetInfo {
   // value ~0.
   uint64_t getNullPointerValue(LangAS AS) const override {
 // FIXME: Also should handle region.
-return (AS == LangAS::opencl_local || AS == LangAS::opencl_private)
-  ? ~0 : 0;
+return (AS == LangAS::opencl_local || AS == LangAS::opencl_private ||
+AS == LangAS::sycl_local || AS == LangAS::sycl_private)
+   ? ~0
+   : 0;
   }
 
   void setAuxTarget(const TargetInfo *Aux) override;
diff --git a/clang/lib/CodeGen/Targets/NVPTX.cpp 
b/clang/lib/CodeGen/Targets/NVPTX.cpp
index d0dc7c258a03a6..8718f1ecf3a7e0 100644
--- a/clang/lib/CodeGen/Targets/NVPTX.cpp
+++ b/clang/lib/CodeGen/Targets/NVPTX.cpp
@@ -47,6 +47,10 @@ class NVPTXTargetCodeGenInfo : public TargetCodeGenInfo {
CodeGen::CodeGenModule &M) const override;
   bool shouldEmitStaticExternCAliases() const override;
 
+  llvm::Constant *getNullPointer(const CodeGen::CodeGenModule &CGM,
+ llvm::PointerType *T,
+ QualType QT) const override;
+
   llvm::Type *getCUDADeviceBuiltinSurfaceDeviceType() const override {
 // On the device side, surface reference is represented as an object handle
 // in 64-bit integer.
@@ -285,6 +289,20 @@ void 
NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV,
 bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const {
   return false;
 }
+
+llvm::Constant *
+NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM,
+   llvm::PointerType *PT,
+   QualType QT) const {
+  auto &Ctx = CGM.getContext();
+  if (PT->getAddressSpace() != Ctx.getTargetAddressSpace(LangAS::opencl_local))
+return llvm::ConstantPointerNull::get(PT);
+
+  auto NPT = llvm::PointerType::get(
+  PT->getContext(), Ctx.getTargetAddressSpace(LangAS::opencl_generic));
+  return llvm::ConstantExpr::getAddrSpaceCast(
+  llvm::ConstantPointerNull::get(NPT), PT);
+}
 }
 
 void CodeGenModule::handleCUDALaunchBoundsAttr(llvm::Function *F,
diff --git a/clang/test/CodeGenSYCL/address-space-conversions.cpp 
b/clang/test/CodeGenSYCL/address-space-conversions.cpp
index 3933ad375412da..10a181318a174b 100644
--- a/clang/test/CodeGenSYCL/address-space-conversions.cpp
+++ b/clang/test/CodeGenSYCL/address-space-conversions.cpp
@@ -25,6 +25,10 @@ void usages() {
   __attribute__((opencl_local)) int *LOC;
   // CHECK-DAG: [[NoAS:%[a-zA-Z0-9]+]] = alloca ptr addrspace(4)
   // CHECK-DAG: [[NoAS]].ascast = addrspacecast ptr [[NoAS]] to ptr 
addrspace(4)
+  LOC = nullptr;
+  // CHECK-DAG: store ptr addrspace(3) null, ptr addrspace(4) [[LOC]].ascast, 
align 8
+  GLOB = nullptr;
+  // CHECK-DAG: store ptr addrspace(1) null, ptr addrspace(4) [[GLOB]].ascast, 
align 8
   int *NoAS;
   // CHECK-DAG: [[PRIV:%[a-zA-Z0-9]+]] = alloca ptr
   // CHECK-DAG: [[PRIV]].ascast = addrspacecast ptr [[PRIV]] to ptr 
addrspace(4)
diff --git a/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp 
b/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
new file mode 100644
index 00..35da61cd8cbbe3
--- /dev/null
+++ b/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
@@ -0,0 +1,128 @@
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fsycl-is-device 
-disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
+void bar(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar2(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF2:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar(__attribute__((opencl_local)) int &Data) {}
+// CHEC

[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-15 Thread via cfe-commits


@@ -0,0 +1,122 @@
+// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fsycl-is-device 
-disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
+void bar(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar2(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF2:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar(__attribute__((opencl_local)) int &Data) {}
+// CHECK-DAG: define dso_local void @[[LOCAL_REF:[a-zA-Z0-9_]+]](ptr 
addrspace(3) noundef align 4 dereferenceable(4) %
+void foo(int *Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_PTR:[a-zA-Z0-9_]+]](ptr noundef %
+void foo2(int *Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_PTR2:[a-zA-Z0-9_]+]](ptr noundef %
+void foo(__attribute__((opencl_local)) int *Data) {}
+// CHECK-DAG: define dso_local void @[[LOC_PTR:[a-zA-Z0-9_]+]](ptr 
addrspace(3) noundef %
+
+template 
+void tmpl(T t);
+// See Check Lines below.
+
+void usages() {
+  __attribute__((opencl_global)) int *GLOB;
+  // CHECK-DAG: [[GLOB:%[a-zA-Z0-9]+]] = alloca ptr addrspace(1), align 8
+  __attribute__((opencl_local)) int *LOC;
+  // CHECK-DAG: [[LOC:%[a-zA-Z0-9]+]] = alloca ptr addrspace(3), align 8
+  LOC = nullptr;
+  // CHECK-DAG: store ptr addrspace(3) addrspacecast (ptr null to ptr 
addrspace(3)), ptr [[LOC]], align 8
+  GLOB = nullptr;
+  // CHECK-DAG: store ptr addrspace(1) null, ptr [[GLOB]], align 8
+  int *NoAS;
+  // CHECK-DAG: [[NoAS:%[a-zA-Z0-9]+]] = alloca ptr, align 8
+  __attribute__((opencl_private)) int *PRIV;
+  // CHECK-DAG: [[PRIV:%[a-zA-Z0-9]+]] = alloca ptr, align 8
+  __attribute__((opencl_global_device)) int *GLOBDEVICE;
+  // CHECK-DAG: [[GLOB_DEVICE:%[a-zA-Z0-9]+]] = alloca ptr addrspace(1), align 
8
+  __attribute__((opencl_global_host)) int *GLOBHOST;
+  // CHECK-DAG: [[GLOB_HOST:%[a-zA-Z0-9]+]] = alloca ptr addrspace(1), align 8
+  NoAS = (int *)GLOB;
+  // CHECK-DAG: [[GLOB_LOAD:%[a-zA-Z0-9]+]] = load ptr addrspace(1), ptr 
[[GLOB]], align 8
+  // CHECK-DAG: [[GLOB_CAST:%[a-zA-Z0-9]+]] = addrspacecast ptr addrspace(1) 
[[GLOB_LOAD]] to ptr
+  // CHECK-DAG: store ptr [[GLOB_CAST]], ptr [[NoAS]], align 8
+  NoAS = (int *)LOC;
+  // CHECK-DAG: [[LOC_LOAD:%[a-zA-Z0-9]+]] = load ptr addrspace(3), ptr 
[[LOC]], align 8
+  // CHECK-DAG: [[LOC_CAST:%[a-zA-Z0-9]+]] = addrspacecast ptr addrspace(3) 
[[LOC_LOAD]] to ptr
+  // CHECK-DAG: store ptr [[LOC_CAST]], ptr [[NoAS]], align 8
+  NoAS = (int *)PRIV;
+  // CHECK-DAG: [[LOC_LOAD:%[a-zA-Z0-9]+]] = load ptr, ptr [[PRIV]], align 8
+  // CHECK-DAG: store ptr [[LOC_LOAD]], ptr [[NoAS]], align 8
+  GLOB = (__attribute__((opencl_global)) int *)NoAS;
+  // CHECK-DAG: [[NoAS_LOAD:%[a-zA-Z0-9]+]] = load ptr, ptr [[NoAS]], align 8
+  // CHECK-DAG: [[NoAS_CAST:%[a-zA-Z0-9]+]] = addrspacecast ptr [[NoAS_LOAD]] 
to ptr addrspace(1)
+  // CHECK-DAG: store ptr addrspace(1) [[NoAS_CAST]], ptr [[GLOB]], align 8
+  LOC = (__attribute__((opencl_local)) int *)NoAS;
+  // CHECK-DAG: [[NoAS_LOAD:%[a-zA-Z0-9]+]] = load ptr, ptr [[NoAS]], align 8
+  // CHECK-DAG: [[NoAS_CAST:%[a-zA-Z0-9]+]] = addrspacecast ptr [[NoAS_LOAD]] 
to ptr addrspace(3)
+  // CHECK-DAG: store ptr addrspace(3) [[NoAS_CAST]], ptr [[LOC]], align 8
+  PRIV = (__attribute__((opencl_private)) int *)NoAS;
+  // CHECK-DAG: [[NoAS_LOAD:%[a-zA-Z0-9]+]] = load ptr, ptr [[NoAS]], align 8
+  // CHECK-DAG: store ptr [[NoAS_LOAD]], ptr [[PRIV]], align 8
+  GLOB = (__attribute__((opencl_global)) int *)GLOBDEVICE;
+  // CHECK-DAG: [[GLOBDEVICE_LOAD:%[a-zA-Z0-9]+]] = load ptr addrspace(1), ptr 
[[GLOB_DEVICE]], align 8
+  // CHECK-DAG: store ptr addrspace(1) [[GLOBDEVICE_LOAD]], ptr %GLOB, align 8
+  GLOB = (__attribute__((opencl_global)) int *)GLOBHOST;
+  // CHECK-DAG: [[GLOB_HOST_LOAD:%[a-zA-Z0-9]+]] = load ptr addrspace(1), ptr 
[[GLOB_HOST]], align 8
+  // CHECK-DAG: store ptr addrspace(1) [[GLOB_HOST_LOAD]], ptr [[GLOB]], align 
8
+  bar(*GLOB);

mmoadeli wrote:

@arichardson , thank you for the comment. It's done.

https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-15 Thread via cfe-commits

https://github.com/mmoadeli updated 
https://github.com/llvm/llvm-project/pull/78759

>From 37504a970c8cf78a8f221fb75ad5653f89526288 Mon Sep 17 00:00:00 2001
From: m moadeli 
Date: Fri, 19 Jan 2024 18:42:24 +
Subject: [PATCH 1/2] - Address space cast of a `local_space` specialized
 `nullptr` into a generic space for the CUDA backend. - In the context of AMD
 GPU, assigns a NULL value of `~0` for the address spaces of `sycl_local` and
 `sycl_private`

---
 clang/lib/Basic/Targets/AMDGPU.h  |   6 +-
 clang/lib/CodeGen/Targets/NVPTX.cpp   |  18 +++
 .../CodeGenSYCL/address-space-conversions.cpp |   4 +
 .../amd-address-space-conversions.cpp | 128 ++
 .../cuda-address-space-conversions.cpp| 122 +
 5 files changed, 276 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
 create mode 100644 clang/test/CodeGenSYCL/cuda-address-space-conversions.cpp

diff --git a/clang/lib/Basic/Targets/AMDGPU.h b/clang/lib/Basic/Targets/AMDGPU.h
index e80589dde0ecb5..94d9ba93ed226f 100644
--- a/clang/lib/Basic/Targets/AMDGPU.h
+++ b/clang/lib/Basic/Targets/AMDGPU.h
@@ -414,8 +414,10 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : 
public TargetInfo {
   // value ~0.
   uint64_t getNullPointerValue(LangAS AS) const override {
 // FIXME: Also should handle region.
-return (AS == LangAS::opencl_local || AS == LangAS::opencl_private)
-  ? ~0 : 0;
+return (AS == LangAS::opencl_local || AS == LangAS::opencl_private ||
+AS == LangAS::sycl_local || AS == LangAS::sycl_private)
+   ? ~0
+   : 0;
   }
 
   void setAuxTarget(const TargetInfo *Aux) override;
diff --git a/clang/lib/CodeGen/Targets/NVPTX.cpp 
b/clang/lib/CodeGen/Targets/NVPTX.cpp
index d0dc7c258a03a6..8718f1ecf3a7e0 100644
--- a/clang/lib/CodeGen/Targets/NVPTX.cpp
+++ b/clang/lib/CodeGen/Targets/NVPTX.cpp
@@ -47,6 +47,10 @@ class NVPTXTargetCodeGenInfo : public TargetCodeGenInfo {
CodeGen::CodeGenModule &M) const override;
   bool shouldEmitStaticExternCAliases() const override;
 
+  llvm::Constant *getNullPointer(const CodeGen::CodeGenModule &CGM,
+ llvm::PointerType *T,
+ QualType QT) const override;
+
   llvm::Type *getCUDADeviceBuiltinSurfaceDeviceType() const override {
 // On the device side, surface reference is represented as an object handle
 // in 64-bit integer.
@@ -285,6 +289,20 @@ void 
NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV,
 bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const {
   return false;
 }
+
+llvm::Constant *
+NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM,
+   llvm::PointerType *PT,
+   QualType QT) const {
+  auto &Ctx = CGM.getContext();
+  if (PT->getAddressSpace() != Ctx.getTargetAddressSpace(LangAS::opencl_local))
+return llvm::ConstantPointerNull::get(PT);
+
+  auto NPT = llvm::PointerType::get(
+  PT->getContext(), Ctx.getTargetAddressSpace(LangAS::opencl_generic));
+  return llvm::ConstantExpr::getAddrSpaceCast(
+  llvm::ConstantPointerNull::get(NPT), PT);
+}
 }
 
 void CodeGenModule::handleCUDALaunchBoundsAttr(llvm::Function *F,
diff --git a/clang/test/CodeGenSYCL/address-space-conversions.cpp 
b/clang/test/CodeGenSYCL/address-space-conversions.cpp
index 3933ad375412da..10a181318a174b 100644
--- a/clang/test/CodeGenSYCL/address-space-conversions.cpp
+++ b/clang/test/CodeGenSYCL/address-space-conversions.cpp
@@ -25,6 +25,10 @@ void usages() {
   __attribute__((opencl_local)) int *LOC;
   // CHECK-DAG: [[NoAS:%[a-zA-Z0-9]+]] = alloca ptr addrspace(4)
   // CHECK-DAG: [[NoAS]].ascast = addrspacecast ptr [[NoAS]] to ptr 
addrspace(4)
+  LOC = nullptr;
+  // CHECK-DAG: store ptr addrspace(3) null, ptr addrspace(4) [[LOC]].ascast, 
align 8
+  GLOB = nullptr;
+  // CHECK-DAG: store ptr addrspace(1) null, ptr addrspace(4) [[GLOB]].ascast, 
align 8
   int *NoAS;
   // CHECK-DAG: [[PRIV:%[a-zA-Z0-9]+]] = alloca ptr
   // CHECK-DAG: [[PRIV]].ascast = addrspacecast ptr [[PRIV]] to ptr 
addrspace(4)
diff --git a/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp 
b/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
new file mode 100644
index 00..35da61cd8cbbe3
--- /dev/null
+++ b/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
@@ -0,0 +1,128 @@
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fsycl-is-device 
-disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
+void bar(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar2(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF2:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar(__attribute__((opencl_local)) int &Data) {}
+// CHEC

[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-16 Thread via cfe-commits

https://github.com/mmoadeli updated 
https://github.com/llvm/llvm-project/pull/78759

>From ea5af38849f3c2badbb3c1c1161a13d1c855e19b Mon Sep 17 00:00:00 2001
From: m moadeli 
Date: Fri, 19 Jan 2024 18:42:24 +
Subject: [PATCH 1/2] - Address space cast of a `local_space` specialized
 `nullptr` into a generic space for the CUDA backend. - In the context of AMD
 GPU, assigns a NULL value of `~0` for the address spaces of `sycl_local` and
 `sycl_private`

---
 clang/lib/Basic/Targets/AMDGPU.h  |   6 +-
 clang/lib/CodeGen/Targets/NVPTX.cpp   |  18 +++
 .../CodeGenSYCL/address-space-conversions.cpp |   4 +
 .../amd-address-space-conversions.cpp | 128 ++
 .../cuda-address-space-conversions.cpp| 122 +
 5 files changed, 276 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
 create mode 100644 clang/test/CodeGenSYCL/cuda-address-space-conversions.cpp

diff --git a/clang/lib/Basic/Targets/AMDGPU.h b/clang/lib/Basic/Targets/AMDGPU.h
index e80589dde0ecb5..94d9ba93ed226f 100644
--- a/clang/lib/Basic/Targets/AMDGPU.h
+++ b/clang/lib/Basic/Targets/AMDGPU.h
@@ -414,8 +414,10 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : 
public TargetInfo {
   // value ~0.
   uint64_t getNullPointerValue(LangAS AS) const override {
 // FIXME: Also should handle region.
-return (AS == LangAS::opencl_local || AS == LangAS::opencl_private)
-  ? ~0 : 0;
+return (AS == LangAS::opencl_local || AS == LangAS::opencl_private ||
+AS == LangAS::sycl_local || AS == LangAS::sycl_private)
+   ? ~0
+   : 0;
   }
 
   void setAuxTarget(const TargetInfo *Aux) override;
diff --git a/clang/lib/CodeGen/Targets/NVPTX.cpp 
b/clang/lib/CodeGen/Targets/NVPTX.cpp
index d0dc7c258a03a6..8718f1ecf3a7e0 100644
--- a/clang/lib/CodeGen/Targets/NVPTX.cpp
+++ b/clang/lib/CodeGen/Targets/NVPTX.cpp
@@ -47,6 +47,10 @@ class NVPTXTargetCodeGenInfo : public TargetCodeGenInfo {
CodeGen::CodeGenModule &M) const override;
   bool shouldEmitStaticExternCAliases() const override;
 
+  llvm::Constant *getNullPointer(const CodeGen::CodeGenModule &CGM,
+ llvm::PointerType *T,
+ QualType QT) const override;
+
   llvm::Type *getCUDADeviceBuiltinSurfaceDeviceType() const override {
 // On the device side, surface reference is represented as an object handle
 // in 64-bit integer.
@@ -285,6 +289,20 @@ void 
NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV,
 bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const {
   return false;
 }
+
+llvm::Constant *
+NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM,
+   llvm::PointerType *PT,
+   QualType QT) const {
+  auto &Ctx = CGM.getContext();
+  if (PT->getAddressSpace() != Ctx.getTargetAddressSpace(LangAS::opencl_local))
+return llvm::ConstantPointerNull::get(PT);
+
+  auto NPT = llvm::PointerType::get(
+  PT->getContext(), Ctx.getTargetAddressSpace(LangAS::opencl_generic));
+  return llvm::ConstantExpr::getAddrSpaceCast(
+  llvm::ConstantPointerNull::get(NPT), PT);
+}
 }
 
 void CodeGenModule::handleCUDALaunchBoundsAttr(llvm::Function *F,
diff --git a/clang/test/CodeGenSYCL/address-space-conversions.cpp 
b/clang/test/CodeGenSYCL/address-space-conversions.cpp
index 3933ad375412da..10a181318a174b 100644
--- a/clang/test/CodeGenSYCL/address-space-conversions.cpp
+++ b/clang/test/CodeGenSYCL/address-space-conversions.cpp
@@ -25,6 +25,10 @@ void usages() {
   __attribute__((opencl_local)) int *LOC;
   // CHECK-DAG: [[NoAS:%[a-zA-Z0-9]+]] = alloca ptr addrspace(4)
   // CHECK-DAG: [[NoAS]].ascast = addrspacecast ptr [[NoAS]] to ptr 
addrspace(4)
+  LOC = nullptr;
+  // CHECK-DAG: store ptr addrspace(3) null, ptr addrspace(4) [[LOC]].ascast, 
align 8
+  GLOB = nullptr;
+  // CHECK-DAG: store ptr addrspace(1) null, ptr addrspace(4) [[GLOB]].ascast, 
align 8
   int *NoAS;
   // CHECK-DAG: [[PRIV:%[a-zA-Z0-9]+]] = alloca ptr
   // CHECK-DAG: [[PRIV]].ascast = addrspacecast ptr [[PRIV]] to ptr 
addrspace(4)
diff --git a/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp 
b/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
new file mode 100644
index 00..35da61cd8cbbe3
--- /dev/null
+++ b/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
@@ -0,0 +1,128 @@
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fsycl-is-device 
-disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
+void bar(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar2(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF2:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar(__attribute__((opencl_local)) int &Data) {}
+// CHEC

[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-06 Thread Matt Arsenault via cfe-commits

https://github.com/arsenm approved this pull request.

amdgpu parts lgtm (which could be split to a separate change from the ptx 
change)

https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-06 Thread Alexander Richardson via cfe-commits


@@ -0,0 +1,122 @@
+// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fsycl-is-device 
-disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
+void bar(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar2(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF2:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar(__attribute__((opencl_local)) int &Data) {}
+// CHECK-DAG: define dso_local void @[[LOCAL_REF:[a-zA-Z0-9_]+]](ptr 
addrspace(3) noundef align 4 dereferenceable(4) %
+void foo(int *Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_PTR:[a-zA-Z0-9_]+]](ptr noundef %
+void foo2(int *Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_PTR2:[a-zA-Z0-9_]+]](ptr noundef %
+void foo(__attribute__((opencl_local)) int *Data) {}
+// CHECK-DAG: define dso_local void @[[LOC_PTR:[a-zA-Z0-9_]+]](ptr 
addrspace(3) noundef %
+
+template 
+void tmpl(T t);
+// See Check Lines below.
+
+void usages() {
+  __attribute__((opencl_global)) int *GLOB;
+  // CHECK-DAG: [[GLOB:%[a-zA-Z0-9]+]] = alloca ptr addrspace(1), align 8
+  __attribute__((opencl_local)) int *LOC;
+  // CHECK-DAG: [[LOC:%[a-zA-Z0-9]+]] = alloca ptr addrspace(3), align 8
+  LOC = nullptr;
+  // CHECK-DAG: store ptr addrspace(3) addrspacecast (ptr null to ptr 
addrspace(3)), ptr [[LOC]], align 8
+  GLOB = nullptr;
+  // CHECK-DAG: store ptr addrspace(1) null, ptr [[GLOB]], align 8
+  int *NoAS;
+  // CHECK-DAG: [[NoAS:%[a-zA-Z0-9]+]] = alloca ptr, align 8
+  __attribute__((opencl_private)) int *PRIV;
+  // CHECK-DAG: [[PRIV:%[a-zA-Z0-9]+]] = alloca ptr, align 8
+  __attribute__((opencl_global_device)) int *GLOBDEVICE;
+  // CHECK-DAG: [[GLOB_DEVICE:%[a-zA-Z0-9]+]] = alloca ptr addrspace(1), align 
8
+  __attribute__((opencl_global_host)) int *GLOBHOST;
+  // CHECK-DAG: [[GLOB_HOST:%[a-zA-Z0-9]+]] = alloca ptr addrspace(1), align 8
+  NoAS = (int *)GLOB;
+  // CHECK-DAG: [[GLOB_LOAD:%[a-zA-Z0-9]+]] = load ptr addrspace(1), ptr 
[[GLOB]], align 8
+  // CHECK-DAG: [[GLOB_CAST:%[a-zA-Z0-9]+]] = addrspacecast ptr addrspace(1) 
[[GLOB_LOAD]] to ptr
+  // CHECK-DAG: store ptr [[GLOB_CAST]], ptr [[NoAS]], align 8
+  NoAS = (int *)LOC;
+  // CHECK-DAG: [[LOC_LOAD:%[a-zA-Z0-9]+]] = load ptr addrspace(3), ptr 
[[LOC]], align 8
+  // CHECK-DAG: [[LOC_CAST:%[a-zA-Z0-9]+]] = addrspacecast ptr addrspace(3) 
[[LOC_LOAD]] to ptr
+  // CHECK-DAG: store ptr [[LOC_CAST]], ptr [[NoAS]], align 8
+  NoAS = (int *)PRIV;
+  // CHECK-DAG: [[LOC_LOAD:%[a-zA-Z0-9]+]] = load ptr, ptr [[PRIV]], align 8
+  // CHECK-DAG: store ptr [[LOC_LOAD]], ptr [[NoAS]], align 8
+  GLOB = (__attribute__((opencl_global)) int *)NoAS;
+  // CHECK-DAG: [[NoAS_LOAD:%[a-zA-Z0-9]+]] = load ptr, ptr [[NoAS]], align 8
+  // CHECK-DAG: [[NoAS_CAST:%[a-zA-Z0-9]+]] = addrspacecast ptr [[NoAS_LOAD]] 
to ptr addrspace(1)
+  // CHECK-DAG: store ptr addrspace(1) [[NoAS_CAST]], ptr [[GLOB]], align 8
+  LOC = (__attribute__((opencl_local)) int *)NoAS;
+  // CHECK-DAG: [[NoAS_LOAD:%[a-zA-Z0-9]+]] = load ptr, ptr [[NoAS]], align 8
+  // CHECK-DAG: [[NoAS_CAST:%[a-zA-Z0-9]+]] = addrspacecast ptr [[NoAS_LOAD]] 
to ptr addrspace(3)
+  // CHECK-DAG: store ptr addrspace(3) [[NoAS_CAST]], ptr [[LOC]], align 8
+  PRIV = (__attribute__((opencl_private)) int *)NoAS;
+  // CHECK-DAG: [[NoAS_LOAD:%[a-zA-Z0-9]+]] = load ptr, ptr [[NoAS]], align 8
+  // CHECK-DAG: store ptr [[NoAS_LOAD]], ptr [[PRIV]], align 8
+  GLOB = (__attribute__((opencl_global)) int *)GLOBDEVICE;
+  // CHECK-DAG: [[GLOBDEVICE_LOAD:%[a-zA-Z0-9]+]] = load ptr addrspace(1), ptr 
[[GLOB_DEVICE]], align 8
+  // CHECK-DAG: store ptr addrspace(1) [[GLOBDEVICE_LOAD]], ptr %GLOB, align 8
+  GLOB = (__attribute__((opencl_global)) int *)GLOBHOST;
+  // CHECK-DAG: [[GLOB_HOST_LOAD:%[a-zA-Z0-9]+]] = load ptr addrspace(1), ptr 
[[GLOB_HOST]], align 8
+  // CHECK-DAG: store ptr addrspace(1) [[GLOB_HOST_LOAD]], ptr [[GLOB]], align 
8
+  bar(*GLOB);

arichardson wrote:

Why are all these CHECK lines using CHECK-DAG? I would not expect them to be 
reordered, so `CHECK:` should work just fine?

https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-06 Thread Alexander Richardson via cfe-commits


@@ -285,6 +289,20 @@ void 
NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV,
 bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const {
   return false;
 }
+
+llvm::Constant *
+NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM,
+   llvm::PointerType *PT,
+   QualType QT) const {
+  auto &Ctx = CGM.getContext();
+  if (PT->getAddressSpace() != Ctx.getTargetAddressSpace(LangAS::opencl_local))
+return llvm::ConstantPointerNull::get(PT);
+
+  auto NPT = llvm::PointerType::get(
+  PT->getContext(), Ctx.getTargetAddressSpace(LangAS::opencl_generic));
+  return llvm::ConstantExpr::getAddrSpaceCast(
+  llvm::ConstantPointerNull::get(NPT), PT);
+}

arichardson wrote:

As far as I can tell the reason for the AMDGPU code using an addrspacecast is 
the following comment `// Currently LLVM assumes null pointers always have 
value 0, which results in incorrectly transformed IR.` so it can't use a `null` 
literal for all ones.

Looking at the lang-ref I can't actually see anywhere that `ptr addrspace(X) 
null` is the zero value, so this should probably be clarified in the lagref: 
https://llvm.org/docs/LangRef.html#t-pointer

https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-07 Thread via cfe-commits

https://github.com/mmoadeli updated 
https://github.com/llvm/llvm-project/pull/78759

>From 286ac8f3ea6aec711827ccab9608b010e78b18cf Mon Sep 17 00:00:00 2001
From: m moadeli 
Date: Fri, 19 Jan 2024 18:42:24 +
Subject: [PATCH 1/2] - Address space cast of a `local_space` specialized
 `nullptr` into a generic space for the CUDA backend. - In the context of AMD
 GPU, assigns a NULL value of `~0` for the address spaces of `sycl_local` and
 `sycl_private`

---
 clang/lib/Basic/Targets/AMDGPU.h  |   6 +-
 clang/lib/CodeGen/Targets/NVPTX.cpp   |  18 +++
 .../CodeGenSYCL/address-space-conversions.cpp |   4 +
 .../amd-address-space-conversions.cpp | 128 ++
 .../cuda-address-space-conversions.cpp| 122 +
 5 files changed, 276 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
 create mode 100644 clang/test/CodeGenSYCL/cuda-address-space-conversions.cpp

diff --git a/clang/lib/Basic/Targets/AMDGPU.h b/clang/lib/Basic/Targets/AMDGPU.h
index 90a1516ecdd20d..94d6ee7f5f72df 100644
--- a/clang/lib/Basic/Targets/AMDGPU.h
+++ b/clang/lib/Basic/Targets/AMDGPU.h
@@ -418,8 +418,10 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : 
public TargetInfo {
   // value ~0.
   uint64_t getNullPointerValue(LangAS AS) const override {
 // FIXME: Also should handle region.
-return (AS == LangAS::opencl_local || AS == LangAS::opencl_private)
-  ? ~0 : 0;
+return (AS == LangAS::opencl_local || AS == LangAS::opencl_private ||
+AS == LangAS::sycl_local || AS == LangAS::sycl_private)
+   ? ~0
+   : 0;
   }
 
   void setAuxTarget(const TargetInfo *Aux) override;
diff --git a/clang/lib/CodeGen/Targets/NVPTX.cpp 
b/clang/lib/CodeGen/Targets/NVPTX.cpp
index d0dc7c258a03a6..8718f1ecf3a7e0 100644
--- a/clang/lib/CodeGen/Targets/NVPTX.cpp
+++ b/clang/lib/CodeGen/Targets/NVPTX.cpp
@@ -47,6 +47,10 @@ class NVPTXTargetCodeGenInfo : public TargetCodeGenInfo {
CodeGen::CodeGenModule &M) const override;
   bool shouldEmitStaticExternCAliases() const override;
 
+  llvm::Constant *getNullPointer(const CodeGen::CodeGenModule &CGM,
+ llvm::PointerType *T,
+ QualType QT) const override;
+
   llvm::Type *getCUDADeviceBuiltinSurfaceDeviceType() const override {
 // On the device side, surface reference is represented as an object handle
 // in 64-bit integer.
@@ -285,6 +289,20 @@ void 
NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV,
 bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const {
   return false;
 }
+
+llvm::Constant *
+NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM,
+   llvm::PointerType *PT,
+   QualType QT) const {
+  auto &Ctx = CGM.getContext();
+  if (PT->getAddressSpace() != Ctx.getTargetAddressSpace(LangAS::opencl_local))
+return llvm::ConstantPointerNull::get(PT);
+
+  auto NPT = llvm::PointerType::get(
+  PT->getContext(), Ctx.getTargetAddressSpace(LangAS::opencl_generic));
+  return llvm::ConstantExpr::getAddrSpaceCast(
+  llvm::ConstantPointerNull::get(NPT), PT);
+}
 }
 
 void CodeGenModule::handleCUDALaunchBoundsAttr(llvm::Function *F,
diff --git a/clang/test/CodeGenSYCL/address-space-conversions.cpp 
b/clang/test/CodeGenSYCL/address-space-conversions.cpp
index 3933ad375412da..10a181318a174b 100644
--- a/clang/test/CodeGenSYCL/address-space-conversions.cpp
+++ b/clang/test/CodeGenSYCL/address-space-conversions.cpp
@@ -25,6 +25,10 @@ void usages() {
   __attribute__((opencl_local)) int *LOC;
   // CHECK-DAG: [[NoAS:%[a-zA-Z0-9]+]] = alloca ptr addrspace(4)
   // CHECK-DAG: [[NoAS]].ascast = addrspacecast ptr [[NoAS]] to ptr 
addrspace(4)
+  LOC = nullptr;
+  // CHECK-DAG: store ptr addrspace(3) null, ptr addrspace(4) [[LOC]].ascast, 
align 8
+  GLOB = nullptr;
+  // CHECK-DAG: store ptr addrspace(1) null, ptr addrspace(4) [[GLOB]].ascast, 
align 8
   int *NoAS;
   // CHECK-DAG: [[PRIV:%[a-zA-Z0-9]+]] = alloca ptr
   // CHECK-DAG: [[PRIV]].ascast = addrspacecast ptr [[PRIV]] to ptr 
addrspace(4)
diff --git a/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp 
b/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
new file mode 100644
index 00..35da61cd8cbbe3
--- /dev/null
+++ b/clang/test/CodeGenSYCL/amd-address-space-conversions.cpp
@@ -0,0 +1,128 @@
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fsycl-is-device 
-disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
+void bar(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar2(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF2:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar(__attribute__((opencl_local)) int &Data) {}
+// CHEC

[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-07 Thread via cfe-commits


@@ -0,0 +1,122 @@
+// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fsycl-is-device 
-disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
+void bar(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar2(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF2:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar(__attribute__((opencl_local)) int &Data) {}
+// CHECK-DAG: define dso_local void @[[LOCAL_REF:[a-zA-Z0-9_]+]](ptr 
addrspace(3) noundef align 4 dereferenceable(4) %
+void foo(int *Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_PTR:[a-zA-Z0-9_]+]](ptr noundef %
+void foo2(int *Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_PTR2:[a-zA-Z0-9_]+]](ptr noundef %
+void foo(__attribute__((opencl_local)) int *Data) {}
+// CHECK-DAG: define dso_local void @[[LOC_PTR:[a-zA-Z0-9_]+]](ptr 
addrspace(3) noundef %
+
+template 
+void tmpl(T t);
+// See Check Lines below.
+
+void usages() {
+  __attribute__((opencl_global)) int *GLOB;
+  // CHECK-DAG: [[GLOB:%[a-zA-Z0-9]+]] = alloca ptr addrspace(1), align 8
+  __attribute__((opencl_local)) int *LOC;
+  // CHECK-DAG: [[LOC:%[a-zA-Z0-9]+]] = alloca ptr addrspace(3), align 8
+  LOC = nullptr;
+  // CHECK-DAG: store ptr addrspace(3) addrspacecast (ptr null to ptr 
addrspace(3)), ptr [[LOC]], align 8
+  GLOB = nullptr;
+  // CHECK-DAG: store ptr addrspace(1) null, ptr [[GLOB]], align 8
+  int *NoAS;
+  // CHECK-DAG: [[NoAS:%[a-zA-Z0-9]+]] = alloca ptr, align 8
+  __attribute__((opencl_private)) int *PRIV;
+  // CHECK-DAG: [[PRIV:%[a-zA-Z0-9]+]] = alloca ptr, align 8
+  __attribute__((opencl_global_device)) int *GLOBDEVICE;
+  // CHECK-DAG: [[GLOB_DEVICE:%[a-zA-Z0-9]+]] = alloca ptr addrspace(1), align 
8
+  __attribute__((opencl_global_host)) int *GLOBHOST;
+  // CHECK-DAG: [[GLOB_HOST:%[a-zA-Z0-9]+]] = alloca ptr addrspace(1), align 8
+  NoAS = (int *)GLOB;
+  // CHECK-DAG: [[GLOB_LOAD:%[a-zA-Z0-9]+]] = load ptr addrspace(1), ptr 
[[GLOB]], align 8
+  // CHECK-DAG: [[GLOB_CAST:%[a-zA-Z0-9]+]] = addrspacecast ptr addrspace(1) 
[[GLOB_LOAD]] to ptr
+  // CHECK-DAG: store ptr [[GLOB_CAST]], ptr [[NoAS]], align 8
+  NoAS = (int *)LOC;
+  // CHECK-DAG: [[LOC_LOAD:%[a-zA-Z0-9]+]] = load ptr addrspace(3), ptr 
[[LOC]], align 8
+  // CHECK-DAG: [[LOC_CAST:%[a-zA-Z0-9]+]] = addrspacecast ptr addrspace(3) 
[[LOC_LOAD]] to ptr
+  // CHECK-DAG: store ptr [[LOC_CAST]], ptr [[NoAS]], align 8
+  NoAS = (int *)PRIV;
+  // CHECK-DAG: [[LOC_LOAD:%[a-zA-Z0-9]+]] = load ptr, ptr [[PRIV]], align 8
+  // CHECK-DAG: store ptr [[LOC_LOAD]], ptr [[NoAS]], align 8
+  GLOB = (__attribute__((opencl_global)) int *)NoAS;
+  // CHECK-DAG: [[NoAS_LOAD:%[a-zA-Z0-9]+]] = load ptr, ptr [[NoAS]], align 8
+  // CHECK-DAG: [[NoAS_CAST:%[a-zA-Z0-9]+]] = addrspacecast ptr [[NoAS_LOAD]] 
to ptr addrspace(1)
+  // CHECK-DAG: store ptr addrspace(1) [[NoAS_CAST]], ptr [[GLOB]], align 8
+  LOC = (__attribute__((opencl_local)) int *)NoAS;
+  // CHECK-DAG: [[NoAS_LOAD:%[a-zA-Z0-9]+]] = load ptr, ptr [[NoAS]], align 8
+  // CHECK-DAG: [[NoAS_CAST:%[a-zA-Z0-9]+]] = addrspacecast ptr [[NoAS_LOAD]] 
to ptr addrspace(3)
+  // CHECK-DAG: store ptr addrspace(3) [[NoAS_CAST]], ptr [[LOC]], align 8
+  PRIV = (__attribute__((opencl_private)) int *)NoAS;
+  // CHECK-DAG: [[NoAS_LOAD:%[a-zA-Z0-9]+]] = load ptr, ptr [[NoAS]], align 8
+  // CHECK-DAG: store ptr [[NoAS_LOAD]], ptr [[PRIV]], align 8
+  GLOB = (__attribute__((opencl_global)) int *)GLOBDEVICE;
+  // CHECK-DAG: [[GLOBDEVICE_LOAD:%[a-zA-Z0-9]+]] = load ptr addrspace(1), ptr 
[[GLOB_DEVICE]], align 8
+  // CHECK-DAG: store ptr addrspace(1) [[GLOBDEVICE_LOAD]], ptr %GLOB, align 8
+  GLOB = (__attribute__((opencl_global)) int *)GLOBHOST;
+  // CHECK-DAG: [[GLOB_HOST_LOAD:%[a-zA-Z0-9]+]] = load ptr addrspace(1), ptr 
[[GLOB_HOST]], align 8
+  // CHECK-DAG: store ptr addrspace(1) [[GLOB_HOST_LOAD]], ptr [[GLOB]], align 
8
+  bar(*GLOB);

mmoadeli wrote:

Thanks @arichardson in added AMD and CUDA tests `CHECK-DAG`s are replaced with 
`CHECK`. 

https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-07 Thread Matt Arsenault via cfe-commits


@@ -285,6 +289,20 @@ void 
NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV,
 bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const {
   return false;
 }
+
+llvm::Constant *
+NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM,
+   llvm::PointerType *PT,
+   QualType QT) const {
+  auto &Ctx = CGM.getContext();
+  if (PT->getAddressSpace() != Ctx.getTargetAddressSpace(LangAS::opencl_local))
+return llvm::ConstantPointerNull::get(PT);
+
+  auto NPT = llvm::PointerType::get(
+  PT->getContext(), Ctx.getTargetAddressSpace(LangAS::opencl_generic));
+  return llvm::ConstantExpr::getAddrSpaceCast(
+  llvm::ConstantPointerNull::get(NPT), PT);
+}

arsenm wrote:

"The semantics of non-zero address spaces are target-specific. Memory access 
through a non-dereferenceable pointer is undefined behavior in any address 
space. Pointers with the bit-value 0 are only assumed to be non-dereferenceable 
in address space 0, unless the function is marked with the 
null_pointer_is_valid attribute."

https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-07 Thread Matt Arsenault via cfe-commits


@@ -285,6 +289,20 @@ void 
NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV,
 bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const {
   return false;
 }
+
+llvm::Constant *
+NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM,
+   llvm::PointerType *PT,
+   QualType QT) const {
+  auto &Ctx = CGM.getContext();
+  if (PT->getAddressSpace() != Ctx.getTargetAddressSpace(LangAS::opencl_local))
+return llvm::ConstantPointerNull::get(PT);
+
+  auto NPT = llvm::PointerType::get(
+  PT->getContext(), Ctx.getTargetAddressSpace(LangAS::opencl_generic));
+  return llvm::ConstantExpr::getAddrSpaceCast(
+  llvm::ConstantPointerNull::get(NPT), PT);
+}

arsenm wrote:

I think this really needs to be a property of the datalayout. the addrspacecast 
is a roundabout way of using -1 

https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-07 Thread Alexander Richardson via cfe-commits


@@ -285,6 +289,20 @@ void 
NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV,
 bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const {
   return false;
 }
+
+llvm::Constant *
+NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM,
+   llvm::PointerType *PT,
+   QualType QT) const {
+  auto &Ctx = CGM.getContext();
+  if (PT->getAddressSpace() != Ctx.getTargetAddressSpace(LangAS::opencl_local))
+return llvm::ConstantPointerNull::get(PT);
+
+  auto NPT = llvm::PointerType::get(
+  PT->getContext(), Ctx.getTargetAddressSpace(LangAS::opencl_generic));
+  return llvm::ConstantExpr::getAddrSpaceCast(
+  llvm::ConstantPointerNull::get(NPT), PT);
+}

arichardson wrote:

Agreed. I believe that right now some LLVM passes assume a `null` constant is 
all zeroes, even though LangRef does not specify it so we need to either 
document those semantics or add a property to the data layout.

https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-07 Thread Alexander Richardson via cfe-commits


@@ -0,0 +1,122 @@
+// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fsycl-is-device 
-disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
+void bar(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar2(int &Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_REF2:[a-zA-Z0-9_]+]](ptr noundef 
nonnull align 4 dereferenceable(4) %
+void bar(__attribute__((opencl_local)) int &Data) {}
+// CHECK-DAG: define dso_local void @[[LOCAL_REF:[a-zA-Z0-9_]+]](ptr 
addrspace(3) noundef align 4 dereferenceable(4) %
+void foo(int *Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_PTR:[a-zA-Z0-9_]+]](ptr noundef %
+void foo2(int *Data) {}
+// CHECK-DAG: define dso_local void @[[RAW_PTR2:[a-zA-Z0-9_]+]](ptr noundef %
+void foo(__attribute__((opencl_local)) int *Data) {}
+// CHECK-DAG: define dso_local void @[[LOC_PTR:[a-zA-Z0-9_]+]](ptr 
addrspace(3) noundef %
+
+template 
+void tmpl(T t);
+// See Check Lines below.
+
+void usages() {
+  __attribute__((opencl_global)) int *GLOB;
+  // CHECK-DAG: [[GLOB:%[a-zA-Z0-9]+]] = alloca ptr addrspace(1), align 8
+  __attribute__((opencl_local)) int *LOC;
+  // CHECK-DAG: [[LOC:%[a-zA-Z0-9]+]] = alloca ptr addrspace(3), align 8
+  LOC = nullptr;
+  // CHECK-DAG: store ptr addrspace(3) addrspacecast (ptr null to ptr 
addrspace(3)), ptr [[LOC]], align 8
+  GLOB = nullptr;
+  // CHECK-DAG: store ptr addrspace(1) null, ptr [[GLOB]], align 8
+  int *NoAS;
+  // CHECK-DAG: [[NoAS:%[a-zA-Z0-9]+]] = alloca ptr, align 8
+  __attribute__((opencl_private)) int *PRIV;
+  // CHECK-DAG: [[PRIV:%[a-zA-Z0-9]+]] = alloca ptr, align 8
+  __attribute__((opencl_global_device)) int *GLOBDEVICE;
+  // CHECK-DAG: [[GLOB_DEVICE:%[a-zA-Z0-9]+]] = alloca ptr addrspace(1), align 
8
+  __attribute__((opencl_global_host)) int *GLOBHOST;
+  // CHECK-DAG: [[GLOB_HOST:%[a-zA-Z0-9]+]] = alloca ptr addrspace(1), align 8
+  NoAS = (int *)GLOB;
+  // CHECK-DAG: [[GLOB_LOAD:%[a-zA-Z0-9]+]] = load ptr addrspace(1), ptr 
[[GLOB]], align 8
+  // CHECK-DAG: [[GLOB_CAST:%[a-zA-Z0-9]+]] = addrspacecast ptr addrspace(1) 
[[GLOB_LOAD]] to ptr
+  // CHECK-DAG: store ptr [[GLOB_CAST]], ptr [[NoAS]], align 8
+  NoAS = (int *)LOC;
+  // CHECK-DAG: [[LOC_LOAD:%[a-zA-Z0-9]+]] = load ptr addrspace(3), ptr 
[[LOC]], align 8
+  // CHECK-DAG: [[LOC_CAST:%[a-zA-Z0-9]+]] = addrspacecast ptr addrspace(3) 
[[LOC_LOAD]] to ptr
+  // CHECK-DAG: store ptr [[LOC_CAST]], ptr [[NoAS]], align 8
+  NoAS = (int *)PRIV;
+  // CHECK-DAG: [[LOC_LOAD:%[a-zA-Z0-9]+]] = load ptr, ptr [[PRIV]], align 8
+  // CHECK-DAG: store ptr [[LOC_LOAD]], ptr [[NoAS]], align 8
+  GLOB = (__attribute__((opencl_global)) int *)NoAS;
+  // CHECK-DAG: [[NoAS_LOAD:%[a-zA-Z0-9]+]] = load ptr, ptr [[NoAS]], align 8
+  // CHECK-DAG: [[NoAS_CAST:%[a-zA-Z0-9]+]] = addrspacecast ptr [[NoAS_LOAD]] 
to ptr addrspace(1)
+  // CHECK-DAG: store ptr addrspace(1) [[NoAS_CAST]], ptr [[GLOB]], align 8
+  LOC = (__attribute__((opencl_local)) int *)NoAS;
+  // CHECK-DAG: [[NoAS_LOAD:%[a-zA-Z0-9]+]] = load ptr, ptr [[NoAS]], align 8
+  // CHECK-DAG: [[NoAS_CAST:%[a-zA-Z0-9]+]] = addrspacecast ptr [[NoAS_LOAD]] 
to ptr addrspace(3)
+  // CHECK-DAG: store ptr addrspace(3) [[NoAS_CAST]], ptr [[LOC]], align 8
+  PRIV = (__attribute__((opencl_private)) int *)NoAS;
+  // CHECK-DAG: [[NoAS_LOAD:%[a-zA-Z0-9]+]] = load ptr, ptr [[NoAS]], align 8
+  // CHECK-DAG: store ptr [[NoAS_LOAD]], ptr [[PRIV]], align 8
+  GLOB = (__attribute__((opencl_global)) int *)GLOBDEVICE;
+  // CHECK-DAG: [[GLOBDEVICE_LOAD:%[a-zA-Z0-9]+]] = load ptr addrspace(1), ptr 
[[GLOB_DEVICE]], align 8
+  // CHECK-DAG: store ptr addrspace(1) [[GLOBDEVICE_LOAD]], ptr %GLOB, align 8
+  GLOB = (__attribute__((opencl_global)) int *)GLOBHOST;
+  // CHECK-DAG: [[GLOB_HOST_LOAD:%[a-zA-Z0-9]+]] = load ptr addrspace(1), ptr 
[[GLOB_HOST]], align 8
+  // CHECK-DAG: store ptr addrspace(1) [[GLOB_HOST_LOAD]], ptr [[GLOB]], align 
8
+  bar(*GLOB);

arichardson wrote:

Thanks. I think it would also be good to update 
clang/test/CodeGenSYCL/address-space-conversions.cpp as a NFC commit and rebase 
this PR on top of that. 

https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-01-25 Thread Matt Arsenault via cfe-commits


@@ -418,8 +418,10 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : 
public TargetInfo {
   // value ~0.
   uint64_t getNullPointerValue(LangAS AS) const override {
 // FIXME: Also should handle region.
-return (AS == LangAS::opencl_local || AS == LangAS::opencl_private)
-  ? ~0 : 0;
+return (AS == LangAS::opencl_local || AS == LangAS::opencl_private ||
+AS == LangAS::sycl_local || AS == LangAS::sycl_private)

arsenm wrote:

Is there a point to having separate LangAS enums for these? 

https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-01-25 Thread Victor Lomuller via cfe-commits


@@ -285,6 +289,20 @@ void 
NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV,
 bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const {
   return false;
 }
+
+llvm::Constant *
+NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM,
+   llvm::PointerType *PT,
+   QualType QT) const {
+  auto &Ctx = CGM.getContext();
+  if (PT->getAddressSpace() != Ctx.getTargetAddressSpace(LangAS::opencl_local))
+return llvm::ConstantPointerNull::get(PT);
+
+  auto NPT = llvm::PointerType::get(
+  PT->getContext(), Ctx.getTargetAddressSpace(LangAS::opencl_generic));
+  return llvm::ConstantExpr::getAddrSpaceCast(
+  llvm::ConstantPointerNull::get(NPT), PT);
+}

Naghasan wrote:

Hi @Artem-B 

I'm shimming in at @mmoadeli's request. I advised him on the resolution of his 
issue.

> I don't quite understand what's going on here.

So it is a similar story as for the AMDGPU backend. `0` as a pointer to shared 
memory is a valid one and points to the root of the shared memory, so that's 
means we cannot use this value as `nullptr`. AMDGPU uses -1 (all bits set) for 
this, but we couldn't find anything equivalent in the CUDA/PTX documentation. 
After a few investigation, we found out the most stable way to do this is 
simply by inserting this expression.

Note that `0` as a pointer to the generic address space the expected value for 
`nullptr`

> Why are we ASC'ing all null pointers to LangAS::opencl_generic ?

The patch isn't doing this, if the pointer type *is* to the cuda shared address 
space (opencl's local address space) then we do ASC. Otherwise this emits the 
simple `llvm::ConstantPointerNull`.
We used `LangAS::opencl_generic` as a way to emphasis there is a generic to 
shared address space cast going on. The other solution here would be to use 
`LangAS::Default` to retrieve the target address space, but `Default` doesn't 
sound right to me as you have to know this maps to NVPTX's generic target 
address space. Either way, we don't have a strong opinion on what to use. But a 
comment is probably needed regardless.

> Will it work for CUDA (as in the CUDA language)? I think this code should be 
> restricted to apply the ASC only for OpenCL and leave CUDA/HIP with the 
> dafault.

So yes and no. To the `Will it work for CUDA ?` part, yes it will because you 
actually cannot hit this return. CUDA doesn't expose address spaces so you 
can't have that nullptr as an address in the cuda shared address space, so the 
`if` above will always evaluate to true in CUDA.

For the `leave CUDA/HIP with the dafault` part, you could force things and use 
target address spaces like it is done in the clang headers for CUDA and this 
change would capture that. However, as explained before, `0` in the address 
space 3 (NVPTX backend) is a valid address and it is very easy to highlight in 
SASS.

https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-01-25 Thread Victor Lomuller via cfe-commits


@@ -418,8 +418,10 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : 
public TargetInfo {
   // value ~0.
   uint64_t getNullPointerValue(LangAS AS) const override {
 // FIXME: Also should handle region.
-return (AS == LangAS::opencl_local || AS == LangAS::opencl_private)
-  ? ~0 : 0;
+return (AS == LangAS::opencl_local || AS == LangAS::opencl_private ||
+AS == LangAS::sycl_local || AS == LangAS::sycl_private)

Naghasan wrote:

The split is the result of long discussions with the OpenCL code owner

https://github.com/llvm/llvm-project/pull/78759
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits