[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

Victor Lomuller via cfe-commits Thu, 25 Jan 2024 07:40:03 -0800

================
@@ -285,6 +289,20 @@ void 
NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV,
 bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const {
   return false;
 }
+
+llvm::Constant *
+NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM,
+                                       llvm::PointerType *PT,
+                                       QualType QT) const {
+  auto &Ctx = CGM.getContext();
+  if (PT->getAddressSpace() != Ctx.getTargetAddressSpace(LangAS::opencl_local))
+    return llvm::ConstantPointerNull::get(PT);
+
+  auto NPT = llvm::PointerType::get(
+      PT->getContext(), Ctx.getTargetAddressSpace(LangAS::opencl_generic));
+  return llvm::ConstantExpr::getAddrSpaceCast(
+      llvm::ConstantPointerNull::get(NPT), PT);
+}
----------------
Naghasan wrote:


Hi @Artem-B 

I'm shimming in at @mmoadeli's request. I advised him on the resolution of his 
issue.

> I don't quite understand what's going on here.

So it is a similar story as for the AMDGPU backend. `0` as a pointer to shared 
memory is a valid one and points to the root of the shared memory, so that's 
means we cannot use this value as `nullptr`. AMDGPU uses -1 (all bits set) for 
this, but we couldn't find anything equivalent in the CUDA/PTX documentation. 
After a few investigation, we found out the most stable way to do this is 
simply by inserting this expression.

Note that `0` as a pointer to the generic address space the expected value for 
`nullptr`

> Why are we ASC'ing all null pointers to LangAS::opencl_generic ?

The patch isn't doing this, if the pointer type *is* to the cuda shared address 
space (opencl's local address space) then we do ASC. Otherwise this emits the 
simple `llvm::ConstantPointerNull`.
We used `LangAS::opencl_generic` as a way to emphasis there is a generic to 
shared address space cast going on. The other solution here would be to use 
`LangAS::Default` to retrieve the target address space, but `Default` doesn't 
sound right to me as you have to know this maps to NVPTX's generic target 
address space. Either way, we don't have a strong opinion on what to use. But a 
comment is probably needed regardless.

> Will it work for CUDA (as in the CUDA language)? I think this code should be 
> restricted to apply the ASC only for OpenCL and leave CUDA/HIP with the 
> dafault.

So yes and no. To the `Will it work for CUDA ?` part, yes it will because you 
actually cannot hit this return. CUDA doesn't expose address spaces so you 
can't have that nullptr as an address in the cuda shared address space, so the 
`if` above will always evaluate to true in CUDA.

For the `leave CUDA/HIP with the dafault` part, you could force things and use 
target address spaces like it is done in the clang headers for CUDA and this 
change would capture that. However, as explained before, `0` in the address 
space 3 (NVPTX backend) is a valid address and it is very easy to highlight in 
SASS.

https://github.com/llvm/llvm-project/pull/78759
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

Reply via email to