[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-22 Thread Mehdi Amini via cfe-commits

https://github.com/joker-eph approved this pull request.

MLIR side LGTM

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-22 Thread Alex MacLean via cfe-commits

https://github.com/AlexMaclean approved this pull request.


https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-22 Thread Alex MacLean via cfe-commits

https://github.com/AlexMaclean commented:

llvm changes LGTM, though I'm not too familiar with the MLIR portion of this 
change. 

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-21 Thread Alex MacLean via cfe-commits


@@ -2381,25 +2387,38 @@ def INT_PTX_LDG_G_v4i32_ELE : VLDG_G_ELE_V4<"u32", 
Int32Regs>;
 def INT_PTX_LDG_G_v4f32_ELE : VLDG_G_ELE_V4<"f32", Float32Regs>;
 
 
-multiclass NG_TO_G {
-   def "" : NVPTXInst<(outs Int32Regs:$result), (ins Int32Regs:$src),
-  "cvta." # Str # ".u32 \t$result, $src;", []>;
-   def _64 : NVPTXInst<(outs Int64Regs:$result), (ins Int64Regs:$src),
-  "cvta." # Str # ".u64 \t$result, $src;", []>;
+multiclass NG_TO_G Preds = []> 
{
+  foreach bitwidth = !if(Supports32, ["32", "64"], ["64"]) in {
+if !eq(bitwidth, "32") then {
+  def "" : NVPTXInst<(outs Int32Regs:$result), (ins Int32Regs:$src),
+   "cvta." # Str # ".u32 \t$result, $src;", []>, Requires;
+} else if !eq(bitwidth, "64") then {
+  def _64 : NVPTXInst<(outs Int64Regs:$result), (ins Int64Regs:$src),
+"cvta." # Str # ".u64 \t$result, $src;", []>, Requires;
+}
+  }
 }
 
-multiclass G_TO_NG {
-   def "" : NVPTXInst<(outs Int32Regs:$result), (ins Int32Regs:$src),
-  "cvta.to." # Str # ".u32 \t$result, $src;", []>;
-   def _64 : NVPTXInst<(outs Int64Regs:$result), (ins Int64Regs:$src),
-  "cvta.to." # Str # ".u64 \t$result, $src;", []>;
+multiclass G_TO_NG Preds = []> 
{
+  foreach bitwidth = !if(Supports32, ["32", "64"], ["64"]) in {

AlexMaclean wrote:

I don't think this foreach loop is really accomplishing much. Can it just be 
removed?

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-21 Thread via cfe-commits


@@ -280,8 +282,10 @@ NVPTXTargetMachine::getPredicatedAddrSpace(const Value *V) 
const {
 case Intrinsic::nvvm_isspacep_local:
   return std::make_pair(II->getArgOperand(0), llvm::ADDRESS_SPACE_LOCAL);
 case Intrinsic::nvvm_isspacep_shared:
-case Intrinsic::nvvm_isspacep_shared_cluster:
   return std::make_pair(II->getArgOperand(0), llvm::ADDRESS_SPACE_SHARED);
+case Intrinsic::nvvm_isspacep_shared_cluster:

modiking wrote:

Technically this is conservative since `nvvm_isspacep_shared_cluster == true` 
implies that the pointer is also under `ADDRESS_SPACE_SHARED`. However the 
current interface only allows returning one address so that'll need to get 
updated. I'm planning on improving InferAddrSpaces to also detect from 
mapa/mapa_shared_cluster so I'll update this part as well during that.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-21 Thread via cfe-commits

https://github.com/modiking edited 
https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-21 Thread via cfe-commits


@@ -426,10 +426,7 @@ static std::optional evaluateIsSpace(Intrinsic::ID 
IID, unsigned AS) {
   case Intrinsic::nvvm_isspacep_shared:
 return AS == NVPTXAS::ADDRESS_SPACE_SHARED;

modiking wrote:

Right, if it's `ADDRESS_SPACE_SHARED_CLUSTER` we can't perform any folding.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-21 Thread via cfe-commits


@@ -176,6 +176,7 @@ enum AddressSpace : AddressSpaceUnderlyingType {
   Shared = 3,

modiking wrote:

Will keep it `shared` in this change. We can visit renaming/aliasing later if 
we want.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-21 Thread via cfe-commits


@@ -25,6 +25,7 @@ enum AddressSpace : unsigned {
   ADDRESS_SPACE_CONST = 4,
   ADDRESS_SPACE_LOCAL = 5,
   ADDRESS_SPACE_TENSOR = 6,
+  ADDRESS_SPACE_SHARED_CLUSTER = 7,

modiking wrote:

Will keep it `shared` in this change. We can visit renaming/aliasing later if 
we want.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-21 Thread via cfe-commits


@@ -426,10 +426,7 @@ static std::optional evaluateIsSpace(Intrinsic::ID 
IID, unsigned AS) {
   case Intrinsic::nvvm_isspacep_shared:
 return AS == NVPTXAS::ADDRESS_SPACE_SHARED;
   case Intrinsic::nvvm_isspacep_shared_cluster:
-// We can't tell shared from shared_cluster at compile time from AS alone,
-// but it can't be either is AS is not shared.
-return AS == NVPTXAS::ADDRESS_SPACE_SHARED ? std::nullopt
-   : std::optional{false};
+return AS == NVPTXAS::ADDRESS_SPACE_SHARED_CLUSTER;

modiking wrote:

Good call, fixed.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-21 Thread Artem Belevich via cfe-commits


@@ -25,6 +25,7 @@ enum AddressSpace : unsigned {
   ADDRESS_SPACE_CONST = 4,
   ADDRESS_SPACE_LOCAL = 5,
   ADDRESS_SPACE_TENSOR = 6,
+  ADDRESS_SPACE_SHARED_CLUSTER = 7,

Artem-B wrote:

PTX docs say:
```
 If no sub-qualifier is specified with the .shared state space, then it 
defaults to ::cta.
```

Ideally, it's the most generic variant that should've had the most generic name 
of `shared`, and the specific subsets would have an additional suffix, but 
`shared` already has 10+ years of being in use. For pretty much all existing 
users SHARED already means a per-CTA variant, and I think we should keep it as 
is.  Changing its name to something else will not really buy us anything other 
than churn, IMO.

We may consider adding SHARED_CTA as an alias to SHARED. I can see how in some 
new contexts it may be useful to highlight the intended distinction.



https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-19 Thread Alex MacLean via cfe-commits


@@ -3019,8 +3019,26 @@ SDValue NVPTXTargetLowering::LowerADDRSPACECAST(SDValue 
Op,
   unsigned SrcAS = N->getSrcAddressSpace();
   unsigned DestAS = N->getDestAddressSpace();
   if (SrcAS != llvm::ADDRESS_SPACE_GENERIC &&
-  DestAS != llvm::ADDRESS_SPACE_GENERIC)
+  DestAS != llvm::ADDRESS_SPACE_GENERIC) {
+// Shared and SharedCluster can be converted to each other through generic
+// space
+if ((SrcAS == llvm::ADDRESS_SPACE_SHARED &&
+ DestAS == llvm::ADDRESS_SPACE_SHARED_CLUSTER) ||
+(SrcAS == llvm::ADDRESS_SPACE_SHARED_CLUSTER &&
+ DestAS == llvm::ADDRESS_SPACE_SHARED)) {
+  const MVT GenerictVT =
+  getPointerTy(DAG.getDataLayout(), ADDRESS_SPACE_GENERIC);
+  const MVT OutputVT = getPointerTy(DAG.getDataLayout(), DestAS);
+  SDValue GenericConversion = DAG.getAddrSpaceCast(
+  SDLoc(), GenerictVT, Op.getOperand(0), SrcAS, ADDRESS_SPACE_GENERIC);

AlexMaclean wrote:

Use the DL from the original value

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-19 Thread Alex MacLean via cfe-commits


@@ -3019,8 +3019,26 @@ SDValue NVPTXTargetLowering::LowerADDRSPACECAST(SDValue 
Op,
   unsigned SrcAS = N->getSrcAddressSpace();
   unsigned DestAS = N->getDestAddressSpace();
   if (SrcAS != llvm::ADDRESS_SPACE_GENERIC &&
-  DestAS != llvm::ADDRESS_SPACE_GENERIC)
+  DestAS != llvm::ADDRESS_SPACE_GENERIC) {
+// Shared and SharedCluster can be converted to each other through generic
+// space
+if ((SrcAS == llvm::ADDRESS_SPACE_SHARED &&
+ DestAS == llvm::ADDRESS_SPACE_SHARED_CLUSTER) ||
+(SrcAS == llvm::ADDRESS_SPACE_SHARED_CLUSTER &&
+ DestAS == llvm::ADDRESS_SPACE_SHARED)) {
+  const MVT GenerictVT =
+  getPointerTy(DAG.getDataLayout(), ADDRESS_SPACE_GENERIC);
+  const MVT OutputVT = getPointerTy(DAG.getDataLayout(), DestAS);

AlexMaclean wrote:

Just use `Op.getValueType()`

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-19 Thread Alex MacLean via cfe-commits


@@ -25,6 +25,7 @@ enum AddressSpace : unsigned {
   ADDRESS_SPACE_CONST = 4,
   ADDRESS_SPACE_LOCAL = 5,
   ADDRESS_SPACE_TENSOR = 6,
+  ADDRESS_SPACE_SHARED_CLUSTER = 7,

AlexMaclean wrote:

I think it would be good to rename `ADDRESS_SPACE_SHARED` to 
`ADDRESS_SPACE_SHARED_CTA`. This would be a bit more consistent with PTX and I 
think clearer as well. With the current names it is easy to assume that 
SHARED_CLUSTER is a subspace of SHARED when in fact the opposite is true. 

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-19 Thread Alex MacLean via cfe-commits


@@ -426,10 +426,7 @@ static std::optional evaluateIsSpace(Intrinsic::ID 
IID, unsigned AS) {
   case Intrinsic::nvvm_isspacep_shared:
 return AS == NVPTXAS::ADDRESS_SPACE_SHARED;

AlexMaclean wrote:

If the address space is `ADDRESS_SPACE_SHARED_CLUSTER` this intrinsic may or 
may not return true.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-19 Thread Alex MacLean via cfe-commits


@@ -176,6 +176,7 @@ enum AddressSpace : AddressSpaceUnderlyingType {
   Shared = 3,

AlexMaclean wrote:

Lets rename this to `SharedCTA` as well.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-19 Thread Alex MacLean via cfe-commits


@@ -426,10 +426,7 @@ static std::optional evaluateIsSpace(Intrinsic::ID 
IID, unsigned AS) {
   case Intrinsic::nvvm_isspacep_shared:
 return AS == NVPTXAS::ADDRESS_SPACE_SHARED;
   case Intrinsic::nvvm_isspacep_shared_cluster:
-// We can't tell shared from shared_cluster at compile time from AS alone,
-// but it can't be either is AS is not shared.
-return AS == NVPTXAS::ADDRESS_SPACE_SHARED ? std::nullopt
-   : std::optional{false};
+return AS == NVPTXAS::ADDRESS_SPACE_SHARED_CLUSTER;

AlexMaclean wrote:

If the address space is `NVPTXAS::ADDRESS_SPACE_SHARED` won't this also return 
true?

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-18 Thread Alex MacLean via cfe-commits


@@ -2381,29 +2387,41 @@ def INT_PTX_LDG_G_v4i32_ELE : VLDG_G_ELE_V4<"u32", 
Int32Regs>;
 def INT_PTX_LDG_G_v4f32_ELE : VLDG_G_ELE_V4<"f32", Float32Regs>;
 
 
-multiclass NG_TO_G {
+multiclass NG_TO_G Preds = []> {
def "" : NVPTXInst<(outs Int32Regs:$result), (ins Int32Regs:$src),
-  "cvta." # Str # ".u32 \t$result, $src;", []>;
+  "cvta." # Str # ".u32 \t$result, $src;", []>, Requires;
+   def _64 : NVPTXInst<(outs Int64Regs:$result), (ins Int64Regs:$src),
+  "cvta." # Str # ".u64 \t$result, $src;", []>, Requires;
+}
+
+multiclass NG_TO_G_64 Preds = []> {
def _64 : NVPTXInst<(outs Int64Regs:$result), (ins Int64Regs:$src),
-  "cvta." # Str # ".u64 \t$result, $src;", []>;
+  "cvta." # Str # ".u64 \t$result, $src;", []>, Requires;
 }

AlexMaclean wrote:

I think it would be cleaner to just add a bit to the `NG_TO_G` class for 
`supports_32`

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-18 Thread Alex MacLean via cfe-commits


@@ -0,0 +1,329 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: llc < %s -o - -mcpu=sm_90 -mattr=+ptx78 | FileCheck %s
+; RUN: %if ptxas-12.0 %{ llc < %s -mcpu=sm_90 -mattr=+ptx78| %ptxas-verify 
-arch=sm_90 %}
+
+target triple = "nvptx64-nvidia-cuda"
+
+@llvm.used = appending global [5 x ptr] [
+  ptr @test_distributed_shared_cluster_common,

AlexMaclean wrote:

our lit tests generally don't use `@llvm.used`, can you remove this?

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-18 Thread Alex MacLean via cfe-commits


@@ -3019,8 +3019,42 @@ SDValue NVPTXTargetLowering::LowerADDRSPACECAST(SDValue 
Op,
   unsigned SrcAS = N->getSrcAddressSpace();
   unsigned DestAS = N->getDestAddressSpace();
   if (SrcAS != llvm::ADDRESS_SPACE_GENERIC &&
-  DestAS != llvm::ADDRESS_SPACE_GENERIC)
+  DestAS != llvm::ADDRESS_SPACE_GENERIC) {
+// Shared and SharedCluster can be converted to each other through generic
+// space
+if (SrcAS == llvm::ADDRESS_SPACE_SHARED &&

AlexMaclean wrote:

This `if` and the one below look essentially duplicated. Can you fold them 
together?

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-18 Thread Alex MacLean via cfe-commits

https://github.com/AlexMaclean commented:

Getting close to ready, a couple more places to update:
- NVPTXTargetTransformInfo.cpp: evaluateIsSpace
- NVPTXUsage.rst: Address Space section, add intrinsics you're modifying, such 
as `mapa`, to the spec

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-18 Thread Alex MacLean via cfe-commits


@@ -0,0 +1,48 @@
+; RUN: llc -O0 < %s -mtriple=nvptx64 -mcpu=sm_80 | FileCheck %s 
-check-prefixes=ALL,NOPTRCONV,CLS64

AlexMaclean wrote:

Use update_llc_test_checks for this test.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-18 Thread Alex MacLean via cfe-commits

https://github.com/AlexMaclean edited 
https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-18 Thread via cfe-commits


@@ -982,8 +982,9 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) {
 case ADDRESS_SPACE_SHARED:
   Opc = TM.is64Bit() ? NVPTX::cvta_shared_64 : NVPTX::cvta_shared;
   break;
-case ADDRESS_SPACE_DSHARED:
-  Opc = TM.is64Bit() ? NVPTX::cvta_dshared_64 : NVPTX::cvta_dshared;
+case ADDRESS_SPACE_SHARED_CLUSTER:
+  Opc = TM.is64Bit() ? NVPTX::cvta_shared_cluster_64
+ : NVPTX::cvta_shared_cluster;

gonzalobg wrote:

If a program/user requests 32-bit ABI and sm_90, that's an error. I'd expect 
this to be reported either by the frontend (e.g. clang) or when parsing the 
llvm module target data. At that point, NVPTX can just assume that this never 
happens and we can hard fail here to be helpful to other compiler developers, 
since they are the only ones that should be seeing this failure.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-17 Thread via cfe-commits

modiking wrote:

Made the requested changes, @AlexMaclean/@Artem-B PTAL when you get the 
chance--thanks!

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-17 Thread via cfe-commits

github-actions[bot] wrote:




:warning: undef deprecator found issues in your code. :warning:



You can test this locally with the following command:


``bash
git diff -U0 --pickaxe-regex -S 
'([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 'HEAD~1' HEAD 
llvm/test/CodeGen/NVPTX/addrspacecast-ptx64.ll 
llvm/test/CodeGen/NVPTX/distributed-shared-cluster.ll 
clang/lib/Basic/Targets/NVPTX.cpp clang/test/CodeGen/target-data.c 
llvm/include/llvm/Support/NVPTXAddrSpace.h llvm/lib/IR/AutoUpgrade.cpp 
llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXInstPrinter.cpp 
llvm/lib/Target/NVPTX/NVPTX.h llvm/lib/Target/NVPTX/NVPTXAliasAnalysis.cpp 
llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp 
llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp 
llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp 
llvm/lib/Target/NVPTX/NVPTXUtilities.h 
llvm/test/Assembler/auto_upgrade_nvvm_intrinsics.ll 
llvm/test/CodeGen/NVPTX/cp-async-bulk-tensor-g2s.ll 
llvm/test/CodeGen/NVPTX/cp-async-bulk.ll llvm/test/CodeGen/NVPTX/nvptx-aa.ll 
mlir/include/mlir/Dialect/LLVMIR/NVVMDialect.h
``




The following files introduce new uses of undef:
 - llvm/test/Assembler/auto_upgrade_nvvm_intrinsics.ll

[Undef](https://llvm.org/docs/LangRef.html#undefined-values) is now deprecated 
and should only be used in the rare cases where no replacement is possible. For 
example, a load of uninitialized memory yields `undef`. You should use `poison` 
values for placeholders instead.

In tests, avoid using `undef` and having tests that trigger undefined behavior. 
If you need an operand with some unimportant value, you can add a new argument 
to the function and use that instead.

For example, this is considered a bad practice:
```llvm
define void @fn() {
  ...
  br i1 undef, ...
}
```

Please use the following instead:
```llvm
define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}
```

Please refer to the [Undefined Behavior 
Manual](https://llvm.org/docs/UndefinedBehavior.html) for more information.



https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-16 Thread Artem Belevich via cfe-commits


@@ -982,8 +982,9 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) {
 case ADDRESS_SPACE_SHARED:
   Opc = TM.is64Bit() ? NVPTX::cvta_shared_64 : NVPTX::cvta_shared;
   break;
-case ADDRESS_SPACE_DSHARED:
-  Opc = TM.is64Bit() ? NVPTX::cvta_dshared_64 : NVPTX::cvta_dshared;
+case ADDRESS_SPACE_SHARED_CLUSTER:
+  Opc = TM.is64Bit() ? NVPTX::cvta_shared_cluster_64
+ : NVPTX::cvta_shared_cluster;

Artem-B wrote:

> I think throwing the error in ISel is a good place to fail.

It's a convenient place for LLVM developer. It's problematic for the end user. 
In general, if we're diagnosing a user error, it must be in a way that's 
actionable for the *user*, not for the compiler developer. Diagnostic by 
compiler crash is not the best UI.

Granted, producing an invalid instruction and relying on ptxas to diagnose it 
is only marginally better.

I think the missing bit of the puzzle here is that we still support 32-bit 
compilation on sm_90+. If we make it impossible to do it in principle, then it 
makes the whole point moot, and we no longer have to bother with 32-bit generic 
pointers for the instructions that are available on newer GPUs only.

This is something to be addressed separately. For this patch, I'm fine with 
proceeding with the assumption that 32-bit compilation never happens.


https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-16 Thread via cfe-commits


@@ -982,8 +982,9 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) {
 case ADDRESS_SPACE_SHARED:
   Opc = TM.is64Bit() ? NVPTX::cvta_shared_64 : NVPTX::cvta_shared;
   break;
-case ADDRESS_SPACE_DSHARED:
-  Opc = TM.is64Bit() ? NVPTX::cvta_dshared_64 : NVPTX::cvta_dshared;
+case ADDRESS_SPACE_SHARED_CLUSTER:
+  Opc = TM.is64Bit() ? NVPTX::cvta_shared_cluster_64
+ : NVPTX::cvta_shared_cluster;

modiking wrote:

> I think the missing bit of the puzzle here is that we still support 32-bit 
> compilation on sm_90+

Yeah that's true, having that be enforced early would resolve this problem. I 
like that as an overall solution to match that in reality there is only 64-bit 
mode available on sm_90+. Also agreed that can be addressed separately.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-15 Thread via cfe-commits

modiking wrote:

> A general thought, Can we include the base changes in this PR and create a 
> separate PR for the intrinsics-migration+MLIR changes?

That makes sense, will do once we're happy with all the changes.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-15 Thread via cfe-commits


@@ -982,8 +982,9 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) {
 case ADDRESS_SPACE_SHARED:
   Opc = TM.is64Bit() ? NVPTX::cvta_shared_64 : NVPTX::cvta_shared;
   break;
-case ADDRESS_SPACE_DSHARED:
-  Opc = TM.is64Bit() ? NVPTX::cvta_dshared_64 : NVPTX::cvta_dshared;
+case ADDRESS_SPACE_SHARED_CLUSTER:
+  Opc = TM.is64Bit() ? NVPTX::cvta_shared_cluster_64
+ : NVPTX::cvta_shared_cluster;

modiking wrote:

> The check here is for TM.is64Bit() not whether shared memory is 64 or 32 
> bits. 

Ah I see. Agreed then that 32-bit `cvta.*shared::cluster` shouldn't exist as a 
valid construct during ISel. I think throwing the error in ISel is a good place 
to fail. ptxas will complain if we give it a 32-bit target machine on sm90+:
```
ptxas fatal   :  32-Bit ABI (--machine 32 or 32-Bit addressing) is not 
supported on sm_90 or higher architectures.
```
But given that a 32-bit version doesn't exist we shouldn't generate one.

We should also update the documentation that `cvta.*shared::cluster` is 64 bit 
only as well:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-cvta

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-15 Thread via cfe-commits


@@ -0,0 +1,258 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: llc < %s -o - -mcpu=sm_90 -march=nvptx64 -mattr=+ptx80 | FileCheck %s
+; RUN: %if ptxas-12.0 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx80| 
%ptxas-verify -arch=sm_90 %}
+
+target triple = "nvptx64-nvidia-cuda"
+
+@llvm.used = appending global [1 x ptr] [ptr 
@test_distributed_shared_cluster], section "llvm.metadata"
+
+declare ptr addrspace(7) @llvm.nvvm.mapa.shared.cluster(ptr addrspace(3), i32)
+declare i1 @llvm.nvvm.isspacep.shared.cluster(ptr)
+declare i32 @llvm.nvvm.read.ptx.sreg.ctaid.x()
+declare ptr @llvm.nvvm.mapa(ptr, i32)
+
+define i32 @test_distributed_shared_cluster(ptr %ptr, ptr addrspace(3) 
%smem_ptr) local_unnamed_addr {
+; CHECK-LABEL: test_distributed_shared_cluster(
+; CHECK:   {
+; CHECK-NEXT:.reg .pred %p<13>;
+; CHECK-NEXT:.reg .b16 %rs<5>;
+; CHECK-NEXT:.reg .b32 %r<69>;
+; CHECK-NEXT:.reg .f32 %f<2>;
+; CHECK-NEXT:.reg .b64 %rd<24>;
+; CHECK-NEXT:.reg .f64 %fd<2>;
+; CHECK-EMPTY:
+; CHECK-NEXT:  // %bb.0: // %entry
+; CHECK-NEXT:ld.param.u64 %rd2, [test_distributed_shared_cluster_param_0];
+; CHECK-NEXT:ld.param.u64 %rd3, [test_distributed_shared_cluster_param_1];
+; CHECK-NEXT:mov.u32 %r24, %ctaid.x;
+; CHECK-NEXT:xor.b32 %r25, %r24, 1;
+; CHECK-NEXT:isspacep.shared::cluster %p1, %rd2;
+; CHECK-NEXT:mapa.u64 %rd4, %rd2, %r25;
+; CHECK-NEXT:isspacep.shared::cluster %p2, %rd4;
+; CHECK-NEXT:mapa.shared::cluster.u64 %rd5, %rd3, %r25;
+; CHECK-NEXT:mov.b16 %rs1, 0x3C00;
+; CHECK-NEXT:atom.shared::cluster.add.noftz.f16 %rs2, [%rd5], %rs1;
+; CHECK-NEXT:mov.b16 %rs3, 0x3F80;
+; CHECK-NEXT:atom.shared::cluster.add.noftz.bf16 %rs4, [%rd5], %rs3;
+; CHECK-NEXT:atom.shared::cluster.add.f32 %f1, [%rd5], 0f3F80;
+; CHECK-NEXT:atom.shared::cluster.add.f64 %fd1, [%rd5], 0d3FF0;
+; CHECK-NEXT:atom.shared::cluster.add.u32 %r26, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.add.u64 %rd6, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.exch.b32 %r27, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.exch.b64 %rd7, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.min.s32 %r28, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.min.s64 %rd8, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.min.u32 %r29, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.min.u64 %rd9, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.max.s32 %r30, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.max.s64 %rd10, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.max.u32 %r31, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.max.u64 %rd11, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.inc.u32 %r32, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.dec.u32 %r33, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.and.b32 %r34, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.and.b64 %rd12, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.or.b32 %r35, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.or.b64 %rd13, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.xor.b32 %r36, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.xor.b64 %rd14, [%rd5], 1;
+; CHECK-NEXT:atom.relaxed.shared::cluster.cas.b32 %r37, [%rd5], 1, 0;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b32 %r38, [%rd5], 1, 0;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b32 %r39, [%rd5], 1, 0;
+; CHECK-NEXT:atom.release.shared::cluster.cas.b32 %r40, [%rd5], 1, 0;
+; CHECK-NEXT:atom.acq_rel.shared::cluster.cas.b32 %r41, [%rd5], 1, 0;
+; CHECK-NEXT:atom.acq_rel.shared::cluster.cas.b32 %r42, [%rd5], 1, 0;
+; CHECK-NEXT:fence.sc.sys;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b32 %r43, [%rd5], 1, 0;
+; CHECK-NEXT:fence.sc.sys;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b32 %r44, [%rd5], 1, 0;
+; CHECK-NEXT:fence.sc.sys;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b32 %r45, [%rd5], 1, 0;
+; CHECK-NEXT:atom.relaxed.shared::cluster.cas.b64 %rd15, [%rd5], 1, 0;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b64 %rd16, [%rd5], 1, 0;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b64 %rd17, [%rd5], 1, 0;
+; CHECK-NEXT:atom.release.shared::cluster.cas.b64 %rd18, [%rd5], 1, 0;
+; CHECK-NEXT:atom.acq_rel.shared::cluster.cas.b64 %rd19, [%rd5], 1, 0;
+; CHECK-NEXT:atom.acq_rel.shared::cluster.cas.b64 %rd20, [%rd5], 1, 0;
+; CHECK-NEXT:fence.sc.sys;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b64 %rd21, [%rd5], 1, 0;
+; CHECK-NEXT:fence.sc.sys;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b64 %rd22, [%rd5], 1, 0;
+; CHECK-NEXT:fence.sc.sys;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b64 %rd23, [%rd5], 1, 0;
+; CHECK-NEXT:and.b64 %rd1, %rd5, -4;
+; CHECK-NEXT:cvt.u32.u64 %r46, %rd5;
+; CHECK-NEXT:and.b32 %r47, %r46, 3;
+; CHECK-NEXT:shl.b32 %r1, %r47, 3;
+; CHECK-NEXT:mov.b

[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-15 Thread via cfe-commits


@@ -0,0 +1,258 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: llc < %s -o - -mcpu=sm_90 -march=nvptx64 -mattr=+ptx80 | FileCheck %s
+; RUN: %if ptxas-12.0 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx80| 
%ptxas-verify -arch=sm_90 %}

modiking wrote:

Changed

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-15 Thread Artem Belevich via cfe-commits


@@ -982,8 +982,9 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) {
 case ADDRESS_SPACE_SHARED:
   Opc = TM.is64Bit() ? NVPTX::cvta_shared_64 : NVPTX::cvta_shared;
   break;
-case ADDRESS_SPACE_DSHARED:
-  Opc = TM.is64Bit() ? NVPTX::cvta_dshared_64 : NVPTX::cvta_dshared;
+case ADDRESS_SPACE_SHARED_CLUSTER:
+  Opc = TM.is64Bit() ? NVPTX::cvta_shared_cluster_64
+ : NVPTX::cvta_shared_cluster;

Artem-B wrote:

> What matters for cvta is the size of the generic pointers, which I think is 
> always 64, right?

It will depend on the compilation mode. You are correct that we can not produce 
valid code for shared_cluster conversions in 32-bit mode, so the question is -- 
how are we going to fail?

If we unconditionally use `NVPTX::cvta_shared_cluster_64` we'll likely break 
other things because it will hardcode the assumption that generic pointer is 
64-bit, while the rest of the DAG and the lowering machinery would be expecting 
it to be 32 bits. I suspect we'll run into an assertion somewhere. The bottom 
line is that we may still want to go through the motions, and pretend that we 
can convert AS to/from cluster_shared in 32-bit mode , so we can finish the 
compilation, and *then* let ptxas report an error. At least at that point the 
user would be able to examine the PTX and reason about what went wrong. 

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-15 Thread via cfe-commits


@@ -117,13 +117,15 @@ extern "C" LLVM_EXTERNAL_VISIBILITY void 
LLVMInitializeNVPTXTarget() {
 static std::string computeDataLayout(bool is64Bit, bool UseShortPointers) {
   std::string Ret = "e";
 
-  if (!is64Bit)
-Ret += "-p:32:32";
-  else if (UseShortPointers)
-Ret += "-p3:32:32-p4:32:32-p5:32:32";
-
   // Tensor Memory (addrspace:6) is always 32-bits.
-  Ret += "-p6:32:32";
+  // Distributed Shared Memory (addrspace:7) follows shared memory
+  // (addrspace:3).
+  if (!is64Bit)
+Ret += "-p:32:32-p6:32:32-p7:32:32";
+  else if (UseShortPointers) {
+Ret += "-p3:32:32-p4:32:32-p5:32:32-p6:32:32-p7:32:32";
+  } else
+Ret += "-p6:32:32";

modiking wrote:

To keep the address spaces in order `p6:32:32` is also moved to the conditions 
above so the comment applies to every case now.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-15 Thread via cfe-commits


@@ -2034,13 +2038,15 @@ multiclass F_ATOMIC_2_AS, 
preds>;
   defm _S : F_ATOMIC_2, 
preds>;
+  defm _S_C : F_ATOMIC_2, !listconcat([hasSM<80>], preds)>;
   defm _GEN : F_ATOMIC_2, 
preds>;
 }
 
 multiclass F_ATOMIC_3_AS preds = []> {
   defvar frag_pat = (frag node:$a, node:$b, node:$c);
   defm _G : F_ATOMIC_3, preds>;
   defm _S : F_ATOMIC_3, preds>;
+  defm _S_C : F_ATOMIC_3, !listconcat([hasSM<80>], preds)>;

modiking wrote:

Yeah it should be hasSM<90> && hasPTX<78> which is captured by `hasClusters`. 
Changed to `hasClusters`

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-15 Thread via cfe-commits


@@ -2038,15 +2038,15 @@ multiclass F_ATOMIC_2_AS, 
preds>;
   defm _S : F_ATOMIC_2, 
preds>;
-  defm _DS : F_ATOMIC_2, !listconcat([hasSM<80>], preds)>;
+  defm _S_C : F_ATOMIC_2, !listconcat([hasSM<80>], preds)>;

modiking wrote:

Changed to use `hasClusters`

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-15 Thread via cfe-commits


@@ -137,6 +137,7 @@ def hasAtomBitwise64 : 
Predicate<"Subtarget->hasAtomBitwise64()">;
 def hasAtomMinMax64 : Predicate<"Subtarget->hasAtomMinMax64()">;
 def hasVote : Predicate<"Subtarget->hasVote()">;
 def hasDouble : Predicate<"Subtarget->hasDouble()">;
+def hasClusters : Predicate<"Subtarget->hasClusters()">;

modiking wrote:

Good catch, it's meant to unify how we check for support of features:
```
  bool hasClusters() const { return SmVersion >= 90 && PTXVersion >= 78; }
```

Updating the predicate checks in the td files to use this instead.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-15 Thread Alex MacLean via cfe-commits


@@ -982,8 +982,9 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) {
 case ADDRESS_SPACE_SHARED:
   Opc = TM.is64Bit() ? NVPTX::cvta_shared_64 : NVPTX::cvta_shared;
   break;
-case ADDRESS_SPACE_DSHARED:
-  Opc = TM.is64Bit() ? NVPTX::cvta_dshared_64 : NVPTX::cvta_dshared;
+case ADDRESS_SPACE_SHARED_CLUSTER:
+  Opc = TM.is64Bit() ? NVPTX::cvta_shared_cluster_64
+ : NVPTX::cvta_shared_cluster;

AlexMaclean wrote:

The check here is for `TM.is64Bit()` not whether shared memory is 64 or 32 
bits. What matters for cvta is the size of the generic pointers, which I think 
is always 64, right? If that is the case we'll never want the 32-bit variant. 
Even if there is something I'm missing and there are cases where we would want 
the 32 bit variant, we should still be checking PTX and SM version here and 
throwing an error as needed.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-15 Thread via cfe-commits


@@ -4704,6 +4754,43 @@ void llvm::UpgradeIntrinsicCall(CallBase *CI, Function 
*NewFn) {
 CI->eraseFromParent();
 return;
   }
+  case Intrinsic::nvvm_mapa_shared_cluster: {
+// Create a new call with the correct address space.
+NewCall =
+Builder.CreateCall(NewFn, {CI->getArgOperand(0), 
CI->getArgOperand(1)});
+Value *Res = NewCall;
+Res = Builder.CreateAddrSpaceCast(
+Res, Builder.getPtrTy(NVPTXAS::ADDRESS_SPACE_GENERIC));
+Res = Builder.CreateAddrSpaceCast(
+Res, Builder.getPtrTy(NVPTXAS::ADDRESS_SPACE_SHARED));

modiking wrote:

Not in this case. `nvvm_mapa_shared_cluster` when upgraded returns a 
`shared_cluster` pointer but the original IR expects a `shared` pointer so the 
casts are done accordingly

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-15 Thread via cfe-commits


@@ -982,8 +982,9 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) {
 case ADDRESS_SPACE_SHARED:
   Opc = TM.is64Bit() ? NVPTX::cvta_shared_64 : NVPTX::cvta_shared;
   break;
-case ADDRESS_SPACE_DSHARED:
-  Opc = TM.is64Bit() ? NVPTX::cvta_dshared_64 : NVPTX::cvta_dshared;
+case ADDRESS_SPACE_SHARED_CLUSTER:
+  Opc = TM.is64Bit() ? NVPTX::cvta_shared_cluster_64
+ : NVPTX::cvta_shared_cluster;

modiking wrote:

Yeah 32-bit shared memory pointers are still supported still as an option even 
in 64-bit compilation.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-14 Thread Durgadoss R via cfe-commits


@@ -117,13 +117,15 @@ extern "C" LLVM_EXTERNAL_VISIBILITY void 
LLVMInitializeNVPTXTarget() {
 static std::string computeDataLayout(bool is64Bit, bool UseShortPointers) {
   std::string Ret = "e";
 
-  if (!is64Bit)
-Ret += "-p:32:32";
-  else if (UseShortPointers)
-Ret += "-p3:32:32-p4:32:32-p5:32:32";
-
   // Tensor Memory (addrspace:6) is always 32-bits.
-  Ret += "-p6:32:32";
+  // Distributed Shared Memory (addrspace:7) follows shared memory
+  // (addrspace:3).
+  if (!is64Bit)
+Ret += "-p:32:32-p6:32:32-p7:32:32";
+  else if (UseShortPointers) {
+Ret += "-p3:32:32-p4:32:32-p5:32:32-p6:32:32-p7:32:32";
+  } else
+Ret += "-p6:32:32";

durga4github wrote:

We need the comment from line 120 here. Please retain it as is.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-14 Thread Durgadoss R via cfe-commits

durga4github wrote:

A general thought,
Can we include the base changes in this PR and create a separate PR for the 
intrinsics-migration+MLIR changes?

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-14 Thread Durgadoss R via cfe-commits


@@ -4704,6 +4754,43 @@ void llvm::UpgradeIntrinsicCall(CallBase *CI, Function 
*NewFn) {
 CI->eraseFromParent();
 return;
   }
+  case Intrinsic::nvvm_mapa_shared_cluster: {
+// Create a new call with the correct address space.
+NewCall =
+Builder.CreateCall(NewFn, {CI->getArgOperand(0), 
CI->getArgOperand(1)});
+Value *Res = NewCall;
+Res = Builder.CreateAddrSpaceCast(
+Res, Builder.getPtrTy(NVPTXAS::ADDRESS_SPACE_GENERIC));
+Res = Builder.CreateAddrSpaceCast(
+Res, Builder.getPtrTy(NVPTXAS::ADDRESS_SPACE_SHARED));

durga4github wrote:

Should this be shared_cluster?

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-14 Thread Durgadoss R via cfe-commits

durga4github wrote:

(Sorry I clicked the wrong button `Close` instead of `Comment`)

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-14 Thread Durgadoss R via cfe-commits

https://github.com/durga4github reopened 
https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-14 Thread Durgadoss R via cfe-commits

https://github.com/durga4github closed 
https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-14 Thread Durgadoss R via cfe-commits


@@ -982,8 +982,9 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) {
 case ADDRESS_SPACE_SHARED:
   Opc = TM.is64Bit() ? NVPTX::cvta_shared_64 : NVPTX::cvta_shared;
   break;
-case ADDRESS_SPACE_DSHARED:
-  Opc = TM.is64Bit() ? NVPTX::cvta_dshared_64 : NVPTX::cvta_dshared;
+case ADDRESS_SPACE_SHARED_CLUSTER:
+  Opc = TM.is64Bit() ? NVPTX::cvta_shared_cluster_64
+ : NVPTX::cvta_shared_cluster;

durga4github wrote:

okay, I believe we still support 32-bit for shared-memory pointers, even in 
sm_90+?

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-14 Thread Durgadoss R via cfe-commits


@@ -2034,13 +2038,15 @@ multiclass F_ATOMIC_2_AS, 
preds>;
   defm _S : F_ATOMIC_2, 
preds>;
+  defm _S_C : F_ATOMIC_2, !listconcat([hasSM<80>], preds)>;
   defm _GEN : F_ATOMIC_2, 
preds>;
 }
 
 multiclass F_ATOMIC_3_AS preds = []> {
   defvar frag_pat = (frag node:$a, node:$b, node:$c);
   defm _G : F_ATOMIC_3, preds>;
   defm _S : F_ATOMIC_3, preds>;
+  defm _S_C : F_ATOMIC_3, !listconcat([hasSM<80>], preds)>;

durga4github wrote:

Did you mean hasSM<90> here?
(May be it is right, I am just trying to get it clarified for myself)

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-14 Thread Durgadoss R via cfe-commits


@@ -0,0 +1,258 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: llc < %s -o - -mcpu=sm_90 -march=nvptx64 -mattr=+ptx80 | FileCheck %s
+; RUN: %if ptxas-12.0 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx80| 
%ptxas-verify -arch=sm_90 %}

durga4github wrote:

and we only need +ptx78 here..

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-14 Thread Durgadoss R via cfe-commits


@@ -43,7 +43,10 @@ enum NVVMMemorySpace {
   /// Tensor memory space identifier.
   /// Tensor memory is available only in arch-accelerated
   /// variants from sm100 onwards.
-  kTensorMemorySpace = 6
+  kTensorMemorySpace = 6,
+  /// Distributed shared memory space identifier.
+  /// Distributed shared memory is available only in sm80+.

durga4github wrote:

We have sm_86, sm_87 etc. that do not support dsmem.
So, let us say "sm 90 onwards" here.

https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#memory-hierarchy


https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-12 Thread Alex MacLean via cfe-commits


@@ -137,6 +137,7 @@ def hasAtomBitwise64 : 
Predicate<"Subtarget->hasAtomBitwise64()">;
 def hasAtomMinMax64 : Predicate<"Subtarget->hasAtomMinMax64()">;
 def hasVote : Predicate<"Subtarget->hasVote()">;
 def hasDouble : Predicate<"Subtarget->hasDouble()">;
+def hasClusters : Predicate<"Subtarget->hasClusters()">;

AlexMaclean wrote:

Is this actually getting used anywhere? If not please remove. 

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-12 Thread Alex MacLean via cfe-commits


@@ -2038,15 +2038,15 @@ multiclass F_ATOMIC_2_AS, 
preds>;
   defm _S : F_ATOMIC_2, 
preds>;
-  defm _DS : F_ATOMIC_2, !listconcat([hasSM<80>], preds)>;
+  defm _S_C : F_ATOMIC_2, !listconcat([hasSM<80>], preds)>;

AlexMaclean wrote:

The PTX doc seems to say this is supported only for sm_90 and ptx_78

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-12 Thread Alex MacLean via cfe-commits


@@ -982,8 +982,9 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) {
 case ADDRESS_SPACE_SHARED:
   Opc = TM.is64Bit() ? NVPTX::cvta_shared_64 : NVPTX::cvta_shared;
   break;
-case ADDRESS_SPACE_DSHARED:
-  Opc = TM.is64Bit() ? NVPTX::cvta_dshared_64 : NVPTX::cvta_dshared;
+case ADDRESS_SPACE_SHARED_CLUSTER:
+  Opc = TM.is64Bit() ? NVPTX::cvta_shared_cluster_64
+ : NVPTX::cvta_shared_cluster;

AlexMaclean wrote:

My understanding is that cluster is not supported until sm_90, and that sm_90+ 
do not support 32bit compilation. Is there something I'm missing? If not we 
should never select the 32-bit version here and instead check to ensure we're 
compiling for sm_90+.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-12 Thread Alex MacLean via cfe-commits


@@ -0,0 +1,258 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: llc < %s -o - -mcpu=sm_90 -march=nvptx64 -mattr=+ptx80 | FileCheck %s
+; RUN: %if ptxas-12.0 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx80| 
%ptxas-verify -arch=sm_90 %}
+
+target triple = "nvptx64-nvidia-cuda"
+
+@llvm.used = appending global [1 x ptr] [ptr 
@test_distributed_shared_cluster], section "llvm.metadata"
+
+declare ptr addrspace(7) @llvm.nvvm.mapa.shared.cluster(ptr addrspace(3), i32)
+declare i1 @llvm.nvvm.isspacep.shared.cluster(ptr)
+declare i32 @llvm.nvvm.read.ptx.sreg.ctaid.x()
+declare ptr @llvm.nvvm.mapa(ptr, i32)
+
+define i32 @test_distributed_shared_cluster(ptr %ptr, ptr addrspace(3) 
%smem_ptr) local_unnamed_addr {
+; CHECK-LABEL: test_distributed_shared_cluster(
+; CHECK:   {
+; CHECK-NEXT:.reg .pred %p<13>;
+; CHECK-NEXT:.reg .b16 %rs<5>;
+; CHECK-NEXT:.reg .b32 %r<69>;
+; CHECK-NEXT:.reg .f32 %f<2>;
+; CHECK-NEXT:.reg .b64 %rd<24>;
+; CHECK-NEXT:.reg .f64 %fd<2>;
+; CHECK-EMPTY:
+; CHECK-NEXT:  // %bb.0: // %entry
+; CHECK-NEXT:ld.param.u64 %rd2, [test_distributed_shared_cluster_param_0];
+; CHECK-NEXT:ld.param.u64 %rd3, [test_distributed_shared_cluster_param_1];
+; CHECK-NEXT:mov.u32 %r24, %ctaid.x;
+; CHECK-NEXT:xor.b32 %r25, %r24, 1;
+; CHECK-NEXT:isspacep.shared::cluster %p1, %rd2;
+; CHECK-NEXT:mapa.u64 %rd4, %rd2, %r25;
+; CHECK-NEXT:isspacep.shared::cluster %p2, %rd4;
+; CHECK-NEXT:mapa.shared::cluster.u64 %rd5, %rd3, %r25;
+; CHECK-NEXT:mov.b16 %rs1, 0x3C00;
+; CHECK-NEXT:atom.shared::cluster.add.noftz.f16 %rs2, [%rd5], %rs1;
+; CHECK-NEXT:mov.b16 %rs3, 0x3F80;
+; CHECK-NEXT:atom.shared::cluster.add.noftz.bf16 %rs4, [%rd5], %rs3;
+; CHECK-NEXT:atom.shared::cluster.add.f32 %f1, [%rd5], 0f3F80;
+; CHECK-NEXT:atom.shared::cluster.add.f64 %fd1, [%rd5], 0d3FF0;
+; CHECK-NEXT:atom.shared::cluster.add.u32 %r26, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.add.u64 %rd6, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.exch.b32 %r27, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.exch.b64 %rd7, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.min.s32 %r28, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.min.s64 %rd8, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.min.u32 %r29, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.min.u64 %rd9, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.max.s32 %r30, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.max.s64 %rd10, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.max.u32 %r31, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.max.u64 %rd11, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.inc.u32 %r32, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.dec.u32 %r33, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.and.b32 %r34, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.and.b64 %rd12, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.or.b32 %r35, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.or.b64 %rd13, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.xor.b32 %r36, [%rd5], 1;
+; CHECK-NEXT:atom.shared::cluster.xor.b64 %rd14, [%rd5], 1;
+; CHECK-NEXT:atom.relaxed.shared::cluster.cas.b32 %r37, [%rd5], 1, 0;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b32 %r38, [%rd5], 1, 0;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b32 %r39, [%rd5], 1, 0;
+; CHECK-NEXT:atom.release.shared::cluster.cas.b32 %r40, [%rd5], 1, 0;
+; CHECK-NEXT:atom.acq_rel.shared::cluster.cas.b32 %r41, [%rd5], 1, 0;
+; CHECK-NEXT:atom.acq_rel.shared::cluster.cas.b32 %r42, [%rd5], 1, 0;
+; CHECK-NEXT:fence.sc.sys;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b32 %r43, [%rd5], 1, 0;
+; CHECK-NEXT:fence.sc.sys;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b32 %r44, [%rd5], 1, 0;
+; CHECK-NEXT:fence.sc.sys;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b32 %r45, [%rd5], 1, 0;
+; CHECK-NEXT:atom.relaxed.shared::cluster.cas.b64 %rd15, [%rd5], 1, 0;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b64 %rd16, [%rd5], 1, 0;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b64 %rd17, [%rd5], 1, 0;
+; CHECK-NEXT:atom.release.shared::cluster.cas.b64 %rd18, [%rd5], 1, 0;
+; CHECK-NEXT:atom.acq_rel.shared::cluster.cas.b64 %rd19, [%rd5], 1, 0;
+; CHECK-NEXT:atom.acq_rel.shared::cluster.cas.b64 %rd20, [%rd5], 1, 0;
+; CHECK-NEXT:fence.sc.sys;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b64 %rd21, [%rd5], 1, 0;
+; CHECK-NEXT:fence.sc.sys;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b64 %rd22, [%rd5], 1, 0;
+; CHECK-NEXT:fence.sc.sys;
+; CHECK-NEXT:atom.acquire.shared::cluster.cas.b64 %rd23, [%rd5], 1, 0;
+; CHECK-NEXT:and.b64 %rd1, %rd5, -4;
+; CHECK-NEXT:cvt.u32.u64 %r46, %rd5;
+; CHECK-NEXT:and.b32 %r47, %r46, 3;
+; CHECK-NEXT:shl.b32 %r1, %r47, 3;
+; CHECK-NEXT:mov.b

[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-12 Thread Alex MacLean via cfe-commits

https://github.com/AlexMaclean commented:

Backend changes look reasonable so far. One concern I have with this change is 
that until now we've assumed specific address-spaces are non-overlapping. 
You've addressed some of the places where this assumption is encoded but I 
think there are others you have not. For example, addrspacecasts between 
shared::cta and shared::cluster seem like they should be valid and should be 
expanded via a pair of cvta instructions through generic space, I think this 
would produce an error right now. I also wonder if InferAS or other places have 
made this assumption as well. 

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-12 Thread Alex MacLean via cfe-commits


@@ -0,0 +1,258 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: llc < %s -o - -mcpu=sm_90 -march=nvptx64 -mattr=+ptx80 | FileCheck %s
+; RUN: %if ptxas-12.0 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx80| 
%ptxas-verify -arch=sm_90 %}
+
+target triple = "nvptx64-nvidia-cuda"
+
+@llvm.used = appending global [1 x ptr] [ptr 
@test_distributed_shared_cluster], section "llvm.metadata"
+
+declare ptr addrspace(7) @llvm.nvvm.mapa.shared.cluster(ptr addrspace(3), i32)
+declare i1 @llvm.nvvm.isspacep.shared.cluster(ptr)
+declare i32 @llvm.nvvm.read.ptx.sreg.ctaid.x()
+declare ptr @llvm.nvvm.mapa(ptr, i32)
+
+define i32 @test_distributed_shared_cluster(ptr %ptr, ptr addrspace(3) 
%smem_ptr) local_unnamed_addr {

AlexMaclean wrote:

Please break this up into more than one big function. Perhaps group some of the 
atomic cases together but otherwise each instruction/intrinsic should have it's 
own function. This will make the test easier to debug if something starts 
failing at some point.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-12 Thread Alex MacLean via cfe-commits


@@ -0,0 +1,258 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: llc < %s -o - -mcpu=sm_90 -march=nvptx64 -mattr=+ptx80 | FileCheck %s
+; RUN: %if ptxas-12.0 %{ llc < %s -mtriple=nvptx64 -mcpu=sm_90 -mattr=+ptx80| 
%ptxas-verify -arch=sm_90 %}

AlexMaclean wrote:

nit: the -march and -mtriple can be removed here.

https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-12 Thread Alex MacLean via cfe-commits

https://github.com/AlexMaclean edited 
https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-11 Thread via cfe-commits

https://github.com/modiking edited 
https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-11 Thread via cfe-commits

https://github.com/modiking edited 
https://github.com/llvm/llvm-project/pull/135444
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits