https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/79768
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>;
def : Proc<"sm_62", [SM62, PTX50]>;
def : Proc<"sm_70", [SM70, PTX60]>;
def : Proc<"sm_72", [SM72, PTX61]>;
-def : Proc<"sm_75", [SM75, PTX63]>;
+def : Proc<"sm_75", [SM75, PTX62, PTX63]>;
jhuber6 wrote:
Sh
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79768
>From 2c7049defef3b62de7017640948cccfb07ff756c Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Sun, 28 Jan 2024 14:57:05 -0600
Subject: [PATCH 1/3] [NVPTX] Add 'activemask' builtin and intrinsic support
Summar
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>;
def : Proc<"sm_62", [SM62, PTX50]>;
def : Proc<"sm_70", [SM70, PTX60]>;
def : Proc<"sm_72", [SM72, PTX61]>;
-def : Proc<"sm_75", [SM75, PTX63]>;
+def : Proc<"sm_75", [SM75, PTX62, PTX63]>;
jhuber6 wrote:
Ok
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>;
def : Proc<"sm_62", [SM62, PTX50]>;
def : Proc<"sm_70", [SM70, PTX60]>;
def : Proc<"sm_72", [SM72, PTX61]>;
-def : Proc<"sm_75", [SM75, PTX63]>;
+def : Proc<"sm_75", [SM75, PTX62, PTX63]>;
Artem-B wrote:
I'
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>;
def : Proc<"sm_62", [SM62, PTX50]>;
def : Proc<"sm_70", [SM70, PTX60]>;
def : Proc<"sm_72", [SM72, PTX61]>;
-def : Proc<"sm_75", [SM75, PTX63]>;
+def : Proc<"sm_75", [SM75, PTX62, PTX63]>;
jhuber6 wrote:
Ok
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>;
def : Proc<"sm_62", [SM62, PTX50]>;
def : Proc<"sm_70", [SM70, PTX60]>;
def : Proc<"sm_72", [SM72, PTX61]>;
-def : Proc<"sm_75", [SM75, PTX63]>;
+def : Proc<"sm_75", [SM75, PTX62, PTX63]>;
Artem-B wrote:
Wh
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>;
def : Proc<"sm_62", [SM62, PTX50]>;
def : Proc<"sm_70", [SM70, PTX60]>;
def : Proc<"sm_72", [SM72, PTX61]>;
-def : Proc<"sm_75", [SM75, PTX63]>;
+def : Proc<"sm_75", [SM75, PTX62, PTX63]>;
jhuber6 wrote:
Ye
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>;
def : Proc<"sm_62", [SM62, PTX50]>;
def : Proc<"sm_70", [SM70, PTX60]>;
def : Proc<"sm_72", [SM72, PTX61]>;
-def : Proc<"sm_75", [SM75, PTX63]>;
+def : Proc<"sm_75", [SM75, PTX62, PTX63]>;
Artem-B wrote:
Wh
@@ -4599,6 +4599,14 @@ def int_nvvm_vote_ballot_sync :
[IntrInaccessibleMemOnly, IntrConvergent, IntrNoCallback],
"llvm.nvvm.vote.ballot.sync">,
ClangBuiltin<"__nvvm_vote_ballot_sync">;
+//
+// ACTIVEMASK
+//
+def int_nvvm_activemask :
+ Intrinsic<[llvm_i32_ty]
@@ -4599,6 +4599,14 @@ def int_nvvm_vote_ballot_sync :
[IntrInaccessibleMemOnly, IntrConvergent, IntrNoCallback],
"llvm.nvvm.vote.ballot.sync">,
ClangBuiltin<"__nvvm_vote_ballot_sync">;
+//
+// ACTIVEMASK
+//
+def int_nvvm_activemask :
+ Intrinsic<[llvm_i32_ty]
jhuber6 wrote:
Added side effects attribute, I believe this matches the current behavior of
the inline asm better.
https://github.com/llvm/llvm-project/pull/79768
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/m
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79768
>From 2c7049defef3b62de7017640948cccfb07ff756c Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Sun, 28 Jan 2024 14:57:05 -0600
Subject: [PATCH 1/2] [NVPTX] Add 'activemask' builtin and intrinsic support
Summar
jhuber6 wrote:
> https://bugs.llvm.org/show_bug.cgi?id=35249
Yeah, there's constant issues with convergence analysis. I included one of the
tests to try to show that it won't merge with the covergent attribute. Since
this is a general issue for all of these things. In the past I usually add
i
Artem-B wrote:
'activemask' is a rather peculiar instruction which may not be a good candidate
for exposing it to LLVM.
The problem is that it can 'observe' the past branch decisions and reflects the
state of not-yet-reconverged conditional branches. LLVM does not take it into
account. Opaque
llvmbot wrote:
@llvm/pr-subscribers-clang
Author: Joseph Huber (jhuber6)
Changes
Summary:
This patch adds support for getting the 'activemask' instruction's value
without needing to use inline assembly. See the relevant PTX reference
for details.
https://docs.nvidia.com/cuda/parallel-thre
16 matches
Mail list logo