[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -119,23 +119,17 @@ define void @msubpt1(i32 %index, i32 %elem) {
; CHECK-CPA-O0: // %bb.0: // %entry
; CHECK-CPA-O0-NEXT:// implicit-def: $x8
; CHECK-CPA-O0-NEXT:mov w8, w0
-; CHECK-CPA-O0-NEXT:sxtw x9, w8
-; CHECK-CPA-O0-NEXT:mov x8, xzr
-; CHECK-CPA-O0-NEXT:subs x8, x8, x9
-; CHECK-CPA-O0-NEXT:lsl x8, x8, #1
-; CHECK-CPA-O0-NEXT:subs x10, x8, x9
+; CHECK-CPA-O0-NEXT:sxtw x8, w8
ritter-x2a wrote:
Alright, thanks!
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -119,23 +119,17 @@ define void @msubpt1(i32 %index, i32 %elem) {
; CHECK-CPA-O0: // %bb.0: // %entry
; CHECK-CPA-O0-NEXT:// implicit-def: $x8
; CHECK-CPA-O0-NEXT:mov w8, w0
-; CHECK-CPA-O0-NEXT:sxtw x9, w8
-; CHECK-CPA-O0-NEXT:mov x8, xzr
-; CHECK-CPA-O0-NEXT:subs x8, x8, x9
-; CHECK-CPA-O0-NEXT:lsl x8, x8, #1
-; CHECK-CPA-O0-NEXT:subs x10, x8, x9
+; CHECK-CPA-O0-NEXT:sxtw x8, w8
rgwott wrote:
It looks good, although I am not a human compiler =D
To me the giveaway is that some more complex chains of addpts are simplified,
and some subpts are substituted by adppts, which is something I have seen
during development.
I think it's fine, go ahead.
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -119,23 +119,17 @@ define void @msubpt1(i32 %index, i32 %elem) {
; CHECK-CPA-O0: // %bb.0: // %entry
; CHECK-CPA-O0-NEXT:// implicit-def: $x8
; CHECK-CPA-O0-NEXT:mov w8, w0
-; CHECK-CPA-O0-NEXT:sxtw x9, w8
-; CHECK-CPA-O0-NEXT:mov x8, xzr
-; CHECK-CPA-O0-NEXT:subs x8, x8, x9
-; CHECK-CPA-O0-NEXT:lsl x8, x8, #1
-; CHECK-CPA-O0-NEXT:subs x10, x8, x9
+; CHECK-CPA-O0-NEXT:sxtw x8, w8
ritter-x2a wrote:
With #105669 now landed, this new AArch64 CPA test is also affected by this PR.
The changes to this test with the new combines seem reasonable to me at a
glance. @rgwott, could you please have a look and check if they make sense to
you?
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
https://github.com/ritter-x2a edited https://github.com/llvm/llvm-project/pull/142739 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
https://github.com/ritter-x2a updated
https://github.com/llvm/llvm-project/pull/142739
>From f8aa9d23e4a70e3155d19b0ad0e209b47866aab9 Mon Sep 17 00:00:00 2001
From: Fabian Ritter
Date: Wed, 4 Jun 2025 03:32:32 -0400
Subject: [PATCH 1/7] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines
This patch focuses on generic DAG combines, plus an AMDGPU-target-specific one
that is closely connected.
The generic DAG combine is based on a part of PR #105669 by @rgwott, which was
adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello
LLVM tree. I added some parts and removed several disjuncts from the
reassociation condition:
- `isNullConstant(X)`, since there are address spaces where 0 is a perfectly
normal value that shouldn't be treated specially,
- `(YIsConstant && ZOneUse)` and `(N0OneUse && ZOneUse && !ZIsConstant)`, since
they cause regressions in AMDGPU.
For SWDEV-516125.
---
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 92 -
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 49 +
llvm/lib/Target/AMDGPU/SIISelLowering.h | 1 +
.../AMDGPU/ptradd-sdag-optimizations.ll | 194 ++
4 files changed, 201 insertions(+), 135 deletions(-)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 56a5643e13442..b203431232035 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -421,6 +421,7 @@ namespace {
SDValue visitADDLike(SDNode *N);
SDValue visitADDLikeCommutative(SDValue N0, SDValue N1,
SDNode *LocReference);
+SDValue visitPTRADD(SDNode *N);
SDValue visitSUB(SDNode *N);
SDValue visitADDSAT(SDNode *N);
SDValue visitSUBSAT(SDNode *N);
@@ -1140,7 +1141,7 @@ bool
DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
return true;
}
- if (Opc != ISD::ADD)
+ if (Opc != ISD::ADD && Opc != ISD::PTRADD)
return false;
auto *C2 = dyn_cast(N1);
@@ -1860,6 +1861,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
case ISD::TokenFactor:return visitTokenFactor(N);
case ISD::MERGE_VALUES: return visitMERGE_VALUES(N);
case ISD::ADD:return visitADD(N);
+ case ISD::PTRADD: return visitPTRADD(N);
case ISD::SUB:return visitSUB(N);
case ISD::SADDSAT:
case ISD::UADDSAT:return visitADDSAT(N);
@@ -2630,6 +2632,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N0->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
+ SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+ AddToWorklist(Add.getNode());
+ return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+}
+
+// TODO: There is another possible fold here that was proven useful.
+// It would be this:
+//
+// (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+// * (ptradd x, y) has one use; and
+// * y is a constant; and
+// * z is not a constant.
+//
+// In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+// opportunity to select more complex instructions such as SUBPT and
+// MSUBPT. H
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
https://github.com/ritter-x2a updated
https://github.com/llvm/llvm-project/pull/142739
>From f8aa9d23e4a70e3155d19b0ad0e209b47866aab9 Mon Sep 17 00:00:00 2001
From: Fabian Ritter
Date: Wed, 4 Jun 2025 03:32:32 -0400
Subject: [PATCH 1/7] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines
This patch focuses on generic DAG combines, plus an AMDGPU-target-specific one
that is closely connected.
The generic DAG combine is based on a part of PR #105669 by @rgwott, which was
adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello
LLVM tree. I added some parts and removed several disjuncts from the
reassociation condition:
- `isNullConstant(X)`, since there are address spaces where 0 is a perfectly
normal value that shouldn't be treated specially,
- `(YIsConstant && ZOneUse)` and `(N0OneUse && ZOneUse && !ZIsConstant)`, since
they cause regressions in AMDGPU.
For SWDEV-516125.
---
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 92 -
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 49 +
llvm/lib/Target/AMDGPU/SIISelLowering.h | 1 +
.../AMDGPU/ptradd-sdag-optimizations.ll | 194 ++
4 files changed, 201 insertions(+), 135 deletions(-)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 56a5643e13442..b203431232035 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -421,6 +421,7 @@ namespace {
SDValue visitADDLike(SDNode *N);
SDValue visitADDLikeCommutative(SDValue N0, SDValue N1,
SDNode *LocReference);
+SDValue visitPTRADD(SDNode *N);
SDValue visitSUB(SDNode *N);
SDValue visitADDSAT(SDNode *N);
SDValue visitSUBSAT(SDNode *N);
@@ -1140,7 +1141,7 @@ bool
DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
return true;
}
- if (Opc != ISD::ADD)
+ if (Opc != ISD::ADD && Opc != ISD::PTRADD)
return false;
auto *C2 = dyn_cast(N1);
@@ -1860,6 +1861,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
case ISD::TokenFactor:return visitTokenFactor(N);
case ISD::MERGE_VALUES: return visitMERGE_VALUES(N);
case ISD::ADD:return visitADD(N);
+ case ISD::PTRADD: return visitPTRADD(N);
case ISD::SUB:return visitSUB(N);
case ISD::SADDSAT:
case ISD::UADDSAT:return visitADDSAT(N);
@@ -2630,6 +2632,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N0->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
+ SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+ AddToWorklist(Add.getNode());
+ return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+}
+
+// TODO: There is another possible fold here that was proven useful.
+// It would be this:
+//
+// (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+// * (ptradd x, y) has one use; and
+// * y is a constant; and
+// * z is not a constant.
+//
+// In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+// opportunity to select more complex instructions such as SUBPT and
+// MSUBPT. H
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2628,6 +2630,87 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0) && PtrVT == IntVT)
+return N1;
ritter-x2a wrote:
@arsenm, do you think we should remove the redundant `PtrVT == IntVT` part here
again? That would be fine by me, and @arichardson already said above that he
doesn't feel strongly about this and is happy with either.
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -14944,6 +14945,51 @@ SDValue SITargetLowering::performAddCombine(SDNode *N,
return SDValue();
}
+SDValue SITargetLowering::performPtrAddCombine(SDNode *N,
+ DAGCombinerInfo &DCI) const {
+ SelectionDAG &DAG = DCI.DAG;
+ SDLoc DL(N);
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+
+ if (N1.getOpcode() == ISD::ADD) {
+// (ptradd x, (add y, z)) -> (ptradd (ptradd x, y), z) if z is a constant,
+//y is not, and (add y, z) is used only once.
+// (ptradd x, (add y, z)) -> (ptradd (ptradd x, z), y) if y is a constant,
+//z is not, and (add y, z) is used only once.
+// The goal is to move constant offsets to the outermost ptradd, to create
+// more opportunities to fold offsets into memory instructions.
+// Together with the generic combines in DAGCombiner.cpp, this also
+// implements (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y)).
+//
+// This transform is here instead of in the general DAGCombiner as it can
+// turn in-bounds pointer arithmetic out-of-bounds, which is problematic
for
+// AArch64's CPA.
+SDValue X = N0;
+SDValue Y = N1.getOperand(0);
+SDValue Z = N1.getOperand(1);
+bool N1OneUse = N1.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+if ((ZIsConstant != YIsConstant) && N1OneUse) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N1->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
ritter-x2a wrote:
Done (here and also for similar code in DAGCombiner.cpp).
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -14944,6 +14945,51 @@ SDValue SITargetLowering::performAddCombine(SDNode *N,
return SDValue();
}
+SDValue SITargetLowering::performPtrAddCombine(SDNode *N,
+ DAGCombinerInfo &DCI) const {
+ SelectionDAG &DAG = DCI.DAG;
+ SDLoc DL(N);
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+
+ if (N1.getOpcode() == ISD::ADD) {
+// (ptradd x, (add y, z)) -> (ptradd (ptradd x, y), z) if z is a constant,
+//y is not, and (add y, z) is used only once.
+// (ptradd x, (add y, z)) -> (ptradd (ptradd x, z), y) if y is a constant,
+//z is not, and (add y, z) is used only once.
+// The goal is to move constant offsets to the outermost ptradd, to create
+// more opportunities to fold offsets into memory instructions.
+// Together with the generic combines in DAGCombiner.cpp, this also
+// implements (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y)).
+//
+// This transform is here instead of in the general DAGCombiner as it can
+// turn in-bounds pointer arithmetic out-of-bounds, which is problematic
for
+// AArch64's CPA.
+SDValue X = N0;
+SDValue Y = N1.getOperand(0);
+SDValue Z = N1.getOperand(1);
+bool N1OneUse = N1.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+if ((ZIsConstant != YIsConstant) && N1OneUse) {
ritter-x2a wrote:
Done.
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2628,6 +2630,87 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0) && PtrVT == IntVT)
+return N1;
ritter-x2a wrote:
I've applied the suggested change for now.
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
https://github.com/ritter-x2a updated
https://github.com/llvm/llvm-project/pull/142739
>From 6ea714e83e4714d9fe025e5e9fee48b41f223cb8 Mon Sep 17 00:00:00 2001
From: Fabian Ritter
Date: Wed, 4 Jun 2025 03:32:32 -0400
Subject: [PATCH 1/6] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines
This patch focuses on generic DAG combines, plus an AMDGPU-target-specific one
that is closely connected.
The generic DAG combine is based on a part of PR #105669 by @rgwott, which was
adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello
LLVM tree. I added some parts and removed several disjuncts from the
reassociation condition:
- `isNullConstant(X)`, since there are address spaces where 0 is a perfectly
normal value that shouldn't be treated specially,
- `(YIsConstant && ZOneUse)` and `(N0OneUse && ZOneUse && !ZIsConstant)`, since
they cause regressions in AMDGPU.
For SWDEV-516125.
---
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 92 -
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 49 +
llvm/lib/Target/AMDGPU/SIISelLowering.h | 1 +
.../AMDGPU/ptradd-sdag-optimizations.ll | 194 ++
4 files changed, 201 insertions(+), 135 deletions(-)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 5d62ded171f4f..505cb264ae948 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -419,6 +419,7 @@ namespace {
SDValue visitADDLike(SDNode *N);
SDValue visitADDLikeCommutative(SDValue N0, SDValue N1,
SDNode *LocReference);
+SDValue visitPTRADD(SDNode *N);
SDValue visitSUB(SDNode *N);
SDValue visitADDSAT(SDNode *N);
SDValue visitSUBSAT(SDNode *N);
@@ -1138,7 +1139,7 @@ bool
DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
return true;
}
- if (Opc != ISD::ADD)
+ if (Opc != ISD::ADD && Opc != ISD::PTRADD)
return false;
auto *C2 = dyn_cast(N1);
@@ -1858,6 +1859,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
case ISD::TokenFactor:return visitTokenFactor(N);
case ISD::MERGE_VALUES: return visitMERGE_VALUES(N);
case ISD::ADD:return visitADD(N);
+ case ISD::PTRADD: return visitPTRADD(N);
case ISD::SUB:return visitSUB(N);
case ISD::SADDSAT:
case ISD::UADDSAT:return visitADDSAT(N);
@@ -2628,6 +2630,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N0->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
+ SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+ AddToWorklist(Add.getNode());
+ return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+}
+
+// TODO: There is another possible fold here that was proven useful.
+// It would be this:
+//
+// (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+// * (ptradd x, y) has one use; and
+// * y is a constant; and
+// * z is not a constant.
+//
+// In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+// opportunity to select more complex instructions such as SUBPT and
+// MSUBPT. H
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2628,6 +2630,87 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0) && PtrVT == IntVT)
+return N1;
ritter-x2a wrote:
Yes, @arichardson argued for adding it redundantly here in [a thread
below](https://app.graphite.dev/github/pr/llvm/llvm-project/142739/%5BAMDGPU%5D%5BSDAG%5D-Add-ISD-PTRADD-DAG-combines#comment-PRRC_kwDOBITxeM5_IBA7),
to highlight this constraint more explicitly for when the assertion might be
relaxed in the future.
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
https://github.com/ritter-x2a updated
https://github.com/llvm/llvm-project/pull/142739
>From c7aea91592109a305725bfd6cf0d60f939c5433e Mon Sep 17 00:00:00 2001
From: Fabian Ritter
Date: Wed, 4 Jun 2025 03:32:32 -0400
Subject: [PATCH 1/5] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines
This patch focuses on generic DAG combines, plus an AMDGPU-target-specific one
that is closely connected.
The generic DAG combine is based on a part of PR #105669 by @rgwott, which was
adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello
LLVM tree. I added some parts and removed several disjuncts from the
reassociation condition:
- `isNullConstant(X)`, since there are address spaces where 0 is a perfectly
normal value that shouldn't be treated specially,
- `(YIsConstant && ZOneUse)` and `(N0OneUse && ZOneUse && !ZIsConstant)`, since
they cause regressions in AMDGPU.
For SWDEV-516125.
---
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 92 -
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 49 +
llvm/lib/Target/AMDGPU/SIISelLowering.h | 1 +
.../AMDGPU/ptradd-sdag-optimizations.ll | 194 ++
4 files changed, 201 insertions(+), 135 deletions(-)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 5d62ded171f4f..505cb264ae948 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -419,6 +419,7 @@ namespace {
SDValue visitADDLike(SDNode *N);
SDValue visitADDLikeCommutative(SDValue N0, SDValue N1,
SDNode *LocReference);
+SDValue visitPTRADD(SDNode *N);
SDValue visitSUB(SDNode *N);
SDValue visitADDSAT(SDNode *N);
SDValue visitSUBSAT(SDNode *N);
@@ -1138,7 +1139,7 @@ bool
DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
return true;
}
- if (Opc != ISD::ADD)
+ if (Opc != ISD::ADD && Opc != ISD::PTRADD)
return false;
auto *C2 = dyn_cast(N1);
@@ -1858,6 +1859,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
case ISD::TokenFactor:return visitTokenFactor(N);
case ISD::MERGE_VALUES: return visitMERGE_VALUES(N);
case ISD::ADD:return visitADD(N);
+ case ISD::PTRADD: return visitPTRADD(N);
case ISD::SUB:return visitSUB(N);
case ISD::SADDSAT:
case ISD::UADDSAT:return visitADDSAT(N);
@@ -2628,6 +2630,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N0->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
+ SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+ AddToWorklist(Add.getNode());
+ return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+}
+
+// TODO: There is another possible fold here that was proven useful.
+// It would be this:
+//
+// (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+// * (ptradd x, y) has one use; and
+// * y is a constant; and
+// * z is not a constant.
+//
+// In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+// opportunity to select more complex instructions such as SUBPT and
+// MSUBPT. H
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
https://github.com/ritter-x2a updated
https://github.com/llvm/llvm-project/pull/142739
>From c7aea91592109a305725bfd6cf0d60f939c5433e Mon Sep 17 00:00:00 2001
From: Fabian Ritter
Date: Wed, 4 Jun 2025 03:32:32 -0400
Subject: [PATCH 1/5] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines
This patch focuses on generic DAG combines, plus an AMDGPU-target-specific one
that is closely connected.
The generic DAG combine is based on a part of PR #105669 by @rgwott, which was
adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello
LLVM tree. I added some parts and removed several disjuncts from the
reassociation condition:
- `isNullConstant(X)`, since there are address spaces where 0 is a perfectly
normal value that shouldn't be treated specially,
- `(YIsConstant && ZOneUse)` and `(N0OneUse && ZOneUse && !ZIsConstant)`, since
they cause regressions in AMDGPU.
For SWDEV-516125.
---
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 92 -
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 49 +
llvm/lib/Target/AMDGPU/SIISelLowering.h | 1 +
.../AMDGPU/ptradd-sdag-optimizations.ll | 194 ++
4 files changed, 201 insertions(+), 135 deletions(-)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 5d62ded171f4f..505cb264ae948 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -419,6 +419,7 @@ namespace {
SDValue visitADDLike(SDNode *N);
SDValue visitADDLikeCommutative(SDValue N0, SDValue N1,
SDNode *LocReference);
+SDValue visitPTRADD(SDNode *N);
SDValue visitSUB(SDNode *N);
SDValue visitADDSAT(SDNode *N);
SDValue visitSUBSAT(SDNode *N);
@@ -1138,7 +1139,7 @@ bool
DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
return true;
}
- if (Opc != ISD::ADD)
+ if (Opc != ISD::ADD && Opc != ISD::PTRADD)
return false;
auto *C2 = dyn_cast(N1);
@@ -1858,6 +1859,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
case ISD::TokenFactor:return visitTokenFactor(N);
case ISD::MERGE_VALUES: return visitMERGE_VALUES(N);
case ISD::ADD:return visitADD(N);
+ case ISD::PTRADD: return visitPTRADD(N);
case ISD::SUB:return visitSUB(N);
case ISD::SADDSAT:
case ISD::UADDSAT:return visitADDSAT(N);
@@ -2628,6 +2630,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N0->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
+ SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+ AddToWorklist(Add.getNode());
+ return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+}
+
+// TODO: There is another possible fold here that was proven useful.
+// It would be this:
+//
+// (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+// * (ptradd x, y) has one use; and
+// * y is a constant; and
+// * z is not a constant.
+//
+// In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+// opportunity to select more complex instructions such as SUBPT and
+// MSUBPT. H
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
https://github.com/ritter-x2a updated
https://github.com/llvm/llvm-project/pull/142739
>From 3002da1befde734af1904d3424abd72b65f1377b Mon Sep 17 00:00:00 2001
From: Fabian Ritter
Date: Wed, 4 Jun 2025 03:32:32 -0400
Subject: [PATCH 1/5] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines
This patch focuses on generic DAG combines, plus an AMDGPU-target-specific one
that is closely connected.
The generic DAG combine is based on a part of PR #105669 by @rgwott, which was
adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello
LLVM tree. I added some parts and removed several disjuncts from the
reassociation condition:
- `isNullConstant(X)`, since there are address spaces where 0 is a perfectly
normal value that shouldn't be treated specially,
- `(YIsConstant && ZOneUse)` and `(N0OneUse && ZOneUse && !ZIsConstant)`, since
they cause regressions in AMDGPU.
For SWDEV-516125.
---
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 92 -
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 49 +
llvm/lib/Target/AMDGPU/SIISelLowering.h | 1 +
.../AMDGPU/ptradd-sdag-optimizations.ll | 194 ++
4 files changed, 201 insertions(+), 135 deletions(-)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 5d62ded171f4f..505cb264ae948 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -419,6 +419,7 @@ namespace {
SDValue visitADDLike(SDNode *N);
SDValue visitADDLikeCommutative(SDValue N0, SDValue N1,
SDNode *LocReference);
+SDValue visitPTRADD(SDNode *N);
SDValue visitSUB(SDNode *N);
SDValue visitADDSAT(SDNode *N);
SDValue visitSUBSAT(SDNode *N);
@@ -1138,7 +1139,7 @@ bool
DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
return true;
}
- if (Opc != ISD::ADD)
+ if (Opc != ISD::ADD && Opc != ISD::PTRADD)
return false;
auto *C2 = dyn_cast(N1);
@@ -1858,6 +1859,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
case ISD::TokenFactor:return visitTokenFactor(N);
case ISD::MERGE_VALUES: return visitMERGE_VALUES(N);
case ISD::ADD:return visitADD(N);
+ case ISD::PTRADD: return visitPTRADD(N);
case ISD::SUB:return visitSUB(N);
case ISD::SADDSAT:
case ISD::UADDSAT:return visitADDSAT(N);
@@ -2628,6 +2630,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N0->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
+ SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+ AddToWorklist(Add.getNode());
+ return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+}
+
+// TODO: There is another possible fold here that was proven useful.
+// It would be this:
+//
+// (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+// * (ptradd x, y) has one use; and
+// * y is a constant; and
+// * z is not a constant.
+//
+// In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+// opportunity to select more complex instructions such as SUBPT and
+// MSUBPT. H
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
https://github.com/ritter-x2a updated
https://github.com/llvm/llvm-project/pull/142739
>From 3002da1befde734af1904d3424abd72b65f1377b Mon Sep 17 00:00:00 2001
From: Fabian Ritter
Date: Wed, 4 Jun 2025 03:32:32 -0400
Subject: [PATCH 1/5] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines
This patch focuses on generic DAG combines, plus an AMDGPU-target-specific one
that is closely connected.
The generic DAG combine is based on a part of PR #105669 by @rgwott, which was
adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello
LLVM tree. I added some parts and removed several disjuncts from the
reassociation condition:
- `isNullConstant(X)`, since there are address spaces where 0 is a perfectly
normal value that shouldn't be treated specially,
- `(YIsConstant && ZOneUse)` and `(N0OneUse && ZOneUse && !ZIsConstant)`, since
they cause regressions in AMDGPU.
For SWDEV-516125.
---
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 92 -
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 49 +
llvm/lib/Target/AMDGPU/SIISelLowering.h | 1 +
.../AMDGPU/ptradd-sdag-optimizations.ll | 194 ++
4 files changed, 201 insertions(+), 135 deletions(-)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 5d62ded171f4f..505cb264ae948 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -419,6 +419,7 @@ namespace {
SDValue visitADDLike(SDNode *N);
SDValue visitADDLikeCommutative(SDValue N0, SDValue N1,
SDNode *LocReference);
+SDValue visitPTRADD(SDNode *N);
SDValue visitSUB(SDNode *N);
SDValue visitADDSAT(SDNode *N);
SDValue visitSUBSAT(SDNode *N);
@@ -1138,7 +1139,7 @@ bool
DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
return true;
}
- if (Opc != ISD::ADD)
+ if (Opc != ISD::ADD && Opc != ISD::PTRADD)
return false;
auto *C2 = dyn_cast(N1);
@@ -1858,6 +1859,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
case ISD::TokenFactor:return visitTokenFactor(N);
case ISD::MERGE_VALUES: return visitMERGE_VALUES(N);
case ISD::ADD:return visitADD(N);
+ case ISD::PTRADD: return visitPTRADD(N);
case ISD::SUB:return visitSUB(N);
case ISD::SADDSAT:
case ISD::UADDSAT:return visitADDSAT(N);
@@ -2628,6 +2630,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N0->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
+ SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+ AddToWorklist(Add.getNode());
+ return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+}
+
+// TODO: There is another possible fold here that was proven useful.
+// It would be this:
+//
+// (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+// * (ptradd x, y) has one use; and
+// * y is a constant; and
+// * z is not a constant.
+//
+// In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+// opportunity to select more complex instructions such as SUBPT and
+// MSUBPT. H
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -14944,6 +14945,51 @@ SDValue SITargetLowering::performAddCombine(SDNode *N,
return SDValue();
}
+SDValue SITargetLowering::performPtrAddCombine(SDNode *N,
+ DAGCombinerInfo &DCI) const {
+ SelectionDAG &DAG = DCI.DAG;
+ SDLoc DL(N);
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+
+ if (N1.getOpcode() == ISD::ADD) {
+// (ptradd x, (add y, z)) -> (ptradd (ptradd x, y), z) if z is a constant,
+//y is not, and (add y, z) is used only once.
+// (ptradd x, (add y, z)) -> (ptradd (ptradd x, z), y) if y is a constant,
+//z is not, and (add y, z) is used only once.
+// The goal is to move constant offsets to the outermost ptradd, to create
+// more opportunities to fold offsets into memory instructions.
+// Together with the generic combines in DAGCombiner.cpp, this also
+// implements (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y)).
+//
+// This transform is here instead of in the general DAGCombiner as it can
+// turn in-bounds pointer arithmetic out-of-bounds, which is problematic
for
+// AArch64's CPA.
+SDValue X = N0;
+SDValue Y = N1.getOperand(0);
+SDValue Z = N1.getOperand(1);
+bool N1OneUse = N1.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+if ((ZIsConstant != YIsConstant) && N1OneUse) {
arsenm wrote:
Avoid the DAG.isConstantIntBuildVectorOrConstantInt in the !N1OneUse case?
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -14944,6 +14945,51 @@ SDValue SITargetLowering::performAddCombine(SDNode *N,
return SDValue();
}
+SDValue SITargetLowering::performPtrAddCombine(SDNode *N,
+ DAGCombinerInfo &DCI) const {
+ SelectionDAG &DAG = DCI.DAG;
+ SDLoc DL(N);
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+
+ if (N1.getOpcode() == ISD::ADD) {
+// (ptradd x, (add y, z)) -> (ptradd (ptradd x, y), z) if z is a constant,
+//y is not, and (add y, z) is used only once.
+// (ptradd x, (add y, z)) -> (ptradd (ptradd x, z), y) if y is a constant,
+//z is not, and (add y, z) is used only once.
+// The goal is to move constant offsets to the outermost ptradd, to create
+// more opportunities to fold offsets into memory instructions.
+// Together with the generic combines in DAGCombiner.cpp, this also
+// implements (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y)).
+//
+// This transform is here instead of in the general DAGCombiner as it can
+// turn in-bounds pointer arithmetic out-of-bounds, which is problematic
for
+// AArch64's CPA.
+SDValue X = N0;
+SDValue Y = N1.getOperand(0);
+SDValue Z = N1.getOperand(1);
+bool N1OneUse = N1.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+if ((ZIsConstant != YIsConstant) && N1OneUse) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N1->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
arsenm wrote:
Can you do SDNodeFlags = (N->getFlags() & N1->getFlags()) &
SDNodeFlags::NoUnsignedWrap?
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2628,6 +2630,87 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0) && PtrVT == IntVT)
+return N1;
arsenm wrote:
```suggestion
// fold (ptradd 0, x) -> x
if (PtrVT == IntVT && isNullConstant(N0))
return N1;
```
But PtrVT == IntVT was already asserted above?
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
https://github.com/ritter-x2a updated
https://github.com/llvm/llvm-project/pull/142739
>From 743ecdf0cf69d300859d6817fa4a9c48218aa9e5 Mon Sep 17 00:00:00 2001
From: Fabian Ritter
Date: Wed, 4 Jun 2025 03:32:32 -0400
Subject: [PATCH 1/5] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines
This patch focuses on generic DAG combines, plus an AMDGPU-target-specific one
that is closely connected.
The generic DAG combine is based on a part of PR #105669 by @rgwott, which was
adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello
LLVM tree. I added some parts and removed several disjuncts from the
reassociation condition:
- `isNullConstant(X)`, since there are address spaces where 0 is a perfectly
normal value that shouldn't be treated specially,
- `(YIsConstant && ZOneUse)` and `(N0OneUse && ZOneUse && !ZIsConstant)`, since
they cause regressions in AMDGPU.
For SWDEV-516125.
---
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 92 -
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 49 +
llvm/lib/Target/AMDGPU/SIISelLowering.h | 1 +
.../AMDGPU/ptradd-sdag-optimizations.ll | 194 ++
4 files changed, 201 insertions(+), 135 deletions(-)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 5d62ded171f4f..505cb264ae948 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -419,6 +419,7 @@ namespace {
SDValue visitADDLike(SDNode *N);
SDValue visitADDLikeCommutative(SDValue N0, SDValue N1,
SDNode *LocReference);
+SDValue visitPTRADD(SDNode *N);
SDValue visitSUB(SDNode *N);
SDValue visitADDSAT(SDNode *N);
SDValue visitSUBSAT(SDNode *N);
@@ -1138,7 +1139,7 @@ bool
DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
return true;
}
- if (Opc != ISD::ADD)
+ if (Opc != ISD::ADD && Opc != ISD::PTRADD)
return false;
auto *C2 = dyn_cast(N1);
@@ -1858,6 +1859,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
case ISD::TokenFactor:return visitTokenFactor(N);
case ISD::MERGE_VALUES: return visitMERGE_VALUES(N);
case ISD::ADD:return visitADD(N);
+ case ISD::PTRADD: return visitPTRADD(N);
case ISD::SUB:return visitSUB(N);
case ISD::SADDSAT:
case ISD::UADDSAT:return visitADDSAT(N);
@@ -2628,6 +2630,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N0->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
+ SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+ AddToWorklist(Add.getNode());
+ return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+}
+
+// TODO: There is another possible fold here that was proven useful.
+// It would be this:
+//
+// (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+// * (ptradd x, y) has one use; and
+// * y is a constant; and
+// * z is not a constant.
+//
+// In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+// opportunity to select more complex instructions such as SUBPT and
+// MSUBPT. H
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
https://github.com/ritter-x2a updated
https://github.com/llvm/llvm-project/pull/142739
>From 743ecdf0cf69d300859d6817fa4a9c48218aa9e5 Mon Sep 17 00:00:00 2001
From: Fabian Ritter
Date: Wed, 4 Jun 2025 03:32:32 -0400
Subject: [PATCH 1/5] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines
This patch focuses on generic DAG combines, plus an AMDGPU-target-specific one
that is closely connected.
The generic DAG combine is based on a part of PR #105669 by @rgwott, which was
adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello
LLVM tree. I added some parts and removed several disjuncts from the
reassociation condition:
- `isNullConstant(X)`, since there are address spaces where 0 is a perfectly
normal value that shouldn't be treated specially,
- `(YIsConstant && ZOneUse)` and `(N0OneUse && ZOneUse && !ZIsConstant)`, since
they cause regressions in AMDGPU.
For SWDEV-516125.
---
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 92 -
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 49 +
llvm/lib/Target/AMDGPU/SIISelLowering.h | 1 +
.../AMDGPU/ptradd-sdag-optimizations.ll | 194 ++
4 files changed, 201 insertions(+), 135 deletions(-)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 5d62ded171f4f..505cb264ae948 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -419,6 +419,7 @@ namespace {
SDValue visitADDLike(SDNode *N);
SDValue visitADDLikeCommutative(SDValue N0, SDValue N1,
SDNode *LocReference);
+SDValue visitPTRADD(SDNode *N);
SDValue visitSUB(SDNode *N);
SDValue visitADDSAT(SDNode *N);
SDValue visitSUBSAT(SDNode *N);
@@ -1138,7 +1139,7 @@ bool
DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
return true;
}
- if (Opc != ISD::ADD)
+ if (Opc != ISD::ADD && Opc != ISD::PTRADD)
return false;
auto *C2 = dyn_cast(N1);
@@ -1858,6 +1859,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
case ISD::TokenFactor:return visitTokenFactor(N);
case ISD::MERGE_VALUES: return visitMERGE_VALUES(N);
case ISD::ADD:return visitADD(N);
+ case ISD::PTRADD: return visitPTRADD(N);
case ISD::SUB:return visitSUB(N);
case ISD::SADDSAT:
case ISD::UADDSAT:return visitADDSAT(N);
@@ -2628,6 +2630,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N0->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
+ SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+ AddToWorklist(Add.getNode());
+ return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+}
+
+// TODO: There is another possible fold here that was proven useful.
+// It would be this:
+//
+// (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+// * (ptradd x, y) has one use; and
+// * y is a constant; and
+// * z is not a constant.
+//
+// In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+// opportunity to select more complex instructions such as SUBPT and
+// MSUBPT. H
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
github-actions[bot] wrote:
:warning: undef deprecator found issues in your code. :warning:
You can test this locally with the following command:
``bash
git diff -U0 --pickaxe-regex -S
'([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 'HEAD~1' HEAD
llvm/test/CodeGen/AMDGPU/ptradd-sdag-undef-poison.ll
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
llvm/lib/Target/AMDGPU/SIISelLowering.h
llvm/test/CodeGen/AMDGPU/ptradd-sdag-optimizations.ll
``
The following files introduce new uses of undef:
- llvm/test/CodeGen/AMDGPU/ptradd-sdag-undef-poison.ll
[Undef](https://llvm.org/docs/LangRef.html#undefined-values) is now deprecated
and should only be used in the rare cases where no replacement is possible. For
example, a load of uninitialized memory yields `undef`. You should use `poison`
values for placeholders instead.
In tests, avoid using `undef` and having tests that trigger undefined behavior.
If you need an operand with some unimportant value, you can add a new argument
to the function and use that instead.
For example, this is considered a bad practice:
```llvm
define void @fn() {
...
br i1 undef, ...
}
```
Please use the following instead:
```llvm
define void @fn(i1 %cond) {
...
br i1 %cond, ...
}
```
Please refer to the [Undefined Behavior
Manual](https://llvm.org/docs/UndefinedBehavior.html) for more information.
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
ritter-x2a wrote:
I've added some in the `ptradd-sdag-undef-poison.ll` test. Can't have them diff
nicely like the other tests because the folds are already enabled on trunk, but
I did verify that additions are generated if the folds are disabled again.
We obviously need to ignore the undef deprecator github action for this test
(or should I limit it to poison?).
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
https://github.com/ritter-x2a updated
https://github.com/llvm/llvm-project/pull/142739
>From 30bb17d1019285e8937d1c640023c33d5e7c32cf Mon Sep 17 00:00:00 2001
From: Fabian Ritter
Date: Wed, 4 Jun 2025 03:32:32 -0400
Subject: [PATCH 1/5] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines
This patch focuses on generic DAG combines, plus an AMDGPU-target-specific one
that is closely connected.
The generic DAG combine is based on a part of PR #105669 by @rgwott, which was
adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello
LLVM tree. I added some parts and removed several disjuncts from the
reassociation condition:
- `isNullConstant(X)`, since there are address spaces where 0 is a perfectly
normal value that shouldn't be treated specially,
- `(YIsConstant && ZOneUse)` and `(N0OneUse && ZOneUse && !ZIsConstant)`, since
they cause regressions in AMDGPU.
For SWDEV-516125.
---
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 92 -
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 49 +
llvm/lib/Target/AMDGPU/SIISelLowering.h | 1 +
.../AMDGPU/ptradd-sdag-optimizations.ll | 194 ++
4 files changed, 201 insertions(+), 135 deletions(-)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index aba3c0f80a024..e57e8eb8799e2 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -419,6 +419,7 @@ namespace {
SDValue visitADDLike(SDNode *N);
SDValue visitADDLikeCommutative(SDValue N0, SDValue N1,
SDNode *LocReference);
+SDValue visitPTRADD(SDNode *N);
SDValue visitSUB(SDNode *N);
SDValue visitADDSAT(SDNode *N);
SDValue visitSUBSAT(SDNode *N);
@@ -1138,7 +1139,7 @@ bool
DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
return true;
}
- if (Opc != ISD::ADD)
+ if (Opc != ISD::ADD && Opc != ISD::PTRADD)
return false;
auto *C2 = dyn_cast(N1);
@@ -1858,6 +1859,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
case ISD::TokenFactor:return visitTokenFactor(N);
case ISD::MERGE_VALUES: return visitMERGE_VALUES(N);
case ISD::ADD:return visitADD(N);
+ case ISD::PTRADD: return visitPTRADD(N);
case ISD::SUB:return visitSUB(N);
case ISD::SADDSAT:
case ISD::UADDSAT:return visitADDSAT(N);
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N0->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
+ SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+ AddToWorklist(Add.getNode());
+ return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+}
+
+// TODO: There is another possible fold here that was proven useful.
+// It would be this:
+//
+// (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+// * (ptradd x, y) has one use; and
+// * y is a constant; and
+// * z is not a constant.
+//
+// In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+// opportunity to select more complex instructions such as SUBPT and
+// MSUBPT. H
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
ritter-x2a wrote:
I've now added the explicit condition for this fold.
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
arichardson wrote:
We already know this condition is needed, so I'd prefer adding here so we don't
need to make this change once we relax the existing assertions. But I don't
feel strongly about this so happy with either.
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
arichardson wrote:
Nice, I was not aware of those folds. Do we already have tests for them? If not
it would be nice to add one here (but don't consider this a blocker).
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
ritter-x2a wrote:
Indeed, I'll do that if we don't land on moving the target-specific combine
below this one in [the other
thread](https://app.graphite.dev/github/pr/llvm/llvm-project/142739/%5BAMDGPU%5D%5BSDAG%5D-Add-ISD-PTRADD-DAG-combines#comment-PRRC_kwDOBITxeM5-x5pQ).
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N0->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
+ SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+ AddToWorklist(Add.getNode());
+ return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+}
+
+// TODO: There is another possible fold here that was proven useful.
+// It would be this:
+//
+// (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+// * (ptradd x, y) has one use; and
+// * y is a constant; and
+// * z is not a constant.
+//
+// In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+// opportunity to select more complex instructions such as SUBPT and
+// MSUBPT. However, a hypothetical corner case has been found that we could
+// not avoid. Consider this (pseudo-POSIX C):
+//
+// char *foo(char *x, int z) {return (x + LARGE_CONSTANT) + z;}
+// char *p = mmap(LARGE_CONSTANT);
+// char *q = foo(p, -LARGE_CONSTANT);
+//
+// Then x + LARGE_CONSTANT is one-past-the-end, so valid, and a
+// further + z takes it back to the start of the mapping, so valid,
+// regardless of the address mmap gave back. However, if mmap gives you an
+// address < LARGE_CONSTANT (ignoring high bits), x - LARGE_CONSTANT will
+// borrow from the high bits (with the subsequent + z carrying back into
+// the high bits to give you a well-defined pointer) and thus trip
+// FEAT_CPA's pointer corruption checks.
+//
+// We leave this fold as an opportunity for future work, addressing the
+// corner case for FEAT_CPA, as well as reconciling the solution with the
+// more general application of pointer arithmetic in other future targets.
ritter-x2a wrote:
My vague idea of handling this properly in the future would be to
- have an `inbounds` flag on `PTRADD` nodes (see #131862),
- have backends generate instructions that break for out-of-bounds arithmetic
only when the `inbounds` flag is present, and
- give targets an option to request that transformations that would generate
`PTRADD`s without `inbounds` flag are not applied.
I think that would give things like the CPA implementation a better semantic
footing, since otherwise they would just be miscompiling the IR's
`getelementptr`s without `inbounds` flags. However, at least the last point
above is currently not on my critical path, so I'm open to adding the comment
here or moving the other transform here.
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
ritter-x2a wrote:
So that `y + z` can be folded into a single constant, which might be folded as
an immediate offset into a memory instruction. `SeparateConstOffsetFromGEP`
should do that for AMDGPU already in many cases when it's beneficial, but
- I don't think that every backend uses `SeparateConstOffsetFromGEP`, so it can
be worthwhile to have anyway,
- There are cases where these are introduced after `SeparateConstOffsetFromGEP`
runs; for example when a wide vector load/store with an offset is legalized to
several loads/stores with nested offsets, for example in `store_v16i32` in
`ptradd-sdag-optimizations.ll`; with this reassociation we get the code that we
would get with the old non-PTRADD code path, and
- while it's probably possible that this could lead to worse code, the
`reassociationCanBreakAddressingModePattern` check above _should_ avoid these
(I'm not 100% convinced the logic in there is sound, but that seems like a
different problem).
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
ritter-x2a wrote:
There is an `assert(PtrVT == IntVT)` above and a similar assert in
`SelectionDAG::getNode()` that rules that out (the reasoning for adding the
assert in `getNode` was
[here](https://github.com/llvm/llvm-project/pull/140017#issuecomment-2893168440)).
We can add this condition here as well to emphasize it even more, but to make
the combines truly safe against problems when pointer and index type mismatches
are allowed, we'd also need to handle, e.g., cases where the types of `Y` and
`Z` in the reassociation below don't match (and there are probably more cases
where explicit handling would be required).
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
arichardson wrote:
yeah this needs a
```suggestion
if (isNullConstant(N0) && PtrVT == IntVT)
```
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
jrtc27 wrote: > isNullConstant(X), since there are address spaces where 0 is a perfectly normal value that shouldn't be treated specially, I don't know if it's important for CHERI to have this or if the IR-level optimisations render it not so needed. But `NULL + int` is how we represent an integer as a pointer, so `NULL + x + y` is something that can legitimately turn up, and we want to be able to fold the x and y together as just integer arithmetic, only converting to a capability at the very end when needed. https://github.com/llvm/llvm-project/pull/142739 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
jrtc27 wrote:
Only if they're the same type. This isn't valid for CHERI, the LHS is a
capability, the RHS is an integer. Nor is this valid for architectures where
address size != index size.
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -14935,6 +14936,52 @@ SDValue SITargetLowering::performAddCombine(SDNode *N,
return SDValue();
}
+SDValue SITargetLowering::performPtrAddCombine(SDNode *N,
+ DAGCombinerInfo &DCI) const {
+ SelectionDAG &DAG = DCI.DAG;
+ EVT VT = N->getValueType(0);
+ SDLoc DL(N);
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+
+ if (N1.getOpcode() == ISD::ADD) {
+// (ptradd x, (add y, z)) -> (ptradd (ptradd x, y), z) if z is a constant,
+//y is not, and (add y, z) is used only once.
+// (ptradd x, (add y, z)) -> (ptradd (ptradd x, z), y) if y is a constant,
+//z is not, and (add y, z) is used only once.
+// The goal is to move constant offsets to the outermost ptradd, to create
+// more opportunities to fold offsets into memory instructions.
+// Together with the generic combines in DAGCombiner.cpp, this also
+// implements (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y)).
+//
+// This transform is here instead of in the general DAGCombiner as it can
+// turn in-bounds pointer arithmetic out-of-bounds, which is problematic
for
+// AArch64's CPA.
arichardson wrote:
I think having it here is fine, it explains why it can't be moved to a generic
fold.
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
shiltian wrote:
I think you can early bail out here to prevent the giant code block from an
indentation.
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -14935,6 +14936,52 @@ SDValue SITargetLowering::performAddCombine(SDNode *N,
return SDValue();
}
+SDValue SITargetLowering::performPtrAddCombine(SDNode *N,
+ DAGCombinerInfo &DCI) const {
+ SelectionDAG &DAG = DCI.DAG;
+ EVT VT = N->getValueType(0);
+ SDLoc DL(N);
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+
+ if (N1.getOpcode() == ISD::ADD) {
shiltian wrote:
bail out early?
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
https://github.com/shiltian edited https://github.com/llvm/llvm-project/pull/142739 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
shiltian wrote:
for my own education, why is that?
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -14935,6 +14936,52 @@ SDValue SITargetLowering::performAddCombine(SDNode *N,
return SDValue();
}
+SDValue SITargetLowering::performPtrAddCombine(SDNode *N,
+ DAGCombinerInfo &DCI) const {
+ SelectionDAG &DAG = DCI.DAG;
+ EVT VT = N->getValueType(0);
+ SDLoc DL(N);
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+
+ if (N1.getOpcode() == ISD::ADD) {
+// (ptradd x, (add y, z)) -> (ptradd (ptradd x, y), z) if z is a constant,
+//y is not, and (add y, z) is used only once.
+// (ptradd x, (add y, z)) -> (ptradd (ptradd x, z), y) if y is a constant,
+//z is not, and (add y, z) is used only once.
+// The goal is to move constant offsets to the outermost ptradd, to create
+// more opportunities to fold offsets into memory instructions.
+// Together with the generic combines in DAGCombiner.cpp, this also
+// implements (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y)).
+//
+// This transform is here instead of in the general DAGCombiner as it can
+// turn in-bounds pointer arithmetic out-of-bounds, which is problematic
for
+// AArch64's CPA.
shiltian wrote:
remove the aarch64 part?
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
shiltian wrote:
how about poison?
https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
https://github.com/ritter-x2a updated
https://github.com/llvm/llvm-project/pull/142739
>From efdaf03ba25edfd254a3b9bc79470ed861e123c1 Mon Sep 17 00:00:00 2001
From: Fabian Ritter
Date: Wed, 4 Jun 2025 03:32:32 -0400
Subject: [PATCH] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines
This patch focuses on generic DAG combines, plus an AMDGPU-target-specific one
that is closely connected.
The generic DAG combine is based on a part of PR #105669 by @rgwott, which was
adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello
LLVM tree. I added some parts and removed several disjuncts from the
reassociation condition:
- `isNullConstant(X)`, since there are address spaces where 0 is a perfectly
normal value that shouldn't be treated specially,
- `(YIsConstant && ZOneUse)` and `(N0OneUse && ZOneUse && !ZIsConstant)`, since
they cause regressions in AMDGPU.
For SWDEV-516125.
---
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 92 -
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 49 +
llvm/lib/Target/AMDGPU/SIISelLowering.h | 1 +
.../AMDGPU/ptradd-sdag-optimizations.ll | 193 ++
4 files changed, 201 insertions(+), 134 deletions(-)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index aba3c0f80a024..e57e8eb8799e2 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -419,6 +419,7 @@ namespace {
SDValue visitADDLike(SDNode *N);
SDValue visitADDLikeCommutative(SDValue N0, SDValue N1,
SDNode *LocReference);
+SDValue visitPTRADD(SDNode *N);
SDValue visitSUB(SDNode *N);
SDValue visitADDSAT(SDNode *N);
SDValue visitSUBSAT(SDNode *N);
@@ -1138,7 +1139,7 @@ bool
DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
return true;
}
- if (Opc != ISD::ADD)
+ if (Opc != ISD::ADD && Opc != ISD::PTRADD)
return false;
auto *C2 = dyn_cast(N1);
@@ -1858,6 +1859,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
case ISD::TokenFactor:return visitTokenFactor(N);
case ISD::MERGE_VALUES: return visitMERGE_VALUES(N);
case ISD::ADD:return visitADD(N);
+ case ISD::PTRADD: return visitPTRADD(N);
case ISD::SUB:return visitSUB(N);
case ISD::SADDSAT:
case ISD::UADDSAT:return visitADDSAT(N);
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N0->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
+ SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+ AddToWorklist(Add.getNode());
+ return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+}
+
+// TODO: There is another possible fold here that was proven useful.
+// It would be this:
+//
+// (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+// * (ptradd x, y) has one use; and
+// * y is a constant; and
+// * z is not a constant.
+//
+// In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+// opportunity to select more complex instructions such as SUBPT and
+// MSUBPT. Howev
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
https://github.com/ritter-x2a updated
https://github.com/llvm/llvm-project/pull/142739
>From efdaf03ba25edfd254a3b9bc79470ed861e123c1 Mon Sep 17 00:00:00 2001
From: Fabian Ritter
Date: Wed, 4 Jun 2025 03:32:32 -0400
Subject: [PATCH] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines
This patch focuses on generic DAG combines, plus an AMDGPU-target-specific one
that is closely connected.
The generic DAG combine is based on a part of PR #105669 by @rgwott, which was
adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello
LLVM tree. I added some parts and removed several disjuncts from the
reassociation condition:
- `isNullConstant(X)`, since there are address spaces where 0 is a perfectly
normal value that shouldn't be treated specially,
- `(YIsConstant && ZOneUse)` and `(N0OneUse && ZOneUse && !ZIsConstant)`, since
they cause regressions in AMDGPU.
For SWDEV-516125.
---
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 92 -
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 49 +
llvm/lib/Target/AMDGPU/SIISelLowering.h | 1 +
.../AMDGPU/ptradd-sdag-optimizations.ll | 193 ++
4 files changed, 201 insertions(+), 134 deletions(-)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index aba3c0f80a024..e57e8eb8799e2 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -419,6 +419,7 @@ namespace {
SDValue visitADDLike(SDNode *N);
SDValue visitADDLikeCommutative(SDValue N0, SDValue N1,
SDNode *LocReference);
+SDValue visitPTRADD(SDNode *N);
SDValue visitSUB(SDNode *N);
SDValue visitADDSAT(SDNode *N);
SDValue visitSUBSAT(SDNode *N);
@@ -1138,7 +1139,7 @@ bool
DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
return true;
}
- if (Opc != ISD::ADD)
+ if (Opc != ISD::ADD && Opc != ISD::PTRADD)
return false;
auto *C2 = dyn_cast(N1);
@@ -1858,6 +1859,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
case ISD::TokenFactor:return visitTokenFactor(N);
case ISD::MERGE_VALUES: return visitMERGE_VALUES(N);
case ISD::ADD:return visitADD(N);
+ case ISD::PTRADD: return visitPTRADD(N);
case ISD::SUB:return visitSUB(N);
case ISD::SADDSAT:
case ISD::UADDSAT:return visitADDSAT(N);
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N0->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
+ SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+ AddToWorklist(Add.getNode());
+ return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+}
+
+// TODO: There is another possible fold here that was proven useful.
+// It would be this:
+//
+// (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+// * (ptradd x, y) has one use; and
+// * y is a constant; and
+// * z is not a constant.
+//
+// In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+// opportunity to select more complex instructions such as SUBPT and
+// MSUBPT. Howev
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
ritter-x2a wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/142739?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#142739** https://app.graphite.dev/github/pr/llvm/llvm-project/142739?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/142739?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#142738** https://app.graphite.dev/github/pr/llvm/llvm-project/142738?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#141725** https://app.graphite.dev/github/pr/llvm/llvm-project/141725?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/142739 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
https://github.com/ritter-x2a ready_for_review https://github.com/llvm/llvm-project/pull/142739 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
llvmbot wrote:
@llvm/pr-subscribers-backend-amdgpu
Author: Fabian Ritter (ritter-x2a)
Changes
This patch focuses on generic DAG combines, plus an AMDGPU-target-specific one
that is closely connected.
The generic DAG combine is based on a part of PR #105669 by @rgwott, which was
adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello
LLVM tree. I added some parts and removed several disjuncts from the
reassociation condition:
- `isNullConstant(X)`, since there are address spaces where 0 is a perfectly
normal value that shouldn't be treated specially,
- `(YIsConstant && ZOneUse)` and `(N0OneUse && ZOneUse
&& !ZIsConstant)`, since
they cause regressions in AMDGPU.
For SWDEV-516125.
---
Patch is 20.34 KiB, truncated to 20.00 KiB below, full version:
https://github.com/llvm/llvm-project/pull/142739.diff
4 Files Affected:
- (modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+91-1)
- (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+49)
- (modified) llvm/lib/Target/AMDGPU/SIISelLowering.h (+1)
- (modified) llvm/test/CodeGen/AMDGPU/ptradd-sdag-optimizations.ll (+60-133)
``diff
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 9e418329d15be..e45134a2e5548 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -419,6 +419,7 @@ namespace {
SDValue visitADDLike(SDNode *N);
SDValue visitADDLikeCommutative(SDValue N0, SDValue N1,
SDNode *LocReference);
+SDValue visitPTRADD(SDNode *N);
SDValue visitSUB(SDNode *N);
SDValue visitADDSAT(SDNode *N);
SDValue visitSUBSAT(SDNode *N);
@@ -1138,7 +1139,7 @@ bool
DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
return true;
}
- if (Opc != ISD::ADD)
+ if (Opc != ISD::ADD && Opc != ISD::PTRADD)
return false;
auto *C2 = dyn_cast(N1);
@@ -1858,6 +1859,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
case ISD::TokenFactor:return visitTokenFactor(N);
case ISD::MERGE_VALUES: return visitMERGE_VALUES(N);
case ISD::ADD:return visitADD(N);
+ case ISD::PTRADD: return visitPTRADD(N);
case ISD::SUB:return visitSUB(N);
case ISD::SADDSAT:
case ISD::UADDSAT:return visitADDSAT(N);
@@ -2623,6 +2625,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N0->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
+ SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+ AddToWorklist(Add.getNode());
+ return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+}
+
+// TODO: There is another possible fold here that was proven useful.
+// It would be this:
+//
+// (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+// * (ptradd x, y) has one use; and
+// * y is a constant; and
+// * z is not a constant.
+//
+// In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+// opportunity to select more complex instructions such as SUBPT and
+// MSUBPT. However, a hypothetical corner case has been found that we co
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
https://github.com/ritter-x2a created
https://github.com/llvm/llvm-project/pull/142739
This patch focuses on generic DAG combines, plus an AMDGPU-target-specific one
that is closely connected.
The generic DAG combine is based on a part of PR #105669 by @rgwott, which was
adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello
LLVM tree. I added some parts and removed several disjuncts from the
reassociation condition:
- `isNullConstant(X)`, since there are address spaces where 0 is a perfectly
normal value that shouldn't be treated specially,
- `(YIsConstant && ZOneUse)` and `(N0OneUse && ZOneUse && !ZIsConstant)`, since
they cause regressions in AMDGPU.
For SWDEV-516125.
>From 1afe2b10cb3781c57adb2eec584b7fc07c073cf8 Mon Sep 17 00:00:00 2001
From: Fabian Ritter
Date: Wed, 4 Jun 2025 03:32:32 -0400
Subject: [PATCH] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines
This patch focuses on generic DAG combines, plus an AMDGPU-target-specific one
that is closely connected.
The generic DAG combine is based on a part of PR #105669 by @rgwott, which was
adapted from work by @jrtc27, @arichardson, @davidchisnall in the CHERI/Morello
LLVM tree. I added some parts and removed several disjuncts from the
reassociation condition:
- `isNullConstant(X)`, since there are address spaces where 0 is a perfectly
normal value that shouldn't be treated specially,
- `(YIsConstant && ZOneUse)` and `(N0OneUse && ZOneUse && !ZIsConstant)`, since
they cause regressions in AMDGPU.
For SWDEV-516125.
---
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 92 -
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 49 +
llvm/lib/Target/AMDGPU/SIISelLowering.h | 1 +
.../AMDGPU/ptradd-sdag-optimizations.ll | 193 ++
4 files changed, 201 insertions(+), 134 deletions(-)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 9e418329d15be..e45134a2e5548 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -419,6 +419,7 @@ namespace {
SDValue visitADDLike(SDNode *N);
SDValue visitADDLikeCommutative(SDValue N0, SDValue N1,
SDNode *LocReference);
+SDValue visitPTRADD(SDNode *N);
SDValue visitSUB(SDNode *N);
SDValue visitADDSAT(SDNode *N);
SDValue visitSUBSAT(SDNode *N);
@@ -1138,7 +1139,7 @@ bool
DAGCombiner::reassociationCanBreakAddressingModePattern(unsigned Opc,
return true;
}
- if (Opc != ISD::ADD)
+ if (Opc != ISD::ADD && Opc != ISD::PTRADD)
return false;
auto *C2 = dyn_cast(N1);
@@ -1858,6 +1859,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
case ISD::TokenFactor:return visitTokenFactor(N);
case ISD::MERGE_VALUES: return visitMERGE_VALUES(N);
case ISD::ADD:return visitADD(N);
+ case ISD::PTRADD: return visitPTRADD(N);
case ISD::SUB:return visitSUB(N);
case ISD::SADDSAT:
case ISD::UADDSAT:return visitADDSAT(N);
@@ -2623,6 +2625,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc
&DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+// * y is a constant and (ptradd x, y) has one use; or
+// * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N0->getFlags().hasNoUnsignedWrap())
+Flags |= SDNodeFlags::NoUnsignedWrap;
+ SDVa
