[llvm-branch-commits] [llvm] release/18.x: [AArch64] Remove invalid uabdl patterns. (#89272) (PR #89380)

2024-04-30 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/89380

>From a96b04442c9fc29cd884b56bf07af8615191176f Mon Sep 17 00:00:00 2001
From: David Green 
Date: Fri, 19 Apr 2024 09:30:13 +0100
Subject: [PATCH] [AArch64] Remove invalid uabdl patterns. (#89272)

These were added in https://reviews.llvm.org/D14208, which look like
they attempt to detect abs from xor+add+ashr. They do not appear to be
detecting the correct value for the src input though, which I think is
intended to be the sub(zext, zext) part of the pattern. We have pattens
from abs now, so the old invalid patterns can be removed.

Fixes #88784

(cherry picked from commit 851462fcaa7f6e3301865de84f98be7e872e64b6)
---
 llvm/lib/Target/AArch64/AArch64InstrInfo.td | 10 -
 llvm/test/CodeGen/AArch64/arm64-vabs.ll | 48 +
 2 files changed, 48 insertions(+), 10 deletions(-)

diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td 
b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index 03baa7497615e3..ac61dd8745d4e6 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -4885,19 +4885,9 @@ defm UABDL   : SIMDLongThreeVectorBHSabdl<1, 0b0111, 
"uabdl",
 def : Pat<(abs (v8i16 (sub (zext (v8i8 V64:$opA)),
(zext (v8i8 V64:$opB),
   (UABDLv8i8_v8i16 V64:$opA, V64:$opB)>;
-def : Pat<(xor (v8i16 (AArch64vashr v8i16:$src, (i32 15))),
-   (v8i16 (add (sub (zext (v8i8 V64:$opA)),
-(zext (v8i8 V64:$opB))),
-   (AArch64vashr v8i16:$src, (i32 15),
-  (UABDLv8i8_v8i16 V64:$opA, V64:$opB)>;
 def : Pat<(abs (v8i16 (sub (zext (extract_high_v16i8 (v16i8 V128:$opA))),
(zext (extract_high_v16i8 (v16i8 V128:$opB)),
   (UABDLv16i8_v8i16 V128:$opA, V128:$opB)>;
-def : Pat<(xor (v8i16 (AArch64vashr v8i16:$src, (i32 15))),
-   (v8i16 (add (sub (zext (extract_high_v16i8 (v16i8 V128:$opA))),
-(zext (extract_high_v16i8 (v16i8 V128:$opB,
-   (AArch64vashr v8i16:$src, (i32 15),
-  (UABDLv16i8_v8i16 V128:$opA, V128:$opB)>;
 def : Pat<(abs (v4i32 (sub (zext (v4i16 V64:$opA)),
(zext (v4i16 V64:$opB),
   (UABDLv4i16_v4i32 V64:$opA, V64:$opB)>;
diff --git a/llvm/test/CodeGen/AArch64/arm64-vabs.ll 
b/llvm/test/CodeGen/AArch64/arm64-vabs.ll
index fe4da2e7cf36b5..89c8d540b97e04 100644
--- a/llvm/test/CodeGen/AArch64/arm64-vabs.ll
+++ b/llvm/test/CodeGen/AArch64/arm64-vabs.ll
@@ -1848,3 +1848,51 @@ define <2 x i128> @uabd_i64(<2 x i64> %a, <2 x i64> %b) {
   %absel = select <2 x i1> %abcmp, <2 x i128> %ababs, <2 x i128> %abdiff
   ret <2 x i128> %absel
 }
+
+define <8 x i16> @pr88784(<8 x i8> %l0, <8 x i8> %l1, <8 x i16> %l2) {
+; CHECK-SD-LABEL: pr88784:
+; CHECK-SD:   // %bb.0:
+; CHECK-SD-NEXT:usubl.8h v0, v0, v1
+; CHECK-SD-NEXT:cmlt.8h v1, v2, #0
+; CHECK-SD-NEXT:ssra.8h v0, v2, #15
+; CHECK-SD-NEXT:eor.16b v0, v1, v0
+; CHECK-SD-NEXT:ret
+;
+; CHECK-GI-LABEL: pr88784:
+; CHECK-GI:   // %bb.0:
+; CHECK-GI-NEXT:usubl.8h v0, v0, v1
+; CHECK-GI-NEXT:sshr.8h v1, v2, #15
+; CHECK-GI-NEXT:ssra.8h v0, v2, #15
+; CHECK-GI-NEXT:eor.16b v0, v1, v0
+; CHECK-GI-NEXT:ret
+  %l4 = zext <8 x i8> %l0 to <8 x i16>
+  %l5 = ashr <8 x i16> %l2, 
+  %l6 = zext <8 x i8> %l1 to <8 x i16>
+  %l7 = sub <8 x i16> %l4, %l6
+  %l8 = add <8 x i16> %l5, %l7
+  %l9 = xor <8 x i16> %l5, %l8
+  ret <8 x i16> %l9
+}
+
+define <8 x i16> @pr88784_fixed(<8 x i8> %l0, <8 x i8> %l1, <8 x i16> %l2) {
+; CHECK-SD-LABEL: pr88784_fixed:
+; CHECK-SD:   // %bb.0:
+; CHECK-SD-NEXT:uabdl.8h v0, v0, v1
+; CHECK-SD-NEXT:ret
+;
+; CHECK-GI-LABEL: pr88784_fixed:
+; CHECK-GI:   // %bb.0:
+; CHECK-GI-NEXT:usubl.8h v0, v0, v1
+; CHECK-GI-NEXT:sshr.8h v1, v0, #15
+; CHECK-GI-NEXT:ssra.8h v0, v0, #15
+; CHECK-GI-NEXT:eor.16b v0, v1, v0
+; CHECK-GI-NEXT:ret
+  %l4 = zext <8 x i8> %l0 to <8 x i16>
+  %l6 = zext <8 x i8> %l1 to <8 x i16>
+  %l7 = sub <8 x i16> %l4, %l6
+  %l5 = ashr <8 x i16> %l7, 
+  %l8 = add <8 x i16> %l5, %l7
+  %l9 = xor <8 x i16> %l5, %l8
+  ret <8 x i16> %l9
+}
+

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [AArch64] Remove invalid uabdl patterns. (#89272) (PR #89380)

2024-04-30 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/89380
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] a96b044 - [AArch64] Remove invalid uabdl patterns. (#89272)

2024-04-30 Thread Tom Stellard via llvm-branch-commits

Author: David Green
Date: 2024-04-30T16:01:41-07:00
New Revision: a96b04442c9fc29cd884b56bf07af8615191176f

URL: 
https://github.com/llvm/llvm-project/commit/a96b04442c9fc29cd884b56bf07af8615191176f
DIFF: 
https://github.com/llvm/llvm-project/commit/a96b04442c9fc29cd884b56bf07af8615191176f.diff

LOG: [AArch64] Remove invalid uabdl patterns. (#89272)

These were added in https://reviews.llvm.org/D14208, which look like
they attempt to detect abs from xor+add+ashr. They do not appear to be
detecting the correct value for the src input though, which I think is
intended to be the sub(zext, zext) part of the pattern. We have pattens
from abs now, so the old invalid patterns can be removed.

Fixes #88784

(cherry picked from commit 851462fcaa7f6e3301865de84f98be7e872e64b6)

Added: 


Modified: 
llvm/lib/Target/AArch64/AArch64InstrInfo.td
llvm/test/CodeGen/AArch64/arm64-vabs.ll

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td 
b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index 03baa7497615e3..ac61dd8745d4e6 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -4885,19 +4885,9 @@ defm UABDL   : SIMDLongThreeVectorBHSabdl<1, 0b0111, 
"uabdl",
 def : Pat<(abs (v8i16 (sub (zext (v8i8 V64:$opA)),
(zext (v8i8 V64:$opB),
   (UABDLv8i8_v8i16 V64:$opA, V64:$opB)>;
-def : Pat<(xor (v8i16 (AArch64vashr v8i16:$src, (i32 15))),
-   (v8i16 (add (sub (zext (v8i8 V64:$opA)),
-(zext (v8i8 V64:$opB))),
-   (AArch64vashr v8i16:$src, (i32 15),
-  (UABDLv8i8_v8i16 V64:$opA, V64:$opB)>;
 def : Pat<(abs (v8i16 (sub (zext (extract_high_v16i8 (v16i8 V128:$opA))),
(zext (extract_high_v16i8 (v16i8 V128:$opB)),
   (UABDLv16i8_v8i16 V128:$opA, V128:$opB)>;
-def : Pat<(xor (v8i16 (AArch64vashr v8i16:$src, (i32 15))),
-   (v8i16 (add (sub (zext (extract_high_v16i8 (v16i8 V128:$opA))),
-(zext (extract_high_v16i8 (v16i8 V128:$opB,
-   (AArch64vashr v8i16:$src, (i32 15),
-  (UABDLv16i8_v8i16 V128:$opA, V128:$opB)>;
 def : Pat<(abs (v4i32 (sub (zext (v4i16 V64:$opA)),
(zext (v4i16 V64:$opB),
   (UABDLv4i16_v4i32 V64:$opA, V64:$opB)>;

diff  --git a/llvm/test/CodeGen/AArch64/arm64-vabs.ll 
b/llvm/test/CodeGen/AArch64/arm64-vabs.ll
index fe4da2e7cf36b5..89c8d540b97e04 100644
--- a/llvm/test/CodeGen/AArch64/arm64-vabs.ll
+++ b/llvm/test/CodeGen/AArch64/arm64-vabs.ll
@@ -1848,3 +1848,51 @@ define <2 x i128> @uabd_i64(<2 x i64> %a, <2 x i64> %b) {
   %absel = select <2 x i1> %abcmp, <2 x i128> %ababs, <2 x i128> %ab
diff 
   ret <2 x i128> %absel
 }
+
+define <8 x i16> @pr88784(<8 x i8> %l0, <8 x i8> %l1, <8 x i16> %l2) {
+; CHECK-SD-LABEL: pr88784:
+; CHECK-SD:   // %bb.0:
+; CHECK-SD-NEXT:usubl.8h v0, v0, v1
+; CHECK-SD-NEXT:cmlt.8h v1, v2, #0
+; CHECK-SD-NEXT:ssra.8h v0, v2, #15
+; CHECK-SD-NEXT:eor.16b v0, v1, v0
+; CHECK-SD-NEXT:ret
+;
+; CHECK-GI-LABEL: pr88784:
+; CHECK-GI:   // %bb.0:
+; CHECK-GI-NEXT:usubl.8h v0, v0, v1
+; CHECK-GI-NEXT:sshr.8h v1, v2, #15
+; CHECK-GI-NEXT:ssra.8h v0, v2, #15
+; CHECK-GI-NEXT:eor.16b v0, v1, v0
+; CHECK-GI-NEXT:ret
+  %l4 = zext <8 x i8> %l0 to <8 x i16>
+  %l5 = ashr <8 x i16> %l2, 
+  %l6 = zext <8 x i8> %l1 to <8 x i16>
+  %l7 = sub <8 x i16> %l4, %l6
+  %l8 = add <8 x i16> %l5, %l7
+  %l9 = xor <8 x i16> %l5, %l8
+  ret <8 x i16> %l9
+}
+
+define <8 x i16> @pr88784_fixed(<8 x i8> %l0, <8 x i8> %l1, <8 x i16> %l2) {
+; CHECK-SD-LABEL: pr88784_fixed:
+; CHECK-SD:   // %bb.0:
+; CHECK-SD-NEXT:uabdl.8h v0, v0, v1
+; CHECK-SD-NEXT:ret
+;
+; CHECK-GI-LABEL: pr88784_fixed:
+; CHECK-GI:   // %bb.0:
+; CHECK-GI-NEXT:usubl.8h v0, v0, v1
+; CHECK-GI-NEXT:sshr.8h v1, v0, #15
+; CHECK-GI-NEXT:ssra.8h v0, v0, #15
+; CHECK-GI-NEXT:eor.16b v0, v1, v0
+; CHECK-GI-NEXT:ret
+  %l4 = zext <8 x i8> %l0 to <8 x i16>
+  %l6 = zext <8 x i8> %l1 to <8 x i16>
+  %l7 = sub <8 x i16> %l4, %l6
+  %l5 = ashr <8 x i16> %l7, 
+  %l8 = add <8 x i16> %l5, %l7
+  %l9 = xor <8 x i16> %l5, %l8
+  ret <8 x i16> %l9
+}
+



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [GlobalISel] Don't form anyextending atomic loads. (PR #90435)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/90435

>From 4da5b14174938bc69b8e729bc8b5bb393bd70b9e Mon Sep 17 00:00:00 2001
From: Amara Emerson 
Date: Fri, 5 Apr 2024 10:49:19 -0700
Subject: [PATCH] [GlobalISel] Don't form anyextending atomic loads.

Until we can reliably check the legality and improve our selection of these,
don't form them at all.

(cherry picked from commit 60fc4ac67a613e4e36cef019fb2d13d70a06cfe8)
---
 .../lib/CodeGen/GlobalISel/CombinerHelper.cpp |  4 +-
 .../Atomics/aarch64-atomic-load-rcpc_immo.ll  | 55 +-
 .../AArch64/GlobalISel/arm64-atomic.ll| 56 +--
 .../AArch64/GlobalISel/arm64-pcsections.ll| 28 +-
 .../atomic-anyextending-load-crash.ll | 47 
 5 files changed, 131 insertions(+), 59 deletions(-)
 create mode 100644 
llvm/test/CodeGen/AArch64/GlobalISel/atomic-anyextending-load-crash.ll

diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp 
b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
index 772229215e798d..61ddc858ba44c7 100644
--- a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
@@ -591,8 +591,8 @@ bool 
CombinerHelper::matchCombineExtendingLoads(MachineInstr &MI,
 UseMI.getOpcode() == TargetOpcode::G_ZEXT ||
 (UseMI.getOpcode() == TargetOpcode::G_ANYEXT)) {
   const auto &MMO = LoadMI->getMMO();
-  // For atomics, only form anyextending loads.
-  if (MMO.isAtomic() && UseMI.getOpcode() != TargetOpcode::G_ANYEXT)
+  // Don't do anything for atomics.
+  if (MMO.isAtomic())
 continue;
   // Check for legality.
   if (!isPreLegalize()) {
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll 
b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll
index b0507e9d075fab..9687ba683fb7e6 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll
@@ -35,16 +35,24 @@ define i8 @load_atomic_i8_aligned_monotonic_const(ptr 
readonly %ptr) {
 }
 
 define i8 @load_atomic_i8_aligned_acquire(ptr %ptr) {
-; CHECK-LABEL: load_atomic_i8_aligned_acquire:
-; CHECK:ldapurb w0, [x0, #4]
+; GISEL-LABEL: load_atomic_i8_aligned_acquire:
+; GISEL:add x8, x0, #4
+; GISEL:ldaprb w0, [x8]
+;
+; SDAG-LABEL: load_atomic_i8_aligned_acquire:
+; SDAG:ldapurb w0, [x0, #4]
 %gep = getelementptr inbounds i8, ptr %ptr, i32 4
 %r = load atomic i8, ptr %gep acquire, align 1
 ret i8 %r
 }
 
 define i8 @load_atomic_i8_aligned_acquire_const(ptr readonly %ptr) {
-; CHECK-LABEL: load_atomic_i8_aligned_acquire_const:
-; CHECK:ldapurb w0, [x0, #4]
+; GISEL-LABEL: load_atomic_i8_aligned_acquire_const:
+; GISEL:add x8, x0, #4
+; GISEL:ldaprb w0, [x8]
+;
+; SDAG-LABEL: load_atomic_i8_aligned_acquire_const:
+; SDAG:ldapurb w0, [x0, #4]
 %gep = getelementptr inbounds i8, ptr %ptr, i32 4
 %r = load atomic i8, ptr %gep acquire, align 1
 ret i8 %r
@@ -101,16 +109,24 @@ define i16 @load_atomic_i16_aligned_monotonic_const(ptr 
readonly %ptr) {
 }
 
 define i16 @load_atomic_i16_aligned_acquire(ptr %ptr) {
-; CHECK-LABEL: load_atomic_i16_aligned_acquire:
-; CHECK:ldapurh w0, [x0, #8]
+; GISEL-LABEL: load_atomic_i16_aligned_acquire:
+; GISEL:add x8, x0, #8
+; GISEL:ldaprh w0, [x8]
+;
+; SDAG-LABEL: load_atomic_i16_aligned_acquire:
+; SDAG:ldapurh w0, [x0, #8]
 %gep = getelementptr inbounds i16, ptr %ptr, i32 4
 %r = load atomic i16, ptr %gep acquire, align 2
 ret i16 %r
 }
 
 define i16 @load_atomic_i16_aligned_acquire_const(ptr readonly %ptr) {
-; CHECK-LABEL: load_atomic_i16_aligned_acquire_const:
-; CHECK:ldapurh w0, [x0, #8]
+; GISEL-LABEL: load_atomic_i16_aligned_acquire_const:
+; GISEL:add x8, x0, #8
+; GISEL:ldaprh w0, [x8]
+;
+; SDAG-LABEL: load_atomic_i16_aligned_acquire_const:
+; SDAG:ldapurh w0, [x0, #8]
 %gep = getelementptr inbounds i16, ptr %ptr, i32 4
 %r = load atomic i16, ptr %gep acquire, align 2
 ret i16 %r
@@ -367,16 +383,24 @@ define i8 @load_atomic_i8_unaligned_monotonic_const(ptr 
readonly %ptr) {
 }
 
 define i8 @load_atomic_i8_unaligned_acquire(ptr %ptr) {
-; CHECK-LABEL: load_atomic_i8_unaligned_acquire:
-; CHECK:ldapurb w0, [x0, #4]
+; GISEL-LABEL: load_atomic_i8_unaligned_acquire:
+; GISEL:add x8, x0, #4
+; GISEL:ldaprb w0, [x8]
+;
+; SDAG-LABEL: load_atomic_i8_unaligned_acquire:
+; SDAG:ldapurb w0, [x0, #4]
 %gep = getelementptr inbounds i8, ptr %ptr, i32 4
 %r = load atomic i8, ptr %gep acquire, align 1
 ret i8 %r
 }
 
 define i8 @load_atomic_i8_unaligned_acquire_const(ptr readonly %ptr) {
-; CHECK-LABEL: load_atomic_i8_unaligned_acquire_const:
-; CHECK:ldapurb w0, [x0, #4]
+; GISEL-LABEL: load_atomic_i8_unaligned_acquire_const:
+; GISEL:add x8, x0, #4
+; GISEL:ldaprb w0, [x8]
+;
+; SDAG-LABEL: load_atomic_i

[llvm-branch-commits] [llvm] 4da5b14 - [GlobalISel] Don't form anyextending atomic loads.

2024-05-01 Thread Tom Stellard via llvm-branch-commits

Author: Amara Emerson
Date: 2024-05-01T11:30:12-07:00
New Revision: 4da5b14174938bc69b8e729bc8b5bb393bd70b9e

URL: 
https://github.com/llvm/llvm-project/commit/4da5b14174938bc69b8e729bc8b5bb393bd70b9e
DIFF: 
https://github.com/llvm/llvm-project/commit/4da5b14174938bc69b8e729bc8b5bb393bd70b9e.diff

LOG: [GlobalISel] Don't form anyextending atomic loads.

Until we can reliably check the legality and improve our selection of these,
don't form them at all.

(cherry picked from commit 60fc4ac67a613e4e36cef019fb2d13d70a06cfe8)

Added: 
llvm/test/CodeGen/AArch64/GlobalISel/atomic-anyextending-load-crash.ll

Modified: 
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll
llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll
llvm/test/CodeGen/AArch64/GlobalISel/arm64-pcsections.ll

Removed: 




diff  --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp 
b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
index 772229215e798d..61ddc858ba44c7 100644
--- a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
@@ -591,8 +591,8 @@ bool 
CombinerHelper::matchCombineExtendingLoads(MachineInstr &MI,
 UseMI.getOpcode() == TargetOpcode::G_ZEXT ||
 (UseMI.getOpcode() == TargetOpcode::G_ANYEXT)) {
   const auto &MMO = LoadMI->getMMO();
-  // For atomics, only form anyextending loads.
-  if (MMO.isAtomic() && UseMI.getOpcode() != TargetOpcode::G_ANYEXT)
+  // Don't do anything for atomics.
+  if (MMO.isAtomic())
 continue;
   // Check for legality.
   if (!isPreLegalize()) {

diff  --git 
a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll 
b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll
index b0507e9d075fab..9687ba683fb7e6 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll
@@ -35,16 +35,24 @@ define i8 @load_atomic_i8_aligned_monotonic_const(ptr 
readonly %ptr) {
 }
 
 define i8 @load_atomic_i8_aligned_acquire(ptr %ptr) {
-; CHECK-LABEL: load_atomic_i8_aligned_acquire:
-; CHECK:ldapurb w0, [x0, #4]
+; GISEL-LABEL: load_atomic_i8_aligned_acquire:
+; GISEL:add x8, x0, #4
+; GISEL:ldaprb w0, [x8]
+;
+; SDAG-LABEL: load_atomic_i8_aligned_acquire:
+; SDAG:ldapurb w0, [x0, #4]
 %gep = getelementptr inbounds i8, ptr %ptr, i32 4
 %r = load atomic i8, ptr %gep acquire, align 1
 ret i8 %r
 }
 
 define i8 @load_atomic_i8_aligned_acquire_const(ptr readonly %ptr) {
-; CHECK-LABEL: load_atomic_i8_aligned_acquire_const:
-; CHECK:ldapurb w0, [x0, #4]
+; GISEL-LABEL: load_atomic_i8_aligned_acquire_const:
+; GISEL:add x8, x0, #4
+; GISEL:ldaprb w0, [x8]
+;
+; SDAG-LABEL: load_atomic_i8_aligned_acquire_const:
+; SDAG:ldapurb w0, [x0, #4]
 %gep = getelementptr inbounds i8, ptr %ptr, i32 4
 %r = load atomic i8, ptr %gep acquire, align 1
 ret i8 %r
@@ -101,16 +109,24 @@ define i16 @load_atomic_i16_aligned_monotonic_const(ptr 
readonly %ptr) {
 }
 
 define i16 @load_atomic_i16_aligned_acquire(ptr %ptr) {
-; CHECK-LABEL: load_atomic_i16_aligned_acquire:
-; CHECK:ldapurh w0, [x0, #8]
+; GISEL-LABEL: load_atomic_i16_aligned_acquire:
+; GISEL:add x8, x0, #8
+; GISEL:ldaprh w0, [x8]
+;
+; SDAG-LABEL: load_atomic_i16_aligned_acquire:
+; SDAG:ldapurh w0, [x0, #8]
 %gep = getelementptr inbounds i16, ptr %ptr, i32 4
 %r = load atomic i16, ptr %gep acquire, align 2
 ret i16 %r
 }
 
 define i16 @load_atomic_i16_aligned_acquire_const(ptr readonly %ptr) {
-; CHECK-LABEL: load_atomic_i16_aligned_acquire_const:
-; CHECK:ldapurh w0, [x0, #8]
+; GISEL-LABEL: load_atomic_i16_aligned_acquire_const:
+; GISEL:add x8, x0, #8
+; GISEL:ldaprh w0, [x8]
+;
+; SDAG-LABEL: load_atomic_i16_aligned_acquire_const:
+; SDAG:ldapurh w0, [x0, #8]
 %gep = getelementptr inbounds i16, ptr %ptr, i32 4
 %r = load atomic i16, ptr %gep acquire, align 2
 ret i16 %r
@@ -367,16 +383,24 @@ define i8 @load_atomic_i8_unaligned_monotonic_const(ptr 
readonly %ptr) {
 }
 
 define i8 @load_atomic_i8_unaligned_acquire(ptr %ptr) {
-; CHECK-LABEL: load_atomic_i8_unaligned_acquire:
-; CHECK:ldapurb w0, [x0, #4]
+; GISEL-LABEL: load_atomic_i8_unaligned_acquire:
+; GISEL:add x8, x0, #4
+; GISEL:ldaprb w0, [x8]
+;
+; SDAG-LABEL: load_atomic_i8_unaligned_acquire:
+; SDAG:ldapurb w0, [x0, #4]
 %gep = getelementptr inbounds i8, ptr %ptr, i32 4
 %r = load atomic i8, ptr %gep acquire, align 1
 ret i8 %r
 }
 
 define i8 @load_atomic_i8_unaligned_acquire_const(ptr readonly %ptr) {
-; CHECK-LABEL: load_atomic_i8_unaligned_acquire_const:
-; CHECK:ldapurb w0, [x0, #4]
+; GISEL-LABEL: load_atomic_i8_unaligned_acquire_const:
+; GISEL:add x8, x0, #4
+; GISEL:ldaprb w0, [x8]
+

[llvm-branch-commits] [llvm] release/18.x: [GlobalISel] Don't form anyextending atomic loads. (PR #90435)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/90435
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86] Enable EVEX512 when host CPU has AVX512 (#90479) (PR #90545)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/90545

>From a7b8b890600a33e0c88d639f311f1d73ccb1c8d2 Mon Sep 17 00:00:00 2001
From: Phoebe Wang 
Date: Tue, 30 Apr 2024 10:09:41 +0800
Subject: [PATCH] [X86] Enable EVEX512 when host CPU has AVX512 (#90479)

This is used when -march=native run on an unknown CPU to old version of
LLVM.

(cherry picked from commit b3291793f11924a3b62601aabebebdcfbb12a9a1)
---
 llvm/lib/TargetParser/Host.cpp | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/TargetParser/Host.cpp b/llvm/lib/TargetParser/Host.cpp
index 4466d50458e198..1adef15771fa17 100644
--- a/llvm/lib/TargetParser/Host.cpp
+++ b/llvm/lib/TargetParser/Host.cpp
@@ -1266,8 +1266,10 @@ static void getAvailableFeatures(unsigned ECX, unsigned 
EDX, unsigned MaxLeaf,
 setFeature(X86::FEATURE_AVX2);
   if (HasLeaf7 && ((EBX >> 8) & 1))
 setFeature(X86::FEATURE_BMI2);
-  if (HasLeaf7 && ((EBX >> 16) & 1) && HasAVX512Save)
+  if (HasLeaf7 && ((EBX >> 16) & 1) && HasAVX512Save) {
 setFeature(X86::FEATURE_AVX512F);
+setFeature(X86::FEATURE_EVEX512);
+  }
   if (HasLeaf7 && ((EBX >> 17) & 1) && HasAVX512Save)
 setFeature(X86::FEATURE_AVX512DQ);
   if (HasLeaf7 && ((EBX >> 19) & 1))
@@ -1772,6 +1774,7 @@ bool sys::getHostCPUFeatures(StringMap &Features) {
   Features["rtm"]= HasLeaf7 && ((EBX >> 11) & 1);
   // AVX512 is only supported if the OS supports the context save for it.
   Features["avx512f"]= HasLeaf7 && ((EBX >> 16) & 1) && HasAVX512Save;
+  Features["evex512"]= Features["avx512f"];
   Features["avx512dq"]   = HasLeaf7 && ((EBX >> 17) & 1) && HasAVX512Save;
   Features["rdseed"] = HasLeaf7 && ((EBX >> 18) & 1);
   Features["adx"]= HasLeaf7 && ((EBX >> 19) & 1);

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] a7b8b89 - [X86] Enable EVEX512 when host CPU has AVX512 (#90479)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

Author: Phoebe Wang
Date: 2024-05-01T11:32:03-07:00
New Revision: a7b8b890600a33e0c88d639f311f1d73ccb1c8d2

URL: 
https://github.com/llvm/llvm-project/commit/a7b8b890600a33e0c88d639f311f1d73ccb1c8d2
DIFF: 
https://github.com/llvm/llvm-project/commit/a7b8b890600a33e0c88d639f311f1d73ccb1c8d2.diff

LOG: [X86] Enable EVEX512 when host CPU has AVX512 (#90479)

This is used when -march=native run on an unknown CPU to old version of
LLVM.

(cherry picked from commit b3291793f11924a3b62601aabebebdcfbb12a9a1)

Added: 


Modified: 
llvm/lib/TargetParser/Host.cpp

Removed: 




diff  --git a/llvm/lib/TargetParser/Host.cpp b/llvm/lib/TargetParser/Host.cpp
index 4466d50458e198..1adef15771fa17 100644
--- a/llvm/lib/TargetParser/Host.cpp
+++ b/llvm/lib/TargetParser/Host.cpp
@@ -1266,8 +1266,10 @@ static void getAvailableFeatures(unsigned ECX, unsigned 
EDX, unsigned MaxLeaf,
 setFeature(X86::FEATURE_AVX2);
   if (HasLeaf7 && ((EBX >> 8) & 1))
 setFeature(X86::FEATURE_BMI2);
-  if (HasLeaf7 && ((EBX >> 16) & 1) && HasAVX512Save)
+  if (HasLeaf7 && ((EBX >> 16) & 1) && HasAVX512Save) {
 setFeature(X86::FEATURE_AVX512F);
+setFeature(X86::FEATURE_EVEX512);
+  }
   if (HasLeaf7 && ((EBX >> 17) & 1) && HasAVX512Save)
 setFeature(X86::FEATURE_AVX512DQ);
   if (HasLeaf7 && ((EBX >> 19) & 1))
@@ -1772,6 +1774,7 @@ bool sys::getHostCPUFeatures(StringMap &Features) {
   Features["rtm"]= HasLeaf7 && ((EBX >> 11) & 1);
   // AVX512 is only supported if the OS supports the context save for it.
   Features["avx512f"]= HasLeaf7 && ((EBX >> 16) & 1) && HasAVX512Save;
+  Features["evex512"]= Features["avx512f"];
   Features["avx512dq"]   = HasLeaf7 && ((EBX >> 17) & 1) && HasAVX512Save;
   Features["rdseed"] = HasLeaf7 && ((EBX >> 18) & 1);
   Features["adx"]= HasLeaf7 && ((EBX >> 19) & 1);



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86] Enable EVEX512 when host CPU has AVX512 (#90479) (PR #90545)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/90545
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [GlobalISel] Fix store merging incorrectly classifying an unknown index expr as 0. (#90375) (PR #90673)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/90673

>From ece9d35f1a705ab8d66895c6d985907f2b9a2c0c Mon Sep 17 00:00:00 2001
From: Amara Emerson 
Date: Wed, 1 May 2024 05:42:14 +0800
Subject: [PATCH] [GlobalISel] Fix store merging incorrectly classifying an
 unknown index expr as 0. (#90375)

During analysis, we incorrectly leave the offset part of an address info
struct
as zero, when in actual fact we failed to decompose it into base +
offset.
This results in incorrectly assuming that the address is adjacent to
another store
addr. To fix this we wrap the offset in an optional<> so we can
distinguish between
real zero and unknown.

Fixes issue #90242

(cherry picked from commit 19f4d68252b70c81ebb1686a5a31069eda5373de)
---
 .../llvm/CodeGen/GlobalISel/LoadStoreOpt.h| 20 --
 llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp  | 48 --
 .../AArch64/GlobalISel/store-merging.ll   | 19 ++
 .../AArch64/GlobalISel/store-merging.mir  | 62 ---
 4 files changed, 115 insertions(+), 34 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h 
b/llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h
index 0f20a33f3a755c..7990997835d019 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h
@@ -35,11 +35,23 @@ struct LegalityQuery;
 class MachineRegisterInfo;
 namespace GISelAddressing {
 /// Helper struct to store a base, index and offset that forms an address
-struct BaseIndexOffset {
+class BaseIndexOffset {
+private:
   Register BaseReg;
   Register IndexReg;
-  int64_t Offset = 0;
-  bool IsIndexSignExt = false;
+  std::optional Offset;
+
+public:
+  BaseIndexOffset() = default;
+  Register getBase() { return BaseReg; }
+  Register getBase() const { return BaseReg; }
+  Register getIndex() { return IndexReg; }
+  Register getIndex() const { return IndexReg; }
+  void setBase(Register NewBase) { BaseReg = NewBase; }
+  void setIndex(Register NewIndex) { IndexReg = NewIndex; }
+  void setOffset(std::optional NewOff) { Offset = NewOff; }
+  bool hasValidOffset() const { return Offset.has_value(); }
+  int64_t getOffset() const { return *Offset; }
 };
 
 /// Returns a BaseIndexOffset which describes the pointer in \p Ptr.
@@ -89,7 +101,7 @@ class LoadStoreOpt : public MachineFunctionPass {
 // order stores are writing to incremeneting consecutive addresses. So when
 // we walk the block in reverse order, the next eligible store must write 
to
 // an offset one store width lower than CurrentLowestOffset.
-uint64_t CurrentLowestOffset;
+int64_t CurrentLowestOffset;
 SmallVector Stores;
 // A vector of MachineInstr/unsigned pairs to denote potential aliases that
 // need to be checked before the candidate is considered safe to merge. The
diff --git a/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp 
b/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
index 246aa88b09acf6..ee499c41c558c3 100644
--- a/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
@@ -84,21 +84,20 @@ BaseIndexOffset GISelAddressing::getPointerInfo(Register 
Ptr,
 MachineRegisterInfo &MRI) {
   BaseIndexOffset Info;
   Register PtrAddRHS;
-  if (!mi_match(Ptr, MRI, m_GPtrAdd(m_Reg(Info.BaseReg), m_Reg(PtrAddRHS {
-Info.BaseReg = Ptr;
-Info.IndexReg = Register();
-Info.IsIndexSignExt = false;
+  Register BaseReg;
+  if (!mi_match(Ptr, MRI, m_GPtrAdd(m_Reg(BaseReg), m_Reg(PtrAddRHS {
+Info.setBase(Ptr);
+Info.setOffset(0);
 return Info;
   }
-
+  Info.setBase(BaseReg);
   auto RHSCst = getIConstantVRegValWithLookThrough(PtrAddRHS, MRI);
   if (RHSCst)
-Info.Offset = RHSCst->Value.getSExtValue();
+Info.setOffset(RHSCst->Value.getSExtValue());
 
   // Just recognize a simple case for now. In future we'll need to match
   // indexing patterns for base + index + constant.
-  Info.IndexReg = PtrAddRHS;
-  Info.IsIndexSignExt = false;
+  Info.setIndex(PtrAddRHS);
   return Info;
 }
 
@@ -114,15 +113,16 @@ bool GISelAddressing::aliasIsKnownForLoadStore(const 
MachineInstr &MI1,
   BaseIndexOffset BasePtr0 = getPointerInfo(LdSt1->getPointerReg(), MRI);
   BaseIndexOffset BasePtr1 = getPointerInfo(LdSt2->getPointerReg(), MRI);
 
-  if (!BasePtr0.BaseReg.isValid() || !BasePtr1.BaseReg.isValid())
+  if (!BasePtr0.getBase().isValid() || !BasePtr1.getBase().isValid())
 return false;
 
   int64_t Size1 = LdSt1->getMemSize();
   int64_t Size2 = LdSt2->getMemSize();
 
   int64_t PtrDiff;
-  if (BasePtr0.BaseReg == BasePtr1.BaseReg) {
-PtrDiff = BasePtr1.Offset - BasePtr0.Offset;
+  if (BasePtr0.getBase() == BasePtr1.getBase() && BasePtr0.hasValidOffset() &&
+  BasePtr1.hasValidOffset()) {
+PtrDiff = BasePtr1.getOffset() - BasePtr0.getOffset();
 // If the size of memory access is unknown, do not use it to do analysis.
 // One example of 

[llvm-branch-commits] [llvm] ece9d35 - [GlobalISel] Fix store merging incorrectly classifying an unknown index expr as 0. (#90375)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

Author: Amara Emerson
Date: 2024-05-01T11:35:12-07:00
New Revision: ece9d35f1a705ab8d66895c6d985907f2b9a2c0c

URL: 
https://github.com/llvm/llvm-project/commit/ece9d35f1a705ab8d66895c6d985907f2b9a2c0c
DIFF: 
https://github.com/llvm/llvm-project/commit/ece9d35f1a705ab8d66895c6d985907f2b9a2c0c.diff

LOG: [GlobalISel] Fix store merging incorrectly classifying an unknown index 
expr as 0. (#90375)

During analysis, we incorrectly leave the offset part of an address info
struct
as zero, when in actual fact we failed to decompose it into base +
offset.
This results in incorrectly assuming that the address is adjacent to
another store
addr. To fix this we wrap the offset in an optional<> so we can
distinguish between
real zero and unknown.

Fixes issue #90242

(cherry picked from commit 19f4d68252b70c81ebb1686a5a31069eda5373de)

Added: 


Modified: 
llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h
llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
llvm/test/CodeGen/AArch64/GlobalISel/store-merging.ll
llvm/test/CodeGen/AArch64/GlobalISel/store-merging.mir

Removed: 




diff  --git a/llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h 
b/llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h
index 0f20a33f3a755c..7990997835d019 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h
@@ -35,11 +35,23 @@ struct LegalityQuery;
 class MachineRegisterInfo;
 namespace GISelAddressing {
 /// Helper struct to store a base, index and offset that forms an address
-struct BaseIndexOffset {
+class BaseIndexOffset {
+private:
   Register BaseReg;
   Register IndexReg;
-  int64_t Offset = 0;
-  bool IsIndexSignExt = false;
+  std::optional Offset;
+
+public:
+  BaseIndexOffset() = default;
+  Register getBase() { return BaseReg; }
+  Register getBase() const { return BaseReg; }
+  Register getIndex() { return IndexReg; }
+  Register getIndex() const { return IndexReg; }
+  void setBase(Register NewBase) { BaseReg = NewBase; }
+  void setIndex(Register NewIndex) { IndexReg = NewIndex; }
+  void setOffset(std::optional NewOff) { Offset = NewOff; }
+  bool hasValidOffset() const { return Offset.has_value(); }
+  int64_t getOffset() const { return *Offset; }
 };
 
 /// Returns a BaseIndexOffset which describes the pointer in \p Ptr.
@@ -89,7 +101,7 @@ class LoadStoreOpt : public MachineFunctionPass {
 // order stores are writing to incremeneting consecutive addresses. So when
 // we walk the block in reverse order, the next eligible store must write 
to
 // an offset one store width lower than CurrentLowestOffset.
-uint64_t CurrentLowestOffset;
+int64_t CurrentLowestOffset;
 SmallVector Stores;
 // A vector of MachineInstr/unsigned pairs to denote potential aliases that
 // need to be checked before the candidate is considered safe to merge. The

diff  --git a/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp 
b/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
index 246aa88b09acf6..ee499c41c558c3 100644
--- a/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
@@ -84,21 +84,20 @@ BaseIndexOffset GISelAddressing::getPointerInfo(Register 
Ptr,
 MachineRegisterInfo &MRI) {
   BaseIndexOffset Info;
   Register PtrAddRHS;
-  if (!mi_match(Ptr, MRI, m_GPtrAdd(m_Reg(Info.BaseReg), m_Reg(PtrAddRHS {
-Info.BaseReg = Ptr;
-Info.IndexReg = Register();
-Info.IsIndexSignExt = false;
+  Register BaseReg;
+  if (!mi_match(Ptr, MRI, m_GPtrAdd(m_Reg(BaseReg), m_Reg(PtrAddRHS {
+Info.setBase(Ptr);
+Info.setOffset(0);
 return Info;
   }
-
+  Info.setBase(BaseReg);
   auto RHSCst = getIConstantVRegValWithLookThrough(PtrAddRHS, MRI);
   if (RHSCst)
-Info.Offset = RHSCst->Value.getSExtValue();
+Info.setOffset(RHSCst->Value.getSExtValue());
 
   // Just recognize a simple case for now. In future we'll need to match
   // indexing patterns for base + index + constant.
-  Info.IndexReg = PtrAddRHS;
-  Info.IsIndexSignExt = false;
+  Info.setIndex(PtrAddRHS);
   return Info;
 }
 
@@ -114,15 +113,16 @@ bool GISelAddressing::aliasIsKnownForLoadStore(const 
MachineInstr &MI1,
   BaseIndexOffset BasePtr0 = getPointerInfo(LdSt1->getPointerReg(), MRI);
   BaseIndexOffset BasePtr1 = getPointerInfo(LdSt2->getPointerReg(), MRI);
 
-  if (!BasePtr0.BaseReg.isValid() || !BasePtr1.BaseReg.isValid())
+  if (!BasePtr0.getBase().isValid() || !BasePtr1.getBase().isValid())
 return false;
 
   int64_t Size1 = LdSt1->getMemSize();
   int64_t Size2 = LdSt2->getMemSize();
 
   int64_t PtrDiff;
-  if (BasePtr0.BaseReg == BasePtr1.BaseReg) {
-PtrDiff = BasePtr1.Offset - BasePtr0.Offset;
+  if (BasePtr0.getBase() == BasePtr1.getBase() && BasePtr0.hasValidOffset() &&
+  BasePtr1.hasValidOffset()) {
+PtrDiff = BasePtr1.getOffset() - BasePtr0.getOffset

[llvm-branch-commits] [llvm] release/18.x: [GlobalISel] Fix store merging incorrectly classifying an unknown index expr as 0. (#90375) (PR #90673)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/90673
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [RISCV][ISel] Fix types in `tryFoldSelectIntoOp` (#90659) (PR #90682)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/90682

>From 20b9ed64ea074f03057e1d775a1d9d0f067ab0b0 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Wed, 1 May 2024 06:51:36 +0800
Subject: [PATCH] [RISCV][ISel] Fix types in `tryFoldSelectIntoOp` (#90659)

```
SelectionDAG has 17 nodes:
  t0: ch,glue = EntryToken
t6: i64,ch = CopyFromReg t0, Register:i64 %2
  t8: i1 = truncate t6
  t4: i64,ch = CopyFromReg t0, Register:i64 %1
t7: i1 = truncate t4
t2: i64,ch = CopyFromReg t0, Register:i64 %0
  t10: i64,i1 = saddo t2, Constant:i64<1>
t11: i1 = or t8, t10:1
  t12: i1 = select t7, t8, t11
t13: i64 = any_extend t12
  t15: ch,glue = CopyToReg t0, Register:i64 $x10, t13
  t16: ch = RISCVISD::RET_GLUE t15, Register:i64 $x10, t15:1
```

`OtherOpVT` should be i1, but `OtherOp->getValueType(0)` returns `i64`,
which ignores `ResNo` in `SDValue`.

Fix https://github.com/llvm/llvm-project/issues/90652.

(cherry picked from commit 2647bd73696ae987addd0e74774a44108accb1e6)
---
 llvm/lib/Target/RISCV/RISCVISelLowering.cpp |  2 +-
 llvm/test/CodeGen/RISCV/pr90652.ll  | 19 +++
 2 files changed, 20 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/RISCV/pr90652.ll

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index a0cec426002b6f..d46093b9e260a2 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -14559,7 +14559,7 @@ static SDValue tryFoldSelectIntoOp(SDNode *N, 
SelectionDAG &DAG,
   EVT VT = N->getValueType(0);
   SDLoc DL(N);
   SDValue OtherOp = TrueVal.getOperand(1 - OpToFold);
-  EVT OtherOpVT = OtherOp->getValueType(0);
+  EVT OtherOpVT = OtherOp.getValueType();
   SDValue IdentityOperand =
   DAG.getNeutralElement(Opc, DL, OtherOpVT, N->getFlags());
   if (!Commutative)
diff --git a/llvm/test/CodeGen/RISCV/pr90652.ll 
b/llvm/test/CodeGen/RISCV/pr90652.ll
new file mode 100644
index 00..2162395b92ac3c
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/pr90652.ll
@@ -0,0 +1,19 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 4
+; RUN: llc < %s -mtriple=riscv64 | FileCheck %s
+
+define i1 @test(i64 %x, i1 %cond1, i1 %cond2) {
+; CHECK-LABEL: test:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:addi a3, a0, 1
+; CHECK-NEXT:slt a0, a3, a0
+; CHECK-NEXT:not a1, a1
+; CHECK-NEXT:and a0, a1, a0
+; CHECK-NEXT:or a0, a2, a0
+; CHECK-NEXT:ret
+entry:
+  %sadd = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %x, i64 1)
+  %ov = extractvalue { i64, i1 } %sadd, 1
+  %or = or i1 %cond2, %ov
+  %sel = select i1 %cond1, i1 %cond2, i1 %or
+  ret i1 %sel
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 20b9ed6 - [RISCV][ISel] Fix types in `tryFoldSelectIntoOp` (#90659)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

Author: Yingwei Zheng
Date: 2024-05-01T11:39:11-07:00
New Revision: 20b9ed64ea074f03057e1d775a1d9d0f067ab0b0

URL: 
https://github.com/llvm/llvm-project/commit/20b9ed64ea074f03057e1d775a1d9d0f067ab0b0
DIFF: 
https://github.com/llvm/llvm-project/commit/20b9ed64ea074f03057e1d775a1d9d0f067ab0b0.diff

LOG: [RISCV][ISel] Fix types in `tryFoldSelectIntoOp` (#90659)

```
SelectionDAG has 17 nodes:
  t0: ch,glue = EntryToken
t6: i64,ch = CopyFromReg t0, Register:i64 %2
  t8: i1 = truncate t6
  t4: i64,ch = CopyFromReg t0, Register:i64 %1
t7: i1 = truncate t4
t2: i64,ch = CopyFromReg t0, Register:i64 %0
  t10: i64,i1 = saddo t2, Constant:i64<1>
t11: i1 = or t8, t10:1
  t12: i1 = select t7, t8, t11
t13: i64 = any_extend t12
  t15: ch,glue = CopyToReg t0, Register:i64 $x10, t13
  t16: ch = RISCVISD::RET_GLUE t15, Register:i64 $x10, t15:1
```

`OtherOpVT` should be i1, but `OtherOp->getValueType(0)` returns `i64`,
which ignores `ResNo` in `SDValue`.

Fix https://github.com/llvm/llvm-project/issues/90652.

(cherry picked from commit 2647bd73696ae987addd0e74774a44108accb1e6)

Added: 
llvm/test/CodeGen/RISCV/pr90652.ll

Modified: 
llvm/lib/Target/RISCV/RISCVISelLowering.cpp

Removed: 




diff  --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index a0cec426002b6f..d46093b9e260a2 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -14559,7 +14559,7 @@ static SDValue tryFoldSelectIntoOp(SDNode *N, 
SelectionDAG &DAG,
   EVT VT = N->getValueType(0);
   SDLoc DL(N);
   SDValue OtherOp = TrueVal.getOperand(1 - OpToFold);
-  EVT OtherOpVT = OtherOp->getValueType(0);
+  EVT OtherOpVT = OtherOp.getValueType();
   SDValue IdentityOperand =
   DAG.getNeutralElement(Opc, DL, OtherOpVT, N->getFlags());
   if (!Commutative)

diff  --git a/llvm/test/CodeGen/RISCV/pr90652.ll 
b/llvm/test/CodeGen/RISCV/pr90652.ll
new file mode 100644
index 00..2162395b92ac3c
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/pr90652.ll
@@ -0,0 +1,19 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 4
+; RUN: llc < %s -mtriple=riscv64 | FileCheck %s
+
+define i1 @test(i64 %x, i1 %cond1, i1 %cond2) {
+; CHECK-LABEL: test:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:addi a3, a0, 1
+; CHECK-NEXT:slt a0, a3, a0
+; CHECK-NEXT:not a1, a1
+; CHECK-NEXT:and a0, a1, a0
+; CHECK-NEXT:or a0, a2, a0
+; CHECK-NEXT:ret
+entry:
+  %sadd = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %x, i64 1)
+  %ov = extractvalue { i64, i1 } %sadd, 1
+  %or = or i1 %cond2, %ov
+  %sel = select i1 %cond1, i1 %cond2, i1 %or
+  ret i1 %sel
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [RISCV][ISel] Fix types in `tryFoldSelectIntoOp` (#90659) (PR #90682)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/90682
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [RISCV][ISel] Fix types in `tryFoldSelectIntoOp` (#90659) (PR #90682)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @dtcxzyw (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.


https://github.com/llvm/llvm-project/pull/90682
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [GlobalISel] Fix store merging incorrectly classifying an unknown index expr as 0. (#90375) (PR #90673)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @aemerson (or anyone else). If you would like to add a note about this fix 
in the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/90673
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86] Enable EVEX512 when host CPU has AVX512 (#90479) (PR #90545)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @nikic (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/90545
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [GlobalISel] Don't form anyextending atomic loads. (PR #90435)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @nikic (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/90435
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [AArch64] Remove invalid uabdl patterns. (#89272) (PR #89380)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @AtariDreams (or anyone else). If you would like to add a note about this 
fix in the release notes (completely optional). Please reply to this comment 
with a one or two sentence description of the fix.  When you are done, please 
add the release:note label to this PR.

https://github.com/llvm/llvm-project/pull/89380
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x: [clang][CoverageMapping] do not emit a gap region when either end doesn't have valid source locations (#89564) (PR #90369)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @whentojump (or anyone else). If you would like to add a note about this fix 
in the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/90369
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86][EVEX512] Check hasEVEX512 for canExtendTo512DQ (#90390) (PR #90422)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @phoebewang (or anyone else). If you would like to add a note about this fix 
in the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/90422
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [CGP] Drop poison-generating flags after hoisting (#90382) (PR #90437)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @dtcxzyw  (or anyone else). If you would like to add a note about this fix 
in the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/90437
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Clang] Handle structs with inner structs and no fields (#89126) (PR #90133)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @bwendling (or anyone else). If you would like to add a note about this fix 
in the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/90133
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [IRCE] Skip icmp ptr in `InductiveRangeCheck::parseRangeCheckICmp` (#89967) (PR #90182)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @nikic (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/90182
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Backport ARM64EC variadic args fixes to LLVM 18 (PR #81800)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @dpaoliello  (or anyone else). If you would like to add a note about this 
fix in the release notes (completely optional). Please reply to this comment 
with a one or two sentence description of the fix.  When you are done, please 
add the release:note label to this PR.

https://github.com/llvm/llvm-project/pull/81800
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x: [clang-format] Fix a regression in ContinuationIndenter (#88414) (PR #89412)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @owenca (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/89412
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x: [clang-format] Fix a regression in annotating TrailingReturnArrow (#86624) (PR #89415)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @owenca (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/89415
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Fix setting nontemporal in memory legalizer (#83815) (PR #90204)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @jayfoad (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/90204
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [DAGCombiner] Fix miscompile bug in combineShiftOfShiftedLogic (#89616) (PR #89766)

2024-05-01 Thread Tom Stellard via llvm-branch-commits
=?utf-8?q?Björn?= Pettersson 
Message-ID:
In-Reply-To: 


tstellar wrote:

@AtariDreams (or anyone else). If you would like to add a note about this fix 
in the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/89766
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [polly] release/18.x: [clang-format] Correctly annotate braces in macros (#87… (PR #89491)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@owenca (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/89491
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] release/18.x: [libcxx] [modules] Add _LIBCPP_USING_IF_EXISTS on aligned_alloc (#89827) (PR #89894)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@mstorsjo (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/89894
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [analyzer] Backport performace regression fix (PR #89725)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @steakhal (or anyone else). If you would like to add a note about this fix 
in the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/89725
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86] Fix miscompile in combineShiftRightArithmetic (PR #86728)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @AtariDreams (or anyone else). If you would like to add a note about this 
fix in the release notes (completely optional). Please reply to this comment 
with a one or two sentence description of the fix.  When you are done, please 
add the release:note label to this PR.

https://github.com/llvm/llvm-project/pull/86728
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [GlobalISel] Fix fewerElementsVectorPhi to insert after G_PHIs (#87927) (PR #89240)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @AtariDreams (or anyone else). If you would like to add a note about this 
fix in the release notes (completely optional). Please reply to this comment 
with a one or two sentence description of the fix.  When you are done, please 
add the release:note label to this PR.

https://github.com/llvm/llvm-project/pull/89240
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [InstCombine] Fix unexpected overwriting in `foldSelectWithSRem` (#89539) (PR #89546)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @dtcxzyw (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/89546
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] Backport fix for crash reported in #88181 (PR #89022)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @steakhal (or anyone else). If you would like to add a note about this fix 
in the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/89022
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86] Always use 64-bit relocations in no-PIC large code model (#89101) (PR #89124)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @aeubanks (or anyone else). If you would like to add a note about this fix 
in the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/89124
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x: [clang codegen] Fix MS ABI detection of user-provided constructors. (#90151) (PR #90639)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@efriedma-quic (or anyone else). If you would like to add a note about this fix 
in the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.


https://github.com/llvm/llvm-project/pull/90639
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x: [clang codegen] Fix MS ABI detection of user-provided constructors. (#90151) (PR #90639)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/90639

>From 617a15a9eac96088ae5e9134248d8236e34b91b1 Mon Sep 17 00:00:00 2001
From: Eli Friedman 
Date: Mon, 29 Apr 2024 12:00:12 -0700
Subject: [PATCH] [clang codegen] Fix MS ABI detection of user-provided
 constructors. (#90151)

In the context of determining whether a class counts as an "aggregate",
a constructor template counts as a user-provided constructor.

Fixes #86384

(cherry picked from commit 3ab4ae9e58c09dfd8203547ba8916f3458a0a481)
---
 clang/docs/ReleaseNotes.rst  |  6 ++
 clang/lib/CodeGen/MicrosoftCXXABI.cpp| 12 +---
 clang/test/CodeGen/arm64-microsoft-arguments.cpp | 15 +++
 3 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 1e88b58725bd95..e533ecfd5aeba5 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -149,6 +149,12 @@ ABI Changes in This Version
 - Following the SystemV ABI for x86-64, ``__int128`` arguments will no longer
   be split between a register and a stack slot.
 
+- Fixed Microsoft calling convention for returning certain classes with a
+  templated constructor. If a class has a templated constructor, it should
+  be returned indirectly even if it meets all the other requirements for
+  returning a class in a register. This affects some uses of std::pair.
+  (#GH86384).
+
 AST Dumping Potentially Breaking Changes
 
 - When dumping a sugared type, Clang will no longer print the desugared type if
diff --git a/clang/lib/CodeGen/MicrosoftCXXABI.cpp 
b/clang/lib/CodeGen/MicrosoftCXXABI.cpp
index 172c4c937b9728..4d0f4c63f843b8 100644
--- a/clang/lib/CodeGen/MicrosoftCXXABI.cpp
+++ b/clang/lib/CodeGen/MicrosoftCXXABI.cpp
@@ -1135,9 +1135,15 @@ static bool isTrivialForMSVC(const CXXRecordDecl *RD, 
QualType Ty,
 return false;
   if (RD->hasNonTrivialCopyAssignment())
 return false;
-  for (const CXXConstructorDecl *Ctor : RD->ctors())
-if (Ctor->isUserProvided())
-  return false;
+  for (const Decl *D : RD->decls()) {
+if (auto *Ctor = dyn_cast(D)) {
+  if (Ctor->isUserProvided())
+return false;
+} else if (auto *Template = dyn_cast(D)) {
+  if (isa(Template->getTemplatedDecl()))
+return false;
+}
+  }
   if (RD->hasNonTrivialDestructor())
 return false;
   return true;
diff --git a/clang/test/CodeGen/arm64-microsoft-arguments.cpp 
b/clang/test/CodeGen/arm64-microsoft-arguments.cpp
index e8309888dcfe21..85472645acb3b3 100644
--- a/clang/test/CodeGen/arm64-microsoft-arguments.cpp
+++ b/clang/test/CodeGen/arm64-microsoft-arguments.cpp
@@ -201,3 +201,18 @@ S11 f11() {
   S11 x;
   return func11(x);
 }
+
+// GH86384
+// Pass and return object with template constructor (pass directly,
+// return indirectly).
+// CHECK: define dso_local void @"?f12@@YA?AUS12@@XZ"(ptr dead_on_unwind inreg 
noalias writable sret(%struct.S12) align 4 {{.*}})
+// CHECK: call void @"?func12@@YA?AUS12@@U1@@Z"(ptr dead_on_unwind inreg 
writable sret(%struct.S12) align 4 {{.*}}, i64 {{.*}})
+struct S12 {
+  template S12(T*) {}
+  int x;
+};
+S12 func12(S12 x);
+S12 f12() {
+  S12 x((int*)0);
+  return func12(x);
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] 617a15a - [clang codegen] Fix MS ABI detection of user-provided constructors. (#90151)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

Author: Eli Friedman
Date: 2024-05-01T15:56:33-07:00
New Revision: 617a15a9eac96088ae5e9134248d8236e34b91b1

URL: 
https://github.com/llvm/llvm-project/commit/617a15a9eac96088ae5e9134248d8236e34b91b1
DIFF: 
https://github.com/llvm/llvm-project/commit/617a15a9eac96088ae5e9134248d8236e34b91b1.diff

LOG: [clang codegen] Fix MS ABI detection of user-provided constructors. 
(#90151)

In the context of determining whether a class counts as an "aggregate",
a constructor template counts as a user-provided constructor.

Fixes #86384

(cherry picked from commit 3ab4ae9e58c09dfd8203547ba8916f3458a0a481)

Added: 


Modified: 
clang/docs/ReleaseNotes.rst
clang/lib/CodeGen/MicrosoftCXXABI.cpp
clang/test/CodeGen/arm64-microsoft-arguments.cpp

Removed: 




diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 1e88b58725bd95..e533ecfd5aeba5 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -149,6 +149,12 @@ ABI Changes in This Version
 - Following the SystemV ABI for x86-64, ``__int128`` arguments will no longer
   be split between a register and a stack slot.
 
+- Fixed Microsoft calling convention for returning certain classes with a
+  templated constructor. If a class has a templated constructor, it should
+  be returned indirectly even if it meets all the other requirements for
+  returning a class in a register. This affects some uses of std::pair.
+  (#GH86384).
+
 AST Dumping Potentially Breaking Changes
 
 - When dumping a sugared type, Clang will no longer print the desugared type if

diff  --git a/clang/lib/CodeGen/MicrosoftCXXABI.cpp 
b/clang/lib/CodeGen/MicrosoftCXXABI.cpp
index 172c4c937b9728..4d0f4c63f843b8 100644
--- a/clang/lib/CodeGen/MicrosoftCXXABI.cpp
+++ b/clang/lib/CodeGen/MicrosoftCXXABI.cpp
@@ -1135,9 +1135,15 @@ static bool isTrivialForMSVC(const CXXRecordDecl *RD, 
QualType Ty,
 return false;
   if (RD->hasNonTrivialCopyAssignment())
 return false;
-  for (const CXXConstructorDecl *Ctor : RD->ctors())
-if (Ctor->isUserProvided())
-  return false;
+  for (const Decl *D : RD->decls()) {
+if (auto *Ctor = dyn_cast(D)) {
+  if (Ctor->isUserProvided())
+return false;
+} else if (auto *Template = dyn_cast(D)) {
+  if (isa(Template->getTemplatedDecl()))
+return false;
+}
+  }
   if (RD->hasNonTrivialDestructor())
 return false;
   return true;

diff  --git a/clang/test/CodeGen/arm64-microsoft-arguments.cpp 
b/clang/test/CodeGen/arm64-microsoft-arguments.cpp
index e8309888dcfe21..85472645acb3b3 100644
--- a/clang/test/CodeGen/arm64-microsoft-arguments.cpp
+++ b/clang/test/CodeGen/arm64-microsoft-arguments.cpp
@@ -201,3 +201,18 @@ S11 f11() {
   S11 x;
   return func11(x);
 }
+
+// GH86384
+// Pass and return object with template constructor (pass directly,
+// return indirectly).
+// CHECK: define dso_local void @"?f12@@YA?AUS12@@XZ"(ptr dead_on_unwind inreg 
noalias writable sret(%struct.S12) align 4 {{.*}})
+// CHECK: call void @"?func12@@YA?AUS12@@U1@@Z"(ptr dead_on_unwind inreg 
writable sret(%struct.S12) align 4 {{.*}}, i64 {{.*}})
+struct S12 {
+  template S12(T*) {}
+  int x;
+};
+S12 func12(S12 x);
+S12 f12() {
+  S12 x((int*)0);
+  return func12(x);
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x: [clang codegen] Fix MS ABI detection of user-provided constructors. (#90151) (PR #90639)

2024-05-01 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/90639
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Fix setting nontemporal in memory legalizer (#83815) (PR #90204)

2024-05-02 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@jayfoad If it's not noteworthy, then it's OK to not add a release note.  We 
don't typically have a list of fixed bugs in the release notes.

https://github.com/llvm/llvm-project/pull/90204
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Bump version to 18.1.6 (PR #91094)

2024-05-04 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar created 
https://github.com/llvm/llvm-project/pull/91094

None

>From b71b9cfce7f3e5dce0cf1856df95cfe8d16252f1 Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Sat, 4 May 2024 21:56:44 +
Subject: [PATCH] Bump version to 18.1.6

---
 llvm/CMakeLists.txt| 2 +-
 llvm/utils/lit/lit/__init__.py | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/llvm/CMakeLists.txt b/llvm/CMakeLists.txt
index f82be164ac9c48..26b7b01bb1f8de 100644
--- a/llvm/CMakeLists.txt
+++ b/llvm/CMakeLists.txt
@@ -22,7 +22,7 @@ if(NOT DEFINED LLVM_VERSION_MINOR)
   set(LLVM_VERSION_MINOR 1)
 endif()
 if(NOT DEFINED LLVM_VERSION_PATCH)
-  set(LLVM_VERSION_PATCH 5)
+  set(LLVM_VERSION_PATCH 6)
 endif()
 if(NOT DEFINED LLVM_VERSION_SUFFIX)
   set(LLVM_VERSION_SUFFIX)
diff --git a/llvm/utils/lit/lit/__init__.py b/llvm/utils/lit/lit/__init__.py
index 1cfcc7d37813bc..d8b0e3bd1c69e3 100644
--- a/llvm/utils/lit/lit/__init__.py
+++ b/llvm/utils/lit/lit/__init__.py
@@ -2,7 +2,7 @@
 
 __author__ = "Daniel Dunbar"
 __email__ = "dan...@minormatter.com"
-__versioninfo__ = (18, 1, 5)
+__versioninfo__ = (18, 1, 6)
 __version__ = ".".join(str(v) for v in __versioninfo__) + "dev"
 
 __all__ = []

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] Backport some fixes for building the release binaries (PR #91095)

2024-05-04 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar created 
https://github.com/llvm/llvm-project/pull/91095

None

>From b71b9cfce7f3e5dce0cf1856df95cfe8d16252f1 Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Sat, 4 May 2024 21:56:44 +
Subject: [PATCH 1/4] Bump version to 18.1.6

---
 llvm/CMakeLists.txt| 2 +-
 llvm/utils/lit/lit/__init__.py | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/llvm/CMakeLists.txt b/llvm/CMakeLists.txt
index f82be164ac9c48..26b7b01bb1f8de 100644
--- a/llvm/CMakeLists.txt
+++ b/llvm/CMakeLists.txt
@@ -22,7 +22,7 @@ if(NOT DEFINED LLVM_VERSION_MINOR)
   set(LLVM_VERSION_MINOR 1)
 endif()
 if(NOT DEFINED LLVM_VERSION_PATCH)
-  set(LLVM_VERSION_PATCH 5)
+  set(LLVM_VERSION_PATCH 6)
 endif()
 if(NOT DEFINED LLVM_VERSION_SUFFIX)
   set(LLVM_VERSION_SUFFIX)
diff --git a/llvm/utils/lit/lit/__init__.py b/llvm/utils/lit/lit/__init__.py
index 1cfcc7d37813bc..d8b0e3bd1c69e3 100644
--- a/llvm/utils/lit/lit/__init__.py
+++ b/llvm/utils/lit/lit/__init__.py
@@ -2,7 +2,7 @@
 
 __author__ = "Daniel Dunbar"
 __email__ = "dan...@minormatter.com"
-__versioninfo__ = (18, 1, 5)
+__versioninfo__ = (18, 1, 6)
 __version__ = ".".join(str(v) for v in __versioninfo__) + "dev"
 
 __all__ = []

>From dc6392e374ef8367e98b996569f3bb2898bcb99a Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Wed, 24 Apr 2024 07:47:42 -0700
Subject: [PATCH 2/4] [CMake][Release] Add stage2-package target (#89517)

This target will be used to generate the release binary package for
uploading to GitHub.

(cherry picked from commit a38f201f1ec70c2b1f3cf46e7f291c53bb16753e)
---
 clang/cmake/caches/Release.cmake | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake
index bd1f688d61a7ea..fa972636553f1f 100644
--- a/clang/cmake/caches/Release.cmake
+++ b/clang/cmake/caches/Release.cmake
@@ -14,6 +14,7 @@ if (LLVM_RELEASE_ENABLE_PGO)
   set(CLANG_BOOTSTRAP_TARGETS
 generate-profdata
 stage2
+stage2-package
 stage2-clang
 stage2-distribution
 stage2-install
@@ -57,6 +58,7 @@ set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "")
 set(BOOTSTRAP_CLANG_ENABLE_BOOTSTRAP ON CACHE STRING "")
 set(BOOTSTRAP_CLANG_BOOTSTRAP_TARGETS
   clang
+  package
   check-all
   check-llvm
   check-clang CACHE STRING "")

>From 89f6c6ed99e27397e1d4ac8a0cf2e7d3cf11bccd Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Thu, 25 Apr 2024 15:32:08 -0700
Subject: [PATCH 3/4] [CMake][Release] Refactor cache file and use two stages
 for non-PGO builds (#89812)

Completely refactor the cache file to simplify it and remove unnecessary
variables. The main functional change here is that the non-PGO builds
now use two stages, so `ninja -C build stage2-package` can be used with
both PGO and non-PGO builds.

(cherry picked from commit 6473fbf2d68c8486d168f29afc35d3e8a6fabe69)
---
 clang/cmake/caches/Release.cmake | 134 +++
 1 file changed, 66 insertions(+), 68 deletions(-)

diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake
index fa972636553f1f..c164d5497275f3 100644
--- a/clang/cmake/caches/Release.cmake
+++ b/clang/cmake/caches/Release.cmake
@@ -1,95 +1,93 @@
 # Plain options configure the first build.
 # BOOTSTRAP_* options configure the second build.
 # BOOTSTRAP_BOOTSTRAP_* options configure the third build.
+# PGO Builds have 3 stages (stage1, stage2-instrumented, stage2)
+# non-PGO Builds have 2 stages (stage1, stage2)
 
-# General Options
+
+function (set_final_stage_var name value type)
+  if (LLVM_RELEASE_ENABLE_PGO)
+set(BOOTSTRAP_BOOTSTRAP_${name} ${value} CACHE ${type} "")
+  else()
+set(BOOTSTRAP_${name} ${value} CACHE ${type} "")
+  endif()
+endfunction()
+
+function (set_instrument_and_final_stage_var name value type)
+  # This sets the varaible for the final stage in non-PGO builds and in
+  # the stage2-instrumented stage for PGO builds.
+  set(BOOTSTRAP_${name} ${value} CACHE ${type} "")
+  if (LLVM_RELEASE_ENABLE_PGO)
+# Set the variable in the final stage for PGO builds.
+set(BOOTSTRAP_BOOTSTRAP_${name} ${value} CACHE ${type} "")
+  endif()
+endfunction()
+
+# General Options:
+# If you want to override any of the LLVM_RELEASE_* variables you can set them
+# on the command line via -D, but you need to do this before you pass this
+# cache file to CMake via -C. e.g.
+#
+# cmake -D LLVM_RELEASE_ENABLE_PGO=ON -C Release.cmake
 set(LLVM_RELEASE_ENABLE_LTO THIN CACHE STRING "")
 set(LLVM_RELEASE_ENABLE_PGO OFF CACHE BOOL "")
-
+set(LLVM_RELEASE_ENABLE_RUNTIMES "compiler-rt;libcxx;libcxxabi;libunwind" 
CACHE STRING "")
+set(LLVM_RELEASE_ENABLE_PROJECTS 
"clang;lld;lldb;clang-tools-extra;bolt;polly;mlir;flang" CACHE STRING "")
+# Note we don't need to add install here, since it is one of the pre-defined
+# steps.
+set(LLVM_RELEASE_FINAL_STAGE_TARGETS 
"clang;package;check-all;check-llvm;check-clang" CACHE STRING "")
 set(CMAKE_BUILD_TYPE RELEASE CACHE STRING "")
 
-# Stage 1 Bo

[llvm-branch-commits] [clang] [llvm] Backport some fixes for building the release binaries (PR #91095)

2024-05-04 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/91095

>From b71b9cfce7f3e5dce0cf1856df95cfe8d16252f1 Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Sat, 4 May 2024 21:56:44 +
Subject: [PATCH 1/8] Bump version to 18.1.6

---
 llvm/CMakeLists.txt| 2 +-
 llvm/utils/lit/lit/__init__.py | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/llvm/CMakeLists.txt b/llvm/CMakeLists.txt
index f82be164ac9c48..26b7b01bb1f8de 100644
--- a/llvm/CMakeLists.txt
+++ b/llvm/CMakeLists.txt
@@ -22,7 +22,7 @@ if(NOT DEFINED LLVM_VERSION_MINOR)
   set(LLVM_VERSION_MINOR 1)
 endif()
 if(NOT DEFINED LLVM_VERSION_PATCH)
-  set(LLVM_VERSION_PATCH 5)
+  set(LLVM_VERSION_PATCH 6)
 endif()
 if(NOT DEFINED LLVM_VERSION_SUFFIX)
   set(LLVM_VERSION_SUFFIX)
diff --git a/llvm/utils/lit/lit/__init__.py b/llvm/utils/lit/lit/__init__.py
index 1cfcc7d37813bc..d8b0e3bd1c69e3 100644
--- a/llvm/utils/lit/lit/__init__.py
+++ b/llvm/utils/lit/lit/__init__.py
@@ -2,7 +2,7 @@
 
 __author__ = "Daniel Dunbar"
 __email__ = "dan...@minormatter.com"
-__versioninfo__ = (18, 1, 5)
+__versioninfo__ = (18, 1, 6)
 __version__ = ".".join(str(v) for v in __versioninfo__) + "dev"
 
 __all__ = []

>From dc6392e374ef8367e98b996569f3bb2898bcb99a Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Wed, 24 Apr 2024 07:47:42 -0700
Subject: [PATCH 2/8] [CMake][Release] Add stage2-package target (#89517)

This target will be used to generate the release binary package for
uploading to GitHub.

(cherry picked from commit a38f201f1ec70c2b1f3cf46e7f291c53bb16753e)
---
 clang/cmake/caches/Release.cmake | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake
index bd1f688d61a7ea..fa972636553f1f 100644
--- a/clang/cmake/caches/Release.cmake
+++ b/clang/cmake/caches/Release.cmake
@@ -14,6 +14,7 @@ if (LLVM_RELEASE_ENABLE_PGO)
   set(CLANG_BOOTSTRAP_TARGETS
 generate-profdata
 stage2
+stage2-package
 stage2-clang
 stage2-distribution
 stage2-install
@@ -57,6 +58,7 @@ set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "")
 set(BOOTSTRAP_CLANG_ENABLE_BOOTSTRAP ON CACHE STRING "")
 set(BOOTSTRAP_CLANG_BOOTSTRAP_TARGETS
   clang
+  package
   check-all
   check-llvm
   check-clang CACHE STRING "")

>From 89f6c6ed99e27397e1d4ac8a0cf2e7d3cf11bccd Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Thu, 25 Apr 2024 15:32:08 -0700
Subject: [PATCH 3/8] [CMake][Release] Refactor cache file and use two stages
 for non-PGO builds (#89812)

Completely refactor the cache file to simplify it and remove unnecessary
variables. The main functional change here is that the non-PGO builds
now use two stages, so `ninja -C build stage2-package` can be used with
both PGO and non-PGO builds.

(cherry picked from commit 6473fbf2d68c8486d168f29afc35d3e8a6fabe69)
---
 clang/cmake/caches/Release.cmake | 134 +++
 1 file changed, 66 insertions(+), 68 deletions(-)

diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake
index fa972636553f1f..c164d5497275f3 100644
--- a/clang/cmake/caches/Release.cmake
+++ b/clang/cmake/caches/Release.cmake
@@ -1,95 +1,93 @@
 # Plain options configure the first build.
 # BOOTSTRAP_* options configure the second build.
 # BOOTSTRAP_BOOTSTRAP_* options configure the third build.
+# PGO Builds have 3 stages (stage1, stage2-instrumented, stage2)
+# non-PGO Builds have 2 stages (stage1, stage2)
 
-# General Options
+
+function (set_final_stage_var name value type)
+  if (LLVM_RELEASE_ENABLE_PGO)
+set(BOOTSTRAP_BOOTSTRAP_${name} ${value} CACHE ${type} "")
+  else()
+set(BOOTSTRAP_${name} ${value} CACHE ${type} "")
+  endif()
+endfunction()
+
+function (set_instrument_and_final_stage_var name value type)
+  # This sets the varaible for the final stage in non-PGO builds and in
+  # the stage2-instrumented stage for PGO builds.
+  set(BOOTSTRAP_${name} ${value} CACHE ${type} "")
+  if (LLVM_RELEASE_ENABLE_PGO)
+# Set the variable in the final stage for PGO builds.
+set(BOOTSTRAP_BOOTSTRAP_${name} ${value} CACHE ${type} "")
+  endif()
+endfunction()
+
+# General Options:
+# If you want to override any of the LLVM_RELEASE_* variables you can set them
+# on the command line via -D, but you need to do this before you pass this
+# cache file to CMake via -C. e.g.
+#
+# cmake -D LLVM_RELEASE_ENABLE_PGO=ON -C Release.cmake
 set(LLVM_RELEASE_ENABLE_LTO THIN CACHE STRING "")
 set(LLVM_RELEASE_ENABLE_PGO OFF CACHE BOOL "")
-
+set(LLVM_RELEASE_ENABLE_RUNTIMES "compiler-rt;libcxx;libcxxabi;libunwind" 
CACHE STRING "")
+set(LLVM_RELEASE_ENABLE_PROJECTS 
"clang;lld;lldb;clang-tools-extra;bolt;polly;mlir;flang" CACHE STRING "")
+# Note we don't need to add install here, since it is one of the pre-defined
+# steps.
+set(LLVM_RELEASE_FINAL_STAGE_TARGETS 
"clang;package;check-all;check-llvm;check-clang" CACHE STRING "")
 set(CMAKE_BUILD_TYPE RELEASE CACHE STRING "")
 
-# Stage 1 Bootstra

[llvm-branch-commits] [llvm] [workflows] Fix libclang-abi-tests to work with new version scheme (PR #91096)

2024-05-04 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar created 
https://github.com/llvm/llvm-project/pull/91096

None

>From 19cb0cd2e2e499b46593d4708f0beaab671586bd Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Sat, 4 May 2024 23:10:21 +
Subject: [PATCH] [workflows] Fix libclang-abi-tests to work with new version
 scheme

---
 .github/workflows/libclang-abi-tests.yml | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/.github/workflows/libclang-abi-tests.yml 
b/.github/workflows/libclang-abi-tests.yml
index ccfc1e5fb8a742..14da910e667ea1 100644
--- a/.github/workflows/libclang-abi-tests.yml
+++ b/.github/workflows/libclang-abi-tests.yml
@@ -51,9 +51,10 @@ jobs:
 id: vars
 run: |
   remote_repo='https://github.com/llvm/llvm-project'
-  if [ ${{ steps.version.outputs.LLVM_VERSION_MINOR }} -ne 0 ] || [ 
${{ steps.version.outputs.LLVM_VERSION_PATCH }} -eq 0 ]; then
+  echo "BASELINE_VERSION_MINOR=1" >> "$GITHUB_OUTPUT"
+  if [ ${{ steps.version.outputs.LLVM_VERSION_PATCH }} -eq 0 ]; then
 major_version=$(( ${{ steps.version.outputs.LLVM_VERSION_MAJOR }} 
- 1))
-baseline_ref="llvmorg-$major_version.0.0"
+baseline_ref="llvmorg-$major_version.1.0"
 
 # If there is a minor release, we want to use that as the base 
line.
 minor_ref=$(git ls-remote --refs -t "$remote_repo" 
llvmorg-"$major_version".[1-9].[0-9] | tail -n1 | grep -o 'llvmorg-.\+' || true)
@@ -75,7 +76,7 @@ jobs:
   else
 {
   echo "BASELINE_VERSION_MAJOR=${{ 
steps.version.outputs.LLVM_VERSION_MAJOR }}"
-  echo "BASELINE_REF=llvmorg-${{ 
steps.version.outputs.LLVM_VERSION_MAJOR }}.0.0"
+  echo "BASELINE_REF=llvmorg-${{ 
steps.version.outputs.LLVM_VERSION_MAJOR }}.1.0"
   echo "ABI_HEADERS=."
   echo "ABI_LIBS=libclang.so libclang-cpp.so"
 } >> "$GITHUB_OUTPUT"

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: Prepend all library intrinsics with `#` when building for Arm64EC (PR #88016)

2024-05-07 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Hi @dpaoliello I'm seeing some "Illegal Instruction" errors when running the 
bolt tests on aarch64.  Do you think there is any chance this commit could be 
the cause?  It's the only one between 18.1.3 and 18.1.4 that touches the 
aarch64 code gen.  Here is the full log:

https://kojipkgs.fedoraproject.org//work/tasks/3490/117353490/build.log

https://github.com/llvm/llvm-project/pull/88016
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [workflows] Rework pre-commit CI for the release branch (PR #91550)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar created 
https://github.com/llvm/llvm-project/pull/91550

This rewrites the pre-commit CI for the release branch so that it behaves 
almost exactly like the current buildkite builders.  It builds every project 
and uses a better filtering method for selecting which projects to build.

In addition, with this change we drop the Linux and Windows test configs, since 
these are already covered by buildkite and add a config for macos/aarch64.

>From a590088cbdf37d3c4d274c5ab9d6d4e4de9c922c Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Fri, 16 Feb 2024 21:34:02 +
Subject: [PATCH] [workflows] Rework pre-commit CI for the release branch

This rewrites the pre-commit CI for the release branch so that it
behaves almost exactly like the current buildkite builders.  It builds
every project and uses a better filtering method for selecting which
projects to build.

In addition, with this change we drop the Linux and Windows test
configs, since these are already covered by buildkite and add a
config for macos/aarch64.
---
 .github/workflows/ci-tests.yml| 154 
 .../compute-projects-to-test/action.yml   |  21 ++
 .../compute-projects-to-test.sh   | 221 ++
 .github/workflows/continue-timeout-job.yml|  75 ++
 .github/workflows/get-job-id/action.yml   |  30 +++
 .../workflows/pr-sccache-restore/action.yml   |  26 +++
 .github/workflows/pr-sccache-save/action.yml  |  50 
 .github/workflows/timeout-restore/action.yml  |  33 +++
 .github/workflows/timeout-save/action.yml |  94 
 .../unprivileged-download-artifact/action.yml |  77 ++
 10 files changed, 781 insertions(+)
 create mode 100644 .github/workflows/ci-tests.yml
 create mode 100644 .github/workflows/compute-projects-to-test/action.yml
 create mode 100755 
.github/workflows/compute-projects-to-test/compute-projects-to-test.sh
 create mode 100644 .github/workflows/continue-timeout-job.yml
 create mode 100644 .github/workflows/get-job-id/action.yml
 create mode 100644 .github/workflows/pr-sccache-restore/action.yml
 create mode 100644 .github/workflows/pr-sccache-save/action.yml
 create mode 100644 .github/workflows/timeout-restore/action.yml
 create mode 100644 .github/workflows/timeout-save/action.yml
 create mode 100644 .github/workflows/unprivileged-download-artifact/action.yml

diff --git a/.github/workflows/ci-tests.yml b/.github/workflows/ci-tests.yml
new file mode 100644
index 0..22e39174abee7
--- /dev/null
+++ b/.github/workflows/ci-tests.yml
@@ -0,0 +1,154 @@
+name: "CI Tests"
+
+permissions:
+  contents: read
+
+on:
+  pull_request:
+types:
+  - opened
+  - synchronize
+  - reopened
+  # When a PR is closed, we still start this workflow, but then skip
+  # all the jobs, which makes it effectively a no-op.  The reason to
+  # do this is that it allows us to take advantage of concurrency groups
+  # to cancel in progress CI jobs whenever the PR is closed.
+  - closed
+branches:
+  - main
+
+concurrency:
+  group: ${{ github.workflow }}-${{ github.event.pull_request.number }}
+  cancel-in-progress: True
+
+jobs:
+  compute-test-configs:
+name: "Compute Configurations to Test"
+if: github.event.action != 'closed'
+runs-on: ubuntu-22.04
+outputs:
+  projects: ${{ steps.vars.outputs.projects }}
+  check-targets: ${{ steps.vars.outputs.check-targets }}
+  test-build: ${{ steps.vars.outputs.check-targets != '' }}
+  test-platforms: ${{ steps.platforms.outputs.result }}
+steps:
+  - name: Fetch LLVM sources
+uses: actions/checkout@v4
+with:
+  fetch-depth: 2
+
+  - name: Compute projects to test
+id: vars
+uses: ./.github/workflows/compute-projects-to-test
+
+  - name: Compute platforms to test
+uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea 
#v7.0.1
+id: platforms
+with:
+  script: |
+linuxConfig = {
+  name: "linux-x86_64",
+  runs_on: "ubuntu-22.04"
+}
+windowsConfig = {
+  name: "windows-x86_64",
+  runs_on: "windows-2022"
+}
+macConfig = {
+  name: "macos-x86_64",
+  runs_on: "macos-13"
+}
+macArmConfig = {
+  name: "macos-aarch64",
+  runs_on: "macos-14"
+}
+
+configs = []
+
+const base_ref = process.env.GITHUB_BASE_REF;
+if (base_ref.startsWith('release/')) {
+  // This is a pull request against a release branch.
+  configs.push(macConfig)
+  configs.push(macArmConfig)
+}
+
+return configs;
+
+  ci-build-test:
+# If this job name is changed, then we need to update the job-name
+# paramater for the timeout-save step below.
+name: "Build"
+needs:
+

[llvm-branch-commits] [llvm] [workflows] Rework pre-commit CI for the release branch (PR #91550)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/91550

>From 8ea4c39bef000973979cc75a39006e5f87481ee2 Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Fri, 16 Feb 2024 21:34:02 +
Subject: [PATCH] [workflows] Rework pre-commit CI for the release branch

This rewrites the pre-commit CI for the release branch so that it
behaves almost exactly like the current buildkite builders.  It builds
every project and uses a better filtering method for selecting which
projects to build.

In addition, with this change we drop the Linux and Windows test
configs, since these are already covered by buildkite and add a
config for macos/aarch64.
---
 .github/workflows/ci-tests.yml| 156 +
 .../compute-projects-to-test/action.yml   |  21 ++
 .../compute-projects-to-test.sh   | 221 ++
 .github/workflows/continue-timeout-job.yml|  75 ++
 .github/workflows/get-job-id/action.yml   |  30 +++
 .github/workflows/lld-tests.yml   |  38 ---
 .../workflows/pr-sccache-restore/action.yml   |  26 +++
 .github/workflows/pr-sccache-save/action.yml  |  50 
 .github/workflows/timeout-restore/action.yml  |  33 +++
 .github/workflows/timeout-save/action.yml |  94 
 .../unprivileged-download-artifact/action.yml |  77 ++
 11 files changed, 783 insertions(+), 38 deletions(-)
 create mode 100644 .github/workflows/ci-tests.yml
 create mode 100644 .github/workflows/compute-projects-to-test/action.yml
 create mode 100755 
.github/workflows/compute-projects-to-test/compute-projects-to-test.sh
 create mode 100644 .github/workflows/continue-timeout-job.yml
 create mode 100644 .github/workflows/get-job-id/action.yml
 delete mode 100644 .github/workflows/lld-tests.yml
 create mode 100644 .github/workflows/pr-sccache-restore/action.yml
 create mode 100644 .github/workflows/pr-sccache-save/action.yml
 create mode 100644 .github/workflows/timeout-restore/action.yml
 create mode 100644 .github/workflows/timeout-save/action.yml
 create mode 100644 .github/workflows/unprivileged-download-artifact/action.yml

diff --git a/.github/workflows/ci-tests.yml b/.github/workflows/ci-tests.yml
new file mode 100644
index 0..e1d1c02755939
--- /dev/null
+++ b/.github/workflows/ci-tests.yml
@@ -0,0 +1,156 @@
+name: "CI Tests"
+
+permissions:
+  contents: read
+
+on:
+  pull_request:
+types:
+  - opened
+  - synchronize
+  - reopened
+  # When a PR is closed, we still start this workflow, but then skip
+  # all the jobs, which makes it effectively a no-op.  The reason to
+  # do this is that it allows us to take advantage of concurrency groups
+  # to cancel in progress CI jobs whenever the PR is closed.
+  - closed
+branches:
+  - 'release/**'
+
+concurrency:
+  group: ${{ github.workflow }}-${{ github.event.pull_request.number }}
+  cancel-in-progress: True
+
+jobs:
+  compute-test-configs:
+name: "Compute Configurations to Test"
+if: >-
+  github.repository_owner == 'llvm' &&
+  github.event.action != 'closed'
+runs-on: ubuntu-22.04
+outputs:
+  projects: ${{ steps.vars.outputs.projects }}
+  check-targets: ${{ steps.vars.outputs.check-targets }}
+  test-build: ${{ steps.vars.outputs.check-targets != '' }}
+  test-platforms: ${{ steps.platforms.outputs.result }}
+steps:
+  - name: Fetch LLVM sources
+uses: actions/checkout@v4
+with:
+  fetch-depth: 2
+
+  - name: Compute projects to test
+id: vars
+uses: ./.github/workflows/compute-projects-to-test
+
+  - name: Compute platforms to test
+uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea 
#v7.0.1
+id: platforms
+with:
+  script: |
+linuxConfig = {
+  name: "linux-x86_64",
+  runs_on: "ubuntu-22.04"
+}
+windowsConfig = {
+  name: "windows-x86_64",
+  runs_on: "windows-2022"
+}
+macConfig = {
+  name: "macos-x86_64",
+  runs_on: "macos-13"
+}
+macArmConfig = {
+  name: "macos-aarch64",
+  runs_on: "macos-14"
+}
+
+configs = []
+
+const base_ref = process.env.GITHUB_BASE_REF;
+if (base_ref.startsWith('release/')) {
+  // This is a pull request against a release branch.
+  configs.push(macConfig)
+  configs.push(macArmConfig)
+}
+
+return configs;
+
+  ci-build-test:
+# If this job name is changed, then we need to update the job-name
+# paramater for the timeout-save step below.
+name: "Build"
+needs:
+  - compute-test-configs
+permissions:
+  actions: write #pr-sccache-save may delete artifacts.
+runs-on: ${{ matrix.runs_on }}
+strategy:
+  fail-fast: false
+  matrix

[llvm-branch-commits] [llvm] [workflows] Rework pre-commit CI for the release branch (PR #91550)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/91550

>From 8ea4c39bef000973979cc75a39006e5f87481ee2 Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Fri, 16 Feb 2024 21:34:02 +
Subject: [PATCH 1/2] [workflows] Rework pre-commit CI for the release branch

This rewrites the pre-commit CI for the release branch so that it
behaves almost exactly like the current buildkite builders.  It builds
every project and uses a better filtering method for selecting which
projects to build.

In addition, with this change we drop the Linux and Windows test
configs, since these are already covered by buildkite and add a
config for macos/aarch64.
---
 .github/workflows/ci-tests.yml| 156 +
 .../compute-projects-to-test/action.yml   |  21 ++
 .../compute-projects-to-test.sh   | 221 ++
 .github/workflows/continue-timeout-job.yml|  75 ++
 .github/workflows/get-job-id/action.yml   |  30 +++
 .github/workflows/lld-tests.yml   |  38 ---
 .../workflows/pr-sccache-restore/action.yml   |  26 +++
 .github/workflows/pr-sccache-save/action.yml  |  50 
 .github/workflows/timeout-restore/action.yml  |  33 +++
 .github/workflows/timeout-save/action.yml |  94 
 .../unprivileged-download-artifact/action.yml |  77 ++
 11 files changed, 783 insertions(+), 38 deletions(-)
 create mode 100644 .github/workflows/ci-tests.yml
 create mode 100644 .github/workflows/compute-projects-to-test/action.yml
 create mode 100755 
.github/workflows/compute-projects-to-test/compute-projects-to-test.sh
 create mode 100644 .github/workflows/continue-timeout-job.yml
 create mode 100644 .github/workflows/get-job-id/action.yml
 delete mode 100644 .github/workflows/lld-tests.yml
 create mode 100644 .github/workflows/pr-sccache-restore/action.yml
 create mode 100644 .github/workflows/pr-sccache-save/action.yml
 create mode 100644 .github/workflows/timeout-restore/action.yml
 create mode 100644 .github/workflows/timeout-save/action.yml
 create mode 100644 .github/workflows/unprivileged-download-artifact/action.yml

diff --git a/.github/workflows/ci-tests.yml b/.github/workflows/ci-tests.yml
new file mode 100644
index 0..e1d1c02755939
--- /dev/null
+++ b/.github/workflows/ci-tests.yml
@@ -0,0 +1,156 @@
+name: "CI Tests"
+
+permissions:
+  contents: read
+
+on:
+  pull_request:
+types:
+  - opened
+  - synchronize
+  - reopened
+  # When a PR is closed, we still start this workflow, but then skip
+  # all the jobs, which makes it effectively a no-op.  The reason to
+  # do this is that it allows us to take advantage of concurrency groups
+  # to cancel in progress CI jobs whenever the PR is closed.
+  - closed
+branches:
+  - 'release/**'
+
+concurrency:
+  group: ${{ github.workflow }}-${{ github.event.pull_request.number }}
+  cancel-in-progress: True
+
+jobs:
+  compute-test-configs:
+name: "Compute Configurations to Test"
+if: >-
+  github.repository_owner == 'llvm' &&
+  github.event.action != 'closed'
+runs-on: ubuntu-22.04
+outputs:
+  projects: ${{ steps.vars.outputs.projects }}
+  check-targets: ${{ steps.vars.outputs.check-targets }}
+  test-build: ${{ steps.vars.outputs.check-targets != '' }}
+  test-platforms: ${{ steps.platforms.outputs.result }}
+steps:
+  - name: Fetch LLVM sources
+uses: actions/checkout@v4
+with:
+  fetch-depth: 2
+
+  - name: Compute projects to test
+id: vars
+uses: ./.github/workflows/compute-projects-to-test
+
+  - name: Compute platforms to test
+uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea 
#v7.0.1
+id: platforms
+with:
+  script: |
+linuxConfig = {
+  name: "linux-x86_64",
+  runs_on: "ubuntu-22.04"
+}
+windowsConfig = {
+  name: "windows-x86_64",
+  runs_on: "windows-2022"
+}
+macConfig = {
+  name: "macos-x86_64",
+  runs_on: "macos-13"
+}
+macArmConfig = {
+  name: "macos-aarch64",
+  runs_on: "macos-14"
+}
+
+configs = []
+
+const base_ref = process.env.GITHUB_BASE_REF;
+if (base_ref.startsWith('release/')) {
+  // This is a pull request against a release branch.
+  configs.push(macConfig)
+  configs.push(macArmConfig)
+}
+
+return configs;
+
+  ci-build-test:
+# If this job name is changed, then we need to update the job-name
+# paramater for the timeout-save step below.
+name: "Build"
+needs:
+  - compute-test-configs
+permissions:
+  actions: write #pr-sccache-save may delete artifacts.
+runs-on: ${{ matrix.runs_on }}
+strategy:
+  fail-fast: false
+  ma

[llvm-branch-commits] [llvm] [workflows] Rework pre-commit CI for the release branch (PR #91550)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/91550

>From 8ea4c39bef000973979cc75a39006e5f87481ee2 Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Fri, 16 Feb 2024 21:34:02 +
Subject: [PATCH 1/3] [workflows] Rework pre-commit CI for the release branch

This rewrites the pre-commit CI for the release branch so that it
behaves almost exactly like the current buildkite builders.  It builds
every project and uses a better filtering method for selecting which
projects to build.

In addition, with this change we drop the Linux and Windows test
configs, since these are already covered by buildkite and add a
config for macos/aarch64.
---
 .github/workflows/ci-tests.yml| 156 +
 .../compute-projects-to-test/action.yml   |  21 ++
 .../compute-projects-to-test.sh   | 221 ++
 .github/workflows/continue-timeout-job.yml|  75 ++
 .github/workflows/get-job-id/action.yml   |  30 +++
 .github/workflows/lld-tests.yml   |  38 ---
 .../workflows/pr-sccache-restore/action.yml   |  26 +++
 .github/workflows/pr-sccache-save/action.yml  |  50 
 .github/workflows/timeout-restore/action.yml  |  33 +++
 .github/workflows/timeout-save/action.yml |  94 
 .../unprivileged-download-artifact/action.yml |  77 ++
 11 files changed, 783 insertions(+), 38 deletions(-)
 create mode 100644 .github/workflows/ci-tests.yml
 create mode 100644 .github/workflows/compute-projects-to-test/action.yml
 create mode 100755 
.github/workflows/compute-projects-to-test/compute-projects-to-test.sh
 create mode 100644 .github/workflows/continue-timeout-job.yml
 create mode 100644 .github/workflows/get-job-id/action.yml
 delete mode 100644 .github/workflows/lld-tests.yml
 create mode 100644 .github/workflows/pr-sccache-restore/action.yml
 create mode 100644 .github/workflows/pr-sccache-save/action.yml
 create mode 100644 .github/workflows/timeout-restore/action.yml
 create mode 100644 .github/workflows/timeout-save/action.yml
 create mode 100644 .github/workflows/unprivileged-download-artifact/action.yml

diff --git a/.github/workflows/ci-tests.yml b/.github/workflows/ci-tests.yml
new file mode 100644
index 0..e1d1c02755939
--- /dev/null
+++ b/.github/workflows/ci-tests.yml
@@ -0,0 +1,156 @@
+name: "CI Tests"
+
+permissions:
+  contents: read
+
+on:
+  pull_request:
+types:
+  - opened
+  - synchronize
+  - reopened
+  # When a PR is closed, we still start this workflow, but then skip
+  # all the jobs, which makes it effectively a no-op.  The reason to
+  # do this is that it allows us to take advantage of concurrency groups
+  # to cancel in progress CI jobs whenever the PR is closed.
+  - closed
+branches:
+  - 'release/**'
+
+concurrency:
+  group: ${{ github.workflow }}-${{ github.event.pull_request.number }}
+  cancel-in-progress: True
+
+jobs:
+  compute-test-configs:
+name: "Compute Configurations to Test"
+if: >-
+  github.repository_owner == 'llvm' &&
+  github.event.action != 'closed'
+runs-on: ubuntu-22.04
+outputs:
+  projects: ${{ steps.vars.outputs.projects }}
+  check-targets: ${{ steps.vars.outputs.check-targets }}
+  test-build: ${{ steps.vars.outputs.check-targets != '' }}
+  test-platforms: ${{ steps.platforms.outputs.result }}
+steps:
+  - name: Fetch LLVM sources
+uses: actions/checkout@v4
+with:
+  fetch-depth: 2
+
+  - name: Compute projects to test
+id: vars
+uses: ./.github/workflows/compute-projects-to-test
+
+  - name: Compute platforms to test
+uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea 
#v7.0.1
+id: platforms
+with:
+  script: |
+linuxConfig = {
+  name: "linux-x86_64",
+  runs_on: "ubuntu-22.04"
+}
+windowsConfig = {
+  name: "windows-x86_64",
+  runs_on: "windows-2022"
+}
+macConfig = {
+  name: "macos-x86_64",
+  runs_on: "macos-13"
+}
+macArmConfig = {
+  name: "macos-aarch64",
+  runs_on: "macos-14"
+}
+
+configs = []
+
+const base_ref = process.env.GITHUB_BASE_REF;
+if (base_ref.startsWith('release/')) {
+  // This is a pull request against a release branch.
+  configs.push(macConfig)
+  configs.push(macArmConfig)
+}
+
+return configs;
+
+  ci-build-test:
+# If this job name is changed, then we need to update the job-name
+# paramater for the timeout-save step below.
+name: "Build"
+needs:
+  - compute-test-configs
+permissions:
+  actions: write #pr-sccache-save may delete artifacts.
+runs-on: ${{ matrix.runs_on }}
+strategy:
+  fail-fast: false
+  ma

[llvm-branch-commits] [llvm] Bump version to 18.1.6 (PR #91094)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/91094
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] Backport some fixes for building the release binaries (PR #91095)

2024-05-08 Thread Tom Stellard via llvm-branch-commits


@@ -22,7 +22,7 @@ if(NOT DEFINED LLVM_VERSION_MINOR)
   set(LLVM_VERSION_MINOR 1)
 endif()
 if(NOT DEFINED LLVM_VERSION_PATCH)
-  set(LLVM_VERSION_PATCH 5)
+  set(LLVM_VERSION_PATCH 6)

tstellar wrote:

I just merged this commit in another PR.

https://github.com/llvm/llvm-project/pull/91095
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] Backport some fixes for building the release binaries (PR #91095)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/91095

>From f2c5a10e1f27768b031b8b54cb056fd4e261ad8f Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Wed, 24 Apr 2024 07:47:42 -0700
Subject: [PATCH 1/7] [CMake][Release] Add stage2-package target (#89517)

This target will be used to generate the release binary package for
uploading to GitHub.

(cherry picked from commit a38f201f1ec70c2b1f3cf46e7f291c53bb16753e)
---
 clang/cmake/caches/Release.cmake | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake
index bd1f688d61a7e..fa972636553f1 100644
--- a/clang/cmake/caches/Release.cmake
+++ b/clang/cmake/caches/Release.cmake
@@ -14,6 +14,7 @@ if (LLVM_RELEASE_ENABLE_PGO)
   set(CLANG_BOOTSTRAP_TARGETS
 generate-profdata
 stage2
+stage2-package
 stage2-clang
 stage2-distribution
 stage2-install
@@ -57,6 +58,7 @@ set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "")
 set(BOOTSTRAP_CLANG_ENABLE_BOOTSTRAP ON CACHE STRING "")
 set(BOOTSTRAP_CLANG_BOOTSTRAP_TARGETS
   clang
+  package
   check-all
   check-llvm
   check-clang CACHE STRING "")

>From ce88e86e428be7eea517201ddee8d62150ae8de4 Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Thu, 25 Apr 2024 15:32:08 -0700
Subject: [PATCH 2/7] [CMake][Release] Refactor cache file and use two stages
 for non-PGO builds (#89812)

Completely refactor the cache file to simplify it and remove unnecessary
variables. The main functional change here is that the non-PGO builds
now use two stages, so `ninja -C build stage2-package` can be used with
both PGO and non-PGO builds.

(cherry picked from commit 6473fbf2d68c8486d168f29afc35d3e8a6fabe69)
---
 clang/cmake/caches/Release.cmake | 134 +++
 1 file changed, 66 insertions(+), 68 deletions(-)

diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake
index fa972636553f1..c164d5497275f 100644
--- a/clang/cmake/caches/Release.cmake
+++ b/clang/cmake/caches/Release.cmake
@@ -1,95 +1,93 @@
 # Plain options configure the first build.
 # BOOTSTRAP_* options configure the second build.
 # BOOTSTRAP_BOOTSTRAP_* options configure the third build.
+# PGO Builds have 3 stages (stage1, stage2-instrumented, stage2)
+# non-PGO Builds have 2 stages (stage1, stage2)
 
-# General Options
+
+function (set_final_stage_var name value type)
+  if (LLVM_RELEASE_ENABLE_PGO)
+set(BOOTSTRAP_BOOTSTRAP_${name} ${value} CACHE ${type} "")
+  else()
+set(BOOTSTRAP_${name} ${value} CACHE ${type} "")
+  endif()
+endfunction()
+
+function (set_instrument_and_final_stage_var name value type)
+  # This sets the varaible for the final stage in non-PGO builds and in
+  # the stage2-instrumented stage for PGO builds.
+  set(BOOTSTRAP_${name} ${value} CACHE ${type} "")
+  if (LLVM_RELEASE_ENABLE_PGO)
+# Set the variable in the final stage for PGO builds.
+set(BOOTSTRAP_BOOTSTRAP_${name} ${value} CACHE ${type} "")
+  endif()
+endfunction()
+
+# General Options:
+# If you want to override any of the LLVM_RELEASE_* variables you can set them
+# on the command line via -D, but you need to do this before you pass this
+# cache file to CMake via -C. e.g.
+#
+# cmake -D LLVM_RELEASE_ENABLE_PGO=ON -C Release.cmake
 set(LLVM_RELEASE_ENABLE_LTO THIN CACHE STRING "")
 set(LLVM_RELEASE_ENABLE_PGO OFF CACHE BOOL "")
-
+set(LLVM_RELEASE_ENABLE_RUNTIMES "compiler-rt;libcxx;libcxxabi;libunwind" 
CACHE STRING "")
+set(LLVM_RELEASE_ENABLE_PROJECTS 
"clang;lld;lldb;clang-tools-extra;bolt;polly;mlir;flang" CACHE STRING "")
+# Note we don't need to add install here, since it is one of the pre-defined
+# steps.
+set(LLVM_RELEASE_FINAL_STAGE_TARGETS 
"clang;package;check-all;check-llvm;check-clang" CACHE STRING "")
 set(CMAKE_BUILD_TYPE RELEASE CACHE STRING "")
 
-# Stage 1 Bootstrap Setup
+# Stage 1 Options
+set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "")
 set(CLANG_ENABLE_BOOTSTRAP ON CACHE BOOL "")
+
+set(STAGE1_PROJECTS "clang")
+set(STAGE1_RUNTIMES "")
+
 if (LLVM_RELEASE_ENABLE_PGO)
+  list(APPEND STAGE1_PROJECTS "lld")
+  list(APPEND STAGE1_RUNTIMES "compiler-rt")
   set(CLANG_BOOTSTRAP_TARGETS
 generate-profdata
-stage2
 stage2-package
 stage2-clang
-stage2-distribution
 stage2-install
-stage2-install-distribution
-stage2-install-distribution-toolchain
 stage2-check-all
 stage2-check-llvm
-stage2-check-clang
-stage2-test-suite CACHE STRING "")
-else()
-  set(CLANG_BOOTSTRAP_TARGETS
-clang
-check-all
-check-llvm
-check-clang
-test-suite
-stage3
-stage3-clang
-stage3-check-all
-stage3-check-llvm
-stage3-check-clang
-stage3-install
-stage3-test-suite CACHE STRING "")
-endif()
+stage2-check-clang CACHE STRING "")
 
-# Stage 1 Options
-set(STAGE1_PROJECTS "clang")
-set(STAGE1_RUNTIMES "")
+  # Configuration for stage2-instrumented
+  set(BOOTSTRAP_CLANG_ENABLE_BOOTSTRAP ON CACHE STRING "")
+  # This e

[llvm-branch-commits] [clang] f2c5a10 - [CMake][Release] Add stage2-package target (#89517)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

Author: Tom Stellard
Date: 2024-05-08T19:47:50-07:00
New Revision: f2c5a10e1f27768b031b8b54cb056fd4e261ad8f

URL: 
https://github.com/llvm/llvm-project/commit/f2c5a10e1f27768b031b8b54cb056fd4e261ad8f
DIFF: 
https://github.com/llvm/llvm-project/commit/f2c5a10e1f27768b031b8b54cb056fd4e261ad8f.diff

LOG: [CMake][Release] Add stage2-package target (#89517)

This target will be used to generate the release binary package for
uploading to GitHub.

(cherry picked from commit a38f201f1ec70c2b1f3cf46e7f291c53bb16753e)

Added: 


Modified: 
clang/cmake/caches/Release.cmake

Removed: 




diff  --git a/clang/cmake/caches/Release.cmake 
b/clang/cmake/caches/Release.cmake
index bd1f688d61a7e..fa972636553f1 100644
--- a/clang/cmake/caches/Release.cmake
+++ b/clang/cmake/caches/Release.cmake
@@ -14,6 +14,7 @@ if (LLVM_RELEASE_ENABLE_PGO)
   set(CLANG_BOOTSTRAP_TARGETS
 generate-profdata
 stage2
+stage2-package
 stage2-clang
 stage2-distribution
 stage2-install
@@ -57,6 +58,7 @@ set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "")
 set(BOOTSTRAP_CLANG_ENABLE_BOOTSTRAP ON CACHE STRING "")
 set(BOOTSTRAP_CLANG_BOOTSTRAP_TARGETS
   clang
+  package
   check-all
   check-llvm
   check-clang CACHE STRING "")



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] ce88e86 - [CMake][Release] Refactor cache file and use two stages for non-PGO builds (#89812)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

Author: Tom Stellard
Date: 2024-05-08T19:47:50-07:00
New Revision: ce88e86e428be7eea517201ddee8d62150ae8de4

URL: 
https://github.com/llvm/llvm-project/commit/ce88e86e428be7eea517201ddee8d62150ae8de4
DIFF: 
https://github.com/llvm/llvm-project/commit/ce88e86e428be7eea517201ddee8d62150ae8de4.diff

LOG: [CMake][Release] Refactor cache file and use two stages for non-PGO builds 
(#89812)

Completely refactor the cache file to simplify it and remove unnecessary
variables. The main functional change here is that the non-PGO builds
now use two stages, so `ninja -C build stage2-package` can be used with
both PGO and non-PGO builds.

(cherry picked from commit 6473fbf2d68c8486d168f29afc35d3e8a6fabe69)

Added: 


Modified: 
clang/cmake/caches/Release.cmake

Removed: 




diff  --git a/clang/cmake/caches/Release.cmake 
b/clang/cmake/caches/Release.cmake
index fa972636553f1..c164d5497275f 100644
--- a/clang/cmake/caches/Release.cmake
+++ b/clang/cmake/caches/Release.cmake
@@ -1,95 +1,93 @@
 # Plain options configure the first build.
 # BOOTSTRAP_* options configure the second build.
 # BOOTSTRAP_BOOTSTRAP_* options configure the third build.
+# PGO Builds have 3 stages (stage1, stage2-instrumented, stage2)
+# non-PGO Builds have 2 stages (stage1, stage2)
 
-# General Options
+
+function (set_final_stage_var name value type)
+  if (LLVM_RELEASE_ENABLE_PGO)
+set(BOOTSTRAP_BOOTSTRAP_${name} ${value} CACHE ${type} "")
+  else()
+set(BOOTSTRAP_${name} ${value} CACHE ${type} "")
+  endif()
+endfunction()
+
+function (set_instrument_and_final_stage_var name value type)
+  # This sets the varaible for the final stage in non-PGO builds and in
+  # the stage2-instrumented stage for PGO builds.
+  set(BOOTSTRAP_${name} ${value} CACHE ${type} "")
+  if (LLVM_RELEASE_ENABLE_PGO)
+# Set the variable in the final stage for PGO builds.
+set(BOOTSTRAP_BOOTSTRAP_${name} ${value} CACHE ${type} "")
+  endif()
+endfunction()
+
+# General Options:
+# If you want to override any of the LLVM_RELEASE_* variables you can set them
+# on the command line via -D, but you need to do this before you pass this
+# cache file to CMake via -C. e.g.
+#
+# cmake -D LLVM_RELEASE_ENABLE_PGO=ON -C Release.cmake
 set(LLVM_RELEASE_ENABLE_LTO THIN CACHE STRING "")
 set(LLVM_RELEASE_ENABLE_PGO OFF CACHE BOOL "")
-
+set(LLVM_RELEASE_ENABLE_RUNTIMES "compiler-rt;libcxx;libcxxabi;libunwind" 
CACHE STRING "")
+set(LLVM_RELEASE_ENABLE_PROJECTS 
"clang;lld;lldb;clang-tools-extra;bolt;polly;mlir;flang" CACHE STRING "")
+# Note we don't need to add install here, since it is one of the pre-defined
+# steps.
+set(LLVM_RELEASE_FINAL_STAGE_TARGETS 
"clang;package;check-all;check-llvm;check-clang" CACHE STRING "")
 set(CMAKE_BUILD_TYPE RELEASE CACHE STRING "")
 
-# Stage 1 Bootstrap Setup
+# Stage 1 Options
+set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "")
 set(CLANG_ENABLE_BOOTSTRAP ON CACHE BOOL "")
+
+set(STAGE1_PROJECTS "clang")
+set(STAGE1_RUNTIMES "")
+
 if (LLVM_RELEASE_ENABLE_PGO)
+  list(APPEND STAGE1_PROJECTS "lld")
+  list(APPEND STAGE1_RUNTIMES "compiler-rt")
   set(CLANG_BOOTSTRAP_TARGETS
 generate-profdata
-stage2
 stage2-package
 stage2-clang
-stage2-distribution
 stage2-install
-stage2-install-distribution
-stage2-install-distribution-toolchain
 stage2-check-all
 stage2-check-llvm
-stage2-check-clang
-stage2-test-suite CACHE STRING "")
-else()
-  set(CLANG_BOOTSTRAP_TARGETS
-clang
-check-all
-check-llvm
-check-clang
-test-suite
-stage3
-stage3-clang
-stage3-check-all
-stage3-check-llvm
-stage3-check-clang
-stage3-install
-stage3-test-suite CACHE STRING "")
-endif()
+stage2-check-clang CACHE STRING "")
 
-# Stage 1 Options
-set(STAGE1_PROJECTS "clang")
-set(STAGE1_RUNTIMES "")
+  # Configuration for stage2-instrumented
+  set(BOOTSTRAP_CLANG_ENABLE_BOOTSTRAP ON CACHE STRING "")
+  # This enables the build targets for the final stage which is called stage2.
+  set(BOOTSTRAP_CLANG_BOOTSTRAP_TARGETS ${LLVM_RELEASE_FINAL_STAGE_TARGETS} 
CACHE STRING "")
+  set(BOOTSTRAP_LLVM_BUILD_INSTRUMENTED IR CACHE STRING "")
+  set(BOOTSTRAP_LLVM_ENABLE_RUNTIMES "compiler-rt" CACHE STRING "")
+  set(BOOTSTRAP_LLVM_ENABLE_PROJECTS "clang;lld" CACHE STRING "")
 
-if (LLVM_RELEASE_ENABLE_PGO)
-  list(APPEND STAGE1_PROJECTS "lld")
-  list(APPEND STAGE1_RUNTIMES "compiler-rt")
+else()
+  if (LLVM_RELEASE_ENABLE_LTO)
+list(APPEND STAGE1_PROJECTS "lld")
+  endif()
+  # Any targets added here will be given the target name stage2-${target}, so
+  # if you want to run them you can just use:
+  # ninja -C $BUILDDIR stage2-${target}
+  set(CLANG_BOOTSTRAP_TARGETS ${LLVM_RELEASE_FINAL_STAGE_TARGETS} CACHE STRING 
"")
 endif()
 
+# Stage 1 Common Config
 set(LLVM_ENABLE_RUNTIMES ${STAGE1_RUNTIMES} CACHE STRING "")
 set(LLVM_ENABLE_PROJECTS ${STAGE1_PROJECTS} CACHE STRING "")
 

[llvm-branch-commits] [clang] b7e2397 - [CMake][Release] Enable CMAKE_POSITION_INDEPENDENT_CODE (#90139)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

Author: Tom Stellard
Date: 2024-05-08T19:47:50-07:00
New Revision: b7e2397c54b7cddac8fa188e68073f78e895a57a

URL: 
https://github.com/llvm/llvm-project/commit/b7e2397c54b7cddac8fa188e68073f78e895a57a
DIFF: 
https://github.com/llvm/llvm-project/commit/b7e2397c54b7cddac8fa188e68073f78e895a57a.diff

LOG: [CMake][Release] Enable CMAKE_POSITION_INDEPENDENT_CODE (#90139)

Set this in the cache file directly instead of via the test-release.sh
script so that the release builds can be reproduced with just the cache
file.

(cherry picked from commit 53ff002c6f7ec64a75ab0990b1314cc6b4bb67cf)

Added: 


Modified: 
clang/cmake/caches/Release.cmake
llvm/utils/release/test-release.sh

Removed: 




diff  --git a/clang/cmake/caches/Release.cmake 
b/clang/cmake/caches/Release.cmake
index c164d5497275f..c0bfcbdfc1c2a 100644
--- a/clang/cmake/caches/Release.cmake
+++ b/clang/cmake/caches/Release.cmake
@@ -82,6 +82,7 @@ set(LLVM_ENABLE_PROJECTS ${STAGE1_PROJECTS} CACHE STRING "")
 # stage2-instrumented and Final Stage Config:
 # Options that need to be set in both the instrumented stage (if we are doing
 # a pgo build) and the final stage.
+set_instrument_and_final_stage_var(CMAKE_POSITION_INDEPENDENT_CODE "ON" STRING)
 set_instrument_and_final_stage_var(LLVM_ENABLE_LTO 
"${LLVM_RELEASE_ENABLE_LTO}" STRING)
 if (LLVM_RELEASE_ENABLE_LTO)
   set_instrument_and_final_stage_var(LLVM_ENABLE_LLD "ON" BOOL)

diff  --git a/llvm/utils/release/test-release.sh 
b/llvm/utils/release/test-release.sh
index 4314b565e11b0..050004aa08c49 100755
--- a/llvm/utils/release/test-release.sh
+++ b/llvm/utils/release/test-release.sh
@@ -353,8 +353,7 @@ function build_with_cmake_cache() {
   env CC="$c_compiler" CXX="$cxx_compiler" \
   cmake -G "$generator" -B $CMakeBuildDir -S $SrcDir/llvm \
 -C $SrcDir/clang/cmake/caches/Release.cmake \
-   
-DCLANG_BOOTSTRAP_PASSTHROUGH="CMAKE_POSITION_INDEPENDENT_CODE;LLVM_LIT_ARGS" \
--DCMAKE_POSITION_INDEPENDENT_CODE=ON \
+   -DCLANG_BOOTSTRAP_PASSTHROUGH="LLVM_LIT_ARGS" \
 -DLLVM_LIT_ARGS="-j $NumJobs $LitVerbose" \
 $ExtraConfigureFlags
 2>&1 | tee $LogDir/llvm.configure-$Flavor.log



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] Backport some fixes for building the release binaries (PR #91095)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/91095
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12 (#90595) (PR #90719)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/90719

>From 58e44d3c6f67d5402ec38913d4262b94e73ac123 Mon Sep 17 00:00:00 2001
From: David Stuttard 
Date: Wed, 1 May 2024 11:37:13 +0100
Subject: [PATCH] [AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12
 (#90595)

Code to determine if a waitcnt is required before a barrier instruction
only
considered S_BARRIER.
gfx12 adds barrier_signal/wait so need to enhance the existing code to
look for
a barrier start (which is just an S_BARRIER for earlier architectures).
---
 llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp   |  2 +-
 llvm/lib/Target/AMDGPU/SIInstrInfo.h  | 11 ++
 .../CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll   |  2 ++
 .../AMDGPU/llvm.amdgcn.s.barrier.wait.ll  | 22 +++
 4 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp 
b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index 6ecb1c8bf6e1d..7a3198612f86f 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -1832,7 +1832,7 @@ bool 
SIInsertWaitcnts::generateWaitcntInstBefore(MachineInstr &MI,
   // not, we need to ensure the subtarget is capable of backing off barrier
   // instructions in case there are any outstanding memory operations that may
   // cause an exception. Otherwise, insert an explicit S_WAITCNT 0 here.
-  if (MI.getOpcode() == AMDGPU::S_BARRIER &&
+  if (TII->isBarrierStart(MI.getOpcode()) &&
   !ST->hasAutoWaitcntBeforeBarrier() && !ST->supportsBackOffBarrier()) {
 Wait = Wait.combined(
 AMDGPU::Waitcnt::allZero(ST->hasExtendedWaitCounts(), ST->hasVscnt()));
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.h 
b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
index 1c9dacc09f815..626d903c0c695 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
@@ -908,6 +908,17 @@ class SIInstrInfo final : public AMDGPUGenInstrInfo {
 return MI.getDesc().TSFlags & SIInstrFlags::IsNeverUniform;
   }
 
+  // Check to see if opcode is for a barrier start. Pre gfx12 this is just the
+  // S_BARRIER, but after support for S_BARRIER_SIGNAL* / S_BARRIER_WAIT we 
want
+  // to check for the barrier start (S_BARRIER_SIGNAL*)
+  bool isBarrierStart(unsigned Opcode) const {
+return Opcode == AMDGPU::S_BARRIER ||
+   Opcode == AMDGPU::S_BARRIER_SIGNAL_M0 ||
+   Opcode == AMDGPU::S_BARRIER_SIGNAL_ISFIRST_M0 ||
+   Opcode == AMDGPU::S_BARRIER_SIGNAL_IMM ||
+   Opcode == AMDGPU::S_BARRIER_SIGNAL_ISFIRST_IMM;
+  }
+
   static bool doesNotReadTiedSource(const MachineInstr &MI) {
 return MI.getDesc().TSFlags & SIInstrFlags::TiedSourceNotRead;
   }
diff --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll 
b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll
index a7d3115af29bf..47c021769aa56 100644
--- a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll
+++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll
@@ -96,6 +96,7 @@ define amdgpu_kernel void @test_barrier(ptr addrspace(1) 
%out, i32 %size) #0 {
 ; VARIANT4-NEXT:s_wait_kmcnt 0x0
 ; VARIANT4-NEXT:v_xad_u32 v1, v0, -1, s2
 ; VARIANT4-NEXT:global_store_b32 v3, v0, s[0:1]
+; VARIANT4-NEXT:s_wait_storecnt 0x0
 ; VARIANT4-NEXT:s_barrier_signal -1
 ; VARIANT4-NEXT:s_barrier_wait -1
 ; VARIANT4-NEXT:v_ashrrev_i32_e32 v2, 31, v1
@@ -142,6 +143,7 @@ define amdgpu_kernel void @test_barrier(ptr addrspace(1) 
%out, i32 %size) #0 {
 ; VARIANT6-NEXT:v_dual_mov_b32 v4, s1 :: v_dual_mov_b32 v3, s0
 ; VARIANT6-NEXT:v_sub_nc_u32_e32 v1, s2, v0
 ; VARIANT6-NEXT:global_store_b32 v5, v0, s[0:1]
+; VARIANT6-NEXT:s_wait_storecnt 0x0
 ; VARIANT6-NEXT:s_barrier_signal -1
 ; VARIANT6-NEXT:s_barrier_wait -1
 ; VARIANT6-NEXT:v_ashrrev_i32_e32 v2, 31, v1
diff --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll 
b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll
index 4ab5e97964a85..38a34ec6daf73 100644
--- a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll
+++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll
@@ -12,6 +12,7 @@ define amdgpu_kernel void @test1_s_barrier_signal(ptr 
addrspace(1) %out) #0 {
 ; GCN-NEXT:v_sub_nc_u32_e32 v0, v1, v0
 ; GCN-NEXT:s_wait_kmcnt 0x0
 ; GCN-NEXT:global_store_b32 v3, v2, s[0:1]
+; GCN-NEXT:s_wait_storecnt 0x0
 ; GCN-NEXT:s_barrier_signal -1
 ; GCN-NEXT:s_barrier_wait -1
 ; GCN-NEXT:global_store_b32 v3, v0, s[0:1]
@@ -28,6 +29,7 @@ define amdgpu_kernel void @test1_s_barrier_signal(ptr 
addrspace(1) %out) #0 {
 ; GLOBAL-ISEL-NEXT:v_sub_nc_u32_e32 v0, v1, v0
 ; GLOBAL-ISEL-NEXT:s_wait_kmcnt 0x0
 ; GLOBAL-ISEL-NEXT:global_store_b32 v3, v2, s[0:1]
+; GLOBAL-ISEL-NEXT:s_wait_storecnt 0x0
 ; GLOBAL-ISEL-NEXT:s_barrier_signal -1
 ; GLOBAL-ISEL-NEXT:s_barrier_wait -1
 ; GLOBAL-ISEL-NEXT:global_store_b32 v3, v0, s[0:1]
@@ -56,6 +58,7 @@ define amdgpu_k

[llvm-branch-commits] [llvm] 58e44d3 - [AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12 (#90595)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

Author: David Stuttard
Date: 2024-05-08T20:08:59-07:00
New Revision: 58e44d3c6f67d5402ec38913d4262b94e73ac123

URL: 
https://github.com/llvm/llvm-project/commit/58e44d3c6f67d5402ec38913d4262b94e73ac123
DIFF: 
https://github.com/llvm/llvm-project/commit/58e44d3c6f67d5402ec38913d4262b94e73ac123.diff

LOG: [AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12 (#90595)

Code to determine if a waitcnt is required before a barrier instruction
only
considered S_BARRIER.
gfx12 adds barrier_signal/wait so need to enhance the existing code to
look for
a barrier start (which is just an S_BARRIER for earlier architectures).

Added: 


Modified: 
llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
llvm/lib/Target/AMDGPU/SIInstrInfo.h
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll

Removed: 




diff  --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp 
b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index 6ecb1c8bf6e1d..7a3198612f86f 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -1832,7 +1832,7 @@ bool 
SIInsertWaitcnts::generateWaitcntInstBefore(MachineInstr &MI,
   // not, we need to ensure the subtarget is capable of backing off barrier
   // instructions in case there are any outstanding memory operations that may
   // cause an exception. Otherwise, insert an explicit S_WAITCNT 0 here.
-  if (MI.getOpcode() == AMDGPU::S_BARRIER &&
+  if (TII->isBarrierStart(MI.getOpcode()) &&
   !ST->hasAutoWaitcntBeforeBarrier() && !ST->supportsBackOffBarrier()) {
 Wait = Wait.combined(
 AMDGPU::Waitcnt::allZero(ST->hasExtendedWaitCounts(), ST->hasVscnt()));

diff  --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.h 
b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
index 1c9dacc09f815..626d903c0c695 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
@@ -908,6 +908,17 @@ class SIInstrInfo final : public AMDGPUGenInstrInfo {
 return MI.getDesc().TSFlags & SIInstrFlags::IsNeverUniform;
   }
 
+  // Check to see if opcode is for a barrier start. Pre gfx12 this is just the
+  // S_BARRIER, but after support for S_BARRIER_SIGNAL* / S_BARRIER_WAIT we 
want
+  // to check for the barrier start (S_BARRIER_SIGNAL*)
+  bool isBarrierStart(unsigned Opcode) const {
+return Opcode == AMDGPU::S_BARRIER ||
+   Opcode == AMDGPU::S_BARRIER_SIGNAL_M0 ||
+   Opcode == AMDGPU::S_BARRIER_SIGNAL_ISFIRST_M0 ||
+   Opcode == AMDGPU::S_BARRIER_SIGNAL_IMM ||
+   Opcode == AMDGPU::S_BARRIER_SIGNAL_ISFIRST_IMM;
+  }
+
   static bool doesNotReadTiedSource(const MachineInstr &MI) {
 return MI.getDesc().TSFlags & SIInstrFlags::TiedSourceNotRead;
   }

diff  --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll 
b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll
index a7d3115af29bf..47c021769aa56 100644
--- a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll
+++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll
@@ -96,6 +96,7 @@ define amdgpu_kernel void @test_barrier(ptr addrspace(1) 
%out, i32 %size) #0 {
 ; VARIANT4-NEXT:s_wait_kmcnt 0x0
 ; VARIANT4-NEXT:v_xad_u32 v1, v0, -1, s2
 ; VARIANT4-NEXT:global_store_b32 v3, v0, s[0:1]
+; VARIANT4-NEXT:s_wait_storecnt 0x0
 ; VARIANT4-NEXT:s_barrier_signal -1
 ; VARIANT4-NEXT:s_barrier_wait -1
 ; VARIANT4-NEXT:v_ashrrev_i32_e32 v2, 31, v1
@@ -142,6 +143,7 @@ define amdgpu_kernel void @test_barrier(ptr addrspace(1) 
%out, i32 %size) #0 {
 ; VARIANT6-NEXT:v_dual_mov_b32 v4, s1 :: v_dual_mov_b32 v3, s0
 ; VARIANT6-NEXT:v_sub_nc_u32_e32 v1, s2, v0
 ; VARIANT6-NEXT:global_store_b32 v5, v0, s[0:1]
+; VARIANT6-NEXT:s_wait_storecnt 0x0
 ; VARIANT6-NEXT:s_barrier_signal -1
 ; VARIANT6-NEXT:s_barrier_wait -1
 ; VARIANT6-NEXT:v_ashrrev_i32_e32 v2, 31, v1

diff  --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll 
b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll
index 4ab5e97964a85..38a34ec6daf73 100644
--- a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll
+++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll
@@ -12,6 +12,7 @@ define amdgpu_kernel void @test1_s_barrier_signal(ptr 
addrspace(1) %out) #0 {
 ; GCN-NEXT:v_sub_nc_u32_e32 v0, v1, v0
 ; GCN-NEXT:s_wait_kmcnt 0x0
 ; GCN-NEXT:global_store_b32 v3, v2, s[0:1]
+; GCN-NEXT:s_wait_storecnt 0x0
 ; GCN-NEXT:s_barrier_signal -1
 ; GCN-NEXT:s_barrier_wait -1
 ; GCN-NEXT:global_store_b32 v3, v0, s[0:1]
@@ -28,6 +29,7 @@ define amdgpu_kernel void @test1_s_barrier_signal(ptr 
addrspace(1) %out) #0 {
 ; GLOBAL-ISEL-NEXT:v_sub_nc_u32_e32 v0, v1, v0
 ; GLOBAL-ISEL-NEXT:s_wait_kmcnt 0x0
 ; GLOBAL-ISEL-NEXT:global_store_b32 v3, v2, s[0:1]
+; GLOBAL-ISEL-NEXT:s_wait_storecnt 0x0
 ; GLOBAL-ISEL-NEXT:s_barrier_signal -1
 ; GLOBAL-ISEL-NEXT:s_b

[llvm-branch-commits] [llvm] [AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12 (#90595) (PR #90719)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/90719
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86][EVEX512] Add `HasEVEX512` when `NoVLX` used for 512-bit patterns (#91106) (PR #91118)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/91118

>From 047cd915b86a4f35543ad4e691953aaa5a91c4fe Mon Sep 17 00:00:00 2001
From: Phoebe Wang 
Date: Sun, 5 May 2024 18:40:27 +0800
Subject: [PATCH] [X86][EVEX512] Add `HasEVEX512` when `NoVLX` used for 512-bit
 patterns (#91106)

With KNL/KNC being deprecated, we don't need to care about such no VLX
cases anymore. We may remove such patterns in the future.

Fixes #90844

(cherry picked from commit 7963d9a2b3c20561278a85b19e156e013231342c)
---
 llvm/lib/Target/X86/X86ISelLowering.cpp |  4 ++-
 llvm/lib/Target/X86/X86InstrAVX512.td   | 42 -
 llvm/test/CodeGen/X86/pr90844.ll| 19 +++
 3 files changed, 43 insertions(+), 22 deletions(-)
 create mode 100644 llvm/test/CodeGen/X86/pr90844.ll

diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 71fc6b5047eaa..c572b27fe401e 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -29841,7 +29841,9 @@ static SDValue LowerRotate(SDValue Op, const 
X86Subtarget &Subtarget,
 return R;
 
   // AVX512 implicitly uses modulo rotation amounts.
-  if (Subtarget.hasAVX512() && 32 <= EltSizeInBits) {
+  if ((Subtarget.hasVLX() ||
+   (Subtarget.hasAVX512() && Subtarget.hasEVEX512())) &&
+  32 <= EltSizeInBits) {
 // Attempt to rotate by immediate.
 if (IsCstSplat) {
   unsigned RotOpc = IsROTL ? X86ISD::VROTLI : X86ISD::VROTRI;
diff --git a/llvm/lib/Target/X86/X86InstrAVX512.td 
b/llvm/lib/Target/X86/X86InstrAVX512.td
index bb5e22c714279..0564f2167d8ee 100644
--- a/llvm/lib/Target/X86/X86InstrAVX512.td
+++ b/llvm/lib/Target/X86/X86InstrAVX512.td
@@ -814,7 +814,7 @@ defm : vextract_for_size_lowering<"VEXTRACTF64x4Z", 
v32f16_info, v16f16x_info,
 
 // A 128-bit extract from bits [255:128] of a 512-bit vector should use a
 // smaller extract to enable EVEX->VEX.
-let Predicates = [NoVLX] in {
+let Predicates = [NoVLX, HasEVEX512] in {
 def : Pat<(v2i64 (extract_subvector (v8i64 VR512:$src), (iPTR 2))),
   (v2i64 (VEXTRACTI128rr
   (v4i64 (EXTRACT_SUBREG (v8i64 VR512:$src), sub_ymm)),
@@ -3068,7 +3068,7 @@ def : Pat<(Narrow.KVT (and Narrow.KRC:$mask,
addr:$src2, (X86cmpm_imm_commute timm:$cc)), Narrow.KRC)>;
 }
 
-let Predicates = [HasAVX512, NoVLX] in {
+let Predicates = [HasAVX512, NoVLX, HasEVEX512] in {
   defm : axv512_icmp_packed_cc_no_vlx_lowering;
   defm : axv512_icmp_packed_cc_no_vlx_lowering;
 
@@ -3099,7 +3099,7 @@ let Predicates = [HasAVX512, NoVLX] in {
   defm : axv512_cmp_packed_cc_no_vlx_lowering<"VCMPPD", v2f64x_info, 
v8f64_info>;
 }
 
-let Predicates = [HasBWI, NoVLX] in {
+let Predicates = [HasBWI, NoVLX, HasEVEX512] in {
   defm : axv512_icmp_packed_cc_no_vlx_lowering;
   defm : axv512_icmp_packed_cc_no_vlx_lowering;
 
@@ -3493,7 +3493,7 @@ multiclass mask_move_lowering;
   defm : mask_move_lowering<"VMOVDQA32Z", v4i32x_info, v16i32_info>;
   defm : mask_move_lowering<"VMOVAPSZ", v8f32x_info, v16f32_info>;
@@ -3505,7 +3505,7 @@ let Predicates = [HasAVX512, NoVLX] in {
   defm : mask_move_lowering<"VMOVDQA64Z", v4i64x_info, v8i64_info>;
 }
 
-let Predicates = [HasBWI, NoVLX] in {
+let Predicates = [HasBWI, NoVLX, HasEVEX512] in {
   defm : mask_move_lowering<"VMOVDQU8Z", v16i8x_info, v64i8_info>;
   defm : mask_move_lowering<"VMOVDQU8Z", v32i8x_info, v64i8_info>;
 
@@ -4998,8 +4998,8 @@ defm VPMINUD : avx512_binop_rm_vl_d<0x3B, "vpminud", umin,
 defm VPMINUQ : avx512_binop_rm_vl_q<0x3B, "vpminuq", umin,
 SchedWriteVecALU, HasAVX512, 1>, T8;
 
-// PMULLQ: Use 512bit version to implement 128/256 bit in case NoVLX.
-let Predicates = [HasDQI, NoVLX] in {
+// PMULLQ: Use 512bit version to implement 128/256 bit in case NoVLX, 
HasEVEX512.
+let Predicates = [HasDQI, NoVLX, HasEVEX512] in {
   def : Pat<(v4i64 (mul (v4i64 VR256X:$src1), (v4i64 VR256X:$src2))),
 (EXTRACT_SUBREG
 (VPMULLQZrr
@@ -5055,7 +5055,7 @@ multiclass avx512_min_max_lowering {
  sub_xmm)>;
 }
 
-let Predicates = [HasAVX512, NoVLX] in {
+let Predicates = [HasAVX512, NoVLX, HasEVEX512] in {
   defm : avx512_min_max_lowering<"VPMAXUQZ", umax>;
   defm : avx512_min_max_lowering<"VPMINUQZ", umin>;
   defm : avx512_min_max_lowering<"VPMAXSQZ", smax>;
@@ -6032,7 +6032,7 @@ defm VPSRL : avx512_shift_types<0xD2, 0xD3, 0xD1, 
"vpsrl", X86vsrl,
 SchedWriteVecShift>;
 
 // Use 512bit VPSRA/VPSRAI version to implement v2i64/v4i64 in case NoVLX.
-let Predicates = [HasAVX512, NoVLX] in {
+let Predicates = [HasAVX512, NoVLX, HasEVEX512] in {
   def : Pat<(v4i64 (X86vsra (v4i64 VR256X:$src1), (v2i64 VR128X:$src2))),
 (EXTRACT_SUBREG (v8i64
   (VPSRAQZrr
@@ -6161,14 +6161,14 @@ defm VPSRLV : avx512_var_shift_types<0x45, "vpsrlv", 
X86vsrlv, SchedWriteVarVecS
 defm VPRORV : avx512_var_shift_types<0x14, "vprorv", r

[llvm-branch-commits] [llvm] 047cd91 - [X86][EVEX512] Add `HasEVEX512` when `NoVLX` used for 512-bit patterns (#91106)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

Author: Phoebe Wang
Date: 2024-05-08T20:10:38-07:00
New Revision: 047cd915b86a4f35543ad4e691953aaa5a91c4fe

URL: 
https://github.com/llvm/llvm-project/commit/047cd915b86a4f35543ad4e691953aaa5a91c4fe
DIFF: 
https://github.com/llvm/llvm-project/commit/047cd915b86a4f35543ad4e691953aaa5a91c4fe.diff

LOG: [X86][EVEX512] Add `HasEVEX512` when `NoVLX` used for 512-bit patterns 
(#91106)

With KNL/KNC being deprecated, we don't need to care about such no VLX
cases anymore. We may remove such patterns in the future.

Fixes #90844

(cherry picked from commit 7963d9a2b3c20561278a85b19e156e013231342c)

Added: 
llvm/test/CodeGen/X86/pr90844.ll

Modified: 
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/lib/Target/X86/X86InstrAVX512.td

Removed: 




diff  --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 71fc6b5047eaa..c572b27fe401e 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -29841,7 +29841,9 @@ static SDValue LowerRotate(SDValue Op, const 
X86Subtarget &Subtarget,
 return R;
 
   // AVX512 implicitly uses modulo rotation amounts.
-  if (Subtarget.hasAVX512() && 32 <= EltSizeInBits) {
+  if ((Subtarget.hasVLX() ||
+   (Subtarget.hasAVX512() && Subtarget.hasEVEX512())) &&
+  32 <= EltSizeInBits) {
 // Attempt to rotate by immediate.
 if (IsCstSplat) {
   unsigned RotOpc = IsROTL ? X86ISD::VROTLI : X86ISD::VROTRI;

diff  --git a/llvm/lib/Target/X86/X86InstrAVX512.td 
b/llvm/lib/Target/X86/X86InstrAVX512.td
index bb5e22c714279..0564f2167d8ee 100644
--- a/llvm/lib/Target/X86/X86InstrAVX512.td
+++ b/llvm/lib/Target/X86/X86InstrAVX512.td
@@ -814,7 +814,7 @@ defm : vextract_for_size_lowering<"VEXTRACTF64x4Z", 
v32f16_info, v16f16x_info,
 
 // A 128-bit extract from bits [255:128] of a 512-bit vector should use a
 // smaller extract to enable EVEX->VEX.
-let Predicates = [NoVLX] in {
+let Predicates = [NoVLX, HasEVEX512] in {
 def : Pat<(v2i64 (extract_subvector (v8i64 VR512:$src), (iPTR 2))),
   (v2i64 (VEXTRACTI128rr
   (v4i64 (EXTRACT_SUBREG (v8i64 VR512:$src), sub_ymm)),
@@ -3068,7 +3068,7 @@ def : Pat<(Narrow.KVT (and Narrow.KRC:$mask,
addr:$src2, (X86cmpm_imm_commute timm:$cc)), Narrow.KRC)>;
 }
 
-let Predicates = [HasAVX512, NoVLX] in {
+let Predicates = [HasAVX512, NoVLX, HasEVEX512] in {
   defm : axv512_icmp_packed_cc_no_vlx_lowering;
   defm : axv512_icmp_packed_cc_no_vlx_lowering;
 
@@ -3099,7 +3099,7 @@ let Predicates = [HasAVX512, NoVLX] in {
   defm : axv512_cmp_packed_cc_no_vlx_lowering<"VCMPPD", v2f64x_info, 
v8f64_info>;
 }
 
-let Predicates = [HasBWI, NoVLX] in {
+let Predicates = [HasBWI, NoVLX, HasEVEX512] in {
   defm : axv512_icmp_packed_cc_no_vlx_lowering;
   defm : axv512_icmp_packed_cc_no_vlx_lowering;
 
@@ -3493,7 +3493,7 @@ multiclass mask_move_lowering;
   defm : mask_move_lowering<"VMOVDQA32Z", v4i32x_info, v16i32_info>;
   defm : mask_move_lowering<"VMOVAPSZ", v8f32x_info, v16f32_info>;
@@ -3505,7 +3505,7 @@ let Predicates = [HasAVX512, NoVLX] in {
   defm : mask_move_lowering<"VMOVDQA64Z", v4i64x_info, v8i64_info>;
 }
 
-let Predicates = [HasBWI, NoVLX] in {
+let Predicates = [HasBWI, NoVLX, HasEVEX512] in {
   defm : mask_move_lowering<"VMOVDQU8Z", v16i8x_info, v64i8_info>;
   defm : mask_move_lowering<"VMOVDQU8Z", v32i8x_info, v64i8_info>;
 
@@ -4998,8 +4998,8 @@ defm VPMINUD : avx512_binop_rm_vl_d<0x3B, "vpminud", umin,
 defm VPMINUQ : avx512_binop_rm_vl_q<0x3B, "vpminuq", umin,
 SchedWriteVecALU, HasAVX512, 1>, T8;
 
-// PMULLQ: Use 512bit version to implement 128/256 bit in case NoVLX.
-let Predicates = [HasDQI, NoVLX] in {
+// PMULLQ: Use 512bit version to implement 128/256 bit in case NoVLX, 
HasEVEX512.
+let Predicates = [HasDQI, NoVLX, HasEVEX512] in {
   def : Pat<(v4i64 (mul (v4i64 VR256X:$src1), (v4i64 VR256X:$src2))),
 (EXTRACT_SUBREG
 (VPMULLQZrr
@@ -5055,7 +5055,7 @@ multiclass avx512_min_max_lowering {
  sub_xmm)>;
 }
 
-let Predicates = [HasAVX512, NoVLX] in {
+let Predicates = [HasAVX512, NoVLX, HasEVEX512] in {
   defm : avx512_min_max_lowering<"VPMAXUQZ", umax>;
   defm : avx512_min_max_lowering<"VPMINUQZ", umin>;
   defm : avx512_min_max_lowering<"VPMAXSQZ", smax>;
@@ -6032,7 +6032,7 @@ defm VPSRL : avx512_shift_types<0xD2, 0xD3, 0xD1, 
"vpsrl", X86vsrl,
 SchedWriteVecShift>;
 
 // Use 512bit VPSRA/VPSRAI version to implement v2i64/v4i64 in case NoVLX.
-let Predicates = [HasAVX512, NoVLX] in {
+let Predicates = [HasAVX512, NoVLX, HasEVEX512] in {
   def : Pat<(v4i64 (X86vsra (v4i64 VR256X:$src1), (v2i64 VR128X:$src2))),
 (EXTRACT_SUBREG (v8i64
   (VPSRAQZrr
@@ -6161,14 +6161,14 @@ defm VPSRLV : avx512_var_shift_types<0x45, "vpsrlv", 
X86vsrlv, SchedWriteVarVecS
 defm VPRORV : avx512_var_sh

[llvm-branch-commits] [llvm] release/18.x: [X86][EVEX512] Add `HasEVEX512` when `NoVLX` used for 512-bit patterns (#91106) (PR #91118)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/91118
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [AArch64][GISEL] Consider fcmp true and fcmp false in cond code selection (#86972) (PR #91126)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@marcauberer You can just create manually create a pull request against the 
release/18.x branch with the fixes.

https://github.com/llvm/llvm-project/pull/91126
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125) (PR #91425)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/91425

>From 2fc32a278e4fd46c6dd085845e69e84c321a3f75 Mon Sep 17 00:00:00 2001
From: Phoebe Wang 
Date: Mon, 6 May 2024 10:59:44 +0800
Subject: [PATCH 1/2] [X86][FP16] Do not create VBROADCAST_LOAD for f16 without
 AVX2 (#91125)

AVX doesn't provide 16-bit BROADCAST instruction.

Fixes #91005
---
 llvm/lib/Target/X86/X86ISelLowering.cpp |  2 +-
 llvm/test/CodeGen/X86/pr91005.ll| 39 +
 2 files changed, 40 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/X86/pr91005.ll

diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index c572b27fe401e..3e4ecab8443a9 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -7295,7 +7295,7 @@ static SDValue 
lowerBuildVectorAsBroadcast(BuildVectorSDNode *BVOp,
 // With pattern matching, the VBROADCAST node may become a VMOVDDUP.
 if (ScalarSize == 32 ||
 (ScalarSize == 64 && (IsGE256 || Subtarget.hasVLX())) ||
-CVT == MVT::f16 ||
+(CVT == MVT::f16 && Subtarget.hasAVX2()) ||
 (OptForSize && (ScalarSize == 64 || Subtarget.hasAVX2( {
   const Constant *C = nullptr;
   if (ConstantSDNode *CI = dyn_cast(Ld))
diff --git a/llvm/test/CodeGen/X86/pr91005.ll b/llvm/test/CodeGen/X86/pr91005.ll
new file mode 100644
index 0..97fd1ce456882
--- /dev/null
+++ b/llvm/test/CodeGen/X86/pr91005.ll
@@ -0,0 +1,39 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 4
+; RUN: llc -mtriple=x86_64-unknown-unknown -mattr=+f16c < %s | FileCheck %s
+
+define void @PR91005(ptr %0) minsize {
+; CHECK-LABEL: PR91005:
+; CHECK:   # %bb.0:
+; CHECK-NEXT:xorl %eax, %eax
+; CHECK-NEXT:testb %al, %al
+; CHECK-NEXT:je .LBB0_2
+; CHECK-NEXT:  # %bb.1:
+; CHECK-NEXT:vbroadcastss {{.*#+}} xmm0 = [31744,31744,31744,31744]
+; CHECK-NEXT:vpcmpeqw %xmm0, %xmm0, %xmm0
+; CHECK-NEXT:vpinsrw $0, {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm1
+; CHECK-NEXT:vpand %xmm1, %xmm0, %xmm0
+; CHECK-NEXT:vcvtph2ps %xmm0, %xmm0
+; CHECK-NEXT:vpxor %xmm1, %xmm1, %xmm1
+; CHECK-NEXT:vmulss %xmm1, %xmm0, %xmm0
+; CHECK-NEXT:vcvtps2ph $4, %xmm0, %xmm0
+; CHECK-NEXT:vmovd %xmm0, %eax
+; CHECK-NEXT:movw %ax, (%rdi)
+; CHECK-NEXT:  .LBB0_2: # %common.ret
+; CHECK-NEXT:retq
+  %2 = bitcast <2 x half> poison to <2 x i16>
+  %3 = icmp eq <2 x i16> %2, 
+  br i1 poison, label %4, label %common.ret
+
+common.ret:   ; preds = %4, %1
+  ret void
+
+4:; preds = %1
+  %5 = select <2 x i1> %3, <2 x half> , <2 x half> 
zeroinitializer
+  %6 = fmul <2 x half> %5, zeroinitializer
+  %7 = fsub <2 x half> %6, zeroinitializer
+  %8 = extractelement <2 x half> %7, i64 0
+  store half %8, ptr %0, align 2
+  br label %common.ret
+}
+
+declare <2 x half> @llvm.fabs.v2f16(<2 x half>)

>From 4d284b853f26a6cb848028720163561cabf63d95 Mon Sep 17 00:00:00 2001
From: Phoebe Wang 
Date: Wed, 8 May 2024 10:59:31 +0800
Subject: [PATCH 2/2] Fix difference with LLVM 18 release

---
 llvm/test/CodeGen/X86/pr91005.ll | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/llvm/test/CodeGen/X86/pr91005.ll b/llvm/test/CodeGen/X86/pr91005.ll
index 97fd1ce456882..16b78bf1e7e17 100644
--- a/llvm/test/CodeGen/X86/pr91005.ll
+++ b/llvm/test/CodeGen/X86/pr91005.ll
@@ -8,12 +8,13 @@ define void @PR91005(ptr %0) minsize {
 ; CHECK-NEXT:testb %al, %al
 ; CHECK-NEXT:je .LBB0_2
 ; CHECK-NEXT:  # %bb.1:
-; CHECK-NEXT:vbroadcastss {{.*#+}} xmm0 = [31744,31744,31744,31744]
-; CHECK-NEXT:vpcmpeqw %xmm0, %xmm0, %xmm0
-; CHECK-NEXT:vpinsrw $0, {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm1
-; CHECK-NEXT:vpand %xmm1, %xmm0, %xmm0
+; CHECK-NEXT:vpcmpeqw {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0
+; CHECK-NEXT:vpand {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0
+; CHECK-NEXT:vpextrw $0, %xmm0, %eax
+; CHECK-NEXT:movzwl %ax, %eax
+; CHECK-NEXT:vmovd %eax, %xmm0
 ; CHECK-NEXT:vcvtph2ps %xmm0, %xmm0
-; CHECK-NEXT:vpxor %xmm1, %xmm1, %xmm1
+; CHECK-NEXT:vxorps %xmm1, %xmm1, %xmm1
 ; CHECK-NEXT:vmulss %xmm1, %xmm0, %xmm0
 ; CHECK-NEXT:vcvtps2ph $4, %xmm0, %xmm0
 ; CHECK-NEXT:vmovd %xmm0, %eax

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125) (PR #91425)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/91425

>From dfc89f89ed14ebf22effe9dd9605608a975c4ed8 Mon Sep 17 00:00:00 2001
From: Phoebe Wang 
Date: Mon, 6 May 2024 10:59:44 +0800
Subject: [PATCH] [X86][FP16] Do not create VBROADCAST_LOAD for f16 without
 AVX2 (#91125)

AVX doesn't provide 16-bit BROADCAST instruction.

Fixes #91005
---
 llvm/lib/Target/X86/X86ISelLowering.cpp |  2 +-
 llvm/test/CodeGen/X86/pr91005.ll| 40 +
 2 files changed, 41 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/X86/pr91005.ll

diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index c572b27fe401e..3e4ecab8443a9 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -7295,7 +7295,7 @@ static SDValue 
lowerBuildVectorAsBroadcast(BuildVectorSDNode *BVOp,
 // With pattern matching, the VBROADCAST node may become a VMOVDDUP.
 if (ScalarSize == 32 ||
 (ScalarSize == 64 && (IsGE256 || Subtarget.hasVLX())) ||
-CVT == MVT::f16 ||
+(CVT == MVT::f16 && Subtarget.hasAVX2()) ||
 (OptForSize && (ScalarSize == 64 || Subtarget.hasAVX2( {
   const Constant *C = nullptr;
   if (ConstantSDNode *CI = dyn_cast(Ld))
diff --git a/llvm/test/CodeGen/X86/pr91005.ll b/llvm/test/CodeGen/X86/pr91005.ll
new file mode 100644
index 0..16b78bf1e7e17
--- /dev/null
+++ b/llvm/test/CodeGen/X86/pr91005.ll
@@ -0,0 +1,40 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 4
+; RUN: llc -mtriple=x86_64-unknown-unknown -mattr=+f16c < %s | FileCheck %s
+
+define void @PR91005(ptr %0) minsize {
+; CHECK-LABEL: PR91005:
+; CHECK:   # %bb.0:
+; CHECK-NEXT:xorl %eax, %eax
+; CHECK-NEXT:testb %al, %al
+; CHECK-NEXT:je .LBB0_2
+; CHECK-NEXT:  # %bb.1:
+; CHECK-NEXT:vpcmpeqw {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0
+; CHECK-NEXT:vpand {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0
+; CHECK-NEXT:vpextrw $0, %xmm0, %eax
+; CHECK-NEXT:movzwl %ax, %eax
+; CHECK-NEXT:vmovd %eax, %xmm0
+; CHECK-NEXT:vcvtph2ps %xmm0, %xmm0
+; CHECK-NEXT:vxorps %xmm1, %xmm1, %xmm1
+; CHECK-NEXT:vmulss %xmm1, %xmm0, %xmm0
+; CHECK-NEXT:vcvtps2ph $4, %xmm0, %xmm0
+; CHECK-NEXT:vmovd %xmm0, %eax
+; CHECK-NEXT:movw %ax, (%rdi)
+; CHECK-NEXT:  .LBB0_2: # %common.ret
+; CHECK-NEXT:retq
+  %2 = bitcast <2 x half> poison to <2 x i16>
+  %3 = icmp eq <2 x i16> %2, 
+  br i1 poison, label %4, label %common.ret
+
+common.ret:   ; preds = %4, %1
+  ret void
+
+4:; preds = %1
+  %5 = select <2 x i1> %3, <2 x half> , <2 x half> 
zeroinitializer
+  %6 = fmul <2 x half> %5, zeroinitializer
+  %7 = fsub <2 x half> %6, zeroinitializer
+  %8 = extractelement <2 x half> %7, i64 0
+  store half %8, ptr %0, align 2
+  br label %common.ret
+}
+
+declare <2 x half> @llvm.fabs.v2f16(<2 x half>)

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] dfc89f8 - [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

Author: Phoebe Wang
Date: 2024-05-08T20:14:03-07:00
New Revision: dfc89f89ed14ebf22effe9dd9605608a975c4ed8

URL: 
https://github.com/llvm/llvm-project/commit/dfc89f89ed14ebf22effe9dd9605608a975c4ed8
DIFF: 
https://github.com/llvm/llvm-project/commit/dfc89f89ed14ebf22effe9dd9605608a975c4ed8.diff

LOG: [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125)

AVX doesn't provide 16-bit BROADCAST instruction.

Fixes #91005

Added: 
llvm/test/CodeGen/X86/pr91005.ll

Modified: 
llvm/lib/Target/X86/X86ISelLowering.cpp

Removed: 




diff  --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index c572b27fe401e..3e4ecab8443a9 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -7295,7 +7295,7 @@ static SDValue 
lowerBuildVectorAsBroadcast(BuildVectorSDNode *BVOp,
 // With pattern matching, the VBROADCAST node may become a VMOVDDUP.
 if (ScalarSize == 32 ||
 (ScalarSize == 64 && (IsGE256 || Subtarget.hasVLX())) ||
-CVT == MVT::f16 ||
+(CVT == MVT::f16 && Subtarget.hasAVX2()) ||
 (OptForSize && (ScalarSize == 64 || Subtarget.hasAVX2( {
   const Constant *C = nullptr;
   if (ConstantSDNode *CI = dyn_cast(Ld))

diff  --git a/llvm/test/CodeGen/X86/pr91005.ll 
b/llvm/test/CodeGen/X86/pr91005.ll
new file mode 100644
index 0..16b78bf1e7e17
--- /dev/null
+++ b/llvm/test/CodeGen/X86/pr91005.ll
@@ -0,0 +1,40 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 4
+; RUN: llc -mtriple=x86_64-unknown-unknown -mattr=+f16c < %s | FileCheck %s
+
+define void @PR91005(ptr %0) minsize {
+; CHECK-LABEL: PR91005:
+; CHECK:   # %bb.0:
+; CHECK-NEXT:xorl %eax, %eax
+; CHECK-NEXT:testb %al, %al
+; CHECK-NEXT:je .LBB0_2
+; CHECK-NEXT:  # %bb.1:
+; CHECK-NEXT:vpcmpeqw {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0
+; CHECK-NEXT:vpand {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0
+; CHECK-NEXT:vpextrw $0, %xmm0, %eax
+; CHECK-NEXT:movzwl %ax, %eax
+; CHECK-NEXT:vmovd %eax, %xmm0
+; CHECK-NEXT:vcvtph2ps %xmm0, %xmm0
+; CHECK-NEXT:vxorps %xmm1, %xmm1, %xmm1
+; CHECK-NEXT:vmulss %xmm1, %xmm0, %xmm0
+; CHECK-NEXT:vcvtps2ph $4, %xmm0, %xmm0
+; CHECK-NEXT:vmovd %xmm0, %eax
+; CHECK-NEXT:movw %ax, (%rdi)
+; CHECK-NEXT:  .LBB0_2: # %common.ret
+; CHECK-NEXT:retq
+  %2 = bitcast <2 x half> poison to <2 x i16>
+  %3 = icmp eq <2 x i16> %2, 
+  br i1 poison, label %4, label %common.ret
+
+common.ret:   ; preds = %4, %1
+  ret void
+
+4:; preds = %1
+  %5 = select <2 x i1> %3, <2 x half> , <2 x half> 
zeroinitializer
+  %6 = fmul <2 x half> %5, zeroinitializer
+  %7 = fsub <2 x half> %6, zeroinitializer
+  %8 = extractelement <2 x half> %7, i64 0
+  store half %8, ptr %0, align 2
+  br label %common.ret
+}
+
+declare <2 x half> @llvm.fabs.v2f16(<2 x half>)



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125) (PR #91425)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/91425
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [SelectionDAG] Mark frame index as "aliased" at argument copy elison (PR #91035)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/91035

>From f5f572f54b32f6ff3ae450fa421ed6d478f09ec8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Bj=C3=B6rn=20Pettersson?= 
Date: Tue, 23 Apr 2024 13:49:18 +0200
Subject: [PATCH] [SelectionDAG] Mark frame index as "aliased" at argument copy
 elison (#89712)

This is a fix for miscompiles reported in
  https://github.com/llvm/llvm-project/issues/89060

After argument copy elison the IR value for the eliminated alloca
is aliasing with the fixed stack object. This patch is making sure
that we mark the fixed stack object as being aliased with IR values
to avoid that for example schedulers are reordering accesses to
the fixed stack object. This could otherwise happen when there is a
mix of MemOperands refering the shared fixed stack slow via both
the IR value for the elided alloca, and via a fixed stack pseudo
source value (as would be the case when lowering the arguments).

(cherry picked from commit d8b253be56b3e9073b3e59123cf2da0bcde20c63)
---
 llvm/include/llvm/CodeGen/MachineFrameInfo.h  |  7 
 .../SelectionDAG/SelectionDAGBuilder.cpp  |  3 +-
 llvm/test/CodeGen/Hexagon/arg-copy-elison.ll  | 39 +++
 3 files changed, 48 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/Hexagon/arg-copy-elison.ll

diff --git a/llvm/include/llvm/CodeGen/MachineFrameInfo.h 
b/llvm/include/llvm/CodeGen/MachineFrameInfo.h
index 7d11d63d4066f..c35faac09c4d9 100644
--- a/llvm/include/llvm/CodeGen/MachineFrameInfo.h
+++ b/llvm/include/llvm/CodeGen/MachineFrameInfo.h
@@ -697,6 +697,13 @@ class MachineFrameInfo {
 return Objects[ObjectIdx+NumFixedObjects].isAliased;
   }
 
+  /// Set "maybe pointed to by an LLVM IR value" for an object.
+  void setIsAliasedObjectIndex(int ObjectIdx, bool IsAliased) {
+assert(unsigned(ObjectIdx+NumFixedObjects) < Objects.size() &&
+   "Invalid Object Idx!");
+Objects[ObjectIdx+NumFixedObjects].isAliased = IsAliased;
+  }
+
   /// Returns true if the specified index corresponds to an immutable object.
   bool isImmutableObjectIndex(int ObjectIdx) const {
 // Tail calling functions can clobber their function arguments.
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 5ce1013f30fd1..7406a8ac1611d 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -10888,7 +10888,7 @@ static void tryToElideArgumentCopy(
   }
 
   // Perform the elision. Delete the old stack object and replace its only use
-  // in the variable info map. Mark the stack object as mutable.
+  // in the variable info map. Mark the stack object as mutable and aliased.
   LLVM_DEBUG({
 dbgs() << "Eliding argument copy from " << Arg << " to " << *AI << '\n'
<< "  Replacing frame index " << OldIndex << " with " << FixedIndex
@@ -10896,6 +10896,7 @@ static void tryToElideArgumentCopy(
   });
   MFI.RemoveStackObject(OldIndex);
   MFI.setIsImmutableObjectIndex(FixedIndex, false);
+  MFI.setIsAliasedObjectIndex(FixedIndex, true);
   AllocaIndex = FixedIndex;
   ArgCopyElisionFrameIndexMap.insert({OldIndex, FixedIndex});
   for (SDValue ArgVal : ArgVals)
diff --git a/llvm/test/CodeGen/Hexagon/arg-copy-elison.ll 
b/llvm/test/CodeGen/Hexagon/arg-copy-elison.ll
new file mode 100644
index 0..f0c30c301f446
--- /dev/null
+++ b/llvm/test/CodeGen/Hexagon/arg-copy-elison.ll
@@ -0,0 +1,39 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 4
+; RUN: llc -mtriple hexagon-- -o - %s | FileCheck %s
+
+; Reproducer for https://github.com/llvm/llvm-project/issues/89060
+;
+; Problem was a bug in argument copy elison. Given that the %alloca is
+; eliminated, the same frame index will be used for accessing %alloca and %a
+; on the fixed stack. Care must be taken when setting up
+; MachinePointerInfo/MemOperands for those accesses to either make sure that
+; we always refer to the fixed stack slot the same way (not using the
+; ir.alloca name), or make sure that we still detect that they alias each
+; other if using different kinds of MemOperands to identify the same fixed
+; stack entry.
+;
+define i32 @f(i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, 
i32 %q1, i32 %a, i32 %q2) {
+; CHECK-LABEL: f:
+; CHECK: .cfi_startproc
+; CHECK-NEXT:  // %bb.0:
+; CHECK-NEXT:{
+; CHECK-NEXT: r0 = memw(r29+#36)
+; CHECK-NEXT: r1 = memw(r29+#28)
+; CHECK-NEXT:}
+; CHECK-NEXT:{
+; CHECK-NEXT: r0 = sub(r1,r0)
+; CHECK-NEXT: r2 = memw(r29+#32)
+; CHECK-NEXT: memw(r29+#32) = ##666
+; CHECK-NEXT:}
+; CHECK-NEXT:{
+; CHECK-NEXT: r0 = xor(r0,r2)
+; CHECK-NEXT: jumpr r31
+; CHECK-NEXT:}
+  %alloca = alloca i32
+  store i32 %a, ptr %alloca ; Should be elided.
+  store i32 666, ptr %alloca
+  %x = sub i32 %q1, %q2
+  %y = xor i32 %x, %a 

[llvm-branch-commits] [llvm] f5f572f - [SelectionDAG] Mark frame index as "aliased" at argument copy elison (#89712)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

Author: Björn Pettersson
Date: 2024-05-08T20:16:03-07:00
New Revision: f5f572f54b32f6ff3ae450fa421ed6d478f09ec8

URL: 
https://github.com/llvm/llvm-project/commit/f5f572f54b32f6ff3ae450fa421ed6d478f09ec8
DIFF: 
https://github.com/llvm/llvm-project/commit/f5f572f54b32f6ff3ae450fa421ed6d478f09ec8.diff

LOG: [SelectionDAG] Mark frame index as "aliased" at argument copy elison 
(#89712)

This is a fix for miscompiles reported in
  https://github.com/llvm/llvm-project/issues/89060

After argument copy elison the IR value for the eliminated alloca
is aliasing with the fixed stack object. This patch is making sure
that we mark the fixed stack object as being aliased with IR values
to avoid that for example schedulers are reordering accesses to
the fixed stack object. This could otherwise happen when there is a
mix of MemOperands refering the shared fixed stack slow via both
the IR value for the elided alloca, and via a fixed stack pseudo
source value (as would be the case when lowering the arguments).

(cherry picked from commit d8b253be56b3e9073b3e59123cf2da0bcde20c63)

Added: 
llvm/test/CodeGen/Hexagon/arg-copy-elison.ll

Modified: 
llvm/include/llvm/CodeGen/MachineFrameInfo.h
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

Removed: 




diff  --git a/llvm/include/llvm/CodeGen/MachineFrameInfo.h 
b/llvm/include/llvm/CodeGen/MachineFrameInfo.h
index 7d11d63d4066f..c35faac09c4d9 100644
--- a/llvm/include/llvm/CodeGen/MachineFrameInfo.h
+++ b/llvm/include/llvm/CodeGen/MachineFrameInfo.h
@@ -697,6 +697,13 @@ class MachineFrameInfo {
 return Objects[ObjectIdx+NumFixedObjects].isAliased;
   }
 
+  /// Set "maybe pointed to by an LLVM IR value" for an object.
+  void setIsAliasedObjectIndex(int ObjectIdx, bool IsAliased) {
+assert(unsigned(ObjectIdx+NumFixedObjects) < Objects.size() &&
+   "Invalid Object Idx!");
+Objects[ObjectIdx+NumFixedObjects].isAliased = IsAliased;
+  }
+
   /// Returns true if the specified index corresponds to an immutable object.
   bool isImmutableObjectIndex(int ObjectIdx) const {
 // Tail calling functions can clobber their function arguments.

diff  --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 5ce1013f30fd1..7406a8ac1611d 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -10888,7 +10888,7 @@ static void tryToElideArgumentCopy(
   }
 
   // Perform the elision. Delete the old stack object and replace its only use
-  // in the variable info map. Mark the stack object as mutable.
+  // in the variable info map. Mark the stack object as mutable and aliased.
   LLVM_DEBUG({
 dbgs() << "Eliding argument copy from " << Arg << " to " << *AI << '\n'
<< "  Replacing frame index " << OldIndex << " with " << FixedIndex
@@ -10896,6 +10896,7 @@ static void tryToElideArgumentCopy(
   });
   MFI.RemoveStackObject(OldIndex);
   MFI.setIsImmutableObjectIndex(FixedIndex, false);
+  MFI.setIsAliasedObjectIndex(FixedIndex, true);
   AllocaIndex = FixedIndex;
   ArgCopyElisionFrameIndexMap.insert({OldIndex, FixedIndex});
   for (SDValue ArgVal : ArgVals)

diff  --git a/llvm/test/CodeGen/Hexagon/arg-copy-elison.ll 
b/llvm/test/CodeGen/Hexagon/arg-copy-elison.ll
new file mode 100644
index 0..f0c30c301f446
--- /dev/null
+++ b/llvm/test/CodeGen/Hexagon/arg-copy-elison.ll
@@ -0,0 +1,39 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 4
+; RUN: llc -mtriple hexagon-- -o - %s | FileCheck %s
+
+; Reproducer for https://github.com/llvm/llvm-project/issues/89060
+;
+; Problem was a bug in argument copy elison. Given that the %alloca is
+; eliminated, the same frame index will be used for accessing %alloca and %a
+; on the fixed stack. Care must be taken when setting up
+; MachinePointerInfo/MemOperands for those accesses to either make sure that
+; we always refer to the fixed stack slot the same way (not using the
+; ir.alloca name), or make sure that we still detect that they alias each
+; other if using 
diff erent kinds of MemOperands to identify the same fixed
+; stack entry.
+;
+define i32 @f(i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, 
i32 %q1, i32 %a, i32 %q2) {
+; CHECK-LABEL: f:
+; CHECK: .cfi_startproc
+; CHECK-NEXT:  // %bb.0:
+; CHECK-NEXT:{
+; CHECK-NEXT: r0 = memw(r29+#36)
+; CHECK-NEXT: r1 = memw(r29+#28)
+; CHECK-NEXT:}
+; CHECK-NEXT:{
+; CHECK-NEXT: r0 = sub(r1,r0)
+; CHECK-NEXT: r2 = memw(r29+#32)
+; CHECK-NEXT: memw(r29+#32) = ##666
+; CHECK-NEXT:}
+; CHECK-NEXT:{
+; CHECK-NEXT: r0 = xor(r0,r2)
+; CHECK-NEXT: jumpr r31
+; CHECK-NEXT:}
+  %alloca = alloca i32
+  store i32 %a, ptr %alloca ; Should be elided.
+  store i32 666, ptr %alloca
+  %x = sub i32 %q

[llvm-branch-commits] [llvm] release/18.x: [SelectionDAG] Mark frame index as "aliased" at argument copy elison (PR #91035)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/91035
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [AMDGPU] Fix GFX12 encoding of s_wait_event export_ready (#89622) (PR #91034)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/91034

>From bce9393291a2daa8006d1da629aa2765e00f4e70 Mon Sep 17 00:00:00 2001
From: Jay Foad 
Date: Tue, 23 Apr 2024 14:38:45 +0100
Subject: [PATCH] [AMDGPU] Fix GFX12 encoding of s_wait_event export_ready
 (#89622)

As well as flipping the sense of the bit, GFX12 moved it from bit 0 to
bit 1 in the encoded simm16 operand.

(cherry picked from commit e0a763c490d8ef58dca867e0ef834978ccf8e17d)
---
 llvm/lib/Target/AMDGPU/SOPInstructions.td|  2 +-
 llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll | 10 +++---
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SOPInstructions.td 
b/llvm/lib/Target/AMDGPU/SOPInstructions.td
index ae5ef0541929b..5762efde73f02 100644
--- a/llvm/lib/Target/AMDGPU/SOPInstructions.td
+++ b/llvm/lib/Target/AMDGPU/SOPInstructions.td
@@ -1786,7 +1786,7 @@ def : GCNPat<
 let SubtargetPredicate = isNotGFX12Plus in
   def : GCNPat <(int_amdgcn_s_wait_event_export_ready), (S_WAIT_EVENT (i16 
0))>;
 let SubtargetPredicate = isGFX12Plus in
-  def : GCNPat <(int_amdgcn_s_wait_event_export_ready), (S_WAIT_EVENT (i16 
1))>;
+  def : GCNPat <(int_amdgcn_s_wait_event_export_ready), (S_WAIT_EVENT (i16 
2))>;
 
 // The first 10 bits of the mode register are the core FP mode on all
 // subtargets.
diff --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll 
b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll
index 08c77148f6ae1..433fefa434988 100644
--- a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll
+++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll
@@ -5,14 +5,10 @@
 
 ; GCN-LABEL: {{^}}test_wait_event:
 ; GFX11: s_wait_event 0x0
-; GFX12: s_wait_event 0x1
+; GFX12: s_wait_event 0x2
 
-define amdgpu_ps void @test_wait_event() #0 {
+define amdgpu_ps void @test_wait_event() {
 entry:
-  call void @llvm.amdgcn.s.wait.event.export.ready() #0
+  call void @llvm.amdgcn.s.wait.event.export.ready()
   ret void
 }
-
-declare void @llvm.amdgcn.s.wait.event.export.ready() #0
-
-attributes #0 = { nounwind }

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] bce9393 - [AMDGPU] Fix GFX12 encoding of s_wait_event export_ready (#89622)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

Author: Jay Foad
Date: 2024-05-08T20:17:31-07:00
New Revision: bce9393291a2daa8006d1da629aa2765e00f4e70

URL: 
https://github.com/llvm/llvm-project/commit/bce9393291a2daa8006d1da629aa2765e00f4e70
DIFF: 
https://github.com/llvm/llvm-project/commit/bce9393291a2daa8006d1da629aa2765e00f4e70.diff

LOG: [AMDGPU] Fix GFX12 encoding of s_wait_event export_ready (#89622)

As well as flipping the sense of the bit, GFX12 moved it from bit 0 to
bit 1 in the encoded simm16 operand.

(cherry picked from commit e0a763c490d8ef58dca867e0ef834978ccf8e17d)

Added: 


Modified: 
llvm/lib/Target/AMDGPU/SOPInstructions.td
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll

Removed: 




diff  --git a/llvm/lib/Target/AMDGPU/SOPInstructions.td 
b/llvm/lib/Target/AMDGPU/SOPInstructions.td
index ae5ef0541929b..5762efde73f02 100644
--- a/llvm/lib/Target/AMDGPU/SOPInstructions.td
+++ b/llvm/lib/Target/AMDGPU/SOPInstructions.td
@@ -1786,7 +1786,7 @@ def : GCNPat<
 let SubtargetPredicate = isNotGFX12Plus in
   def : GCNPat <(int_amdgcn_s_wait_event_export_ready), (S_WAIT_EVENT (i16 
0))>;
 let SubtargetPredicate = isGFX12Plus in
-  def : GCNPat <(int_amdgcn_s_wait_event_export_ready), (S_WAIT_EVENT (i16 
1))>;
+  def : GCNPat <(int_amdgcn_s_wait_event_export_ready), (S_WAIT_EVENT (i16 
2))>;
 
 // The first 10 bits of the mode register are the core FP mode on all
 // subtargets.

diff  --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll 
b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll
index 08c77148f6ae1..433fefa434988 100644
--- a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll
+++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll
@@ -5,14 +5,10 @@
 
 ; GCN-LABEL: {{^}}test_wait_event:
 ; GFX11: s_wait_event 0x0
-; GFX12: s_wait_event 0x1
+; GFX12: s_wait_event 0x2
 
-define amdgpu_ps void @test_wait_event() #0 {
+define amdgpu_ps void @test_wait_event() {
 entry:
-  call void @llvm.amdgcn.s.wait.event.export.ready() #0
+  call void @llvm.amdgcn.s.wait.event.export.ready()
   ret void
 }
-
-declare void @llvm.amdgcn.s.wait.event.export.ready() #0
-
-attributes #0 = { nounwind }



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [AMDGPU] Fix GFX12 encoding of s_wait_event export_ready (#89622) (PR #91034)

2024-05-08 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/91034
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [AArc64][GlobalISel] Fix legalizer assert for G_INSERT_VECTOR_ELT (PR #90827)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@aemerson Did you submit a new pull request with a fix?

https://github.com/llvm/llvm-project/pull/90827
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x: [clang-format] Don't remove parentheses of fold expressions (#91045) (PR #91165)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/91165

>From 0abb89a80f5c0736843950dc39f2454ab312b319 Mon Sep 17 00:00:00 2001
From: Owen Pan 
Date: Sun, 5 May 2024 21:33:41 -0700
Subject: [PATCH] [clang-format] Don't remove parentheses of fold expressions
 (#91045)

Fixes #90966.

(cherry picked from commit db0ed5533368414b1c4e1c884eef651c66359da2)
---
 clang/lib/Format/UnwrappedLineParser.cpp | 7 ++-
 clang/unittests/Format/FormatTest.cpp| 9 +
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/clang/lib/Format/UnwrappedLineParser.cpp 
b/clang/lib/Format/UnwrappedLineParser.cpp
index a6eb18bb2b322..f70affb732a0d 100644
--- a/clang/lib/Format/UnwrappedLineParser.cpp
+++ b/clang/lib/Format/UnwrappedLineParser.cpp
@@ -2510,6 +2510,7 @@ bool UnwrappedLineParser::parseParens(TokenType 
AmpAmpTokenType) {
   assert(FormatTok->is(tok::l_paren) && "'(' expected.");
   auto *LeftParen = FormatTok;
   bool SeenEqual = false;
+  bool MightBeFoldExpr = false;
   const bool MightBeStmtExpr = Tokens->peekNextToken()->is(tok::l_brace);
   nextToken();
   do {
@@ -2521,7 +2522,7 @@ bool UnwrappedLineParser::parseParens(TokenType 
AmpAmpTokenType) {
 parseChildBlock();
   break;
 case tok::r_paren:
-  if (!MightBeStmtExpr && !Line->InMacroBody &&
+  if (!MightBeStmtExpr && !MightBeFoldExpr && !Line->InMacroBody &&
   Style.RemoveParentheses > FormatStyle::RPS_Leave) {
 const auto *Prev = LeftParen->Previous;
 const auto *Next = Tokens->peekNextToken();
@@ -2564,6 +2565,10 @@ bool UnwrappedLineParser::parseParens(TokenType 
AmpAmpTokenType) {
 parseBracedList();
   }
   break;
+case tok::ellipsis:
+  MightBeFoldExpr = true;
+  nextToken();
+  break;
 case tok::equal:
   SeenEqual = true;
   if (Style.isCSharp() && FormatTok->is(TT_FatArrow))
diff --git a/clang/unittests/Format/FormatTest.cpp 
b/clang/unittests/Format/FormatTest.cpp
index 88877e53d014c..923128672c316 100644
--- a/clang/unittests/Format/FormatTest.cpp
+++ b/clang/unittests/Format/FormatTest.cpp
@@ -26894,8 +26894,14 @@ TEST_F(FormatTest, RemoveParentheses) {
"if ((({ a; })))\n"
"  b;",
Style);
+  verifyFormat("static_assert((std::is_constructible_v && ...));",
+   "static_assert(((std::is_constructible_v && 
...)));",
+   Style);
   verifyFormat("return (0);", "return (((0)));", Style);
   verifyFormat("return (({ 0; }));", "return ((({ 0; })));", Style);
+  verifyFormat("return ((... && std::is_convertible_v));",
+   "return (((... && std::is_convertible_v)));",
+   Style);
 
   Style.RemoveParentheses = FormatStyle::RPS_ReturnStatement;
   verifyFormat("#define Return0 return (0);", Style);
@@ -26903,6 +26909,9 @@ TEST_F(FormatTest, RemoveParentheses) {
   verifyFormat("co_return 0;", "co_return ((0));", Style);
   verifyFormat("return 0;", "return (((0)));", Style);
   verifyFormat("return ({ 0; });", "return ((({ 0; })));", Style);
+  verifyFormat("return (... && std::is_convertible_v);",
+   "return (((... && std::is_convertible_v)));",
+   Style);
   verifyFormat("inline decltype(auto) f() {\n"
"  if (a) {\n"
"return (a);\n"

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] 0abb89a - [clang-format] Don't remove parentheses of fold expressions (#91045)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

Author: Owen Pan
Date: 2024-05-09T12:16:32-07:00
New Revision: 0abb89a80f5c0736843950dc39f2454ab312b319

URL: 
https://github.com/llvm/llvm-project/commit/0abb89a80f5c0736843950dc39f2454ab312b319
DIFF: 
https://github.com/llvm/llvm-project/commit/0abb89a80f5c0736843950dc39f2454ab312b319.diff

LOG: [clang-format] Don't remove parentheses of fold expressions (#91045)

Fixes #90966.

(cherry picked from commit db0ed5533368414b1c4e1c884eef651c66359da2)

Added: 


Modified: 
clang/lib/Format/UnwrappedLineParser.cpp
clang/unittests/Format/FormatTest.cpp

Removed: 




diff  --git a/clang/lib/Format/UnwrappedLineParser.cpp 
b/clang/lib/Format/UnwrappedLineParser.cpp
index a6eb18bb2b322..f70affb732a0d 100644
--- a/clang/lib/Format/UnwrappedLineParser.cpp
+++ b/clang/lib/Format/UnwrappedLineParser.cpp
@@ -2510,6 +2510,7 @@ bool UnwrappedLineParser::parseParens(TokenType 
AmpAmpTokenType) {
   assert(FormatTok->is(tok::l_paren) && "'(' expected.");
   auto *LeftParen = FormatTok;
   bool SeenEqual = false;
+  bool MightBeFoldExpr = false;
   const bool MightBeStmtExpr = Tokens->peekNextToken()->is(tok::l_brace);
   nextToken();
   do {
@@ -2521,7 +2522,7 @@ bool UnwrappedLineParser::parseParens(TokenType 
AmpAmpTokenType) {
 parseChildBlock();
   break;
 case tok::r_paren:
-  if (!MightBeStmtExpr && !Line->InMacroBody &&
+  if (!MightBeStmtExpr && !MightBeFoldExpr && !Line->InMacroBody &&
   Style.RemoveParentheses > FormatStyle::RPS_Leave) {
 const auto *Prev = LeftParen->Previous;
 const auto *Next = Tokens->peekNextToken();
@@ -2564,6 +2565,10 @@ bool UnwrappedLineParser::parseParens(TokenType 
AmpAmpTokenType) {
 parseBracedList();
   }
   break;
+case tok::ellipsis:
+  MightBeFoldExpr = true;
+  nextToken();
+  break;
 case tok::equal:
   SeenEqual = true;
   if (Style.isCSharp() && FormatTok->is(TT_FatArrow))

diff  --git a/clang/unittests/Format/FormatTest.cpp 
b/clang/unittests/Format/FormatTest.cpp
index 88877e53d014c..923128672c316 100644
--- a/clang/unittests/Format/FormatTest.cpp
+++ b/clang/unittests/Format/FormatTest.cpp
@@ -26894,8 +26894,14 @@ TEST_F(FormatTest, RemoveParentheses) {
"if ((({ a; })))\n"
"  b;",
Style);
+  verifyFormat("static_assert((std::is_constructible_v && ...));",
+   "static_assert(((std::is_constructible_v && 
...)));",
+   Style);
   verifyFormat("return (0);", "return (((0)));", Style);
   verifyFormat("return (({ 0; }));", "return ((({ 0; })));", Style);
+  verifyFormat("return ((... && std::is_convertible_v));",
+   "return (((... && std::is_convertible_v)));",
+   Style);
 
   Style.RemoveParentheses = FormatStyle::RPS_ReturnStatement;
   verifyFormat("#define Return0 return (0);", Style);
@@ -26903,6 +26909,9 @@ TEST_F(FormatTest, RemoveParentheses) {
   verifyFormat("co_return 0;", "co_return ((0));", Style);
   verifyFormat("return 0;", "return (((0)));", Style);
   verifyFormat("return ({ 0; });", "return ((({ 0; })));", Style);
+  verifyFormat("return (... && std::is_convertible_v);",
+   "return (((... && std::is_convertible_v)));",
+   Style);
   verifyFormat("inline decltype(auto) f() {\n"
"  if (a) {\n"
"return (a);\n"



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x: [clang-format] Don't remove parentheses of fold expressions (#91045) (PR #91165)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/91165
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [FunctionAttrs] Fix incorrect nonnull inference for non-inbounds GEP (#91180) (PR #91286)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/91286

>From 4a28f8e3c625e168c1cb9203150e3dc6495bb0fa Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Tue, 7 May 2024 09:47:28 +0900
Subject: [PATCH] [FunctionAttrs] Fix incorrect nonnull inference for
 non-inbounds GEP (#91180)

For inbounds GEPs, if the source pointer is non-null, the result must
also be non-null. However, this does not hold for non-inbounds GEPs.

Fixes https://github.com/llvm/llvm-project/issues/91177.

(cherry picked from commit f34d30cdae0f59698f660d5cc8fb993fb3441064)
---
 llvm/lib/Transforms/IPO/FunctionAttrs.cpp |  7 +-
 .../Transforms/FunctionAttrs/nocapture.ll |  2 +-
 llvm/test/Transforms/FunctionAttrs/nonnull.ll | 23 +++
 3 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/llvm/lib/Transforms/IPO/FunctionAttrs.cpp 
b/llvm/lib/Transforms/IPO/FunctionAttrs.cpp
index 7ebf265e17ba1..27c411250d53c 100644
--- a/llvm/lib/Transforms/IPO/FunctionAttrs.cpp
+++ b/llvm/lib/Transforms/IPO/FunctionAttrs.cpp
@@ -1186,10 +1186,15 @@ static bool isReturnNonNull(Function *F, const 
SCCNodeSet &SCCNodes,
 switch (RVI->getOpcode()) {
 // Extend the analysis by looking upwards.
 case Instruction::BitCast:
-case Instruction::GetElementPtr:
 case Instruction::AddrSpaceCast:
   FlowsToReturn.insert(RVI->getOperand(0));
   continue;
+case Instruction::GetElementPtr:
+  if (cast(RVI)->isInBounds()) {
+FlowsToReturn.insert(RVI->getOperand(0));
+continue;
+  }
+  return false;
 case Instruction::Select: {
   SelectInst *SI = cast(RVI);
   FlowsToReturn.insert(SI->getTrueValue());
diff --git a/llvm/test/Transforms/FunctionAttrs/nocapture.ll 
b/llvm/test/Transforms/FunctionAttrs/nocapture.ll
index 3d483f671b1af..8d6f6a7c73f80 100644
--- a/llvm/test/Transforms/FunctionAttrs/nocapture.ll
+++ b/llvm/test/Transforms/FunctionAttrs/nocapture.ll
@@ -197,7 +197,7 @@ declare i32 @__gxx_personality_v0(...)
 
 define ptr @lookup_bit(ptr %q, i32 %bitno) readnone nounwind {
 ; FNATTRS: Function Attrs: mustprogress nofree norecurse nosync nounwind 
willreturn memory(none)
-; FNATTRS-LABEL: define nonnull ptr @lookup_bit
+; FNATTRS-LABEL: define ptr @lookup_bit
 ; FNATTRS-SAME: (ptr [[Q:%.*]], i32 [[BITNO:%.*]]) #[[ATTR0]] {
 ; FNATTRS-NEXT:[[TMP:%.*]] = ptrtoint ptr [[Q]] to i32
 ; FNATTRS-NEXT:[[TMP2:%.*]] = lshr i32 [[TMP]], [[BITNO]]
diff --git a/llvm/test/Transforms/FunctionAttrs/nonnull.ll 
b/llvm/test/Transforms/FunctionAttrs/nonnull.ll
index d9bdb6298ed0f..ec5545b969e55 100644
--- a/llvm/test/Transforms/FunctionAttrs/nonnull.ll
+++ b/llvm/test/Transforms/FunctionAttrs/nonnull.ll
@@ -905,26 +905,26 @@ define i1 @parent8(ptr %a, ptr %bogus1, ptr %b) 
personality ptr @esfp{
 ; FNATTRS-SAME: ptr nonnull [[A:%.*]], ptr nocapture readnone [[BOGUS1:%.*]], 
ptr nonnull [[B:%.*]]) #[[ATTR7]] personality ptr @esfp {
 ; FNATTRS-NEXT:  entry:
 ; FNATTRS-NEXT:invoke void @use2nonnull(ptr [[A]], ptr [[B]])
-; FNATTRS-NEXT:to label [[CONT:%.*]] unwind label [[EXC:%.*]]
+; FNATTRS-NEXT:to label [[CONT:%.*]] unwind label [[EXC:%.*]]
 ; FNATTRS:   cont:
 ; FNATTRS-NEXT:[[NULL_CHECK:%.*]] = icmp eq ptr [[B]], null
 ; FNATTRS-NEXT:ret i1 [[NULL_CHECK]]
 ; FNATTRS:   exc:
 ; FNATTRS-NEXT:[[LP:%.*]] = landingpad { ptr, i32 }
-; FNATTRS-NEXT:filter [0 x ptr] zeroinitializer
+; FNATTRS-NEXT:filter [0 x ptr] zeroinitializer
 ; FNATTRS-NEXT:unreachable
 ;
 ; ATTRIBUTOR-LABEL: define i1 @parent8(
 ; ATTRIBUTOR-SAME: ptr nonnull [[A:%.*]], ptr nocapture nofree readnone 
[[BOGUS1:%.*]], ptr nonnull [[B:%.*]]) #[[ATTR8]] personality ptr @esfp {
 ; ATTRIBUTOR-NEXT:  entry:
 ; ATTRIBUTOR-NEXT:invoke void @use2nonnull(ptr nonnull [[A]], ptr nonnull 
[[B]])
-; ATTRIBUTOR-NEXT:to label [[CONT:%.*]] unwind label [[EXC:%.*]]
+; ATTRIBUTOR-NEXT:to label [[CONT:%.*]] unwind label [[EXC:%.*]]
 ; ATTRIBUTOR:   cont:
 ; ATTRIBUTOR-NEXT:[[NULL_CHECK:%.*]] = icmp eq ptr [[B]], null
 ; ATTRIBUTOR-NEXT:ret i1 [[NULL_CHECK]]
 ; ATTRIBUTOR:   exc:
 ; ATTRIBUTOR-NEXT:[[LP:%.*]] = landingpad { ptr, i32 }
-; ATTRIBUTOR-NEXT:filter [0 x ptr] zeroinitializer
+; ATTRIBUTOR-NEXT:filter [0 x ptr] zeroinitializer
 ; ATTRIBUTOR-NEXT:unreachable
 ;
 
@@ -1415,5 +1415,20 @@ define void @PR43833_simple(ptr %0, i32 %1) {
   br i1 %11, label %7, label %8
 }
 
+define ptr @pr91177_non_inbounds_gep(ptr nonnull %arg) {
+; FNATTRS-LABEL: define ptr @pr91177_non_inbounds_gep(
+; FNATTRS-SAME: ptr nonnull readnone [[ARG:%.*]]) #[[ATTR0]] {
+; FNATTRS-NEXT:[[RES:%.*]] = getelementptr i8, ptr [[ARG]], i64 -8
+; FNATTRS-NEXT:ret ptr [[RES]]
+;
+; ATTRIBUTOR-LABEL: define ptr @pr91177_non_inbounds_gep(
+; ATTRIBUTOR-SAME: ptr nofree nonnull readnone [[ARG:%.*]]) #[[ATTR0]] {
+; ATTRIBUTOR-NEXT:[[RES:%.*]] = getelementptr i8, ptr [[ARG]], i64 -8
+; ATTRIBUT

[llvm-branch-commits] [llvm] 4a28f8e - [FunctionAttrs] Fix incorrect nonnull inference for non-inbounds GEP (#91180)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

Author: Nikita Popov
Date: 2024-05-09T12:18:19-07:00
New Revision: 4a28f8e3c625e168c1cb9203150e3dc6495bb0fa

URL: 
https://github.com/llvm/llvm-project/commit/4a28f8e3c625e168c1cb9203150e3dc6495bb0fa
DIFF: 
https://github.com/llvm/llvm-project/commit/4a28f8e3c625e168c1cb9203150e3dc6495bb0fa.diff

LOG: [FunctionAttrs] Fix incorrect nonnull inference for non-inbounds GEP 
(#91180)

For inbounds GEPs, if the source pointer is non-null, the result must
also be non-null. However, this does not hold for non-inbounds GEPs.

Fixes https://github.com/llvm/llvm-project/issues/91177.

(cherry picked from commit f34d30cdae0f59698f660d5cc8fb993fb3441064)

Added: 


Modified: 
llvm/lib/Transforms/IPO/FunctionAttrs.cpp
llvm/test/Transforms/FunctionAttrs/nocapture.ll
llvm/test/Transforms/FunctionAttrs/nonnull.ll

Removed: 




diff  --git a/llvm/lib/Transforms/IPO/FunctionAttrs.cpp 
b/llvm/lib/Transforms/IPO/FunctionAttrs.cpp
index 7ebf265e17ba1..27c411250d53c 100644
--- a/llvm/lib/Transforms/IPO/FunctionAttrs.cpp
+++ b/llvm/lib/Transforms/IPO/FunctionAttrs.cpp
@@ -1186,10 +1186,15 @@ static bool isReturnNonNull(Function *F, const 
SCCNodeSet &SCCNodes,
 switch (RVI->getOpcode()) {
 // Extend the analysis by looking upwards.
 case Instruction::BitCast:
-case Instruction::GetElementPtr:
 case Instruction::AddrSpaceCast:
   FlowsToReturn.insert(RVI->getOperand(0));
   continue;
+case Instruction::GetElementPtr:
+  if (cast(RVI)->isInBounds()) {
+FlowsToReturn.insert(RVI->getOperand(0));
+continue;
+  }
+  return false;
 case Instruction::Select: {
   SelectInst *SI = cast(RVI);
   FlowsToReturn.insert(SI->getTrueValue());

diff  --git a/llvm/test/Transforms/FunctionAttrs/nocapture.ll 
b/llvm/test/Transforms/FunctionAttrs/nocapture.ll
index 3d483f671b1af..8d6f6a7c73f80 100644
--- a/llvm/test/Transforms/FunctionAttrs/nocapture.ll
+++ b/llvm/test/Transforms/FunctionAttrs/nocapture.ll
@@ -197,7 +197,7 @@ declare i32 @__gxx_personality_v0(...)
 
 define ptr @lookup_bit(ptr %q, i32 %bitno) readnone nounwind {
 ; FNATTRS: Function Attrs: mustprogress nofree norecurse nosync nounwind 
willreturn memory(none)
-; FNATTRS-LABEL: define nonnull ptr @lookup_bit
+; FNATTRS-LABEL: define ptr @lookup_bit
 ; FNATTRS-SAME: (ptr [[Q:%.*]], i32 [[BITNO:%.*]]) #[[ATTR0]] {
 ; FNATTRS-NEXT:[[TMP:%.*]] = ptrtoint ptr [[Q]] to i32
 ; FNATTRS-NEXT:[[TMP2:%.*]] = lshr i32 [[TMP]], [[BITNO]]

diff  --git a/llvm/test/Transforms/FunctionAttrs/nonnull.ll 
b/llvm/test/Transforms/FunctionAttrs/nonnull.ll
index d9bdb6298ed0f..ec5545b969e55 100644
--- a/llvm/test/Transforms/FunctionAttrs/nonnull.ll
+++ b/llvm/test/Transforms/FunctionAttrs/nonnull.ll
@@ -905,26 +905,26 @@ define i1 @parent8(ptr %a, ptr %bogus1, ptr %b) 
personality ptr @esfp{
 ; FNATTRS-SAME: ptr nonnull [[A:%.*]], ptr nocapture readnone [[BOGUS1:%.*]], 
ptr nonnull [[B:%.*]]) #[[ATTR7]] personality ptr @esfp {
 ; FNATTRS-NEXT:  entry:
 ; FNATTRS-NEXT:invoke void @use2nonnull(ptr [[A]], ptr [[B]])
-; FNATTRS-NEXT:to label [[CONT:%.*]] unwind label [[EXC:%.*]]
+; FNATTRS-NEXT:to label [[CONT:%.*]] unwind label [[EXC:%.*]]
 ; FNATTRS:   cont:
 ; FNATTRS-NEXT:[[NULL_CHECK:%.*]] = icmp eq ptr [[B]], null
 ; FNATTRS-NEXT:ret i1 [[NULL_CHECK]]
 ; FNATTRS:   exc:
 ; FNATTRS-NEXT:[[LP:%.*]] = landingpad { ptr, i32 }
-; FNATTRS-NEXT:filter [0 x ptr] zeroinitializer
+; FNATTRS-NEXT:filter [0 x ptr] zeroinitializer
 ; FNATTRS-NEXT:unreachable
 ;
 ; ATTRIBUTOR-LABEL: define i1 @parent8(
 ; ATTRIBUTOR-SAME: ptr nonnull [[A:%.*]], ptr nocapture nofree readnone 
[[BOGUS1:%.*]], ptr nonnull [[B:%.*]]) #[[ATTR8]] personality ptr @esfp {
 ; ATTRIBUTOR-NEXT:  entry:
 ; ATTRIBUTOR-NEXT:invoke void @use2nonnull(ptr nonnull [[A]], ptr nonnull 
[[B]])
-; ATTRIBUTOR-NEXT:to label [[CONT:%.*]] unwind label [[EXC:%.*]]
+; ATTRIBUTOR-NEXT:to label [[CONT:%.*]] unwind label [[EXC:%.*]]
 ; ATTRIBUTOR:   cont:
 ; ATTRIBUTOR-NEXT:[[NULL_CHECK:%.*]] = icmp eq ptr [[B]], null
 ; ATTRIBUTOR-NEXT:ret i1 [[NULL_CHECK]]
 ; ATTRIBUTOR:   exc:
 ; ATTRIBUTOR-NEXT:[[LP:%.*]] = landingpad { ptr, i32 }
-; ATTRIBUTOR-NEXT:filter [0 x ptr] zeroinitializer
+; ATTRIBUTOR-NEXT:filter [0 x ptr] zeroinitializer
 ; ATTRIBUTOR-NEXT:unreachable
 ;
 
@@ -1415,5 +1415,20 @@ define void @PR43833_simple(ptr %0, i32 %1) {
   br i1 %11, label %7, label %8
 }
 
+define ptr @pr91177_non_inbounds_gep(ptr nonnull %arg) {
+; FNATTRS-LABEL: define ptr @pr91177_non_inbounds_gep(
+; FNATTRS-SAME: ptr nonnull readnone [[ARG:%.*]]) #[[ATTR0]] {
+; FNATTRS-NEXT:[[RES:%.*]] = getelementptr i8, ptr [[ARG]], i64 -8
+; FNATTRS-NEXT:ret ptr [[RES]]
+;
+; ATTRIBUTOR-LABEL: define ptr @pr91177_non_inbounds_gep(
+; ATTRIBUTOR-SAME: ptr nofree nonnull readnone [[ARG

[llvm-branch-commits] [llvm] release/18.x: [FunctionAttrs] Fix incorrect nonnull inference for non-inbounds GEP (#91180) (PR #91286)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/91286
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [AArch64][GISEL] Consider fcmp true and fcmp false in cond code selection (#86972) (PR #91580)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/91580

>From 1ffa00a01e5db29cb5c1b2cb6baed8e2b9381f81 Mon Sep 17 00:00:00 2001
From: Marc Auberer 
Date: Thu, 28 Mar 2024 23:08:38 +0100
Subject: [PATCH 1/2] [AArch64][GISEL] Consider fcmp true and fcmp false in
 cond code selection (#86972)

Fixes #86917

`FCMP_TRUE` and `FCMP_FALSE` were previously not considered and we ended
up in an llvm_unreachable assertion.
---
 .../AArch64/GISel/AArch64GlobalISelUtils.cpp  |   6 ++
 .../CodeGen/AArch64/GlobalISel/select.mir |  20 
 .../AArch64/neon-compare-instructions.ll  | 101 ++
 3 files changed, 127 insertions(+)

diff --git a/llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp 
b/llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp
index 92db89cc0915b..80fe4bcb8b58f 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp
@@ -147,6 +147,12 @@ void AArch64GISelUtils::changeFCMPPredToAArch64CC(
   case CmpInst::FCMP_UNE:
 CondCode = AArch64CC::NE;
 break;
+  case CmpInst::FCMP_TRUE:
+CondCode = AArch64CC::AL;
+break;
+  case CmpInst::FCMP_FALSE:
+CondCode = AArch64CC::NV;
+break;
   }
 }
 
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/select.mir 
b/llvm/test/CodeGen/AArch64/GlobalISel/select.mir
index 60cddbf794bc7..ae78d4be0f88a 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/select.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/select.mir
@@ -183,6 +183,14 @@ registers:
   - { id: 5, class: gpr }
   - { id: 6, class: gpr }
   - { id: 7, class: gpr }
+  - { id: 8, class: fpr }
+  - { id: 9, class: gpr }
+  - { id: 10, class: fpr }
+  - { id: 11, class: gpr }
+  - { id: 12, class: gpr }
+  - { id: 13, class: gpr }
+  - { id: 14, class: gpr }
+  - { id: 15, class: gpr }
 
 # CHECK:  body:
 # CHECK:nofpexcept FCMPSrr %0, %0, implicit-def $nzcv
@@ -209,6 +217,18 @@ body: |
 %7(s32) = G_ANYEXT %5
 $w0 = COPY %7(s32)
 
+%8(s32) = COPY $s0
+%9(s32) = G_FCMP floatpred(true), %8, %8
+%12(s8) = G_TRUNC %9(s32)
+%14(s32) = G_ANYEXT %12
+$w0 = COPY %14(s32)
+
+%10(s64) = COPY $d0
+%11(s32) = G_FCMP floatpred(false), %10, %10
+%13(s8) = G_TRUNC %11(s32)
+%15(s32) = G_ANYEXT %13
+$w0 = COPY %15(s32)
+
 ...
 
 ---
diff --git a/llvm/test/CodeGen/AArch64/neon-compare-instructions.ll 
b/llvm/test/CodeGen/AArch64/neon-compare-instructions.ll
index 765c81e26e13c..c4c00f8e97942 100644
--- a/llvm/test/CodeGen/AArch64/neon-compare-instructions.ll
+++ b/llvm/test/CodeGen/AArch64/neon-compare-instructions.ll
@@ -2870,6 +2870,107 @@ define <2 x i64> @fcmune2xdouble(<2 x double> %A, <2 x 
double> %B) {
   ret <2 x i64> %tmp4
 }
 
+define <2 x i32> @fcmal2xfloat(<2 x float> %A, <2 x float> %B) {
+; CHECK-SD-LABEL: fcmal2xfloat:
+; CHECK-SD:   // %bb.0:
+; CHECK-SD-NEXT:movi v0.2d, #0x
+; CHECK-SD-NEXT:ret
+;
+; CHECK-GI-LABEL: fcmal2xfloat:
+; CHECK-GI:   // %bb.0:
+; CHECK-GI-NEXT:movi v0.2s, #1
+; CHECK-GI-NEXT:shl v0.2s, v0.2s, #31
+; CHECK-GI-NEXT:sshr v0.2s, v0.2s, #31
+; CHECK-GI-NEXT:ret
+  %tmp3 = fcmp true <2 x float> %A, %B
+  %tmp4 = sext <2 x i1> %tmp3 to <2 x i32>
+  ret <2 x i32> %tmp4
+}
+
+define <4 x i32> @fcmal4xfloat(<4 x float> %A, <4 x float> %B) {
+; CHECK-SD-LABEL: fcmal4xfloat:
+; CHECK-SD:   // %bb.0:
+; CHECK-SD-NEXT:movi v0.2d, #0x
+; CHECK-SD-NEXT:ret
+;
+; CHECK-GI-LABEL: fcmal4xfloat:
+; CHECK-GI:   // %bb.0:
+; CHECK-GI-NEXT:mov w8, #1 // =0x1
+; CHECK-GI-NEXT:fmov s0, w8
+; CHECK-GI-NEXT:mov v1.16b, v0.16b
+; CHECK-GI-NEXT:mov v1.h[1], v0.h[0]
+; CHECK-GI-NEXT:mov v0.h[1], v0.h[0]
+; CHECK-GI-NEXT:ushll v1.4s, v1.4h, #0
+; CHECK-GI-NEXT:ushll v0.4s, v0.4h, #0
+; CHECK-GI-NEXT:mov v1.d[1], v0.d[0]
+; CHECK-GI-NEXT:shl v0.4s, v1.4s, #31
+; CHECK-GI-NEXT:sshr v0.4s, v0.4s, #31
+; CHECK-GI-NEXT:ret
+  %tmp3 = fcmp true <4 x float> %A, %B
+  %tmp4 = sext <4 x i1> %tmp3 to <4 x i32>
+  ret <4 x i32> %tmp4
+}
+define <2 x i64> @fcmal2xdouble(<2 x double> %A, <2 x double> %B) {
+; CHECK-SD-LABEL: fcmal2xdouble:
+; CHECK-SD:   // %bb.0:
+; CHECK-SD-NEXT:movi v0.2d, #0x
+; CHECK-SD-NEXT:ret
+;
+; CHECK-GI-LABEL: fcmal2xdouble:
+; CHECK-GI:   // %bb.0:
+; CHECK-GI-NEXT:adrp x8, .LCPI221_0
+; CHECK-GI-NEXT:ldr q0, [x8, :lo12:.LCPI221_0]
+; CHECK-GI-NEXT:shl v0.2d, v0.2d, #63
+; CHECK-GI-NEXT:sshr v0.2d, v0.2d, #63
+; CHECK-GI-NEXT:ret
+  %tmp3 = fcmp true <2 x double> %A, %B
+  %tmp4 = sext <2 x i1> %tmp3 to <2 x i64>
+  ret <2 x i64> %tmp4
+}
+
+define <2 x i32> @fcmnv2xfloat(<2 x float> %A, <2 x float> %B) {
+; CHECK-LABEL: fcmnv2xfloat:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:movi v0.2d, #
+; CHECK-NEXT:ret
+  %tmp3 = fcmp false <2 x float> %A, %B
+  %tmp4 = sext <2 x i1> %tmp3 to <2 x i3

[llvm-branch-commits] [llvm] release/18.x: [AArch64][GISEL] Consider fcmp true and fcmp false in cond code selection (#86972) (PR #91580)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/91580
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [InterleavedLoadCombine] Bail out on non-byte-sized vector element type (#90705) (PR #90805)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/90805

>From d9a7e5179a89624f23d8d6993e7e9ec8887063fc Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Thu, 2 May 2024 09:38:09 +0900
Subject: [PATCH] [InterleavedLoadCombine] Bail out on non-byte-sized vector
 element type (#90705)

Vectors are always tightly packed, and elements of non-byte-sized
usually do not have a well-defined (byte) offset.

Fixes https://github.com/llvm/llvm-project/issues/90695.

(cherry picked from commit d484c4d3501a7ff3d00a6e0cfad026a3b01d320c)
---
 .../CodeGen/InterleavedLoadCombinePass.cpp|  3 +++
 .../interleaved-load-combine-pr90695.ll   | 19 +++
 2 files changed, 22 insertions(+)
 create mode 100644 
llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll

diff --git a/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp 
b/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp
index f2d5c3c867c2d..bbb0b654dc67b 100644
--- a/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp
+++ b/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp
@@ -877,6 +877,9 @@ struct VectorInfo {
 if (LI->isAtomic())
   return false;
 
+if (!DL.typeSizeEqualsStoreSize(Result.VTy->getElementType()))
+  return false;
+
 // Get the base polynomial
 computePolynomialFromPointer(*LI->getPointerOperand(), Offset, BasePtr, 
DL);
 
diff --git a/llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll 
b/llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll
new file mode 100644
index 0..ee75b3a083f71
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll
@@ -0,0 +1,19 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -S -passes=interleaved-load-combine < %s | FileCheck %s
+
+target triple = "aarch64-unknown-windows-gnu"
+
+; Make sure we don't crash on loads of vectors of non-byte-sized types.
+define <4 x i1> @test(ptr %p) {
+; CHECK-LABEL: define <4 x i1> @test(
+; CHECK-SAME: ptr [[P:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[LOAD:%.*]] = load <2 x i1>, ptr [[P]], align 1
+; CHECK-NEXT:[[SHUF:%.*]] = shufflevector <2 x i1> [[LOAD]], <2 x i1> 
zeroinitializer, <4 x i32> 
+; CHECK-NEXT:ret <4 x i1> [[SHUF]]
+;
+entry:
+  %load = load <2 x i1>, ptr %p, align 1
+  %shuf = shufflevector <2 x i1> %load, <2 x i1> zeroinitializer, <4 x i32> 

+  ret <4 x i1> %shuf
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] d9a7e51 - [InterleavedLoadCombine] Bail out on non-byte-sized vector element type (#90705)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

Author: Nikita Popov
Date: 2024-05-09T12:48:52-07:00
New Revision: d9a7e5179a89624f23d8d6993e7e9ec8887063fc

URL: 
https://github.com/llvm/llvm-project/commit/d9a7e5179a89624f23d8d6993e7e9ec8887063fc
DIFF: 
https://github.com/llvm/llvm-project/commit/d9a7e5179a89624f23d8d6993e7e9ec8887063fc.diff

LOG: [InterleavedLoadCombine] Bail out on non-byte-sized vector element type 
(#90705)

Vectors are always tightly packed, and elements of non-byte-sized
usually do not have a well-defined (byte) offset.

Fixes https://github.com/llvm/llvm-project/issues/90695.

(cherry picked from commit d484c4d3501a7ff3d00a6e0cfad026a3b01d320c)

Added: 
llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll

Modified: 
llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp

Removed: 




diff  --git a/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp 
b/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp
index f2d5c3c867c2d..bbb0b654dc67b 100644
--- a/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp
+++ b/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp
@@ -877,6 +877,9 @@ struct VectorInfo {
 if (LI->isAtomic())
   return false;
 
+if (!DL.typeSizeEqualsStoreSize(Result.VTy->getElementType()))
+  return false;
+
 // Get the base polynomial
 computePolynomialFromPointer(*LI->getPointerOperand(), Offset, BasePtr, 
DL);
 

diff  --git a/llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll 
b/llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll
new file mode 100644
index 0..ee75b3a083f71
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll
@@ -0,0 +1,19 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -S -passes=interleaved-load-combine < %s | FileCheck %s
+
+target triple = "aarch64-unknown-windows-gnu"
+
+; Make sure we don't crash on loads of vectors of non-byte-sized types.
+define <4 x i1> @test(ptr %p) {
+; CHECK-LABEL: define <4 x i1> @test(
+; CHECK-SAME: ptr [[P:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[LOAD:%.*]] = load <2 x i1>, ptr [[P]], align 1
+; CHECK-NEXT:[[SHUF:%.*]] = shufflevector <2 x i1> [[LOAD]], <2 x i1> 
zeroinitializer, <4 x i32> 
+; CHECK-NEXT:ret <4 x i1> [[SHUF]]
+;
+entry:
+  %load = load <2 x i1>, ptr %p, align 1
+  %shuf = shufflevector <2 x i1> %load, <2 x i1> zeroinitializer, <4 x i32> 

+  ret <4 x i1> %shuf
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [InterleavedLoadCombine] Bail out on non-byte-sized vector element type (#90705) (PR #90805)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/90805
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12 (#90595) (PR #90719)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@jayfoad (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/90719
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86][EVEX512] Add `HasEVEX512` when `NoVLX` used for 512-bit patterns (#91106) (PR #91118)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@phoebewang (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/91118
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [SelectionDAG] Mark frame index as "aliased" at argument copy elison (PR #91035)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@AtariDreams (or anyone else). If you would like to add a note about this fix 
in the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/91035
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125) (PR #91425)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@phoebewang (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/91425
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [AMDGPU] Fix GFX12 encoding of s_wait_event export_ready (#89622) (PR #91034)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@AtariDreams (or anyone else). If you would like to add a note about this fix 
in the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/91034
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x: [clang-format] Don't remove parentheses of fold expressions (#91045) (PR #91165)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@owenca (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/91165
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [FunctionAttrs] Fix incorrect nonnull inference for non-inbounds GEP (#91180) (PR #91286)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@nikic (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/91286
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [InterleavedLoadCombine] Bail out on non-byte-sized vector element type (#90705) (PR #90805)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@nikic (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/90805
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [AArch64][GISEL] Consider fcmp true and fcmp false in cond code selection (#86972) (PR #91580)

2024-05-09 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@marcauberer (or anyone else). If you would like to add a note about this fix 
in the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/91580
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [AArc64][GlobalISel] Fix legalizer assert for G_INSERT_VECTOR_ELT - manual merge (PR #91672)

2024-05-10 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/91672

>From 7dbd266e89a70e96a747d8dd4aa5c6abfde15b2c Mon Sep 17 00:00:00 2001
From: Amara Emerson 
Date: Thu, 7 Mar 2024 15:38:33 -0800
Subject: [PATCH] [AArc64][GlobalISel] Fix legalizer assert for
 G_INSERT_VECTOR_ELT

We should moreElements <3 x s1> to <4 x s1> before we try to widen the element,
otherwise we end up with a <3 x s21> nonsense type.

(cherry picked from commit a01e9ce86f4c1bc9af819902db9f287b6d23f54f)

Test has been changed from original commit due to a fallback in a G_BITCAST.
Added abort=2 so we can see partial legalization and check no crash.
---
 .../AArch64/GISel/AArch64LegalizerInfo.cpp|  1 +
 .../GlobalISel/legalize-insert-vector-elt.mir | 69 ++-
 2 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp 
b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index 4b9d549e79114..de3c89e925a2a 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -877,6 +877,7 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const 
AArch64Subtarget &ST)
 
   getActionDefinitionsBuilder(G_INSERT_VECTOR_ELT)
   .legalIf(typeInSet(0, {v16s8, v8s8, v8s16, v4s16, v4s32, v2s32, v2s64}))
+  .moreElementsToNextPow2(0)
   .widenVectorEltsToVectorMinSize(0, 64);
 
   getActionDefinitionsBuilder(G_BUILD_VECTOR)
diff --git 
a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir 
b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir
index 6f6cf2cc165b9..563d3d3e26edf 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir
@@ -1,5 +1,5 @@
 # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
-# RUN: llc -mtriple=aarch64-linux-gnu -O0 -run-pass=legalizer %s -o - 
-global-isel-abort=1 | FileCheck %s
+# RUN: llc -mtriple=aarch64-linux-gnu -O0 -run-pass=legalizer %s -o - 
-global-isel-abort=2 | FileCheck %s
 ---
 name:pr63826_v2s16
 body: |
@@ -216,3 +216,70 @@ body: |
 $q0 = COPY %2(<2 x s64>)
 RET_ReallyLR
 ...
+---
+name:v3s8_crash
+body: |
+  ; CHECK-LABEL: name: v3s8_crash
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x8000)
+  ; CHECK-NEXT:   liveins: $w1, $w2, $w3, $x0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(p0) = COPY $x0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $w2
+  ; CHECK-NEXT:   [[COPY3:%[0-9]+]]:_(s32) = COPY $w3
+  ; CHECK-NEXT:   [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR 
[[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32)
+  ; CHECK-NEXT:   [[TRUNC:%[0-9]+]]:_(<3 x s8>) = G_TRUNC [[BUILD_VECTOR]](<3 
x s32>)
+  ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
+  ; CHECK-NEXT:   [[DEF:%[0-9]+]]:_(s8) = G_IMPLICIT_DEF
+  ; CHECK-NEXT:   [[C1:%[0-9]+]]:_(s8) = G_CONSTANT i8 0
+  ; CHECK-NEXT:   [[BUILD_VECTOR1:%[0-9]+]]:_(<3 x s8>) = G_BUILD_VECTOR 
[[C1]](s8), [[DEF]](s8), [[DEF]](s8)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.1(0x8000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
+  ; CHECK-NEXT:   [[C3:%[0-9]+]]:_(s8) = G_CONSTANT i8 0
+  ; CHECK-NEXT:   [[IVEC:%[0-9]+]]:_(<3 x s8>) = G_INSERT_VECTOR_ELT 
[[TRUNC]], [[C3]](s8), [[C2]](s64)
+  ; CHECK-NEXT:   [[SHUF:%[0-9]+]]:_(<12 x s8>) = G_SHUFFLE_VECTOR [[IVEC]](<3 
x s8>), [[BUILD_VECTOR1]], shufflemask(0, 3, 3, 3, 1, 3, 3, 3, 2, 3, 3, 3)
+  ; CHECK-NEXT:   [[BITCAST:%[0-9]+]]:_(<3 x s32>) = G_BITCAST [[SHUF]](<12 x 
s8>)
+  ; CHECK-NEXT:   [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), 
[[UV2:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[BITCAST]](<3 x s32>)
+  ; CHECK-NEXT:   [[DEF1:%[0-9]+]]:_(s32) = G_IMPLICIT_DEF
+  ; CHECK-NEXT:   [[BUILD_VECTOR2:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR 
[[UV]](s32), [[UV1]](s32), [[UV2]](s32), [[DEF1]](s32)
+  ; CHECK-NEXT:   [[UITOFP:%[0-9]+]]:_(<4 x s32>) = G_UITOFP 
[[BUILD_VECTOR2]](<4 x s32>)
+  ; CHECK-NEXT:   [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), 
[[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[UITOFP]](<4 
x s32>)
+  ; CHECK-NEXT:   [[BUILD_VECTOR3:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR 
[[UV3]](s32), [[UV4]](s32), [[UV5]](s32)
+  ; CHECK-NEXT:   [[UV7:%[0-9]+]]:_(s32), [[UV8:%[0-9]+]]:_(s32), 
[[UV9:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[BUILD_VECTOR3]](<3 x s32>)
+  ; CHECK-NEXT:   G_STORE [[UV7]](s32), [[COPY]](p0) :: (store (s32), align 16)
+  ; CHECK-NEXT:   [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
+  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C4]](s64)
+  ; CHECK-NEXT:   G_STORE [[UV8]](s32), [[PTR_ADD]](p0) :: (store (s32) into 
unknown-address + 4)
+  ; CHECK-NEXT:   [[C5:%[0-9]+]]:_(s64) = G_CON

[llvm-branch-commits] [llvm] 7dbd266 - [AArc64][GlobalISel] Fix legalizer assert for G_INSERT_VECTOR_ELT

2024-05-10 Thread Tom Stellard via llvm-branch-commits

Author: Amara Emerson
Date: 2024-05-10T11:50:51-07:00
New Revision: 7dbd266e89a70e96a747d8dd4aa5c6abfde15b2c

URL: 
https://github.com/llvm/llvm-project/commit/7dbd266e89a70e96a747d8dd4aa5c6abfde15b2c
DIFF: 
https://github.com/llvm/llvm-project/commit/7dbd266e89a70e96a747d8dd4aa5c6abfde15b2c.diff

LOG: [AArc64][GlobalISel] Fix legalizer assert for G_INSERT_VECTOR_ELT

We should moreElements <3 x s1> to <4 x s1> before we try to widen the element,
otherwise we end up with a <3 x s21> nonsense type.

(cherry picked from commit a01e9ce86f4c1bc9af819902db9f287b6d23f54f)

Test has been changed from original commit due to a fallback in a G_BITCAST.
Added abort=2 so we can see partial legalization and check no crash.

Added: 


Modified: 
llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir

Removed: 




diff  --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp 
b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index 4b9d549e79114..de3c89e925a2a 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -877,6 +877,7 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const 
AArch64Subtarget &ST)
 
   getActionDefinitionsBuilder(G_INSERT_VECTOR_ELT)
   .legalIf(typeInSet(0, {v16s8, v8s8, v8s16, v4s16, v4s32, v2s32, v2s64}))
+  .moreElementsToNextPow2(0)
   .widenVectorEltsToVectorMinSize(0, 64);
 
   getActionDefinitionsBuilder(G_BUILD_VECTOR)

diff  --git 
a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir 
b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir
index 6f6cf2cc165b9..563d3d3e26edf 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir
@@ -1,5 +1,5 @@
 # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
-# RUN: llc -mtriple=aarch64-linux-gnu -O0 -run-pass=legalizer %s -o - 
-global-isel-abort=1 | FileCheck %s
+# RUN: llc -mtriple=aarch64-linux-gnu -O0 -run-pass=legalizer %s -o - 
-global-isel-abort=2 | FileCheck %s
 ---
 name:pr63826_v2s16
 body: |
@@ -216,3 +216,70 @@ body: |
 $q0 = COPY %2(<2 x s64>)
 RET_ReallyLR
 ...
+---
+name:v3s8_crash
+body: |
+  ; CHECK-LABEL: name: v3s8_crash
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x8000)
+  ; CHECK-NEXT:   liveins: $w1, $w2, $w3, $x0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(p0) = COPY $x0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:_(s32) = COPY $w2
+  ; CHECK-NEXT:   [[COPY3:%[0-9]+]]:_(s32) = COPY $w3
+  ; CHECK-NEXT:   [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR 
[[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32)
+  ; CHECK-NEXT:   [[TRUNC:%[0-9]+]]:_(<3 x s8>) = G_TRUNC [[BUILD_VECTOR]](<3 
x s32>)
+  ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
+  ; CHECK-NEXT:   [[DEF:%[0-9]+]]:_(s8) = G_IMPLICIT_DEF
+  ; CHECK-NEXT:   [[C1:%[0-9]+]]:_(s8) = G_CONSTANT i8 0
+  ; CHECK-NEXT:   [[BUILD_VECTOR1:%[0-9]+]]:_(<3 x s8>) = G_BUILD_VECTOR 
[[C1]](s8), [[DEF]](s8), [[DEF]](s8)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.1(0x8000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
+  ; CHECK-NEXT:   [[C3:%[0-9]+]]:_(s8) = G_CONSTANT i8 0
+  ; CHECK-NEXT:   [[IVEC:%[0-9]+]]:_(<3 x s8>) = G_INSERT_VECTOR_ELT 
[[TRUNC]], [[C3]](s8), [[C2]](s64)
+  ; CHECK-NEXT:   [[SHUF:%[0-9]+]]:_(<12 x s8>) = G_SHUFFLE_VECTOR [[IVEC]](<3 
x s8>), [[BUILD_VECTOR1]], shufflemask(0, 3, 3, 3, 1, 3, 3, 3, 2, 3, 3, 3)
+  ; CHECK-NEXT:   [[BITCAST:%[0-9]+]]:_(<3 x s32>) = G_BITCAST [[SHUF]](<12 x 
s8>)
+  ; CHECK-NEXT:   [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), 
[[UV2:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[BITCAST]](<3 x s32>)
+  ; CHECK-NEXT:   [[DEF1:%[0-9]+]]:_(s32) = G_IMPLICIT_DEF
+  ; CHECK-NEXT:   [[BUILD_VECTOR2:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR 
[[UV]](s32), [[UV1]](s32), [[UV2]](s32), [[DEF1]](s32)
+  ; CHECK-NEXT:   [[UITOFP:%[0-9]+]]:_(<4 x s32>) = G_UITOFP 
[[BUILD_VECTOR2]](<4 x s32>)
+  ; CHECK-NEXT:   [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), 
[[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[UITOFP]](<4 
x s32>)
+  ; CHECK-NEXT:   [[BUILD_VECTOR3:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR 
[[UV3]](s32), [[UV4]](s32), [[UV5]](s32)
+  ; CHECK-NEXT:   [[UV7:%[0-9]+]]:_(s32), [[UV8:%[0-9]+]]:_(s32), 
[[UV9:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[BUILD_VECTOR3]](<3 x s32>)
+  ; CHECK-NEXT:   G_STORE [[UV7]](s32), [[COPY]](p0) :: (store (s32), align 16)
+  ; CHECK-NEXT:   [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
+  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C4]](s64)

[llvm-branch-commits] [llvm] release/18.x: [AArc64][GlobalISel] Fix legalizer assert for G_INSERT_VECTOR_ELT - manual merge (PR #91672)

2024-05-10 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/91672
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


  1   2   3   4   5   6   7   8   9   10   >