[llvm-branch-commits] [llvm] release/18.x: [AArch64] Remove invalid uabdl patterns. (#89272) (PR #89380)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/89380 >From a96b04442c9fc29cd884b56bf07af8615191176f Mon Sep 17 00:00:00 2001 From: David Green Date: Fri, 19 Apr 2024 09:30:13 +0100 Subject: [PATCH] [AArch64] Remove invalid uabdl patterns. (#89272) These were added in https://reviews.llvm.org/D14208, which look like they attempt to detect abs from xor+add+ashr. They do not appear to be detecting the correct value for the src input though, which I think is intended to be the sub(zext, zext) part of the pattern. We have pattens from abs now, so the old invalid patterns can be removed. Fixes #88784 (cherry picked from commit 851462fcaa7f6e3301865de84f98be7e872e64b6) --- llvm/lib/Target/AArch64/AArch64InstrInfo.td | 10 - llvm/test/CodeGen/AArch64/arm64-vabs.ll | 48 + 2 files changed, 48 insertions(+), 10 deletions(-) diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td index 03baa7497615e3..ac61dd8745d4e6 100644 --- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td +++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td @@ -4885,19 +4885,9 @@ defm UABDL : SIMDLongThreeVectorBHSabdl<1, 0b0111, "uabdl", def : Pat<(abs (v8i16 (sub (zext (v8i8 V64:$opA)), (zext (v8i8 V64:$opB), (UABDLv8i8_v8i16 V64:$opA, V64:$opB)>; -def : Pat<(xor (v8i16 (AArch64vashr v8i16:$src, (i32 15))), - (v8i16 (add (sub (zext (v8i8 V64:$opA)), -(zext (v8i8 V64:$opB))), - (AArch64vashr v8i16:$src, (i32 15), - (UABDLv8i8_v8i16 V64:$opA, V64:$opB)>; def : Pat<(abs (v8i16 (sub (zext (extract_high_v16i8 (v16i8 V128:$opA))), (zext (extract_high_v16i8 (v16i8 V128:$opB)), (UABDLv16i8_v8i16 V128:$opA, V128:$opB)>; -def : Pat<(xor (v8i16 (AArch64vashr v8i16:$src, (i32 15))), - (v8i16 (add (sub (zext (extract_high_v16i8 (v16i8 V128:$opA))), -(zext (extract_high_v16i8 (v16i8 V128:$opB, - (AArch64vashr v8i16:$src, (i32 15), - (UABDLv16i8_v8i16 V128:$opA, V128:$opB)>; def : Pat<(abs (v4i32 (sub (zext (v4i16 V64:$opA)), (zext (v4i16 V64:$opB), (UABDLv4i16_v4i32 V64:$opA, V64:$opB)>; diff --git a/llvm/test/CodeGen/AArch64/arm64-vabs.ll b/llvm/test/CodeGen/AArch64/arm64-vabs.ll index fe4da2e7cf36b5..89c8d540b97e04 100644 --- a/llvm/test/CodeGen/AArch64/arm64-vabs.ll +++ b/llvm/test/CodeGen/AArch64/arm64-vabs.ll @@ -1848,3 +1848,51 @@ define <2 x i128> @uabd_i64(<2 x i64> %a, <2 x i64> %b) { %absel = select <2 x i1> %abcmp, <2 x i128> %ababs, <2 x i128> %abdiff ret <2 x i128> %absel } + +define <8 x i16> @pr88784(<8 x i8> %l0, <8 x i8> %l1, <8 x i16> %l2) { +; CHECK-SD-LABEL: pr88784: +; CHECK-SD: // %bb.0: +; CHECK-SD-NEXT:usubl.8h v0, v0, v1 +; CHECK-SD-NEXT:cmlt.8h v1, v2, #0 +; CHECK-SD-NEXT:ssra.8h v0, v2, #15 +; CHECK-SD-NEXT:eor.16b v0, v1, v0 +; CHECK-SD-NEXT:ret +; +; CHECK-GI-LABEL: pr88784: +; CHECK-GI: // %bb.0: +; CHECK-GI-NEXT:usubl.8h v0, v0, v1 +; CHECK-GI-NEXT:sshr.8h v1, v2, #15 +; CHECK-GI-NEXT:ssra.8h v0, v2, #15 +; CHECK-GI-NEXT:eor.16b v0, v1, v0 +; CHECK-GI-NEXT:ret + %l4 = zext <8 x i8> %l0 to <8 x i16> + %l5 = ashr <8 x i16> %l2, + %l6 = zext <8 x i8> %l1 to <8 x i16> + %l7 = sub <8 x i16> %l4, %l6 + %l8 = add <8 x i16> %l5, %l7 + %l9 = xor <8 x i16> %l5, %l8 + ret <8 x i16> %l9 +} + +define <8 x i16> @pr88784_fixed(<8 x i8> %l0, <8 x i8> %l1, <8 x i16> %l2) { +; CHECK-SD-LABEL: pr88784_fixed: +; CHECK-SD: // %bb.0: +; CHECK-SD-NEXT:uabdl.8h v0, v0, v1 +; CHECK-SD-NEXT:ret +; +; CHECK-GI-LABEL: pr88784_fixed: +; CHECK-GI: // %bb.0: +; CHECK-GI-NEXT:usubl.8h v0, v0, v1 +; CHECK-GI-NEXT:sshr.8h v1, v0, #15 +; CHECK-GI-NEXT:ssra.8h v0, v0, #15 +; CHECK-GI-NEXT:eor.16b v0, v1, v0 +; CHECK-GI-NEXT:ret + %l4 = zext <8 x i8> %l0 to <8 x i16> + %l6 = zext <8 x i8> %l1 to <8 x i16> + %l7 = sub <8 x i16> %l4, %l6 + %l5 = ashr <8 x i16> %l7, + %l8 = add <8 x i16> %l5, %l7 + %l9 = xor <8 x i16> %l5, %l8 + ret <8 x i16> %l9 +} + ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [AArch64] Remove invalid uabdl patterns. (#89272) (PR #89380)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/89380 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] a96b044 - [AArch64] Remove invalid uabdl patterns. (#89272)
Author: David Green Date: 2024-04-30T16:01:41-07:00 New Revision: a96b04442c9fc29cd884b56bf07af8615191176f URL: https://github.com/llvm/llvm-project/commit/a96b04442c9fc29cd884b56bf07af8615191176f DIFF: https://github.com/llvm/llvm-project/commit/a96b04442c9fc29cd884b56bf07af8615191176f.diff LOG: [AArch64] Remove invalid uabdl patterns. (#89272) These were added in https://reviews.llvm.org/D14208, which look like they attempt to detect abs from xor+add+ashr. They do not appear to be detecting the correct value for the src input though, which I think is intended to be the sub(zext, zext) part of the pattern. We have pattens from abs now, so the old invalid patterns can be removed. Fixes #88784 (cherry picked from commit 851462fcaa7f6e3301865de84f98be7e872e64b6) Added: Modified: llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/test/CodeGen/AArch64/arm64-vabs.ll Removed: diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td index 03baa7497615e3..ac61dd8745d4e6 100644 --- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td +++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td @@ -4885,19 +4885,9 @@ defm UABDL : SIMDLongThreeVectorBHSabdl<1, 0b0111, "uabdl", def : Pat<(abs (v8i16 (sub (zext (v8i8 V64:$opA)), (zext (v8i8 V64:$opB), (UABDLv8i8_v8i16 V64:$opA, V64:$opB)>; -def : Pat<(xor (v8i16 (AArch64vashr v8i16:$src, (i32 15))), - (v8i16 (add (sub (zext (v8i8 V64:$opA)), -(zext (v8i8 V64:$opB))), - (AArch64vashr v8i16:$src, (i32 15), - (UABDLv8i8_v8i16 V64:$opA, V64:$opB)>; def : Pat<(abs (v8i16 (sub (zext (extract_high_v16i8 (v16i8 V128:$opA))), (zext (extract_high_v16i8 (v16i8 V128:$opB)), (UABDLv16i8_v8i16 V128:$opA, V128:$opB)>; -def : Pat<(xor (v8i16 (AArch64vashr v8i16:$src, (i32 15))), - (v8i16 (add (sub (zext (extract_high_v16i8 (v16i8 V128:$opA))), -(zext (extract_high_v16i8 (v16i8 V128:$opB, - (AArch64vashr v8i16:$src, (i32 15), - (UABDLv16i8_v8i16 V128:$opA, V128:$opB)>; def : Pat<(abs (v4i32 (sub (zext (v4i16 V64:$opA)), (zext (v4i16 V64:$opB), (UABDLv4i16_v4i32 V64:$opA, V64:$opB)>; diff --git a/llvm/test/CodeGen/AArch64/arm64-vabs.ll b/llvm/test/CodeGen/AArch64/arm64-vabs.ll index fe4da2e7cf36b5..89c8d540b97e04 100644 --- a/llvm/test/CodeGen/AArch64/arm64-vabs.ll +++ b/llvm/test/CodeGen/AArch64/arm64-vabs.ll @@ -1848,3 +1848,51 @@ define <2 x i128> @uabd_i64(<2 x i64> %a, <2 x i64> %b) { %absel = select <2 x i1> %abcmp, <2 x i128> %ababs, <2 x i128> %ab diff ret <2 x i128> %absel } + +define <8 x i16> @pr88784(<8 x i8> %l0, <8 x i8> %l1, <8 x i16> %l2) { +; CHECK-SD-LABEL: pr88784: +; CHECK-SD: // %bb.0: +; CHECK-SD-NEXT:usubl.8h v0, v0, v1 +; CHECK-SD-NEXT:cmlt.8h v1, v2, #0 +; CHECK-SD-NEXT:ssra.8h v0, v2, #15 +; CHECK-SD-NEXT:eor.16b v0, v1, v0 +; CHECK-SD-NEXT:ret +; +; CHECK-GI-LABEL: pr88784: +; CHECK-GI: // %bb.0: +; CHECK-GI-NEXT:usubl.8h v0, v0, v1 +; CHECK-GI-NEXT:sshr.8h v1, v2, #15 +; CHECK-GI-NEXT:ssra.8h v0, v2, #15 +; CHECK-GI-NEXT:eor.16b v0, v1, v0 +; CHECK-GI-NEXT:ret + %l4 = zext <8 x i8> %l0 to <8 x i16> + %l5 = ashr <8 x i16> %l2, + %l6 = zext <8 x i8> %l1 to <8 x i16> + %l7 = sub <8 x i16> %l4, %l6 + %l8 = add <8 x i16> %l5, %l7 + %l9 = xor <8 x i16> %l5, %l8 + ret <8 x i16> %l9 +} + +define <8 x i16> @pr88784_fixed(<8 x i8> %l0, <8 x i8> %l1, <8 x i16> %l2) { +; CHECK-SD-LABEL: pr88784_fixed: +; CHECK-SD: // %bb.0: +; CHECK-SD-NEXT:uabdl.8h v0, v0, v1 +; CHECK-SD-NEXT:ret +; +; CHECK-GI-LABEL: pr88784_fixed: +; CHECK-GI: // %bb.0: +; CHECK-GI-NEXT:usubl.8h v0, v0, v1 +; CHECK-GI-NEXT:sshr.8h v1, v0, #15 +; CHECK-GI-NEXT:ssra.8h v0, v0, #15 +; CHECK-GI-NEXT:eor.16b v0, v1, v0 +; CHECK-GI-NEXT:ret + %l4 = zext <8 x i8> %l0 to <8 x i16> + %l6 = zext <8 x i8> %l1 to <8 x i16> + %l7 = sub <8 x i16> %l4, %l6 + %l5 = ashr <8 x i16> %l7, + %l8 = add <8 x i16> %l5, %l7 + %l9 = xor <8 x i16> %l5, %l8 + ret <8 x i16> %l9 +} + ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [GlobalISel] Don't form anyextending atomic loads. (PR #90435)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/90435 >From 4da5b14174938bc69b8e729bc8b5bb393bd70b9e Mon Sep 17 00:00:00 2001 From: Amara Emerson Date: Fri, 5 Apr 2024 10:49:19 -0700 Subject: [PATCH] [GlobalISel] Don't form anyextending atomic loads. Until we can reliably check the legality and improve our selection of these, don't form them at all. (cherry picked from commit 60fc4ac67a613e4e36cef019fb2d13d70a06cfe8) --- .../lib/CodeGen/GlobalISel/CombinerHelper.cpp | 4 +- .../Atomics/aarch64-atomic-load-rcpc_immo.ll | 55 +- .../AArch64/GlobalISel/arm64-atomic.ll| 56 +-- .../AArch64/GlobalISel/arm64-pcsections.ll| 28 +- .../atomic-anyextending-load-crash.ll | 47 5 files changed, 131 insertions(+), 59 deletions(-) create mode 100644 llvm/test/CodeGen/AArch64/GlobalISel/atomic-anyextending-load-crash.ll diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp index 772229215e798d..61ddc858ba44c7 100644 --- a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp +++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp @@ -591,8 +591,8 @@ bool CombinerHelper::matchCombineExtendingLoads(MachineInstr &MI, UseMI.getOpcode() == TargetOpcode::G_ZEXT || (UseMI.getOpcode() == TargetOpcode::G_ANYEXT)) { const auto &MMO = LoadMI->getMMO(); - // For atomics, only form anyextending loads. - if (MMO.isAtomic() && UseMI.getOpcode() != TargetOpcode::G_ANYEXT) + // Don't do anything for atomics. + if (MMO.isAtomic()) continue; // Check for legality. if (!isPreLegalize()) { diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll index b0507e9d075fab..9687ba683fb7e6 100644 --- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll +++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll @@ -35,16 +35,24 @@ define i8 @load_atomic_i8_aligned_monotonic_const(ptr readonly %ptr) { } define i8 @load_atomic_i8_aligned_acquire(ptr %ptr) { -; CHECK-LABEL: load_atomic_i8_aligned_acquire: -; CHECK:ldapurb w0, [x0, #4] +; GISEL-LABEL: load_atomic_i8_aligned_acquire: +; GISEL:add x8, x0, #4 +; GISEL:ldaprb w0, [x8] +; +; SDAG-LABEL: load_atomic_i8_aligned_acquire: +; SDAG:ldapurb w0, [x0, #4] %gep = getelementptr inbounds i8, ptr %ptr, i32 4 %r = load atomic i8, ptr %gep acquire, align 1 ret i8 %r } define i8 @load_atomic_i8_aligned_acquire_const(ptr readonly %ptr) { -; CHECK-LABEL: load_atomic_i8_aligned_acquire_const: -; CHECK:ldapurb w0, [x0, #4] +; GISEL-LABEL: load_atomic_i8_aligned_acquire_const: +; GISEL:add x8, x0, #4 +; GISEL:ldaprb w0, [x8] +; +; SDAG-LABEL: load_atomic_i8_aligned_acquire_const: +; SDAG:ldapurb w0, [x0, #4] %gep = getelementptr inbounds i8, ptr %ptr, i32 4 %r = load atomic i8, ptr %gep acquire, align 1 ret i8 %r @@ -101,16 +109,24 @@ define i16 @load_atomic_i16_aligned_monotonic_const(ptr readonly %ptr) { } define i16 @load_atomic_i16_aligned_acquire(ptr %ptr) { -; CHECK-LABEL: load_atomic_i16_aligned_acquire: -; CHECK:ldapurh w0, [x0, #8] +; GISEL-LABEL: load_atomic_i16_aligned_acquire: +; GISEL:add x8, x0, #8 +; GISEL:ldaprh w0, [x8] +; +; SDAG-LABEL: load_atomic_i16_aligned_acquire: +; SDAG:ldapurh w0, [x0, #8] %gep = getelementptr inbounds i16, ptr %ptr, i32 4 %r = load atomic i16, ptr %gep acquire, align 2 ret i16 %r } define i16 @load_atomic_i16_aligned_acquire_const(ptr readonly %ptr) { -; CHECK-LABEL: load_atomic_i16_aligned_acquire_const: -; CHECK:ldapurh w0, [x0, #8] +; GISEL-LABEL: load_atomic_i16_aligned_acquire_const: +; GISEL:add x8, x0, #8 +; GISEL:ldaprh w0, [x8] +; +; SDAG-LABEL: load_atomic_i16_aligned_acquire_const: +; SDAG:ldapurh w0, [x0, #8] %gep = getelementptr inbounds i16, ptr %ptr, i32 4 %r = load atomic i16, ptr %gep acquire, align 2 ret i16 %r @@ -367,16 +383,24 @@ define i8 @load_atomic_i8_unaligned_monotonic_const(ptr readonly %ptr) { } define i8 @load_atomic_i8_unaligned_acquire(ptr %ptr) { -; CHECK-LABEL: load_atomic_i8_unaligned_acquire: -; CHECK:ldapurb w0, [x0, #4] +; GISEL-LABEL: load_atomic_i8_unaligned_acquire: +; GISEL:add x8, x0, #4 +; GISEL:ldaprb w0, [x8] +; +; SDAG-LABEL: load_atomic_i8_unaligned_acquire: +; SDAG:ldapurb w0, [x0, #4] %gep = getelementptr inbounds i8, ptr %ptr, i32 4 %r = load atomic i8, ptr %gep acquire, align 1 ret i8 %r } define i8 @load_atomic_i8_unaligned_acquire_const(ptr readonly %ptr) { -; CHECK-LABEL: load_atomic_i8_unaligned_acquire_const: -; CHECK:ldapurb w0, [x0, #4] +; GISEL-LABEL: load_atomic_i8_unaligned_acquire_const: +; GISEL:add x8, x0, #4 +; GISEL:ldaprb w0, [x8] +; +; SDAG-LABEL: load_atomic_i
[llvm-branch-commits] [llvm] 4da5b14 - [GlobalISel] Don't form anyextending atomic loads.
Author: Amara Emerson Date: 2024-05-01T11:30:12-07:00 New Revision: 4da5b14174938bc69b8e729bc8b5bb393bd70b9e URL: https://github.com/llvm/llvm-project/commit/4da5b14174938bc69b8e729bc8b5bb393bd70b9e DIFF: https://github.com/llvm/llvm-project/commit/4da5b14174938bc69b8e729bc8b5bb393bd70b9e.diff LOG: [GlobalISel] Don't form anyextending atomic loads. Until we can reliably check the legality and improve our selection of these, don't form them at all. (cherry picked from commit 60fc4ac67a613e4e36cef019fb2d13d70a06cfe8) Added: llvm/test/CodeGen/AArch64/GlobalISel/atomic-anyextending-load-crash.ll Modified: llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll llvm/test/CodeGen/AArch64/GlobalISel/arm64-pcsections.ll Removed: diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp index 772229215e798d..61ddc858ba44c7 100644 --- a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp +++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp @@ -591,8 +591,8 @@ bool CombinerHelper::matchCombineExtendingLoads(MachineInstr &MI, UseMI.getOpcode() == TargetOpcode::G_ZEXT || (UseMI.getOpcode() == TargetOpcode::G_ANYEXT)) { const auto &MMO = LoadMI->getMMO(); - // For atomics, only form anyextending loads. - if (MMO.isAtomic() && UseMI.getOpcode() != TargetOpcode::G_ANYEXT) + // Don't do anything for atomics. + if (MMO.isAtomic()) continue; // Check for legality. if (!isPreLegalize()) { diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll index b0507e9d075fab..9687ba683fb7e6 100644 --- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll +++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll @@ -35,16 +35,24 @@ define i8 @load_atomic_i8_aligned_monotonic_const(ptr readonly %ptr) { } define i8 @load_atomic_i8_aligned_acquire(ptr %ptr) { -; CHECK-LABEL: load_atomic_i8_aligned_acquire: -; CHECK:ldapurb w0, [x0, #4] +; GISEL-LABEL: load_atomic_i8_aligned_acquire: +; GISEL:add x8, x0, #4 +; GISEL:ldaprb w0, [x8] +; +; SDAG-LABEL: load_atomic_i8_aligned_acquire: +; SDAG:ldapurb w0, [x0, #4] %gep = getelementptr inbounds i8, ptr %ptr, i32 4 %r = load atomic i8, ptr %gep acquire, align 1 ret i8 %r } define i8 @load_atomic_i8_aligned_acquire_const(ptr readonly %ptr) { -; CHECK-LABEL: load_atomic_i8_aligned_acquire_const: -; CHECK:ldapurb w0, [x0, #4] +; GISEL-LABEL: load_atomic_i8_aligned_acquire_const: +; GISEL:add x8, x0, #4 +; GISEL:ldaprb w0, [x8] +; +; SDAG-LABEL: load_atomic_i8_aligned_acquire_const: +; SDAG:ldapurb w0, [x0, #4] %gep = getelementptr inbounds i8, ptr %ptr, i32 4 %r = load atomic i8, ptr %gep acquire, align 1 ret i8 %r @@ -101,16 +109,24 @@ define i16 @load_atomic_i16_aligned_monotonic_const(ptr readonly %ptr) { } define i16 @load_atomic_i16_aligned_acquire(ptr %ptr) { -; CHECK-LABEL: load_atomic_i16_aligned_acquire: -; CHECK:ldapurh w0, [x0, #8] +; GISEL-LABEL: load_atomic_i16_aligned_acquire: +; GISEL:add x8, x0, #8 +; GISEL:ldaprh w0, [x8] +; +; SDAG-LABEL: load_atomic_i16_aligned_acquire: +; SDAG:ldapurh w0, [x0, #8] %gep = getelementptr inbounds i16, ptr %ptr, i32 4 %r = load atomic i16, ptr %gep acquire, align 2 ret i16 %r } define i16 @load_atomic_i16_aligned_acquire_const(ptr readonly %ptr) { -; CHECK-LABEL: load_atomic_i16_aligned_acquire_const: -; CHECK:ldapurh w0, [x0, #8] +; GISEL-LABEL: load_atomic_i16_aligned_acquire_const: +; GISEL:add x8, x0, #8 +; GISEL:ldaprh w0, [x8] +; +; SDAG-LABEL: load_atomic_i16_aligned_acquire_const: +; SDAG:ldapurh w0, [x0, #8] %gep = getelementptr inbounds i16, ptr %ptr, i32 4 %r = load atomic i16, ptr %gep acquire, align 2 ret i16 %r @@ -367,16 +383,24 @@ define i8 @load_atomic_i8_unaligned_monotonic_const(ptr readonly %ptr) { } define i8 @load_atomic_i8_unaligned_acquire(ptr %ptr) { -; CHECK-LABEL: load_atomic_i8_unaligned_acquire: -; CHECK:ldapurb w0, [x0, #4] +; GISEL-LABEL: load_atomic_i8_unaligned_acquire: +; GISEL:add x8, x0, #4 +; GISEL:ldaprb w0, [x8] +; +; SDAG-LABEL: load_atomic_i8_unaligned_acquire: +; SDAG:ldapurb w0, [x0, #4] %gep = getelementptr inbounds i8, ptr %ptr, i32 4 %r = load atomic i8, ptr %gep acquire, align 1 ret i8 %r } define i8 @load_atomic_i8_unaligned_acquire_const(ptr readonly %ptr) { -; CHECK-LABEL: load_atomic_i8_unaligned_acquire_const: -; CHECK:ldapurb w0, [x0, #4] +; GISEL-LABEL: load_atomic_i8_unaligned_acquire_const: +; GISEL:add x8, x0, #4 +; GISEL:ldaprb w0, [x8] +
[llvm-branch-commits] [llvm] release/18.x: [GlobalISel] Don't form anyextending atomic loads. (PR #90435)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/90435 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [X86] Enable EVEX512 when host CPU has AVX512 (#90479) (PR #90545)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/90545 >From a7b8b890600a33e0c88d639f311f1d73ccb1c8d2 Mon Sep 17 00:00:00 2001 From: Phoebe Wang Date: Tue, 30 Apr 2024 10:09:41 +0800 Subject: [PATCH] [X86] Enable EVEX512 when host CPU has AVX512 (#90479) This is used when -march=native run on an unknown CPU to old version of LLVM. (cherry picked from commit b3291793f11924a3b62601aabebebdcfbb12a9a1) --- llvm/lib/TargetParser/Host.cpp | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/llvm/lib/TargetParser/Host.cpp b/llvm/lib/TargetParser/Host.cpp index 4466d50458e198..1adef15771fa17 100644 --- a/llvm/lib/TargetParser/Host.cpp +++ b/llvm/lib/TargetParser/Host.cpp @@ -1266,8 +1266,10 @@ static void getAvailableFeatures(unsigned ECX, unsigned EDX, unsigned MaxLeaf, setFeature(X86::FEATURE_AVX2); if (HasLeaf7 && ((EBX >> 8) & 1)) setFeature(X86::FEATURE_BMI2); - if (HasLeaf7 && ((EBX >> 16) & 1) && HasAVX512Save) + if (HasLeaf7 && ((EBX >> 16) & 1) && HasAVX512Save) { setFeature(X86::FEATURE_AVX512F); +setFeature(X86::FEATURE_EVEX512); + } if (HasLeaf7 && ((EBX >> 17) & 1) && HasAVX512Save) setFeature(X86::FEATURE_AVX512DQ); if (HasLeaf7 && ((EBX >> 19) & 1)) @@ -1772,6 +1774,7 @@ bool sys::getHostCPUFeatures(StringMap &Features) { Features["rtm"]= HasLeaf7 && ((EBX >> 11) & 1); // AVX512 is only supported if the OS supports the context save for it. Features["avx512f"]= HasLeaf7 && ((EBX >> 16) & 1) && HasAVX512Save; + Features["evex512"]= Features["avx512f"]; Features["avx512dq"] = HasLeaf7 && ((EBX >> 17) & 1) && HasAVX512Save; Features["rdseed"] = HasLeaf7 && ((EBX >> 18) & 1); Features["adx"]= HasLeaf7 && ((EBX >> 19) & 1); ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] a7b8b89 - [X86] Enable EVEX512 when host CPU has AVX512 (#90479)
Author: Phoebe Wang Date: 2024-05-01T11:32:03-07:00 New Revision: a7b8b890600a33e0c88d639f311f1d73ccb1c8d2 URL: https://github.com/llvm/llvm-project/commit/a7b8b890600a33e0c88d639f311f1d73ccb1c8d2 DIFF: https://github.com/llvm/llvm-project/commit/a7b8b890600a33e0c88d639f311f1d73ccb1c8d2.diff LOG: [X86] Enable EVEX512 when host CPU has AVX512 (#90479) This is used when -march=native run on an unknown CPU to old version of LLVM. (cherry picked from commit b3291793f11924a3b62601aabebebdcfbb12a9a1) Added: Modified: llvm/lib/TargetParser/Host.cpp Removed: diff --git a/llvm/lib/TargetParser/Host.cpp b/llvm/lib/TargetParser/Host.cpp index 4466d50458e198..1adef15771fa17 100644 --- a/llvm/lib/TargetParser/Host.cpp +++ b/llvm/lib/TargetParser/Host.cpp @@ -1266,8 +1266,10 @@ static void getAvailableFeatures(unsigned ECX, unsigned EDX, unsigned MaxLeaf, setFeature(X86::FEATURE_AVX2); if (HasLeaf7 && ((EBX >> 8) & 1)) setFeature(X86::FEATURE_BMI2); - if (HasLeaf7 && ((EBX >> 16) & 1) && HasAVX512Save) + if (HasLeaf7 && ((EBX >> 16) & 1) && HasAVX512Save) { setFeature(X86::FEATURE_AVX512F); +setFeature(X86::FEATURE_EVEX512); + } if (HasLeaf7 && ((EBX >> 17) & 1) && HasAVX512Save) setFeature(X86::FEATURE_AVX512DQ); if (HasLeaf7 && ((EBX >> 19) & 1)) @@ -1772,6 +1774,7 @@ bool sys::getHostCPUFeatures(StringMap &Features) { Features["rtm"]= HasLeaf7 && ((EBX >> 11) & 1); // AVX512 is only supported if the OS supports the context save for it. Features["avx512f"]= HasLeaf7 && ((EBX >> 16) & 1) && HasAVX512Save; + Features["evex512"]= Features["avx512f"]; Features["avx512dq"] = HasLeaf7 && ((EBX >> 17) & 1) && HasAVX512Save; Features["rdseed"] = HasLeaf7 && ((EBX >> 18) & 1); Features["adx"]= HasLeaf7 && ((EBX >> 19) & 1); ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [X86] Enable EVEX512 when host CPU has AVX512 (#90479) (PR #90545)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/90545 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [GlobalISel] Fix store merging incorrectly classifying an unknown index expr as 0. (#90375) (PR #90673)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/90673 >From ece9d35f1a705ab8d66895c6d985907f2b9a2c0c Mon Sep 17 00:00:00 2001 From: Amara Emerson Date: Wed, 1 May 2024 05:42:14 +0800 Subject: [PATCH] [GlobalISel] Fix store merging incorrectly classifying an unknown index expr as 0. (#90375) During analysis, we incorrectly leave the offset part of an address info struct as zero, when in actual fact we failed to decompose it into base + offset. This results in incorrectly assuming that the address is adjacent to another store addr. To fix this we wrap the offset in an optional<> so we can distinguish between real zero and unknown. Fixes issue #90242 (cherry picked from commit 19f4d68252b70c81ebb1686a5a31069eda5373de) --- .../llvm/CodeGen/GlobalISel/LoadStoreOpt.h| 20 -- llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp | 48 -- .../AArch64/GlobalISel/store-merging.ll | 19 ++ .../AArch64/GlobalISel/store-merging.mir | 62 --- 4 files changed, 115 insertions(+), 34 deletions(-) diff --git a/llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h b/llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h index 0f20a33f3a755c..7990997835d019 100644 --- a/llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h +++ b/llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h @@ -35,11 +35,23 @@ struct LegalityQuery; class MachineRegisterInfo; namespace GISelAddressing { /// Helper struct to store a base, index and offset that forms an address -struct BaseIndexOffset { +class BaseIndexOffset { +private: Register BaseReg; Register IndexReg; - int64_t Offset = 0; - bool IsIndexSignExt = false; + std::optional Offset; + +public: + BaseIndexOffset() = default; + Register getBase() { return BaseReg; } + Register getBase() const { return BaseReg; } + Register getIndex() { return IndexReg; } + Register getIndex() const { return IndexReg; } + void setBase(Register NewBase) { BaseReg = NewBase; } + void setIndex(Register NewIndex) { IndexReg = NewIndex; } + void setOffset(std::optional NewOff) { Offset = NewOff; } + bool hasValidOffset() const { return Offset.has_value(); } + int64_t getOffset() const { return *Offset; } }; /// Returns a BaseIndexOffset which describes the pointer in \p Ptr. @@ -89,7 +101,7 @@ class LoadStoreOpt : public MachineFunctionPass { // order stores are writing to incremeneting consecutive addresses. So when // we walk the block in reverse order, the next eligible store must write to // an offset one store width lower than CurrentLowestOffset. -uint64_t CurrentLowestOffset; +int64_t CurrentLowestOffset; SmallVector Stores; // A vector of MachineInstr/unsigned pairs to denote potential aliases that // need to be checked before the candidate is considered safe to merge. The diff --git a/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp b/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp index 246aa88b09acf6..ee499c41c558c3 100644 --- a/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp +++ b/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp @@ -84,21 +84,20 @@ BaseIndexOffset GISelAddressing::getPointerInfo(Register Ptr, MachineRegisterInfo &MRI) { BaseIndexOffset Info; Register PtrAddRHS; - if (!mi_match(Ptr, MRI, m_GPtrAdd(m_Reg(Info.BaseReg), m_Reg(PtrAddRHS { -Info.BaseReg = Ptr; -Info.IndexReg = Register(); -Info.IsIndexSignExt = false; + Register BaseReg; + if (!mi_match(Ptr, MRI, m_GPtrAdd(m_Reg(BaseReg), m_Reg(PtrAddRHS { +Info.setBase(Ptr); +Info.setOffset(0); return Info; } - + Info.setBase(BaseReg); auto RHSCst = getIConstantVRegValWithLookThrough(PtrAddRHS, MRI); if (RHSCst) -Info.Offset = RHSCst->Value.getSExtValue(); +Info.setOffset(RHSCst->Value.getSExtValue()); // Just recognize a simple case for now. In future we'll need to match // indexing patterns for base + index + constant. - Info.IndexReg = PtrAddRHS; - Info.IsIndexSignExt = false; + Info.setIndex(PtrAddRHS); return Info; } @@ -114,15 +113,16 @@ bool GISelAddressing::aliasIsKnownForLoadStore(const MachineInstr &MI1, BaseIndexOffset BasePtr0 = getPointerInfo(LdSt1->getPointerReg(), MRI); BaseIndexOffset BasePtr1 = getPointerInfo(LdSt2->getPointerReg(), MRI); - if (!BasePtr0.BaseReg.isValid() || !BasePtr1.BaseReg.isValid()) + if (!BasePtr0.getBase().isValid() || !BasePtr1.getBase().isValid()) return false; int64_t Size1 = LdSt1->getMemSize(); int64_t Size2 = LdSt2->getMemSize(); int64_t PtrDiff; - if (BasePtr0.BaseReg == BasePtr1.BaseReg) { -PtrDiff = BasePtr1.Offset - BasePtr0.Offset; + if (BasePtr0.getBase() == BasePtr1.getBase() && BasePtr0.hasValidOffset() && + BasePtr1.hasValidOffset()) { +PtrDiff = BasePtr1.getOffset() - BasePtr0.getOffset(); // If the size of memory access is unknown, do not use it to do analysis. // One example of
[llvm-branch-commits] [llvm] ece9d35 - [GlobalISel] Fix store merging incorrectly classifying an unknown index expr as 0. (#90375)
Author: Amara Emerson Date: 2024-05-01T11:35:12-07:00 New Revision: ece9d35f1a705ab8d66895c6d985907f2b9a2c0c URL: https://github.com/llvm/llvm-project/commit/ece9d35f1a705ab8d66895c6d985907f2b9a2c0c DIFF: https://github.com/llvm/llvm-project/commit/ece9d35f1a705ab8d66895c6d985907f2b9a2c0c.diff LOG: [GlobalISel] Fix store merging incorrectly classifying an unknown index expr as 0. (#90375) During analysis, we incorrectly leave the offset part of an address info struct as zero, when in actual fact we failed to decompose it into base + offset. This results in incorrectly assuming that the address is adjacent to another store addr. To fix this we wrap the offset in an optional<> so we can distinguish between real zero and unknown. Fixes issue #90242 (cherry picked from commit 19f4d68252b70c81ebb1686a5a31069eda5373de) Added: Modified: llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp llvm/test/CodeGen/AArch64/GlobalISel/store-merging.ll llvm/test/CodeGen/AArch64/GlobalISel/store-merging.mir Removed: diff --git a/llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h b/llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h index 0f20a33f3a755c..7990997835d019 100644 --- a/llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h +++ b/llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h @@ -35,11 +35,23 @@ struct LegalityQuery; class MachineRegisterInfo; namespace GISelAddressing { /// Helper struct to store a base, index and offset that forms an address -struct BaseIndexOffset { +class BaseIndexOffset { +private: Register BaseReg; Register IndexReg; - int64_t Offset = 0; - bool IsIndexSignExt = false; + std::optional Offset; + +public: + BaseIndexOffset() = default; + Register getBase() { return BaseReg; } + Register getBase() const { return BaseReg; } + Register getIndex() { return IndexReg; } + Register getIndex() const { return IndexReg; } + void setBase(Register NewBase) { BaseReg = NewBase; } + void setIndex(Register NewIndex) { IndexReg = NewIndex; } + void setOffset(std::optional NewOff) { Offset = NewOff; } + bool hasValidOffset() const { return Offset.has_value(); } + int64_t getOffset() const { return *Offset; } }; /// Returns a BaseIndexOffset which describes the pointer in \p Ptr. @@ -89,7 +101,7 @@ class LoadStoreOpt : public MachineFunctionPass { // order stores are writing to incremeneting consecutive addresses. So when // we walk the block in reverse order, the next eligible store must write to // an offset one store width lower than CurrentLowestOffset. -uint64_t CurrentLowestOffset; +int64_t CurrentLowestOffset; SmallVector Stores; // A vector of MachineInstr/unsigned pairs to denote potential aliases that // need to be checked before the candidate is considered safe to merge. The diff --git a/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp b/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp index 246aa88b09acf6..ee499c41c558c3 100644 --- a/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp +++ b/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp @@ -84,21 +84,20 @@ BaseIndexOffset GISelAddressing::getPointerInfo(Register Ptr, MachineRegisterInfo &MRI) { BaseIndexOffset Info; Register PtrAddRHS; - if (!mi_match(Ptr, MRI, m_GPtrAdd(m_Reg(Info.BaseReg), m_Reg(PtrAddRHS { -Info.BaseReg = Ptr; -Info.IndexReg = Register(); -Info.IsIndexSignExt = false; + Register BaseReg; + if (!mi_match(Ptr, MRI, m_GPtrAdd(m_Reg(BaseReg), m_Reg(PtrAddRHS { +Info.setBase(Ptr); +Info.setOffset(0); return Info; } - + Info.setBase(BaseReg); auto RHSCst = getIConstantVRegValWithLookThrough(PtrAddRHS, MRI); if (RHSCst) -Info.Offset = RHSCst->Value.getSExtValue(); +Info.setOffset(RHSCst->Value.getSExtValue()); // Just recognize a simple case for now. In future we'll need to match // indexing patterns for base + index + constant. - Info.IndexReg = PtrAddRHS; - Info.IsIndexSignExt = false; + Info.setIndex(PtrAddRHS); return Info; } @@ -114,15 +113,16 @@ bool GISelAddressing::aliasIsKnownForLoadStore(const MachineInstr &MI1, BaseIndexOffset BasePtr0 = getPointerInfo(LdSt1->getPointerReg(), MRI); BaseIndexOffset BasePtr1 = getPointerInfo(LdSt2->getPointerReg(), MRI); - if (!BasePtr0.BaseReg.isValid() || !BasePtr1.BaseReg.isValid()) + if (!BasePtr0.getBase().isValid() || !BasePtr1.getBase().isValid()) return false; int64_t Size1 = LdSt1->getMemSize(); int64_t Size2 = LdSt2->getMemSize(); int64_t PtrDiff; - if (BasePtr0.BaseReg == BasePtr1.BaseReg) { -PtrDiff = BasePtr1.Offset - BasePtr0.Offset; + if (BasePtr0.getBase() == BasePtr1.getBase() && BasePtr0.hasValidOffset() && + BasePtr1.hasValidOffset()) { +PtrDiff = BasePtr1.getOffset() - BasePtr0.getOffset
[llvm-branch-commits] [llvm] release/18.x: [GlobalISel] Fix store merging incorrectly classifying an unknown index expr as 0. (#90375) (PR #90673)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/90673 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [RISCV][ISel] Fix types in `tryFoldSelectIntoOp` (#90659) (PR #90682)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/90682 >From 20b9ed64ea074f03057e1d775a1d9d0f067ab0b0 Mon Sep 17 00:00:00 2001 From: Yingwei Zheng Date: Wed, 1 May 2024 06:51:36 +0800 Subject: [PATCH] [RISCV][ISel] Fix types in `tryFoldSelectIntoOp` (#90659) ``` SelectionDAG has 17 nodes: t0: ch,glue = EntryToken t6: i64,ch = CopyFromReg t0, Register:i64 %2 t8: i1 = truncate t6 t4: i64,ch = CopyFromReg t0, Register:i64 %1 t7: i1 = truncate t4 t2: i64,ch = CopyFromReg t0, Register:i64 %0 t10: i64,i1 = saddo t2, Constant:i64<1> t11: i1 = or t8, t10:1 t12: i1 = select t7, t8, t11 t13: i64 = any_extend t12 t15: ch,glue = CopyToReg t0, Register:i64 $x10, t13 t16: ch = RISCVISD::RET_GLUE t15, Register:i64 $x10, t15:1 ``` `OtherOpVT` should be i1, but `OtherOp->getValueType(0)` returns `i64`, which ignores `ResNo` in `SDValue`. Fix https://github.com/llvm/llvm-project/issues/90652. (cherry picked from commit 2647bd73696ae987addd0e74774a44108accb1e6) --- llvm/lib/Target/RISCV/RISCVISelLowering.cpp | 2 +- llvm/test/CodeGen/RISCV/pr90652.ll | 19 +++ 2 files changed, 20 insertions(+), 1 deletion(-) create mode 100644 llvm/test/CodeGen/RISCV/pr90652.ll diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp index a0cec426002b6f..d46093b9e260a2 100644 --- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp @@ -14559,7 +14559,7 @@ static SDValue tryFoldSelectIntoOp(SDNode *N, SelectionDAG &DAG, EVT VT = N->getValueType(0); SDLoc DL(N); SDValue OtherOp = TrueVal.getOperand(1 - OpToFold); - EVT OtherOpVT = OtherOp->getValueType(0); + EVT OtherOpVT = OtherOp.getValueType(); SDValue IdentityOperand = DAG.getNeutralElement(Opc, DL, OtherOpVT, N->getFlags()); if (!Commutative) diff --git a/llvm/test/CodeGen/RISCV/pr90652.ll b/llvm/test/CodeGen/RISCV/pr90652.ll new file mode 100644 index 00..2162395b92ac3c --- /dev/null +++ b/llvm/test/CodeGen/RISCV/pr90652.ll @@ -0,0 +1,19 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 +; RUN: llc < %s -mtriple=riscv64 | FileCheck %s + +define i1 @test(i64 %x, i1 %cond1, i1 %cond2) { +; CHECK-LABEL: test: +; CHECK: # %bb.0: # %entry +; CHECK-NEXT:addi a3, a0, 1 +; CHECK-NEXT:slt a0, a3, a0 +; CHECK-NEXT:not a1, a1 +; CHECK-NEXT:and a0, a1, a0 +; CHECK-NEXT:or a0, a2, a0 +; CHECK-NEXT:ret +entry: + %sadd = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %x, i64 1) + %ov = extractvalue { i64, i1 } %sadd, 1 + %or = or i1 %cond2, %ov + %sel = select i1 %cond1, i1 %cond2, i1 %or + ret i1 %sel +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 20b9ed6 - [RISCV][ISel] Fix types in `tryFoldSelectIntoOp` (#90659)
Author: Yingwei Zheng Date: 2024-05-01T11:39:11-07:00 New Revision: 20b9ed64ea074f03057e1d775a1d9d0f067ab0b0 URL: https://github.com/llvm/llvm-project/commit/20b9ed64ea074f03057e1d775a1d9d0f067ab0b0 DIFF: https://github.com/llvm/llvm-project/commit/20b9ed64ea074f03057e1d775a1d9d0f067ab0b0.diff LOG: [RISCV][ISel] Fix types in `tryFoldSelectIntoOp` (#90659) ``` SelectionDAG has 17 nodes: t0: ch,glue = EntryToken t6: i64,ch = CopyFromReg t0, Register:i64 %2 t8: i1 = truncate t6 t4: i64,ch = CopyFromReg t0, Register:i64 %1 t7: i1 = truncate t4 t2: i64,ch = CopyFromReg t0, Register:i64 %0 t10: i64,i1 = saddo t2, Constant:i64<1> t11: i1 = or t8, t10:1 t12: i1 = select t7, t8, t11 t13: i64 = any_extend t12 t15: ch,glue = CopyToReg t0, Register:i64 $x10, t13 t16: ch = RISCVISD::RET_GLUE t15, Register:i64 $x10, t15:1 ``` `OtherOpVT` should be i1, but `OtherOp->getValueType(0)` returns `i64`, which ignores `ResNo` in `SDValue`. Fix https://github.com/llvm/llvm-project/issues/90652. (cherry picked from commit 2647bd73696ae987addd0e74774a44108accb1e6) Added: llvm/test/CodeGen/RISCV/pr90652.ll Modified: llvm/lib/Target/RISCV/RISCVISelLowering.cpp Removed: diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp index a0cec426002b6f..d46093b9e260a2 100644 --- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp @@ -14559,7 +14559,7 @@ static SDValue tryFoldSelectIntoOp(SDNode *N, SelectionDAG &DAG, EVT VT = N->getValueType(0); SDLoc DL(N); SDValue OtherOp = TrueVal.getOperand(1 - OpToFold); - EVT OtherOpVT = OtherOp->getValueType(0); + EVT OtherOpVT = OtherOp.getValueType(); SDValue IdentityOperand = DAG.getNeutralElement(Opc, DL, OtherOpVT, N->getFlags()); if (!Commutative) diff --git a/llvm/test/CodeGen/RISCV/pr90652.ll b/llvm/test/CodeGen/RISCV/pr90652.ll new file mode 100644 index 00..2162395b92ac3c --- /dev/null +++ b/llvm/test/CodeGen/RISCV/pr90652.ll @@ -0,0 +1,19 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 +; RUN: llc < %s -mtriple=riscv64 | FileCheck %s + +define i1 @test(i64 %x, i1 %cond1, i1 %cond2) { +; CHECK-LABEL: test: +; CHECK: # %bb.0: # %entry +; CHECK-NEXT:addi a3, a0, 1 +; CHECK-NEXT:slt a0, a3, a0 +; CHECK-NEXT:not a1, a1 +; CHECK-NEXT:and a0, a1, a0 +; CHECK-NEXT:or a0, a2, a0 +; CHECK-NEXT:ret +entry: + %sadd = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %x, i64 1) + %ov = extractvalue { i64, i1 } %sadd, 1 + %or = or i1 %cond2, %ov + %sel = select i1 %cond1, i1 %cond2, i1 %or + ret i1 %sel +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [RISCV][ISel] Fix types in `tryFoldSelectIntoOp` (#90659) (PR #90682)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/90682 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [RISCV][ISel] Fix types in `tryFoldSelectIntoOp` (#90659) (PR #90682)
tstellar wrote: Hi @dtcxzyw (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/90682 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [GlobalISel] Fix store merging incorrectly classifying an unknown index expr as 0. (#90375) (PR #90673)
tstellar wrote: Hi @aemerson (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/90673 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [X86] Enable EVEX512 when host CPU has AVX512 (#90479) (PR #90545)
tstellar wrote: Hi @nikic (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/90545 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [GlobalISel] Don't form anyextending atomic loads. (PR #90435)
tstellar wrote: Hi @nikic (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/90435 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [AArch64] Remove invalid uabdl patterns. (#89272) (PR #89380)
tstellar wrote: Hi @AtariDreams (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/89380 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/18.x: [clang][CoverageMapping] do not emit a gap region when either end doesn't have valid source locations (#89564) (PR #90369)
tstellar wrote: Hi @whentojump (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/90369 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [X86][EVEX512] Check hasEVEX512 for canExtendTo512DQ (#90390) (PR #90422)
tstellar wrote: Hi @phoebewang (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/90422 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [CGP] Drop poison-generating flags after hoisting (#90382) (PR #90437)
tstellar wrote: Hi @dtcxzyw (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/90437 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Handle structs with inner structs and no fields (#89126) (PR #90133)
tstellar wrote: Hi @bwendling (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/90133 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [IRCE] Skip icmp ptr in `InductiveRangeCheck::parseRangeCheckICmp` (#89967) (PR #90182)
tstellar wrote: Hi @nikic (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/90182 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Backport ARM64EC variadic args fixes to LLVM 18 (PR #81800)
tstellar wrote: Hi @dpaoliello (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/81800 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/18.x: [clang-format] Fix a regression in ContinuationIndenter (#88414) (PR #89412)
tstellar wrote: Hi @owenca (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/89412 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/18.x: [clang-format] Fix a regression in annotating TrailingReturnArrow (#86624) (PR #89415)
tstellar wrote: Hi @owenca (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/89415 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Fix setting nontemporal in memory legalizer (#83815) (PR #90204)
tstellar wrote: Hi @jayfoad (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/90204 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [DAGCombiner] Fix miscompile bug in combineShiftOfShiftedLogic (#89616) (PR #89766)
=?utf-8?q?Björn?= Pettersson Message-ID: In-Reply-To: tstellar wrote: @AtariDreams (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/89766 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [polly] release/18.x: [clang-format] Correctly annotate braces in macros (#87… (PR #89491)
tstellar wrote: @owenca (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/89491 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] release/18.x: [libcxx] [modules]Â Add _LIBCPP_USING_IF_EXISTS on aligned_alloc (#89827) (PR #89894)
tstellar wrote: @mstorsjo (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/89894 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [analyzer] Backport performace regression fix (PR #89725)
tstellar wrote: Hi @steakhal (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/89725 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [X86] Fix miscompile in combineShiftRightArithmetic (PR #86728)
tstellar wrote: Hi @AtariDreams (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/86728 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [GlobalISel] Fix fewerElementsVectorPhi to insert after G_PHIs (#87927) (PR #89240)
tstellar wrote: Hi @AtariDreams (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/89240 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [InstCombine] Fix unexpected overwriting in `foldSelectWithSRem` (#89539) (PR #89546)
tstellar wrote: Hi @dtcxzyw (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/89546 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] Backport fix for crash reported in #88181 (PR #89022)
tstellar wrote: Hi @steakhal (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/89022 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [X86] Always use 64-bit relocations in no-PIC large code model (#89101) (PR #89124)
tstellar wrote: Hi @aeubanks (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/89124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/18.x: [clang codegen] Fix MS ABI detection of user-provided constructors. (#90151) (PR #90639)
tstellar wrote: @efriedma-quic (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/90639 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/18.x: [clang codegen] Fix MS ABI detection of user-provided constructors. (#90151) (PR #90639)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/90639 >From 617a15a9eac96088ae5e9134248d8236e34b91b1 Mon Sep 17 00:00:00 2001 From: Eli Friedman Date: Mon, 29 Apr 2024 12:00:12 -0700 Subject: [PATCH] [clang codegen] Fix MS ABI detection of user-provided constructors. (#90151) In the context of determining whether a class counts as an "aggregate", a constructor template counts as a user-provided constructor. Fixes #86384 (cherry picked from commit 3ab4ae9e58c09dfd8203547ba8916f3458a0a481) --- clang/docs/ReleaseNotes.rst | 6 ++ clang/lib/CodeGen/MicrosoftCXXABI.cpp| 12 +--- clang/test/CodeGen/arm64-microsoft-arguments.cpp | 15 +++ 3 files changed, 30 insertions(+), 3 deletions(-) diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 1e88b58725bd95..e533ecfd5aeba5 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -149,6 +149,12 @@ ABI Changes in This Version - Following the SystemV ABI for x86-64, ``__int128`` arguments will no longer be split between a register and a stack slot. +- Fixed Microsoft calling convention for returning certain classes with a + templated constructor. If a class has a templated constructor, it should + be returned indirectly even if it meets all the other requirements for + returning a class in a register. This affects some uses of std::pair. + (#GH86384). + AST Dumping Potentially Breaking Changes - When dumping a sugared type, Clang will no longer print the desugared type if diff --git a/clang/lib/CodeGen/MicrosoftCXXABI.cpp b/clang/lib/CodeGen/MicrosoftCXXABI.cpp index 172c4c937b9728..4d0f4c63f843b8 100644 --- a/clang/lib/CodeGen/MicrosoftCXXABI.cpp +++ b/clang/lib/CodeGen/MicrosoftCXXABI.cpp @@ -1135,9 +1135,15 @@ static bool isTrivialForMSVC(const CXXRecordDecl *RD, QualType Ty, return false; if (RD->hasNonTrivialCopyAssignment()) return false; - for (const CXXConstructorDecl *Ctor : RD->ctors()) -if (Ctor->isUserProvided()) - return false; + for (const Decl *D : RD->decls()) { +if (auto *Ctor = dyn_cast(D)) { + if (Ctor->isUserProvided()) +return false; +} else if (auto *Template = dyn_cast(D)) { + if (isa(Template->getTemplatedDecl())) +return false; +} + } if (RD->hasNonTrivialDestructor()) return false; return true; diff --git a/clang/test/CodeGen/arm64-microsoft-arguments.cpp b/clang/test/CodeGen/arm64-microsoft-arguments.cpp index e8309888dcfe21..85472645acb3b3 100644 --- a/clang/test/CodeGen/arm64-microsoft-arguments.cpp +++ b/clang/test/CodeGen/arm64-microsoft-arguments.cpp @@ -201,3 +201,18 @@ S11 f11() { S11 x; return func11(x); } + +// GH86384 +// Pass and return object with template constructor (pass directly, +// return indirectly). +// CHECK: define dso_local void @"?f12@@YA?AUS12@@XZ"(ptr dead_on_unwind inreg noalias writable sret(%struct.S12) align 4 {{.*}}) +// CHECK: call void @"?func12@@YA?AUS12@@U1@@Z"(ptr dead_on_unwind inreg writable sret(%struct.S12) align 4 {{.*}}, i64 {{.*}}) +struct S12 { + template S12(T*) {} + int x; +}; +S12 func12(S12 x); +S12 f12() { + S12 x((int*)0); + return func12(x); +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] 617a15a - [clang codegen] Fix MS ABI detection of user-provided constructors. (#90151)
Author: Eli Friedman Date: 2024-05-01T15:56:33-07:00 New Revision: 617a15a9eac96088ae5e9134248d8236e34b91b1 URL: https://github.com/llvm/llvm-project/commit/617a15a9eac96088ae5e9134248d8236e34b91b1 DIFF: https://github.com/llvm/llvm-project/commit/617a15a9eac96088ae5e9134248d8236e34b91b1.diff LOG: [clang codegen] Fix MS ABI detection of user-provided constructors. (#90151) In the context of determining whether a class counts as an "aggregate", a constructor template counts as a user-provided constructor. Fixes #86384 (cherry picked from commit 3ab4ae9e58c09dfd8203547ba8916f3458a0a481) Added: Modified: clang/docs/ReleaseNotes.rst clang/lib/CodeGen/MicrosoftCXXABI.cpp clang/test/CodeGen/arm64-microsoft-arguments.cpp Removed: diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 1e88b58725bd95..e533ecfd5aeba5 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -149,6 +149,12 @@ ABI Changes in This Version - Following the SystemV ABI for x86-64, ``__int128`` arguments will no longer be split between a register and a stack slot. +- Fixed Microsoft calling convention for returning certain classes with a + templated constructor. If a class has a templated constructor, it should + be returned indirectly even if it meets all the other requirements for + returning a class in a register. This affects some uses of std::pair. + (#GH86384). + AST Dumping Potentially Breaking Changes - When dumping a sugared type, Clang will no longer print the desugared type if diff --git a/clang/lib/CodeGen/MicrosoftCXXABI.cpp b/clang/lib/CodeGen/MicrosoftCXXABI.cpp index 172c4c937b9728..4d0f4c63f843b8 100644 --- a/clang/lib/CodeGen/MicrosoftCXXABI.cpp +++ b/clang/lib/CodeGen/MicrosoftCXXABI.cpp @@ -1135,9 +1135,15 @@ static bool isTrivialForMSVC(const CXXRecordDecl *RD, QualType Ty, return false; if (RD->hasNonTrivialCopyAssignment()) return false; - for (const CXXConstructorDecl *Ctor : RD->ctors()) -if (Ctor->isUserProvided()) - return false; + for (const Decl *D : RD->decls()) { +if (auto *Ctor = dyn_cast(D)) { + if (Ctor->isUserProvided()) +return false; +} else if (auto *Template = dyn_cast(D)) { + if (isa(Template->getTemplatedDecl())) +return false; +} + } if (RD->hasNonTrivialDestructor()) return false; return true; diff --git a/clang/test/CodeGen/arm64-microsoft-arguments.cpp b/clang/test/CodeGen/arm64-microsoft-arguments.cpp index e8309888dcfe21..85472645acb3b3 100644 --- a/clang/test/CodeGen/arm64-microsoft-arguments.cpp +++ b/clang/test/CodeGen/arm64-microsoft-arguments.cpp @@ -201,3 +201,18 @@ S11 f11() { S11 x; return func11(x); } + +// GH86384 +// Pass and return object with template constructor (pass directly, +// return indirectly). +// CHECK: define dso_local void @"?f12@@YA?AUS12@@XZ"(ptr dead_on_unwind inreg noalias writable sret(%struct.S12) align 4 {{.*}}) +// CHECK: call void @"?func12@@YA?AUS12@@U1@@Z"(ptr dead_on_unwind inreg writable sret(%struct.S12) align 4 {{.*}}, i64 {{.*}}) +struct S12 { + template S12(T*) {} + int x; +}; +S12 func12(S12 x); +S12 f12() { + S12 x((int*)0); + return func12(x); +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/18.x: [clang codegen] Fix MS ABI detection of user-provided constructors. (#90151) (PR #90639)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/90639 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Fix setting nontemporal in memory legalizer (#83815) (PR #90204)
tstellar wrote: @jayfoad If it's not noteworthy, then it's OK to not add a release note. We don't typically have a list of fixed bugs in the release notes. https://github.com/llvm/llvm-project/pull/90204 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Bump version to 18.1.6 (PR #91094)
https://github.com/tstellar created https://github.com/llvm/llvm-project/pull/91094 None >From b71b9cfce7f3e5dce0cf1856df95cfe8d16252f1 Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Sat, 4 May 2024 21:56:44 + Subject: [PATCH] Bump version to 18.1.6 --- llvm/CMakeLists.txt| 2 +- llvm/utils/lit/lit/__init__.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/llvm/CMakeLists.txt b/llvm/CMakeLists.txt index f82be164ac9c48..26b7b01bb1f8de 100644 --- a/llvm/CMakeLists.txt +++ b/llvm/CMakeLists.txt @@ -22,7 +22,7 @@ if(NOT DEFINED LLVM_VERSION_MINOR) set(LLVM_VERSION_MINOR 1) endif() if(NOT DEFINED LLVM_VERSION_PATCH) - set(LLVM_VERSION_PATCH 5) + set(LLVM_VERSION_PATCH 6) endif() if(NOT DEFINED LLVM_VERSION_SUFFIX) set(LLVM_VERSION_SUFFIX) diff --git a/llvm/utils/lit/lit/__init__.py b/llvm/utils/lit/lit/__init__.py index 1cfcc7d37813bc..d8b0e3bd1c69e3 100644 --- a/llvm/utils/lit/lit/__init__.py +++ b/llvm/utils/lit/lit/__init__.py @@ -2,7 +2,7 @@ __author__ = "Daniel Dunbar" __email__ = "dan...@minormatter.com" -__versioninfo__ = (18, 1, 5) +__versioninfo__ = (18, 1, 6) __version__ = ".".join(str(v) for v in __versioninfo__) + "dev" __all__ = [] ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] Backport some fixes for building the release binaries (PR #91095)
https://github.com/tstellar created https://github.com/llvm/llvm-project/pull/91095 None >From b71b9cfce7f3e5dce0cf1856df95cfe8d16252f1 Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Sat, 4 May 2024 21:56:44 + Subject: [PATCH 1/4] Bump version to 18.1.6 --- llvm/CMakeLists.txt| 2 +- llvm/utils/lit/lit/__init__.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/llvm/CMakeLists.txt b/llvm/CMakeLists.txt index f82be164ac9c48..26b7b01bb1f8de 100644 --- a/llvm/CMakeLists.txt +++ b/llvm/CMakeLists.txt @@ -22,7 +22,7 @@ if(NOT DEFINED LLVM_VERSION_MINOR) set(LLVM_VERSION_MINOR 1) endif() if(NOT DEFINED LLVM_VERSION_PATCH) - set(LLVM_VERSION_PATCH 5) + set(LLVM_VERSION_PATCH 6) endif() if(NOT DEFINED LLVM_VERSION_SUFFIX) set(LLVM_VERSION_SUFFIX) diff --git a/llvm/utils/lit/lit/__init__.py b/llvm/utils/lit/lit/__init__.py index 1cfcc7d37813bc..d8b0e3bd1c69e3 100644 --- a/llvm/utils/lit/lit/__init__.py +++ b/llvm/utils/lit/lit/__init__.py @@ -2,7 +2,7 @@ __author__ = "Daniel Dunbar" __email__ = "dan...@minormatter.com" -__versioninfo__ = (18, 1, 5) +__versioninfo__ = (18, 1, 6) __version__ = ".".join(str(v) for v in __versioninfo__) + "dev" __all__ = [] >From dc6392e374ef8367e98b996569f3bb2898bcb99a Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Wed, 24 Apr 2024 07:47:42 -0700 Subject: [PATCH 2/4] [CMake][Release] Add stage2-package target (#89517) This target will be used to generate the release binary package for uploading to GitHub. (cherry picked from commit a38f201f1ec70c2b1f3cf46e7f291c53bb16753e) --- clang/cmake/caches/Release.cmake | 2 ++ 1 file changed, 2 insertions(+) diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake index bd1f688d61a7ea..fa972636553f1f 100644 --- a/clang/cmake/caches/Release.cmake +++ b/clang/cmake/caches/Release.cmake @@ -14,6 +14,7 @@ if (LLVM_RELEASE_ENABLE_PGO) set(CLANG_BOOTSTRAP_TARGETS generate-profdata stage2 +stage2-package stage2-clang stage2-distribution stage2-install @@ -57,6 +58,7 @@ set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "") set(BOOTSTRAP_CLANG_ENABLE_BOOTSTRAP ON CACHE STRING "") set(BOOTSTRAP_CLANG_BOOTSTRAP_TARGETS clang + package check-all check-llvm check-clang CACHE STRING "") >From 89f6c6ed99e27397e1d4ac8a0cf2e7d3cf11bccd Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Thu, 25 Apr 2024 15:32:08 -0700 Subject: [PATCH 3/4] [CMake][Release] Refactor cache file and use two stages for non-PGO builds (#89812) Completely refactor the cache file to simplify it and remove unnecessary variables. The main functional change here is that the non-PGO builds now use two stages, so `ninja -C build stage2-package` can be used with both PGO and non-PGO builds. (cherry picked from commit 6473fbf2d68c8486d168f29afc35d3e8a6fabe69) --- clang/cmake/caches/Release.cmake | 134 +++ 1 file changed, 66 insertions(+), 68 deletions(-) diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake index fa972636553f1f..c164d5497275f3 100644 --- a/clang/cmake/caches/Release.cmake +++ b/clang/cmake/caches/Release.cmake @@ -1,95 +1,93 @@ # Plain options configure the first build. # BOOTSTRAP_* options configure the second build. # BOOTSTRAP_BOOTSTRAP_* options configure the third build. +# PGO Builds have 3 stages (stage1, stage2-instrumented, stage2) +# non-PGO Builds have 2 stages (stage1, stage2) -# General Options + +function (set_final_stage_var name value type) + if (LLVM_RELEASE_ENABLE_PGO) +set(BOOTSTRAP_BOOTSTRAP_${name} ${value} CACHE ${type} "") + else() +set(BOOTSTRAP_${name} ${value} CACHE ${type} "") + endif() +endfunction() + +function (set_instrument_and_final_stage_var name value type) + # This sets the varaible for the final stage in non-PGO builds and in + # the stage2-instrumented stage for PGO builds. + set(BOOTSTRAP_${name} ${value} CACHE ${type} "") + if (LLVM_RELEASE_ENABLE_PGO) +# Set the variable in the final stage for PGO builds. +set(BOOTSTRAP_BOOTSTRAP_${name} ${value} CACHE ${type} "") + endif() +endfunction() + +# General Options: +# If you want to override any of the LLVM_RELEASE_* variables you can set them +# on the command line via -D, but you need to do this before you pass this +# cache file to CMake via -C. e.g. +# +# cmake -D LLVM_RELEASE_ENABLE_PGO=ON -C Release.cmake set(LLVM_RELEASE_ENABLE_LTO THIN CACHE STRING "") set(LLVM_RELEASE_ENABLE_PGO OFF CACHE BOOL "") - +set(LLVM_RELEASE_ENABLE_RUNTIMES "compiler-rt;libcxx;libcxxabi;libunwind" CACHE STRING "") +set(LLVM_RELEASE_ENABLE_PROJECTS "clang;lld;lldb;clang-tools-extra;bolt;polly;mlir;flang" CACHE STRING "") +# Note we don't need to add install here, since it is one of the pre-defined +# steps. +set(LLVM_RELEASE_FINAL_STAGE_TARGETS "clang;package;check-all;check-llvm;check-clang" CACHE STRING "") set(CMAKE_BUILD_TYPE RELEASE CACHE STRING "") -# Stage 1 Bo
[llvm-branch-commits] [clang] [llvm] Backport some fixes for building the release binaries (PR #91095)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/91095 >From b71b9cfce7f3e5dce0cf1856df95cfe8d16252f1 Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Sat, 4 May 2024 21:56:44 + Subject: [PATCH 1/8] Bump version to 18.1.6 --- llvm/CMakeLists.txt| 2 +- llvm/utils/lit/lit/__init__.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/llvm/CMakeLists.txt b/llvm/CMakeLists.txt index f82be164ac9c48..26b7b01bb1f8de 100644 --- a/llvm/CMakeLists.txt +++ b/llvm/CMakeLists.txt @@ -22,7 +22,7 @@ if(NOT DEFINED LLVM_VERSION_MINOR) set(LLVM_VERSION_MINOR 1) endif() if(NOT DEFINED LLVM_VERSION_PATCH) - set(LLVM_VERSION_PATCH 5) + set(LLVM_VERSION_PATCH 6) endif() if(NOT DEFINED LLVM_VERSION_SUFFIX) set(LLVM_VERSION_SUFFIX) diff --git a/llvm/utils/lit/lit/__init__.py b/llvm/utils/lit/lit/__init__.py index 1cfcc7d37813bc..d8b0e3bd1c69e3 100644 --- a/llvm/utils/lit/lit/__init__.py +++ b/llvm/utils/lit/lit/__init__.py @@ -2,7 +2,7 @@ __author__ = "Daniel Dunbar" __email__ = "dan...@minormatter.com" -__versioninfo__ = (18, 1, 5) +__versioninfo__ = (18, 1, 6) __version__ = ".".join(str(v) for v in __versioninfo__) + "dev" __all__ = [] >From dc6392e374ef8367e98b996569f3bb2898bcb99a Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Wed, 24 Apr 2024 07:47:42 -0700 Subject: [PATCH 2/8] [CMake][Release] Add stage2-package target (#89517) This target will be used to generate the release binary package for uploading to GitHub. (cherry picked from commit a38f201f1ec70c2b1f3cf46e7f291c53bb16753e) --- clang/cmake/caches/Release.cmake | 2 ++ 1 file changed, 2 insertions(+) diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake index bd1f688d61a7ea..fa972636553f1f 100644 --- a/clang/cmake/caches/Release.cmake +++ b/clang/cmake/caches/Release.cmake @@ -14,6 +14,7 @@ if (LLVM_RELEASE_ENABLE_PGO) set(CLANG_BOOTSTRAP_TARGETS generate-profdata stage2 +stage2-package stage2-clang stage2-distribution stage2-install @@ -57,6 +58,7 @@ set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "") set(BOOTSTRAP_CLANG_ENABLE_BOOTSTRAP ON CACHE STRING "") set(BOOTSTRAP_CLANG_BOOTSTRAP_TARGETS clang + package check-all check-llvm check-clang CACHE STRING "") >From 89f6c6ed99e27397e1d4ac8a0cf2e7d3cf11bccd Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Thu, 25 Apr 2024 15:32:08 -0700 Subject: [PATCH 3/8] [CMake][Release] Refactor cache file and use two stages for non-PGO builds (#89812) Completely refactor the cache file to simplify it and remove unnecessary variables. The main functional change here is that the non-PGO builds now use two stages, so `ninja -C build stage2-package` can be used with both PGO and non-PGO builds. (cherry picked from commit 6473fbf2d68c8486d168f29afc35d3e8a6fabe69) --- clang/cmake/caches/Release.cmake | 134 +++ 1 file changed, 66 insertions(+), 68 deletions(-) diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake index fa972636553f1f..c164d5497275f3 100644 --- a/clang/cmake/caches/Release.cmake +++ b/clang/cmake/caches/Release.cmake @@ -1,95 +1,93 @@ # Plain options configure the first build. # BOOTSTRAP_* options configure the second build. # BOOTSTRAP_BOOTSTRAP_* options configure the third build. +# PGO Builds have 3 stages (stage1, stage2-instrumented, stage2) +# non-PGO Builds have 2 stages (stage1, stage2) -# General Options + +function (set_final_stage_var name value type) + if (LLVM_RELEASE_ENABLE_PGO) +set(BOOTSTRAP_BOOTSTRAP_${name} ${value} CACHE ${type} "") + else() +set(BOOTSTRAP_${name} ${value} CACHE ${type} "") + endif() +endfunction() + +function (set_instrument_and_final_stage_var name value type) + # This sets the varaible for the final stage in non-PGO builds and in + # the stage2-instrumented stage for PGO builds. + set(BOOTSTRAP_${name} ${value} CACHE ${type} "") + if (LLVM_RELEASE_ENABLE_PGO) +# Set the variable in the final stage for PGO builds. +set(BOOTSTRAP_BOOTSTRAP_${name} ${value} CACHE ${type} "") + endif() +endfunction() + +# General Options: +# If you want to override any of the LLVM_RELEASE_* variables you can set them +# on the command line via -D, but you need to do this before you pass this +# cache file to CMake via -C. e.g. +# +# cmake -D LLVM_RELEASE_ENABLE_PGO=ON -C Release.cmake set(LLVM_RELEASE_ENABLE_LTO THIN CACHE STRING "") set(LLVM_RELEASE_ENABLE_PGO OFF CACHE BOOL "") - +set(LLVM_RELEASE_ENABLE_RUNTIMES "compiler-rt;libcxx;libcxxabi;libunwind" CACHE STRING "") +set(LLVM_RELEASE_ENABLE_PROJECTS "clang;lld;lldb;clang-tools-extra;bolt;polly;mlir;flang" CACHE STRING "") +# Note we don't need to add install here, since it is one of the pre-defined +# steps. +set(LLVM_RELEASE_FINAL_STAGE_TARGETS "clang;package;check-all;check-llvm;check-clang" CACHE STRING "") set(CMAKE_BUILD_TYPE RELEASE CACHE STRING "") -# Stage 1 Bootstra
[llvm-branch-commits] [llvm] [workflows] Fix libclang-abi-tests to work with new version scheme (PR #91096)
https://github.com/tstellar created https://github.com/llvm/llvm-project/pull/91096 None >From 19cb0cd2e2e499b46593d4708f0beaab671586bd Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Sat, 4 May 2024 23:10:21 + Subject: [PATCH] [workflows] Fix libclang-abi-tests to work with new version scheme --- .github/workflows/libclang-abi-tests.yml | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/.github/workflows/libclang-abi-tests.yml b/.github/workflows/libclang-abi-tests.yml index ccfc1e5fb8a742..14da910e667ea1 100644 --- a/.github/workflows/libclang-abi-tests.yml +++ b/.github/workflows/libclang-abi-tests.yml @@ -51,9 +51,10 @@ jobs: id: vars run: | remote_repo='https://github.com/llvm/llvm-project' - if [ ${{ steps.version.outputs.LLVM_VERSION_MINOR }} -ne 0 ] || [ ${{ steps.version.outputs.LLVM_VERSION_PATCH }} -eq 0 ]; then + echo "BASELINE_VERSION_MINOR=1" >> "$GITHUB_OUTPUT" + if [ ${{ steps.version.outputs.LLVM_VERSION_PATCH }} -eq 0 ]; then major_version=$(( ${{ steps.version.outputs.LLVM_VERSION_MAJOR }} - 1)) -baseline_ref="llvmorg-$major_version.0.0" +baseline_ref="llvmorg-$major_version.1.0" # If there is a minor release, we want to use that as the base line. minor_ref=$(git ls-remote --refs -t "$remote_repo" llvmorg-"$major_version".[1-9].[0-9] | tail -n1 | grep -o 'llvmorg-.\+' || true) @@ -75,7 +76,7 @@ jobs: else { echo "BASELINE_VERSION_MAJOR=${{ steps.version.outputs.LLVM_VERSION_MAJOR }}" - echo "BASELINE_REF=llvmorg-${{ steps.version.outputs.LLVM_VERSION_MAJOR }}.0.0" + echo "BASELINE_REF=llvmorg-${{ steps.version.outputs.LLVM_VERSION_MAJOR }}.1.0" echo "ABI_HEADERS=." echo "ABI_LIBS=libclang.so libclang-cpp.so" } >> "$GITHUB_OUTPUT" ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: Prepend all library intrinsics with `#` when building for Arm64EC (PR #88016)
tstellar wrote: Hi @dpaoliello I'm seeing some "Illegal Instruction" errors when running the bolt tests on aarch64. Do you think there is any chance this commit could be the cause? It's the only one between 18.1.3 and 18.1.4 that touches the aarch64 code gen. Here is the full log: https://kojipkgs.fedoraproject.org//work/tasks/3490/117353490/build.log https://github.com/llvm/llvm-project/pull/88016 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [workflows] Rework pre-commit CI for the release branch (PR #91550)
https://github.com/tstellar created https://github.com/llvm/llvm-project/pull/91550 This rewrites the pre-commit CI for the release branch so that it behaves almost exactly like the current buildkite builders. It builds every project and uses a better filtering method for selecting which projects to build. In addition, with this change we drop the Linux and Windows test configs, since these are already covered by buildkite and add a config for macos/aarch64. >From a590088cbdf37d3c4d274c5ab9d6d4e4de9c922c Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Fri, 16 Feb 2024 21:34:02 + Subject: [PATCH] [workflows] Rework pre-commit CI for the release branch This rewrites the pre-commit CI for the release branch so that it behaves almost exactly like the current buildkite builders. It builds every project and uses a better filtering method for selecting which projects to build. In addition, with this change we drop the Linux and Windows test configs, since these are already covered by buildkite and add a config for macos/aarch64. --- .github/workflows/ci-tests.yml| 154 .../compute-projects-to-test/action.yml | 21 ++ .../compute-projects-to-test.sh | 221 ++ .github/workflows/continue-timeout-job.yml| 75 ++ .github/workflows/get-job-id/action.yml | 30 +++ .../workflows/pr-sccache-restore/action.yml | 26 +++ .github/workflows/pr-sccache-save/action.yml | 50 .github/workflows/timeout-restore/action.yml | 33 +++ .github/workflows/timeout-save/action.yml | 94 .../unprivileged-download-artifact/action.yml | 77 ++ 10 files changed, 781 insertions(+) create mode 100644 .github/workflows/ci-tests.yml create mode 100644 .github/workflows/compute-projects-to-test/action.yml create mode 100755 .github/workflows/compute-projects-to-test/compute-projects-to-test.sh create mode 100644 .github/workflows/continue-timeout-job.yml create mode 100644 .github/workflows/get-job-id/action.yml create mode 100644 .github/workflows/pr-sccache-restore/action.yml create mode 100644 .github/workflows/pr-sccache-save/action.yml create mode 100644 .github/workflows/timeout-restore/action.yml create mode 100644 .github/workflows/timeout-save/action.yml create mode 100644 .github/workflows/unprivileged-download-artifact/action.yml diff --git a/.github/workflows/ci-tests.yml b/.github/workflows/ci-tests.yml new file mode 100644 index 0..22e39174abee7 --- /dev/null +++ b/.github/workflows/ci-tests.yml @@ -0,0 +1,154 @@ +name: "CI Tests" + +permissions: + contents: read + +on: + pull_request: +types: + - opened + - synchronize + - reopened + # When a PR is closed, we still start this workflow, but then skip + # all the jobs, which makes it effectively a no-op. The reason to + # do this is that it allows us to take advantage of concurrency groups + # to cancel in progress CI jobs whenever the PR is closed. + - closed +branches: + - main + +concurrency: + group: ${{ github.workflow }}-${{ github.event.pull_request.number }} + cancel-in-progress: True + +jobs: + compute-test-configs: +name: "Compute Configurations to Test" +if: github.event.action != 'closed' +runs-on: ubuntu-22.04 +outputs: + projects: ${{ steps.vars.outputs.projects }} + check-targets: ${{ steps.vars.outputs.check-targets }} + test-build: ${{ steps.vars.outputs.check-targets != '' }} + test-platforms: ${{ steps.platforms.outputs.result }} +steps: + - name: Fetch LLVM sources +uses: actions/checkout@v4 +with: + fetch-depth: 2 + + - name: Compute projects to test +id: vars +uses: ./.github/workflows/compute-projects-to-test + + - name: Compute platforms to test +uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea #v7.0.1 +id: platforms +with: + script: | +linuxConfig = { + name: "linux-x86_64", + runs_on: "ubuntu-22.04" +} +windowsConfig = { + name: "windows-x86_64", + runs_on: "windows-2022" +} +macConfig = { + name: "macos-x86_64", + runs_on: "macos-13" +} +macArmConfig = { + name: "macos-aarch64", + runs_on: "macos-14" +} + +configs = [] + +const base_ref = process.env.GITHUB_BASE_REF; +if (base_ref.startsWith('release/')) { + // This is a pull request against a release branch. + configs.push(macConfig) + configs.push(macArmConfig) +} + +return configs; + + ci-build-test: +# If this job name is changed, then we need to update the job-name +# paramater for the timeout-save step below. +name: "Build" +needs: +
[llvm-branch-commits] [llvm] [workflows] Rework pre-commit CI for the release branch (PR #91550)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/91550 >From 8ea4c39bef000973979cc75a39006e5f87481ee2 Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Fri, 16 Feb 2024 21:34:02 + Subject: [PATCH] [workflows] Rework pre-commit CI for the release branch This rewrites the pre-commit CI for the release branch so that it behaves almost exactly like the current buildkite builders. It builds every project and uses a better filtering method for selecting which projects to build. In addition, with this change we drop the Linux and Windows test configs, since these are already covered by buildkite and add a config for macos/aarch64. --- .github/workflows/ci-tests.yml| 156 + .../compute-projects-to-test/action.yml | 21 ++ .../compute-projects-to-test.sh | 221 ++ .github/workflows/continue-timeout-job.yml| 75 ++ .github/workflows/get-job-id/action.yml | 30 +++ .github/workflows/lld-tests.yml | 38 --- .../workflows/pr-sccache-restore/action.yml | 26 +++ .github/workflows/pr-sccache-save/action.yml | 50 .github/workflows/timeout-restore/action.yml | 33 +++ .github/workflows/timeout-save/action.yml | 94 .../unprivileged-download-artifact/action.yml | 77 ++ 11 files changed, 783 insertions(+), 38 deletions(-) create mode 100644 .github/workflows/ci-tests.yml create mode 100644 .github/workflows/compute-projects-to-test/action.yml create mode 100755 .github/workflows/compute-projects-to-test/compute-projects-to-test.sh create mode 100644 .github/workflows/continue-timeout-job.yml create mode 100644 .github/workflows/get-job-id/action.yml delete mode 100644 .github/workflows/lld-tests.yml create mode 100644 .github/workflows/pr-sccache-restore/action.yml create mode 100644 .github/workflows/pr-sccache-save/action.yml create mode 100644 .github/workflows/timeout-restore/action.yml create mode 100644 .github/workflows/timeout-save/action.yml create mode 100644 .github/workflows/unprivileged-download-artifact/action.yml diff --git a/.github/workflows/ci-tests.yml b/.github/workflows/ci-tests.yml new file mode 100644 index 0..e1d1c02755939 --- /dev/null +++ b/.github/workflows/ci-tests.yml @@ -0,0 +1,156 @@ +name: "CI Tests" + +permissions: + contents: read + +on: + pull_request: +types: + - opened + - synchronize + - reopened + # When a PR is closed, we still start this workflow, but then skip + # all the jobs, which makes it effectively a no-op. The reason to + # do this is that it allows us to take advantage of concurrency groups + # to cancel in progress CI jobs whenever the PR is closed. + - closed +branches: + - 'release/**' + +concurrency: + group: ${{ github.workflow }}-${{ github.event.pull_request.number }} + cancel-in-progress: True + +jobs: + compute-test-configs: +name: "Compute Configurations to Test" +if: >- + github.repository_owner == 'llvm' && + github.event.action != 'closed' +runs-on: ubuntu-22.04 +outputs: + projects: ${{ steps.vars.outputs.projects }} + check-targets: ${{ steps.vars.outputs.check-targets }} + test-build: ${{ steps.vars.outputs.check-targets != '' }} + test-platforms: ${{ steps.platforms.outputs.result }} +steps: + - name: Fetch LLVM sources +uses: actions/checkout@v4 +with: + fetch-depth: 2 + + - name: Compute projects to test +id: vars +uses: ./.github/workflows/compute-projects-to-test + + - name: Compute platforms to test +uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea #v7.0.1 +id: platforms +with: + script: | +linuxConfig = { + name: "linux-x86_64", + runs_on: "ubuntu-22.04" +} +windowsConfig = { + name: "windows-x86_64", + runs_on: "windows-2022" +} +macConfig = { + name: "macos-x86_64", + runs_on: "macos-13" +} +macArmConfig = { + name: "macos-aarch64", + runs_on: "macos-14" +} + +configs = [] + +const base_ref = process.env.GITHUB_BASE_REF; +if (base_ref.startsWith('release/')) { + // This is a pull request against a release branch. + configs.push(macConfig) + configs.push(macArmConfig) +} + +return configs; + + ci-build-test: +# If this job name is changed, then we need to update the job-name +# paramater for the timeout-save step below. +name: "Build" +needs: + - compute-test-configs +permissions: + actions: write #pr-sccache-save may delete artifacts. +runs-on: ${{ matrix.runs_on }} +strategy: + fail-fast: false + matrix
[llvm-branch-commits] [llvm] [workflows] Rework pre-commit CI for the release branch (PR #91550)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/91550 >From 8ea4c39bef000973979cc75a39006e5f87481ee2 Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Fri, 16 Feb 2024 21:34:02 + Subject: [PATCH 1/2] [workflows] Rework pre-commit CI for the release branch This rewrites the pre-commit CI for the release branch so that it behaves almost exactly like the current buildkite builders. It builds every project and uses a better filtering method for selecting which projects to build. In addition, with this change we drop the Linux and Windows test configs, since these are already covered by buildkite and add a config for macos/aarch64. --- .github/workflows/ci-tests.yml| 156 + .../compute-projects-to-test/action.yml | 21 ++ .../compute-projects-to-test.sh | 221 ++ .github/workflows/continue-timeout-job.yml| 75 ++ .github/workflows/get-job-id/action.yml | 30 +++ .github/workflows/lld-tests.yml | 38 --- .../workflows/pr-sccache-restore/action.yml | 26 +++ .github/workflows/pr-sccache-save/action.yml | 50 .github/workflows/timeout-restore/action.yml | 33 +++ .github/workflows/timeout-save/action.yml | 94 .../unprivileged-download-artifact/action.yml | 77 ++ 11 files changed, 783 insertions(+), 38 deletions(-) create mode 100644 .github/workflows/ci-tests.yml create mode 100644 .github/workflows/compute-projects-to-test/action.yml create mode 100755 .github/workflows/compute-projects-to-test/compute-projects-to-test.sh create mode 100644 .github/workflows/continue-timeout-job.yml create mode 100644 .github/workflows/get-job-id/action.yml delete mode 100644 .github/workflows/lld-tests.yml create mode 100644 .github/workflows/pr-sccache-restore/action.yml create mode 100644 .github/workflows/pr-sccache-save/action.yml create mode 100644 .github/workflows/timeout-restore/action.yml create mode 100644 .github/workflows/timeout-save/action.yml create mode 100644 .github/workflows/unprivileged-download-artifact/action.yml diff --git a/.github/workflows/ci-tests.yml b/.github/workflows/ci-tests.yml new file mode 100644 index 0..e1d1c02755939 --- /dev/null +++ b/.github/workflows/ci-tests.yml @@ -0,0 +1,156 @@ +name: "CI Tests" + +permissions: + contents: read + +on: + pull_request: +types: + - opened + - synchronize + - reopened + # When a PR is closed, we still start this workflow, but then skip + # all the jobs, which makes it effectively a no-op. The reason to + # do this is that it allows us to take advantage of concurrency groups + # to cancel in progress CI jobs whenever the PR is closed. + - closed +branches: + - 'release/**' + +concurrency: + group: ${{ github.workflow }}-${{ github.event.pull_request.number }} + cancel-in-progress: True + +jobs: + compute-test-configs: +name: "Compute Configurations to Test" +if: >- + github.repository_owner == 'llvm' && + github.event.action != 'closed' +runs-on: ubuntu-22.04 +outputs: + projects: ${{ steps.vars.outputs.projects }} + check-targets: ${{ steps.vars.outputs.check-targets }} + test-build: ${{ steps.vars.outputs.check-targets != '' }} + test-platforms: ${{ steps.platforms.outputs.result }} +steps: + - name: Fetch LLVM sources +uses: actions/checkout@v4 +with: + fetch-depth: 2 + + - name: Compute projects to test +id: vars +uses: ./.github/workflows/compute-projects-to-test + + - name: Compute platforms to test +uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea #v7.0.1 +id: platforms +with: + script: | +linuxConfig = { + name: "linux-x86_64", + runs_on: "ubuntu-22.04" +} +windowsConfig = { + name: "windows-x86_64", + runs_on: "windows-2022" +} +macConfig = { + name: "macos-x86_64", + runs_on: "macos-13" +} +macArmConfig = { + name: "macos-aarch64", + runs_on: "macos-14" +} + +configs = [] + +const base_ref = process.env.GITHUB_BASE_REF; +if (base_ref.startsWith('release/')) { + // This is a pull request against a release branch. + configs.push(macConfig) + configs.push(macArmConfig) +} + +return configs; + + ci-build-test: +# If this job name is changed, then we need to update the job-name +# paramater for the timeout-save step below. +name: "Build" +needs: + - compute-test-configs +permissions: + actions: write #pr-sccache-save may delete artifacts. +runs-on: ${{ matrix.runs_on }} +strategy: + fail-fast: false + ma
[llvm-branch-commits] [llvm] [workflows] Rework pre-commit CI for the release branch (PR #91550)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/91550 >From 8ea4c39bef000973979cc75a39006e5f87481ee2 Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Fri, 16 Feb 2024 21:34:02 + Subject: [PATCH 1/3] [workflows] Rework pre-commit CI for the release branch This rewrites the pre-commit CI for the release branch so that it behaves almost exactly like the current buildkite builders. It builds every project and uses a better filtering method for selecting which projects to build. In addition, with this change we drop the Linux and Windows test configs, since these are already covered by buildkite and add a config for macos/aarch64. --- .github/workflows/ci-tests.yml| 156 + .../compute-projects-to-test/action.yml | 21 ++ .../compute-projects-to-test.sh | 221 ++ .github/workflows/continue-timeout-job.yml| 75 ++ .github/workflows/get-job-id/action.yml | 30 +++ .github/workflows/lld-tests.yml | 38 --- .../workflows/pr-sccache-restore/action.yml | 26 +++ .github/workflows/pr-sccache-save/action.yml | 50 .github/workflows/timeout-restore/action.yml | 33 +++ .github/workflows/timeout-save/action.yml | 94 .../unprivileged-download-artifact/action.yml | 77 ++ 11 files changed, 783 insertions(+), 38 deletions(-) create mode 100644 .github/workflows/ci-tests.yml create mode 100644 .github/workflows/compute-projects-to-test/action.yml create mode 100755 .github/workflows/compute-projects-to-test/compute-projects-to-test.sh create mode 100644 .github/workflows/continue-timeout-job.yml create mode 100644 .github/workflows/get-job-id/action.yml delete mode 100644 .github/workflows/lld-tests.yml create mode 100644 .github/workflows/pr-sccache-restore/action.yml create mode 100644 .github/workflows/pr-sccache-save/action.yml create mode 100644 .github/workflows/timeout-restore/action.yml create mode 100644 .github/workflows/timeout-save/action.yml create mode 100644 .github/workflows/unprivileged-download-artifact/action.yml diff --git a/.github/workflows/ci-tests.yml b/.github/workflows/ci-tests.yml new file mode 100644 index 0..e1d1c02755939 --- /dev/null +++ b/.github/workflows/ci-tests.yml @@ -0,0 +1,156 @@ +name: "CI Tests" + +permissions: + contents: read + +on: + pull_request: +types: + - opened + - synchronize + - reopened + # When a PR is closed, we still start this workflow, but then skip + # all the jobs, which makes it effectively a no-op. The reason to + # do this is that it allows us to take advantage of concurrency groups + # to cancel in progress CI jobs whenever the PR is closed. + - closed +branches: + - 'release/**' + +concurrency: + group: ${{ github.workflow }}-${{ github.event.pull_request.number }} + cancel-in-progress: True + +jobs: + compute-test-configs: +name: "Compute Configurations to Test" +if: >- + github.repository_owner == 'llvm' && + github.event.action != 'closed' +runs-on: ubuntu-22.04 +outputs: + projects: ${{ steps.vars.outputs.projects }} + check-targets: ${{ steps.vars.outputs.check-targets }} + test-build: ${{ steps.vars.outputs.check-targets != '' }} + test-platforms: ${{ steps.platforms.outputs.result }} +steps: + - name: Fetch LLVM sources +uses: actions/checkout@v4 +with: + fetch-depth: 2 + + - name: Compute projects to test +id: vars +uses: ./.github/workflows/compute-projects-to-test + + - name: Compute platforms to test +uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea #v7.0.1 +id: platforms +with: + script: | +linuxConfig = { + name: "linux-x86_64", + runs_on: "ubuntu-22.04" +} +windowsConfig = { + name: "windows-x86_64", + runs_on: "windows-2022" +} +macConfig = { + name: "macos-x86_64", + runs_on: "macos-13" +} +macArmConfig = { + name: "macos-aarch64", + runs_on: "macos-14" +} + +configs = [] + +const base_ref = process.env.GITHUB_BASE_REF; +if (base_ref.startsWith('release/')) { + // This is a pull request against a release branch. + configs.push(macConfig) + configs.push(macArmConfig) +} + +return configs; + + ci-build-test: +# If this job name is changed, then we need to update the job-name +# paramater for the timeout-save step below. +name: "Build" +needs: + - compute-test-configs +permissions: + actions: write #pr-sccache-save may delete artifacts. +runs-on: ${{ matrix.runs_on }} +strategy: + fail-fast: false + ma
[llvm-branch-commits] [llvm] Bump version to 18.1.6 (PR #91094)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/91094 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] Backport some fixes for building the release binaries (PR #91095)
@@ -22,7 +22,7 @@ if(NOT DEFINED LLVM_VERSION_MINOR) set(LLVM_VERSION_MINOR 1) endif() if(NOT DEFINED LLVM_VERSION_PATCH) - set(LLVM_VERSION_PATCH 5) + set(LLVM_VERSION_PATCH 6) tstellar wrote: I just merged this commit in another PR. https://github.com/llvm/llvm-project/pull/91095 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] Backport some fixes for building the release binaries (PR #91095)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/91095 >From f2c5a10e1f27768b031b8b54cb056fd4e261ad8f Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Wed, 24 Apr 2024 07:47:42 -0700 Subject: [PATCH 1/7] [CMake][Release] Add stage2-package target (#89517) This target will be used to generate the release binary package for uploading to GitHub. (cherry picked from commit a38f201f1ec70c2b1f3cf46e7f291c53bb16753e) --- clang/cmake/caches/Release.cmake | 2 ++ 1 file changed, 2 insertions(+) diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake index bd1f688d61a7e..fa972636553f1 100644 --- a/clang/cmake/caches/Release.cmake +++ b/clang/cmake/caches/Release.cmake @@ -14,6 +14,7 @@ if (LLVM_RELEASE_ENABLE_PGO) set(CLANG_BOOTSTRAP_TARGETS generate-profdata stage2 +stage2-package stage2-clang stage2-distribution stage2-install @@ -57,6 +58,7 @@ set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "") set(BOOTSTRAP_CLANG_ENABLE_BOOTSTRAP ON CACHE STRING "") set(BOOTSTRAP_CLANG_BOOTSTRAP_TARGETS clang + package check-all check-llvm check-clang CACHE STRING "") >From ce88e86e428be7eea517201ddee8d62150ae8de4 Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Thu, 25 Apr 2024 15:32:08 -0700 Subject: [PATCH 2/7] [CMake][Release] Refactor cache file and use two stages for non-PGO builds (#89812) Completely refactor the cache file to simplify it and remove unnecessary variables. The main functional change here is that the non-PGO builds now use two stages, so `ninja -C build stage2-package` can be used with both PGO and non-PGO builds. (cherry picked from commit 6473fbf2d68c8486d168f29afc35d3e8a6fabe69) --- clang/cmake/caches/Release.cmake | 134 +++ 1 file changed, 66 insertions(+), 68 deletions(-) diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake index fa972636553f1..c164d5497275f 100644 --- a/clang/cmake/caches/Release.cmake +++ b/clang/cmake/caches/Release.cmake @@ -1,95 +1,93 @@ # Plain options configure the first build. # BOOTSTRAP_* options configure the second build. # BOOTSTRAP_BOOTSTRAP_* options configure the third build. +# PGO Builds have 3 stages (stage1, stage2-instrumented, stage2) +# non-PGO Builds have 2 stages (stage1, stage2) -# General Options + +function (set_final_stage_var name value type) + if (LLVM_RELEASE_ENABLE_PGO) +set(BOOTSTRAP_BOOTSTRAP_${name} ${value} CACHE ${type} "") + else() +set(BOOTSTRAP_${name} ${value} CACHE ${type} "") + endif() +endfunction() + +function (set_instrument_and_final_stage_var name value type) + # This sets the varaible for the final stage in non-PGO builds and in + # the stage2-instrumented stage for PGO builds. + set(BOOTSTRAP_${name} ${value} CACHE ${type} "") + if (LLVM_RELEASE_ENABLE_PGO) +# Set the variable in the final stage for PGO builds. +set(BOOTSTRAP_BOOTSTRAP_${name} ${value} CACHE ${type} "") + endif() +endfunction() + +# General Options: +# If you want to override any of the LLVM_RELEASE_* variables you can set them +# on the command line via -D, but you need to do this before you pass this +# cache file to CMake via -C. e.g. +# +# cmake -D LLVM_RELEASE_ENABLE_PGO=ON -C Release.cmake set(LLVM_RELEASE_ENABLE_LTO THIN CACHE STRING "") set(LLVM_RELEASE_ENABLE_PGO OFF CACHE BOOL "") - +set(LLVM_RELEASE_ENABLE_RUNTIMES "compiler-rt;libcxx;libcxxabi;libunwind" CACHE STRING "") +set(LLVM_RELEASE_ENABLE_PROJECTS "clang;lld;lldb;clang-tools-extra;bolt;polly;mlir;flang" CACHE STRING "") +# Note we don't need to add install here, since it is one of the pre-defined +# steps. +set(LLVM_RELEASE_FINAL_STAGE_TARGETS "clang;package;check-all;check-llvm;check-clang" CACHE STRING "") set(CMAKE_BUILD_TYPE RELEASE CACHE STRING "") -# Stage 1 Bootstrap Setup +# Stage 1 Options +set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "") set(CLANG_ENABLE_BOOTSTRAP ON CACHE BOOL "") + +set(STAGE1_PROJECTS "clang") +set(STAGE1_RUNTIMES "") + if (LLVM_RELEASE_ENABLE_PGO) + list(APPEND STAGE1_PROJECTS "lld") + list(APPEND STAGE1_RUNTIMES "compiler-rt") set(CLANG_BOOTSTRAP_TARGETS generate-profdata -stage2 stage2-package stage2-clang -stage2-distribution stage2-install -stage2-install-distribution -stage2-install-distribution-toolchain stage2-check-all stage2-check-llvm -stage2-check-clang -stage2-test-suite CACHE STRING "") -else() - set(CLANG_BOOTSTRAP_TARGETS -clang -check-all -check-llvm -check-clang -test-suite -stage3 -stage3-clang -stage3-check-all -stage3-check-llvm -stage3-check-clang -stage3-install -stage3-test-suite CACHE STRING "") -endif() +stage2-check-clang CACHE STRING "") -# Stage 1 Options -set(STAGE1_PROJECTS "clang") -set(STAGE1_RUNTIMES "") + # Configuration for stage2-instrumented + set(BOOTSTRAP_CLANG_ENABLE_BOOTSTRAP ON CACHE STRING "") + # This e
[llvm-branch-commits] [clang] f2c5a10 - [CMake][Release] Add stage2-package target (#89517)
Author: Tom Stellard Date: 2024-05-08T19:47:50-07:00 New Revision: f2c5a10e1f27768b031b8b54cb056fd4e261ad8f URL: https://github.com/llvm/llvm-project/commit/f2c5a10e1f27768b031b8b54cb056fd4e261ad8f DIFF: https://github.com/llvm/llvm-project/commit/f2c5a10e1f27768b031b8b54cb056fd4e261ad8f.diff LOG: [CMake][Release] Add stage2-package target (#89517) This target will be used to generate the release binary package for uploading to GitHub. (cherry picked from commit a38f201f1ec70c2b1f3cf46e7f291c53bb16753e) Added: Modified: clang/cmake/caches/Release.cmake Removed: diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake index bd1f688d61a7e..fa972636553f1 100644 --- a/clang/cmake/caches/Release.cmake +++ b/clang/cmake/caches/Release.cmake @@ -14,6 +14,7 @@ if (LLVM_RELEASE_ENABLE_PGO) set(CLANG_BOOTSTRAP_TARGETS generate-profdata stage2 +stage2-package stage2-clang stage2-distribution stage2-install @@ -57,6 +58,7 @@ set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "") set(BOOTSTRAP_CLANG_ENABLE_BOOTSTRAP ON CACHE STRING "") set(BOOTSTRAP_CLANG_BOOTSTRAP_TARGETS clang + package check-all check-llvm check-clang CACHE STRING "") ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] ce88e86 - [CMake][Release] Refactor cache file and use two stages for non-PGO builds (#89812)
Author: Tom Stellard Date: 2024-05-08T19:47:50-07:00 New Revision: ce88e86e428be7eea517201ddee8d62150ae8de4 URL: https://github.com/llvm/llvm-project/commit/ce88e86e428be7eea517201ddee8d62150ae8de4 DIFF: https://github.com/llvm/llvm-project/commit/ce88e86e428be7eea517201ddee8d62150ae8de4.diff LOG: [CMake][Release] Refactor cache file and use two stages for non-PGO builds (#89812) Completely refactor the cache file to simplify it and remove unnecessary variables. The main functional change here is that the non-PGO builds now use two stages, so `ninja -C build stage2-package` can be used with both PGO and non-PGO builds. (cherry picked from commit 6473fbf2d68c8486d168f29afc35d3e8a6fabe69) Added: Modified: clang/cmake/caches/Release.cmake Removed: diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake index fa972636553f1..c164d5497275f 100644 --- a/clang/cmake/caches/Release.cmake +++ b/clang/cmake/caches/Release.cmake @@ -1,95 +1,93 @@ # Plain options configure the first build. # BOOTSTRAP_* options configure the second build. # BOOTSTRAP_BOOTSTRAP_* options configure the third build. +# PGO Builds have 3 stages (stage1, stage2-instrumented, stage2) +# non-PGO Builds have 2 stages (stage1, stage2) -# General Options + +function (set_final_stage_var name value type) + if (LLVM_RELEASE_ENABLE_PGO) +set(BOOTSTRAP_BOOTSTRAP_${name} ${value} CACHE ${type} "") + else() +set(BOOTSTRAP_${name} ${value} CACHE ${type} "") + endif() +endfunction() + +function (set_instrument_and_final_stage_var name value type) + # This sets the varaible for the final stage in non-PGO builds and in + # the stage2-instrumented stage for PGO builds. + set(BOOTSTRAP_${name} ${value} CACHE ${type} "") + if (LLVM_RELEASE_ENABLE_PGO) +# Set the variable in the final stage for PGO builds. +set(BOOTSTRAP_BOOTSTRAP_${name} ${value} CACHE ${type} "") + endif() +endfunction() + +# General Options: +# If you want to override any of the LLVM_RELEASE_* variables you can set them +# on the command line via -D, but you need to do this before you pass this +# cache file to CMake via -C. e.g. +# +# cmake -D LLVM_RELEASE_ENABLE_PGO=ON -C Release.cmake set(LLVM_RELEASE_ENABLE_LTO THIN CACHE STRING "") set(LLVM_RELEASE_ENABLE_PGO OFF CACHE BOOL "") - +set(LLVM_RELEASE_ENABLE_RUNTIMES "compiler-rt;libcxx;libcxxabi;libunwind" CACHE STRING "") +set(LLVM_RELEASE_ENABLE_PROJECTS "clang;lld;lldb;clang-tools-extra;bolt;polly;mlir;flang" CACHE STRING "") +# Note we don't need to add install here, since it is one of the pre-defined +# steps. +set(LLVM_RELEASE_FINAL_STAGE_TARGETS "clang;package;check-all;check-llvm;check-clang" CACHE STRING "") set(CMAKE_BUILD_TYPE RELEASE CACHE STRING "") -# Stage 1 Bootstrap Setup +# Stage 1 Options +set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "") set(CLANG_ENABLE_BOOTSTRAP ON CACHE BOOL "") + +set(STAGE1_PROJECTS "clang") +set(STAGE1_RUNTIMES "") + if (LLVM_RELEASE_ENABLE_PGO) + list(APPEND STAGE1_PROJECTS "lld") + list(APPEND STAGE1_RUNTIMES "compiler-rt") set(CLANG_BOOTSTRAP_TARGETS generate-profdata -stage2 stage2-package stage2-clang -stage2-distribution stage2-install -stage2-install-distribution -stage2-install-distribution-toolchain stage2-check-all stage2-check-llvm -stage2-check-clang -stage2-test-suite CACHE STRING "") -else() - set(CLANG_BOOTSTRAP_TARGETS -clang -check-all -check-llvm -check-clang -test-suite -stage3 -stage3-clang -stage3-check-all -stage3-check-llvm -stage3-check-clang -stage3-install -stage3-test-suite CACHE STRING "") -endif() +stage2-check-clang CACHE STRING "") -# Stage 1 Options -set(STAGE1_PROJECTS "clang") -set(STAGE1_RUNTIMES "") + # Configuration for stage2-instrumented + set(BOOTSTRAP_CLANG_ENABLE_BOOTSTRAP ON CACHE STRING "") + # This enables the build targets for the final stage which is called stage2. + set(BOOTSTRAP_CLANG_BOOTSTRAP_TARGETS ${LLVM_RELEASE_FINAL_STAGE_TARGETS} CACHE STRING "") + set(BOOTSTRAP_LLVM_BUILD_INSTRUMENTED IR CACHE STRING "") + set(BOOTSTRAP_LLVM_ENABLE_RUNTIMES "compiler-rt" CACHE STRING "") + set(BOOTSTRAP_LLVM_ENABLE_PROJECTS "clang;lld" CACHE STRING "") -if (LLVM_RELEASE_ENABLE_PGO) - list(APPEND STAGE1_PROJECTS "lld") - list(APPEND STAGE1_RUNTIMES "compiler-rt") +else() + if (LLVM_RELEASE_ENABLE_LTO) +list(APPEND STAGE1_PROJECTS "lld") + endif() + # Any targets added here will be given the target name stage2-${target}, so + # if you want to run them you can just use: + # ninja -C $BUILDDIR stage2-${target} + set(CLANG_BOOTSTRAP_TARGETS ${LLVM_RELEASE_FINAL_STAGE_TARGETS} CACHE STRING "") endif() +# Stage 1 Common Config set(LLVM_ENABLE_RUNTIMES ${STAGE1_RUNTIMES} CACHE STRING "") set(LLVM_ENABLE_PROJECTS ${STAGE1_PROJECTS} CACHE STRING "")
[llvm-branch-commits] [clang] b7e2397 - [CMake][Release] Enable CMAKE_POSITION_INDEPENDENT_CODE (#90139)
Author: Tom Stellard Date: 2024-05-08T19:47:50-07:00 New Revision: b7e2397c54b7cddac8fa188e68073f78e895a57a URL: https://github.com/llvm/llvm-project/commit/b7e2397c54b7cddac8fa188e68073f78e895a57a DIFF: https://github.com/llvm/llvm-project/commit/b7e2397c54b7cddac8fa188e68073f78e895a57a.diff LOG: [CMake][Release] Enable CMAKE_POSITION_INDEPENDENT_CODE (#90139) Set this in the cache file directly instead of via the test-release.sh script so that the release builds can be reproduced with just the cache file. (cherry picked from commit 53ff002c6f7ec64a75ab0990b1314cc6b4bb67cf) Added: Modified: clang/cmake/caches/Release.cmake llvm/utils/release/test-release.sh Removed: diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake index c164d5497275f..c0bfcbdfc1c2a 100644 --- a/clang/cmake/caches/Release.cmake +++ b/clang/cmake/caches/Release.cmake @@ -82,6 +82,7 @@ set(LLVM_ENABLE_PROJECTS ${STAGE1_PROJECTS} CACHE STRING "") # stage2-instrumented and Final Stage Config: # Options that need to be set in both the instrumented stage (if we are doing # a pgo build) and the final stage. +set_instrument_and_final_stage_var(CMAKE_POSITION_INDEPENDENT_CODE "ON" STRING) set_instrument_and_final_stage_var(LLVM_ENABLE_LTO "${LLVM_RELEASE_ENABLE_LTO}" STRING) if (LLVM_RELEASE_ENABLE_LTO) set_instrument_and_final_stage_var(LLVM_ENABLE_LLD "ON" BOOL) diff --git a/llvm/utils/release/test-release.sh b/llvm/utils/release/test-release.sh index 4314b565e11b0..050004aa08c49 100755 --- a/llvm/utils/release/test-release.sh +++ b/llvm/utils/release/test-release.sh @@ -353,8 +353,7 @@ function build_with_cmake_cache() { env CC="$c_compiler" CXX="$cxx_compiler" \ cmake -G "$generator" -B $CMakeBuildDir -S $SrcDir/llvm \ -C $SrcDir/clang/cmake/caches/Release.cmake \ - -DCLANG_BOOTSTRAP_PASSTHROUGH="CMAKE_POSITION_INDEPENDENT_CODE;LLVM_LIT_ARGS" \ --DCMAKE_POSITION_INDEPENDENT_CODE=ON \ + -DCLANG_BOOTSTRAP_PASSTHROUGH="LLVM_LIT_ARGS" \ -DLLVM_LIT_ARGS="-j $NumJobs $LitVerbose" \ $ExtraConfigureFlags 2>&1 | tee $LogDir/llvm.configure-$Flavor.log ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] Backport some fixes for building the release binaries (PR #91095)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/91095 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12 (#90595) (PR #90719)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/90719 >From 58e44d3c6f67d5402ec38913d4262b94e73ac123 Mon Sep 17 00:00:00 2001 From: David Stuttard Date: Wed, 1 May 2024 11:37:13 +0100 Subject: [PATCH] [AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12 (#90595) Code to determine if a waitcnt is required before a barrier instruction only considered S_BARRIER. gfx12 adds barrier_signal/wait so need to enhance the existing code to look for a barrier start (which is just an S_BARRIER for earlier architectures). --- llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp | 2 +- llvm/lib/Target/AMDGPU/SIInstrInfo.h | 11 ++ .../CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll | 2 ++ .../AMDGPU/llvm.amdgcn.s.barrier.wait.ll | 22 +++ 4 files changed, 36 insertions(+), 1 deletion(-) diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp index 6ecb1c8bf6e1d..7a3198612f86f 100644 --- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp +++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp @@ -1832,7 +1832,7 @@ bool SIInsertWaitcnts::generateWaitcntInstBefore(MachineInstr &MI, // not, we need to ensure the subtarget is capable of backing off barrier // instructions in case there are any outstanding memory operations that may // cause an exception. Otherwise, insert an explicit S_WAITCNT 0 here. - if (MI.getOpcode() == AMDGPU::S_BARRIER && + if (TII->isBarrierStart(MI.getOpcode()) && !ST->hasAutoWaitcntBeforeBarrier() && !ST->supportsBackOffBarrier()) { Wait = Wait.combined( AMDGPU::Waitcnt::allZero(ST->hasExtendedWaitCounts(), ST->hasVscnt())); diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.h b/llvm/lib/Target/AMDGPU/SIInstrInfo.h index 1c9dacc09f815..626d903c0c695 100644 --- a/llvm/lib/Target/AMDGPU/SIInstrInfo.h +++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.h @@ -908,6 +908,17 @@ class SIInstrInfo final : public AMDGPUGenInstrInfo { return MI.getDesc().TSFlags & SIInstrFlags::IsNeverUniform; } + // Check to see if opcode is for a barrier start. Pre gfx12 this is just the + // S_BARRIER, but after support for S_BARRIER_SIGNAL* / S_BARRIER_WAIT we want + // to check for the barrier start (S_BARRIER_SIGNAL*) + bool isBarrierStart(unsigned Opcode) const { +return Opcode == AMDGPU::S_BARRIER || + Opcode == AMDGPU::S_BARRIER_SIGNAL_M0 || + Opcode == AMDGPU::S_BARRIER_SIGNAL_ISFIRST_M0 || + Opcode == AMDGPU::S_BARRIER_SIGNAL_IMM || + Opcode == AMDGPU::S_BARRIER_SIGNAL_ISFIRST_IMM; + } + static bool doesNotReadTiedSource(const MachineInstr &MI) { return MI.getDesc().TSFlags & SIInstrFlags::TiedSourceNotRead; } diff --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll index a7d3115af29bf..47c021769aa56 100644 --- a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll +++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll @@ -96,6 +96,7 @@ define amdgpu_kernel void @test_barrier(ptr addrspace(1) %out, i32 %size) #0 { ; VARIANT4-NEXT:s_wait_kmcnt 0x0 ; VARIANT4-NEXT:v_xad_u32 v1, v0, -1, s2 ; VARIANT4-NEXT:global_store_b32 v3, v0, s[0:1] +; VARIANT4-NEXT:s_wait_storecnt 0x0 ; VARIANT4-NEXT:s_barrier_signal -1 ; VARIANT4-NEXT:s_barrier_wait -1 ; VARIANT4-NEXT:v_ashrrev_i32_e32 v2, 31, v1 @@ -142,6 +143,7 @@ define amdgpu_kernel void @test_barrier(ptr addrspace(1) %out, i32 %size) #0 { ; VARIANT6-NEXT:v_dual_mov_b32 v4, s1 :: v_dual_mov_b32 v3, s0 ; VARIANT6-NEXT:v_sub_nc_u32_e32 v1, s2, v0 ; VARIANT6-NEXT:global_store_b32 v5, v0, s[0:1] +; VARIANT6-NEXT:s_wait_storecnt 0x0 ; VARIANT6-NEXT:s_barrier_signal -1 ; VARIANT6-NEXT:s_barrier_wait -1 ; VARIANT6-NEXT:v_ashrrev_i32_e32 v2, 31, v1 diff --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll index 4ab5e97964a85..38a34ec6daf73 100644 --- a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll +++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll @@ -12,6 +12,7 @@ define amdgpu_kernel void @test1_s_barrier_signal(ptr addrspace(1) %out) #0 { ; GCN-NEXT:v_sub_nc_u32_e32 v0, v1, v0 ; GCN-NEXT:s_wait_kmcnt 0x0 ; GCN-NEXT:global_store_b32 v3, v2, s[0:1] +; GCN-NEXT:s_wait_storecnt 0x0 ; GCN-NEXT:s_barrier_signal -1 ; GCN-NEXT:s_barrier_wait -1 ; GCN-NEXT:global_store_b32 v3, v0, s[0:1] @@ -28,6 +29,7 @@ define amdgpu_kernel void @test1_s_barrier_signal(ptr addrspace(1) %out) #0 { ; GLOBAL-ISEL-NEXT:v_sub_nc_u32_e32 v0, v1, v0 ; GLOBAL-ISEL-NEXT:s_wait_kmcnt 0x0 ; GLOBAL-ISEL-NEXT:global_store_b32 v3, v2, s[0:1] +; GLOBAL-ISEL-NEXT:s_wait_storecnt 0x0 ; GLOBAL-ISEL-NEXT:s_barrier_signal -1 ; GLOBAL-ISEL-NEXT:s_barrier_wait -1 ; GLOBAL-ISEL-NEXT:global_store_b32 v3, v0, s[0:1] @@ -56,6 +58,7 @@ define amdgpu_k
[llvm-branch-commits] [llvm] 58e44d3 - [AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12 (#90595)
Author: David Stuttard Date: 2024-05-08T20:08:59-07:00 New Revision: 58e44d3c6f67d5402ec38913d4262b94e73ac123 URL: https://github.com/llvm/llvm-project/commit/58e44d3c6f67d5402ec38913d4262b94e73ac123 DIFF: https://github.com/llvm/llvm-project/commit/58e44d3c6f67d5402ec38913d4262b94e73ac123.diff LOG: [AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12 (#90595) Code to determine if a waitcnt is required before a barrier instruction only considered S_BARRIER. gfx12 adds barrier_signal/wait so need to enhance the existing code to look for a barrier start (which is just an S_BARRIER for earlier architectures). Added: Modified: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp llvm/lib/Target/AMDGPU/SIInstrInfo.h llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll Removed: diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp index 6ecb1c8bf6e1d..7a3198612f86f 100644 --- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp +++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp @@ -1832,7 +1832,7 @@ bool SIInsertWaitcnts::generateWaitcntInstBefore(MachineInstr &MI, // not, we need to ensure the subtarget is capable of backing off barrier // instructions in case there are any outstanding memory operations that may // cause an exception. Otherwise, insert an explicit S_WAITCNT 0 here. - if (MI.getOpcode() == AMDGPU::S_BARRIER && + if (TII->isBarrierStart(MI.getOpcode()) && !ST->hasAutoWaitcntBeforeBarrier() && !ST->supportsBackOffBarrier()) { Wait = Wait.combined( AMDGPU::Waitcnt::allZero(ST->hasExtendedWaitCounts(), ST->hasVscnt())); diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.h b/llvm/lib/Target/AMDGPU/SIInstrInfo.h index 1c9dacc09f815..626d903c0c695 100644 --- a/llvm/lib/Target/AMDGPU/SIInstrInfo.h +++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.h @@ -908,6 +908,17 @@ class SIInstrInfo final : public AMDGPUGenInstrInfo { return MI.getDesc().TSFlags & SIInstrFlags::IsNeverUniform; } + // Check to see if opcode is for a barrier start. Pre gfx12 this is just the + // S_BARRIER, but after support for S_BARRIER_SIGNAL* / S_BARRIER_WAIT we want + // to check for the barrier start (S_BARRIER_SIGNAL*) + bool isBarrierStart(unsigned Opcode) const { +return Opcode == AMDGPU::S_BARRIER || + Opcode == AMDGPU::S_BARRIER_SIGNAL_M0 || + Opcode == AMDGPU::S_BARRIER_SIGNAL_ISFIRST_M0 || + Opcode == AMDGPU::S_BARRIER_SIGNAL_IMM || + Opcode == AMDGPU::S_BARRIER_SIGNAL_ISFIRST_IMM; + } + static bool doesNotReadTiedSource(const MachineInstr &MI) { return MI.getDesc().TSFlags & SIInstrFlags::TiedSourceNotRead; } diff --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll index a7d3115af29bf..47c021769aa56 100644 --- a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll +++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll @@ -96,6 +96,7 @@ define amdgpu_kernel void @test_barrier(ptr addrspace(1) %out, i32 %size) #0 { ; VARIANT4-NEXT:s_wait_kmcnt 0x0 ; VARIANT4-NEXT:v_xad_u32 v1, v0, -1, s2 ; VARIANT4-NEXT:global_store_b32 v3, v0, s[0:1] +; VARIANT4-NEXT:s_wait_storecnt 0x0 ; VARIANT4-NEXT:s_barrier_signal -1 ; VARIANT4-NEXT:s_barrier_wait -1 ; VARIANT4-NEXT:v_ashrrev_i32_e32 v2, 31, v1 @@ -142,6 +143,7 @@ define amdgpu_kernel void @test_barrier(ptr addrspace(1) %out, i32 %size) #0 { ; VARIANT6-NEXT:v_dual_mov_b32 v4, s1 :: v_dual_mov_b32 v3, s0 ; VARIANT6-NEXT:v_sub_nc_u32_e32 v1, s2, v0 ; VARIANT6-NEXT:global_store_b32 v5, v0, s[0:1] +; VARIANT6-NEXT:s_wait_storecnt 0x0 ; VARIANT6-NEXT:s_barrier_signal -1 ; VARIANT6-NEXT:s_barrier_wait -1 ; VARIANT6-NEXT:v_ashrrev_i32_e32 v2, 31, v1 diff --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll index 4ab5e97964a85..38a34ec6daf73 100644 --- a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll +++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll @@ -12,6 +12,7 @@ define amdgpu_kernel void @test1_s_barrier_signal(ptr addrspace(1) %out) #0 { ; GCN-NEXT:v_sub_nc_u32_e32 v0, v1, v0 ; GCN-NEXT:s_wait_kmcnt 0x0 ; GCN-NEXT:global_store_b32 v3, v2, s[0:1] +; GCN-NEXT:s_wait_storecnt 0x0 ; GCN-NEXT:s_barrier_signal -1 ; GCN-NEXT:s_barrier_wait -1 ; GCN-NEXT:global_store_b32 v3, v0, s[0:1] @@ -28,6 +29,7 @@ define amdgpu_kernel void @test1_s_barrier_signal(ptr addrspace(1) %out) #0 { ; GLOBAL-ISEL-NEXT:v_sub_nc_u32_e32 v0, v1, v0 ; GLOBAL-ISEL-NEXT:s_wait_kmcnt 0x0 ; GLOBAL-ISEL-NEXT:global_store_b32 v3, v2, s[0:1] +; GLOBAL-ISEL-NEXT:s_wait_storecnt 0x0 ; GLOBAL-ISEL-NEXT:s_barrier_signal -1 ; GLOBAL-ISEL-NEXT:s_b
[llvm-branch-commits] [llvm] [AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12 (#90595) (PR #90719)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/90719 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [X86][EVEX512] Add `HasEVEX512` when `NoVLX` used for 512-bit patterns (#91106) (PR #91118)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/91118 >From 047cd915b86a4f35543ad4e691953aaa5a91c4fe Mon Sep 17 00:00:00 2001 From: Phoebe Wang Date: Sun, 5 May 2024 18:40:27 +0800 Subject: [PATCH] [X86][EVEX512] Add `HasEVEX512` when `NoVLX` used for 512-bit patterns (#91106) With KNL/KNC being deprecated, we don't need to care about such no VLX cases anymore. We may remove such patterns in the future. Fixes #90844 (cherry picked from commit 7963d9a2b3c20561278a85b19e156e013231342c) --- llvm/lib/Target/X86/X86ISelLowering.cpp | 4 ++- llvm/lib/Target/X86/X86InstrAVX512.td | 42 - llvm/test/CodeGen/X86/pr90844.ll| 19 +++ 3 files changed, 43 insertions(+), 22 deletions(-) create mode 100644 llvm/test/CodeGen/X86/pr90844.ll diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp index 71fc6b5047eaa..c572b27fe401e 100644 --- a/llvm/lib/Target/X86/X86ISelLowering.cpp +++ b/llvm/lib/Target/X86/X86ISelLowering.cpp @@ -29841,7 +29841,9 @@ static SDValue LowerRotate(SDValue Op, const X86Subtarget &Subtarget, return R; // AVX512 implicitly uses modulo rotation amounts. - if (Subtarget.hasAVX512() && 32 <= EltSizeInBits) { + if ((Subtarget.hasVLX() || + (Subtarget.hasAVX512() && Subtarget.hasEVEX512())) && + 32 <= EltSizeInBits) { // Attempt to rotate by immediate. if (IsCstSplat) { unsigned RotOpc = IsROTL ? X86ISD::VROTLI : X86ISD::VROTRI; diff --git a/llvm/lib/Target/X86/X86InstrAVX512.td b/llvm/lib/Target/X86/X86InstrAVX512.td index bb5e22c714279..0564f2167d8ee 100644 --- a/llvm/lib/Target/X86/X86InstrAVX512.td +++ b/llvm/lib/Target/X86/X86InstrAVX512.td @@ -814,7 +814,7 @@ defm : vextract_for_size_lowering<"VEXTRACTF64x4Z", v32f16_info, v16f16x_info, // A 128-bit extract from bits [255:128] of a 512-bit vector should use a // smaller extract to enable EVEX->VEX. -let Predicates = [NoVLX] in { +let Predicates = [NoVLX, HasEVEX512] in { def : Pat<(v2i64 (extract_subvector (v8i64 VR512:$src), (iPTR 2))), (v2i64 (VEXTRACTI128rr (v4i64 (EXTRACT_SUBREG (v8i64 VR512:$src), sub_ymm)), @@ -3068,7 +3068,7 @@ def : Pat<(Narrow.KVT (and Narrow.KRC:$mask, addr:$src2, (X86cmpm_imm_commute timm:$cc)), Narrow.KRC)>; } -let Predicates = [HasAVX512, NoVLX] in { +let Predicates = [HasAVX512, NoVLX, HasEVEX512] in { defm : axv512_icmp_packed_cc_no_vlx_lowering; defm : axv512_icmp_packed_cc_no_vlx_lowering; @@ -3099,7 +3099,7 @@ let Predicates = [HasAVX512, NoVLX] in { defm : axv512_cmp_packed_cc_no_vlx_lowering<"VCMPPD", v2f64x_info, v8f64_info>; } -let Predicates = [HasBWI, NoVLX] in { +let Predicates = [HasBWI, NoVLX, HasEVEX512] in { defm : axv512_icmp_packed_cc_no_vlx_lowering; defm : axv512_icmp_packed_cc_no_vlx_lowering; @@ -3493,7 +3493,7 @@ multiclass mask_move_lowering; defm : mask_move_lowering<"VMOVDQA32Z", v4i32x_info, v16i32_info>; defm : mask_move_lowering<"VMOVAPSZ", v8f32x_info, v16f32_info>; @@ -3505,7 +3505,7 @@ let Predicates = [HasAVX512, NoVLX] in { defm : mask_move_lowering<"VMOVDQA64Z", v4i64x_info, v8i64_info>; } -let Predicates = [HasBWI, NoVLX] in { +let Predicates = [HasBWI, NoVLX, HasEVEX512] in { defm : mask_move_lowering<"VMOVDQU8Z", v16i8x_info, v64i8_info>; defm : mask_move_lowering<"VMOVDQU8Z", v32i8x_info, v64i8_info>; @@ -4998,8 +4998,8 @@ defm VPMINUD : avx512_binop_rm_vl_d<0x3B, "vpminud", umin, defm VPMINUQ : avx512_binop_rm_vl_q<0x3B, "vpminuq", umin, SchedWriteVecALU, HasAVX512, 1>, T8; -// PMULLQ: Use 512bit version to implement 128/256 bit in case NoVLX. -let Predicates = [HasDQI, NoVLX] in { +// PMULLQ: Use 512bit version to implement 128/256 bit in case NoVLX, HasEVEX512. +let Predicates = [HasDQI, NoVLX, HasEVEX512] in { def : Pat<(v4i64 (mul (v4i64 VR256X:$src1), (v4i64 VR256X:$src2))), (EXTRACT_SUBREG (VPMULLQZrr @@ -5055,7 +5055,7 @@ multiclass avx512_min_max_lowering { sub_xmm)>; } -let Predicates = [HasAVX512, NoVLX] in { +let Predicates = [HasAVX512, NoVLX, HasEVEX512] in { defm : avx512_min_max_lowering<"VPMAXUQZ", umax>; defm : avx512_min_max_lowering<"VPMINUQZ", umin>; defm : avx512_min_max_lowering<"VPMAXSQZ", smax>; @@ -6032,7 +6032,7 @@ defm VPSRL : avx512_shift_types<0xD2, 0xD3, 0xD1, "vpsrl", X86vsrl, SchedWriteVecShift>; // Use 512bit VPSRA/VPSRAI version to implement v2i64/v4i64 in case NoVLX. -let Predicates = [HasAVX512, NoVLX] in { +let Predicates = [HasAVX512, NoVLX, HasEVEX512] in { def : Pat<(v4i64 (X86vsra (v4i64 VR256X:$src1), (v2i64 VR128X:$src2))), (EXTRACT_SUBREG (v8i64 (VPSRAQZrr @@ -6161,14 +6161,14 @@ defm VPSRLV : avx512_var_shift_types<0x45, "vpsrlv", X86vsrlv, SchedWriteVarVecS defm VPRORV : avx512_var_shift_types<0x14, "vprorv", r
[llvm-branch-commits] [llvm] 047cd91 - [X86][EVEX512] Add `HasEVEX512` when `NoVLX` used for 512-bit patterns (#91106)
Author: Phoebe Wang Date: 2024-05-08T20:10:38-07:00 New Revision: 047cd915b86a4f35543ad4e691953aaa5a91c4fe URL: https://github.com/llvm/llvm-project/commit/047cd915b86a4f35543ad4e691953aaa5a91c4fe DIFF: https://github.com/llvm/llvm-project/commit/047cd915b86a4f35543ad4e691953aaa5a91c4fe.diff LOG: [X86][EVEX512] Add `HasEVEX512` when `NoVLX` used for 512-bit patterns (#91106) With KNL/KNC being deprecated, we don't need to care about such no VLX cases anymore. We may remove such patterns in the future. Fixes #90844 (cherry picked from commit 7963d9a2b3c20561278a85b19e156e013231342c) Added: llvm/test/CodeGen/X86/pr90844.ll Modified: llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86InstrAVX512.td Removed: diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp index 71fc6b5047eaa..c572b27fe401e 100644 --- a/llvm/lib/Target/X86/X86ISelLowering.cpp +++ b/llvm/lib/Target/X86/X86ISelLowering.cpp @@ -29841,7 +29841,9 @@ static SDValue LowerRotate(SDValue Op, const X86Subtarget &Subtarget, return R; // AVX512 implicitly uses modulo rotation amounts. - if (Subtarget.hasAVX512() && 32 <= EltSizeInBits) { + if ((Subtarget.hasVLX() || + (Subtarget.hasAVX512() && Subtarget.hasEVEX512())) && + 32 <= EltSizeInBits) { // Attempt to rotate by immediate. if (IsCstSplat) { unsigned RotOpc = IsROTL ? X86ISD::VROTLI : X86ISD::VROTRI; diff --git a/llvm/lib/Target/X86/X86InstrAVX512.td b/llvm/lib/Target/X86/X86InstrAVX512.td index bb5e22c714279..0564f2167d8ee 100644 --- a/llvm/lib/Target/X86/X86InstrAVX512.td +++ b/llvm/lib/Target/X86/X86InstrAVX512.td @@ -814,7 +814,7 @@ defm : vextract_for_size_lowering<"VEXTRACTF64x4Z", v32f16_info, v16f16x_info, // A 128-bit extract from bits [255:128] of a 512-bit vector should use a // smaller extract to enable EVEX->VEX. -let Predicates = [NoVLX] in { +let Predicates = [NoVLX, HasEVEX512] in { def : Pat<(v2i64 (extract_subvector (v8i64 VR512:$src), (iPTR 2))), (v2i64 (VEXTRACTI128rr (v4i64 (EXTRACT_SUBREG (v8i64 VR512:$src), sub_ymm)), @@ -3068,7 +3068,7 @@ def : Pat<(Narrow.KVT (and Narrow.KRC:$mask, addr:$src2, (X86cmpm_imm_commute timm:$cc)), Narrow.KRC)>; } -let Predicates = [HasAVX512, NoVLX] in { +let Predicates = [HasAVX512, NoVLX, HasEVEX512] in { defm : axv512_icmp_packed_cc_no_vlx_lowering; defm : axv512_icmp_packed_cc_no_vlx_lowering; @@ -3099,7 +3099,7 @@ let Predicates = [HasAVX512, NoVLX] in { defm : axv512_cmp_packed_cc_no_vlx_lowering<"VCMPPD", v2f64x_info, v8f64_info>; } -let Predicates = [HasBWI, NoVLX] in { +let Predicates = [HasBWI, NoVLX, HasEVEX512] in { defm : axv512_icmp_packed_cc_no_vlx_lowering; defm : axv512_icmp_packed_cc_no_vlx_lowering; @@ -3493,7 +3493,7 @@ multiclass mask_move_lowering; defm : mask_move_lowering<"VMOVDQA32Z", v4i32x_info, v16i32_info>; defm : mask_move_lowering<"VMOVAPSZ", v8f32x_info, v16f32_info>; @@ -3505,7 +3505,7 @@ let Predicates = [HasAVX512, NoVLX] in { defm : mask_move_lowering<"VMOVDQA64Z", v4i64x_info, v8i64_info>; } -let Predicates = [HasBWI, NoVLX] in { +let Predicates = [HasBWI, NoVLX, HasEVEX512] in { defm : mask_move_lowering<"VMOVDQU8Z", v16i8x_info, v64i8_info>; defm : mask_move_lowering<"VMOVDQU8Z", v32i8x_info, v64i8_info>; @@ -4998,8 +4998,8 @@ defm VPMINUD : avx512_binop_rm_vl_d<0x3B, "vpminud", umin, defm VPMINUQ : avx512_binop_rm_vl_q<0x3B, "vpminuq", umin, SchedWriteVecALU, HasAVX512, 1>, T8; -// PMULLQ: Use 512bit version to implement 128/256 bit in case NoVLX. -let Predicates = [HasDQI, NoVLX] in { +// PMULLQ: Use 512bit version to implement 128/256 bit in case NoVLX, HasEVEX512. +let Predicates = [HasDQI, NoVLX, HasEVEX512] in { def : Pat<(v4i64 (mul (v4i64 VR256X:$src1), (v4i64 VR256X:$src2))), (EXTRACT_SUBREG (VPMULLQZrr @@ -5055,7 +5055,7 @@ multiclass avx512_min_max_lowering { sub_xmm)>; } -let Predicates = [HasAVX512, NoVLX] in { +let Predicates = [HasAVX512, NoVLX, HasEVEX512] in { defm : avx512_min_max_lowering<"VPMAXUQZ", umax>; defm : avx512_min_max_lowering<"VPMINUQZ", umin>; defm : avx512_min_max_lowering<"VPMAXSQZ", smax>; @@ -6032,7 +6032,7 @@ defm VPSRL : avx512_shift_types<0xD2, 0xD3, 0xD1, "vpsrl", X86vsrl, SchedWriteVecShift>; // Use 512bit VPSRA/VPSRAI version to implement v2i64/v4i64 in case NoVLX. -let Predicates = [HasAVX512, NoVLX] in { +let Predicates = [HasAVX512, NoVLX, HasEVEX512] in { def : Pat<(v4i64 (X86vsra (v4i64 VR256X:$src1), (v2i64 VR128X:$src2))), (EXTRACT_SUBREG (v8i64 (VPSRAQZrr @@ -6161,14 +6161,14 @@ defm VPSRLV : avx512_var_shift_types<0x45, "vpsrlv", X86vsrlv, SchedWriteVarVecS defm VPRORV : avx512_var_sh
[llvm-branch-commits] [llvm] release/18.x: [X86][EVEX512] Add `HasEVEX512` when `NoVLX` used for 512-bit patterns (#91106) (PR #91118)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/91118 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [AArch64][GISEL] Consider fcmp true and fcmp false in cond code selection (#86972) (PR #91126)
tstellar wrote: @marcauberer You can just create manually create a pull request against the release/18.x branch with the fixes. https://github.com/llvm/llvm-project/pull/91126 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125) (PR #91425)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/91425 >From 2fc32a278e4fd46c6dd085845e69e84c321a3f75 Mon Sep 17 00:00:00 2001 From: Phoebe Wang Date: Mon, 6 May 2024 10:59:44 +0800 Subject: [PATCH 1/2] [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125) AVX doesn't provide 16-bit BROADCAST instruction. Fixes #91005 --- llvm/lib/Target/X86/X86ISelLowering.cpp | 2 +- llvm/test/CodeGen/X86/pr91005.ll| 39 + 2 files changed, 40 insertions(+), 1 deletion(-) create mode 100644 llvm/test/CodeGen/X86/pr91005.ll diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp index c572b27fe401e..3e4ecab8443a9 100644 --- a/llvm/lib/Target/X86/X86ISelLowering.cpp +++ b/llvm/lib/Target/X86/X86ISelLowering.cpp @@ -7295,7 +7295,7 @@ static SDValue lowerBuildVectorAsBroadcast(BuildVectorSDNode *BVOp, // With pattern matching, the VBROADCAST node may become a VMOVDDUP. if (ScalarSize == 32 || (ScalarSize == 64 && (IsGE256 || Subtarget.hasVLX())) || -CVT == MVT::f16 || +(CVT == MVT::f16 && Subtarget.hasAVX2()) || (OptForSize && (ScalarSize == 64 || Subtarget.hasAVX2( { const Constant *C = nullptr; if (ConstantSDNode *CI = dyn_cast(Ld)) diff --git a/llvm/test/CodeGen/X86/pr91005.ll b/llvm/test/CodeGen/X86/pr91005.ll new file mode 100644 index 0..97fd1ce456882 --- /dev/null +++ b/llvm/test/CodeGen/X86/pr91005.ll @@ -0,0 +1,39 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 +; RUN: llc -mtriple=x86_64-unknown-unknown -mattr=+f16c < %s | FileCheck %s + +define void @PR91005(ptr %0) minsize { +; CHECK-LABEL: PR91005: +; CHECK: # %bb.0: +; CHECK-NEXT:xorl %eax, %eax +; CHECK-NEXT:testb %al, %al +; CHECK-NEXT:je .LBB0_2 +; CHECK-NEXT: # %bb.1: +; CHECK-NEXT:vbroadcastss {{.*#+}} xmm0 = [31744,31744,31744,31744] +; CHECK-NEXT:vpcmpeqw %xmm0, %xmm0, %xmm0 +; CHECK-NEXT:vpinsrw $0, {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm1 +; CHECK-NEXT:vpand %xmm1, %xmm0, %xmm0 +; CHECK-NEXT:vcvtph2ps %xmm0, %xmm0 +; CHECK-NEXT:vpxor %xmm1, %xmm1, %xmm1 +; CHECK-NEXT:vmulss %xmm1, %xmm0, %xmm0 +; CHECK-NEXT:vcvtps2ph $4, %xmm0, %xmm0 +; CHECK-NEXT:vmovd %xmm0, %eax +; CHECK-NEXT:movw %ax, (%rdi) +; CHECK-NEXT: .LBB0_2: # %common.ret +; CHECK-NEXT:retq + %2 = bitcast <2 x half> poison to <2 x i16> + %3 = icmp eq <2 x i16> %2, + br i1 poison, label %4, label %common.ret + +common.ret: ; preds = %4, %1 + ret void + +4:; preds = %1 + %5 = select <2 x i1> %3, <2 x half> , <2 x half> zeroinitializer + %6 = fmul <2 x half> %5, zeroinitializer + %7 = fsub <2 x half> %6, zeroinitializer + %8 = extractelement <2 x half> %7, i64 0 + store half %8, ptr %0, align 2 + br label %common.ret +} + +declare <2 x half> @llvm.fabs.v2f16(<2 x half>) >From 4d284b853f26a6cb848028720163561cabf63d95 Mon Sep 17 00:00:00 2001 From: Phoebe Wang Date: Wed, 8 May 2024 10:59:31 +0800 Subject: [PATCH 2/2] Fix difference with LLVM 18 release --- llvm/test/CodeGen/X86/pr91005.ll | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/llvm/test/CodeGen/X86/pr91005.ll b/llvm/test/CodeGen/X86/pr91005.ll index 97fd1ce456882..16b78bf1e7e17 100644 --- a/llvm/test/CodeGen/X86/pr91005.ll +++ b/llvm/test/CodeGen/X86/pr91005.ll @@ -8,12 +8,13 @@ define void @PR91005(ptr %0) minsize { ; CHECK-NEXT:testb %al, %al ; CHECK-NEXT:je .LBB0_2 ; CHECK-NEXT: # %bb.1: -; CHECK-NEXT:vbroadcastss {{.*#+}} xmm0 = [31744,31744,31744,31744] -; CHECK-NEXT:vpcmpeqw %xmm0, %xmm0, %xmm0 -; CHECK-NEXT:vpinsrw $0, {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm1 -; CHECK-NEXT:vpand %xmm1, %xmm0, %xmm0 +; CHECK-NEXT:vpcmpeqw {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0 +; CHECK-NEXT:vpand {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0 +; CHECK-NEXT:vpextrw $0, %xmm0, %eax +; CHECK-NEXT:movzwl %ax, %eax +; CHECK-NEXT:vmovd %eax, %xmm0 ; CHECK-NEXT:vcvtph2ps %xmm0, %xmm0 -; CHECK-NEXT:vpxor %xmm1, %xmm1, %xmm1 +; CHECK-NEXT:vxorps %xmm1, %xmm1, %xmm1 ; CHECK-NEXT:vmulss %xmm1, %xmm0, %xmm0 ; CHECK-NEXT:vcvtps2ph $4, %xmm0, %xmm0 ; CHECK-NEXT:vmovd %xmm0, %eax ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125) (PR #91425)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/91425 >From dfc89f89ed14ebf22effe9dd9605608a975c4ed8 Mon Sep 17 00:00:00 2001 From: Phoebe Wang Date: Mon, 6 May 2024 10:59:44 +0800 Subject: [PATCH] [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125) AVX doesn't provide 16-bit BROADCAST instruction. Fixes #91005 --- llvm/lib/Target/X86/X86ISelLowering.cpp | 2 +- llvm/test/CodeGen/X86/pr91005.ll| 40 + 2 files changed, 41 insertions(+), 1 deletion(-) create mode 100644 llvm/test/CodeGen/X86/pr91005.ll diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp index c572b27fe401e..3e4ecab8443a9 100644 --- a/llvm/lib/Target/X86/X86ISelLowering.cpp +++ b/llvm/lib/Target/X86/X86ISelLowering.cpp @@ -7295,7 +7295,7 @@ static SDValue lowerBuildVectorAsBroadcast(BuildVectorSDNode *BVOp, // With pattern matching, the VBROADCAST node may become a VMOVDDUP. if (ScalarSize == 32 || (ScalarSize == 64 && (IsGE256 || Subtarget.hasVLX())) || -CVT == MVT::f16 || +(CVT == MVT::f16 && Subtarget.hasAVX2()) || (OptForSize && (ScalarSize == 64 || Subtarget.hasAVX2( { const Constant *C = nullptr; if (ConstantSDNode *CI = dyn_cast(Ld)) diff --git a/llvm/test/CodeGen/X86/pr91005.ll b/llvm/test/CodeGen/X86/pr91005.ll new file mode 100644 index 0..16b78bf1e7e17 --- /dev/null +++ b/llvm/test/CodeGen/X86/pr91005.ll @@ -0,0 +1,40 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 +; RUN: llc -mtriple=x86_64-unknown-unknown -mattr=+f16c < %s | FileCheck %s + +define void @PR91005(ptr %0) minsize { +; CHECK-LABEL: PR91005: +; CHECK: # %bb.0: +; CHECK-NEXT:xorl %eax, %eax +; CHECK-NEXT:testb %al, %al +; CHECK-NEXT:je .LBB0_2 +; CHECK-NEXT: # %bb.1: +; CHECK-NEXT:vpcmpeqw {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0 +; CHECK-NEXT:vpand {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0 +; CHECK-NEXT:vpextrw $0, %xmm0, %eax +; CHECK-NEXT:movzwl %ax, %eax +; CHECK-NEXT:vmovd %eax, %xmm0 +; CHECK-NEXT:vcvtph2ps %xmm0, %xmm0 +; CHECK-NEXT:vxorps %xmm1, %xmm1, %xmm1 +; CHECK-NEXT:vmulss %xmm1, %xmm0, %xmm0 +; CHECK-NEXT:vcvtps2ph $4, %xmm0, %xmm0 +; CHECK-NEXT:vmovd %xmm0, %eax +; CHECK-NEXT:movw %ax, (%rdi) +; CHECK-NEXT: .LBB0_2: # %common.ret +; CHECK-NEXT:retq + %2 = bitcast <2 x half> poison to <2 x i16> + %3 = icmp eq <2 x i16> %2, + br i1 poison, label %4, label %common.ret + +common.ret: ; preds = %4, %1 + ret void + +4:; preds = %1 + %5 = select <2 x i1> %3, <2 x half> , <2 x half> zeroinitializer + %6 = fmul <2 x half> %5, zeroinitializer + %7 = fsub <2 x half> %6, zeroinitializer + %8 = extractelement <2 x half> %7, i64 0 + store half %8, ptr %0, align 2 + br label %common.ret +} + +declare <2 x half> @llvm.fabs.v2f16(<2 x half>) ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] dfc89f8 - [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125)
Author: Phoebe Wang Date: 2024-05-08T20:14:03-07:00 New Revision: dfc89f89ed14ebf22effe9dd9605608a975c4ed8 URL: https://github.com/llvm/llvm-project/commit/dfc89f89ed14ebf22effe9dd9605608a975c4ed8 DIFF: https://github.com/llvm/llvm-project/commit/dfc89f89ed14ebf22effe9dd9605608a975c4ed8.diff LOG: [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125) AVX doesn't provide 16-bit BROADCAST instruction. Fixes #91005 Added: llvm/test/CodeGen/X86/pr91005.ll Modified: llvm/lib/Target/X86/X86ISelLowering.cpp Removed: diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp index c572b27fe401e..3e4ecab8443a9 100644 --- a/llvm/lib/Target/X86/X86ISelLowering.cpp +++ b/llvm/lib/Target/X86/X86ISelLowering.cpp @@ -7295,7 +7295,7 @@ static SDValue lowerBuildVectorAsBroadcast(BuildVectorSDNode *BVOp, // With pattern matching, the VBROADCAST node may become a VMOVDDUP. if (ScalarSize == 32 || (ScalarSize == 64 && (IsGE256 || Subtarget.hasVLX())) || -CVT == MVT::f16 || +(CVT == MVT::f16 && Subtarget.hasAVX2()) || (OptForSize && (ScalarSize == 64 || Subtarget.hasAVX2( { const Constant *C = nullptr; if (ConstantSDNode *CI = dyn_cast(Ld)) diff --git a/llvm/test/CodeGen/X86/pr91005.ll b/llvm/test/CodeGen/X86/pr91005.ll new file mode 100644 index 0..16b78bf1e7e17 --- /dev/null +++ b/llvm/test/CodeGen/X86/pr91005.ll @@ -0,0 +1,40 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 +; RUN: llc -mtriple=x86_64-unknown-unknown -mattr=+f16c < %s | FileCheck %s + +define void @PR91005(ptr %0) minsize { +; CHECK-LABEL: PR91005: +; CHECK: # %bb.0: +; CHECK-NEXT:xorl %eax, %eax +; CHECK-NEXT:testb %al, %al +; CHECK-NEXT:je .LBB0_2 +; CHECK-NEXT: # %bb.1: +; CHECK-NEXT:vpcmpeqw {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0 +; CHECK-NEXT:vpand {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0 +; CHECK-NEXT:vpextrw $0, %xmm0, %eax +; CHECK-NEXT:movzwl %ax, %eax +; CHECK-NEXT:vmovd %eax, %xmm0 +; CHECK-NEXT:vcvtph2ps %xmm0, %xmm0 +; CHECK-NEXT:vxorps %xmm1, %xmm1, %xmm1 +; CHECK-NEXT:vmulss %xmm1, %xmm0, %xmm0 +; CHECK-NEXT:vcvtps2ph $4, %xmm0, %xmm0 +; CHECK-NEXT:vmovd %xmm0, %eax +; CHECK-NEXT:movw %ax, (%rdi) +; CHECK-NEXT: .LBB0_2: # %common.ret +; CHECK-NEXT:retq + %2 = bitcast <2 x half> poison to <2 x i16> + %3 = icmp eq <2 x i16> %2, + br i1 poison, label %4, label %common.ret + +common.ret: ; preds = %4, %1 + ret void + +4:; preds = %1 + %5 = select <2 x i1> %3, <2 x half> , <2 x half> zeroinitializer + %6 = fmul <2 x half> %5, zeroinitializer + %7 = fsub <2 x half> %6, zeroinitializer + %8 = extractelement <2 x half> %7, i64 0 + store half %8, ptr %0, align 2 + br label %common.ret +} + +declare <2 x half> @llvm.fabs.v2f16(<2 x half>) ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125) (PR #91425)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/91425 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [SelectionDAG] Mark frame index as "aliased" at argument copy elison (PR #91035)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/91035 >From f5f572f54b32f6ff3ae450fa421ed6d478f09ec8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bj=C3=B6rn=20Pettersson?= Date: Tue, 23 Apr 2024 13:49:18 +0200 Subject: [PATCH] [SelectionDAG] Mark frame index as "aliased" at argument copy elison (#89712) This is a fix for miscompiles reported in https://github.com/llvm/llvm-project/issues/89060 After argument copy elison the IR value for the eliminated alloca is aliasing with the fixed stack object. This patch is making sure that we mark the fixed stack object as being aliased with IR values to avoid that for example schedulers are reordering accesses to the fixed stack object. This could otherwise happen when there is a mix of MemOperands refering the shared fixed stack slow via both the IR value for the elided alloca, and via a fixed stack pseudo source value (as would be the case when lowering the arguments). (cherry picked from commit d8b253be56b3e9073b3e59123cf2da0bcde20c63) --- llvm/include/llvm/CodeGen/MachineFrameInfo.h | 7 .../SelectionDAG/SelectionDAGBuilder.cpp | 3 +- llvm/test/CodeGen/Hexagon/arg-copy-elison.ll | 39 +++ 3 files changed, 48 insertions(+), 1 deletion(-) create mode 100644 llvm/test/CodeGen/Hexagon/arg-copy-elison.ll diff --git a/llvm/include/llvm/CodeGen/MachineFrameInfo.h b/llvm/include/llvm/CodeGen/MachineFrameInfo.h index 7d11d63d4066f..c35faac09c4d9 100644 --- a/llvm/include/llvm/CodeGen/MachineFrameInfo.h +++ b/llvm/include/llvm/CodeGen/MachineFrameInfo.h @@ -697,6 +697,13 @@ class MachineFrameInfo { return Objects[ObjectIdx+NumFixedObjects].isAliased; } + /// Set "maybe pointed to by an LLVM IR value" for an object. + void setIsAliasedObjectIndex(int ObjectIdx, bool IsAliased) { +assert(unsigned(ObjectIdx+NumFixedObjects) < Objects.size() && + "Invalid Object Idx!"); +Objects[ObjectIdx+NumFixedObjects].isAliased = IsAliased; + } + /// Returns true if the specified index corresponds to an immutable object. bool isImmutableObjectIndex(int ObjectIdx) const { // Tail calling functions can clobber their function arguments. diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp index 5ce1013f30fd1..7406a8ac1611d 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -10888,7 +10888,7 @@ static void tryToElideArgumentCopy( } // Perform the elision. Delete the old stack object and replace its only use - // in the variable info map. Mark the stack object as mutable. + // in the variable info map. Mark the stack object as mutable and aliased. LLVM_DEBUG({ dbgs() << "Eliding argument copy from " << Arg << " to " << *AI << '\n' << " Replacing frame index " << OldIndex << " with " << FixedIndex @@ -10896,6 +10896,7 @@ static void tryToElideArgumentCopy( }); MFI.RemoveStackObject(OldIndex); MFI.setIsImmutableObjectIndex(FixedIndex, false); + MFI.setIsAliasedObjectIndex(FixedIndex, true); AllocaIndex = FixedIndex; ArgCopyElisionFrameIndexMap.insert({OldIndex, FixedIndex}); for (SDValue ArgVal : ArgVals) diff --git a/llvm/test/CodeGen/Hexagon/arg-copy-elison.ll b/llvm/test/CodeGen/Hexagon/arg-copy-elison.ll new file mode 100644 index 0..f0c30c301f446 --- /dev/null +++ b/llvm/test/CodeGen/Hexagon/arg-copy-elison.ll @@ -0,0 +1,39 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 +; RUN: llc -mtriple hexagon-- -o - %s | FileCheck %s + +; Reproducer for https://github.com/llvm/llvm-project/issues/89060 +; +; Problem was a bug in argument copy elison. Given that the %alloca is +; eliminated, the same frame index will be used for accessing %alloca and %a +; on the fixed stack. Care must be taken when setting up +; MachinePointerInfo/MemOperands for those accesses to either make sure that +; we always refer to the fixed stack slot the same way (not using the +; ir.alloca name), or make sure that we still detect that they alias each +; other if using different kinds of MemOperands to identify the same fixed +; stack entry. +; +define i32 @f(i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 %q1, i32 %a, i32 %q2) { +; CHECK-LABEL: f: +; CHECK: .cfi_startproc +; CHECK-NEXT: // %bb.0: +; CHECK-NEXT:{ +; CHECK-NEXT: r0 = memw(r29+#36) +; CHECK-NEXT: r1 = memw(r29+#28) +; CHECK-NEXT:} +; CHECK-NEXT:{ +; CHECK-NEXT: r0 = sub(r1,r0) +; CHECK-NEXT: r2 = memw(r29+#32) +; CHECK-NEXT: memw(r29+#32) = ##666 +; CHECK-NEXT:} +; CHECK-NEXT:{ +; CHECK-NEXT: r0 = xor(r0,r2) +; CHECK-NEXT: jumpr r31 +; CHECK-NEXT:} + %alloca = alloca i32 + store i32 %a, ptr %alloca ; Should be elided. + store i32 666, ptr %alloca + %x = sub i32 %q1, %q2 + %y = xor i32 %x, %a
[llvm-branch-commits] [llvm] f5f572f - [SelectionDAG] Mark frame index as "aliased" at argument copy elison (#89712)
Author: Björn Pettersson Date: 2024-05-08T20:16:03-07:00 New Revision: f5f572f54b32f6ff3ae450fa421ed6d478f09ec8 URL: https://github.com/llvm/llvm-project/commit/f5f572f54b32f6ff3ae450fa421ed6d478f09ec8 DIFF: https://github.com/llvm/llvm-project/commit/f5f572f54b32f6ff3ae450fa421ed6d478f09ec8.diff LOG: [SelectionDAG] Mark frame index as "aliased" at argument copy elison (#89712) This is a fix for miscompiles reported in https://github.com/llvm/llvm-project/issues/89060 After argument copy elison the IR value for the eliminated alloca is aliasing with the fixed stack object. This patch is making sure that we mark the fixed stack object as being aliased with IR values to avoid that for example schedulers are reordering accesses to the fixed stack object. This could otherwise happen when there is a mix of MemOperands refering the shared fixed stack slow via both the IR value for the elided alloca, and via a fixed stack pseudo source value (as would be the case when lowering the arguments). (cherry picked from commit d8b253be56b3e9073b3e59123cf2da0bcde20c63) Added: llvm/test/CodeGen/Hexagon/arg-copy-elison.ll Modified: llvm/include/llvm/CodeGen/MachineFrameInfo.h llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp Removed: diff --git a/llvm/include/llvm/CodeGen/MachineFrameInfo.h b/llvm/include/llvm/CodeGen/MachineFrameInfo.h index 7d11d63d4066f..c35faac09c4d9 100644 --- a/llvm/include/llvm/CodeGen/MachineFrameInfo.h +++ b/llvm/include/llvm/CodeGen/MachineFrameInfo.h @@ -697,6 +697,13 @@ class MachineFrameInfo { return Objects[ObjectIdx+NumFixedObjects].isAliased; } + /// Set "maybe pointed to by an LLVM IR value" for an object. + void setIsAliasedObjectIndex(int ObjectIdx, bool IsAliased) { +assert(unsigned(ObjectIdx+NumFixedObjects) < Objects.size() && + "Invalid Object Idx!"); +Objects[ObjectIdx+NumFixedObjects].isAliased = IsAliased; + } + /// Returns true if the specified index corresponds to an immutable object. bool isImmutableObjectIndex(int ObjectIdx) const { // Tail calling functions can clobber their function arguments. diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp index 5ce1013f30fd1..7406a8ac1611d 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -10888,7 +10888,7 @@ static void tryToElideArgumentCopy( } // Perform the elision. Delete the old stack object and replace its only use - // in the variable info map. Mark the stack object as mutable. + // in the variable info map. Mark the stack object as mutable and aliased. LLVM_DEBUG({ dbgs() << "Eliding argument copy from " << Arg << " to " << *AI << '\n' << " Replacing frame index " << OldIndex << " with " << FixedIndex @@ -10896,6 +10896,7 @@ static void tryToElideArgumentCopy( }); MFI.RemoveStackObject(OldIndex); MFI.setIsImmutableObjectIndex(FixedIndex, false); + MFI.setIsAliasedObjectIndex(FixedIndex, true); AllocaIndex = FixedIndex; ArgCopyElisionFrameIndexMap.insert({OldIndex, FixedIndex}); for (SDValue ArgVal : ArgVals) diff --git a/llvm/test/CodeGen/Hexagon/arg-copy-elison.ll b/llvm/test/CodeGen/Hexagon/arg-copy-elison.ll new file mode 100644 index 0..f0c30c301f446 --- /dev/null +++ b/llvm/test/CodeGen/Hexagon/arg-copy-elison.ll @@ -0,0 +1,39 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 +; RUN: llc -mtriple hexagon-- -o - %s | FileCheck %s + +; Reproducer for https://github.com/llvm/llvm-project/issues/89060 +; +; Problem was a bug in argument copy elison. Given that the %alloca is +; eliminated, the same frame index will be used for accessing %alloca and %a +; on the fixed stack. Care must be taken when setting up +; MachinePointerInfo/MemOperands for those accesses to either make sure that +; we always refer to the fixed stack slot the same way (not using the +; ir.alloca name), or make sure that we still detect that they alias each +; other if using diff erent kinds of MemOperands to identify the same fixed +; stack entry. +; +define i32 @f(i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 %q1, i32 %a, i32 %q2) { +; CHECK-LABEL: f: +; CHECK: .cfi_startproc +; CHECK-NEXT: // %bb.0: +; CHECK-NEXT:{ +; CHECK-NEXT: r0 = memw(r29+#36) +; CHECK-NEXT: r1 = memw(r29+#28) +; CHECK-NEXT:} +; CHECK-NEXT:{ +; CHECK-NEXT: r0 = sub(r1,r0) +; CHECK-NEXT: r2 = memw(r29+#32) +; CHECK-NEXT: memw(r29+#32) = ##666 +; CHECK-NEXT:} +; CHECK-NEXT:{ +; CHECK-NEXT: r0 = xor(r0,r2) +; CHECK-NEXT: jumpr r31 +; CHECK-NEXT:} + %alloca = alloca i32 + store i32 %a, ptr %alloca ; Should be elided. + store i32 666, ptr %alloca + %x = sub i32 %q
[llvm-branch-commits] [llvm] release/18.x: [SelectionDAG] Mark frame index as "aliased" at argument copy elison (PR #91035)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/91035 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [AMDGPU] Fix GFX12 encoding of s_wait_event export_ready (#89622) (PR #91034)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/91034 >From bce9393291a2daa8006d1da629aa2765e00f4e70 Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Tue, 23 Apr 2024 14:38:45 +0100 Subject: [PATCH] [AMDGPU] Fix GFX12 encoding of s_wait_event export_ready (#89622) As well as flipping the sense of the bit, GFX12 moved it from bit 0 to bit 1 in the encoded simm16 operand. (cherry picked from commit e0a763c490d8ef58dca867e0ef834978ccf8e17d) --- llvm/lib/Target/AMDGPU/SOPInstructions.td| 2 +- llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll | 10 +++--- 2 files changed, 4 insertions(+), 8 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/SOPInstructions.td b/llvm/lib/Target/AMDGPU/SOPInstructions.td index ae5ef0541929b..5762efde73f02 100644 --- a/llvm/lib/Target/AMDGPU/SOPInstructions.td +++ b/llvm/lib/Target/AMDGPU/SOPInstructions.td @@ -1786,7 +1786,7 @@ def : GCNPat< let SubtargetPredicate = isNotGFX12Plus in def : GCNPat <(int_amdgcn_s_wait_event_export_ready), (S_WAIT_EVENT (i16 0))>; let SubtargetPredicate = isGFX12Plus in - def : GCNPat <(int_amdgcn_s_wait_event_export_ready), (S_WAIT_EVENT (i16 1))>; + def : GCNPat <(int_amdgcn_s_wait_event_export_ready), (S_WAIT_EVENT (i16 2))>; // The first 10 bits of the mode register are the core FP mode on all // subtargets. diff --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll index 08c77148f6ae1..433fefa434988 100644 --- a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll +++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll @@ -5,14 +5,10 @@ ; GCN-LABEL: {{^}}test_wait_event: ; GFX11: s_wait_event 0x0 -; GFX12: s_wait_event 0x1 +; GFX12: s_wait_event 0x2 -define amdgpu_ps void @test_wait_event() #0 { +define amdgpu_ps void @test_wait_event() { entry: - call void @llvm.amdgcn.s.wait.event.export.ready() #0 + call void @llvm.amdgcn.s.wait.event.export.ready() ret void } - -declare void @llvm.amdgcn.s.wait.event.export.ready() #0 - -attributes #0 = { nounwind } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] bce9393 - [AMDGPU] Fix GFX12 encoding of s_wait_event export_ready (#89622)
Author: Jay Foad Date: 2024-05-08T20:17:31-07:00 New Revision: bce9393291a2daa8006d1da629aa2765e00f4e70 URL: https://github.com/llvm/llvm-project/commit/bce9393291a2daa8006d1da629aa2765e00f4e70 DIFF: https://github.com/llvm/llvm-project/commit/bce9393291a2daa8006d1da629aa2765e00f4e70.diff LOG: [AMDGPU] Fix GFX12 encoding of s_wait_event export_ready (#89622) As well as flipping the sense of the bit, GFX12 moved it from bit 0 to bit 1 in the encoded simm16 operand. (cherry picked from commit e0a763c490d8ef58dca867e0ef834978ccf8e17d) Added: Modified: llvm/lib/Target/AMDGPU/SOPInstructions.td llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll Removed: diff --git a/llvm/lib/Target/AMDGPU/SOPInstructions.td b/llvm/lib/Target/AMDGPU/SOPInstructions.td index ae5ef0541929b..5762efde73f02 100644 --- a/llvm/lib/Target/AMDGPU/SOPInstructions.td +++ b/llvm/lib/Target/AMDGPU/SOPInstructions.td @@ -1786,7 +1786,7 @@ def : GCNPat< let SubtargetPredicate = isNotGFX12Plus in def : GCNPat <(int_amdgcn_s_wait_event_export_ready), (S_WAIT_EVENT (i16 0))>; let SubtargetPredicate = isGFX12Plus in - def : GCNPat <(int_amdgcn_s_wait_event_export_ready), (S_WAIT_EVENT (i16 1))>; + def : GCNPat <(int_amdgcn_s_wait_event_export_ready), (S_WAIT_EVENT (i16 2))>; // The first 10 bits of the mode register are the core FP mode on all // subtargets. diff --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll index 08c77148f6ae1..433fefa434988 100644 --- a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll +++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll @@ -5,14 +5,10 @@ ; GCN-LABEL: {{^}}test_wait_event: ; GFX11: s_wait_event 0x0 -; GFX12: s_wait_event 0x1 +; GFX12: s_wait_event 0x2 -define amdgpu_ps void @test_wait_event() #0 { +define amdgpu_ps void @test_wait_event() { entry: - call void @llvm.amdgcn.s.wait.event.export.ready() #0 + call void @llvm.amdgcn.s.wait.event.export.ready() ret void } - -declare void @llvm.amdgcn.s.wait.event.export.ready() #0 - -attributes #0 = { nounwind } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [AMDGPU] Fix GFX12 encoding of s_wait_event export_ready (#89622) (PR #91034)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/91034 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [AArc64][GlobalISel] Fix legalizer assert for G_INSERT_VECTOR_ELT (PR #90827)
tstellar wrote: @aemerson Did you submit a new pull request with a fix? https://github.com/llvm/llvm-project/pull/90827 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/18.x: [clang-format] Don't remove parentheses of fold expressions (#91045) (PR #91165)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/91165 >From 0abb89a80f5c0736843950dc39f2454ab312b319 Mon Sep 17 00:00:00 2001 From: Owen Pan Date: Sun, 5 May 2024 21:33:41 -0700 Subject: [PATCH] [clang-format] Don't remove parentheses of fold expressions (#91045) Fixes #90966. (cherry picked from commit db0ed5533368414b1c4e1c884eef651c66359da2) --- clang/lib/Format/UnwrappedLineParser.cpp | 7 ++- clang/unittests/Format/FormatTest.cpp| 9 + 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/clang/lib/Format/UnwrappedLineParser.cpp b/clang/lib/Format/UnwrappedLineParser.cpp index a6eb18bb2b322..f70affb732a0d 100644 --- a/clang/lib/Format/UnwrappedLineParser.cpp +++ b/clang/lib/Format/UnwrappedLineParser.cpp @@ -2510,6 +2510,7 @@ bool UnwrappedLineParser::parseParens(TokenType AmpAmpTokenType) { assert(FormatTok->is(tok::l_paren) && "'(' expected."); auto *LeftParen = FormatTok; bool SeenEqual = false; + bool MightBeFoldExpr = false; const bool MightBeStmtExpr = Tokens->peekNextToken()->is(tok::l_brace); nextToken(); do { @@ -2521,7 +2522,7 @@ bool UnwrappedLineParser::parseParens(TokenType AmpAmpTokenType) { parseChildBlock(); break; case tok::r_paren: - if (!MightBeStmtExpr && !Line->InMacroBody && + if (!MightBeStmtExpr && !MightBeFoldExpr && !Line->InMacroBody && Style.RemoveParentheses > FormatStyle::RPS_Leave) { const auto *Prev = LeftParen->Previous; const auto *Next = Tokens->peekNextToken(); @@ -2564,6 +2565,10 @@ bool UnwrappedLineParser::parseParens(TokenType AmpAmpTokenType) { parseBracedList(); } break; +case tok::ellipsis: + MightBeFoldExpr = true; + nextToken(); + break; case tok::equal: SeenEqual = true; if (Style.isCSharp() && FormatTok->is(TT_FatArrow)) diff --git a/clang/unittests/Format/FormatTest.cpp b/clang/unittests/Format/FormatTest.cpp index 88877e53d014c..923128672c316 100644 --- a/clang/unittests/Format/FormatTest.cpp +++ b/clang/unittests/Format/FormatTest.cpp @@ -26894,8 +26894,14 @@ TEST_F(FormatTest, RemoveParentheses) { "if ((({ a; })))\n" " b;", Style); + verifyFormat("static_assert((std::is_constructible_v && ...));", + "static_assert(((std::is_constructible_v && ...)));", + Style); verifyFormat("return (0);", "return (((0)));", Style); verifyFormat("return (({ 0; }));", "return ((({ 0; })));", Style); + verifyFormat("return ((... && std::is_convertible_v));", + "return (((... && std::is_convertible_v)));", + Style); Style.RemoveParentheses = FormatStyle::RPS_ReturnStatement; verifyFormat("#define Return0 return (0);", Style); @@ -26903,6 +26909,9 @@ TEST_F(FormatTest, RemoveParentheses) { verifyFormat("co_return 0;", "co_return ((0));", Style); verifyFormat("return 0;", "return (((0)));", Style); verifyFormat("return ({ 0; });", "return ((({ 0; })));", Style); + verifyFormat("return (... && std::is_convertible_v);", + "return (((... && std::is_convertible_v)));", + Style); verifyFormat("inline decltype(auto) f() {\n" " if (a) {\n" "return (a);\n" ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] 0abb89a - [clang-format] Don't remove parentheses of fold expressions (#91045)
Author: Owen Pan Date: 2024-05-09T12:16:32-07:00 New Revision: 0abb89a80f5c0736843950dc39f2454ab312b319 URL: https://github.com/llvm/llvm-project/commit/0abb89a80f5c0736843950dc39f2454ab312b319 DIFF: https://github.com/llvm/llvm-project/commit/0abb89a80f5c0736843950dc39f2454ab312b319.diff LOG: [clang-format] Don't remove parentheses of fold expressions (#91045) Fixes #90966. (cherry picked from commit db0ed5533368414b1c4e1c884eef651c66359da2) Added: Modified: clang/lib/Format/UnwrappedLineParser.cpp clang/unittests/Format/FormatTest.cpp Removed: diff --git a/clang/lib/Format/UnwrappedLineParser.cpp b/clang/lib/Format/UnwrappedLineParser.cpp index a6eb18bb2b322..f70affb732a0d 100644 --- a/clang/lib/Format/UnwrappedLineParser.cpp +++ b/clang/lib/Format/UnwrappedLineParser.cpp @@ -2510,6 +2510,7 @@ bool UnwrappedLineParser::parseParens(TokenType AmpAmpTokenType) { assert(FormatTok->is(tok::l_paren) && "'(' expected."); auto *LeftParen = FormatTok; bool SeenEqual = false; + bool MightBeFoldExpr = false; const bool MightBeStmtExpr = Tokens->peekNextToken()->is(tok::l_brace); nextToken(); do { @@ -2521,7 +2522,7 @@ bool UnwrappedLineParser::parseParens(TokenType AmpAmpTokenType) { parseChildBlock(); break; case tok::r_paren: - if (!MightBeStmtExpr && !Line->InMacroBody && + if (!MightBeStmtExpr && !MightBeFoldExpr && !Line->InMacroBody && Style.RemoveParentheses > FormatStyle::RPS_Leave) { const auto *Prev = LeftParen->Previous; const auto *Next = Tokens->peekNextToken(); @@ -2564,6 +2565,10 @@ bool UnwrappedLineParser::parseParens(TokenType AmpAmpTokenType) { parseBracedList(); } break; +case tok::ellipsis: + MightBeFoldExpr = true; + nextToken(); + break; case tok::equal: SeenEqual = true; if (Style.isCSharp() && FormatTok->is(TT_FatArrow)) diff --git a/clang/unittests/Format/FormatTest.cpp b/clang/unittests/Format/FormatTest.cpp index 88877e53d014c..923128672c316 100644 --- a/clang/unittests/Format/FormatTest.cpp +++ b/clang/unittests/Format/FormatTest.cpp @@ -26894,8 +26894,14 @@ TEST_F(FormatTest, RemoveParentheses) { "if ((({ a; })))\n" " b;", Style); + verifyFormat("static_assert((std::is_constructible_v && ...));", + "static_assert(((std::is_constructible_v && ...)));", + Style); verifyFormat("return (0);", "return (((0)));", Style); verifyFormat("return (({ 0; }));", "return ((({ 0; })));", Style); + verifyFormat("return ((... && std::is_convertible_v));", + "return (((... && std::is_convertible_v)));", + Style); Style.RemoveParentheses = FormatStyle::RPS_ReturnStatement; verifyFormat("#define Return0 return (0);", Style); @@ -26903,6 +26909,9 @@ TEST_F(FormatTest, RemoveParentheses) { verifyFormat("co_return 0;", "co_return ((0));", Style); verifyFormat("return 0;", "return (((0)));", Style); verifyFormat("return ({ 0; });", "return ((({ 0; })));", Style); + verifyFormat("return (... && std::is_convertible_v);", + "return (((... && std::is_convertible_v)));", + Style); verifyFormat("inline decltype(auto) f() {\n" " if (a) {\n" "return (a);\n" ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/18.x: [clang-format] Don't remove parentheses of fold expressions (#91045) (PR #91165)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/91165 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [FunctionAttrs] Fix incorrect nonnull inference for non-inbounds GEP (#91180) (PR #91286)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/91286 >From 4a28f8e3c625e168c1cb9203150e3dc6495bb0fa Mon Sep 17 00:00:00 2001 From: Nikita Popov Date: Tue, 7 May 2024 09:47:28 +0900 Subject: [PATCH] [FunctionAttrs] Fix incorrect nonnull inference for non-inbounds GEP (#91180) For inbounds GEPs, if the source pointer is non-null, the result must also be non-null. However, this does not hold for non-inbounds GEPs. Fixes https://github.com/llvm/llvm-project/issues/91177. (cherry picked from commit f34d30cdae0f59698f660d5cc8fb993fb3441064) --- llvm/lib/Transforms/IPO/FunctionAttrs.cpp | 7 +- .../Transforms/FunctionAttrs/nocapture.ll | 2 +- llvm/test/Transforms/FunctionAttrs/nonnull.ll | 23 +++ 3 files changed, 26 insertions(+), 6 deletions(-) diff --git a/llvm/lib/Transforms/IPO/FunctionAttrs.cpp b/llvm/lib/Transforms/IPO/FunctionAttrs.cpp index 7ebf265e17ba1..27c411250d53c 100644 --- a/llvm/lib/Transforms/IPO/FunctionAttrs.cpp +++ b/llvm/lib/Transforms/IPO/FunctionAttrs.cpp @@ -1186,10 +1186,15 @@ static bool isReturnNonNull(Function *F, const SCCNodeSet &SCCNodes, switch (RVI->getOpcode()) { // Extend the analysis by looking upwards. case Instruction::BitCast: -case Instruction::GetElementPtr: case Instruction::AddrSpaceCast: FlowsToReturn.insert(RVI->getOperand(0)); continue; +case Instruction::GetElementPtr: + if (cast(RVI)->isInBounds()) { +FlowsToReturn.insert(RVI->getOperand(0)); +continue; + } + return false; case Instruction::Select: { SelectInst *SI = cast(RVI); FlowsToReturn.insert(SI->getTrueValue()); diff --git a/llvm/test/Transforms/FunctionAttrs/nocapture.ll b/llvm/test/Transforms/FunctionAttrs/nocapture.ll index 3d483f671b1af..8d6f6a7c73f80 100644 --- a/llvm/test/Transforms/FunctionAttrs/nocapture.ll +++ b/llvm/test/Transforms/FunctionAttrs/nocapture.ll @@ -197,7 +197,7 @@ declare i32 @__gxx_personality_v0(...) define ptr @lookup_bit(ptr %q, i32 %bitno) readnone nounwind { ; FNATTRS: Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none) -; FNATTRS-LABEL: define nonnull ptr @lookup_bit +; FNATTRS-LABEL: define ptr @lookup_bit ; FNATTRS-SAME: (ptr [[Q:%.*]], i32 [[BITNO:%.*]]) #[[ATTR0]] { ; FNATTRS-NEXT:[[TMP:%.*]] = ptrtoint ptr [[Q]] to i32 ; FNATTRS-NEXT:[[TMP2:%.*]] = lshr i32 [[TMP]], [[BITNO]] diff --git a/llvm/test/Transforms/FunctionAttrs/nonnull.ll b/llvm/test/Transforms/FunctionAttrs/nonnull.ll index d9bdb6298ed0f..ec5545b969e55 100644 --- a/llvm/test/Transforms/FunctionAttrs/nonnull.ll +++ b/llvm/test/Transforms/FunctionAttrs/nonnull.ll @@ -905,26 +905,26 @@ define i1 @parent8(ptr %a, ptr %bogus1, ptr %b) personality ptr @esfp{ ; FNATTRS-SAME: ptr nonnull [[A:%.*]], ptr nocapture readnone [[BOGUS1:%.*]], ptr nonnull [[B:%.*]]) #[[ATTR7]] personality ptr @esfp { ; FNATTRS-NEXT: entry: ; FNATTRS-NEXT:invoke void @use2nonnull(ptr [[A]], ptr [[B]]) -; FNATTRS-NEXT:to label [[CONT:%.*]] unwind label [[EXC:%.*]] +; FNATTRS-NEXT:to label [[CONT:%.*]] unwind label [[EXC:%.*]] ; FNATTRS: cont: ; FNATTRS-NEXT:[[NULL_CHECK:%.*]] = icmp eq ptr [[B]], null ; FNATTRS-NEXT:ret i1 [[NULL_CHECK]] ; FNATTRS: exc: ; FNATTRS-NEXT:[[LP:%.*]] = landingpad { ptr, i32 } -; FNATTRS-NEXT:filter [0 x ptr] zeroinitializer +; FNATTRS-NEXT:filter [0 x ptr] zeroinitializer ; FNATTRS-NEXT:unreachable ; ; ATTRIBUTOR-LABEL: define i1 @parent8( ; ATTRIBUTOR-SAME: ptr nonnull [[A:%.*]], ptr nocapture nofree readnone [[BOGUS1:%.*]], ptr nonnull [[B:%.*]]) #[[ATTR8]] personality ptr @esfp { ; ATTRIBUTOR-NEXT: entry: ; ATTRIBUTOR-NEXT:invoke void @use2nonnull(ptr nonnull [[A]], ptr nonnull [[B]]) -; ATTRIBUTOR-NEXT:to label [[CONT:%.*]] unwind label [[EXC:%.*]] +; ATTRIBUTOR-NEXT:to label [[CONT:%.*]] unwind label [[EXC:%.*]] ; ATTRIBUTOR: cont: ; ATTRIBUTOR-NEXT:[[NULL_CHECK:%.*]] = icmp eq ptr [[B]], null ; ATTRIBUTOR-NEXT:ret i1 [[NULL_CHECK]] ; ATTRIBUTOR: exc: ; ATTRIBUTOR-NEXT:[[LP:%.*]] = landingpad { ptr, i32 } -; ATTRIBUTOR-NEXT:filter [0 x ptr] zeroinitializer +; ATTRIBUTOR-NEXT:filter [0 x ptr] zeroinitializer ; ATTRIBUTOR-NEXT:unreachable ; @@ -1415,5 +1415,20 @@ define void @PR43833_simple(ptr %0, i32 %1) { br i1 %11, label %7, label %8 } +define ptr @pr91177_non_inbounds_gep(ptr nonnull %arg) { +; FNATTRS-LABEL: define ptr @pr91177_non_inbounds_gep( +; FNATTRS-SAME: ptr nonnull readnone [[ARG:%.*]]) #[[ATTR0]] { +; FNATTRS-NEXT:[[RES:%.*]] = getelementptr i8, ptr [[ARG]], i64 -8 +; FNATTRS-NEXT:ret ptr [[RES]] +; +; ATTRIBUTOR-LABEL: define ptr @pr91177_non_inbounds_gep( +; ATTRIBUTOR-SAME: ptr nofree nonnull readnone [[ARG:%.*]]) #[[ATTR0]] { +; ATTRIBUTOR-NEXT:[[RES:%.*]] = getelementptr i8, ptr [[ARG]], i64 -8 +; ATTRIBUT
[llvm-branch-commits] [llvm] 4a28f8e - [FunctionAttrs] Fix incorrect nonnull inference for non-inbounds GEP (#91180)
Author: Nikita Popov Date: 2024-05-09T12:18:19-07:00 New Revision: 4a28f8e3c625e168c1cb9203150e3dc6495bb0fa URL: https://github.com/llvm/llvm-project/commit/4a28f8e3c625e168c1cb9203150e3dc6495bb0fa DIFF: https://github.com/llvm/llvm-project/commit/4a28f8e3c625e168c1cb9203150e3dc6495bb0fa.diff LOG: [FunctionAttrs] Fix incorrect nonnull inference for non-inbounds GEP (#91180) For inbounds GEPs, if the source pointer is non-null, the result must also be non-null. However, this does not hold for non-inbounds GEPs. Fixes https://github.com/llvm/llvm-project/issues/91177. (cherry picked from commit f34d30cdae0f59698f660d5cc8fb993fb3441064) Added: Modified: llvm/lib/Transforms/IPO/FunctionAttrs.cpp llvm/test/Transforms/FunctionAttrs/nocapture.ll llvm/test/Transforms/FunctionAttrs/nonnull.ll Removed: diff --git a/llvm/lib/Transforms/IPO/FunctionAttrs.cpp b/llvm/lib/Transforms/IPO/FunctionAttrs.cpp index 7ebf265e17ba1..27c411250d53c 100644 --- a/llvm/lib/Transforms/IPO/FunctionAttrs.cpp +++ b/llvm/lib/Transforms/IPO/FunctionAttrs.cpp @@ -1186,10 +1186,15 @@ static bool isReturnNonNull(Function *F, const SCCNodeSet &SCCNodes, switch (RVI->getOpcode()) { // Extend the analysis by looking upwards. case Instruction::BitCast: -case Instruction::GetElementPtr: case Instruction::AddrSpaceCast: FlowsToReturn.insert(RVI->getOperand(0)); continue; +case Instruction::GetElementPtr: + if (cast(RVI)->isInBounds()) { +FlowsToReturn.insert(RVI->getOperand(0)); +continue; + } + return false; case Instruction::Select: { SelectInst *SI = cast(RVI); FlowsToReturn.insert(SI->getTrueValue()); diff --git a/llvm/test/Transforms/FunctionAttrs/nocapture.ll b/llvm/test/Transforms/FunctionAttrs/nocapture.ll index 3d483f671b1af..8d6f6a7c73f80 100644 --- a/llvm/test/Transforms/FunctionAttrs/nocapture.ll +++ b/llvm/test/Transforms/FunctionAttrs/nocapture.ll @@ -197,7 +197,7 @@ declare i32 @__gxx_personality_v0(...) define ptr @lookup_bit(ptr %q, i32 %bitno) readnone nounwind { ; FNATTRS: Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none) -; FNATTRS-LABEL: define nonnull ptr @lookup_bit +; FNATTRS-LABEL: define ptr @lookup_bit ; FNATTRS-SAME: (ptr [[Q:%.*]], i32 [[BITNO:%.*]]) #[[ATTR0]] { ; FNATTRS-NEXT:[[TMP:%.*]] = ptrtoint ptr [[Q]] to i32 ; FNATTRS-NEXT:[[TMP2:%.*]] = lshr i32 [[TMP]], [[BITNO]] diff --git a/llvm/test/Transforms/FunctionAttrs/nonnull.ll b/llvm/test/Transforms/FunctionAttrs/nonnull.ll index d9bdb6298ed0f..ec5545b969e55 100644 --- a/llvm/test/Transforms/FunctionAttrs/nonnull.ll +++ b/llvm/test/Transforms/FunctionAttrs/nonnull.ll @@ -905,26 +905,26 @@ define i1 @parent8(ptr %a, ptr %bogus1, ptr %b) personality ptr @esfp{ ; FNATTRS-SAME: ptr nonnull [[A:%.*]], ptr nocapture readnone [[BOGUS1:%.*]], ptr nonnull [[B:%.*]]) #[[ATTR7]] personality ptr @esfp { ; FNATTRS-NEXT: entry: ; FNATTRS-NEXT:invoke void @use2nonnull(ptr [[A]], ptr [[B]]) -; FNATTRS-NEXT:to label [[CONT:%.*]] unwind label [[EXC:%.*]] +; FNATTRS-NEXT:to label [[CONT:%.*]] unwind label [[EXC:%.*]] ; FNATTRS: cont: ; FNATTRS-NEXT:[[NULL_CHECK:%.*]] = icmp eq ptr [[B]], null ; FNATTRS-NEXT:ret i1 [[NULL_CHECK]] ; FNATTRS: exc: ; FNATTRS-NEXT:[[LP:%.*]] = landingpad { ptr, i32 } -; FNATTRS-NEXT:filter [0 x ptr] zeroinitializer +; FNATTRS-NEXT:filter [0 x ptr] zeroinitializer ; FNATTRS-NEXT:unreachable ; ; ATTRIBUTOR-LABEL: define i1 @parent8( ; ATTRIBUTOR-SAME: ptr nonnull [[A:%.*]], ptr nocapture nofree readnone [[BOGUS1:%.*]], ptr nonnull [[B:%.*]]) #[[ATTR8]] personality ptr @esfp { ; ATTRIBUTOR-NEXT: entry: ; ATTRIBUTOR-NEXT:invoke void @use2nonnull(ptr nonnull [[A]], ptr nonnull [[B]]) -; ATTRIBUTOR-NEXT:to label [[CONT:%.*]] unwind label [[EXC:%.*]] +; ATTRIBUTOR-NEXT:to label [[CONT:%.*]] unwind label [[EXC:%.*]] ; ATTRIBUTOR: cont: ; ATTRIBUTOR-NEXT:[[NULL_CHECK:%.*]] = icmp eq ptr [[B]], null ; ATTRIBUTOR-NEXT:ret i1 [[NULL_CHECK]] ; ATTRIBUTOR: exc: ; ATTRIBUTOR-NEXT:[[LP:%.*]] = landingpad { ptr, i32 } -; ATTRIBUTOR-NEXT:filter [0 x ptr] zeroinitializer +; ATTRIBUTOR-NEXT:filter [0 x ptr] zeroinitializer ; ATTRIBUTOR-NEXT:unreachable ; @@ -1415,5 +1415,20 @@ define void @PR43833_simple(ptr %0, i32 %1) { br i1 %11, label %7, label %8 } +define ptr @pr91177_non_inbounds_gep(ptr nonnull %arg) { +; FNATTRS-LABEL: define ptr @pr91177_non_inbounds_gep( +; FNATTRS-SAME: ptr nonnull readnone [[ARG:%.*]]) #[[ATTR0]] { +; FNATTRS-NEXT:[[RES:%.*]] = getelementptr i8, ptr [[ARG]], i64 -8 +; FNATTRS-NEXT:ret ptr [[RES]] +; +; ATTRIBUTOR-LABEL: define ptr @pr91177_non_inbounds_gep( +; ATTRIBUTOR-SAME: ptr nofree nonnull readnone [[ARG
[llvm-branch-commits] [llvm] release/18.x: [FunctionAttrs] Fix incorrect nonnull inference for non-inbounds GEP (#91180) (PR #91286)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/91286 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [AArch64][GISEL] Consider fcmp true and fcmp false in cond code selection (#86972) (PR #91580)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/91580 >From 1ffa00a01e5db29cb5c1b2cb6baed8e2b9381f81 Mon Sep 17 00:00:00 2001 From: Marc Auberer Date: Thu, 28 Mar 2024 23:08:38 +0100 Subject: [PATCH 1/2] [AArch64][GISEL] Consider fcmp true and fcmp false in cond code selection (#86972) Fixes #86917 `FCMP_TRUE` and `FCMP_FALSE` were previously not considered and we ended up in an llvm_unreachable assertion. --- .../AArch64/GISel/AArch64GlobalISelUtils.cpp | 6 ++ .../CodeGen/AArch64/GlobalISel/select.mir | 20 .../AArch64/neon-compare-instructions.ll | 101 ++ 3 files changed, 127 insertions(+) diff --git a/llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp b/llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp index 92db89cc0915b..80fe4bcb8b58f 100644 --- a/llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp +++ b/llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp @@ -147,6 +147,12 @@ void AArch64GISelUtils::changeFCMPPredToAArch64CC( case CmpInst::FCMP_UNE: CondCode = AArch64CC::NE; break; + case CmpInst::FCMP_TRUE: +CondCode = AArch64CC::AL; +break; + case CmpInst::FCMP_FALSE: +CondCode = AArch64CC::NV; +break; } } diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/select.mir b/llvm/test/CodeGen/AArch64/GlobalISel/select.mir index 60cddbf794bc7..ae78d4be0f88a 100644 --- a/llvm/test/CodeGen/AArch64/GlobalISel/select.mir +++ b/llvm/test/CodeGen/AArch64/GlobalISel/select.mir @@ -183,6 +183,14 @@ registers: - { id: 5, class: gpr } - { id: 6, class: gpr } - { id: 7, class: gpr } + - { id: 8, class: fpr } + - { id: 9, class: gpr } + - { id: 10, class: fpr } + - { id: 11, class: gpr } + - { id: 12, class: gpr } + - { id: 13, class: gpr } + - { id: 14, class: gpr } + - { id: 15, class: gpr } # CHECK: body: # CHECK:nofpexcept FCMPSrr %0, %0, implicit-def $nzcv @@ -209,6 +217,18 @@ body: | %7(s32) = G_ANYEXT %5 $w0 = COPY %7(s32) +%8(s32) = COPY $s0 +%9(s32) = G_FCMP floatpred(true), %8, %8 +%12(s8) = G_TRUNC %9(s32) +%14(s32) = G_ANYEXT %12 +$w0 = COPY %14(s32) + +%10(s64) = COPY $d0 +%11(s32) = G_FCMP floatpred(false), %10, %10 +%13(s8) = G_TRUNC %11(s32) +%15(s32) = G_ANYEXT %13 +$w0 = COPY %15(s32) + ... --- diff --git a/llvm/test/CodeGen/AArch64/neon-compare-instructions.ll b/llvm/test/CodeGen/AArch64/neon-compare-instructions.ll index 765c81e26e13c..c4c00f8e97942 100644 --- a/llvm/test/CodeGen/AArch64/neon-compare-instructions.ll +++ b/llvm/test/CodeGen/AArch64/neon-compare-instructions.ll @@ -2870,6 +2870,107 @@ define <2 x i64> @fcmune2xdouble(<2 x double> %A, <2 x double> %B) { ret <2 x i64> %tmp4 } +define <2 x i32> @fcmal2xfloat(<2 x float> %A, <2 x float> %B) { +; CHECK-SD-LABEL: fcmal2xfloat: +; CHECK-SD: // %bb.0: +; CHECK-SD-NEXT:movi v0.2d, #0x +; CHECK-SD-NEXT:ret +; +; CHECK-GI-LABEL: fcmal2xfloat: +; CHECK-GI: // %bb.0: +; CHECK-GI-NEXT:movi v0.2s, #1 +; CHECK-GI-NEXT:shl v0.2s, v0.2s, #31 +; CHECK-GI-NEXT:sshr v0.2s, v0.2s, #31 +; CHECK-GI-NEXT:ret + %tmp3 = fcmp true <2 x float> %A, %B + %tmp4 = sext <2 x i1> %tmp3 to <2 x i32> + ret <2 x i32> %tmp4 +} + +define <4 x i32> @fcmal4xfloat(<4 x float> %A, <4 x float> %B) { +; CHECK-SD-LABEL: fcmal4xfloat: +; CHECK-SD: // %bb.0: +; CHECK-SD-NEXT:movi v0.2d, #0x +; CHECK-SD-NEXT:ret +; +; CHECK-GI-LABEL: fcmal4xfloat: +; CHECK-GI: // %bb.0: +; CHECK-GI-NEXT:mov w8, #1 // =0x1 +; CHECK-GI-NEXT:fmov s0, w8 +; CHECK-GI-NEXT:mov v1.16b, v0.16b +; CHECK-GI-NEXT:mov v1.h[1], v0.h[0] +; CHECK-GI-NEXT:mov v0.h[1], v0.h[0] +; CHECK-GI-NEXT:ushll v1.4s, v1.4h, #0 +; CHECK-GI-NEXT:ushll v0.4s, v0.4h, #0 +; CHECK-GI-NEXT:mov v1.d[1], v0.d[0] +; CHECK-GI-NEXT:shl v0.4s, v1.4s, #31 +; CHECK-GI-NEXT:sshr v0.4s, v0.4s, #31 +; CHECK-GI-NEXT:ret + %tmp3 = fcmp true <4 x float> %A, %B + %tmp4 = sext <4 x i1> %tmp3 to <4 x i32> + ret <4 x i32> %tmp4 +} +define <2 x i64> @fcmal2xdouble(<2 x double> %A, <2 x double> %B) { +; CHECK-SD-LABEL: fcmal2xdouble: +; CHECK-SD: // %bb.0: +; CHECK-SD-NEXT:movi v0.2d, #0x +; CHECK-SD-NEXT:ret +; +; CHECK-GI-LABEL: fcmal2xdouble: +; CHECK-GI: // %bb.0: +; CHECK-GI-NEXT:adrp x8, .LCPI221_0 +; CHECK-GI-NEXT:ldr q0, [x8, :lo12:.LCPI221_0] +; CHECK-GI-NEXT:shl v0.2d, v0.2d, #63 +; CHECK-GI-NEXT:sshr v0.2d, v0.2d, #63 +; CHECK-GI-NEXT:ret + %tmp3 = fcmp true <2 x double> %A, %B + %tmp4 = sext <2 x i1> %tmp3 to <2 x i64> + ret <2 x i64> %tmp4 +} + +define <2 x i32> @fcmnv2xfloat(<2 x float> %A, <2 x float> %B) { +; CHECK-LABEL: fcmnv2xfloat: +; CHECK: // %bb.0: +; CHECK-NEXT:movi v0.2d, # +; CHECK-NEXT:ret + %tmp3 = fcmp false <2 x float> %A, %B + %tmp4 = sext <2 x i1> %tmp3 to <2 x i3
[llvm-branch-commits] [llvm] release/18.x: [AArch64][GISEL] Consider fcmp true and fcmp false in cond code selection (#86972) (PR #91580)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/91580 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [InterleavedLoadCombine] Bail out on non-byte-sized vector element type (#90705) (PR #90805)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/90805 >From d9a7e5179a89624f23d8d6993e7e9ec8887063fc Mon Sep 17 00:00:00 2001 From: Nikita Popov Date: Thu, 2 May 2024 09:38:09 +0900 Subject: [PATCH] [InterleavedLoadCombine] Bail out on non-byte-sized vector element type (#90705) Vectors are always tightly packed, and elements of non-byte-sized usually do not have a well-defined (byte) offset. Fixes https://github.com/llvm/llvm-project/issues/90695. (cherry picked from commit d484c4d3501a7ff3d00a6e0cfad026a3b01d320c) --- .../CodeGen/InterleavedLoadCombinePass.cpp| 3 +++ .../interleaved-load-combine-pr90695.ll | 19 +++ 2 files changed, 22 insertions(+) create mode 100644 llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll diff --git a/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp b/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp index f2d5c3c867c2d..bbb0b654dc67b 100644 --- a/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp +++ b/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp @@ -877,6 +877,9 @@ struct VectorInfo { if (LI->isAtomic()) return false; +if (!DL.typeSizeEqualsStoreSize(Result.VTy->getElementType())) + return false; + // Get the base polynomial computePolynomialFromPointer(*LI->getPointerOperand(), Offset, BasePtr, DL); diff --git a/llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll b/llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll new file mode 100644 index 0..ee75b3a083f71 --- /dev/null +++ b/llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll @@ -0,0 +1,19 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4 +; RUN: opt -S -passes=interleaved-load-combine < %s | FileCheck %s + +target triple = "aarch64-unknown-windows-gnu" + +; Make sure we don't crash on loads of vectors of non-byte-sized types. +define <4 x i1> @test(ptr %p) { +; CHECK-LABEL: define <4 x i1> @test( +; CHECK-SAME: ptr [[P:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[LOAD:%.*]] = load <2 x i1>, ptr [[P]], align 1 +; CHECK-NEXT:[[SHUF:%.*]] = shufflevector <2 x i1> [[LOAD]], <2 x i1> zeroinitializer, <4 x i32> +; CHECK-NEXT:ret <4 x i1> [[SHUF]] +; +entry: + %load = load <2 x i1>, ptr %p, align 1 + %shuf = shufflevector <2 x i1> %load, <2 x i1> zeroinitializer, <4 x i32> + ret <4 x i1> %shuf +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] d9a7e51 - [InterleavedLoadCombine] Bail out on non-byte-sized vector element type (#90705)
Author: Nikita Popov Date: 2024-05-09T12:48:52-07:00 New Revision: d9a7e5179a89624f23d8d6993e7e9ec8887063fc URL: https://github.com/llvm/llvm-project/commit/d9a7e5179a89624f23d8d6993e7e9ec8887063fc DIFF: https://github.com/llvm/llvm-project/commit/d9a7e5179a89624f23d8d6993e7e9ec8887063fc.diff LOG: [InterleavedLoadCombine] Bail out on non-byte-sized vector element type (#90705) Vectors are always tightly packed, and elements of non-byte-sized usually do not have a well-defined (byte) offset. Fixes https://github.com/llvm/llvm-project/issues/90695. (cherry picked from commit d484c4d3501a7ff3d00a6e0cfad026a3b01d320c) Added: llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll Modified: llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp Removed: diff --git a/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp b/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp index f2d5c3c867c2d..bbb0b654dc67b 100644 --- a/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp +++ b/llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp @@ -877,6 +877,9 @@ struct VectorInfo { if (LI->isAtomic()) return false; +if (!DL.typeSizeEqualsStoreSize(Result.VTy->getElementType())) + return false; + // Get the base polynomial computePolynomialFromPointer(*LI->getPointerOperand(), Offset, BasePtr, DL); diff --git a/llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll b/llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll new file mode 100644 index 0..ee75b3a083f71 --- /dev/null +++ b/llvm/test/CodeGen/AArch64/interleaved-load-combine-pr90695.ll @@ -0,0 +1,19 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4 +; RUN: opt -S -passes=interleaved-load-combine < %s | FileCheck %s + +target triple = "aarch64-unknown-windows-gnu" + +; Make sure we don't crash on loads of vectors of non-byte-sized types. +define <4 x i1> @test(ptr %p) { +; CHECK-LABEL: define <4 x i1> @test( +; CHECK-SAME: ptr [[P:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[LOAD:%.*]] = load <2 x i1>, ptr [[P]], align 1 +; CHECK-NEXT:[[SHUF:%.*]] = shufflevector <2 x i1> [[LOAD]], <2 x i1> zeroinitializer, <4 x i32> +; CHECK-NEXT:ret <4 x i1> [[SHUF]] +; +entry: + %load = load <2 x i1>, ptr %p, align 1 + %shuf = shufflevector <2 x i1> %load, <2 x i1> zeroinitializer, <4 x i32> + ret <4 x i1> %shuf +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [InterleavedLoadCombine] Bail out on non-byte-sized vector element type (#90705) (PR #90805)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/90805 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12 (#90595) (PR #90719)
tstellar wrote: @jayfoad (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/90719 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [X86][EVEX512] Add `HasEVEX512` when `NoVLX` used for 512-bit patterns (#91106) (PR #91118)
tstellar wrote: @phoebewang (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/91118 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [SelectionDAG] Mark frame index as "aliased" at argument copy elison (PR #91035)
tstellar wrote: @AtariDreams (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/91035 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125) (PR #91425)
tstellar wrote: @phoebewang (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/91425 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [AMDGPU] Fix GFX12 encoding of s_wait_event export_ready (#89622) (PR #91034)
tstellar wrote: @AtariDreams (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/91034 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/18.x: [clang-format] Don't remove parentheses of fold expressions (#91045) (PR #91165)
tstellar wrote: @owenca (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/91165 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [FunctionAttrs] Fix incorrect nonnull inference for non-inbounds GEP (#91180) (PR #91286)
tstellar wrote: @nikic (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/91286 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [InterleavedLoadCombine] Bail out on non-byte-sized vector element type (#90705) (PR #90805)
tstellar wrote: @nikic (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/90805 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [AArch64][GISEL] Consider fcmp true and fcmp false in cond code selection (#86972) (PR #91580)
tstellar wrote: @marcauberer (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/91580 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [AArc64][GlobalISel] Fix legalizer assert for G_INSERT_VECTOR_ELT - manual merge (PR #91672)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/91672 >From 7dbd266e89a70e96a747d8dd4aa5c6abfde15b2c Mon Sep 17 00:00:00 2001 From: Amara Emerson Date: Thu, 7 Mar 2024 15:38:33 -0800 Subject: [PATCH] [AArc64][GlobalISel] Fix legalizer assert for G_INSERT_VECTOR_ELT We should moreElements <3 x s1> to <4 x s1> before we try to widen the element, otherwise we end up with a <3 x s21> nonsense type. (cherry picked from commit a01e9ce86f4c1bc9af819902db9f287b6d23f54f) Test has been changed from original commit due to a fallback in a G_BITCAST. Added abort=2 so we can see partial legalization and check no crash. --- .../AArch64/GISel/AArch64LegalizerInfo.cpp| 1 + .../GlobalISel/legalize-insert-vector-elt.mir | 69 ++- 2 files changed, 69 insertions(+), 1 deletion(-) diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp index 4b9d549e79114..de3c89e925a2a 100644 --- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp +++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp @@ -877,6 +877,7 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST) getActionDefinitionsBuilder(G_INSERT_VECTOR_ELT) .legalIf(typeInSet(0, {v16s8, v8s8, v8s16, v4s16, v4s32, v2s32, v2s64})) + .moreElementsToNextPow2(0) .widenVectorEltsToVectorMinSize(0, 64); getActionDefinitionsBuilder(G_BUILD_VECTOR) diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir index 6f6cf2cc165b9..563d3d3e26edf 100644 --- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir +++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir @@ -1,5 +1,5 @@ # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py -# RUN: llc -mtriple=aarch64-linux-gnu -O0 -run-pass=legalizer %s -o - -global-isel-abort=1 | FileCheck %s +# RUN: llc -mtriple=aarch64-linux-gnu -O0 -run-pass=legalizer %s -o - -global-isel-abort=2 | FileCheck %s --- name:pr63826_v2s16 body: | @@ -216,3 +216,70 @@ body: | $q0 = COPY %2(<2 x s64>) RET_ReallyLR ... +--- +name:v3s8_crash +body: | + ; CHECK-LABEL: name: v3s8_crash + ; CHECK: bb.0: + ; CHECK-NEXT: successors: %bb.1(0x8000) + ; CHECK-NEXT: liveins: $w1, $w2, $w3, $x0 + ; CHECK-NEXT: {{ $}} + ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0 + ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $w1 + ; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $w2 + ; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $w3 + ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) + ; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(<3 x s8>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) + ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0 + ; CHECK-NEXT: [[DEF:%[0-9]+]]:_(s8) = G_IMPLICIT_DEF + ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s8) = G_CONSTANT i8 0 + ; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<3 x s8>) = G_BUILD_VECTOR [[C1]](s8), [[DEF]](s8), [[DEF]](s8) + ; CHECK-NEXT: {{ $}} + ; CHECK-NEXT: bb.1: + ; CHECK-NEXT: successors: %bb.1(0x8000) + ; CHECK-NEXT: {{ $}} + ; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 0 + ; CHECK-NEXT: [[C3:%[0-9]+]]:_(s8) = G_CONSTANT i8 0 + ; CHECK-NEXT: [[IVEC:%[0-9]+]]:_(<3 x s8>) = G_INSERT_VECTOR_ELT [[TRUNC]], [[C3]](s8), [[C2]](s64) + ; CHECK-NEXT: [[SHUF:%[0-9]+]]:_(<12 x s8>) = G_SHUFFLE_VECTOR [[IVEC]](<3 x s8>), [[BUILD_VECTOR1]], shufflemask(0, 3, 3, 3, 1, 3, 3, 3, 2, 3, 3, 3) + ; CHECK-NEXT: [[BITCAST:%[0-9]+]]:_(<3 x s32>) = G_BITCAST [[SHUF]](<12 x s8>) + ; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[BITCAST]](<3 x s32>) + ; CHECK-NEXT: [[DEF1:%[0-9]+]]:_(s32) = G_IMPLICIT_DEF + ; CHECK-NEXT: [[BUILD_VECTOR2:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[UV]](s32), [[UV1]](s32), [[UV2]](s32), [[DEF1]](s32) + ; CHECK-NEXT: [[UITOFP:%[0-9]+]]:_(<4 x s32>) = G_UITOFP [[BUILD_VECTOR2]](<4 x s32>) + ; CHECK-NEXT: [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[UITOFP]](<4 x s32>) + ; CHECK-NEXT: [[BUILD_VECTOR3:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[UV3]](s32), [[UV4]](s32), [[UV5]](s32) + ; CHECK-NEXT: [[UV7:%[0-9]+]]:_(s32), [[UV8:%[0-9]+]]:_(s32), [[UV9:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[BUILD_VECTOR3]](<3 x s32>) + ; CHECK-NEXT: G_STORE [[UV7]](s32), [[COPY]](p0) :: (store (s32), align 16) + ; CHECK-NEXT: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C4]](s64) + ; CHECK-NEXT: G_STORE [[UV8]](s32), [[PTR_ADD]](p0) :: (store (s32) into unknown-address + 4) + ; CHECK-NEXT: [[C5:%[0-9]+]]:_(s64) = G_CON
[llvm-branch-commits] [llvm] 7dbd266 - [AArc64][GlobalISel] Fix legalizer assert for G_INSERT_VECTOR_ELT
Author: Amara Emerson Date: 2024-05-10T11:50:51-07:00 New Revision: 7dbd266e89a70e96a747d8dd4aa5c6abfde15b2c URL: https://github.com/llvm/llvm-project/commit/7dbd266e89a70e96a747d8dd4aa5c6abfde15b2c DIFF: https://github.com/llvm/llvm-project/commit/7dbd266e89a70e96a747d8dd4aa5c6abfde15b2c.diff LOG: [AArc64][GlobalISel] Fix legalizer assert for G_INSERT_VECTOR_ELT We should moreElements <3 x s1> to <4 x s1> before we try to widen the element, otherwise we end up with a <3 x s21> nonsense type. (cherry picked from commit a01e9ce86f4c1bc9af819902db9f287b6d23f54f) Test has been changed from original commit due to a fallback in a G_BITCAST. Added abort=2 so we can see partial legalization and check no crash. Added: Modified: llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir Removed: diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp index 4b9d549e79114..de3c89e925a2a 100644 --- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp +++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp @@ -877,6 +877,7 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST) getActionDefinitionsBuilder(G_INSERT_VECTOR_ELT) .legalIf(typeInSet(0, {v16s8, v8s8, v8s16, v4s16, v4s32, v2s32, v2s64})) + .moreElementsToNextPow2(0) .widenVectorEltsToVectorMinSize(0, 64); getActionDefinitionsBuilder(G_BUILD_VECTOR) diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir index 6f6cf2cc165b9..563d3d3e26edf 100644 --- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir +++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir @@ -1,5 +1,5 @@ # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py -# RUN: llc -mtriple=aarch64-linux-gnu -O0 -run-pass=legalizer %s -o - -global-isel-abort=1 | FileCheck %s +# RUN: llc -mtriple=aarch64-linux-gnu -O0 -run-pass=legalizer %s -o - -global-isel-abort=2 | FileCheck %s --- name:pr63826_v2s16 body: | @@ -216,3 +216,70 @@ body: | $q0 = COPY %2(<2 x s64>) RET_ReallyLR ... +--- +name:v3s8_crash +body: | + ; CHECK-LABEL: name: v3s8_crash + ; CHECK: bb.0: + ; CHECK-NEXT: successors: %bb.1(0x8000) + ; CHECK-NEXT: liveins: $w1, $w2, $w3, $x0 + ; CHECK-NEXT: {{ $}} + ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0 + ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $w1 + ; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $w2 + ; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $w3 + ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) + ; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(<3 x s8>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) + ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0 + ; CHECK-NEXT: [[DEF:%[0-9]+]]:_(s8) = G_IMPLICIT_DEF + ; CHECK-NEXT: [[C1:%[0-9]+]]:_(s8) = G_CONSTANT i8 0 + ; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<3 x s8>) = G_BUILD_VECTOR [[C1]](s8), [[DEF]](s8), [[DEF]](s8) + ; CHECK-NEXT: {{ $}} + ; CHECK-NEXT: bb.1: + ; CHECK-NEXT: successors: %bb.1(0x8000) + ; CHECK-NEXT: {{ $}} + ; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 0 + ; CHECK-NEXT: [[C3:%[0-9]+]]:_(s8) = G_CONSTANT i8 0 + ; CHECK-NEXT: [[IVEC:%[0-9]+]]:_(<3 x s8>) = G_INSERT_VECTOR_ELT [[TRUNC]], [[C3]](s8), [[C2]](s64) + ; CHECK-NEXT: [[SHUF:%[0-9]+]]:_(<12 x s8>) = G_SHUFFLE_VECTOR [[IVEC]](<3 x s8>), [[BUILD_VECTOR1]], shufflemask(0, 3, 3, 3, 1, 3, 3, 3, 2, 3, 3, 3) + ; CHECK-NEXT: [[BITCAST:%[0-9]+]]:_(<3 x s32>) = G_BITCAST [[SHUF]](<12 x s8>) + ; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[BITCAST]](<3 x s32>) + ; CHECK-NEXT: [[DEF1:%[0-9]+]]:_(s32) = G_IMPLICIT_DEF + ; CHECK-NEXT: [[BUILD_VECTOR2:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[UV]](s32), [[UV1]](s32), [[UV2]](s32), [[DEF1]](s32) + ; CHECK-NEXT: [[UITOFP:%[0-9]+]]:_(<4 x s32>) = G_UITOFP [[BUILD_VECTOR2]](<4 x s32>) + ; CHECK-NEXT: [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[UITOFP]](<4 x s32>) + ; CHECK-NEXT: [[BUILD_VECTOR3:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[UV3]](s32), [[UV4]](s32), [[UV5]](s32) + ; CHECK-NEXT: [[UV7:%[0-9]+]]:_(s32), [[UV8:%[0-9]+]]:_(s32), [[UV9:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[BUILD_VECTOR3]](<3 x s32>) + ; CHECK-NEXT: G_STORE [[UV7]](s32), [[COPY]](p0) :: (store (s32), align 16) + ; CHECK-NEXT: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C4]](s64)
[llvm-branch-commits] [llvm] release/18.x: [AArc64][GlobalISel] Fix legalizer assert for G_INSERT_VECTOR_ELT - manual merge (PR #91672)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/91672 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits