In pr123010 we have a case where we should be getting a single slliw,
but instead we get a 3-insn sequence.
As noted in the BZ we had this before combine:
(insn 6 3 13 2 (set (reg:DI 137)
(sign_extend:DI (ashift:SI (subreg/s/u:SI (reg/v:DI 135 [ a ]) 0)
(const_int 1 [0x1])))) "j.c":10:14 312 {ashlsi3_extend}
(expr_list:REG_DEAD (reg/v:DI 135 [ a ])
(nil)))
Which is exactly what we want. That's a single slliw instruction. THen
combine generates this:
(insn 6 3 13 2 (parallel [
(set (reg:DI 137)
(sign_extract:DI (reg:DI 139 [ a ])
(const_int 31 [0x1f])
(const_int 0 [0])))
(clobber (scratch:DI))
]) "j.c":10:14 333 {*extractdi3}
(expr_list:REG_DEAD (reg:DI 139 [ a ])
(nil)))
(insn 13 6 14 2 (set (reg/i:DI 10 a0)
(ashift:DI (reg:DI 137)
(const_int 1 [0x1]))) "j.c":11:1 297 {ashldi3}
(expr_list:REG_DEAD (reg:DI 137)
(nil)))
Which is due to a define_insn_and_split mis-behaving a bit.
The first approach was to define an insn for the case where we left
shift a sign extracted bitfield where the sign bit of the bitfield gets
shifted into bit 31. Theory being this might be a reasonably common
occurrence and having a pattern for it might be useful (and there's a
similar pattern one could write for a small number of zero extended
fields getting shifted left as well). That turns out to be a problem
though as the sign extension is obfuscated making it harder to track the
state of the sign bits and thus harder to eliminate later sign
extensions. Those regressions can be fixed, but doing so requires the
revamp of the sign/zero extension patterns to eliminate several of the
define_insn_and_splits. Larger change than I really want to do right now.
We could also throttle back the most problematic define_insn_and_split.
It's likely viable, though probably not the best use of time given my
desire to clean up the define_insn_and_splits, including this one.
We can also recognize this case and simplify it. Essentially when we
have (ashift (sign_extract ...)) recognize the case when we're
extracting a bitfield starting at bit 0 and the bit field is shifted
such that the sign bit of the bitfield moves into bit position 7, 15 or
31. In that case we can simplify it to (sign_extend (ashift ...))
This patch takes the last of those three approaches. Bootstrapped and
regression tested on x86_64, and riscv (BPI and Pioneer) as well as
going through all the embedded targets without regressions. I was
somewhat worried about loongarch due to a pattern in its machine
description, but the two tests added with that pattern still pass, so it
seems OK too.
Pushing to the trunk.
Jeff
commit bde7b018d4557538cdc8bef57ab073f3ec0d25b8
Author: Jeff Law <[email protected]>
Date: Sun Jan 4 09:30:01 2026 -0700
[PR target/123010] Simplify shift of sign extracted field to a sign
extending shift
In pr123010 we have a case where we should be getting a single slliw, but
instead we get a 3-insn sequence.
As noted in the BZ we had this before combine:
> (insn 6 3 13 2 (set (reg:DI 137)
> (sign_extend:DI (ashift:SI (subreg/s/u:SI (reg/v:DI 135 [ a ]) 0)
> (const_int 1 [0x1])))) "j.c":10:14 312 {ashlsi3_extend}
> (expr_list:REG_DEAD (reg/v:DI 135 [ a ])
> (nil)))
Which is exactly what we want. That's a single slliw instruction. THen
combine generates this:
> (insn 6 3 13 2 (parallel [
> (set (reg:DI 137)
> (sign_extract:DI (reg:DI 139 [ a ])
> (const_int 31 [0x1f])
> (const_int 0 [0])))
> (clobber (scratch:DI))
> ]) "j.c":10:14 333 {*extractdi3}
> (expr_list:REG_DEAD (reg:DI 139 [ a ])
> (nil)))
> (insn 13 6 14 2 (set (reg/i:DI 10 a0)
> (ashift:DI (reg:DI 137)
> (const_int 1 [0x1]))) "j.c":11:1 297 {ashldi3}
> (expr_list:REG_DEAD (reg:DI 137)
> (nil)))
Which is due to a define_insn_and_split mis-behaving a bit.
The first approach was to define an insn for the case where we left shift a
sign extracted bitfield where the sign bit of the bitfield gets shifted into
bit 31. Theory being this might be a reasonably common occurrence and
having a
pattern for it might be useful (and there's a similar pattern one could
write
for a small number of zero extended fields getting shifted left as well).
That turns out to be a problem though as the sign extension is obfuscated
making it harder to track the state of the sign bits and thus harder to
eliminate later sign extensions. Those regressions can be fixed, but doing
so
requires the revamp of the sign/zero extension patterns to eliminate
several of
the define_insn_and_splits. Larger change than I really want to do right
now.
We could also throttle back the most problematic define_insn_and_split.
It's
likely viable, though probably not the best use of time given my desire to
clean up the define_insn_and_splits, including this one.
We can also recognize this case and simplify it. Essentially when we have
(ashift (sign_extract ...)) recognize the case when we're extracting a
bitfield
starting at bit 0 and the bit field is shifted such that the sign bit of the
bitfield moves into bit position 7, 15 or 31. In that case we can simplify
it
to (sign_extend (ashift ...))
This patch takes the last of those three approaches. Bootstrapped and
regression tested on x86_64, and riscv (BPI and Pioneer) as well as going
through all the embedded targets without regressions. I was somewhat
worried
about loongarch due to a pattern in its machine description, but the two
tests
added with that pattern still pass, so it seems OK too.
PR target/123010
gcc/
* simplify-rtx.cc (simplify_binary_operation_1, case ASHIFT):
Simplify
case where a left shift of the sign extracted field can be turned
into
a sign extension of a left shift.
gcc/testsuite
* gcc.target/riscv/pr123010.c: New test.
diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index b9843bd3078..75655787202 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -4793,6 +4793,43 @@ simplify_ashift:
rtx new_op1 = immed_wide_int_const (c, int_mode);
return simplify_gen_binary (MULT, int_mode, op0, new_op1);
}
+
+ /* If we're shifting left a signed bitfield extraction and the
+ shift count + bitfield size is a natural integral mode and
+ the field starts at offset 0 (counting from the LSB), then
+ this can be simplified to a sign extension of a left shift.
+
+ Some ISAs (RISC-V 64-bit) have inherent support for such
+ instructions and it's better for various optimizations to
+ express as a SIGN_EXTEND rather than a shifted SIGN_EXTRACT. */
+ if (GET_CODE (op0) == SIGN_EXTRACT
+ && REG_P (XEXP (op0, 0))
+ /* The size of the bitfield, the location of the bitfield and
+ shift count must be CONST_INTs. */
+ && CONST_INT_P (op1)
+ && CONST_INT_P (XEXP (op0, 1))
+ && CONST_INT_P (XEXP (op0, 2)))
+ {
+ int size = INTVAL (op1) + INTVAL (XEXP (op0, 1));
+ machine_mode smaller_mode;
+ /* Now we need to verify the size of the bitfield plus the shift
+ count is an integral mode and smaller than MODE. This is
+ requirement for using SIGN_EXTEND. We also need to verify the
+ field starts at bit location 0 and that the subreg lowpart also
+ starts at zero. */
+ if (int_mode_for_size (size, size).exists (&smaller_mode)
+ && mode > smaller_mode
+ && (subreg_lowpart_offset (smaller_mode, mode).to_constant ()
+ == UINTVAL (XEXP (op0, 2)))
+ && XEXP (op0, 2) == CONST0_RTX (mode))
+ {
+ /* Everything passed. So we just need to get the subreg of the
+ original input, shift it and sign extend the result. */
+ rtx op = gen_lowpart (smaller_mode, XEXP (op0, 0));
+ rtx x = gen_rtx_ASHIFT (smaller_mode, op, op1);
+ return gen_rtx_SIGN_EXTEND (mode, x);
+ }
+ }
goto canonicalize_shift;
case LSHIFTRT:
diff --git a/gcc/testsuite/gcc.target/riscv/pr123010.c
b/gcc/testsuite/gcc.target/riscv/pr123010.c
new file mode 100644
index 00000000000..f7fe5a4484d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr123010.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target rv64 } } */
+/* { dg-options "-O3 -march=rv64gc -mabi=lp64" { target rv64 } } */
+#include <stdint-gcc.h>
+
+#define N 2
+
+uint32_t mulu(uint32_t a) {
+ return a * N;
+}
+
+int32_t muls(int32_t a) {
+ return a * N;
+}
+
+/* { dg-final { scan-assembler-times "slliw\t" 2 { target rv64 } } } */
+/* { dg-final { scan-assembler-not "sr\[al\]i\t" } } */
+