Uros Bizjak <[email protected]> writes:
> On Sat, Aug 2, 2025 at 8:56 PM H.J. Lu <[email protected]> wrote:
>>
>> On Fri, Aug 1, 2025 at 10:32 PM Uros Bizjak <[email protected]> wrote:
>> >
>> > On Sat, Aug 2, 2025 at 3:22 AM H.J. Lu <[email protected]> wrote:
>> > >
>> > > After
>> > >
>> > > commit 965564eafb721f8000013a3112f1bba8d8fae32b
>> > > Author: Richard Sandiford <[email protected]>
>> > > Date: Tue Jul 29 15:58:34 2025 +0100
>> > >
>> > > simplify-rtx: Simplify subregs of logic ops
>> > >
>> > > combine generates
>> > >
>> > > (set (zero_extract:SI (reg/v:SI 101 [ a ])
>> > > (const_int 8 [0x8])
>> > > (const_int 8 [0x8]))
>> > > (not:SI (sign_extract:SI (reg:SI 107 [ b ])
>> > > (const_int 8 [0x8])
>> > > (const_int 8 [0x8]))))
>> > >
>> > > instead of
>> > >
>> > > (set (zero_extract:SI (reg/v:SI 101 [ a ])
>> > > (const_int 8 [0x8])
>> > > (const_int 8 [0x8]))
>> > > (subreg:SI (not:QI (subreg:QI (sign_extract:SI (reg:SI 107 [ b ])
>> > > (const_int 8 [0x8])
>> > > (const_int 8 [0x8])) 0)) 0))
>> > >
>> > > Add *one_cmplqi_ext<mode>_2 to support the new pattern.
>> > >
>> > > PR target/121306
>> > > * config/i386/i386.md (*one_cmplqi_ext<mode>_2): New.
>> >
>> > Why not just change the old pattern? I'd expect that the old form is
>> > now obsolete.
>> >
>>
>> *one_cmplqi_ext<mode>_1 is still needed. Otherwise combine will
>> fail to match this instruction:
>>
>> (set (zero_extract:SI (reg/v:SI 102 [ a ])
>> (const_int 8 [0x8])
>> (const_int 8 [0x8]))
>> (subreg:SI (not:QI (subreg:QI (reg:SI 105 [ _2 ]) 0)) 0))
>
> Do you perhaps have a testcase that still shows this combination?
> Perhaps the author of the simplification (CC'd) would be interested in
> this missing case.
>
> I was under the impression that the new simplification should always
> trigger, there is no point in having it if the backend still has to
> provide simplified and non-simplified patterns.
>
> There are also other similar logic patterns in i386.md that would have
> to be amended.
Sorry, I hadn't realised that there were still unfixed regressions
from that patch. I suppose if we wanted to avoid two patterns here,
we'd need to extend the pre-existing word_mode folds to support
subword modes too (for !WORD_REGISTER_OPERATIONS). The attached
untested patch does that, but I expect it would have similar
knock-on effects. I'll give it a spin overnight on x86 anyway
just to see what happens.
Richard
diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index c723a07f06b..0374fec1d9c 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -8333,27 +8333,33 @@ simplify_context::simplify_subreg (machine_mode
outermode, rtx op,
return XEXP (XEXP (op, 0), 0);
}
- /* Attempt to simplify WORD_MODE SUBREGs of bitwise expressions. */
- if (outermode == word_mode
- && (GET_CODE (op) == IOR || GET_CODE (op) == XOR || GET_CODE (op) == AND)
- && SCALAR_INT_MODE_P (innermode))
- {
- rtx op0 = simplify_subreg (outermode, XEXP (op, 0), innermode, byte);
- rtx op1 = simplify_subreg (outermode, XEXP (op, 1), innermode, byte);
- if (op0 && op1)
- return simplify_gen_binary (GET_CODE (op), outermode, op0, op1);
- }
-
- /* Attempt to simplify WORD_MODE SUBREGs of unary bitwise expression. */
- if (outermode == word_mode && GET_CODE (op) == NOT
- && SCALAR_INT_MODE_P (innermode))
+ /* Attempt to simplify WORD_MODE and sub-WORD_MODE SUBREGs of bitwise
+ expressions. */
+ scalar_int_mode int_outermode;
+ if (is_a<scalar_int_mode> (outermode, &int_outermode)
+ && (WORD_REGISTER_OPERATIONS
+ ? int_outermode == word_mode
+ : GET_MODE_PRECISION (int_outermode) <= BITS_PER_WORD)
+ && SCALAR_INT_MODE_P (innermode)
+ && (GET_CODE (op) == IOR
+ || GET_CODE (op) == XOR
+ || GET_CODE (op) == AND
+ || GET_CODE (op) == NOT))
{
rtx op0 = simplify_subreg (outermode, XEXP (op, 0), innermode, byte);
if (op0)
- return simplify_gen_unary (GET_CODE (op), outermode, op0, outermode);
+ {
+ if (GET_CODE (op) == NOT)
+ return simplify_gen_unary (GET_CODE (op), outermode,
+ op0, outermode);
+
+ rtx op1 = simplify_subreg (outermode, XEXP (op, 1), innermode, byte);
+ if (op1)
+ return simplify_gen_binary (GET_CODE (op), outermode, op0, op1);
+ }
}
- scalar_int_mode int_outermode, int_innermode;
+ scalar_int_mode int_innermode;
if (is_a <scalar_int_mode> (outermode, &int_outermode)
&& is_a <scalar_int_mode> (innermode, &int_innermode)
&& known_eq (byte, subreg_lowpart_offset (int_outermode, int_innermode)))
--
2.43.0