Re: [PATCH] match.pd: Optimize (A ^ B) & C ^ B to (A & C)|(B & ~C) late in pipeline

Segher Boessenkool Mon, 15 Dec 2025 09:35:23 -0800

Hi!

On Thu, Dec 11, 2025 at 12:30:43AM +0530, Kishan Parmar wrote:
> The reason for rejecting |INTEGER_CST| is to avoid regressing rs6000 single 
> rotate-and-insert/mask
> instructions generation.


That code is tuned a lot to work well with everything else we have.  If
basics are changed again now, more cokplex things that build on it will
have to be adjusted again as well.

> When @2 is a constant (a mask), the existing RTL infrastructure(simplify-rtx) 
> handles the canonical XOR-form well.

Please don't call it canonical.  It very much is not.

> Specifically, the combiner can match the sequence (rotate -> xor -> and -> 
> xor) and merge it into a  rotate + (rotate-and-insert/mask) instruction.
> 
> If we force the IOR form |(A & C) | (B & ~C)| in GIMPLE for constants, the 
> RTL combiner fails to match
> the single rotate-and-insert pattern.
> It often greedily simplifies the (ROTATE + AND ) part into a simple logical 
> shift (|lshiftrt|), breaking the sequence.

That is the same machine instruction though (just an rlwinm).  Why is
one way of writing it not recognised?  Is this another thing where
subregs of shifts are not properly simplified?

> Attempting to match this new sequence in the combiner is difficult because 
> the middle-end attempts to
> simplify the shift/mask operation into a ZERO_EXTRACT.

If it does that, that is a serious bug.

> Since the RS6000 backend does not generally accept ZERO_EXTRACT
> for these integer operations, the match fails, and we lose the rlwimi kind of 
> instructions (regressing to 3 separate instructions).

We do not have any instructions that can implement ZERO_EXTRACT things,
only things that are a million times more generic (and useful!)  Writing
some cases as ZERO_EXTRACT means we would need exponentially more insnr
patterns.

Nowhere in GCC is ZERO_EXTRACT required, so thankfully this is all a
moot point.


Segher

Re: [PATCH] match.pd: Optimize (A ^ B) & C ^ B to (A & C)|(B & ~C) late in pipeline

Reply via email to