Re: [PATCH] match.pd: simplify "shift + reg EQ|NE reg"

Daniel Henrique Barboza Wed, 07 Jan 2026 13:47:43 -0800



On 1/7/2026 11:44 AM, Jeffrey Law wrote:



On 1/6/2026 3:13 PM, Andrew Pinski wrote:

On Tue, Jan 6, 2026 at 1:46 PM Daniel Barboza
<[email protected]> wrote:

Add a transformation for a nested lshift/rshift inside a plus thatcompares for

equality with the same operand of the plus. In other words:

((a OP b) + c EQ|NE c) ? x : y

becomes, for OP = (lshift, rshift) and in a scenario without overflows:

a !=/== 0 ? x : y

I think we want the transformation even if it is used outside of acond_expr.

Also the above is valid even for types that wrap; just not valid for
types that trap on overflow.

As we already have a pattern for `a + C == C`:
```
/* For equality, this is also true with wrapping overflow.  */
(for op (eq ne)
  (simplify
   (op:c (nop_convert?@3 (plus:c@2 @0 (convert1? @1))) (convert2? @1))
   (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
        && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))
            || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))

&& (CONSTANT_CLASS_P (@0) || (single_use (@2) && single_use(@3)))

        && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2))
        && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1)))
    (op @0 { build_zero_cst (TREE_TYPE (@0)); })))
```

The problem is the above does not work as there is an single_use check
on the plus. The single use check was there since the original match
pattern was added.
I am not sure if we should add a special case where in the above
pattern @0 is a shift.
Though changing that will have to wait for GCC 17 I think.

I tried variation of that pattern, taking into account the lshift andremoving the single_use and constant_class_p conditionals:


/* Do the same but with lshift|rshift as @0.  */
(for op (eq ne)
 (simplify

(op:c (nop_convert?@3 (plus:c@2 (lshift @0 @4) (convert1? @1)))(convert2? @1))

  (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
       && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))
           || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
       && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2))
       && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1)))
   (op @0 { build_zero_cst (TREE_TYPE (@0)); }))))


And I got the following 'optimized' dump:


long int frob (long int x, long int y, long int z)
{
  long int ret;
  long int _1;

;; basic block 2, loop depth 0, count 1073741824 (estimated locally,freq 1.0000), maybe hot

;;    prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED)

;; pred: ENTRY [always] count:1073741824 (estimated locally,freq 1.0000) (FALLTHRU,EXECUTABLE)

  if (y_3(D) == 0)
    goto <bb 4>; [50.00%]
  else
    goto <bb 3>; [50.00%]

;; succ: 4 [50.0% (guessed)] count:536870912 (estimatedlocally, freq 0.5000) (TRUE_VALUE,EXECUTABLE);; 3 [50.0% (guessed)] count:536870912 (estimatedlocally, freq 0.5000) (FALSE_VALUE,EXECUTABLE)

;; basic block 3, loop depth 0, count 536870912 (estimated locally,freq 0.5000), maybe hot

;;    prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED)

;; pred: 2 [50.0% (guessed)] count:536870912 (estimatedlocally, freq 0.5000) (FALSE_VALUE,EXECUTABLE)

  _1 = y_3(D) << 2;
  ret_5 = _1 + z_4(D);

;; succ: 4 [always] count:536870912 (estimated locally, freq0.5000) (FALLTHRU,EXECUTABLE)

;; basic block 4, loop depth 0, count 1073741824 (estimated locally,freq 1.0000), maybe hot

;;    prev block 3, next block 1, flags: (NEW, REACHABLE, VISITED)

;; pred: 3 [always] count:536870912 (estimated locally, freq0.5000) (FALLTHRU,EXECUTABLE);; 2 [50.0% (guessed)] count:536870912 (estimatedlocally, freq 0.5000) (TRUE_VALUE,EXECUTABLE)

  # ret_2 = PHI <ret_5(3), y_3(D)(2)>
  # VUSE <.MEM_6(D)>
  return ret_2;

;; succ: EXIT [always] count:1073741824 (estimated locally,freq 1.0000) (EXECUTABLE) test.c:8:10


}


With the pattern I sent this is the 'optimized' dump I get:


long int frob (long int x, long int y, long int z)
{
  long int ret;
  long int _1;
  _Bool _7;
  long int _8;

;; basic block 2, loop depth 0, count 1073741824 (estimated locally,freq 1.0000), maybe hot

;;    prev block 0, next block 1, flags: (NEW, REACHABLE, VISITED)

;; pred: ENTRY [always] count:1073741824 (estimated locally,freq 1.0000) (FALLTHRU,EXECUTABLE)

  _1 = y_3(D) << 2;
  ret_5 = _1 + z_4(D);
  _7 = y_3(D) == 0;
  _8 = _7 ? 0 : ret_5;
  # VUSE <.MEM_6(D)>
  return _8;

;; succ: EXIT [always] count:1073741824 (estimated locally,freq 1.0000) (EXECUTABLE) test.c:8:10

I'm not sure why we're having this difference given that the patternsare quite similar.

By the way my initial idea for upstreaming was to split the pattern intwo. One like this:


X << N EQ|NE 0 -> X EQ|NE 0

And another one like the original I sent but returning the full lshiftcomparison instead of doing the reduction, i.e.:


(a OP b) + c EQ|NE c -> (a OP b) EQ|NE 0


But I got a lot of trouble with errors like:

Analyzing compilation unit
Performing interprocedural optimizations

<*free_lang_data> {heap 1028k} <visibility> {heap 1028k}<build_ssa_passes> {heap 1028k} <targetclone> {heap 1440k} <opt_lo

cal_passes> {heap 1440k}during GIMPLE pass: ccp
dump file: test.c.038t.ccp1
test.c: In function ‘frob’:
test.c:9:1: internal compiler error: in decompose, at wide-int.h:1049
    9 | }
      | ^
0x2ecc34f internal_error(char const*, ...)
        ../../gcc/diagnostic-global-context.cc:787
0xf0790f fancy_abort(char const*, int, char const*)
        ../../gcc/diagnostics/context.cc:1805

0xa5b276 wi::int_traits<generic_wide_int<wide_int_ref_storage<false,false> > >::decompose(long*, unsigned int, generic_wide

_int<wide_int_ref_storage<false, false> > const&)
        ../../gcc/wide-int.h:1049

0xa5d419 wi::int_traits<generic_wide_int<wide_int_storage>>::decompose(long*, unsigned int, generic_wide_int<wide_int_stora

ge> const&)
        ../../gcc/tree.h:3906

0xa5d419 wide_int_ref_storage<true,false>::wide_int_ref_storage<generic_wide_int<wide_int_storage>>(generic_wide_int<wide_

int_storage> const&, unsigned int)
        ../../gcc/wide-int.h:1099

0xa5d419 generic_wide_int<wide_int_ref_storage<true, false>>::generic_wide_int<generic_wide_int<wide_int_storage> >(generic

_wide_int<wide_int_storage> const&, unsigned int)
        ../../gcc/wide-int.h:855

My guess is that I was messing up the relation between the 'type'precision and the precision used in build_zero_cst(). I gave up and

sent the full pattern, but I wonder if splitting it would be better.


Thanks,

Daniel

Jeff/Richard,
   What are your thoughts on this? I know Richard had thoughts on some
of the single_use in the match patterns before but I have not tracked
them that well.
I had no idea we had that match.pd pattern; clearly what Daniel is doingis closely related.
The single_use in those patterns comes up here:
https://gcc.gnu.org/pipermail/gcc-patches/2017-October/484606.html
Looks like single use is in there because of interactions with anotherpattern.
jeff

Re: [PATCH] match.pd: simplify "shift + reg EQ|NE reg"

Reply via email to