On 1/7/2026 11:44 AM, Jeffrey Law wrote:


On 1/6/2026 3:13 PM, Andrew Pinski wrote:
On Tue, Jan 6, 2026 at 1:46 PM Daniel Barboza
<[email protected]> wrote:
Add a transformation for a nested lshift/rshift inside a plus that compares for
equality with the same operand of the plus. In other words:

((a OP b) + c EQ|NE c) ? x : y

becomes, for OP = (lshift, rshift) and in a scenario without overflows:

a !=/== 0 ? x : y
I think we want the transformation even if it is used outside of a cond_expr.
Also the above is valid even for types that wrap; just not valid for
types that trap on overflow.

As we already have a pattern for `a + C == C`:
```
/* For equality, this is also true with wrapping overflow.  */
(for op (eq ne)
  (simplify
   (op:c (nop_convert?@3 (plus:c@2 @0 (convert1? @1))) (convert2? @1))
   (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
        && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))
            || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
        && (CONSTANT_CLASS_P (@0) || (single_use (@2) && single_use (@3)))
        && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2))
        && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1)))
    (op @0 { build_zero_cst (TREE_TYPE (@0)); })))
```

The problem is the above does not work as there is an single_use check
on the plus. The single use check was there since the original match
pattern was added.
I am not sure if we should add a special case where in the above
pattern @0 is a shift.
Though changing that will have to wait for GCC 17 I think.


I tried variation of that pattern, taking into account the lshift and removing the single_use and constant_class_p conditionals:

/* Do the same but with lshift|rshift as @0.  */
(for op (eq ne)
 (simplify
(op:c (nop_convert?@3 (plus:c@2 (lshift @0 @4) (convert1? @1))) (convert2? @1))
  (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
       && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))
           || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
       && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2))
       && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1)))
   (op @0 { build_zero_cst (TREE_TYPE (@0)); }))))


And I got the following 'optimized' dump:


long int frob (long int x, long int y, long int z)
{
  long int ret;
  long int _1;

;; basic block 2, loop depth 0, count 1073741824 (estimated locally, freq 1.0000), maybe hot
;;    prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED)
;; pred: ENTRY [always] count:1073741824 (estimated locally, freq 1.0000) (FALLTHRU,EXECUTABLE)
  if (y_3(D) == 0)
    goto <bb 4>; [50.00%]
  else
    goto <bb 3>; [50.00%]
;; succ: 4 [50.0% (guessed)] count:536870912 (estimated locally, freq 0.5000) (TRUE_VALUE,EXECUTABLE) ;; 3 [50.0% (guessed)] count:536870912 (estimated locally, freq 0.5000) (FALSE_VALUE,EXECUTABLE)

;; basic block 3, loop depth 0, count 536870912 (estimated locally, freq 0.5000), maybe hot
;;    prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED)
;; pred: 2 [50.0% (guessed)] count:536870912 (estimated locally, freq 0.5000) (FALSE_VALUE,EXECUTABLE)
  _1 = y_3(D) << 2;
  ret_5 = _1 + z_4(D);
;; succ: 4 [always] count:536870912 (estimated locally, freq 0.5000) (FALLTHRU,EXECUTABLE)

;; basic block 4, loop depth 0, count 1073741824 (estimated locally, freq 1.0000), maybe hot
;;    prev block 3, next block 1, flags: (NEW, REACHABLE, VISITED)
;; pred: 3 [always] count:536870912 (estimated locally, freq 0.5000) (FALLTHRU,EXECUTABLE) ;; 2 [50.0% (guessed)] count:536870912 (estimated locally, freq 0.5000) (TRUE_VALUE,EXECUTABLE)
  # ret_2 = PHI <ret_5(3), y_3(D)(2)>
  # VUSE <.MEM_6(D)>
  return ret_2;
;; succ: EXIT [always] count:1073741824 (estimated locally, freq 1.0000) (EXECUTABLE) test.c:8:10

}


With the pattern I sent this is the 'optimized' dump I get:


long int frob (long int x, long int y, long int z)
{
  long int ret;
  long int _1;
  _Bool _7;
  long int _8;

;; basic block 2, loop depth 0, count 1073741824 (estimated locally, freq 1.0000), maybe hot
;;    prev block 0, next block 1, flags: (NEW, REACHABLE, VISITED)
;; pred: ENTRY [always] count:1073741824 (estimated locally, freq 1.0000) (FALLTHRU,EXECUTABLE)
  _1 = y_3(D) << 2;
  ret_5 = _1 + z_4(D);
  _7 = y_3(D) == 0;
  _8 = _7 ? 0 : ret_5;
  # VUSE <.MEM_6(D)>
  return _8;
;; succ: EXIT [always] count:1073741824 (estimated locally, freq 1.0000) (EXECUTABLE) test.c:8:10


I'm not sure why we're having this difference given that the patterns are quite similar.

By the way my initial idea for upstreaming was to split the pattern in two. One like this:

X << N EQ|NE 0 -> X EQ|NE 0

And another one like the original I sent but returning the full lshift comparison instead of doing the reduction, i.e.:

(a OP b) + c EQ|NE c -> (a OP b) EQ|NE 0


But I got a lot of trouble with errors like:

Analyzing compilation unit
Performing interprocedural optimizations
<*free_lang_data> {heap 1028k} <visibility> {heap 1028k} <build_ssa_passes> {heap 1028k} <targetclone> {heap 1440k} <opt_lo
cal_passes> {heap 1440k}during GIMPLE pass: ccp
dump file: test.c.038t.ccp1
test.c: In function ‘frob’:
test.c:9:1: internal compiler error: in decompose, at wide-int.h:1049
    9 | }
      | ^
0x2ecc34f internal_error(char const*, ...)
        ../../gcc/diagnostic-global-context.cc:787
0xf0790f fancy_abort(char const*, int, char const*)
        ../../gcc/diagnostics/context.cc:1805
0xa5b276 wi::int_traits<generic_wide_int<wide_int_ref_storage<false, false> > >::decompose(long*, unsigned int, generic_wide
_int<wide_int_ref_storage<false, false> > const&)
        ../../gcc/wide-int.h:1049
0xa5d419 wi::int_traits<generic_wide_int<wide_int_storage> >::decompose(long*, unsigned int, generic_wide_int<wide_int_stora
ge> const&)
        ../../gcc/tree.h:3906
0xa5d419 wide_int_ref_storage<true, false>::wide_int_ref_storage<generic_wide_int<wide_int_storage> >(generic_wide_int<wide_
int_storage> const&, unsigned int)
        ../../gcc/wide-int.h:1099
0xa5d419 generic_wide_int<wide_int_ref_storage<true, false> >::generic_wide_int<generic_wide_int<wide_int_storage> >(generic
_wide_int<wide_int_storage> const&, unsigned int)
        ../../gcc/wide-int.h:855

My guess is that I was messing up the relation between the 'type' precision and the precision used in build_zero_cst(). I gave up and
sent the full pattern, but I wonder if splitting it would be better.


Thanks,

Daniel





Jeff/Richard,
   What are your thoughts on this? I know Richard had thoughts on some
of the single_use in the match patterns before but I have not tracked
them that well.

I had no idea we had that match.pd pattern; clearly what Daniel is doing is closely related.

The single_use in those patterns comes up here:
https://gcc.gnu.org/pipermail/gcc-patches/2017-October/484606.html

Looks like single use is in there because of interactions with another pattern.

jeff






Reply via email to