[Bug tree-optimization/119402] [14/15 Regression] `((-bool) & _6) & (~_6)` is not optimized to 0 on some targets since r14-5673

2025-04-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119402

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #5 from Andrew Pinski  ---
```
(simplify
 (bit_and:c (mult:c zero_one_valued_p @0) @1)
 (with { bool wascmp; }
  (if (bitwise_inverted_equal_p (@0, @1, wascmp))
   { build_zero_cst (type); })))
```


Or maybe just
```
(simplify
 (bit_and:c (mult:c zero_one_valued_p@2 @0) @1)
 (mult @2 (bit_and @0 @1))
```

Still deciding if we want ! on the bit_and or not.


Is enough and allows to optimize even:
```
unsigned x(_Bool iftmp, unsigned _6)
{
return (iftmp * _6) & (_6);
}
```

Most likely also should handle this inside re-association and turn
zero_one_valued_p*b into (-zero_one_valued_p) & b to allow re-association too
so that:
unsigned x(_Bool iftmp, unsigned _6, unsigned _7)
{
return (iftmp * _6) & ((_7) & ~_6);
}

Is optimized too.

[Bug tree-optimization/119402] [14/15 Regression] `((-bool) & _6) & (~_6)` is not optimized to 0 on some targets since r14-5673

2025-03-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119402

--- Comment #4 from Andrew Pinski  ---
(In reply to Tamar Christina from comment #3)
> 
> Seems like it's better to handle this at the GIMPLE level like we do today
> for the z case.

Yes I agree I originally was going to file this as an enhancement for doing it
on the gimple which I noticed the difference on aarch64 between the versions so
I thought I record that part too.

[Bug tree-optimization/119402] [14/15 Regression] `((-bool) & _6) & (~_6)` is not optimized to 0 on some targets since r14-5673

2025-03-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119402

--- Comment #3 from Tamar Christina  ---
(In reply to Jakub Jelinek from comment #2)
> Started with r14-5673-g33c2b70dbabc02788caabcbc66b7baeafeb95bcf
> With -O2 -mtune=generic it is fine even on the current trunk.

Seems like it's due to missing foldings.

the GIMPLE is the same, but when expended the cost model correctly tell the
backend that multiply isn't that expensive.

So we now expand to:

(insn 7 6 8 (set (reg:SI 109 [ _2 ])
(mult:SI (reg/v:SI 106 [ iftmpD.4457 ])
(reg/v:SI 107 [ _6D.4458 ]))) "/app/example.c":4:26 -1
 (nil))

vs previous:

(insn 7 6 8 (set (reg:SI 109)
(neg:SI (reg/v:SI 106 [ iftmpD.4457 ]))) "/app/example.c":4:26 -1
 (nil))

(insn 8 7 9 (set (reg:SI 110 [ _2 ])
(and:SI (reg:SI 109)
(reg/v:SI 107 [ _6D.4458 ]))) "/app/example.c":4:26 -1
 (nil))

Which then causes:

Trying 8 -> 9:
8: r110:SI=~r107:SI
  REG_DEAD r107:SI
9: r108:SI=r109:SI&r110:SI
  REG_DEAD r110:SI
  REG_DEAD r109:SI
Successfully matched this instruction:
(set (reg:SI 108 [ _4 ])
(and:SI (not:SI (reg/v:SI 107 [ _6D.4458 ]))
(reg:SI 109 [ _2 ])))
allowing combination of insns 8 and 9
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 8.
modifying insn i3 9: r108:SI=~r107:SI&r109:SI
  REG_DEAD r107:SI
  REG_DEAD r109:SI
deferring rescan insn with uid = 9.

Trying 7 -> 9:
7: r109:SI=r106:SI*r107:SI
  REG_DEAD r106:SI
9: r108:SI=~r107:SI&r109:SI
  REG_DEAD r107:SI
  REG_DEAD r109:SI
Failed to match this instruction:
(set (reg:SI 108 [ _4 ])
(and:SI (mult:SI (reg/v:SI 106 [ iftmpD.4457 ])
(reg/v:SI 107 [ _6D.4458 ]))
(not:SI (reg/v:SI 107 [ _6D.4458 ]

vs previous:

Trying 9 -> 10:
9: r111:SI=~r107:SI
  REG_DEAD r107:SI
   10: r108:SI=r110:SI&r111:SI
  REG_DEAD r111:SI
  REG_DEAD r110:SI
Successfully matched this instruction:
(set (reg:SI 108 [ _4 ])
(and:SI (not:SI (reg/v:SI 107 [ _6D.4458 ]))
(reg:SI 110 [ _2 ])))
allowing combination of insns 9 and 10
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 9.
modifying insn i310: r108:SI=~r107:SI&r110:SI
  REG_DEAD r107:SI
  REG_DEAD r110:SI
deferring rescan insn with uid = 10.

Trying 8 -> 10:
8: r110:SI=r109:SI&r107:SI
  REG_DEAD r109:SI
   10: r108:SI=~r107:SI&r110:SI
  REG_DEAD r107:SI
  REG_DEAD r110:SI
Successfully matched this instruction:
(set (reg:SI 108 [ _4 ])
(const_int 0 [0]))
allowing combination of insns 8 and 10
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 8.
deferring deletion of insn with uid = 18.
deferring deletion of insn with uid = 3.
deferring deletion of insn with uid = 2.
deferring deletion of insn with uid = 7.
modifying insn i310: r108:SI=0
deferring rescan insn with uid = 10.

Seems like it's better to handle this at the GIMPLE level like we do today for
the z case.