[Bug tree-optimization/112566] Some ctz/popcount/parity/ffs optimizations

2023-11-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112566

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:6dd4c703be17fa5dd56136d067e7fdc4a61584b3

commit r14-5557-g6dd4c703be17fa5dd56136d067e7fdc4a61584b3
Author: Jakub Jelinek 
Date:   Fri Nov 17 15:10:51 2023 +0100

match.pd: Optimize ctz/popcount/parity/ffs on extended argument [PR112566]

ctz(ext(X)) is the same as ctz(X) in the UB on zero case (or could be also
in the 2 argument case on large BITINT_TYPE by preserving the argument, not
implemented in this patch),
popcount(zext(X)) is the same as popcount(X),
parity(zext(X)) is the same as parity(X),
parity(sext(X)) is the same as parity(X) provided the bit difference
between
the extended and unextended types is even,
ffs(ext(X)) is the same as ffs(X).

The following patch optimizes those in match.pd if those are beneficial
(always in the large BITINT_TYPE case, or if the narrower type has optab
and the wider doesn't, or the wider is larger than word and narrower is
one of the standard argument sizes (tested just int and long long, as
long is on most targets same bitsize as one of those two).

Joseph in the PR mentioned that ctz(narrow(X)) is the same as ctz(X)
if UB on 0, but that can be handled incrementally (and would need different
decisions when it is profitable).
And clz(zext(X)) is clz(X) + bit_difference, but not sure we want to change
that in match.pd at all, perhaps during insn selection?

2023-11-17  Jakub Jelinek  

PR tree-optimization/112566
PR tree-optimization/83171
* match.pd (ctz(ext(X)) -> ctz(X), popcount(zext(X)) ->
popcount(X),
parity(ext(X)) -> parity(X), ffs(ext(X)) -> ffs(X)): New
simplifications.
( __builtin_ffs (X) == 0 -> X == 0): Use FFS rather than
BUILT_IN_FFS BUILT_IN_FFSL BUILT_IN_FFSLL BUILT_IN_FFSIMAX.

* gcc.dg/pr112566-1.c: New test.
* gcc.dg/pr112566-2.c: New test.
* gcc.target/i386/pr78057.c (foo): Pass another long long argument
and use it in __builtin_ia32_*zcnt_u64 instead of the int one.

[Bug tree-optimization/112566] Some ctz/popcount/parity/ffs optimizations

2023-11-16 Thread joseph at codesourcery dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112566

--- Comment #4 from joseph at codesourcery dot com  ---
On Thu, 16 Nov 2023, jakub at gcc dot gnu.org via Gcc-bugs wrote:

> ctz(ext(x)) == ctz(x) if UB on zero,

In one direction, this should also be true for a narrowing conversion 
(changing ctz(narrow(x)) to ctz(x) might remove UB if x is nonzero but 
narrows to zero, but won't introduce UB, or change the result if narrow(x) 
is nonzero).

[Bug tree-optimization/112566] Some ctz/popcount/parity/ffs optimizations

2023-11-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112566

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=83171

--- Comment #3 from Andrew Pinski  ---
>popcount(zext(x)) == popcount(x),
That is PR 83171 .

[Bug tree-optimization/112566] Some ctz/popcount/parity/ffs optimizations

2023-11-16 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112566

Jakub Jelinek  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-11-16
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Created attachment 56605
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56605=edit
gcc14-pr112566.patch

Untested implementation.

[Bug tree-optimization/112566] Some ctz/popcount/parity/ffs optimizations

2023-11-16 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112566

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jsm28 at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
The idea came from looking at Joseph's stdbit.h proposed implementation, which
sometimes in the type-generic macros just uses the long long version
unconditionally.