On Tue, Nov 8, 2022 at 12:02 PM Michael Collison <colli...@rivosinc.com> wrote:
>
> This patches transforms (cond (and (x , 0x1) == 0), y, (z op y)) into
> (-(and (x , 0x1)) & z ) op y, where op is a '^' or a '|'. It also
> transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x ,
> 0x1)) & z ) op y.
>
> Matching this patterns allows GCC to generate branchless code for one of
> the functions in coremark.
>
> Bootstrapped and tested on x86 and RISC-V. Okay?

This seems like a (much) reduced (simplified?) version of
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584411.html .
I have not had time for the last year to go through the comments on
that patch and resubmit it though.
It seems like you are aiming for one specific case in coremarks rather
than a more generic fix too.

Thanks,
Andrew Pinski

>
> Michael.
>
> 2022-11-08  Michael Collison  <colli...@rivosinc.com>
>
>      * match.pd ((cond (and (x , 0x1) == 0), y, (z op y) )
>      -> (-(and (x , 0x1)) & z ) op y)
>
> 2022-11-08  Michael Collison  <colli...@rivosinc.com>
>
>      * gcc.dg/tree-ssa/branchless-cond.c: New test.
>
> ---
>   gcc/match.pd                                  | 22 ++++++++++++++++
>   .../gcc.dg/tree-ssa/branchless-cond.c         | 26 +++++++++++++++++++
>   2 files changed, 48 insertions(+)
>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 194ba8f5188..722f517ac6d 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3486,6 +3486,28 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>     (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1)
>     (max @2 @1))
>
> +/* (cond (and (x , 0x1) == 0), y, (z ^ y) ) -> (-(and (x , 0x1)) & z )
> ^ y */
> +(for op (bit_xor bit_ior)
> + (simplify
> +  (cond (eq (bit_and @0 integer_onep@1)
> +            integer_zerop)
> +        @2
> +        (op:c @3 @2))
> +  (if (INTEGRAL_TYPE_P (type)
> +       && (INTEGRAL_TYPE_P (TREE_TYPE (@0))))
> +       (op (bit_and (negate (convert:type (bit_and @0 @1))) @3) @2))))
> +
> +/* (cond (and (x , 0x1) != 0), (z ^ y), y ) -> (-(and (x , 0x1)) & z )
> ^ y */
> +(for op (bit_xor bit_ior)
> + (simplify
> +  (cond (ne (bit_and @0 integer_onep@1)
> +            integer_zerop)
> +    (op:c @3 @2)
> +        @2)
> +  (if (INTEGRAL_TYPE_P (type)
> +       && (INTEGRAL_TYPE_P (TREE_TYPE (@0))))
> +       (op (bit_and (negate (convert:type (bit_and @0 @1))) @3) @2))))
> +
>   /* Simplifications of shift and rotates.  */
>
>   (for rotate (lrotate rrotate)
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
> b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
> new file mode 100644
> index 00000000000..68087ae6568
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +int f1(unsigned int x, unsigned int y, unsigned int z)
> +{
> +  return ((x & 1) == 0) ? y : z ^ y;
> +}
> +
> +int f2(unsigned int x, unsigned int y, unsigned int z)
> +{
> +  return ((x & 1) != 0) ? z ^ y : y;
> +}
> +
> +int f3(unsigned int x, unsigned int y, unsigned int z)
> +{
> +  return ((x & 1) == 0) ? y : z | y;
> +}
> +
> +int f4(unsigned int x, unsigned int y, unsigned int z)
> +{
> +  return ((x & 1) != 0) ? z | y : y;
> +}
> +
> +/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */
> +/* { dg-final { scan-tree-dump-not "if" "optimized" } } */
> --
> 2.34.1
>
>
>
>

Reply via email to