Hi Richard, > On 7 Nov 2024, at 6:10 PM, Richard Biener <[email protected]> wrote: > > External email: Use caution opening links or attachments > > > On Tue, 5 Nov 2024, Soumya AR wrote: > >> >> >>> On 29 Oct 2024, at 7:16 PM, Richard Biener <[email protected]> wrote: >>> >>> External email: Use caution opening links or attachments >>> >>> >>> On Mon, 28 Oct 2024, Soumya AR wrote: >>> >>>> This patch transforms the following POW calls to equivalent LDEXP calls, as >>>> discussed in PR57492: >>>> >>>> powi (2.0, i) -> ldexp (1.0, i) >>>> >>>> a * powi (2.0, i) -> ldexp (a, i) >>>> >>>> 2.0 * powi (2.0, i) -> ldexp (1.0, i + 1) >>>> >>>> pow (powof2, i) -> ldexp (1.0, i * log2 (powof2)) >>>> >>>> powof2 * pow (2, i) -> ldexp (1.0, i + log2 (powof2)) >>> >>> For the multiplication cases why not handle powof2 * ldexp (1., i) >>> to ldexp (1., i + log2 (powof2)) and a * ldexp (1., i) -> ldexp (a, i) >>> instead? exp2 * ldexp (1., i) is another candidate. >>> >>> So please split out the multiplication parts. >>> >>> + /* Simplify pow (powof2, i) to ldexp (1, i * log2 (powof2)). */ >>> >>> the below pattern handles POWI, not POW. >>> >>> + (simplify >>> + (POWI REAL_CST@0 @1) >>> + (with { HOST_WIDE_INT tmp = 0; >>> + tree integer_arg1 = NULL_TREE; } >>> + (if (integer_valued_real_p (@0) >>> + && real_isinteger (&TREE_REAL_CST (@0), &tmp) >>> + && integer_pow2p (integer_arg1 = build_int_cst (integer_type_node, >>> tmp))) >>> >>> && tmp > 0 >>> && pow2p_hwi (tmp) >>> >>> + (LDEXP { build_one_cst (type); } >>> + (mult @1 { build_int_cst (integer_type_node, >>> + tree_log2 (integer_arg1)); }))))) >>> >>> build_int_cst (integer_type_node, exact_log2 (tmp)) >>> >>> + /* Simplify powi (2.0, i) to ldexp (1, i). */ >>> + (simplify >>> + (POWI REAL_CST@0 @1) >>> + (if (real_equal (TREE_REAL_CST_PTR (@0), &dconst2)) >>> + (LDEXP { build_one_cst (type); } @1))) >>> + >>> >>> You'll have a duplicate pattern here, instead merge them. 2.0 >>> is power-of-two so I wonder why the pattern is needed. >> >> Thanks for the feedback! >> >> I've merged the extra case that was specifically checking for 2.0. >> >> Like you suggested, I also added two ldexp specific transforms: >> >> • powof2 * ldexp (x, i) -> ldexp (x, i + log2 (powof2)) >> • a * ldexp(1., i) -> ldexp (a, i) >> >> Regarding your suggestion to also fold exp2, a conversion like >> exp2 (x) -> ldexp (1., x) or exp2 (x) * ldexp (y, i) -> ldexp (y, i + x) is a >> bit tricky because we'd have to cast it to an integer before passing it to >> ldexp. >> >> real_to_integer only works for constants, which isn't helpful here as exp2 >> (CST) >> becomes a power of 2 anyway and matches with the above patterns. >> >> We'll have to explicitly convert it for non constants and I'm not sure if >> that >> is worth it for this patch. >> >> Let me know what you think. > > + (simplify > + (mult:c REAL_CST@0 (POWI REAL_CST@1 @2)) > + (with { HOST_WIDE_INT tmp = 0; > + tree integer_arg1 = NULL_TREE; } > + (if (integer_valued_real_p (@0) > + && real_equal (TREE_REAL_CST_PTR (@1), &dconst2) > + && real_isinteger (&TREE_REAL_CST (@0), &tmp) > + && integer_pow2p (integer_arg1 = build_int_cst (integer_type_node, > tmp))) > > as said you can use tmp > 0 && pow2p_hwi (tmp) instead of > integer_pow2p and build_int_cst. This applies to all patterns.
Thanks for the feedback! Updated the patch with this modification.
> + (if (integer_valued_real_p (@0)
> + && real_isinteger (&TREE_REAL_CST (@0), &tmp)
>
> also is a redundant check, real_isinteger gives you the answer already.
>
> + (mult @1 {build_int_cst (integer_type_node,
> + tree_log2 (integer_arg1)); })))))
>
> and use exact_log2 (tmp) instead of tree_log2 (integer_arg1).
Fixed.
>
> + /* Simplify a * powi (2.0, i) to ldexp (a, i). */
> + (simplify
> + (mult:c @0 (POWI REAL_CST@1 @2))
> + (if (real_equal (TREE_REAL_CST_PTR (@1), &dconst2))
> + (LDEXP @0 @2)))
> ...
> + /* Simplify powi (2.0, i) to ldexp (1, i). */
> + (simplify
> + (POWI REAL_CST@0 @1)
> + (if (real_equal (TREE_REAL_CST_PTR (@0), &dconst2))
> + (LDEXP { build_one_cst (type); } @1)))
>
> you can expect the first to never hit as we should replace
> a * powi (2.0, i) with a * ldexp (1, i) first. So you can
> drop the first pattern.
Makes sense, apologies for the oversight. Fixed this.
Best,
Soumya
> Richard.
>
>
>> Best,
>> Soumya
>>
>>
>>
>>> Richard.
>>>
>>>>
>>>> This is especially helpful for SVE architectures as LDEXP calls can be
>>>> implemented using the FSCALE instruction, as seen in the following patch:
>>>> https://gcc.gnu.org/pipermail/gcc-patches/2024-September/664160.html
>>>>
>>>> SPEC2017 was run with this patch, while there are no noticeable
>>>> improvements,
>>>> there are no non-noise regressions either.
>>>>
>>>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no
>>>> regression.
>>>> OK for mainline?
>>>>
>>>> Signed-off-by: Soumya AR <[email protected]>
>>>>
>>>> gcc/ChangeLog:
>>>> PR target/57492
>>>> * match.pd: Added patterns to fold certain calls to pow to ldexp.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>> PR target/57492
>>>> * gcc.dg/tree-ssa/pow-to-ldexp.c: New test.
>>>>
>>>>
>>>
>>> --
>>> Richard Biener <[email protected]>
>>> SUSE Software Solutions Germany GmbH,
>>> Frankenstrasse 146, 90461 Nuernberg, Germany;
>>> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
>>
>>
>>
>
> --
> Richard Biener <[email protected]>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
v3-0001-Match-Fold-pow-calls-to-ldexp-when-possible-PR57492.patch
Description: v3-0001-Match-Fold-pow-calls-to-ldexp-when-possible-PR57492.patch
