Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]

Xionghu Luo via Gcc-patches Sun, 11 Jul 2021 18:25:44 -0700

On 2021/7/10 02:40, will schmidt wrote:
> On Wed, 2021-06-30 at 09:44 +0800, Xionghu Luo via Gcc-patches wrote:
>> Gentle ping ^2, thanks.
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html
>>
>>
>> On 2021/5/14 15:13, Xionghu Luo via Gcc-patches wrote:
>>> Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%,
>>> 526.blender_r +1.72%, no obvious changes to others.
> 
> Ok.
> 
>>>
>>>
>>> On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote:
>>>> Gentle ping, thanks.
>>>>
>>>>
>>>> On 2021/4/16 15:10, Xiong Hu Luo wrote:
>>>>> fmod/fmodf and remainder/remainderf could be expanded instead of library
>>>>> call when fast-math build, which is much faster.
>>>>>
>>>>> fmodf:
>>>>>        fdivs   f0,f1,f2
>>>>>        friz    f0,f0
>>>>>        fnmsubs f1,f2,f0,f1
>>>>>
>>>>> remainderf:
>>>>>        fdivs   f0,f1,f2
>>>>>        frin    f0,f0
>>>>>        fnmsubs f1,f2,f0,f1
>>>>>
>>>>> gcc/ChangeLog:
>>>>>
>>>>> 2021-04-16  Xionghu Luo  <luo...@linux.ibm.com>
>>>>>
>>>>>      PR target/97142
> 
> That PR is " Bug 97142
>        - __builtin_fmod not optimized on POWER   "
> 
> OK.
> 
> 
>>>>>      * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
>>>>>      (remainder<mode>3): Likewise.
> 
> 
>>>>>
>>>>> gcc/testsuite/ChangeLog:
>>>>>
>>>>> 2021-04-16  Xionghu Luo  <luo...@linux.ibm.com>
>>>>>
>>>>>      PR target/97142
>>>>>      * gcc.target/powerpc/pr97142.c: New test.
> 
> Ok.
> 
>>>>> ---
>>>>>    gcc/config/rs6000/rs6000.md                | 36 ++++++++++++++++++++++
>>>>>    gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++
>>>>>    2 files changed, 66 insertions(+)
>>>>>    create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
>>>>>
>>>>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>>>>> index a1315523fec..7e0e94e6ba4 100644
>>>>> --- a/gcc/config/rs6000/rs6000.md
>>>>> +++ b/gcc/config/rs6000/rs6000.md
>>>>> @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>"
>>>>>      [(set_attr "type" "fp")
>>>>>       (set_attr "isa" "*,<Fisa>")])
>>>>> +(define_expand "fmod<mode>3"
>>>>> +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>>>>> +    (use (match_operand:SFDF 1 "gpc_reg_operand"))
>>>>> +    (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>>>>> +  "TARGET_HARD_FLOAT
>>>>> +  && TARGET_FPRND
>>>>> +  && flag_unsafe_math_optimizations"
>>>>> +{
>>>>> +  rtx div = gen_reg_rtx (<MODE>mode);
>>>>> +  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>>>>> +
>>>>> +  rtx friz = gen_reg_rtx (<MODE>mode);
>>>>> +  emit_insn (gen_btrunc<mode>2 (friz, div));
>>>>> +
>>>>> +  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz,
>>>>> operands[1]));
>>>>> +  DONE;
>>>>> + })
>>>>> +
>>>>> +(define_expand "remainder<mode>3"
>>>>> +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>>>>> +    (use (match_operand:SFDF 1 "gpc_reg_operand"))
>>>>> +    (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>>>>> +  "TARGET_HARD_FLOAT
>>>>> +  && TARGET_FPRND
>>>>> +  && flag_unsafe_math_optimizations"
>>>>> +{
>>>>> +  rtx div = gen_reg_rtx (<MODE>mode);
>>>>> +  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>>>>> +
>>>>> +  rtx frin = gen_reg_rtx (<MODE>mode);
>>>>> +  emit_insn (gen_round<mode>2 (frin, div));
>>>>> +
>>>>> +  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin,
>>>>> operands[1]));
>>>>> +  DONE;
>>>>> + })
> 
> I notice the pattern of arguments to the final emit
> is op[0],op[2],fri*,op[1]
> while the description comment suggests the generated instruction
> will be fnmsubs  f1,f2,f0,f1  ;
> 
> I don't see any rearranging in the nfms<mode>4 expansions, but
> presumably this is correct and just a cosmetic nit that catches my eye.


>From the ISA, 

fnmsub FRT,FRA,FRC,FRB

The operation
FRT ← - ( [(FRA) (FRC)] - (FRB) )
is performed.

 fmodf:
       fdivs   f0,f1,f2
       friz    f0,f0
       fnmsubs f1,f2,f0,f1

Then the ASM means:

f1 = - (f2 * f0 - f1) = - ([f2 * f1/f2] - f1)

So f1 is set with the mod result.

> 
> Ok.
> 
> 
>>>>> +
>>>>>    (define_insn "*rsqrt<mode>2"
>>>>>      [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
>>>>>        (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
>>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>>>> b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>>>> new file mode 100644
>>>>> index 00000000000..48f25ca5b5b
>>>>> --- /dev/null
>>>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>>>> @@ -0,0 +1,30 @@
>>>>> +/* { dg-do compile } */
>>>>> +/* { dg-options "-Ofast" } */
>>>>> +
>>>>> +#include <math.h>
>>>>> +
>>>>> +float test1 (float x, float y)
>>>>> +{
>>>>> +  return fmodf (x, y);
>>>>> +}
>>>>> +
>>>>> +double test2 (double x, double y)
>>>>> +{
>>>>> +  return fmod (x, y);
>>>>> +}
>>>>> +
>>>>> +float test3 (float x, float y)
>>>>> +{
>>>>> +  return remainderf (x, y);
>>>>> +}
>>>>> +
>>>>> +double test4 (double x, double y)
>>>>> +{
>>>>> +  return remainder (x, y);
>>>>> +}
>>>>> +
>>>>> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
>>>>> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
>>>>> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
>>>>> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
> 
> 
> Ok.
> I'd be tempted to add scan-assembler checks for the fdivs,fri*,fnmsubs
> instructions as well.
> I defer to others on that, of course.. :-)

Thanks, will add below check:

diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c 
b/gcc/testsuite/gcc.target/powerpc/pr97142.c
index 48f25ca5b5b..081ab40b4c0 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr97142.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
@@ -27,4 +27,11 @@ double test4 (double x, double y)
 /* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
 /* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
 /* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
+/* { dg-final { scan-assembler-times {\mfdiv\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfdivs\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfnmsub\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfnmsubs\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfriz\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfrin\M} 2 } } */
+

> 
> lgtm,
> thanks
> -Will
> 
> 
> 
>>>>> +
>>>>>
>>
>>
> 

-- 
Thanks,
Xionghu
Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]

Reply via email to