[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2024-07-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

--- Comment #18 from GCC Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:589865a8e4f6bd26c622ea0ee0a38565a0d42e80

commit r15-1752-g589865a8e4f6bd26c622ea0ee0a38565a0d42e80
Author: Roger Sayle 
Date:   Mon Jul 1 12:21:20 2024 +0100

testsuite: Fix -m32 gcc.target/i386/pr102464-vrndscaleph.c on RedHat.

This patch fixes the 4 FAILs of gcc.target/i386/pr192464-vrndscaleph.c
with --target_board='unix{-m32}' on RedHat 7.x.  The issue is that this
AVX512 test includes the system math.h, and on older systems this provides
inline versions of floor, ceil and rint (for the 387).  The work around
is to define __NO_MATH_INLINES before #include  (or alternatively
use __builtin_floor, __builtin_ceil, etc.).

2024-07-01  Roger Sayle  

gcc/testsuite/ChangeLog
PR middle-end/102464
* gcc.target/i386/pr102464-vrndscaleph.c: Define __NO_MATH_INLINES
to resovle FAILs with -m32 on older RedHat systems.

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2022-06-27 Thread jbeulich at suse dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

--- Comment #17 from jbeulich at suse dot com ---
Largely the same is actually true for the RNDSCALEPH test added for the PR
here.

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2022-06-27 Thread jbeulich at suse dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

jbeulich at suse dot com changed:

   What|Removed |Added

 CC||jbeulich at suse dot com

--- Comment #16 from jbeulich at suse dot com ---
(In reply to Hongtao.liu from comment #15)
> Fixed in GCC12.

Only almost - the new FMA testcase there fails for i?86-*-*. I don't think even
the few uses of VFMA* actually match the expectations. The majority of the
operations are carried in the FPU anyway, despite -mfpmath=sse.

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2022-02-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

Hongtao.liu  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from Hongtao.liu  ---
Fixed in GCC12.

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2021-11-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

--- Comment #14 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:b879d40a17ec0409f1a2cd9ab6134bb28f53eea8

commit r12-5079-gb879d40a17ec0409f1a2cd9ab6134bb28f53eea8
Author: liuhongt 
Date:   Thu Nov 4 16:05:45 2021 +0800

Simplify (trunc)MAX/MIN((extend)a, (extend)b) to MAX/MIN(a,b)

a and b are same type as trunc type and has less precision than
extend type.

gcc/ChangeLog:

PR target/102464
* match.pd: Simplify (trunc)fmax/fmin((extend)a, (extend)b) to
MAX/MIN(a,b)

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr102464-maxmin.c: New test.

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2021-11-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

--- Comment #13 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:a1f7ead09cd41d32e2fe902eb32e587c36e7

commit r12-4985-ga1f7ead09cd41d32e2fe902eb32e587c36e7
Author: liuhongt 
Date:   Mon Nov 8 09:32:17 2021 +0800

Add !HONOR_SNANS to simplifcation: (trunc)copysign((extend)a, (extend)b) to
copysign (a, b).

> Note that this is not safe with -fsignaling-nans, so needs to be disabled
> for that option (if there isn't already logic somewhere with that
effect),
> because the extend will convert a signaling NaN to quiet (raising
> "invalid"), but copysign won't, so this transformation could result in a
> signaling NaN being wrongly returned when the original code would never
> have returned a signaling NaN.
>
> --
> Joseph S. Myers
> jos...@codesourcery.com

gcc/ChangeLog

PR target/102464
* match.pd (Simplifcation (trunc)copysign((extend)a, (extend)b)
to .COPYSIGN (a, b)): Add !HONOR_SNANS.

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2021-11-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

--- Comment #12 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:2ad1e8081f4797a99a96b513ffe14c7305e9b3d8

commit r12-4984-g2ad1e8081f4797a99a96b513ffe14c7305e9b3d8
Author: liuhongt 
Date:   Mon Nov 8 09:19:29 2021 +0800

[Gimple] Simplify (trunc)fma ((extend)a, (extend)b, (extend)c) to IFN_FMA
(a,b, c).

a, b, c are same type as truncation type and has less precision than
extend type, the optimization is guarded under
flag_unsafe_math_optimizations.

gcc/ChangeLog:
PR target/102464
* match.pd: Simplify
(trunc)fma ((extend)a, (extend)b, (extend)c) to IFN_FMA (a, b,
c) under flag_unsafe_math_optimizations.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr102464-fma.c: New test.

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2021-11-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

--- Comment #11 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:22ce7382fccc15ce2355306b3f5be7afc00f81f4

commit r12-4881-g22ce7382fccc15ce2355306b3f5be7afc00f81f4
Author: liuhongt 
Date:   Wed Nov 3 16:07:34 2021 +0800

Simplify (trunc)copysign((extend)a, (extend)b) to .COPYSIGN (a,b).

a and b are same type as the truncation type and has less precision
than extend type.

gcc/ChangeLog:

PR target/102464
* match.pd: simplify (trunc)copysign((extend)a, (extend)b) to
.COPYSIGN (a,b) when a and b are same type as the truncation
type and has less precision than extend type.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr102464-copysign-1.c: New test.

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2021-10-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

--- Comment #10 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:84bcefdaf6d95e08cd980965098961289215

commit r12-4780-g84bcefdaf6d95e08cd980965098961289215
Author: liuhongt 
Date:   Mon Oct 25 15:20:35 2021 +0800

Enable vectorization for _Float16 floor/ceil/trunc/nearbyint/rint
operations.

gcc/ChangeLog:

PR target/102464
* config/i386/i386-builtin-types.def (V8HF_FTYPE_V8HF): New
function type.
(V16HF_FTYPE_V16HF): Ditto.
(V32HF_FTYPE_V32HF): Ditto.
(V8HF_FTYPE_V8HF_ROUND): Ditto.
(V16HF_FTYPE_V16HF_ROUND): Ditto.
(V32HF_FTYPE_V32HF_ROUND): Ditto.
* config/i386/i386-builtin.def ( IX86_BUILTIN_FLOORPH,
IX86_BUILTIN_CEILPH, IX86_BUILTIN_TRUNCPH,
IX86_BUILTIN_FLOORPH256, IX86_BUILTIN_CEILPH256,
IX86_BUILTIN_TRUNCPH256, IX86_BUILTIN_FLOORPH512,
IX86_BUILTIN_CEILPH512, IX86_BUILTIN_TRUNCPH512): New builtin.
* config/i386/i386-builtins.c
(ix86_builtin_vectorized_function): Enable vectorization for
HFmode FLOOR/CEIL/TRUNC operation.
* config/i386/i386-expand.c (ix86_expand_args_builtin): Handle
new builtins.
* config/i386/sse.md (rint2, nearbyint2): Extend
to vector HFmodes.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr102464-vrndscaleph.c: New test.

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2021-10-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

--- Comment #9 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:1a07bc9cda77b1211e95ae295b30e46c0d9ee222

commit r12-4651-g1a07bc9cda77b1211e95ae295b30e46c0d9ee222
Author: liuhongt 
Date:   Mon Oct 25 10:51:33 2021 +0800

Simplify (_Float16) sqrtf((float) a) to .SQRT(a) when a is a _Float16
value.

Similar for sqrt/sqrtl.

gcc/ChangeLog:

PR target/102464
* match.pd: Simplify (_Float16) sqrtf((float) a) to .SQRT(a)
when direct_internal_fn_supported_p, similar for sqrt/sqrtl.

gcc/testsuite/ChangeLog:

PR target/102464
* gcc.target/i386/pr102464-sqrtph.c: New test.
* gcc.target/i386/pr102464-sqrtsh.c: New test.

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2021-10-18 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

--- Comment #8 from Hongtao.liu  ---
(In reply to Richard Biener from comment #3)
> There's related optimizations in convert () which should ideally move to
> match.pd

When i try to mov convert stuffs to match.pd, i find some "mismatch", there's 3
cases
1. math functions are transformed under condition "optimize"
2. math functions are transformed under condition "optimize &&
flag_unsafe_math_optimizations"
3. math functions are transformed under condition "optimize &&
flag_unsafe_math_optimizations flag_errno_maths"

And for logb, it's case 1, which means it can be transformed w/o
!flag_errno_maths, but according to DEF_C99_BUILTIN(BUILT_IN_LOGB,
"logb", BT_FN_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO), !flag_errno_maths
is needed and the transformation will be prevented by
gimple-match-head.c:maybe_push_res_to_seq

  /* We can't and should not emit calls to non-const functions.  */
  if (!(flags_from_decl_or_type (decl) & ECF_CONST))
return NULL;


/* fabsl (extend(x)) -> extend(fabsf(x)), etc., if x is a float.  */
(for froms (BUILT_IN_FABS BUILT_IN_FABSL
BUILT_IN_LOGB BUILT_IN_LOGBL)
 tos (BUILT_IN_FABSF BUILT_IN_FABSF
  BUILT_IN_LOGBF BUILT_IN_LOGBF)
(simplify
  (froms (convert float_value_p@0))
(if (optimize && canonicalize_math_p ()
 && mathfn_built_in (TREE_TYPE (@0), froms))
  (convert (tos @0)

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2021-10-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

--- Comment #7 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:613196462a62a28de8414b9023ec2be9a29ac3dc

commit r12-4242-g613196462a62a28de8414b9023ec2be9a29ac3dc
Author: liuhongt 
Date:   Fri Sep 24 19:17:42 2021 +0800

Simplify (_Float16) ceil ((double) x) to .CEIL (x) when available.

gcc/ChangeLog:

PR target/102464
* config/i386/i386.c (ix86_optab_supported_p):
Return true for HFmode.
* match.pd: Simplify (_Float16) ceil ((double) x) to
__builtin_ceilf16 (a) when a is _Float16 type and
direct_internal_fn_supported_p.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr102464.c: New test.

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2021-09-24 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

--- Comment #6 from Hongtao.liu  ---
(In reply to Hongtao.liu from comment #5)
> (gdb) p direct_internal_fn_supported_p (IFN_CEIL, type, OPTIMIZE_FOR_BOTH)
> $110 = false
> 
> (gdb) p direct_internal_fn_supported_p (IFN_SQRT, type, OPTIMIZE_FOR_BOTH)
> $111 = true
> 
> hmm, why?

Hmm, Because in ix86_optab_supported_p, we have

case rint_optab:
  if ((SSE_FLOAT_MODE_P (mode1)
  && TARGET_SSE_MATH
  && !flag_trapping_math
  && !TARGET_SSE4_1))
return opt_type == OPTIMIZE_FOR_SPEED;
  return true;

case floor_optab:
case ceil_optab:
case btrunc_optab:
  if ((SSE_FLOAT_MODE_P (mode1)
  && TARGET_SSE_MATH
  && !flag_trapping_math
  && TARGET_SSE4_1)
return true;
  return opt_type == OPTIMIZE_FOR_SPEED;

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2021-09-24 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

--- Comment #5 from Hongtao.liu  ---
(gdb) p direct_internal_fn_supported_p (IFN_CEIL, type, OPTIMIZE_FOR_BOTH)
$110 = false

(gdb) p direct_internal_fn_supported_p (IFN_SQRT, type, OPTIMIZE_FOR_BOTH)
$111 = true

hmm, why?

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2021-09-23 Thread joseph at codesourcery dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

--- Comment #4 from joseph at codesourcery dot com  ---
Note that for fma this would only be valid for 
-funsafe-math-optimizations.

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2021-09-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

--- Comment #3 from Richard Biener  ---
There's related optimizations in convert () which should ideally move to
match.pd

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2021-09-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2021-09-23
   Keywords||internal-improvement

--- Comment #2 from Andrew Pinski  ---
Confirmed.

fabs and fma I don't think they need to be internal functions as there are
already tree codes for them.

[Bug middle-end/102464] Miss optimization for (_Float16) sqrtf ((float) f16)

2021-09-22 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102464

--- Comment #1 from Hongtao.liu  ---
Similar optimization also applies for
fma
fmax/fmin
fabs
ldexp
ceil
floor
trunc
round
rint
nearbyint
copysign

Since AVX512-FP16 has corresponding instructions.