> -----Original Message-----
> From: Xi Ruoyao <xry...@xry111.site>
> Sent: Saturday, November 25, 2023 10:43 AM
> To: Jakub Jelinek <ja...@redhat.com>; Segher Boessenkool
> <seg...@kernel.crashing.org>; David Edelsohn <dje....@gmail.com>
> Cc: gcc-patches@gcc.gnu.org; Tamar Christina <tamar.christ...@arm.com>;
> Andrew Pinski <apin...@marvell.com>
> Subject: Re: [PATCH] rs6000: Canonicalize copysign (x, -1) back to -abs (x) in
> the backend [PR112606]
> 
> On Sat, 2023-11-25 at 11:17 +0100, Jakub Jelinek wrote:
> > The middle-end has been changed quite recently to canonicalize -abs
> > (x) to copysign (x, -1) rather than the other way around.
> > While I agree with that at GIMPLE level, since it matches the GIMPLE
> > goal of as few operations as possible for a canonical form (-abs (x)
> > is 2 GIMPLE statements, copysign (x, -1) is just one), I must say I
> > don't really like that being done on RTL as well (or at least not
> > canonicalizing (COPYSIGN x, negative) back to (NEG (ABS x))), because
> > on most targets most of floating point constants need to be loaded
> > from memory, there are a few exceptions but -1 is often not one of them.
> 
> On LoongArch fneg+fabs is even slower than loading a -1 from mem then do
> copysign for some micro-architectural reason I don't know.  (FWIW on
> LoongArch with LSX, the fastest way may be directly setting the sign bit with
> LSX vbitseti instruction - it will also set the sign bits for "junk" elements 
> in the
> high bits of the vector register but there is no
> harm.)
> 
> Can we make a target hook to control this?

There already is.  I have been looking into this and this is the situation:

For the C99 versions of copysign, expand_COPYSIGN has optimized expansions 
inplace.
One of the hooks there forces it to abs/neg.  There is also  code in place for 
if the target
prefers integer expansion over floating point one etc.

There are several problems with it though, IFN expansions don't go through 
expand_COPYSIGN,
So copysignf (x, -1.f) and IFN_COPYSIGN (x, -1.f) are not treated the same, 
even though
Operationally they are.

The expansion also doesn't work for vector types, i.e. it's only doing it for 
types which have a C99
version of copysign.

match.pd has an unofficial "canonicalized" form for integer copysign, and 
expand_COPYSIGN expands
to a different one.  So most targets deal with the ones match.pd generate 
efficiently but not expand_COPYSIGNs.

All the optimizations only happen if the target does not implement the copysign 
optab.  Once you do, it's all up to you.

So even if we use expand_COPYSIGN for scalar expansions of IFN_COPYSIGN PPC 
would still need to reject the -1 case
or remove the optab and use combine to form the copysign instruction.

However the issue here is that IFNs at the moment only support direct 
expansion.  That is, you need an optab to get the
rewriting done.  So you have a catch 22, unlike the C99 versions which have a 
libcall fallback.

I have a patch locally that adds support for non-direct IFN expansions by 
providing hooks that a target can
Implement should they want to handle expansion or control which optimizations 
happen.  The patch also treats IFN and
C99 copysign the same, in that since we know we can always lower them to either 
equivalant integer or fp operations
we always allow the rewriting.

This allows most targets to just be able to remove the copysign optab 
implementation and get the same or better code than
before.

It's not a terribly big patch, but I missed stage 1 deadline and was unsure 
It's suitable for stage 3. It does fix years of copysign
issues once and for all though.

If maintainers want to see the patch I can finish regtesting and post it next 
week.

Cheers,
Tamar

> 
> --
> Xi Ruoyao <xry...@xry111.site>
> School of Aerospace Science and Technology, Xidian University

Reply via email to