[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2023-09-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #26 from Andrew Pinski  ---
Created attachment 55893
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55893=edit
Here is my idea around the patch for prototype for doing the constant prop idea

This is like the one in comment #13 but handles it in phiopt rather than
forwprop.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-08-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |12.0

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-05-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #25 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:f1c777f40aa0b6941efc7440495a8d7e0cc2a1bb

commit r12-964-gf1c777f40aa0b6941efc7440495a8d7e0cc2a1bb
Author: Jakub Jelinek 
Date:   Fri May 21 10:39:50 2021 +0200

tree-optimization: Improve spaceship_replacement [PR94589]

On Wed, May 19, 2021 at 01:30:31PM -0400, Jason Merrill via Gcc-patches
wrote:
> Here, when genericizing lexicographical_compare_three_way, we haven't yet
> walked the operands, so (a == a) still sees ADDR_EXPR , but this is
after
> we've changed the type of a to REFERENCE_TYPE.  When we try to fold (a ==
a)
> by constexpr evaluation, the constexpr code doesn't understand trying to
> take the address of a reference, and we end up crashing.
>
> Fixed by avoiding constexpr evaluation in genericize_spaceship, by using
> fold_build2 instead of build_new_op on scalar operands.  Class operands
> should have been expanded during parsing.

Unfortunately this slightly changed the IL and spaceship_replacement no
longer pattern matches it.

Here are 3 improvements that make it match:

1) as mentioned in the comment above spaceship_replacement, for
   strong_ordering, we are pattern matching something like:
   x == y ? 0 : x < y ? -1 : 1;
   and for partial_ordering
   x == y ? 0 : x < y ? -1 : x > y ? 1 : 2;
   but given the == comparison done first and the other comparisons only
   if == was false, we actually don't care if the other comparisons
   are < vs. <= (or > vs. >=), provided the operands of the comparison
   are the same; we know == is false when doing those and < vs. <= or
   > vs. >= have the same behavior for NaNs too
2) when y is an integral constant, we should treat x < 5 equivalently
   to x <= 4 etc.
3) the code punted if cond2_phi_edge wasn't a EDGE_TRUE_VALUE edge, but
   as the new IL shows, that isn't really needed; given 1) that
   > and >= are equivalent in the code, any of swapping the comparison
   operands, changing L[TE]_EXPR to G[TE]_EXPR or vice versa or
   swapping the EDGE_TRUE_VALUE / EDGE_FALSE_VALUE bits on the edges
   reverses one of the two comparisons

2021-05-21  Jakub Jelinek  

PR tree-optimization/94589
* tree-ssa-phiopt.c (spaceship_replacement): For integral rhs1 and
rhs2, treat x <= 4 equivalently to x < 5 etc.  In cmp1 and cmp2 (if
not the same as cmp3) treat <= the same as < and >= the same as >.
Don't require that cond2_phi_edge is true edge, instead take
false/true edges into account based on cmp1/cmp2 comparison kinds.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-05-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #24 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:2b536797f7e43c55072a3215735f5833f1d6d218

commit r12-935-g2b536797f7e43c55072a3215735f5833f1d6d218
Author: Jakub Jelinek 
Date:   Thu May 20 09:10:38 2021 +0200

phiopt: Simplify (X & Y) == X -> (X & ~Y) == 0 even in presence of integral
conversions [PR94589]

On Wed, May 19, 2021 at 10:15:53AM +0200, Christophe Lyon via Gcc-patches
wrote:
> After this update, the test fails on arm and aarch64: according to the
> logs, the optimization is still performed 14 times.

Seems this is because
  if (change
  && !flag_syntax_only
  && (load_extend_op (TYPE_MODE (TREE_TYPE (and0)))
  == ZERO_EXTEND))
{
  tree uns = unsigned_type_for (TREE_TYPE (and0));
  and0 = fold_convert_loc (loc, uns, and0);
  and1 = fold_convert_loc (loc, uns, and1);
}
in fold-const.c adds on these targets extra casts that prevent the
optimizations.

2021-05-20  Jakub Jelinek  

PR tree-optimization/94589
* match.pd ((X & Y) == X -> (X & ~Y) == 0): Simplify even in
presence
of integral conversions.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-05-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #23 from Jakub Jelinek  ---
Fixed.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-05-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #22 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:65061ea287a80cfb214e402cfd2373a14bfec95a

commit r12-864-g65061ea287a80cfb214e402cfd2373a14bfec95a
Author: Jakub Jelinek 
Date:   Tue May 18 10:08:51 2021 +0200

phiopt: Optimize partial_ordering spaceship >= 0 -ffinite-math-only
[PR94589]

As mentioned earlier, spaceship_replacement didn't optimize
partial_ordering
>= 0 comparisons, because the possible values are -1, 0, 1, 2 and the
>= comparison is implemented as (res & 1) == res to choose the 0 and 1
cases from that.  As we optimize that only with -ffinite-math-only, the
2 case is assumed not to happen and my earlier match.pd change optimizes
(res & 1) == res into (res & ~1) == 0, so this patch pattern matches
that case and handles it like res >= 0.

2021-05-18  Jakub Jelinek  

PR tree-optimization/94589
* tree-ssa-phiopt.c (spaceship_replacement): Pattern match
phi result used in (res & ~1) == 0 comparison as res >= 0 as
res == 2 would be UB with -ffinite-math-only.

* g++.dg/opt/pr94589-2.C: Adjust scan-tree-dump count from 14 to
12.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-05-12 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #21 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:f5f1838435400b837c8677c53a611e2dc6d56442

commit r12-733-gf5f1838435400b837c8677c53a611e2dc6d56442
Author: Jakub Jelinek 
Date:   Wed May 12 09:46:03 2021 +0200

match.pd: Optimize (x & y) == x into (x & ~y) == 0 [PR94589]

> Somewhere in RTL (_M_value&1)==_M_value is turned into (_M_value&-2)==0,
> that could be worth doing already in GIMPLE.

Apparently it is
  /* Simplify eq/ne (and/ior x y) x/y) for targets with a BICS instruction
or
 constant folding if x/y is a constant.  */
  if ((code == EQ || code == NE)
  && (op0code == AND || op0code == IOR)
  && !side_effects_p (op1)
  && op1 != CONST0_RTX (cmp_mode))
{
  /* Both (eq/ne (and x y) x) and (eq/ne (ior x y) y) simplify to
 (eq/ne (and (not y) x) 0).  */
...
  /* Both (eq/ne (and x y) y) and (eq/ne (ior x y) x) simplify to
 (eq/ne (and (not x) y) 0).  */
Yes, doing that on GIMPLE for the case where the not argument is constant
would simplify the phiopt follow-up (it would be single imm use then).

On Thu, May 06, 2021 at 09:42:41PM +0200, Marc Glisse wrote:
> We can probably do it in 2 steps, first something like
>
> (for cmp (eq ne)
>  (simplify
>   (cmp (bit_and:c @0 @1) @0)
>   (cmp (@0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); })))
>
> to get rid of the double use, and then simplify X==0 to X<=~C if C is a
> mask 111...000 (I thought we already had a function to detect such masks,
or
> the 000...111, but I can't find them anymore).

Ok, here is the first step then.

2021-05-12  Jakub Jelinek  
Marc Glisse  

PR tree-optimization/94589
* match.pd ((X & Y) == X -> (X & ~Y) == 0,
(X | Y) == Y -> (X & ~Y) == 0): New GIMPLE simplifications.

* gcc.dg/tree-ssa/pr94589-1.c: New test.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-05-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #20 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:ad96c867e173c1ebcfc201b201adac5095683a08

commit r12-559-gad96c867e173c1ebcfc201b201adac5095683a08
Author: Jakub Jelinek 
Date:   Thu May 6 10:15:40 2021 +0200

phiopt: Optimize (x <=> y) cmp z [PR94589]

genericize_spaceship genericizes i <=> j to approximately
({ int c; if (i == j) c = 0; else if (i < j) c = -1; else c = 1; c; })
for strong ordering and
({ int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1;
else c = 2; c; })
for partial ordering.
The C++ standard supports then == or != comparisons of that against
strong/partial ordering enums, or />= comparisons of <=> result
against literal 0.

In some cases we already optimize that but in many cases we keep performing
all the 2 or 3 comparisons, compute the spaceship value and then compare
that.

The following patch recognizes those patterns if the <=> operands are
integral types or floating point (the latter only for -ffast-math) and
optimizes it to the single comparison that is needed (plus adds debug stmts
if needed for the spaceship result).

There is one thing I'd like to address in a follow-up: the pr94589-2.C
testcase should be matching just 12 times each, but runs
into operator>=(partial_ordering, unspecified) being defined as
(_M_value&1)==_M_value
rather than _M_value>=0.  When not honoring NaNs, the 2 case should be
unreachable and so (_M_value&1)==_M_value is then equivalent to
_M_value>=0,
but is not a single use but two uses.  I'll need to pattern match that case
specially.

2021-05-06  Jakub Jelinek  

PR tree-optimization/94589
* tree-ssa-phiopt.c (tree_ssa_phiopt_worker): Call
spaceship_replacement.
(cond_only_block_p, spaceship_replacement): New functions.

* gcc.dg/pr94589-1.c: New test.
* gcc.dg/pr94589-2.c: New test.
* gcc.dg/pr94589-3.c: New test.
* gcc.dg/pr94589-4.c: New test.
* g++.dg/opt/pr94589-1.C: New test.
* g++.dg/opt/pr94589-2.C: New test.
* g++.dg/opt/pr94589-3.C: New test.
* g++.dg/opt/pr94589-4.C: New test.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-05-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #50719|0   |1
is obsolete||
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #19 from Jakub Jelinek  ---
Created attachment 50741
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50741=edit
gcc12-pr94589.patch

Full untested patch.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-04-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #50710|0   |1
is obsolete||

--- Comment #18 from Jakub Jelinek  ---
Created attachment 50719
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50719=edit
gcc12-pr94589-wip.patch

Updated patch.  This already does the optimizations, though needs testcase
coverage and perhaps something better for debug uses of the PHI_RESULT besides
the single non-debug use.
And, it won't work together with the above mentioned forwprop patch, I'm afraid
it would need to virtually undo that change but it would be hard.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-04-29 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #17 from Jakub Jelinek  ---
Created attachment 50710
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50710=edit
gcc12-pr94589-wip.patch

WIP patch that just matches those spaceship comparisons followed by single use
comparison of that with -1/0/1, but doesn't yet perform any of the
optimizations.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-04-29 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #16 from Jakub Jelinek  ---
Writing a phiopt patch now.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-04-29 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #15 from Jakub Jelinek  ---
C testcase (though surely we need C++ one too because with C++ there are
aggregates and inline functions involved):

int f1 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a == 0; }
int f2 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a < 0; }
int f3 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a > 0; }
int f4 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a != 0; }
int f5 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a <= 0; }
int f6 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a >= 0; }
int f7 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a == -1; }
int f8 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a != -1; }
int f9 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a == 1; }
int f10 (int x, int y) { int a = x == y ? 0 : x < y ? -1 : 1; return a != 1; }
int f11 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a ==
0; }
int f12 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a < 0;
}
int f13 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a > 0;
}
int f14 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a !=
0; }
int f15 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a <=
0; }
int f16 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a >=
0; }
int f17 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a ==
-1; }
int f18 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a !=
-1; }
int f19 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a ==
1; }
int f20 (float x, float y) { int a = x == y ? 0 : x < y ? -1 : 1; return a !=
1; }

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-04-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #14 from Richard Biener  ---
#include 
bool k(int a, int b){
  auto c = (a <=> b);
  return c>0;
}

Produces

   [local count: 1073741824]:
  if (a_1(D) == b_3(D))
goto ; [34.00%]
  else
goto ; [66.00%]

   [local count: 708669601]:
  if (a_1(D) < b_3(D))
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 354334801]:

   [local count: 1073741824]:
  # prephitmp_8 = PHI <0(2), 0(3), 1(4)>
  return prephitmp_8;

from PRE (or the proposed patch) where it is not matched further.  This kind
of patters could be handled by phiopt.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-04-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #13 from Richard Biener  ---
Created attachment 50707
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50707=edit
prototype

For reference this is the prototype patch I mentioned.  I wasn't entirely happy
and wanted to explore the ??? in the commit message.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-04-29 Thread glisse at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #12 from Marc Glisse  ---
(In reply to rguent...@suse.de from comment #11)

> For PR7
> I have prototyped a forwprop patch to try constant folding
> stmts with all-constant PHIs, thus in this case c$_M_value_2 > 0,
> when there's only a single use of it

Maybe we could handle any case where trying to fold the single use (counting
x*x as a single use of x) with each possible value satisfies is_gimple_val (or
whatever the condition is to be allowed in a PHI, and without introducing a use
of a ssa_name before it is defined), so that things like PHI & X would
simplify. But the constant case is indeed the most important, and should allow
the optimization in this PR before the vectorizer using reassoc1.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-04-29 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #11 from rguenther at suse dot de  ---
On Thu, 29 Apr 2021, glisse at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589
> 
> --- Comment #7 from Marc Glisse  ---
> Some key steps in the optimization:
> PRE turns PHI<-1,0,1> > 0 into PHI<0,0,1>
> reassoc then combines the operations (it didn't in gcc-10)
> forwprop+phiopt cleans up (i>0)!=0?1:0 into just i>0.
> 
> Having to wait until phiopt4 to get the simplified form is still very 
> long, and most likely causes missed optimizations in earlier passes. But 
> nice progress!

Agreed - requiring PRE (and thus -O2) is also less than optimal for
such a core feature.  But the IL we get is simply awkward ;)

ifcombine/phiopt2 see

   [local count: 1073741824]:
  if (i_1(D) == 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870913]:
  if (i_1(D) < 0)
goto ; [41.00%]
  else
goto ; [59.00%]

   [local count: 316753840]:

   [local count: 1073741824]:
  # c$_M_value_2 = PHI <0(2), -1(3), 1(4)>
  _4 = c$_M_value_2 > 0;

I guess that's what we should try to work with.  For PR7
I have prototyped a forwprop patch to try constant folding
stmts with all-constant PHIs, thus in this case c$_M_value_2 > 0,
when there's only a single use of it (that basically does what
PRE later does but earlier and for a very small subset of cases).

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-04-29 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #10 from Jonathan Wakely  ---
(In reply to Marc Glisse from comment #8)
> PR96480 would be my guess.

Yes

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-04-29 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #9 from Jonathan Wakely  ---
(In reply to Richard Biener from comment #6)
> Not sure what fixed it there.

Seems to be r11-2593

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-04-29 Thread glisse at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #8 from Marc Glisse  ---
PR96480 would be my guess.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-04-29 Thread glisse at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #7 from Marc Glisse  ---
Some key steps in the optimization:
PRE turns PHI<-1,0,1> > 0 into PHI<0,0,1>
reassoc then combines the operations (it didn't in gcc-10)
forwprop+phiopt cleans up (i>0)!=0?1:0 into just i>0.

Having to wait until phiopt4 to get the simplified form is still very long, and
most likely causes missed optimizations in earlier passes. But nice progress!

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-04-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

Richard Biener  changed:

   What|Removed |Added

  Known to work||11.1.0

--- Comment #6 from Richard Biener  ---
Note the original testcase is optimized with GCC 11

   [local count: 1073741824]:
  _2 = i_1(D) > 0;
  return _2;

but not on the GCC 10 branch.  Not sure what fixed it there.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-04-29 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

--- Comment #5 from Jonathan Wakely  ---
(In reply to Marc Glisse from comment #0)
> For most comparisons @ we do optimize (i<=>0)@0 to just i@0, but not for >
> and <=. Spaceship operator<=> is very painful to use, but I expect we will
> end up seeing a lot of it with C++20, and comparing its result with 0 is
> almost the only way to use its output, so it seems important to optimize
> this common case.

Maybe even common enough to transform it in the FE.


(In reply to Victor Khimenko from comment #2)
> Note that gcc looks bad even on the example from Microsoft's blog post:
> 
> https://godbolt.org/z/Jc7TcN


Right, this doesn't only affect (i<=>0)@0 but also (i<=>j)@0 which is very
common, because it's what the FE synthesizes for operator@ when the type only
defines operator<=>


https://godbolt.org/z/19dM8PdaM

#include  

struct X
{
int i = 0;
auto operator<=>(const X&) const = default;
};

bool lt(X l, X r) { return l once is easier than defining all of < > <= >= but the code is bad:

lt(X, X):
xor eax, eax
cmp edi, esi
je  .L2
setge   al
lea eax, [rax-1+rax]
.L2:
shr al, 7
ret
lt(Y, Y):
cmp edi, esi
setlal
ret

It seems like it should be possible for the FE to recognize that in this
trivial case the defaulted <=> is just comparing integers, and therefore the
synthesized op< could be transformed from (l.i <=> r.i) < 0 to simply l.i < r.i

The FE is not required to synthesize exactly (l.i<=>r.i) @ 0 as long as the
result is correct, so l.i @ r.i would be OK (and avoids the poor codegen until
the missed-optimization gets done).

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-04-29 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

Jonathan Wakely  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=94006

--- Comment #4 from Jonathan Wakely  ---
PR 100322 shows that this missed-optimization causes a regression for code
using std::chrono::duration types. Since C++20 their comparisons use
operator<=> and so produce much worse code than the same source compiled as
C++17.

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2021-04-29 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

Jonathan Wakely  changed:

   What|Removed |Added

 CC||aaron at aarongraham dot com

--- Comment #3 from Jonathan Wakely  ---
*** Bug 100322 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2020-06-05 Thread khim at google dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

Victor Khimenko  changed:

   What|Removed |Added

 CC||khim at google dot com

--- Comment #2 from Victor Khimenko  ---
Note that gcc looks bad even on the example from Microsoft's blog post:

https://godbolt.org/z/Jc7TcN

The fact that MSVC also looks bad on example from Microsoft is not really
relevant, it's MSVC, after all.

Blogpost itself is here:
https://devblogs.microsoft.com/cppblog/simplify-your-code-with-rocket-science-c20s-spaceship-operator/

[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0

2020-04-14 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2020-04-14
 Ever confirmed|0   |1
 CC||rguenth at gcc dot gnu.org

--- Comment #1 from Richard Biener  ---
Hmm.  I would have said phiopt but then there's the missing opportunity to
handle PHIs as COND_EXPRs and simplify them via match.pd patterns.  In this
case simplify

 i_1(D) != 0 ? i_1(D) >= 0 : 0

that could be forwprops job then.  The only pecularity is when the PHI
isn't the only thing controlled by the condition, thus the control flow
not vanishing with replacing the PHI.  In this case we'd turn the code into

   [local count: 1073741824]:
  if (i_1(D) != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870913]:
  _2 = i_1(D) >= 0;

   [local count: 1073741824]:
  prephitmp_6 = i_1(D) > 0;
  return prephitmp_6;

and leave the rest to DCE.  Obviously if the other two compares are not
dead afterwards this isn't going to be a good idea.

And CFG pattern matching is done by phiopt - which for the actual
simplification could resort to match.pd simplifying a COND_EXPR as well.
And yeah, I think there's a dup about PHIs and "folding".