On Thu, Jul 13, 2023 at 2:54 AM Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > The following makes sure that FP x > y ? x : y style max/min operations > are if-converted at the GIMPLE level. While we can neither match > it to MAX_EXPR nor .FMAX as both have different semantics with IEEE > than the ternary ?: operation we can make sure to maintain this form > as a COND_EXPR so backends have the chance to match this to instructions > their ISA offers. > > The patch does this in phiopt where we recognize min/max and instead > of giving up when we have to honor NaNs we alter the generated code > to a COND_EXPR. > > This resolves PR88540 and we can then SLP vectorize the min operation > for its testcase. It also resolves part of the regressions observed > with the change matching bit-inserts of bit-field-refs to vec_perm. > > Expansion from a COND_EXPR rather than from compare-and-branch > regresses gcc.target/i386/pr54855-13.c and gcc.target/i386/pr54855-9.c > by producing extra moves while the corresponding min/max operations > are now already synthesized by RTL expansion, register selection > isn't optimal. This can be also provoked without this change by > altering the operand order in the source. > > It regresses gcc.target/i386/pr110170.c where we end up CSEing the > condition which makes RTL expansion no longer produce the min/max > directly and code generation is obfuscated enough to confuse > RTL if-conversion. > > It also regresses gcc.target/i386/ssefp-[12].c where oddly one > variant isn't if-converted and ix86_expand_fp_movcc doesn't > match directly (the FP constants get expanded twice). A fix > could be in emit_conditional_move where both prepare_cmp_insn > and emit_conditional_move_1 force the constants to (different) > registers. > > Otherwise bootstrapped and tested on x86_64-unknown-linux-gnu. > > PR tree-optimization/88540 > * tree-ssa-phiopt.cc (minmax_replacement): Do not give up > with NaNs but handle the simple case by if-converting to a > COND_EXPR.
One thing which I was thinking about adding to phiopt is having the last pass do the conversion to COND_EXPR if the target supports a conditional move for that expression. That should fix this one right? This was one of things I was working towards with the moving to use match-and-simplify too. Thanks, Andrew > > * gcc.target/i386/pr88540.c: New testcase. > * gcc.target/i386/pr54855-12.c: Adjust. > * gcc.target/i386/pr54855-13.c: Likewise. > --- > gcc/testsuite/gcc.target/i386/pr54855-12.c | 2 +- > gcc/testsuite/gcc.target/i386/pr54855-13.c | 2 +- > gcc/testsuite/gcc.target/i386/pr88540.c | 10 ++++++++++ > gcc/tree-ssa-phiopt.cc | 21 ++++++++++++++++----- > 4 files changed, 28 insertions(+), 7 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr88540.c > > diff --git a/gcc/testsuite/gcc.target/i386/pr54855-12.c > b/gcc/testsuite/gcc.target/i386/pr54855-12.c > index 2f8af392c83..09e8ab8ae39 100644 > --- a/gcc/testsuite/gcc.target/i386/pr54855-12.c > +++ b/gcc/testsuite/gcc.target/i386/pr54855-12.c > @@ -1,6 +1,6 @@ > /* { dg-do compile } */ > /* { dg-options "-O2 -mavx512fp16" } */ > -/* { dg-final { scan-assembler-times "vmaxsh\[ \\t\]" 1 } } */ > +/* { dg-final { scan-assembler-times "vm\[ai\]\[nx\]sh\[ \\t\]" 1 } } */ > /* { dg-final { scan-assembler-not "vcomish\[ \\t\]" } } */ > /* { dg-final { scan-assembler-not "vmovsh\[ \\t\]" { target { ! ia32 } } } > } */ > > diff --git a/gcc/testsuite/gcc.target/i386/pr54855-13.c > b/gcc/testsuite/gcc.target/i386/pr54855-13.c > index 87b4f459a5a..a4f25066f81 100644 > --- a/gcc/testsuite/gcc.target/i386/pr54855-13.c > +++ b/gcc/testsuite/gcc.target/i386/pr54855-13.c > @@ -1,6 +1,6 @@ > /* { dg-do compile } */ > /* { dg-options "-O2 -mavx512fp16" } */ > -/* { dg-final { scan-assembler-times "vmaxsh\[ \\t\]" 1 } } */ > +/* { dg-final { scan-assembler-times "vm\[ai\]\[nx\]sh\[ \\t\]" 1 } } */ > /* { dg-final { scan-assembler-not "vcomish\[ \\t\]" } } */ > /* { dg-final { scan-assembler-not "vmovsh\[ \\t\]" { target { ! ia32 } } } > } */ > > diff --git a/gcc/testsuite/gcc.target/i386/pr88540.c > b/gcc/testsuite/gcc.target/i386/pr88540.c > new file mode 100644 > index 00000000000..b927d0c57d5 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr88540.c > @@ -0,0 +1,10 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -msse2" } */ > + > +void test(double* __restrict d1, double* __restrict d2, double* __restrict > d3) > +{ > + for (int n = 0; n < 2; ++n) > + d3[n] = d1[n] < d2[n] ? d1[n] : d2[n]; > +} > + > +/* { dg-final { scan-assembler "minpd" } } */ > diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc > index 467c9fd108a..13ee486831d 100644 > --- a/gcc/tree-ssa-phiopt.cc > +++ b/gcc/tree-ssa-phiopt.cc > @@ -1580,10 +1580,6 @@ minmax_replacement (basic_block cond_bb, basic_block > middle_bb, basic_block alt_ > > tree type = TREE_TYPE (PHI_RESULT (phi)); > > - /* The optimization may be unsafe due to NaNs. */ > - if (HONOR_NANS (type) || HONOR_SIGNED_ZEROS (type)) > - return false; > - > gcond *cond = as_a <gcond *> (*gsi_last_bb (cond_bb)); > enum tree_code cmp = gimple_cond_code (cond); > tree rhs = gimple_cond_rhs (cond); > @@ -1770,6 +1766,9 @@ minmax_replacement (basic_block cond_bb, basic_block > middle_bb, basic_block alt_ > else > return false; > } > + else if (HONOR_NANS (type) || HONOR_SIGNED_ZEROS (type)) > + /* The optimization may be unsafe due to NaNs. */ > + return false; > else if (middle_bb != alt_middle_bb && threeway_p) > { > /* Recognize the following case: > @@ -2103,7 +2102,19 @@ minmax_replacement (basic_block cond_bb, basic_block > middle_bb, basic_block alt_ > /* Emit the statement to compute min/max. */ > gimple_seq stmts = NULL; > tree phi_result = PHI_RESULT (phi); > - result = gimple_build (&stmts, minmax, TREE_TYPE (phi_result), arg0, arg1); > + > + /* When we can't use a MIN/MAX_EXPR still make sure the expression > + stays in a form to be recognized by ISA that map to IEEE x > y ? x : y > + semantics (that's not IEEE max semantics). */ > + if (HONOR_NANS (type) || HONOR_SIGNED_ZEROS (type)) > + { > + result = gimple_build (&stmts, cmp, boolean_type_node, > + gimple_cond_lhs (cond), rhs); > + result = gimple_build (&stmts, COND_EXPR, TREE_TYPE (phi_result), > + result, arg_true, arg_false); > + } > + else > + result = gimple_build (&stmts, minmax, TREE_TYPE (phi_result), arg0, > arg1); > > gsi = gsi_last_bb (cond_bb); > gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT); > -- > 2.35.3