On Wed, Oct 28, 2020 at 4:18 AM Alexandre Oliva <ol...@adacore.com> wrote: > > On Oct 27, 2020, Richard Biener <richard.guent...@gmail.com> wrote: > > > For trapping math SRC may be a constant? Better be safe > > and guard against TREE_CODE (src) != SSA_NAME. > > *nod* > > > You also want to guard against SSA_NAME_OCCURS_IN_ABNORMAL_PHI (src) > > since you cannot generally propagate or move uses of those. > > What if I don't propagate or move them?
That's fine then. > In my first cut at this change, I figured it (looked like it) took half > a brain to implement it, and shut down the other half, only making minor > tweaks to a copy of execute_cse_sincos_1. > > Your response was a wake-up call to many issues I hadn't considered but > should have, and that it looks like sincos doesn't either, so I ended up > rewriting the cse_conv function to use and propagate a preexisting > dominating def, instead of inserting one at a point where there wasn't > one. That's because any such conv might trap, unless an earlier one > didn't. > > > Any reason you are not replacing all uses via replace_uses_by > > and removing the old conversion stmts? OK, removing might > > wreck the upthread iterator so replacing with GIMPLE_NOP > > is the usual trick then. > > Removing them would be fine, I was just thoughtlessly mirroring the > cse_sincos_1 behavior that I'd based it on. > > > Now, I put in code to leave conversion results alone if > SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs), but I wonder if it would work to > replace uses thereof, or to propagate uses thereof, as long as I turned > such conversion results that would be deleted into copies. I.e., given: > > _2 = (double) _1; > _4 = sin (_2); > ... > _3 = (double) _1; // occurs in abnormal phi below > _5 = cos (_3); > ... > _6 = PHI <..., _3 (abnormal), ...>; > > Wouldn't this be ok to turn into: > > _2 = (double) _1; > _4 = sin (_2); > ... > _3 = _2; > _5 = cos (_2); > ... > _6 = PHI <..., _3 (abnormal), ...>; > > ? Yes. > > Anyway, here's a patch that does *not* assume it would work, and skips > defs used in abnormal phis instead. > > > sincos, however, may mess with them, and even introduce/move a trapping > call to a place where there wasn't any, even if none of the preexisting > calls were going to be executed. The only thing I fixed there was a > plain return within a FOR_EACH_IMM_USE_STMT that I introduced recently. > Though it's unlikely to ever hit, I understand it's wrong per the > current API. > > BTW, any reason why we are not (yet?) using something like: > > #define FOR_EACH_IMM_USE_STMT(STMT, ITER, SSAVAR) \ > for (auto_end_imm_use_stmt_traverse auto_end \ > ((((STMT) = first_imm_use_stmt (&(ITER), (SSAVAR))), \ > &(ITER))); \ > !end_imm_use_stmt_p (&(ITER)); \ > (void) ((STMT) = next_imm_use_stmt (&(ITER)))) Just laziness. Or rather last time I remembered this I tried to do it more fancy via range-for but didn't get very far due to the nested iteration ... > > Anyway, here's what I'm testing now. Bootstrap succeeded, regression > testing underway. Ok to install if it succeeds? OK. Thanks, Richard. > > CSE conversions within sincos > > From: Alexandre Oliva <ol...@adacore.com> > > On platforms in which Aux_[Real_Type] involves non-NOP conversions > (e.g., between single- and double-precision, or between short float > and float), the conversions before the calls are CSEd too late for > sincos to combine calls. > > This patch enables the sincos pass to CSE type casts used as arguments > to eligible calls before looking for other calls using the same > operand. > > > for gcc/ChangeLog > > * tree-ssa-math-opts.c (sincos_stats): Add conv_removed. > (execute_cse_conv_1): New. > (execute_cse_sincos_1): Call it. Fix return within > FOR_EACH_IMM_USE_STMT. > (pass_cse_sincos::execute): Report conv_inserted. > > for gcc/testsuite/ChangeLog > > * gnat.dg/sin_cos.ads: New. > * gnat.dg/sin_cos.adb: New. > * gcc.dg/sin_cos.c: New. > --- > gcc/testsuite/gcc.dg/sin_cos.c | 41 ++++++++++++++ > gcc/testsuite/gnat.dg/sin_cos.adb | 14 +++++ > gcc/testsuite/gnat.dg/sin_cos.ads | 4 + > gcc/tree-ssa-math-opts.c | 107 > +++++++++++++++++++++++++++++++++++++ > 4 files changed, 165 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.dg/sin_cos.c > create mode 100644 gcc/testsuite/gnat.dg/sin_cos.adb > create mode 100644 gcc/testsuite/gnat.dg/sin_cos.ads > > diff --git a/gcc/testsuite/gcc.dg/sin_cos.c b/gcc/testsuite/gcc.dg/sin_cos.c > new file mode 100644 > index 00000000..aa71dca > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/sin_cos.c > @@ -0,0 +1,41 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +/* This maps to essentially the same gimple that is generated for > + gnat.dg/sin_cos.adb, on platforms that use the wraplf variant of > + Ada.Numerics.Aux_Float. The value of EPSILON is not relevant to > + the test, but the test must be there to keep the conversions in > + different BBs long enough to trigger the problem that prevented the > + sincos optimization, because the arguments passed to sin and cos > + didn't get unified into a single SSA_NAME in time for sincos. */ > + > +#include <math.h> > + > +#define EPSILON 3.4526697709225118160247802734375e-4 > + > +static float my_sinf(float x) { > + return (float) sin ((double) x); > +} > + > +static float wrap_sinf(float x) { > + if (fabs (x) < EPSILON) > + return 0; > + return my_sinf (x); > +} > + > +static float my_cosf(float x) { > + return (float) cos ((double) x); > +} > + > +static float wrap_cosf(float x) { > + if (fabs (x) < EPSILON) > + return 1; > + return my_cosf (x); > +} > + > +float my_sin_cos(float x, float *s, float *c) { > + *s = wrap_sinf (x); > + *c = wrap_cosf (x); > +} > + > +/* { dg-final { scan-assembler "sincos\|cexp" { target *-linux-gnu* > *-w64-mingw* *-*-vxworks* } } } */ > diff --git a/gcc/testsuite/gnat.dg/sin_cos.adb > b/gcc/testsuite/gnat.dg/sin_cos.adb > new file mode 100644 > index 00000000..6e18df9 > --- /dev/null > +++ b/gcc/testsuite/gnat.dg/sin_cos.adb > @@ -0,0 +1,14 @@ > +-- { dg-do compile } > +-- { dg-options "-O2 -gnatn" } > + > +with Ada.Numerics.Elementary_Functions; > +use Ada.Numerics.Elementary_Functions; > +package body Sin_Cos is > + procedure Sin_Cos (Angle : T; SinA, CosA : out T) is > + begin > + SinA := Sin (Angle); > + CosA := Cos (Angle); > + end; > +end Sin_Cos; > + > +-- { dg-final { scan-assembler "sincos\|cexp" { target *-linux-gnu* > *-w64-mingw* *-*-vxworks* } } } > diff --git a/gcc/testsuite/gnat.dg/sin_cos.ads > b/gcc/testsuite/gnat.dg/sin_cos.ads > new file mode 100644 > index 00000000..a0eff3d > --- /dev/null > +++ b/gcc/testsuite/gnat.dg/sin_cos.ads > @@ -0,0 +1,4 @@ > +package Sin_Cos is > + subtype T is Float; > + procedure Sin_Cos (Angle : T; SinA, CosA : out T); > +end Sin_Cos; > diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c > index 90dfb98..65c7b34 100644 > --- a/gcc/tree-ssa-math-opts.c > +++ b/gcc/tree-ssa-math-opts.c > @@ -187,6 +187,10 @@ static struct > { > /* Number of cexpi calls inserted. */ > int inserted; > + > + /* Number of conversions removed. */ > + int conv_removed; > + > } sincos_stats; > > static struct > @@ -1099,6 +1103,103 @@ make_pass_cse_reciprocals (gcc::context *ctxt) > return new pass_cse_reciprocals (ctxt); > } > > +/* If NAME is the result of a type conversion, look for other > + equivalent dominating or dominated conversions, and replace all > + uses with the earliest dominating name, removing the redundant > + conversions. Return the prevailing name. */ > + > +static tree > +execute_cse_conv_1 (tree name) > +{ > + if (SSA_NAME_IS_DEFAULT_DEF (name) > + || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (name)) > + return name; > + > + gimple *def_stmt = SSA_NAME_DEF_STMT (name); > + > + if (!gimple_assign_cast_p (def_stmt)) > + return name; > + > + tree src = gimple_assign_rhs1 (def_stmt); > + > + if (TREE_CODE (src) != SSA_NAME) > + return name; > + > + imm_use_iterator use_iter; > + gimple *use_stmt; > + > + /* Find the earliest dominating def. */ > + FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, src) > + { > + if (use_stmt == def_stmt > + || !gimple_assign_cast_p (use_stmt)) > + continue; > + > + tree lhs = gimple_assign_lhs (use_stmt); > + > + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs) > + || (gimple_assign_rhs1 (use_stmt) > + != gimple_assign_rhs1 (def_stmt)) > + || !types_compatible_p (TREE_TYPE (name), TREE_TYPE (lhs))) > + continue; > + > + bool use_dominates; > + if (gimple_bb (def_stmt) == gimple_bb (use_stmt)) > + { > + gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt); > + while (!gsi_end_p (gsi) && gsi_stmt (gsi) != def_stmt) > + gsi_next (&gsi); > + use_dominates = !gsi_end_p (gsi); > + } > + else if (dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), > + gimple_bb (def_stmt))) > + use_dominates = false; > + else if (dominated_by_p (CDI_DOMINATORS, gimple_bb (def_stmt), > + gimple_bb (use_stmt))) > + use_dominates = true; > + else > + continue; > + > + if (use_dominates) > + { > + std::swap (name, lhs); > + std::swap (def_stmt, use_stmt); > + } > + } > + > + /* Now go through all uses of SRC again, replacing the equivalent > + dominated conversions. We may replace defs that were not > + dominated by the then-prevailing defs when we first visited > + them. */ > + FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, src) > + { > + if (use_stmt == def_stmt > + || !gimple_assign_cast_p (use_stmt)) > + continue; > + > + tree lhs = gimple_assign_lhs (use_stmt); > + > + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs) > + || (gimple_assign_rhs1 (use_stmt) > + != gimple_assign_rhs1 (def_stmt)) > + || !types_compatible_p (TREE_TYPE (name), TREE_TYPE (lhs))) > + continue; > + > + if (gimple_bb (def_stmt) == gimple_bb (use_stmt) > + || dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), > + gimple_bb (def_stmt))) > + { > + sincos_stats.conv_removed++; > + > + gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt); > + replace_uses_by (lhs, name); > + gsi_remove (&gsi, true); > + } > + } > + > + return name; > +} > + > /* Records an occurrence at statement USE_STMT in the vector of trees > STMTS if it is dominated by *TOP_BB or dominates it or this basic block > is not yet initialized. Returns true if the occurrence was pushed on > @@ -1147,6 +1248,8 @@ execute_cse_sincos_1 (tree name) > int i; > bool cfg_changed = false; > > + name = execute_cse_conv_1 (name); > + > FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, name) > { > if (gimple_code (use_stmt) != GIMPLE_CALL > @@ -1181,7 +1284,7 @@ execute_cse_sincos_1 (tree name) > and, in subsequent rounds, that the built_in type is the same > type, or a compatible type. */ > if (type != t && !types_compatible_p (type, t)) > - return false; > + RETURN_FROM_IMM_USE_STMT (use_iter, false); > } > if (seen_cos + seen_sin + seen_cexpi <= 1) > return false; > @@ -2296,6 +2399,8 @@ pass_cse_sincos::execute (function *fun) > > statistics_counter_event (fun, "sincos statements inserted", > sincos_stats.inserted); > + statistics_counter_event (fun, "conv statements removed", > + sincos_stats.conv_removed); > > return cfg_changed ? TODO_cleanup_cfg : 0; > } > > > > > -- > Alexandre Oliva, happy hacker > https://FSFLA.org/blogs/lxo/ > Free Software Activist > GNU Toolchain Engineer