https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70159

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2016-03-10
                 CC|                            |rguenth at gcc dot gnu.org
            Version|unknown                     |6.0
     Ever confirmed|0                           |1

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
We already compute nearly everything that is necessary:

Value numbering tmin_13 stmt = tmin_13 = inv_4 * _12;
Match-and-simplified inv_4 * _12 to tmax_11
RHS inv_4 * _12 simplified to tmax_11
Setting value number of tmin_13 to tmax_11 (changed)
...
Value numbering tmax_15 stmt = tmax_15 = inv_4 * _14;
Match-and-simplified inv_4 * _14 to tmin_8
RHS inv_4 * _14 simplified to tmin_8
Setting value number of tmax_15 to tmin_8 (changed)

so we know the equivalency but can't do sth sensible with it yet.

  <bb 5>:
  # tmin_1 = PHI <tmin_8(3), tmin_13(4)>
  # tmax_2 = PHI <tmax_11(3), tmax_15(4)>
  _16 = tmin_1 + tmax_2;

so value-wise this is

  <bb 5>:
  # tmin_1 = PHI <tmin_8, tmin_11>
  # tmax_2 = PHI <tmax_11, tmax_8>
  _16 = tmin_1 + tmax_2;

which we could transform by pattern matching this case changing the
if (inv >= 0) to always take the path which otherwise has no side-effects.

I'm not sure it really fits into FRE/PRE elimination phase but at least
it may be easier to do sth like phi-opt on the more complex cases in the
VN framework (to avoid the need to compute equivalencies this complex).

Hoisting would surely help as well but it still would need sth to detect
the commutatively redundant swap.  That is,

float foo_p(float d, float min, float max, float a)
{
  float tmin;
  float tmax;
  float inv = 1.0f / d;
  if (inv >= 0) {
      tmin = min;
      tmax = max;
  } else {
      tmin = max;
      tmax = min;
  }
  return tmax + tmin;
}

is not optimized either and we retain

  if (inv_4 >= 0.0)
    goto <bb 4>;
  else
    goto <bb 3>;

  <bb 3>:

  <bb 4>:
  # tmin_1 = PHI <min_5(D)(2), max_6(D)(3)>
  # tmax_2 = PHI <max_6(D)(2), min_5(D)(3)>
  _7 = tmin_1 + tmax_2;

until .optimized.  So that part of this PR is independent of the hoisting
issue we have other PRs for.

(simplify
 (plus (cond @0 @1 @2) (cond @0 @2 @1))
 (plus @1 @2))

would fix it if we'd match PHIs for conds as well.  If we "help" PRE with

float foo_p(float d, float min, float max, float a)
{
  float tmin;
  float tmax;
  float inv = 1.0f / d;
  float bar = min + max;
  if (inv >= 0) {
      tmin = min;
      tmax = max;
  } else {
      tmin = max;
      tmax = min;
  }
  return tmax + tmin + bar;
}

it figures that tmax + tmin is equal to bar and optimizes this.  Sth
that hoisting (as implemented in PRE) should then do in one step
hopefully.  Catching this somewhat earlier would be nice though.

Reply via email to