On Sat, Dec 7, 2013 at 3:20 AM, Jeff Law <l...@redhat.com> wrote:
> On 12/06/13 03:29, bin.cheng wrote:
>>
>> Hi,
>> Entry pr41488 shows a case in which induction variable can't be recognized
>> or coalesced.  As noted in the bug entry, one possible fix is to teach PRE
>> not to do some specific transformation.  However, it's in nature a scalar
>> evolution issue.  Considering below snippet:
>>
>>     # i_17 = PHI <i_13(5), 0(3)>
>>     # _20 = PHI <_5(5), start_4(D)(3)>
>>     ...
>>     i_13 = i_17 + 1;
>>     _5 = start_4(D) + i_13;
>>
>> Variable _20 appears in the form of PEELED chrec like (start_4, _5)_LOOP,
>> meaning it has value "start_4" in the 1st iteration, then changes to _5 in
>> following iterations.  PEELED chrec is not implemented by GCC right now,
>> it
>> can be simplified into either POLYNOMIAL or PERIOD one.
>
> Right.  PEELED_CHREC was removed from the LNO branch back in 2004,
> presumably before the LNO branch was merged into the mainline.  No reason
> was given.

Maybe we don't have any user for such CHRECs in GCC now, but is's just guessing.

>
> But what you're really discovering here is that _20 is just another standard
> looking IV that appears in the form of a peeled chrec, right?

Exactly, IVOPT handles only polynomial ones, and _20 is one such chrec
only appearing in peeled form.

>
>
>
>  The POLYNOMIAL
>>
>> chrec is processed by GCC after being recognized, as in the examle, _20 is
>> actually {start_4, 1}_LOOP.
>
> Right.  Based on reading the archives, it looks like this stuff is/was
> generated by PRE.  I also suspect jump threading can create them.  There was
> talk of throttling PRE to leave things in a form that the IV analysis could
> more easily digest, but I'm not sure if that was ever completed.

Also could just because it is coded in that way, just as the case I
added in patch.  I found real examples from ggc-page.c in GCC.
But it's always good to cleanup input of an optimization, I presume
that's why Richard tried to move IVOPT later before.

>
> [ snip ]
>
>
>>
>> One point needs to be clarified that I use tree_to_aff_combination_expand
>> in
>> the patch.  Rational is the number of the specific PEELED_CHRECs should be
>> moderate, I also check the equality literally firstly and resort to affine
>> facility if that fails.  By measuring bootstrap of gcc and spec2kint, I
>> collected the number of opportunities caught by this patch like below:
>>                              literal comparison/affine comparison
>> GCC                          ~1200/~250
>> Spec2Kint              93/34
>>
>> I could back trace the ssa chain instead of using affine functions, but
>> that
>> would miss some cases.
>
> I assume tree_to_aff_combination_expand is relatively expensive, thus the
> two approaches, one which avoids tree_to_aff_combination_expand.
Considering the dump of case in the patch:

  <bb 2>:
  _16 = start_4(D) + 1000;
  if (end_6(D) < _16)
    goto <bb 3>;
  else
    goto <bb 6>;

  <bb 3>:
  pretmp_22 = sp_7(D)->data;

  <bb 4>:
  # i_17 = PHI <i_13(5), 1000(3)>
  # _20 = PHI <_5(5), _16(3)>
  _9 = (unsigned int) _20;
  _10 = _9 * 4;
  _11 = pretmp_22 + _10;
  *_11 = 0;
  i_13 = i_17 + -1;
  _5 = start_4(D) + i_13;
  if (_5 > end_6(D))
    goto <bb 5>;
  else
    goto <bb 6>;

  <bb 5>:
  goto <bb 4>;

I have to prove (_16 + -1) equals to (start_4 + 999) for _20, so
either using affine function to back trace the definition of _16, or I
have to back trace ssa_chain manually.  Here I use affine function
because the number of such cases should be moderate and there are more
complicated case in which simple back trace will lose.

Another question, is it acceptable to add an parameter for
tree_to_aff_combination_expand to limit the number of recursive call
for it?  Thus we don't need to expand to leaf node every time.

>
>
> In add_old_iv_candidates, is it the case that the only time
> SSA_NAME_DEF_STMT (def) is another PHI node is when it's one of these affine

I suppose.  Actually IVOPT make an assert on IP_ORIGINAL candidates in
function rewrite_use_nonlinear_expr, like:

  /* An important special case -- if we are asked to express value of
     the original iv by itself, just exit; there is no need to
     introduce a new computation (that might also need casting the
     variable to unsigned and back).  */
  if (cand->pos == IP_ORIGINAL
      && cand->incremented_at == use->stmt)
    {
      enum tree_code stmt_code;

      gcc_assert (is_gimple_assign (use->stmt));
      ......

The only case I can think about involving other kind of phi node is
for merging phi in conditional code like:

LOOP:
    x_1 = phi <x_2, start>
    ....
    if (cond)
       x_3 = x_1 + 1;
    else
       x_4 = x_1 + 1;
    x_2 = phi <x_3, x_4>

Though SCEV knows x_3/x_4 are ssa_names for an iv in different
branches, IVOPT don't handle them this way (not treated as an original
iv).  This is one defect of current implementation of IVOPT, I think.

> ivs appearing in the form of a PEELED_CHREC?  And in that case, we do not
> want to record the possibility of leaving the original IV untouched?  --

IVOPT adds original cand and tries to keep it the original IV is
because it does have an updating statement.  For IVs picked up by this
patch, it doesn't have stepping statement, so no need/way to leaving
it untouched.  It will be represented by candidates for IVs from which
it is peeled.

> Just trying to make sure I understand the code before giving a final
> approval.

I hope this can explain the patch a little bit, and thanks for your reviewing.

Thanks,
bin

>
> jeff



-- 
Best Regards.

Reply via email to