On Thu, Feb 1, 2018 at 5:07 AM, Kugan Vivekanandarajah
<kugan.vivekanandara...@linaro.org> wrote:
> Hi Richard,
>
> On 31 January 2018 at 21:39, Richard Biener <richard.guent...@gmail.com> 
> wrote:
>> On Wed, Jan 31, 2018 at 11:28 AM, Kugan Vivekanandarajah
>> <kugan.vivekanandara...@linaro.org> wrote:
>>> Hi Richard,
>>>
>>> Thanks for the review.
>>> On 25 January 2018 at 20:04, Richard Biener <richard.guent...@gmail.com> 
>>> wrote:
>>>> On Wed, Jan 24, 2018 at 10:56 PM, Kugan Vivekanandarajah
>>>> <kugan.vivekanandara...@linaro.org> wrote:
>>>>> Hi All,
>>>>>
>>>>> Here is a patch for popcount builtin detection similar to LLVM. I
>>>>> would like to queue this for review for next stage 1.
>>>>>
>>>>> 1. This is done part of loop-distribution and effective for -O3 and above.
>>>>> 2. This does not distribute loop to detect popcount (like
>>>>> memcpy/memmove). I dont think that happens in practice. Please correct
>>>>> me if I am wrong.
>>>>
>>>> But then it has no business inside loop distribution but instead is
>>>> doing final value
>>>> replacement, right?  You are pattern-matching the whole loop after all.  I 
>>>> think
>>>> final value replacement would already do the correct thing if you
>>>> teached number of
>>>> iteration analysis that niter for
>>>>
>>>>   <bb 3> [local count: 955630224]:
>>>>   # b_11 = PHI <b_5(5), b_8(6)>
>>>>   _1 = b_11 + -1;
>>>>   b_8 = _1 & b_11;
>>>>   if (b_8 != 0)
>>>>     goto <bb 6>; [89.00%]
>>>>   else
>>>>     goto <bb 8>; [11.00%]
>>>>
>>>>   <bb 6> [local count: 850510900]:
>>>>   goto <bb 3>; [100.00%]
>>>
>>> I am looking into this approach. What should be the scalar evolution
>>> for b_8 (i.e. b & (b -1) in a loop) should be? This is not clear to me
>>> and can this be represented with the scev?
>>
>> No, it's not affine and thus cannot be represented.  You only need the
>> scalar evolution of the counting IV which is already handled and
>> the number of iteration analysis needs to handle the above IV - this
>> is the missing part.
> Thanks for the clarification. I am now matching this loop pattern in
> number_of_iterations_exit when number_of_iterations_exit_assumptions
> fails. If the pattern matches, I am inserting the _builtin_popcount in
> the loop preheater and setting the loop niter with this. This will be
> used by the final value replacement. Is this what you wanted?

No, you shouldn't insert a popcount stmt but instead the niter
GENERIC tree should be a CALL_EXPR to popcount with the
appropriate argument.

> In general, there is also a condition gating the loop like
>
> if (b_4 != 0)
>   goto loop;
> else
>   end:
>
> This of course will not be removed by final value replacement. Since
> popcount (0) is defined, this is redundant and could be removed but
> not removed.

Yeah, that's probably sth for another pass though.  I suppose the
end: case just uses zero in which case you'll have a PHI and you
can optimize

  if (b != 0)
    return popcount (b);
  return 0;

in phiopt.

Richard.

> Thanks,
> Kugan
>
>>
>> Richard.
>>
>>> Thanks,
>>> Kugan
>>>>
>>>> is related to popcount (b_5).
>>>>
>>>> Richard.
>>>>
>>>>> Bootstrapped and regression tested on aarch64-linux-gnu with no new 
>>>>> regressions.
>>>>>
>>>>> Thanks,
>>>>> Kugan
>>>>>
>>>>> gcc/ChangeLog:
>>>>>
>>>>> 2018-01-25  Kugan Vivekanandarajah  <kug...@linaro.org>
>>>>>
>>>>>     PR middle-end/82479
>>>>>     * tree-loop-distribution.c (handle_popcount): New.
>>>>>     (pass_loop_distribution::execute): Use handle_popcount.
>>>>>
>>>>> gcc/testsuite/ChangeLog:
>>>>>
>>>>> 2018-01-25  Kugan Vivekanandarajah  <kug...@linaro.org>
>>>>>
>>>>>     PR middle-end/82479
>>>>>     * gcc.dg/tree-ssa/popcount.c: New test.

Reply via email to