On Fri, Jun 22, 2018 at 5:11 PM, Kugan Vivekanandarajah
<kugan.vivekanandara...@linaro.org> wrote:
> When we set niter with maybe_zero, currently final_value_relacement
> will not happen due to expression_expensive_p not handling. Patch 1
> adds this.
>
> With that we have the following optimized gimple.
>
>   <bb 2> [local count: 118111601]:
>   if (b_4(D) != 0)
>     goto <bb 3>; [89.00%]
>   else
>     goto <bb 4>; [11.00%]
>
>   <bb 3> [local count: 105119324]:
>   _2 = (unsigned long) b_4(D);
>   _9 = __builtin_popcountl (_2);
>   c_3 = b_4(D) != 0 ? _9 : 1;
>
>   <bb 4> [local count: 118111601]:
>   # c_12 = PHI <c_3(3), 0(2)>
>
> I assume that 1 in  b_4(D) != 0 ? _9 : 1; is OK (?) because when the
No, it doesn't make much sense.  when b_4(D) == 0, the popcount
computed should be 0.  Point is you can never get b_4(D) == 0 with
guard condition in basic block 2.  So the result should simply be:

>   <bb 2> [local count: 118111601]:
>   if (b_4(D) != 0)
>     goto <bb 3>; [89.00%]
>   else
>     goto <bb 4>; [11.00%]
>
>   <bb 3> [local count: 105119324]:
>   _2 = (unsigned long) b_4(D);
>   c_3 = __builtin_popcountl (_2);
>
>   <bb 4> [local count: 118111601]:
>   # c_12 = PHI <c_3(3), 0(2)>

I think this is the code generated if maybe_zero is not set?  which it
should not be set here.
For the same reason, it can be further optimized into:

>   <bb 2> [local count: 118111601]:
>   _2 = (unsigned long) b_4(D);
>   c_12 = __builtin_popcountl (_2);
>

> latch execute zero times for b_4 == 0 means that the body will execute
> ones.
You never get zero times latch here with the aforementioned guard condition.

BTW, I didn't look at following patches which could be wanted optimizations.

Thanks,
bin
>
> The issue here is, since we are checking if (b_4(D) != 0) before
> entering the loop means we don't need to set maybe_zero. Patch 2
> handles this.
>
> With that we have
>   <bb 2> [local count: 118111601]:
>   if (b_4(D) != 0)
>     goto <bb 3>; [89.00%]
>   else
>     goto <bb 4>; [11.00%]
>
>   <bb 3> [local count: 105119324]:
>   _2 = (unsigned long) b_4(D);
>   _9 = __builtin_popcountl (_2);
>
>   <bb 4> [local count: 118111601]:
>   # c_12 = PHI <0(2), _9(3)>
>
> As advised earlier, patch 3 adds phiopt support to remove this.
>
> Bootstrap and regression testing are ongoing.
>
> Is this OK for trunk.
>
> Thanks,
> Kugan

Reply via email to