On Tue, Jun 20, 2017 at 11:20 AM, Bin.Cheng <amker.ch...@gmail.com> wrote:
> On Fri, Jun 16, 2017 at 6:15 PM, Bin.Cheng <amker.ch...@gmail.com> wrote:
>> On Fri, Jun 16, 2017 at 11:21 AM, Richard Biener
>> <richard.guent...@gmail.com> wrote:
>>> On Mon, Jun 12, 2017 at 7:03 PM, Bin Cheng <bin.ch...@arm.com> wrote:
>>>> Hi,
>>>> For now, loop distribution handles variables used outside of loop as 
>>>> reduction.
>>>> This is inaccurate because all partitions contain statement defining 
>>>> induction
>>>> vars.
>>>
>>> But final induction values are usually not used outside of the loop...
>> This is in actuality for induction variable which is used outside of the 
>> loop.
>>>
>>> What is missing is loop distribution trying to change partition order.  In 
>>> fact
>>> we somehow assume we can move a reduction across a detected builtin
>>> (I don't remember if we ever check for validity of that...).
>> Hmm, I am not sure when we can't.  If there is any dependence between
>> builtin/reduction partitions, it should be captured by RDG or PG,
>> otherwise the partitions are independent and can be freely ordered as
>> long as reduction partition is scheduled last?
>>>
>>>> Ideally we should factor out scev-propagation as a standalone interface
>>>> which can be called when necessary.  Before that, this patch simply 
>>>> workarounds
>>>> reduction issue by checking if the statement belongs to all partitions.  
>>>> If yes,
>>>> the reduction must be computed in the last partition no matter how the 
>>>> loop is
>>>> distributed.
>>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>
>>> stmt_in_all_partitions is not kept up-to-date during partition merging and 
>>> if
>>> merging makes the reduction partition(s) pass the stmt_in_all_partitions
>>> test your simple workaround doesn't work ...
>> I think it doesn't matter because:
>>   A) it's really workaround for induction variables.  In general,
>> induction variables are included by all partition.
>>   B) After classify partition, we immediately fuses all reduction
>> partitions.  More stmt_in_all_partitions means we are fusing
>> non-reduction partition with reduction partition, so the newly
>> generated (stmt_in_all_partitions) are actually not reduction
>> statements.  The workaround won't work anyway even the bitmap is
>> maintained.
>>>
>>> As written it's a valid optimization but can you please note it's 
>>> limitation in
>>> some comment please?
>> Yeah, I will add comment explaining it.
> Comment added in new version patch.  It also computes bitmap outside
> now, is it OK?

Ok.  Can you add a testcase for this as well please?  I think the
series up to this
is now fully reviewed, I defered 1/n (the new IFN) to the last one
containing the
runtime versioning.  Can you re-post that (you can merge with the IFN patch)
to apply after the series has been applied up to this?

Thanks,
Richard.

> Thanks,
> bin
> 2017-06-07  Bin Cheng  <bin.ch...@arm.com>
>
>     * tree-loop-distribution.c (classify_partition): New parameter and
>     better handle reduction statement.
>     (rdg_build_partitions): Revise comment.
>     (distribute_loop): Compute statements in all partitions and pass it
>     to classify_partition.

Reply via email to