Re: More of a Loop distribution.

Richard Biener Fri, 14 Aug 2015 00:25:24 -0700

On Fri, Aug 14, 2015 at 8:13 AM, Ajit Kumar Agarwal
<ajit.kumar.agar...@xilinx.com> wrote:
>
>
> -----Original Message-----
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Friday, August 14, 2015 11:30 AM
> To: Ajit Kumar Agarwal
> Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli 
> Hunsigida; Nagaraju Mekala
> Subject: RE: More of a Loop distribution.
>
> On August 14, 2015 4:59:07 AM GMT+02:00, Ajit Kumar Agarwal 
> <ajit.kumar.agar...@xilinx.com> wrote:
>>
>>
>>-----Original Message-----
>>From: Richard Biener [mailto:richard.guent...@gmail.com]
>>Sent: Thursday, August 13, 2015 3:23 PM
>>To: Ajit Kumar Agarwal
>>Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta;
>>Vidhumouli Hunsigida; Nagaraju Mekala
>>Subject: Re: More of a Loop distribution.
>>
>>On Thu, Aug 13, 2015 at 9:37 AM, Ajit Kumar Agarwal
>><ajit.kumar.agar...@xilinx.com> wrote:
>>> All:
>>>
>>> Loop distribution considers DDG to decide on distributing the Loops.
>>> The Loops with control statements like IF-THEN-ELSE can also be
>>> Distributed. Instead of Data Dependency Graph, the Control Dependence
>>Graph should be considered in order to distribute the loops In presence
>>of control Statements.
>>>
>>> Also the presence of multiple exits in the Loop can also be
>>considered for Loop distribution transformation.
>>>
>>> Thus the above transformation helps in the Libquantum benchmarks for
>>SPEC 2006.
>>>
>>> There are following articles that looks interesting to me.
>>>
>>> "Loop Distribution in presence of arbitrarily control flow Ken
>>Kennedy et.al."
>>> "Loop Distribution in presence of Multiple Exits Bor-Ming Hsieh
>>etal."
>>>
>>>
>>> I don't think the loop distribution in presence of control flow is
>>implemented in GCC/LLVM.
>>>
>>> I think it is feasible to consider the above for the implementation
>>in GCC.
>>
>>>>It's true that loop distribution does not try to distribute based on
>>any control structure heuristics but it only considers data locality.
>>It does however already >>compute the control dependence graph (and
>>uses it to add control edges to the DDG to properly add data dependence
>>edges to uses of control statements >>necessary in partitions).
>>
>>>>So it should be a matter of specifying the proper set of starting
>>statements it tries separating.
>>
>>Thanks.
>>
>>>>Not sure which kind of distribution you are after, can you give an
>>example?
>>
>>I would like to have a distribution of the loop having control flow.
>>For example
>>
>>For (I = 2 ; I < N; i++)
>>{
>>    If  (condition)
>>       {
>>  S1:         A[i] = ...
>>  S2:        D[i] = A[i-1]...
>>       }
>>}
>>
>>The above loop can be distributed with two loops having one loop with
>>S1  inside IF and another loop with S2 with the IF.
>>The two scenario can be true.
>>
>>1. The condition inside IF have a check on A[i] and is dependent on S1.
>>In this case the distribution is difficult. And the above article From
>>Ken Kennedy et.al does store the partial results of comparison in an
>>temporary array and that array is compared inside the IF Condition.
>>This makes the loop distributed. This is what I was looking for which I
>>found in the above article.
>>
>>2. The condition inside the IF in the above loop is not dependent on
>>the S1 and S2 , and this case the loop can be distributed.
>>
>>In the above two scenario the GCC can't distribute the loop, as the
>>control dependency graph ( control structure ) is not used.
>
>>>The above loop can be distributed by gcc just fine.
>
> Existing  loop distribution implementation in GCC distributes the above loop 
> in both the above scenario's?


GCC cannot do the trick of computing the condition into a temporary
array.  What it does
(and may then reject for dependence reasons) is to have the original
condition (including an
eventual reference to A[i]) in both partitions containing S1 and S2.
So if the exact thing
loos like

  for (i = 0; i < N; i++)
    {
       if (A[i])
         {
            A[i] = ...;
            D[i] = A[i-1] ...;
         }
   }

then I see no dependence that makes distributing into

  for (i = 0; i < N; i++)
    {
       if (A[i])
         D[i] = A[i-1] ...;
    }
 for (i =0; i < N; i++)
    {
       if (A[i])
         A[i] = ...;
    }

invalid.  But as the testcase isn't complete or compilable I can't
actually verify
what GCC does here.  I would think loop distributions cost metric would reject
this distribution because A is accessed in both loops, but as said, only real
code examples will show.

All I wanted to say is the underlying support is there in loop
distribution - where
current loop distribution is lacking is a) determining the desired partitioning
(by determining the stmts to distribute) and b) the cost model that fuses back
partitions that are related.

Oh, and of course CSE comes in the way of both cost modeling and actual
dependences.  Loop distribution doesn't understand to reload an expression
from memory if it sees

   tem = B[i] + C[i];
   A[i] = tem;   (1)
   D[i] = tem;   (2)

then the partitions containing (1) and (2) will both have the tem =
B[i] + C[i] compute
inside them - there isn't any code trying to replace that with A[i] or
D[i] in either
partition.  I think that's the most important thing to improve with
GCCs loop distribution
pass (as it will also fix some of the cost modelling issues).

Richard.

> Thanks & Regards
> Ajit
>
> Richard.
>
>  The
>>advantage of
>>The above loop distribution makes the loop vectorizable which otherwise
>>not possible due to presence of multiple statements inside the IF and
>>Also may not be IF-converted due to presence of multiple statements. If
>>we distribute the loop for the above two scenarios the individual loops
>>
>>in the distributed loop can be vectorized which is otherwise not
>>possible.
>>
>>Thanks & Regards
>>Ajit
>>
>>
>>Richard.
>>
>>> Thanks & Regards
>>> Ajit
>
>

Re: More of a Loop distribution.

Reply via email to