2009/12/17 Zdenek Dvorak <rakd...@kam.mff.cuni.cz>:
> Hi,
>
>> > Is there a way to pass to the unroller the maximum number of iterations
>> > of the loop such that it can decide to avoid unrolling if
>> > the maximum number  is small.
>> >
>> > To be more specific, I am referring to the following case:
>> > After the vectorizer decides to peel for alignment
>> > it creates three loops:
>> > [1] scalar loop - the prologue to align
>> >    memory access.
>> > [2] the vecorized loop
>> > [3] scalar loop - the remaining scalar computations.
>> >
>> > If the unroller does not know the number of  iterations at compile time
>> > it unrolls loops with run-time checks in the following way
>> > (taken from loop-unroll.c):
>> >
>> >   for (i = 0; i < n; i++)
>> >     body;
>> >
>> >   ==>
>> >
>> >   i = 0;
>> >   mod = n % 4;
>> >
>> >   switch (mod)
>> >     {
>> >       case 3:
>> >         body; i++;
>> >       case 2:
>> >         body; i++;
>> >       case 1:
>> >         body; i++;
>> >       case 0: ;
>> >     }
>> >
>> >   while (i < n)
>> >     {
>> >       body; i++;
>> >       body; i++;
>> >       body; i++;
>> >       body; i++;
>> >     }
>> >
>> >
>> > The vectorizer knowns at compile time the maximum number of iterations
>> > that will be needed for the prologue and the epilogue. In some cases
>> > seems there is no need to unroll and create redundant loops.
>>
>> You can set niter_max in the niter_desc of simple loops.  There is
>> also nb_iter_bound for all loops.  Of course the
>> issue is that loop information is destroyed sometimes.  It also looks
>> like that RTL loop analysis may not re-use this information.
>>
>> Maybe Zdenek knows a better answer.
>
> currently, there is no reliable way how to pass this information to RTL.  The 
> best
> I can come up with (without significant amount of changes to other parts of 
> the compiler)
> would be to insert a code like
>
> if (n > 5)
>  special_abort ();
>
> before the loop in the vectorizer if you know for sure that the loop will 
> iterate at most
> 5 times, use these hints to bound the number of iterations in the unroller 
> (we do not do this
> at the moment, but it should be easy), and remove the calls to special_abort 
> and the
> corresponding branches after the unroller.

We do have __builtin_unreachable (), but that is removed at expansion time
already ;)  Also the code does have

 n = orig_n % 4;

already (or the equivalent masking and subtraction),
but I guess the RTL loop analysis doesn't catch that either.

Richard.

> Zdenek
>

Reply via email to