2016-06-16 9:00 GMT+03:00 Jeff Law <l...@redhat.com>:
> On 05/19/2016 01:39 PM, Ilya Enkovich wrote:
>>
>> Hi,
>>
>> This patch introduces changes required to run vectorizer on loop epilogue.
>> This also enables epilogue vectorization using a vector of smaller size.
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2016-05-19  Ilya Enkovich  <ilya.enkov...@intel.com>
>>
>>         * tree-if-conv.c (tree_if_conversion): Make public.
>>         * tree-if-conv.h: New file.
>>         * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Don't
>>         try to enhance alignment for epilogues.
>>         * tree-vect-loop-manip.c (vect_do_peeling_for_loop_bound): Return
>>         created loop.
>>         * tree-vect-loop.c: include tree-if-conv.h.
>>         (destroy_loop_vec_info): Preserve LOOP_VINFO_ORIG_LOOP_INFO in
>>         loop->aux.
>>         (vect_analyze_loop_form): Init LOOP_VINFO_ORIG_LOOP_INFO and reset
>>         loop->aux.
>>         (vect_analyze_loop): Reset loop->aux.
>>         (vect_transform_loop): Check if created epilogue should be
>> returned
>>         for further vectorization.  If-convert epilogue if required.
>>         * tree-vectorizer.c (vectorize_loops): Add a queue of loops to
>>         process and insert vectorized loop epilogues into this queue.
>>         * tree-vectorizer.h (vect_do_peeling_for_loop_bound): Return
>> created
>>         loop.
>>         (vect_transform_loop): Return created loop.
>
> As Richi noted, the additional calls into the if-converter are unfortunate.
> I'm not sure how else to avoid them though.  It looks like we can run
> if-conversion on just the epilogue, so maybe that's not too bad.
>
>
>> @@ -1212,8 +1213,8 @@ destroy_loop_vec_info (loop_vec_info loop_vinfo,
>> bool clean_stmts)
>>    destroy_cost_data (LOOP_VINFO_TARGET_COST_DATA (loop_vinfo));
>>    loop_vinfo->scalar_cost_vec.release ();
>>
>> +  loop->aux = LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo);
>>    free (loop_vinfo);
>> -  loop->aux = NULL;
>>  }
>
> Hmm, there seems to be a level of indirection I'm missing here.  We're
> smuggling LOOP_VINFO_ORIG_LOOP_INFO around in loop->aux.  Ewww.  I thought
> the whole point of LOOP_VINFO_ORIG_LOOP_INFO was to smuggle the VINFO from
> the original loop to the vectorized epilogue.  What am I missing?  Rather
> than smuggling around in the aux field, is there some inherent reason why we
> can't just copy the info from the original loop directly into
> LOOP_VINFO_ORIG_LOOP_INFO for the vectorized epilogue?

LOOP_VINFO_ORIG_LOOP_INFO is used for several things:
 - mark this loop as epilogue
 - get VF of original loop (required for both mask and nomask modes)
 - get decision about epilogue masking

That's all.  When epilogue is created it has no LOOP_VINFO.  Also when we
vectorize loop we create and destroy its LOOP_VINFO multiple times.  When
loop has LOOP_VINFO loop->aux points to it and original LOOP_VINFO is in
LOOP_VINFO_ORIG_LOOP_INFO.  When Loop has no LOOP_VINFO associated I have no
place to bind it with the original loop and therefore I use vacant loop->aux
for that.  Any other way to bind epilogue with its original loop would work
as well.  I just chose loop->aux to avoid new fields and data structures.

>
>> +  /* FORNOW: Currently alias checks are not inherited for epilogues.
>> +     Don't try to vectorize epilogue because it will require
>> +     additional alias checks.  */
>
> Are the alias checks here redundant with the ones done for the original
> loop?  If so won't DOM eliminate them?

I revisited this part recently and thought it should actually be safe to
assume we have no aliasing in epilogue because we are dominated by alias
checks of the original loop.  So I prepared a patch to remove this restriction
and avoid alias checks generation for epilogues (so we compute aliases checks
required but don't emit them).  I didn't send this patch yet.
Do you think it is a valid assumption?

>
>
> And something just occurred to me -- is there some inherent reason why SLP
> doesn't vectorize the epilogue, particularly for the cases where we can
> vectorize the epilogue using smaller vectors?  Sorry if you've already
> answered this somewhere or it's a dumb question.

IIUC this may happen only if we unroll epilogue into a single BB which happens
only when epilogue iterations count is known. Right?

>
>
>
>>
>> +       /* Add new loop to a processing queue.  To make it easier
>> +          to match loop and its epilogue vectorization in dumps
>> +          put new loop as the next loop to process.  */
>> +       if (new_loop)
>> +         {
>> +           loops.safe_insert (i + 1, new_loop->num);
>> +           vect_loops_num = number_of_loops (cfun);
>> +         }
>> +
>
> So just to be clear, the only reason to do this is for dumps -- other than
> processing the loop before it's epilogue, there's no other inherently
> necessary ordering of the loops, right?

Right, I don't see other reasons to do it.

Thanks,
Ilya

>
>
> Jeff

Reply via email to