Thanks, it seems that Cong's idea is exactly what I meant. Is there
a patch I can try? 

Bingfeng


-----Original Message-----
From: Xinliang David Li [mailto:davi...@google.com] 
Sent: 04 February 2014 18:57
To: Bingfeng Mei
Cc: gcc@gcc.gnu.org; Cong Hou
Subject: Re: Merge epilog loop & loop version due to alias/alignment in 
vectorization?

See also http://gcc.gnu.org/ml/gcc/2013-08/msg00259.html

There are some concerns, but it would be interesting to do some
benchmarking of this.

David

On Tue, Feb 4, 2014 at 8:27 AM, Bingfeng Mei <b...@broadcom.com> wrote:
> Hi,
> One of biggest issues we have with GCC vectorization is bloated code size.
> For example, vectorized version is 2.5 times of non-vectorized one for the
> following simple code. One reason is that GCC often creates one loop copy
> because of aliasing/alignment and one epilog loop because of loop iteration
> constraint.
>
> void foo (int *a, int *b, int N)
> {
>   int i;
>   for (i = 0; i < N; i++)
>   {
>     a[i] = b[i];
>   }
> }
>
> Looking closely, the epilog loop and alignement/aliasing loop are almost
> identical, just different in initial values for some variables entering
> the loop. Can they be merged into one in such situations? If yes, any
> suggestion on how to implement it?
>
> ...
>   <bb 7>:
>   # i_39 = PHI <i_47(8), i_50(10)>
>   _41 = (long unsigned int) i_39;
>   _42 = _41 * 4;
>   _43 = a_7(D) + _42;
>   _44 = b_9(D) + _42;
>   _45 = *_44;
>   *_43 = _45;
>   i_47 = i_39 + 1;
>   if (N_4(D) > i_47)
>     goto <bb 8>;
>   else
>     goto <bb 15>;
>
>   <bb 8>:
>   goto <bb 7>;
>
>   <bb 9>:
>   # i_51 = PHI <i_13(6)>
>   tmp.6_56 = (int) ratio_mult_vf.5_38;
>   if (niters.3_34 == ratio_mult_vf.5_38)
>     goto <bb 16>;
>   else
>     goto <bb 10>;
>
>   <bb 10>:
>   # i_50 = PHI <tmp.6_56(9), 0(4)>
>   goto <bb 7>;
>
>   <bb 11>:
>   goto <bb 6>;
>
>   <bb 12>:
>
>   <bb 13>:
>   # i_24 = PHI <0(12), i_32(14)>
>   _26 = (long unsigned int) i_24;
>   _27 = _26 * 4;
>   _28 = a_7(D) + _27;
>   _29 = b_9(D) + _27;
>   _30 = *_29;
>   *_28 = _30;
>   i_32 = i_24 + 1;
>   if (N_4(D) > i_32)
>     goto <bb 14>;
>   else
>     goto <bb 17>;
> ...
>
> Thanks,
> Bingfeng

Reply via email to