https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101097

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |crazylht at gmail dot com

--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Richard Biener from comment #1)
> Hmm, so the difference is that we use loop vect for 'foo' but fail to do
> that for 'bar' and BB vect succeeds.  Disabling loop vect but enabling BB
> vect also produces optimal code for 'foo' (unrolling happens before):
> 
> foo:
> .LFB0:
>         .cfi_startproc
>         vpmovzxwd       (%rsi), %ymm0
>         vpmovzxwd       (%rdi), %ymm1
>         vpaddd  %ymm1, %ymm0, %ymm0
>         vmovdqu %ymm0, (%rdx)
>         vzeroupper
> 
> the key difference in the vectorizer is that BB vect supports different
> vector sizes in the same instance but the loop vectorizer can only use
> a single vector size.
Is there any plan for extending loop vectorizer to handle different vector
sizes?

Reply via email to