https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #21 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Hongtao.liu from comment #19)
> (In reply to Hongtao.liu from comment #17)
> > (In reply to Hongtao.liu from comment #16)
> > > There're already testcases for vec_extract/vec_set/vec_duplicate, but 
> > > those
> > > testcases are written under TARGET_AVX512FP16, i'll make a copy of them 
> > > and
> > > test them w/o avx512fp16.
> > 
> > Also we can relax condition of extendv*hfv*sf and truncv*sfv*hf to
> > avx512vl/f16c so that vect-float16-1.c could be vectorized.
> > 
> > vect-float16-1.c
> > 
> > void
> > foo (_Float16 *__restrict__ a, _Float16 *__restrict__ b,
> >      _Float16 *__restrict__ c)
> > {
> >   for (int i = 0; i < 256; i++)
> >     a[i] = b[i] + c[i];
> > }
> 
> Even w/ support of extend_optab/trunc_optab, veclower still lower v8hf
> addition to scalar version. And the mismatch is vectorizer assume '+/-' is
> supported by default(w/o check optab, just cehck if v8hf is supported in
> vector_mode_supported_p), and then vectorize the loop, but veclower lower
> vector operation back to scalar which create much worse code than not
> vectorized version. 

I was under impression that autovectorizer won't vectorize if
TARGET_VECTORIZE_PRFERRED_SIMD_MODE returns word_mode. Also, the documentation
for TARGET_VECTOR_MODE_SUPPORTED_P claims that only moves are needed.

So, it looks that middle end is somehow inconsistent here. Adding CC.

> Could veclower try widen mode for addition, even veclower can, vNhfmode
> better be supported under avx512vl or f16c, orelse vectorized code is really
> bad, then why should we supported vector mode under generic target.

We should use it for parameter passing, moves, inserts, extracts and shuffles.
In case of VxHF, we can reuse HImode insns for all these operations.

Reply via email to