https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org,
                   |                            |uros at gcc dot gnu.org

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
At least on your short testcase clang doesn't use divps either.
We do support mulv2sf3, addv2sf3 etc. but not divv2sf3 I bet because with
TARGET_MMX_WITH_SSE it would divide by zero in the 3rd and 4th elts,
but perhaps we could insert 1.0f, 1.0f into those elements of the divisor
before using divps?

Another question is if we could teach SLP to vectorize even factors not power
of 2, say loads/stores could be done (and with e.g. AVX512 almost everything)
could be done with masked loads/stores, most arithmetics could be done normally
and we'd just need to watch what values we'll get in the extra elts and make
sure it doesn't generate exceptions etc.

Reply via email to