https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797

--- Comment #15 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #12)
> (In reply to Jakub Jelinek from comment #10)
> > At least on your short testcase clang doesn't use divps either.
> > We do support mulv2sf3, addv2sf3 etc. but not divv2sf3 I bet because with
> > TARGET_MMX_WITH_SSE it would divide by zero in the 3rd and 4th elts,
> > but perhaps we could insert 1.0f, 1.0f into those elements of the divisor
> > before using divps?
> 
> It could be done, but I was under impression that the sequence to load 1.0f
> into topmost elements nullifies the benefit of operation to divide two

Sure, so perhaps we should somewhat increase the vectorization cost of V2SFmode
division so that we would use it only if it is part of longer sequences?

Reply via email to