https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071

--- Comment #10 from Peter Cordes <peter at cordes dot ca> ---
(In reply to Uroš Bizjak from comment #9)
> There was similar patch for sqrt [1], I think that the approach is
> straightforward, and could be applied to other reg->reg scalar insns as
> well, independently of PR87007 patch.
> 
> [1] https://gcc.gnu.org/ml/gcc-patches/2018-05/msg00202.html

Yeah, that looks good.  So I think it's just vcvtss2sd and sd2ss, and
VROUNDSS/SD that aren't done yet.

That patch covers VSQRTSS/SD, VRCPSS, and VRSQRTSS.

It also bizarrely uses it for VMOVSS, which gcc should only emit if it actually
wants to merge (right?).  *If* this part of the patch isn't a bug

-           return "vmovss\t{%1, %0, %0|%0, %0, %1}";
+           return "vmovss\t{%d1, %0|%0, %d1}";

then even better would be vmovaps %1, %0 (which can benefit from
mov-elimination, and doesn't need a port-5-only ALU uop.)  Same for vmovsd of
course.

Reply via email to