https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92651

--- Comment #4 from Hongyu Wang <wwwhhhyyy333 at gmail dot com> ---
(In reply to Richard Biener from comment #2)
> Btw, which variant is actually the fastest for you?   abs expansion doesn't
> do any cost comparison but just uses direct abs, max and then the xor with
> shift as third option (and after that fall back to compare & jump which later
> might be if-converted into cmov).

Actually the xor with shift is could be the fastest, which improves about 8% on
525.x264_r comparing to the pmaxsd one, and with cmove the improvement is 6.5%.

I don't think this conversion should happen on every cmove instruction,
regardless of how many sse register it would use. I think the simplest way to
avoid this is adjusting the cost.

Reply via email to