https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56309

Peter Cordes <peter at cordes dot ca> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |peter at cordes dot ca

--- Comment #36 from Peter Cordes <peter at cordes dot ca> ---
Related:  a similar case of cmov being a worse choice, for a threshold
condition with an array input that happens to already be sorted:

https://stackoverflow.com/questions/28875325/gcc-optimization-flag-o3-makes-code-slower-than-o2

GCC with -fprofile-generate / -fprofile-use does correctly decide to use
branches.

GCC7 and later (including current trunk) with -O3 -fno-tree-vectorize
de-optimizes by putting the CMOV on the critical path, instead of as part of
creating a zero/non-zero input for the ADD. PR82666.  If you do allow full -O3,
then vectorization is effective, though.

Reply via email to