https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98713
--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> --- This is specific to x86, where if the inputs are inpredictable and results aren't consumed too early that the cmov latency kills performance cmov sometimes improves performance a lot, on the other side, if the inputs are predictable, branches are often much faster than cmov. I'm not aware of other architectures where the conditional moves are such a mixed bag, e.g. on arm/aarch64 I think using cmov is generally always better.