On Mon, Apr 27, 2015 at 9:12 AM, Denys Vlasenko <dvlas...@redhat.com> wrote: > > It is smaller, but not by much. It is two instructions smaller.
Ehh. That's _half_. And on a decoding side, it's the difference between 6 bytes that decode cleanly and can be decoded in parallel with other things (assuming the 6-byte nop), and 13 bytes that will need at least 2 nops (unless you want to do lots of prefixes, which is slow on some cores), _and_ which is likely big enough that you will basically not be decoding anythign else that cycle. So on the whole, your "smaller, but not by much" is all relative. It's a relatively big difference. So if one or two cycles in this code doesn't matter, then why are we adding alternate instructions just to avoid a few ALU instructions and a conditional branch that predicts perfectly? And if it does matter, then the 6-byte option looks clearly better.. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/