https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113560

--- Comment #4 from accelerator0099 at gmail dot com ---
Well, I hope gcc will just generate mulx instruction on arch with BMI2. Let's
look at the AMD64 Architecture Programmer’s Manual Volume 3:
Computes the unsigned product of the specified source operand and the implicit
source operand rDX.
Writes the upper half of the product to the first destination and the lower
half to the second.
So, just a mulx can do this. And according to the manual, it only costs 3 or 4
circles to excute a mulx.

Reply via email to