On Thu, 10 Dec 2020, Lucas de Almeida via Gcc wrote:

when performing (int64_t) foo / (int32_t) bar in gcc under x86, a call to
__divdi3 is always output, even though it seems the use of the idiv
instruction could be faster.

IIRC, idiv requires that the quotient fit in 32 bits, while your C code doesn't. (1LL << 60) / 3 would cause an error with idiv.

It would be possible to use idiv in some cases, if the compiler can prove that variables are in the right range, but that's not so easy. You can use inline asm to force the use of idiv if you know it is safe for your case, the most common being modular arithmetic: if you know that uint32_t a, b, c, d are smaller than m (and m!=0), you can compute a*b+c+d in uint64_t, then use div to compute that modulo m.

--
Marc Glisse

Reply via email to