Re: Use of AVX instructions in mpn_mul_1

Marco Bodrato Fri, 17 Jun 2022 05:51:06 -0700

Ciao Thanassis,

Il 2022-06-13 23:17 Thanassis Tsiodras ha scritto:

I had a quick look at the x86_64 assembly implementations of the basic
primitive used in multiplications (mpn_mul_1), and saw this:

...I could not find any use of AVX-integer-related multiplication
instructions.

I am talking about things like " _mm512_mul_epu32", which at firstglance

seemed promising (8x32bit multiplications in one instruction generating
8x64-bit results in one go).

Four 32x32->64 multiplications perform the same multiplication work ofone 64x64->128. But are "8x32bit multiplications in one instruction"faster then two 64x64 mul? As you confirm, many other additions withcarry propagation are needed.

So the question is, does using AVX reduce the resources needed for amultiplication?

I can't see a way to do that optimally. Is that the reason GMP asm code
seems to prefer the simple 64x64 => 128 instructions?  (mul %rcx)

When you'll find an implementation with AVX, more efficient than ourcurrent implementation, you can contribute it to the project :-)


Ĝis,
m
_______________________________________________
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Use of AVX instructions in mpn_mul_1

Reply via email to