Ciao,
Il 2022-02-27 16:52 Marco Bodrato ha scritto:
Il 2022-02-25 17:06 John Gatrell ha scritto:
I tested using UHWtype in the macro for binvert_limb. On a 64 bit
machine
one of my programs gained a 3% speedup. On a 32 bit machine, there was
no
Should we use uint8_fast_t, uint16_fast_t, uint32_fast_t for the
different levels, and let the compiler choose? :-D
I tried code with uint_fast types, but it seems that the compiler is not
choosing the faster type, the 64-bits type is always used :-(
You should try to store also the 32-bits result into the half-type.
I mean: try replacing the following two lines in your code
__inv = 2 * __hinv - __hinv * __hinv * __n; /* 32 */
\
__inv = 2 * __inv - __inv * __inv * __n; /* 64 */
\
with
__hinv = 2 * __hinv - __hinv * __hinv * __n; /* 32 */
\
__inv = 2 * (mp_limb_t)__hinv - (mp_limb_t)__hinv * __hinv * __n; /*
64 */ \
Ĝis,
m
_______________________________________________
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel