Ciao,
Il 2021-06-06 22:16 Torbjörn Granlund ha scritto:
ni...@lysator.liu.se (Niels Möller) writes:
Maybe we should have some macrology for that? Or do all relevant
processors and compilers support efficient cmov these days? I'm
sticking
to masking expressions for now.
Let's not trust results from compiler generated code for these things.
The mixture of inline asm and plain code is hard for compilers to deal
with. Very subtle things can make a huge cycle count difference.
Of course, mixing asm and plain code will not let the compiler much
freedom...
Should we try if the compiler supports a larger type (e.g. unsigned
__int128) and define the common macros add_ssaaaa and umul_ppmm based on
it? In that case the compiler should be able to optimise also across the
longlong-defined operations.
Ĝis,
m
_______________________________________________
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel