Ciao,

Il 2021-06-06 22:16 Torbjörn Granlund ha scritto:
ni...@lysator.liu.se (Niels Möller) writes:

  Maybe we should have some macrology for that? Or do all relevant
processors and compilers support efficient cmov these days? I'm sticking
  to masking expressions for now.

Let's not trust results from compiler generated code for these things.
The mixture of inline asm and plain code is hard for compilers to deal
with.  Very subtle things can make a huge cycle count difference.

Of course, mixing asm and plain code will not let the compiler much freedom...

Should we try if the compiler supports a larger type (e.g. unsigned __int128) and define the common macros add_ssaaaa and umul_ppmm based on it? In that case the compiler should be able to optimise also across the longlong-defined operations.

Ĝis,
m
_______________________________________________
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Reply via email to