https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123787
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Using a library routine and determining what exactly needs to be multiplied
there is completely intentional, multiplication certainly is not a simple
operation and if you multiply say
_BitInt(33000) foo (_BitInt(33000) x) { return 33 * x; }
then it is a very significant difference if you do say 516 multiplications or
over 265000.
And the clang _BitInt code generation is really scalable, consider simple:
_BitInt(10000) foo (_BitInt(10000) x, _BitInt(10000) y) { return x + y; }
_BitInt(10000) bar (_BitInt(10000) x, _BitInt(10000) y) { return x * y; }
time clang -c -O2 -o b.o b.c
real 0m57.498s
user 0m57.130s
sys 0m0.134s
time gcc -c -O2 -o b.o2 b.c
real 0m0.022s
user 0m0.016s
sys 0m0.006s
readelf -Ws b.o
Symbol table '.symtab' contains 5 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS b.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 2 .text
3: 0000000000000000 5698 FUNC GLOBAL DEFAULT 2 foo
4: 0000000000001650 0x74c15 FUNC GLOBAL DEFAULT 2 bar
readelf -Ws b.o2
Symbol table '.symtab' contains 6 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS b.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text
3: 0000000000000000 93 FUNC GLOBAL DEFAULT 1 foo
4: 0000000000000060 43 FUNC GLOBAL DEFAULT 1 bar
5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND __mulbitint3
If you do 33000 instead of 10000, I had to kill clang after 12 minutes of
compilation because I didn't want to wait forever.
But even for the 10000 bit multiplication clang emits 467KiB of code for a
single operation.
Note, even __int128 operations aren't constant time, multiplication is, but
division/modulo aren't.
On 64-bit targets _BitInt(128) will be constant time too because it will just
use __int128 multiplication the target has.
If you want a portable constant time multiplication, just use say unsigned
_BitInt(257) and set the most significant bit on both operands.