Re: GMP 6.1.2 t-count_zeros failure on ARM with assertions
(Related, I wonder what the effect would be of redefining umul_ppmm as C expressions involving __uint128_t on compilers that support that). We do that already for some CPUs, but this has proven to be somewhat fragile, and in unexpected cases lead to libgcc calls. We brave to do that for at least PowerPC-64, MIPS-64, s390x, Arm64. For alpha, gcc provides an _int_mult_upper which we use instead. Apart from better scheduling, making gcc aware of the semantics allows for algebraic optimisations and various foldings. -- Torbjörn Please encrypt, key id 0xC8601622 ___ gmp-bugs mailing list gmp-bugs@gmplib.org https://gmplib.org/mailman/listinfo/gmp-bugs
Re: GMP 6.1.2 t-count_zeros failure on ARM with assertions
ni...@lysator.liu.se (Niels Möller) writes: Using inline asm instead has the drawback that it leaves a little less opportunity for the compiler to schedule this instructions optimally. No idea if that matters in practice. Since it seems we don't really need count_*_zeros to support zero input, is there any advantage in using inline asm? Sure, and that matters chiefly if the instructions have a long latency. (Now, it is quite likely that CLZ and RBIT didn't get described to the compiler scheduler, as they are usually not used.) (Related, I wonder what the effect would be of redefining umul_ppmm as C expressions involving __uint128_t on compilers that support that). We do that already for some CPUs, but this has proven to be somewhat fragile, and in unexpected cases lead to libgcc calls. -- Torbjörn Please encrypt, key id 0xC8601622 ___ gmp-bugs mailing list gmp-bugs@gmplib.org https://gmplib.org/mailman/listinfo/gmp-bugs
Re: GMP 6.1.2 t-count_zeros failure on ARM with assertions
t...@gmplib.org (Torbjörn Granlund) writes: > We might define these directly, at least for arm64, to CLZ and RBIT+CLZ, > respectively, instead of using gcc's builtin semi-defined variants? Using inline asm instead has the drawback that it leaves a little less opportunity for the compiler to schedule this instructions optimally. No idea if that matters in practice. Since it seems we don't really need count_*_zeros to support zero input, is there any advantage in using inline asm? (Related, I wonder what the effect would be of redefining umul_ppmm as C expressions involving __uint128_t on compilers that support that). /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677. Internet email is subject to wholesale government surveillance. ___ gmp-bugs mailing list gmp-bugs@gmplib.org https://gmplib.org/mailman/listinfo/gmp-bugs