Richard Henderson <richard.hender...@linaro.org> writes:
> These builtins came in clang 3.8, but are not present in gcc through > version 11. Even in clang the optimization is not ideal except for > x86_64, but no worse than the hand-coding that we currently do. Given this statement.... <snip> > +/** > + * uadd64_carry - addition with carry-in and carry-out > + * @x, @y: addends > + * @pcarry: in-out carry value > + * > + * Computes @x + @y + *@pcarry, placing the carry-out back > + * into *@pcarry and returning the 64-bit sum. > + */ > +static inline uint64_t uadd64_carry(uint64_t x, uint64_t y, bool *pcarry) > +{ > +#if __has_builtin(__builtin_addcll) > + unsigned long long c = *pcarry; > + x = __builtin_addcll(x, y, c, &c); what happens when unsigned long long isn't the same as uint64_t? Doesn't C99 only specify a minimum? > + *pcarry = c & 1; Why do we need to clamp it here? Shouldn't the compiler automatically do that due to the bool? > + return x; > +#else > + bool c = *pcarry; > + /* This is clang's internal expansion of __builtin_addc. */ > + c = uadd64_overflow(x, c, &x); > + c |= uadd64_overflow(x, y, &x); > + *pcarry = c; > + return x; > +#endif Either way if you aren't super happy with the compilers builtin and you get equivalent code with the unambigious hand coded version then what is the point of having a builtin leg? > +} > + > +/** > + * usub64_borrow - subtraction with borrow-in and borrow-out > + * @x, @y: addends > + * @pborrow: in-out borrow value > + * > + * Computes @x - @y - *@pborrow, placing the borrow-out back > + * into *@pborrow and returning the 64-bit sum. > + */ > +static inline uint64_t usub64_borrow(uint64_t x, uint64_t y, bool *pborrow) > +{ > +#if __has_builtin(__builtin_subcll) > + unsigned long long b = *pborrow; > + x = __builtin_subcll(x, y, b, &b); > + *pborrow = b & 1; > + return x; > +#else > + bool b = *pborrow; > + b = usub64_overflow(x, b, &x); > + b |= usub64_overflow(x, y, &x); > + *pborrow = b; > + return x; > +#endif > +} > + > /* Host type specific sizes of these routines. */ > > #if ULONG_MAX == UINT32_MAX -- Alex Bennée