https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77776
--- Comment #4 from Marc Glisse <glisse at gcc dot gnu.org> --- (In reply to Matthias Kretz from comment #3) > Did you consider the error introduced by scaling with __amax? I made sure > that the division is without error by zeroing the mantissa bits. Here's a > motivating example that shows an error of 1 ulp otherwise: > https://godbolt.org/z/_U2K7e Your "reference" number seems strange. Why not do the computation with double (or long double or mpfr) or use __builtin_hypotf? Note that it changes the value. How precise is hypot supposed to be? I know it is supposed to try and avoid spurious overflow/underflow, but I am not convinced that it should aim for correct rounding. (I see that you are using clang in that godbolt link, with gcc I need to mark the global variables with "extern const" to get a similar asm) > About std::fma, how bad is the performance hit if there's no instruction for > it? FMA doesn't seem particularly relevant here.