https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108279
--- Comment #3 from Thomas Koenig <tkoenig at gcc dot gnu.org> --- (In reply to Jakub Jelinek from comment #2) > From what I can see, they are certainly not portable. > E.g. the relying on __int128 rules out various arches (basically all 32-bit > arches, > ia32, powerpc 32-bit among others). For this kind of performance improvement on 64-bit systems, we could probably introduce an appropriate #ifdef. Regarding x86 intrinsics, maybe they can be replaced by gcc's vector extension. > Not handling exceptions is a show > stopper too. Agreed, we should not be replacing the soft-fp that way. > Guess better time investment would be to improve performance of the soft-fp > versions. I'm not sure, I think we could get an appreciable benefit if we only invoke this kind of routine behind the appropriate sub-flags of -ffast-math. For a general-purpose code, I see at least no way around the bottleneck of querying the processor status on each invocation, and that is a waste if the program does not care.