https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95644
--- Comment #11 from Steve Kargl <sgk at troutmask dot apl.washington.edu> --- On Thu, Mar 04, 2021 at 02:22:46AM +0000, jvdelisle at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95644 > > --- Comment #10 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> --- > It is very likely that the gcc optimizers will actually convert the to fma > machine instructions, but no guarantee. That's what __builtin_fma() will do. The second kludge I posted would still have the layer of indirection of calling on of fma04, fma08, fma10, or fma16. Also, note that the kludge declares these as IMPURE ELEMENTAL because of the BIND(C) stuff. This is technically incorrect. > I don't have much time, but it is likely some of the tricks we used in matmul > can be used to get this to be "register" implemented . The correct approach would give interfaces in the ieee_arithmetic so that argument checking can be done. The implementation details would be contained in trans-intrinsic.c where conv_intrinsic_fma() is called and __builtin_fma is directly emitted. Another approach, where conv_intrinsic_fma() is unneeded, would be to register __builtin_fma() as builtin function with gfortran. This, however, requires more work because gfortran currently does not have a mechanism for registering a 3 argument builtin function.
