https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95644
--- Comment #10 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> --- It is very likely that the gcc optimizers will actually convert the to fma machine instructions, but no guarantee. I don't have much time, but it is likely some of the tricks we used in matmul can be used to get this to be "register" implemented .