https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108279
--- Comment #10 from Thomas Koenig <tkoenig at gcc dot gnu.org> ---
What we would need for incorporation into gcc is to have several
functions, which would then called depending on which floating point
options are in force at the time of invocation.
So, let's go through the gcc options, to see what would fit where. Walking
down the options tree, depth first.
>From the gcc docs:
'-ffast-math'
Sets the options '-fno-math-errno', '-funsafe-math-optimizations',
'-ffinite-math-only', '-fno-rounding-math', '-fno-signaling-nans',
'-fcx-limited-range' and '-fexcess-precision=fast'.
-fno-math-errno is irrelevant in this context, no need to look at that.
'-funsafe-math-optimizations'
Allow optimizations for floating-point arithmetic that (a) assume
that arguments and results are valid and (b) may violate IEEE or
ANSI standards. When used at link time, it may include libraries
or startup files that change the default FPU control word or other
similar optimizations.
This option is not turned on by any '-O' option since it can result
in incorrect output for programs that depend on an exact
implementation of IEEE or ISO rules/specifications for math
functions. It may, however, yield faster code for programs that do
not require the guarantees of these specifications. Enables
'-fno-signed-zeros', '-fno-trapping-math', '-fassociative-math' and
'-freciprocal-math'.
'-fno-signed-zeros'
Allow optimizations for floating-point arithmetic that ignore the
signedness of zero. IEEE arithmetic specifies the behavior of
distinct +0.0 and -0.0 values, which then prohibits simplification
of expressions such as x+0.0 or 0.0*x (even with
'-ffinite-math-only'). This option implies that the sign of a zero
result isn't significant.
The default is '-fsigned-zeros'.
I don't think this options is relevant.
'-fno-trapping-math'
Compile code assuming that floating-point operations cannot
generate user-visible traps. These traps include division by zero,
overflow, underflow, inexact result and invalid operation. This
option requires that '-fno-signaling-nans' be in effect. Setting
this option may allow faster code if one relies on "non-stop" IEEE
arithmetic, for example.
This option should never be turned on by any '-O' option since it
can result in incorrect output for programs that depend on an exact
implementation of IEEE or ISO rules/specifications for math
functions.
The default is '-ftrapping-math'.
Relevant.
'-ffinite-math-only'
Allow optimizations for floating-point arithmetic that assume that
arguments and results are not NaNs or +-Infs.
This option is not turned on by any '-O' option since it can result
in incorrect output for programs that depend on an exact
implementation of IEEE or ISO rules/specifications for math
functions. It may, however, yield faster code for programs that do
not require the guarantees of these specifications.
This does not have further suboptions. Relevant.
'-fassociative-math'
Allow re-association of operands in series of floating-point
operations. This violates the ISO C and C++ language standard by
possibly changing computation result. NOTE: re-ordering may change
the sign of zero as well as ignore NaNs and inhibit or create
underflow or overflow (and thus cannot be used on code that relies
on rounding behavior like '(x + 2**52) - 2**52'. May also reorder
floating-point comparisons and thus may not be used when ordered
comparisons are required. This option requires that both
'-fno-signed-zeros' and '-fno-trapping-math' be in effect.
Moreover, it doesn't make much sense with '-frounding-math'. For
Fortran the option is automatically enabled when both
'-fno-signed-zeros' and '-fno-trapping-math' are in effect.
The default is '-fno-associative-math'.
Not relevant, I think - this influences compiler optimizations.
'-freciprocal-math'
Allow the reciprocal of a value to be used instead of dividing by
the value if this enables optimizations. For example 'x / y' can
be replaced with 'x * (1/y)', which is useful if '(1/y)' is subject
to common subexpression elimination. Note that this loses
precision and increases the number of flops operating on the value.
The default is '-fno-reciprocal-math'.
Again, not relevant.
'-frounding-math'
Disable transformations and optimizations that assume default
floating-point rounding behavior. This is round-to-zero for all
floating point to integer conversions, and round-to-nearest for all
other arithmetic truncations. This option should be specified for
programs that change the FP rounding mode dynamically, or that may
be executed with a non-default rounding mode. This option disables
constant folding of floating-point expressions at compile time
(which may be affected by rounding mode) and arithmetic
transformations that are unsafe in the presence of sign-dependent
rounding modes.
The default is '-fno-rounding-math'.
This option is experimental and does not currently guarantee to
disable all GCC optimizations that are affected by rounding mode.
Future versions of GCC may provide finer control of this setting
using C99's 'FENV_ACCESS' pragma. This command-line option will be
used to specify the default state for 'FENV_ACCESS'.
Also no further suboptions. This is relevant.
'-fsignaling-nans'
Compile code assuming that IEEE signaling NaNs may generate
user-visible traps during floating-point operations. Setting this
option disables optimizations that may change the number of
exceptions visible with signaling NaNs. This option implies
'-ftrapping-math'.
This option causes the preprocessor macro '__SUPPORT_SNAN__' to be
defined.
The default is '-fno-signaling-nans'.
This option is experimental and does not currently guarantee to
disable all GCC optimizations that affect signaling NaN behavior.
Also, no further suboptions. Relevant.
-fcx-limited-range is not relevant, and neither is -fexcess-precision=fast.
So, unless I missed something, wit should be possible to select different
functions depending on the values of -ftrapping-math, -finite-math-only,
-frounding-math and -fsignalling-nans.
Regarding Fortran's matmul: We use -ffast-math when compiling the library
functions, so any change we make to any of the -ffast-math suboptions
would be used, as well.