On Wed, 7 Aug 2019, Joseph Myers wrote:

On Sat, 22 Jun 2019, Marc Glisse wrote:

as discussed in the PR, this seems like a simple enough approach to handle
FENV functionality safely, while keeping it possible to implement
optimizations in the future.

Could you give a high-level description of the implementation approach,

At the GIMPLE level, z = x + y is represented as a function call z = .FENV_PLUS (x, y, options). The floating point environment (rounding mode, exceptions) is considered to be somewhere in memory (I think it still works if it is a hard register). Unless options say so, .FENV_PLUS may read/write to memory. There are very little optimizations that can be done on general function calls, so this should avoid unwanted movement or removal. We can still implement some specific optimizations just for those functions.

At the RTL level, well the idea is that good back-ends would expand .FENV_PLUS however they want, but the default is to have the arguments and the result use an asm volatile pass-through, which is opaque to optimizers and prevents constant propagation, removal, movement, etc.

(the use of "options" is to avoid having many variants depending on whether we only care about rounding, exceptions, maybe ignore signed zeros, etc, with 0 as the strictest, always-safe version. For explicitly rounded operations as with pragma fenv_round, a different function might be better since the 0 case is not a safe replacement anymore)

and how this design is intended to (eventually) achieve the required
constraints on code movement and removal?  In
<https://gcc.gnu.org/ml/gcc/2013-01/msg00095.html> I listed those
constraints as:

* General calls may set, clear or test exceptions, or manipulate the
rounding mode
(as may asms, depending on their inputs / outputs / clobbers).

If the asm is volatile, this works fine. I'll come back to this below.

* Floating-point operations have the rounding mode as input.  They may set
(but not clear or test) floating-point exception flags.

* Thus in general floating-point operations may not be moved across most
calls (or relevant asms), or values from one side of a call reused for the
same operation with the same inputs appearing on the other side of the
call.

* Statements such as "(void) (a * b);" can't be eliminated because they
may raise exceptions.  (That's purely about exceptions, not rounding
modes.)

I had to add TREE_SIDE_EFFECTS = 1 so the C++ front-end wouldn't remove it prematurely.

(I should add that const function calls should not depend on the rounding
mode, but pure calls may.

That perfectly fits with the idea of having the FP env as part of memory.

Also, on some architectures there are explicit
register names for asms to use in inputs / outputs / clobbers to refer to
the floating-point state registers, and asms not referring to those can be
taken not to manipulate floating-point state, but other architectures
don't have such names.  The safe approach for asms would be to assume that
all asms on all architectures can manipulate floating-point state, until
there is a way to declare what the relevant registers are.)

I assume that an asm using this register as a constraint is already prevented from moving across function calls somehow? If so, at least gimple seems safe.

For RTL, if those asm were volatile, the default expansion would be fine. If they don't need to be and somehow manage to cross the pass-through asm, I guess a target hook to add extra input/output/clobber to the pass-through asm would work. Or best the target would expand the operations to (unspec) insns that explicitly handle exactly those registers.

(I should also note that DFP has a separate rounding mode from binary FP,
but that is unlikely to affect anything in this patch - although there
might end up being potential minor optimizations from knowing that certain
asms only involve one of the two rounding modes.)

I'd like to handle this incrementally, rather than wait for a mega-patch that
does everything, if that's ok. For instance, I didn't handle vectors in this
first patch because the interaction with vector lowering was not completely
obvious. Plus it may help get others to implement some parts of it ;-)

Are there testcases that could be added initially to demonstrate how this
fixes cases that are currently broken, even if other cases aren't fixed?

Yes. I'll need to look into dg-require-effective-target fenv(_exceptions) to see how to disable those new tests where they are not supported. There are many easy tests that already start working, say computing 1./3 twice with a change of rounding mode in between and checking that the results differ, or computing 1./3 and ignoring the result but checking FE_INEXACT.


On Wed, 7 Aug 2019, Joseph Myers wrote:

On Sun, 23 Jun 2019, Marc Glisse wrote:

For constant expressions, I see a difference between
constexpr double third = 1. / 3.;
which really needs to be done at compile time, and
const double third = 1. / 3.;
which will try to evaluate the rhs as constexpr, but where the program is
still valid if that fails. The second one clearly should refuse to be
evaluated at compile time if we are specifying a dynamic rounding direction.

For C, initializers with static or thread storage duration always use
round-to-nearest and discard exceptions (see F.8.2 and F.8.5).  This is
unaffected by FENV_ACCESS (but *is* affected by FENV_ROUND).

Thanks for the precision.

Note that C2x adds a pragma fenv_round that specifies a rounding direction for
a region of code, which seems relevant for constant expressions. That pragma
looks hard, but maybe some pieces would be nice to add.

FENV_ROUND (and FENV_DEC_ROUND) shouldn't be that hard, given the

On the glibc side I expect it to be a lot of work, it seems to require a correctly rounded version of all math functions...

optimizers avoiding code movement that doesn't respect rounding modes
(though I'm only thinking of C here, not C++).  You'd insert appropriate
built-in function calls to save and restore the dynamic rounding modes in
scopes with a constant rounding mode set, taking due care about scopes
being left through goto etc., and restore the mode around calls to
functions that aren't meant to be affected by the constant rounding modes
- you'd also need a built-in function to indicate to make a call that is
affected by the constant rounding modes (and make __builtin_tgmath do that
as well), and to define all the relevant functions as macros using that
built-in function in the standard library headers.  Optimizations for
architectures supporting rounding modes embedded in instructions could
come later.

Complications would include:

* <float.h> constants should use hex floats to avoid being affected by the
constant rounding mode (in turn, this may mean disallowing the FENV_ROUND
pragma in C90 mode because of the lack of hex floats there).  If they use
decimal rather than hex they'd need to be very long constants to have
exactly the right value in all rounding modes.

True. I thought that was on the libc side, but no, float.h is in gcc indeed, and all the values are provided by the compiler as macros anyway.

I didn't look at the rounding that happens while parsing a literal yet, and in particular which pragmas are supposed to affect it (probably not fenv_access, only fenv_round).

It seems that hex floats are accepted even in C89 with a pedwarn that can be disabled with __extension__, although I am not sure if using __extension__ in __FLT_MAX__ (so it wouldn't be a pure literal anymore) would cause trouble.

We could also have #pragma fenv_round to_nearest (not the exact syntax) in float.h, although the C standard doesn't seem to have a push/pop mechanism to restore fenv_round at the end of the file.

* The built-in functions to change the dynamic rounding mode can't involve
calling fegetround / fesetround, because those are in libm and libm is not
supposed to be required unless you call a function in <math.h>,
<complex.h> or <fenv.h> (simply using a language feature such as a pragma
should not introduce a libm dependency).  So a similar issue applies as
applied with atomic compound assignment for floating-point types: every
target with hardware floating point needs to have its own support for
expanding those built-in functions inline, and relevant tests will FAIL
(or be UNSUPPORTED through the compiler calling sorry () when the pragma
is used) on targets without that support, until it is added.  (And in
cases where the rounding modes is TLS data in libc rather than in
hardware, such as soft-float PowerPC GNU/Linux and maybe some other cases
for DFP, you need new implementation-namespace interfaces there to save /
restore it.)

Honestly, that doesn't seem like a priority. Sure, long term for strict conformance (and a bit for performance) it could make sense, but having a not-strictly-legal dependency on libm when using a pragma that is meant for use with fenv.h seems much better than missing the functionality altogether.

--
Marc Glisse

Reply via email to