[Bug target/102783] [powerpc] FPSCR manipulations cannot be relied upon

2023-01-07 Thread glisse at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102783

--- Comment #12 from Marc Glisse  ---
(In reply to Marc Glisse from comment #11)
> Since I had forgotten where it was, let me write here that it is git branch
> /users/glisse/fenv

Since it became impossible (hooks) to push to that branch a while ago, I should
post somewhere the FIXME file I couldn't push last year:

Looking at LLVM, I notice that my design in the gcc fenv branch seems to be
missing a fundamental piece: it has nothing preventing "normal" operations from
outside from migrating towards the protected region, where they may end up
using an unexpected rounding mode (unprotected doesn't mean any rounding mode,
it means the default one), or setting flags that we will observe.
One idea to prevent this would be to make sure that there are no normal FP
operations in functions that have protected operations (does that mean we
should mark functions? Just checking if there is a protected FP op doesn't work
if we call a function that does the op).
This means that we should turn all FP operations of the function into protected
ones (possibly with more relaxed flags if they are not in the protected
region), and we should also do that whenever inlining mixed functions. And
cross my fingers that the compiler doesn't start using FP ops out of thin air.
Would that be sufficient?

[Bug target/102783] [powerpc] FPSCR manipulations cannot be relied upon

2022-08-26 Thread glisse at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102783

--- Comment #11 from Marc Glisse  ---
(In reply to Segher Boessenkool from comment #8)
> Thanks for the pointer, I'll find Marc's work.

Since I had forgotten where it was, let me write here that it is git branch
/users/glisse/fenv

[Bug target/102783] [powerpc] FPSCR manipulations cannot be relied upon

2021-10-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102783

--- Comment #10 from Richard Biener  ---
(In reply to jos...@codesourcery.com from comment #9)
> On Tue, 19 Oct 2021, segher at gcc dot gnu.org via Gcc-bugs wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102783
> > 
> > --- Comment #8 from Segher Boessenkool  ---
> > (In reply to jos...@codesourcery.com from comment #6)
> > > Generically (and if the command-line options are such that floating-point 
> > > control / status bits are to be respected by optimizations), *any* 
> > > function call might access or modify floating-point control and status 
> > > bits, subject to e.g. const functions not being able to access them, pure 
> > > functions not being able to modify them, functions whose body is known 
> > > having properties based on analysis of that body, built-in functions 
> > > having semantics based on what the compiler knows about those functions.  
> > 
> > If FENV_ACCESS is OFF most of those things can be ignored as well.  But
> > FENV_ACCESS is much too blunt a hammer for most of our uses.
> 
> My recent discussions with Roger Sayle 
>  html#580252>, 
> and bug 54192 as referenced therein, may be helpful for more details of 
> how FENV_ACCESS could be split up.  (At present we have -ftrapping-math, 
> on by default, and -frounding-math, off by default.  I suspect that if 
> -ftrapping-math really restricted optimizations enough to avoid all 
> problematic code reordering / removal in the presence of function calls 
> possibly reading and writing exception flags, it would actually inhibit 
> optimization more than a full implementation of -frounding-math would: a 
> full -frounding-math only means that arithmetic *reads* the rounding mode, 
> whereas a full -ftrapping-math means that arithmetic *writes* to the 
> exception flags.)

But one interesting detail is that those writes can be re-ordered when
they are not (synchronously) observed since the exception flags as
produced by arithmetic are "sticky".  That makes mapping their dataflow
to SSA not very precise, you'd have to make arithmetic produce flags
and merge them at use points.

Anyway, I think we can reach a good enough implementation without actually
implementing any data flow by simply restricting what we do to stmts.
It will of course require manual intervention in passes that can break
things rather than having the restriction being visible by data flow that's
checked anyway.

[Bug target/102783] [powerpc] FPSCR manipulations cannot be relied upon

2021-10-19 Thread joseph at codesourcery dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102783

--- Comment #9 from joseph at codesourcery dot com  ---
On Tue, 19 Oct 2021, segher at gcc dot gnu.org via Gcc-bugs wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102783
> 
> --- Comment #8 from Segher Boessenkool  ---
> (In reply to jos...@codesourcery.com from comment #6)
> > Generically (and if the command-line options are such that floating-point 
> > control / status bits are to be respected by optimizations), *any* 
> > function call might access or modify floating-point control and status 
> > bits, subject to e.g. const functions not being able to access them, pure 
> > functions not being able to modify them, functions whose body is known 
> > having properties based on analysis of that body, built-in functions 
> > having semantics based on what the compiler knows about those functions.  
> 
> If FENV_ACCESS is OFF most of those things can be ignored as well.  But
> FENV_ACCESS is much too blunt a hammer for most of our uses.

My recent discussions with Roger Sayle 
, 
and bug 54192 as referenced therein, may be helpful for more details of 
how FENV_ACCESS could be split up.  (At present we have -ftrapping-math, 
on by default, and -frounding-math, off by default.  I suspect that if 
-ftrapping-math really restricted optimizations enough to avoid all 
problematic code reordering / removal in the presence of function calls 
possibly reading and writing exception flags, it would actually inhibit 
optimization more than a full implementation of -frounding-math would: a 
full -frounding-math only means that arithmetic *reads* the rounding mode, 
whereas a full -ftrapping-math means that arithmetic *writes* to the 
exception flags.)

[Bug target/102783] [powerpc] FPSCR manipulations cannot be relied upon

2021-10-19 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102783

--- Comment #8 from Segher Boessenkool  ---
(In reply to jos...@codesourcery.com from comment #6)
> Generically (and if the command-line options are such that floating-point 
> control / status bits are to be respected by optimizations), *any* 
> function call might access or modify floating-point control and status 
> bits, subject to e.g. const functions not being able to access them, pure 
> functions not being able to modify them, functions whose body is known 
> having properties based on analysis of that body, built-in functions 
> having semantics based on what the compiler knows about those functions.  

If FENV_ACCESS is OFF most of those things can be ignored as well.  But
FENV_ACCESS is much too blunt a hammer for most of our uses.

> And then a subset of asms may similarly access or modify them (based on 
> inputs / outputs / clobbers, but maybe on some architectures existing 
> practice doesn't provide a register name that inputs / outputs / clobbers 
> can use to refer to floating-point state).

Like PowerPC.  But we *do* model vscr (vector status and control register).
It won't be hard to add fpscr.

> Then you'd need something like Marc Glisse's -ffenv-access patches (August 
> 2020) to represent the other side of things, how floating-point operations 
> also access / modify such bits.

Yeah, we need something for normal computational FP insns to clobber (on
PowerPC load/store insns never change the fpscr / fenv, but I bet that is
different on other archs).

Thanks for the pointer, I'll find Marc's work.

[Bug target/102783] [powerpc] FPSCR manipulations cannot be relied upon

2021-10-19 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102783

--- Comment #7 from Segher Boessenkool  ---
(In reply to Richard Biener from comment #5)
> Even out-of-line does not help if there are visible CSE/association
> opportunities across such call.

Yeah, good point.

> A workaround is to make the out-of-line
> function __attribute__((returns_twice)) which should insert artificial
> control flow
> preventing such transforms.

Is there anything that guarantees that to work (other than our actual
current implementation)?  It is much more stringent / expensive than we
would want, but if it is the best we can do...

[Bug target/102783] [powerpc] FPSCR manipulations cannot be relied upon

2021-10-18 Thread joseph at codesourcery dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102783

--- Comment #6 from joseph at codesourcery dot com  ---
Generically (and if the command-line options are such that floating-point 
control / status bits are to be respected by optimizations), *any* 
function call might access or modify floating-point control and status 
bits, subject to e.g. const functions not being able to access them, pure 
functions not being able to modify them, functions whose body is known 
having properties based on analysis of that body, built-in functions 
having semantics based on what the compiler knows about those functions.  
And then a subset of asms may similarly access or modify them (based on 
inputs / outputs / clobbers, but maybe on some architectures existing 
practice doesn't provide a register name that inputs / outputs / clobbers 
can use to refer to floating-point state).

Then you'd need something like Marc Glisse's -ffenv-access patches (August 
2020) to represent the other side of things, how floating-point operations 
also access / modify such bits.

[Bug target/102783] [powerpc] FPSCR manipulations cannot be relied upon

2021-10-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102783

--- Comment #5 from Richard Biener  ---
Even out-of-line does not help if there are visible CSE/association
opportunities across such call.  A workaround is to make the out-of-line
function __attribute__((returns_twice)) which should insert artificial control
flow
preventing such transforms.

[Bug target/102783] [powerpc] FPSCR manipulations cannot be relied upon

2021-10-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102783

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=20785,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=34678

--- Comment #4 from Andrew Pinski  ---
PR 20785 and bug 34678 come to mind for the generic issue on the gimple and rtl
levels.  There are many other linked bugs on those two too.

[Bug target/102783] [powerpc] FPSCR manipulations cannot be relied upon

2021-10-15 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102783

Segher Boessenkool  changed:

   What|Removed |Added

   Last reconfirmed||2021-10-15
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #3 from Segher Boessenkool  ---
Confirmed.

This is about the control part.  The status part has similar issues as well
but needs opposite ordering; we do not have any ordering right now, that is
the problem.

We have the same issues for vectors with the VSCR.  That one has only one
status bit: SAT, for saturation, and we set that explicitly in all insns that
do set it.  All of those are unusual, done via builtins, etc.  We model a VSCR
register just for this.  It also has one control bit: NJ, "non-java", it
disables strict IEEE arithmetic, which was useful for improved performance on
old cores.  We do not actually order setting that relative to insns that use
that control bit, but I have never actually seen anything set that bit, so the
issue does not practically exist there.

But for FP we need to order setting the control bits relative to any FP
computational insn, and reading the status bits as well.  There currently is
no way in GCC to say this.  It might be best to have an hook to say what
control bits there are, what insns care about which control bits, and what
insns set each of those bits.  And similar for status bits.

Does this sound generic enough, does it serve the needs of all targets?

[Bug target/102783] [powerpc] FPSCR manipulations cannot be relied upon

2021-10-15 Thread pthaugen at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102783

pthaugen at gcc dot gnu.org changed:

   What|Removed |Added

 CC||pthaugen at gcc dot gnu.org

--- Comment #2 from pthaugen at gcc dot gnu.org ---
I’ll note that an inline asm stmt appears to be a barrier for the scheduler,
but apparently not for other parts of the compiler. For example on the
following code:

double d;
void foo(double *dp, double c)
{
  double e;

  e = c + d;
  asm volatile ("");
  *dp = e + d;
  return;
} 

The scheduling dumps show that the asm volatile has dependencies on all insns
before and after it. But that doesn’t really help because the first addition
stmt gets moved past the asm volatile at expand time.

[Bug target/102783] [powerpc] FPSCR manipulations cannot be relied upon

2021-10-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102783

--- Comment #1 from Andrew Pinski  ---
There is a few other bugs which very similar to this one. Gcc not implementing
a pragma is one of them.