[Bug tree-optimization/68105] optimizing repeated floating point addition to multiplication
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68105 --- Comment #3 from Marc Glisse --- (In reply to zboson from comment #0) > In addition, the following equations are always true even without > associative math. > > 2*a = a + a > 3*a = a + a + a > 4*a = a + a + a + a > 5*a = a + a + a + a + a > > It turns out that GCC does simplify a + a + a + a to 4*a but only with > associative math enabled e.g. with -Ofast when it could do it with -O3. This "always true" needs some qualification. It is wrong with -frounding-math, where you have to stop at 3 IIRC. (In reply to kugan from comment #2) > Looks like a duplicate of PR63586. I am not convinced this is completely a dup. The patch https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00368.html would probably close PR63586, but not this one since all of the testcases about float in that patch use -ffast-math.
[Bug tree-optimization/68105] optimizing repeated floating point addition to multiplication
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68105 kugan at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED CC||kugan at gcc dot gnu.org Resolution|--- |DUPLICATE --- Comment #2 from kugan at gcc dot gnu.org --- Looks like a duplicate of PR63586. *** This bug has been marked as a duplicate of bug 63586 ***
[Bug tree-optimization/68105] optimizing repeated floating point addition to multiplication
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68105 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed||2015-10-27 Component|c |tree-optimization Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- Confirmed. One thing is that the x + x -> 2*x canonicalization GCC has for this purpose is done in fold-const.c only (probably ok, see below). Second thing is that the reassociation pass does _not_ do this optimization. Testcase for that: float foo (float x) { float tem = x + x; float tem2 = tem + x; tem = tem2 + x; tem2 = tem + x; tem = tem2 + x; tem2 = tem + x; return tem2; } Sounds like a beginners project - look at undistribute_ops_list and enhance it to handle the case of implicit * 1 (and maybe multi-use mults on the way). Then of course when in a loop we don't have final value replacement for non-integer types as we use SCEV for that. The SCCP pass needs to be amended here - I believe there is a duplicate bugreport for this somewhere. Note that it can't do this w/o -funsafe-math-optimizations as "more accurate" is only true if you see it mathematically vs. the FP operations. But GCC has to adhere to IEEE FP rules here. It should be guarded with -ffp-contract=fast rather than -fassociative-math though.