[Bug other/63155] [4.9/5 Regression] memory hog

2014-10-30 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63155

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|4.9.2   |4.9.3

--- Comment #4 from Jakub Jelinek  ---
GCC 4.9.2 has been released.


[Bug other/63155] [4.9/5 Regression] memory hog

2014-09-03 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63155

--- Comment #3 from Richard Biener  ---
So the issue is that the setjmp argument needs two temporaries:

  D.2832 = Unity.CurrentAbortFrame;
  D.2833 = &Unity.AbortFrame[D.2832];

  :
  D.2834 = _setjmp (D.2833);

and the EH edge going into the _setjmp call has to merge those through
the abnormal dispatcher.  And that way it receives all of them.  Hmm.

Huh.  Without the abnormal dispatcher they should just get default defs
everywhere (but still many PHI nodes).  Maybe that would be more light-weight.


[Bug other/63155] [4.9/5 Regression] memory hog

2014-09-03 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63155

--- Comment #2 from Richard Biener  ---
I wonder why we need to explicitely represent abnormal PHIs in the dispatcher.
All incoming edges are abnormal and all SSA names have to be coalesced anyway.
Thus we could instead have

  :
/* Not: # _2(ab) = PHI <_17(D)(ab)(2), _1(ab)(6), _1(ab)(7), _3(ab)(11),
_3(ab)(12), _4(ab)(15), _4(ab)(16), _5(ab)(20), _5(ab)(21), _5(ab)(22)> */
  ABNORMAL_DISPATCHER (0);
  _2(ab) = D.12345;

or simply rewrite all must-coalesce vars out-of-SSA?  (or not into SSA
in the first place)

The question is whether accesses to them should be loads/stores (I think so)
and if that will cause other similar issues.

We'd have to factor abnormal edges into a block to a separate forwarder
of course, with a load of all "abnormal" vars.

Anyway, not sure why the gimplify code is disabled for -O0 (or why we
don't re-use formal temps more aggressively as they become anonymous
SSA names later anyway).


[Bug other/63155] [4.9/5 Regression] memory hog

2014-09-03 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63155

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2014-09-03
   Target Milestone|--- |4.9.2
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Clearly caused by the correctness fix for setjmp to wire abnormal edges.

For me it is out-of-ssa which uses too much memory while building
the conflict graph.

We have gigantic PHI nodes here:

_10263(ab) = PHI <_109925(D)(ab)(2),, _10592(ab)(1489)>

it's fast when optimizing.

At -O0 we have a _lot_ more anonymous SSA names.

-O1:

  :
  # _1(ab) = PHI <_1902(3), _2(ab)(5)>
  _1905 = _setjmp (_1(ab));
  if (_1905 == 0)
goto ;
  else
goto ;

  
  # _2(ab) = PHI <_1895(D),  single gigantic PHI

-O0:

  :
  # _1(ab) = PHI <_398164(3), _2(ab)(5)>
  # _632(ab) = PHI <_397532(D)(ab)(3), _633(ab)(5)>
  # _1263(ab) = PHI <_397533(D)(ab)(3), _1264(ab)(5)>
  # _1894(ab) = PHI <_397534(D)(ab)(3), _1895(ab)(5)>
  # _2525(ab) = PHI <_397535(D)(ab)(3), _2526(ab)(5)>
...
  # _396900(ab) = PHI <_398160(D)(ab)(3), _396901(ab)(5)>
  _398165 = _setjmp (_1(ab));
  if (_398165 == 0)
goto ;
  else
goto ;

  
  # _2(ab) = PHI <_397531(D)(ab)(2)...

  # _396901(ab) = PHI <_398160(D)(ab)(2), _3...

gazillion of gigantic PHIs.  And very many PHIs in every block.

It's into-SSA that introduces the difference for the PHI nodes
but already GIMPLIFICATION that introduces very many more
temporaries which is the underlying issue (lookup_tmp_var
!optimize check).

Index: gcc/gimplify.c
===
--- gcc/gimplify.c  (revision 214810)
+++ gcc/gimplify.c  (working copy)
@@ -476,7 +476,7 @@ lookup_tmp_var (tree val, bool is_formal
  block, which means it will go into memory, causing much extra
  work in reload and final and poorer code generation, outweighing
  the extra memory allocation here.  */
-  if (!optimize || !is_formal || TREE_SIDE_EFFECTS (val))
+  if (!is_formal || TREE_SIDE_EFFECTS (val))
 ret = create_tmp_from_val (val);
   else
 {

fixes it (but it means that changing the testcase to use more distinct
user variables would produce the same issue even when optimizing).