[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-08-11 Thread ebotcazou at gcc dot gnu dot org


--- Comment #35 from ebotcazou at gcc dot gnu dot org  2006-08-11 07:17 
---
Jan, I'm assigning it to you since you have already spent a fair amount of time
on it and made significant progress.  Thanks for tackling the hard stuff.


-- 

ebotcazou at gcc dot gnu dot org changed:

   What|Removed |Added

 CC|hubicka at gcc dot gnu dot  |ebotcazou at gcc dot gnu dot
   |org |org
 AssignedTo|unassigned at gcc dot gnu   |hubicka at gcc dot gnu dot
   |dot org |org
 Status|NEW |ASSIGNED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-29 Thread patchapp at dberlin dot org


--- Comment #34 from patchapp at dberlin dot org  2006-07-30 05:45 ---
Subject: Bug number PR rtl-optimization/28071

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is
http://gcc.gnu.org/ml/gcc-patches/2006-07/msg01221.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-29 Thread hubicka at gcc dot gnu dot org


--- Comment #33 from hubicka at gcc dot gnu dot org  2006-07-29 13:14 
---
Subject: Bug 28071

Author: hubicka
Date: Sat Jul 29 13:14:22 2006
New Revision: 115810

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=115810
Log:

PR rtl-optimization/28071
* cfgrtl.c (rtl_delete_block): Free regsets.
* flow.c (allocate_bb_life_data): Re-use regsets if available.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/cfgrtl.c
trunk/gcc/flow.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-28 Thread hubicka at ucw dot cz


--- Comment #32 from hubicka at ucw dot cz  2006-07-28 09:41 ---
Subject: Re:  [4.1/4.2 regression] A file that can not be compiled in
reasonable time/space

Hi,
I've added this testcase to our's memory regression tester (see
gcc-regression mainling list), so hopefully the quadratic memory
consumption issues will be tracked now.  It would be nice to have
runtime benchmark variant of the test we can track the runtime and
compilation time.  It seems to uncover quite interesting behaviours
across the compiler.

Honza


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-28 Thread patchapp at dberlin dot org


--- Comment #31 from patchapp at dberlin dot org  2006-07-28 09:30 ---
Subject: Bug number PR rtl-optimization/28071

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is
http://gcc.gnu.org/ml/gcc-patches/2006-07/msg01185.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-27 Thread hubicka at gcc dot gnu dot org


--- Comment #30 from hubicka at gcc dot gnu dot org  2006-07-27 17:10 
---
Subject: Bug 28071

Author: hubicka
Date: Thu Jul 27 17:10:07 2006
New Revision: 115779

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=115779
Log:
PR rtl-optimization/28071
* hashtab.c (htab_empty): Clear out n_deleted/n_elements;
downsize the hashtable.

Modified:
trunk/libiberty/ChangeLog
trunk/libiberty/hashtab.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-27 Thread hubicka at gcc dot gnu dot org


--- Comment #29 from hubicka at gcc dot gnu dot org  2006-07-27 16:03 
---
Subject: Bug 28071

Author: hubicka
Date: Thu Jul 27 16:03:22 2006
New Revision: 115777

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=115777
Log:
PR rtl-optimization/28071
* cselib.c (cselib_process_insn): Don't remove useless values too
often for very large hashtables.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/cselib.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-27 Thread hubicka at gcc dot gnu dot org


--- Comment #28 from hubicka at gcc dot gnu dot org  2006-07-27 16:02 
---
Subject: Bug 28071

Author: hubicka
Date: Thu Jul 27 16:02:27 2006
New Revision: 115776

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=115776
Log:
PR rtl-optimization/28071
* global.c (greg_obstack): New obstack.
(allocate_bb_info): Use it.
(free_bb_info): Likewise.
(modify_reg_pav): Likewise.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/global.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-27 Thread patchapp at dberlin dot org


--- Comment #27 from patchapp at dberlin dot org  2006-07-27 08:00 ---
Subject: Bug number PR rtl-optimization/28071

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is
http://gcc.gnu.org/ml/gcc-patches/2006-07/msg01147.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-27 Thread patchapp at dberlin dot org


--- Comment #26 from patchapp at dberlin dot org  2006-07-27 07:25 ---
Subject: Bug number PR rtl-optimization/28071

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is
http://gcc.gnu.org/ml/gcc-patches/2006-07/msg01146.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-27 Thread patchapp at dberlin dot org


--- Comment #25 from patchapp at dberlin dot org  2006-07-27 07:20 ---
Subject: Bug number PR rtl-optimization/28071

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is
http://gcc.gnu.org/ml/gcc-patches/2006-07/msg01145.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-27 Thread patchapp at dberlin dot org


--- Comment #24 from patchapp at dberlin dot org  2006-07-27 07:15 ---
Subject: Bug number PR rtl-optimization/28071

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is
http://gcc.gnu.org/ml/gcc-patches/2006-07/msg01144.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-26 Thread hubicka at gcc dot gnu dot org


--- Comment #23 from hubicka at gcc dot gnu dot org  2006-07-26 22:52 
---
Subject: Bug 28071

Author: hubicka
Date: Wed Jul 26 22:51:56 2006
New Revision: 115765

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=115765
Log:
PR rtl-optimization/28071
* regmove.c (reg_is_remote_constant_p): Avoid quadratic behaviour.
(reg_set_in_bb, max_reg_computed): New static variables.
(regmove_optimize): Free the new array.
(fixup_match_1): Update call of reg_is_remote_constant_p.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/regmove.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-25 Thread patchapp at dberlin dot org


--- Comment #22 from patchapp at dberlin dot org  2006-07-25 18:20 ---
Subject: Bug number PR rtl-optimization/28071

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is
http://gcc.gnu.org/ml/gcc-patches/2006-07/msg01083.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-24 Thread hubicka at gcc dot gnu dot org


--- Comment #21 from hubicka at gcc dot gnu dot org  2006-07-24 11:54 
---
OK, some summary ;)

Mainline (after the first three patches) at -O now peaks 450MB (just because of
register allocator's conflict matrix, otherwise it is about 150MB).  Still not
quite icc's 12 seconds/200MB, but we are out of regression land for -O relative
to 4.0.I tested 3.0 and it bombs on the testcase, 2.95 however compile it quite
fluently on 200MB peak, it needs 6 minutes however.

 life analysis :  25.92 (16%) usr   0.01 ( 0%) sys  26.18 (15%) wall   
2565 kB ( 1%) ggc
 inline heuristics :  15.15 ( 9%) usr   0.01 ( 0%) sys  15.27 ( 9%) wall   
1486 kB ( 1%) ggc
 integration   :  21.37 (13%) usr   0.12 ( 5%) sys  21.66 (13%) wall  
33445 kB (19%) ggc
 tree SSA to normal:  27.73 (17%) usr   0.03 ( 1%) sys  27.93 (16%) wall   
  17 kB ( 0%) ggc
 local alloc   :   7.33 ( 4%) usr   0.03 ( 1%) sys   7.41 ( 4%) wall   
1855 kB ( 1%) ggc
 global alloc  :  13.67 ( 8%) usr   0.73 (32%) sys  15.85 ( 9%) wall  
14178 kB ( 8%) ggc
 reload CSE regs   :  30.88 (19%) usr   0.04 ( 2%) sys  31.09 (18%) wall   
2393 kB ( 1%) ggc
 TOTAL : 164.46 2.27   169.53
173593 kB

It would be interesting to see how dataflow branch score here after re-merging
from mainline.  Hopefully integration and register allocation issues should be
tracked there.

The inliner is still quadratic in time because of quadratic split_block and
cgraph_node.  Both can be made linear quite easilly (split_block by always
renumbering the smaller area of block and cgraph_node by producing hashtables
for nodes with many edges), but I am not sure I want to do that for 4.2.
Inline heuristics might be trickier to get in speed.

I duno about reload. Oprofile might be handy ;)

-O2 expose problem in PRE DannyB has fix for.  Regmove and into-SSA can also be
significantly sped up by patches I attached and will commit them once testing
converge.

-O3 turns the testcase into quite different one (gigantic basic block is turned
into many basic blocks by inlining min/max functions).
There few problems are still visible - FRE consume unbounded amount of memory
and we fail to synthetize fmin/fmax operators where we ought to.

If the FRE problem is fixed, I would say it should no longer be considered as
4.2 blocker.

Honza


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-24 Thread hubicka at gcc dot gnu dot org


--- Comment #20 from hubicka at gcc dot gnu dot org  2006-07-24 11:28 
---
Subject: Bug 28071

Author: hubicka
Date: Mon Jul 24 11:27:53 2006
New Revision: 115713

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=115713
Log:
PR rtl-optimization/28071
* tree-cfg.c (tree_split_block): Do not allocate new stmt_list nodes.
* tree-iterator.c (tsi_split_statement_list_before): Do not crash when
splitting before first stmt.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-cfg.c
trunk/gcc/tree-iterator.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-24 Thread hubicka at gcc dot gnu dot org


--- Comment #19 from hubicka at gcc dot gnu dot org  2006-07-24 11:24 
---
Subject: Bug 28071

Author: hubicka
Date: Mon Jul 24 11:23:21 2006
New Revision: 115712

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=115712
Log:
PR rtl-optimization/28071
* ipa-inline.c (update_caller_keys): Remove edges that
are no longer inline candidates.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-inline.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-23 Thread patchapp at dberlin dot org


--- Comment #18 from patchapp at dberlin dot org  2006-07-24 00:05 ---
Subject: Bug number PR28071

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is
http://gcc.gnu.org/ml/gcc-patches/2006-07/msg01011.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-22 Thread hubicka at ucw dot cz


--- Comment #16 from hubicka at ucw dot cz  2006-07-22 20:51 ---
Subject: Re:  [4.1/4.2 regression] A file that can not be compiled in
reasonable time/space

Hi,
with the attached patch that saves roughly 10 minutes of tree-into-ssa
pass, I can compile with -O3 -fno-tree-fre -fno-tree-pre.  Only without
checking-enabled since we do incredibly deep dominator walks running out
of stack space that can be considered as bug too. 
TER still manages to enfore few thousdand temporaries with overlapping
liveranges.

THe out-of-ssa pass spends most of time in calculate_live_on_exit
and calculate_live_on_entry that looks rather symmetric to problem cured
by the attached patch, but I don't see directly how to avoid the
quadratic behaviour there.

Honza

 garbage collection:   1.22 ( 0%) usr   0.10 ( 1%) sys   8.40 ( 1%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.14 ( 0%) usr   0.03 ( 0%) sys   0.18 ( 0%) wall   
1147 kB ( 0%) ggc
 callgraph optimization:   0.07 ( 0%) usr   0.01 ( 0%) sys   0.45 ( 0%) wall   
 533 kB ( 0%) ggc
 ipa reference :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
   0 kB ( 0%) ggc
 ipa pure const:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 ipa type escape   :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
   0 kB ( 0%) ggc
 cfg cleanup   :   3.89 ( 1%) usr   0.01 ( 0%) sys   4.11 ( 0%) wall   
1576 kB ( 1%) ggc
 trivially dead code   :   0.46 ( 0%) usr   0.00 ( 0%) sys   0.53 ( 0%) wall   
   0 kB ( 0%) ggc
 life analysis :  51.34 ( 9%) usr   2.65 (21%) sys  73.91 ( 5%) wall   
2653 kB ( 1%) ggc
 life info update  :  48.97 ( 9%) usr   0.14 ( 1%) sys  50.68 ( 4%) wall   
 641 kB ( 0%) ggc
 alias analysis:   0.69 ( 0%) usr   0.00 ( 0%) sys   1.05 ( 0%) wall   
4139 kB ( 1%) ggc
 register scan :   0.41 ( 0%) usr   0.00 ( 0%) sys   0.40 ( 0%) wall   
   0 kB ( 0%) ggc
 rebuild jump labels   :   0.14 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
   0 kB ( 0%) ggc
 preprocessing :   0.37 ( 0%) usr   0.06 ( 0%) sys   0.34 ( 0%) wall   
 471 kB ( 0%) ggc
 lexical analysis  :   0.01 ( 0%) usr   0.05 ( 0%) sys   0.07 ( 0%) wall   
   0 kB ( 0%) ggc
 parser:   0.09 ( 0%) usr   0.02 ( 0%) sys   0.18 ( 0%) wall   
3207 kB ( 1%) ggc
 inline heuristics :  14.79 ( 3%) usr   0.02 ( 0%) sys  14.86 ( 1%) wall   
1118 kB ( 0%) ggc
 integration   :  17.07 ( 3%) usr   0.22 ( 2%) sys  17.36 ( 1%) wall  
79483 kB (27%) ggc
 tree gimplify :   0.15 ( 0%) usr   0.01 ( 0%) sys   0.17 ( 0%) wall   
3341 kB ( 1%) ggc
 tree eh   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
1338 kB ( 0%) ggc
 tree CFG cleanup  :   4.27 ( 1%) usr   0.00 ( 0%) sys   4.27 ( 0%) wall   
  20 kB ( 0%) ggc
 tree VRP  :   1.26 ( 0%) usr   0.03 ( 0%) sys   1.33 ( 0%) wall   
  14 kB ( 0%) ggc
 tree copy propagation :   0.85 ( 0%) usr   0.05 ( 0%) sys   0.94 ( 0%) wall   
 313 kB ( 0%) ggc
 tree store copy prop  :   0.27 ( 0%) usr   0.01 ( 0%) sys   0.28 ( 0%) wall   
   5 kB ( 0%) ggc
 tree find ref. vars   :   0.16 ( 0%) usr   0.03 ( 0%) sys   0.18 ( 0%) wall  
12044 kB ( 4%) ggc
 tree PTA  :   1.55 ( 0%) usr   0.06 ( 0%) sys   1.63 ( 0%) wall   
  57 kB ( 0%) ggc
 tree alias analysis   :   2.81 ( 0%) usr   0.29 ( 2%) sys   3.10 ( 0%) wall   
   0 kB ( 0%) ggc
 tree PHI insertion:   0.57 ( 0%) usr   0.92 ( 7%) sys   1.52 ( 0%) wall   
3137 kB ( 1%) ggc
 tree SSA rewrite  :   2.33 ( 0%) usr   0.06 ( 0%) sys   5.02 ( 0%) wall  
21592 kB ( 7%) ggc
 tree SSA other:   0.41 ( 0%) usr   0.16 ( 1%) sys   0.65 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA incremental  :   4.18 ( 1%) usr   0.45 ( 4%) sys   4.72 ( 0%) wall   
 520 kB ( 0%) ggc
 tree operand scan :   1.79 ( 0%) usr   0.69 ( 5%) sys  39.97 ( 3%) wall  
18374 kB ( 6%) ggc
 dominator optimization:   2.91 ( 1%) usr   0.05 ( 0%) sys   2.99 ( 0%) wall  
11155 kB ( 4%) ggc
 tree SRA  :   4.24 ( 1%) usr   0.15 ( 1%) sys   4.51 ( 0%) wall  
25568 kB ( 9%) ggc
 tree STORE-CCP:   0.29 ( 0%) usr   0.01 ( 0%) sys   0.31 ( 0%) wall   
  18 kB ( 0%) ggc
 tree CCP  :   0.87 ( 0%) usr   0.01 ( 0%) sys   2.39 ( 0%) wall   
 154 kB ( 0%) ggc
 tree split crit edges :   0.11 ( 0%) usr   0.02 ( 0%) sys   0.14 ( 0%) wall   
9284 kB ( 3%) ggc
 tree reassociation:   0.34 ( 0%) usr   0.00 ( 0%) sys   0.33 ( 0%) wall   
   0 kB ( 0%) ggc
 tree code sinking :   0.32 ( 0%) usr   0.00 ( 0%) sys   0.32 ( 0%) wall   
   0 kB ( 0%) ggc
 tree linearize phis   :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
   0 kB ( 0%) ggc
 tree forward propagate:   0.10 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall   
   0 kB ( 0%) ggc
 tree conservative DCE :   1.13 ( 0%) usr   0.00 ( 0%) sys   1.11 ( 0%) wall   
   0 kB ( 0%) ggc
 tree aggressive DCE   

Re: [Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-22 Thread Jan Hubicka
Hi,
with the attached patch that saves roughly 10 minutes of tree-into-ssa
pass, I can compile with -O3 -fno-tree-fre -fno-tree-pre.  Only without
checking-enabled since we do incredibly deep dominator walks running out
of stack space that can be considered as bug too. 
TER still manages to enfore few thousdand temporaries with overlapping
liveranges.

THe out-of-ssa pass spends most of time in calculate_live_on_exit
and calculate_live_on_entry that looks rather symmetric to problem cured
by the attached patch, but I don't see directly how to avoid the
quadratic behaviour there.

Honza

 garbage collection:   1.22 ( 0%) usr   0.10 ( 1%) sys   8.40 ( 1%) wall
   0 kB ( 0%) ggc
 callgraph construction:   0.14 ( 0%) usr   0.03 ( 0%) sys   0.18 ( 0%) wall
1147 kB ( 0%) ggc
 callgraph optimization:   0.07 ( 0%) usr   0.01 ( 0%) sys   0.45 ( 0%) wall
 533 kB ( 0%) ggc
 ipa reference :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall
   0 kB ( 0%) ggc
 ipa pure const:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
   0 kB ( 0%) ggc
 ipa type escape   :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall
   0 kB ( 0%) ggc
 cfg cleanup   :   3.89 ( 1%) usr   0.01 ( 0%) sys   4.11 ( 0%) wall
1576 kB ( 1%) ggc
 trivially dead code   :   0.46 ( 0%) usr   0.00 ( 0%) sys   0.53 ( 0%) wall
   0 kB ( 0%) ggc
 life analysis :  51.34 ( 9%) usr   2.65 (21%) sys  73.91 ( 5%) wall
2653 kB ( 1%) ggc
 life info update  :  48.97 ( 9%) usr   0.14 ( 1%) sys  50.68 ( 4%) wall
 641 kB ( 0%) ggc
 alias analysis:   0.69 ( 0%) usr   0.00 ( 0%) sys   1.05 ( 0%) wall
4139 kB ( 1%) ggc
 register scan :   0.41 ( 0%) usr   0.00 ( 0%) sys   0.40 ( 0%) wall
   0 kB ( 0%) ggc
 rebuild jump labels   :   0.14 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall
   0 kB ( 0%) ggc
 preprocessing :   0.37 ( 0%) usr   0.06 ( 0%) sys   0.34 ( 0%) wall
 471 kB ( 0%) ggc
 lexical analysis  :   0.01 ( 0%) usr   0.05 ( 0%) sys   0.07 ( 0%) wall
   0 kB ( 0%) ggc
 parser:   0.09 ( 0%) usr   0.02 ( 0%) sys   0.18 ( 0%) wall
3207 kB ( 1%) ggc
 inline heuristics :  14.79 ( 3%) usr   0.02 ( 0%) sys  14.86 ( 1%) wall
1118 kB ( 0%) ggc
 integration   :  17.07 ( 3%) usr   0.22 ( 2%) sys  17.36 ( 1%) wall   
79483 kB (27%) ggc
 tree gimplify :   0.15 ( 0%) usr   0.01 ( 0%) sys   0.17 ( 0%) wall
3341 kB ( 1%) ggc
 tree eh   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
   0 kB ( 0%) ggc
 tree CFG construction :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
1338 kB ( 0%) ggc
 tree CFG cleanup  :   4.27 ( 1%) usr   0.00 ( 0%) sys   4.27 ( 0%) wall
  20 kB ( 0%) ggc
 tree VRP  :   1.26 ( 0%) usr   0.03 ( 0%) sys   1.33 ( 0%) wall
  14 kB ( 0%) ggc
 tree copy propagation :   0.85 ( 0%) usr   0.05 ( 0%) sys   0.94 ( 0%) wall
 313 kB ( 0%) ggc
 tree store copy prop  :   0.27 ( 0%) usr   0.01 ( 0%) sys   0.28 ( 0%) wall
   5 kB ( 0%) ggc
 tree find ref. vars   :   0.16 ( 0%) usr   0.03 ( 0%) sys   0.18 ( 0%) wall   
12044 kB ( 4%) ggc
 tree PTA  :   1.55 ( 0%) usr   0.06 ( 0%) sys   1.63 ( 0%) wall
  57 kB ( 0%) ggc
 tree alias analysis   :   2.81 ( 0%) usr   0.29 ( 2%) sys   3.10 ( 0%) wall
   0 kB ( 0%) ggc
 tree PHI insertion:   0.57 ( 0%) usr   0.92 ( 7%) sys   1.52 ( 0%) wall
3137 kB ( 1%) ggc
 tree SSA rewrite  :   2.33 ( 0%) usr   0.06 ( 0%) sys   5.02 ( 0%) wall   
21592 kB ( 7%) ggc
 tree SSA other:   0.41 ( 0%) usr   0.16 ( 1%) sys   0.65 ( 0%) wall
   0 kB ( 0%) ggc
 tree SSA incremental  :   4.18 ( 1%) usr   0.45 ( 4%) sys   4.72 ( 0%) wall
 520 kB ( 0%) ggc
 tree operand scan :   1.79 ( 0%) usr   0.69 ( 5%) sys  39.97 ( 3%) wall   
18374 kB ( 6%) ggc
 dominator optimization:   2.91 ( 1%) usr   0.05 ( 0%) sys   2.99 ( 0%) wall   
11155 kB ( 4%) ggc
 tree SRA  :   4.24 ( 1%) usr   0.15 ( 1%) sys   4.51 ( 0%) wall   
25568 kB ( 9%) ggc
 tree STORE-CCP:   0.29 ( 0%) usr   0.01 ( 0%) sys   0.31 ( 0%) wall
  18 kB ( 0%) ggc
 tree CCP  :   0.87 ( 0%) usr   0.01 ( 0%) sys   2.39 ( 0%) wall
 154 kB ( 0%) ggc
 tree split crit edges :   0.11 ( 0%) usr   0.02 ( 0%) sys   0.14 ( 0%) wall
9284 kB ( 3%) ggc
 tree reassociation:   0.34 ( 0%) usr   0.00 ( 0%) sys   0.33 ( 0%) wall
   0 kB ( 0%) ggc
 tree code sinking :   0.32 ( 0%) usr   0.00 ( 0%) sys   0.32 ( 0%) wall
   0 kB ( 0%) ggc
 tree linearize phis   :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall
   0 kB ( 0%) ggc
 tree forward propagate:   0.10 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall
   0 kB ( 0%) ggc
 tree conservative DCE :   1.13 ( 0%) usr   0.00 ( 0%) sys   1.11 ( 0%) wall
   0 kB ( 0%) ggc
 tree aggressive DCE   :   0.28 ( 0%) usr   0.00 ( 0%) sys   0.28 ( 0%) wall
   0 kB ( 0%) ggc
 tree DSE  :   0.25 ( 0%) usr   0.00 

[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-22 Thread hubicka at ucw dot cz


--- Comment #14 from hubicka at ucw dot cz  2006-07-22 19:30 ---
Subject: Re:  [4.1/4.2 regression] A file that can not be compiled in
reasonable time/space

Hi,
with the attached patch I can cure the regmove quadratic behaviour and
the time report is not so unresonable now:

 gnu_dev_major gnu_dev_minor gnu_dev_makedev max min f fx fy fz add addl addr
sub subl subr mul mull mulr divl ipow fi
Analyzing compilation unitPerforming intraprocedural optimizations
Assembling functions:
 max min add addl addr sub subl subr mul mull mulr divl ipow fz fy fx f fi {GC
126177k -> 85112k} {GC 327625k -> 39474k}
Execution times (seconds)
 garbage collection:   0.83 ( 0%) usr   0.00 ( 0%) sys   0.82 ( 0%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.16 ( 0%) usr   0.02 ( 1%) sys   0.16 ( 0%) wall   
1147 kB ( 0%) ggc
 callgraph optimization:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
 533 kB ( 0%) ggc
 ipa reference :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   0 kB ( 0%) ggc
 ipa pure const:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 ipa type escape   :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   0 kB ( 0%) ggc
 trivially dead code   :   0.45 ( 0%) usr   0.00 ( 0%) sys   0.42 ( 0%) wall   
   0 kB ( 0%) ggc
 life analysis :  21.38 ( 3%) usr   0.02 ( 1%) sys  21.39 ( 3%) wall   
1120 kB ( 0%) ggc
 life info update  :   0.54 ( 0%) usr   0.00 ( 0%) sys   0.61 ( 0%) wall   
   0 kB ( 0%) ggc
 alias analysis:   0.87 ( 0%) usr   0.00 ( 0%) sys   0.89 ( 0%) wall   
4266 kB ( 1%) ggc
 register scan :   0.42 ( 0%) usr   0.00 ( 0%) sys   0.40 ( 0%) wall   
 150 kB ( 0%) ggc
 rebuild jump labels   :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
   0 kB ( 0%) ggc
 preprocessing :   0.27 ( 0%) usr   0.06 ( 2%) sys   0.36 ( 0%) wall   
 471 kB ( 0%) ggc
 lexical analysis  :   0.04 ( 0%) usr   0.05 ( 2%) sys   0.08 ( 0%) wall   
   0 kB ( 0%) ggc
 parser:   0.12 ( 0%) usr   0.03 ( 1%) sys   0.17 ( 0%) wall   
3207 kB ( 1%) ggc
 inline heuristics :  15.14 ( 2%) usr   0.01 ( 0%) sys  15.26 ( 2%) wall   
1486 kB ( 0%) ggc
 integration   :  21.35 ( 3%) usr   0.12 ( 4%) sys  21.71 ( 3%) wall  
33445 kB ( 8%) ggc
 tree gimplify :   0.18 ( 0%) usr   0.01 ( 0%) sys   0.19 ( 0%) wall   
3341 kB ( 1%) ggc
 tree eh   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
1338 kB ( 0%) ggc
 tree CFG cleanup  :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
  20 kB ( 0%) ggc
 tree VRP  :   0.38 ( 0%) usr   0.01 ( 0%) sys   0.42 ( 0%) wall   
  11 kB ( 0%) ggc
 tree copy propagation :   0.23 ( 0%) usr   0.01 ( 0%) sys   0.28 ( 0%) wall   
 222 kB ( 0%) ggc
 tree store copy prop  :   0.11 ( 0%) usr   0.01 ( 0%) sys   0.14 ( 0%) wall   
   4 kB ( 0%) ggc
 tree find ref. vars   :   0.10 ( 0%) usr   0.01 ( 0%) sys   0.11 ( 0%) wall   
8137 kB ( 2%) ggc
 tree PTA  :   1.29 ( 0%) usr   0.04 ( 1%) sys   1.36 ( 0%) wall   
  57 kB ( 0%) ggc
 tree alias analysis   :   1.89 ( 0%) usr   0.20 ( 7%) sys   2.10 ( 0%) wall   
   0 kB ( 0%) ggc
 tree PHI insertion:   1.68 ( 0%) usr   0.01 ( 0%) sys   1.70 ( 0%) wall   
  18 kB ( 0%) ggc
 tree SSA rewrite  :   0.62 ( 0%) usr   0.04 ( 1%) sys   0.65 ( 0%) wall  
17084 kB ( 4%) ggc
 tree SSA other:   0.48 ( 0%) usr   0.08 ( 3%) sys   0.56 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA incremental  :   1.20 ( 0%) usr   0.00 ( 0%) sys   1.24 ( 0%) wall   
   0 kB ( 0%) ggc
 tree operand scan :   1.48 ( 0%) usr   0.34 (11%) sys   1.93 ( 0%) wall  
15634 kB ( 4%) ggc
 dominator optimization:   1.05 ( 0%) usr   0.05 ( 2%) sys   1.05 ( 0%) wall   
2698 kB ( 1%) ggc
 tree SRA  :   1.05 ( 0%) usr   0.09 ( 3%) sys   1.15 ( 0%) wall  
24835 kB ( 6%) ggc
 tree STORE-CCP:   0.09 ( 0%) usr   0.01 ( 0%) sys   0.11 ( 0%) wall   
   4 kB ( 0%) ggc
 tree CCP  :   0.51 ( 0%) usr   0.02 ( 1%) sys   0.56 ( 0%) wall   
 154 kB ( 0%) ggc
 tree reassociation:   0.11 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
   0 kB ( 0%) ggc
 tree PRE  : 296.46 (45%) usr   0.49 (16%) sys 298.81 (45%) wall  
19481 kB ( 5%) ggc
 tree FRE  :   0.96 ( 0%) usr   0.05 ( 2%) sys   1.00 ( 0%) wall   
7991 kB ( 2%) ggc
 tree forward propagate:   0.04 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree conservative DCE :   0.54 ( 0%) usr   0.00 ( 0%) sys   0.54 ( 0%) wall   
   0 kB ( 0%) ggc
 tree aggressive DCE   :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
   0 kB ( 0%) ggc
 tree DSE  :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   8 kB ( 0%) ggc
 tree SSA uncprop  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA to normal:  27.19 ( 4%) usr   0

[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-22 Thread hubicka at ucw dot cz


--- Comment #12 from hubicka at ucw dot cz  2006-07-22 18:09 ---
Subject: Re:  [4.1/4.2 regression] A file that can not be compiled in
reasonable time/space

Hi,
I am attaching the .optimized dump of this testcase.  It is quite good
demonstration on how SRA and TER tends to increase register pressure in
code like:


;; Function add (add)

Analyzing Edge Insertions.
add (x, y)
{
  double r$min;

:
  r$min = x.min + y.min;
  .max = x.max + y.max;
  .min = r$min;
  return ;

}

;; Function mul (mul)

Analyzing Edge Insertions.
mul (x, y)
{
  double y$min;
  double y$max;
  double x$min;
  double x$max;
  double d;
  double c;
  double b;
  double a;

:
  x$max = x.max;
  x$min = x.min;
  y$max = y.max;
  y$min = y.min;
  a = y$min * x$min;
  b = y$max * x$min;
  c = y$min * x$max;
  d = y$max * x$max;
  .max = max (max (a, b), max (c, d));
  .min = min (min (a, b), min (c, d));
  return ;

}



;; Function fz (fz)

fz (x, y, z)
{

:
  tmp3 = pow (z, 3.7e+1);
  tmp7 = pow (y, 2.0e+0);
  tmp9 = pow (z, 3.6e+1);
  tmp14 = pow (y, 3.0e+0);
  tmp16 = pow (z, 3.5e+1);
...
  tmp3922 = pow (x, 3.8e+1);
  D.17848 = pow (x, 3.9e+1);
  D.17965 = pow (y, 3.9e+1);
  D.17968 = pow (z, 3.9e+1);
  return tmp3 * x * 2.04629333124046830505449179327115416526794433594e+1 * y +
tmp9 * tmp7 * x * 1.63737898728226838329646852798759937286376953125e+2 + tmp16
* tmp14 * x * 3.102825991153964650948182679712772369384765625e+2 + tmp23 *
tmp21 * x * -1.38580890184729059910750947892665863037109375e+3 + tmp30 * tmp28
* x * -4.39080063708386560961116629187017679214477539062e+1 + tmp37 * tmp35 * x
* 1.737348223038549986085854470729827880859375e+4 + tmp44 * tmp42 * x *
-1.069806869373114386689849197864532470703125e+4 + tmp51 * tmp49 * x *
-3.542086638969252817332744598388671875e+4 + tmp58 * tmp56 * x *
-3.091774346229622824466787278652191162109375e+4 + tmp65 * tmp63 * x *
1.568088658621288946889400482177734375e+5 + tmp72 * tmp70 * x *
4.19376520881160162389278411865234375e+5 + tmp79 * tmp77 * x *
2.0111082929561330820433795452117919921875e+5 + tmp86 * tmp84 * x *
-4.337742627231603837572038173675537109375e+5 + tmp93 * tmp91 * x *
-4.829501801337040960788726806640625e+5 + tmp100 * tmp98 * x *
5.32241994551055715419352054595947265625e+5 + tmp107 * tmp105 * x *
1.8250994926701225340366363525390625e+6 + tmp114 * tmp112 * x *
1.6382205795514374040067195892333984375e+6 + tmp121 * tmp119 * x *
1.1912621023960295133292675018310546875e+5 + tmp128 * tmp126 * x *
8.811503159726611338555812835693359375e+5 + tmp135 * tmp133 * x *
2.690164492243868880905210971832275390625e+5 + tmp142 * tmp140 * x *
2.271892026609037420712411403656005859375e+5 + tmp149 * tmp147 * x *
1.795814638975697453133761882781982421875e+5 + tmp156 * tmp154 * x *
-3.94381184819339658133685588836669921875e+5 + tmp163 * tmp161 * x *
7.64450454622797551564872264862060546875e+5 + tmp170 * tmp168 * x *
6.9298171586054741055704653263092041015625e+4 + tmp177 * tmp175 * x *
-3.129066099043917492963373661041259765625e+5 + tmp184 * tmp182 * x *
-4.0792914801556640304625034332275390625e+5 + tmp191 * tmp189 * x *
7.3512920753349564620293676853179931640625e+4 + tmp198 * tmp196 * x *
3.5470695311840399881475605070590972900390625e+3 + tmp205 * tmp203 * x *
-8.8733450804951236932538449764251708984375e+4 + tmp212 * tmp210 * x *
-1.3805889644669676272314973175525665283203125e+4 + tmp219 * tmp217 * x *
-7.54301319902873729006387293338775634765625e+3 + tmp226 * tmp224 * x *
2.23731170493404579246998764574527740478515625e+3 + tmp233 * tmp231 * x *
-3.903765115338947599581779539585113525390625e+2 + tmp240 * tmp238 * x *
4.743319333283892547115101478993892669677734375e+2 + tmp247 * tmp245 * x *
-6.32641294603530113249689748045057058334350585938e+1 + tmp252 * x *
-6.76527508139541300380415123072452843189239501953e+0 * z + tmp258 * x *
-4.51436297228304250772623618104262277483940124512e-1 + tmp263 * x *
2.89405090268957065902100111998151987791061401367e+0 + tmp9 * tmp268 *
-3.7483157190701700756108039058744907379150390625e+2 * y + tmp16 * tmp7 *
tmp268 * 9.276025613194925654170219786465167999267578125e+2 + tmp23 * tmp14 *
tmp268 * 1.35840047018872951412049587815847412109375e+2 + tmp30 * tmp21 *
tmp268 * -3.2681330410168111484381370246410369873046875e+3 + tmp37 * tmp28 *
tmp268 * 2.77737094612259534187614917755126953125e+3 + tmp44 * tmp35 * tmp268 *
2.2773056570869275674340315163135528564453125e+3 + tmp51 * tmp42 * tmp268 *
9.2295963366692260024137794971466064453125e+4 + tmp58 * tmp49 * tmp268 *
-3.049601738325569895096123218536376953125e+5 + tmp65 * tmp56 * tmp268 *
-2.69300746038850047625601291656494140625e+5 + tmp72 * tmp63 * tmp268 *
3.92479526798162725754082202911376953125e+5 + tmp79 * tmp70 * tmp268 *
-1.4348648827185891568660736083984375e+6 + tmp86 * tmp77 * tmp268 *
1.2925352909364881925284862518310546875e+6 + tmp93 * tmp84 * tmp268 *
3.44742843619707785546779632568359375e+6 + tmp100 * tmp91 * tmp268 *
2.2975221813043109141290187835693359375e+6 + tmp107 * tmp98 * tmp268 *
-8.7537045

[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-22 Thread hubicka at ucw dot cz


--- Comment #11 from hubicka at ucw dot cz  2006-07-22 17:12 ---
Subject: Re:  [4.1/4.2 regression] A file that can not be compiled in
reasonable time/space

Hi,
this avoids inliner to produce quadratically many STMT list nodes, so
inlining is now resonably fast.  Next offenders are alias info, PRE,
regmove, global alloc and schedulers.

Index: tree-cfg.c
===
*** tree-cfg.c  (revision 115645)
--- tree-cfg.c  (working copy)
*** tree_redirect_edge_and_branch_force (edg
*** 4158,4164 
  static basic_block
  tree_split_block (basic_block bb, void *stmt)
  {
!   block_stmt_iterator bsi, bsi_tgt;
tree act;
basic_block new_bb;
edge e;
--- 4158,4165 
  static basic_block
  tree_split_block (basic_block bb, void *stmt)
  {
!   block_stmt_iterator bsi;
!   tree_stmt_iterator tsi_tgt;
tree act;
basic_block new_bb;
edge e;
*** tree_split_block (basic_block bb, void *
*** 4192,4204 
}
  }

!   bsi_tgt = bsi_start (new_bb);
!   while (!bsi_end_p (bsi))
! {
!   act = bsi_stmt (bsi);
!   bsi_remove (&bsi, false);
!   bsi_insert_after (&bsi_tgt, act, BSI_NEW_STMT);
! }

return new_bb;
  }
--- 4193,4209 
}
  }

!   if (bsi_end_p (bsi))
! return new_bb;
! 
!   /* Split the statement list - avoid re-creating new containers as this
!  brings ugly quadratic memory consumption in the inliner.  
!  (We are still quadratic since we need to update stmt BB pointers,
!  sadly) */
!   new_bb->stmt_list = tsi_split_statement_list_before (&bsi.tsi);
!   for (tsi_tgt = tsi_start (new_bb->stmt_list);
!!tsi_end_p (tsi_tgt); tsi_next (&tsi_tgt))
! set_bb_for_stmt (tsi_stmt (tsi_tgt), new_bb);

return new_bb;
  }


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-22 Thread hubicka at ucw dot cz


--- Comment #10 from hubicka at ucw dot cz  2006-07-22 13:47 ---
Subject: Re:  [4.1/4.2 regression] A file that can not be compiled in
reasonable time/space

Hi,
this patch makes the -O2 case work pretty well on tree side.  Inliner
expands code from 8MB to 40MB of GGC memory that seems under control.
Aliasing peaks at 85MB that also don't seem completely unresonable.
I will need to give it more testing.  I believe inliner is always ggc
safe but it is easy to be mistaken here.
The patch also speeds up the inline heuristic by prunning out the
impossible edges early making the priority queue smaller.
Also I am quite curious how inliner manages to produce 800MB of
garbage...

Honza

Index: ipa-inline.c
===
*** ipa-inline.c(revision 115645)
--- ipa-inline.c(working copy)
*** update_caller_keys (fibheap_t heap, stru
*** 413,418 
--- 413,419 
bitmap updated_nodes)
  {
struct cgraph_edge *edge;
+   const char *failed_reason;

if (!node->local.inlinable || node->local.disregard_inline_limits
|| node->global.inlined_to)
*** update_caller_keys (fibheap_t heap, stru
*** 421,426 
--- 422,441 
  return;
bitmap_set_bit (updated_nodes, node->uid);
node->global.estimated_growth = INT_MIN;
+ 
+   if (!node->local.inlinable)
+ return;
+   /* Prune out edges we won't inline into anymore.  */
+   if (!cgraph_default_inline_p (node, &failed_reason))
+ {
+   for (edge = node->callers; edge; edge = edge->next_caller)
+   if (edge->aux)
+ {
+   fibheap_delete_node (heap, edge->aux);
+   edge->aux = NULL;
+ }
+   return;
+ }

for (edge = node->callers; edge; edge = edge->next_caller)
  if (edge->inline_failed)
Index: tree-inline.c
===
*** tree-inline.c   (revision 115645)
--- tree-inline.c   (working copy)
*** expand_call_inline (basic_block bb, tree
*** 2163,2172 
/* Update callgraph if needed.  */
cgraph_remove_node (cg_edge->callee);

-   /* Declare the 'auto' variables added with this inlined body.  */
-   record_vars (BLOCK_VARS (id->block));
id->block = NULL_TREE;
successfully_inlined = TRUE;

   egress:
input_location = saved_location;
--- 2163,2171 
/* Update callgraph if needed.  */
cgraph_remove_node (cg_edge->callee);

id->block = NULL_TREE;
successfully_inlined = TRUE;
+   ggc_collect ();

   egress:
input_location = saved_location;
*** declare_inline_vars (tree block, tree va
*** 2556,2562 
  {
tree t;
for (t = vars; t; t = TREE_CHAIN (t))
! DECL_SEEN_IN_BIND_EXPR_P (t) = 1;

if (block)
  BLOCK_VARS (block) = chainon (BLOCK_VARS (block), vars);
--- 2555,2567 
  {
tree t;
for (t = vars; t; t = TREE_CHAIN (t))
! {
!   DECL_SEEN_IN_BIND_EXPR_P (t) = 1;
!   gcc_assert (!TREE_STATIC (t) && !TREE_ASM_WRITTEN (t));
!   cfun->unexpanded_var_list =
!   tree_cons (NULL_TREE, t,
!  cfun->unexpanded_var_list);
! }

if (block)
  BLOCK_VARS (block) = chainon (BLOCK_VARS (block), vars);


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-21 Thread raffalli at univ-savoie dot fr


--- Comment #9 from raffalli at univ-savoie dot fr  2006-07-21 22:01 ---
Subject: Re:  [4.1/4.2 regression] A file that
 can not be compiled in reasonable time/space

hubicka at gcc dot gnu dot org a écrit :
> --- Comment #8 from hubicka at gcc dot gnu dot org  2006-07-21 21:11 
> ---
> Hmm,
> the function fi contains 3 calls, many of called functions contains 
> further
> calls. 
> Since our metric allows to replace each call by up to 10 instructions and we
> allow fi to grow twice, we can end up with 60 instructions in single basic
> block (in fact we do with roughly 39 in the inliner metrics).  This is
> still linear growth and the testcase is rather extreme, so I am not sure if I
> would declare this inliner bug (user has asked for it by declaring stuff 
> inline
> after all ;)
>
> Without inlining we are not behaving much better (I am just running the
> compilation and it is at 900MB, so using 1GB for inlined function bodies don't
> seems to be that unresonable.  I will try to play with this a bit.
>
> One solution might be to adjust our size estimates to be less aggressive for
> large functions so the growth in actual number of statements is not 20 fold at
> most but some smaller constant, but it is rather ugly.
>
> Honza
>
>
>   
may be a look at the assembly code generated by icc which behave very 
well on this test case could be usefull ?

Christophe


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-21 Thread hubicka at gcc dot gnu dot org


--- Comment #8 from hubicka at gcc dot gnu dot org  2006-07-21 21:11 ---
Hmm,
the function fi contains 3 calls, many of called functions contains further
calls. 
Since our metric allows to replace each call by up to 10 instructions and we
allow fi to grow twice, we can end up with 60 instructions in single basic
block (in fact we do with roughly 39 in the inliner metrics).  This is
still linear growth and the testcase is rather extreme, so I am not sure if I
would declare this inliner bug (user has asked for it by declaring stuff inline
after all ;)

Without inlining we are not behaving much better (I am just running the
compilation and it is at 900MB, so using 1GB for inlined function bodies don't
seems to be that unresonable.  I will try to play with this a bit.

One solution might be to adjust our size estimates to be less aggressive for
large functions so the growth in actual number of statements is not 20 fold at
most but some smaller constant, but it is rather ugly.

Honza


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-07-16 Thread mmitchel at gcc dot gnu dot org


-- 

mmitchel at gcc dot gnu dot org changed:

   What|Removed |Added

   Priority|P3  |P2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-06-19 Thread raffalli at univ-savoie dot fr


--- Comment #7 from raffalli at univ-savoie dot fr  2006-06-19 08:44 ---
Just for comparison: on my Intel dual core 3GHz,

icc compiles in 15s within 200Mb with -O3 (including cpp)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071



[Bug rtl-optimization/28071] [4.1/4.2 regression] A file that can not be compiled in reasonable time/space

2006-06-17 Thread rguenth at gcc dot gnu dot org


--- Comment #6 from rguenth at gcc dot gnu dot org  2006-06-17 18:44 ---
Btw, we do not die during inlining, but during optimization which is confronted
with one gigantic basic block, as all BBs after inlining are merged by fixupcfg
;)

Oh, and we die during RTL optimizations...  but I wonder why we are not able to
free up some memory before (lower gc params help for this, and we enter greg
with 250MB used and it still wants
cc1: out of memory allocating 1134939624 bytes after a total of 43487232 bytes

So, more something for Matz/Vladimir.


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||matz at suse dot de
  Component|middle-end  |rtl-optimization


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071