------- Comment #31 from rguenth at gcc dot gnu dot org  2010-01-26 11:35 
-------
Updated timings and memory:

> ~/bin/maxmem2.sh /usr/bin/time gcc-4.5 -S -o /dev/null lgwam.c
32.62user 1.48system 0:34.41elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+333822minor)pagefaults 0swaps
total: 1333745 kB

> ~/bin/maxmem2.sh /usr/bin/time /space/rguenther/install/gcc-4.4.3/bin/gcc -S 
> -o /dev/null lgwam.c
35.01user 1.54system 0:36.89elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+329139minor)pagefaults 0swaps
total: 1306898 kB

> ~/bin/maxmem2.sh /usr/bin/time /space/rguenther/install/gcc-4.3.4/bin/gcc -S 
> -o /dev/null lgwam.c
27.42user 1.83system 0:29.61elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
16inputs+0outputs (0major+374338minor)pagefaults 0swaps
total: 1341721 kB

> ~/bin/maxmem2.sh /usr/bin/time /space/rguenther/install/gcc-4.2.3/bin/gcc -S 
> -o /dev/null lgwam.c
15.33user 0.80system 0:16.31elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
256inputs+0outputs (2major+197427minor)pagefaults 0swaps
total: 598733 kB


time-report for 4.5 trunk (only interesting parts):

 expand                :   6.72 (20%) usr   0.72 (26%) sys   7.45 (20%) wall 
140609 kB (35%) ggc
 integrated RA         :   8.56 (25%) usr   0.12 ( 4%) sys   8.76 (24%) wall   
5151 kB ( 1%) ggc
 reload                :   7.72 (23%) usr   0.22 ( 8%) sys   7.94 (21%) wall    
 TOTAL                 :  34.17             2.79            37.15            
402913 kB

memory-usage is still high compared to 4.2.

At -O2 the picture is similar (memory peaks at 2.5GB):

Execution times (seconds)
 garbage collection    :   0.64 ( 0%) usr   0.00 ( 0%) sys   0.64 ( 0%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.31 ( 0%) usr   0.08 ( 1%) sys   0.39 ( 0%) wall  
20525 kB ( 3%) ggc
 callgraph optimization:   1.40 ( 1%) usr   0.03 ( 0%) sys   1.46 ( 1%) wall   
 639 kB ( 0%) ggc
 ipa cp                :   0.20 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall   
9607 kB ( 1%) ggc
 ipa reference         :   0.14 ( 0%) usr   0.00 ( 0%) sys   0.14 ( 0%) wall   
   0 kB ( 0%) ggc
 ipa pure const        :   0.33 ( 0%) usr   0.00 ( 0%) sys   0.32 ( 0%) wall   
 199 kB ( 0%) ggc
 cfg cleanup           :   0.92 ( 0%) usr   0.00 ( 0%) sys   0.89 ( 0%) wall   
1168 kB ( 0%) ggc
 trivially dead code   :   0.92 ( 0%) usr   0.00 ( 0%) sys   0.89 ( 0%) wall   
   0 kB ( 0%) ggc
 df multiple defs      :   0.86 ( 0%) usr   0.01 ( 0%) sys   0.88 ( 0%) wall   
   0 kB ( 0%) ggc
 df reaching defs      :   2.69 ( 1%) usr   0.96 (12%) sys   3.94 ( 2%) wall   
   0 kB ( 0%) ggc
 df live regs          :   7.15 ( 3%) usr   0.07 ( 1%) sys   7.13 ( 3%) wall   
   0 kB ( 0%) ggc
 df live&initialized regs:   3.33 ( 2%) usr   0.06 ( 1%) sys   3.36 ( 1%) wall 
     0 kB ( 0%) ggc
 df use-def / def-use chains:   0.85 ( 0%) usr   0.03 ( 0%) sys   0.88 ( 0%)
wall       0 kB ( 0%) ggc
 df reg dead/unused notes:   6.31 ( 3%) usr   0.05 ( 1%) sys  12.84 ( 6%) wall 
 18222 kB ( 2%) ggc
 register information  :   3.42 ( 2%) usr   0.00 ( 0%) sys   3.46 ( 2%) wall   
   0 kB ( 0%) ggc
 alias analysis        :   1.14 ( 1%) usr   0.01 ( 0%) sys   1.17 ( 1%) wall  
10447 kB ( 1%) ggc
 alias stmt walking    :   4.39 ( 2%) usr   0.26 ( 3%) sys   4.53 ( 2%) wall   
   0 kB ( 0%) ggc
 register scan         :   0.20 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall   
   7 kB ( 0%) ggc
 rebuild jump labels   :   0.53 ( 0%) usr   0.00 ( 0%) sys   0.51 ( 0%) wall   
   0 kB ( 0%) ggc
 preprocessing         :   0.64 ( 0%) usr   0.35 ( 4%) sys   0.93 ( 0%) wall  
23140 kB ( 3%) ggc
 lexical analysis      :   0.30 ( 0%) usr   0.54 ( 7%) sys   0.88 ( 0%) wall   
   0 kB ( 0%) ggc
 parser                :   0.69 ( 0%) usr   0.38 ( 5%) sys   1.09 ( 0%) wall  
38129 kB ( 5%) ggc
 inline heuristics     :   1.18 ( 1%) usr   0.01 ( 0%) sys   1.15 ( 1%) wall  
29832 kB ( 4%) ggc
 integration           :   3.42 ( 2%) usr   0.45 ( 6%) sys   3.66 ( 2%) wall 
175322 kB (22%) ggc
 tree gimplify         :   1.08 ( 1%) usr   0.09 ( 1%) sys   1.17 ( 1%) wall 
104718 kB (13%) ggc
 tree eh               :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.14 ( 0%) usr   0.01 ( 0%) sys   0.14 ( 0%) wall  
11817 kB ( 1%) ggc
 tree CFG cleanup      :   1.23 ( 1%) usr   0.00 ( 0%) sys   1.21 ( 1%) wall   
  60 kB ( 0%) ggc
 tree VRP              :   3.08 ( 1%) usr   0.01 ( 0%) sys   3.09 ( 1%) wall   
6719 kB ( 1%) ggc
 tree copy propagation :   1.22 ( 1%) usr   0.02 ( 0%) sys   1.14 ( 0%) wall   
 585 kB ( 0%) ggc
 tree find ref. vars   :   0.11 ( 0%) usr   0.01 ( 0%) sys   0.12 ( 0%) wall   
6045 kB ( 1%) ggc
 tree PTA              :   0.62 ( 0%) usr   0.00 ( 0%) sys   0.68 ( 0%) wall   
2178 kB ( 0%) ggc
 tree PHI insertion    :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
 185 kB ( 0%) ggc
 tree SSA rewrite      :   2.92 ( 1%) usr   0.06 ( 1%) sys   3.04 ( 1%) wall  
54148 kB ( 7%) ggc
 tree SSA other        :   0.24 ( 0%) usr   0.08 ( 1%) sys   0.34 ( 0%) wall   
 589 kB ( 0%) ggc
 tree SSA incremental  :   3.46 ( 2%) usr   0.03 ( 0%) sys   3.50 ( 2%) wall   
 187 kB ( 0%) ggc
 tree operand scan     :   1.81 ( 1%) usr   0.31 ( 4%) sys   2.40 ( 1%) wall  
53424 kB ( 7%) ggc
 dominator optimization:   0.73 ( 0%) usr   0.00 ( 0%) sys   0.73 ( 0%) wall   
5666 kB ( 1%) ggc
 tree SRA              :   0.01 ( 0%) usr   0.01 ( 0%) sys   0.02 ( 0%) wall   
   2 kB ( 0%) ggc
 tree CCP              :   0.95 ( 0%) usr   0.00 ( 0%) sys   0.93 ( 0%) wall   
 617 kB ( 0%) ggc
 tree PHI const/copy prop:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
     8 kB ( 0%) ggc
 tree reassociation    :   0.20 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall   
 280 kB ( 0%) ggc
 tree PRE              :   2.02 ( 1%) usr   1.01 (13%) sys   3.02 ( 1%) wall   
2092 kB ( 0%) ggc
 tree FRE              :   2.62 ( 1%) usr   1.12 (14%) sys   3.75 ( 2%) wall   
1674 kB ( 0%) ggc
 tree code sinking     :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall   
1126 kB ( 0%) ggc
 tree linearize phis   :   0.07 ( 0%) usr   0.01 ( 0%) sys   0.09 ( 0%) wall   
   4 kB ( 0%) ggc
 tree forward propagate:   0.19 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall   
  75 kB ( 0%) ggc
 tree phiprop          :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree conservative DCE :   0.61 ( 0%) usr   0.20 ( 2%) sys   0.91 ( 0%) wall   
   5 kB ( 0%) ggc
 tree aggressive DCE   :   0.65 ( 0%) usr   0.09 ( 1%) sys   0.80 ( 0%) wall   
1486 kB ( 0%) ggc
 tree buildin call DCE :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 tree DSE              :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall   
  34 kB ( 0%) ggc
 PHI merge             :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
  26 kB ( 0%) ggc
 tree loop bounds      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
  17 kB ( 0%) ggc
 tree loop invariant motion:   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%)
wall       4 kB ( 0%) ggc
 scev constant prop    :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
 112 kB ( 0%) ggc
 complete unrolling    :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall   
 379 kB ( 0%) ggc
 tree iv optimization  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
 473 kB ( 0%) ggc
 tree loop init        :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall   
 390 kB ( 0%) ggc
 tree copy headers     :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
 330 kB ( 0%) ggc
 tree SSA uncprop      :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
   0 kB ( 0%) ggc
 tree rename SSA copies:   0.31 ( 0%) usr   0.00 ( 0%) sys   0.32 ( 0%) wall   
   0 kB ( 0%) ggc
 dominance frontiers   :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall   
   0 kB ( 0%) ggc
 dominance computation :   1.26 ( 1%) usr   0.04 ( 0%) sys   1.31 ( 1%) wall   
   0 kB ( 0%) ggc
 control dependences   :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 expand                :  10.92 ( 5%) usr   0.70 ( 9%) sys  11.74 ( 5%) wall 
124907 kB (16%) ggc
 forward prop          :   2.08 ( 1%) usr   0.04 ( 0%) sys   2.06 ( 1%) wall   
3631 kB ( 0%) ggc
 CSE                   :   3.19 ( 1%) usr   0.00 ( 0%) sys   3.14 ( 1%) wall   
 512 kB ( 0%) ggc
 dead code elimination :   1.24 ( 1%) usr   0.01 ( 0%) sys   1.25 ( 1%) wall   
   0 kB ( 0%) ggc
 dead store elim1      :   1.26 ( 1%) usr   0.04 ( 0%) sys   1.33 ( 1%) wall   
 965 kB ( 0%) ggc
 dead store elim2      :   1.51 ( 1%) usr   0.02 ( 0%) sys   1.54 ( 1%) wall   
7466 kB ( 1%) ggc
 loop analysis         :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
 229 kB ( 0%) ggc
 loop invariant motion :   0.10 ( 0%) usr   0.03 ( 0%) sys   0.13 ( 0%) wall   
   2 kB ( 0%) ggc
 CPROP                 :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall   
 452 kB ( 0%) ggc
 PRE                   :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall   
  15 kB ( 0%) ggc
 CSE 2                 :   3.20 ( 1%) usr   0.00 ( 0%) sys   3.22 ( 1%) wall   
 494 kB ( 0%) ggc
 branch prediction     :   0.28 ( 0%) usr   0.00 ( 0%) sys   0.30 ( 0%) wall   
2539 kB ( 0%) ggc
 combiner              :   1.42 ( 1%) usr   0.02 ( 0%) sys   1.42 ( 1%) wall  
11673 kB ( 1%) ggc
 if-conversion         :   0.19 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall   
 338 kB ( 0%) ggc
 regmove               :   0.39 ( 0%) usr   0.00 ( 0%) sys   0.35 ( 0%) wall   
   0 kB ( 0%) ggc
 integrated RA         :  78.17 (37%) usr   0.46 ( 6%) sys  78.94 (34%) wall   
4788 kB ( 1%) ggc
 reload                :  25.79 (12%) usr   0.07 ( 1%) sys  26.09 (11%) wall  
26499 kB ( 3%) ggc
 reload CSE regs       :   3.00 ( 1%) usr   0.05 ( 1%) sys   3.08 ( 1%) wall  
15550 kB ( 2%) ggc
 thread pro- & epilogue:   1.39 ( 1%) usr   0.01 ( 0%) sys   1.48 ( 1%) wall   
1006 kB ( 0%) ggc
 if-conversion 2       :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
 138 kB ( 0%) ggc
 combine stack adjustments:   0.08 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall
      0 kB ( 0%) ggc
 peephole 2            :   0.67 ( 0%) usr   0.01 ( 0%) sys   0.69 ( 0%) wall   
3275 kB ( 0%) ggc
 hard reg cprop        :   1.33 ( 1%) usr   0.02 ( 0%) sys   1.32 ( 1%) wall   
 979 kB ( 0%) ggc
 scheduling 2          :   6.59 ( 3%) usr   0.05 ( 1%) sys   6.63 ( 3%) wall   
 294 kB ( 0%) ggc
 machine dep reorg     :   0.69 ( 0%) usr   0.01 ( 0%) sys   0.68 ( 0%) wall   
  10 kB ( 0%) ggc
 reorder blocks        :   0.34 ( 0%) usr   0.00 ( 0%) sys   0.39 ( 0%) wall   
2878 kB ( 0%) ggc
 final                 :   1.63 ( 1%) usr   0.04 ( 0%) sys   1.64 ( 1%) wall   
 201 kB ( 0%) ggc
 tree if-combine       :   0.01 ( 0%) usr   0.01 ( 0%) sys   0.03 ( 0%) wall   
   1 kB ( 0%) ggc
 plugin execution      :   0.00 ( 0%) usr   0.03 ( 0%) sys   0.04 ( 0%) wall   
   0 kB ( 0%) ggc
 TOTAL                 : 214.12             8.02           229.70            
793292 kB

At -O1 we seem to never finish early inlining into testsuite though...
(the loop over all callees calling check-inline-limits which again
loops over all callees looks quadratic - at least because we do not
not consider the duplicates in the caller?  We seem to be looping
over edges but check limits for !one_only.  Why do we not consider
edges individually and avoid calling cgraph_check_inline_limits
with !one_only at all?).


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[4.3/4.5 Regression] gcc    |[4.3/4.4/4.5 Regression] gcc
                   |4.3.1 cannot compile big    |4.3.1 cannot compile big
                   |function                    |function


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37448

Reply via email to