------- Comment #24 from jv244 at cam dot ac dot uk 2008-12-16 14:20 ------- (In reply to comment #23) reduced testcase timings at -O0 and -O3. Tree operand scan anybody?
> time gfortran -O0 -ffree-line-length-512 -c -ftime-report testcase_reduced.f90 Execution times (seconds) garbage collection : 0.51 ( 1%) usr 0.00 ( 0%) sys 0.49 ( 1%) wall 0 kB ( 0%) ggc callgraph construction: 0.05 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 4956 kB ( 2%) ggc callgraph optimization: 8.13 (18%) usr 0.20 (16%) sys 8.36 (18%) wall 1280 kB ( 1%) ggc cfg cleanup : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc CFG verifier : 0.48 ( 1%) usr 0.02 ( 2%) sys 0.46 ( 1%) wall 0 kB ( 0%) ggc trivially dead code : 0.17 ( 0%) usr 0.00 ( 0%) sys 0.18 ( 0%) wall 0 kB ( 0%) ggc df live regs : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall 0 kB ( 0%) ggc df reg dead/unused notes: 0.24 ( 1%) usr 0.00 ( 0%) sys 0.23 ( 0%) wall 9445 kB ( 4%) ggc register information : 0.11 ( 0%) usr 0.01 ( 1%) sys 0.12 ( 0%) wall 0 kB ( 0%) ggc alias analysis : 0.10 ( 0%) usr 0.01 ( 1%) sys 0.10 ( 0%) wall 4239 kB ( 2%) ggc rebuild jump labels : 0.12 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall 0 kB ( 0%) ggc parser : 1.07 ( 2%) usr 0.05 ( 4%) sys 1.12 ( 2%) wall 22673 kB ( 9%) ggc inline heuristics : 16.30 (36%) usr 0.41 (33%) sys 16.75 (36%) wall 0 kB ( 0%) ggc tree gimplify : 0.06 ( 0%) usr 0.01 ( 1%) sys 0.08 ( 0%) wall 6435 kB ( 3%) ggc tree CFG construction : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 180 kB ( 0%) ggc tree find ref. vars : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 3231 kB ( 1%) ggc tree SSA rewrite : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 63 kB ( 0%) ggc tree SSA other : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree operand scan : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 236 kB ( 0%) ggc tree SSA to normal : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree SSA verifier : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree STMT verifier : 0.20 ( 0%) usr 0.01 ( 1%) sys 0.22 ( 0%) wall 0 kB ( 0%) ggc callgraph verifier : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall 0 kB ( 0%) ggc dominance computation : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc expand : 10.86 (24%) usr 0.38 (30%) sys 11.26 (24%) wall 132856 kB (52%) ggc integrated RA : 4.08 ( 9%) usr 0.05 ( 4%) sys 4.13 ( 9%) wall 4604 kB ( 2%) ggc reload : 1.88 ( 4%) usr 0.07 ( 6%) sys 1.97 ( 4%) wall 59269 kB (23%) ggc thread pro- & epilogue: 0.17 ( 0%) usr 0.00 ( 0%) sys 0.18 ( 0%) wall 175 kB ( 0%) ggc final : 0.63 ( 1%) usr 0.03 ( 2%) sys 0.66 ( 1%) wall 3790 kB ( 1%) ggc TOTAL : 45.42 1.25 46.73 253684 kB Extra diagnostic checks enabled; compiler may run slowly. Configure with --enable-checking=release to disable checks. real 0m47.298s user 0m45.923s sys 0m1.316s > time gfortran -march=native -O3 -ffree-line-length-512 -c -ftime-report > testcase_reduced.f90 Execution times (seconds) garbage collection : 1.48 ( 1%) usr 0.01 ( 0%) sys 1.50 ( 1%) wall 0 kB ( 0%) ggc callgraph construction: 0.03 ( 0%) usr 0.01 ( 0%) sys 0.05 ( 0%) wall 4955 kB ( 1%) ggc callgraph optimization: 6.27 ( 3%) usr 0.15 ( 7%) sys 6.46 ( 4%) wall 2366 kB ( 0%) ggc ipa cp : 0.05 ( 0%) usr 0.01 ( 0%) sys 0.06 ( 0%) wall 34 kB ( 0%) ggc cfg cleanup : 0.01 ( 0%) usr 0.01 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc CFG verifier : 1.41 ( 1%) usr 0.00 ( 0%) sys 1.33 ( 1%) wall 0 kB ( 0%) ggc trivially dead code : 0.62 ( 0%) usr 0.00 ( 0%) sys 0.66 ( 0%) wall 0 kB ( 0%) ggc df reaching defs : 0.69 ( 0%) usr 0.01 ( 0%) sys 0.67 ( 0%) wall 0 kB ( 0%) ggc df live regs : 1.86 ( 1%) usr 0.00 ( 0%) sys 1.86 ( 1%) wall 0 kB ( 0%) ggc df live&initialized regs: 0.93 ( 1%) usr 0.00 ( 0%) sys 0.94 ( 1%) wall 0 kB ( 0%) ggc df use-def / def-use chains: 1.33 ( 1%) usr 0.04 ( 2%) sys 1.38 ( 1%) wall 0 kB ( 0%) ggc df reg dead/unused notes: 0.92 ( 1%) usr 0.00 ( 0%) sys 0.96 ( 1%) wall 13469 kB ( 3%) ggc register information : 0.44 ( 0%) usr 0.00 ( 0%) sys 0.43 ( 0%) wall 0 kB ( 0%) ggc alias analysis : 1.05 ( 1%) usr 0.00 ( 0%) sys 1.05 ( 1%) wall 24068 kB ( 5%) ggc register scan : 0.20 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall 18 kB ( 0%) ggc rebuild jump labels : 0.31 ( 0%) usr 0.00 ( 0%) sys 0.30 ( 0%) wall 0 kB ( 0%) ggc parser : 1.16 ( 1%) usr 0.03 ( 1%) sys 1.21 ( 1%) wall 22673 kB ( 5%) ggc inline heuristics : 15.83 ( 9%) usr 0.40 (20%) sys 16.25 ( 9%) wall 138 kB ( 0%) ggc integration : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 885 kB ( 0%) ggc tree gimplify : 0.06 ( 0%) usr 0.01 ( 0%) sys 0.07 ( 0%) wall 6434 kB ( 1%) ggc tree CFG construction : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 179 kB ( 0%) ggc tree CFG cleanup : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 8 kB ( 0%) ggc tree VRP : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 448 kB ( 0%) ggc tree copy propagation : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 159 kB ( 0%) ggc tree find ref. vars : 0.01 ( 0%) usr 0.01 ( 0%) sys 0.01 ( 0%) wall 3229 kB ( 1%) ggc tree PTA : 1.29 ( 1%) usr 0.03 ( 1%) sys 1.34 ( 1%) wall 540 kB ( 0%) ggc tree alias analysis : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 57 kB ( 0%) ggc tree call clobbering : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 19 kB ( 0%) ggc tree flow sensitive alias: 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 60 kB ( 0%) ggc tree flow insensitive alias: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree memory partitioning: 0.09 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall 0 kB ( 0%) ggc tree SSA rewrite : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 8391 kB ( 2%) ggc tree SSA other : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree SSA incremental : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 21 kB ( 0%) ggc tree operand scan : 98.14 (55%) usr 0.03 ( 1%) sys 98.31 (54%) wall 4048 kB ( 1%) ggc dominator optimization: 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 73 kB ( 0%) ggc tree CCP : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 119 kB ( 0%) ggc tree PRE : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 62 kB ( 0%) ggc tree FRE : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 33 kB ( 0%) ggc tree forward propagate: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 3 kB ( 0%) ggc tree conservative DCE : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc tree aggressive DCE : 0.02 ( 0%) usr 0.01 ( 0%) sys 0.01 ( 0%) wall 12 kB ( 0%) ggc tree loop init : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 44 kB ( 0%) ggc tree SSA to normal : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 4 kB ( 0%) ggc tree rename SSA copies: 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree SSA verifier : 0.95 ( 1%) usr 0.00 ( 0%) sys 0.97 ( 1%) wall 0 kB ( 0%) ggc tree STMT verifier : 2.05 ( 1%) usr 0.04 ( 2%) sys 2.11 ( 1%) wall 0 kB ( 0%) ggc callgraph verifier : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 0 kB ( 0%) ggc dominance computation : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc expand : 12.80 ( 7%) usr 0.33 (16%) sys 13.09 ( 7%) wall 131225 kB (27%) ggc lower subreg : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc forward prop : 0.91 ( 1%) usr 0.01 ( 0%) sys 0.91 ( 1%) wall 9021 kB ( 2%) ggc CSE : 2.70 ( 2%) usr 0.01 ( 0%) sys 2.70 ( 1%) wall 8941 kB ( 2%) ggc dead code elimination : 0.37 ( 0%) usr 0.00 ( 0%) sys 0.38 ( 0%) wall 0 kB ( 0%) ggc dead store elim1 : 0.59 ( 0%) usr 0.02 ( 1%) sys 0.62 ( 0%) wall 13140 kB ( 3%) ggc dead store elim2 : 0.77 ( 0%) usr 0.00 ( 0%) sys 0.76 ( 0%) wall 13219 kB ( 3%) ggc CSE 2 : 2.04 ( 1%) usr 0.00 ( 0%) sys 2.04 ( 1%) wall 3477 kB ( 1%) ggc combiner : 0.77 ( 0%) usr 0.00 ( 0%) sys 0.79 ( 0%) wall 6633 kB ( 1%) ggc if-conversion : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 50 kB ( 0%) ggc regmove : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall 28 kB ( 0%) ggc integrated RA : 9.17 ( 5%) usr 0.71 (35%) sys 9.92 ( 5%) wall 25558 kB ( 5%) ggc reload : 3.36 ( 2%) usr 0.10 ( 5%) sys 3.47 ( 2%) wall 101799 kB (21%) ggc reload CSE regs : 1.76 ( 1%) usr 0.00 ( 0%) sys 1.75 ( 1%) wall 27970 kB ( 6%) ggc load CSE after reload : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall 0 kB ( 0%) ggc thread pro- & epilogue: 0.18 ( 0%) usr 0.00 ( 0%) sys 0.19 ( 0%) wall 231 kB ( 0%) ggc peephole 2 : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 0%) wall 29 kB ( 0%) ggc rename registers : 1.03 ( 1%) usr 0.00 ( 0%) sys 1.04 ( 1%) wall 0 kB ( 0%) ggc scheduling 2 : 3.82 ( 2%) usr 0.03 ( 1%) sys 3.82 ( 2%) wall 53812 kB (11%) ggc machine dep reorg : 0.41 ( 0%) usr 0.00 ( 0%) sys 0.41 ( 0%) wall 0 kB ( 0%) ggc reorder blocks : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 3 kB ( 0%) ggc final : 0.64 ( 0%) usr 0.02 ( 1%) sys 0.65 ( 0%) wall 3824 kB ( 1%) ggc TOTAL : 179.47 2.03 181.70 492168 kB Extra diagnostic checks enabled; compiler may run slowly. Configure with --enable-checking=release to disable checks. real 3m2.238s user 2m59.927s sys 0m2.128s -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474