> Are you building with --enable-checking (the default)?
On AMD I am using the François-Xavier's builds. On my G5 I use a patched
version of the Fink's info file, the answer is probably in
ConfigureParams: --prefix=%p/lib/gcc4
--enable-languages=c,c++,fortran,objc,java --infodir='${prefix}/share/info'
--with-gmp=%p --with-included-gettext --host=%m-apple-darwin`uname -r|cut -f1
-d.` `if test ! -f /usr/lib/libSystemStubs.a ; then echo -n
"--with-as=%p/lib/odcctools/bin/as --with-ld=%p/lib/odcctools/bin/ld" ; fi`
and as far as I can tell is "no".
> Can you try compiling some of the most-affected files with -ftime-report, ...?
As a quick answer, with -ftime-report on induct.f90 I get:
[karma] lin/source% time gfortran -ftime-report -O3 -ffast-math -funroll-loops
induct.f90
Execution times (seconds)
garbage collection : 0.58 ( 1%) usr 0.11 ( 2%) sys 0.71 ( 1%) wall
0 kB ( 0%) ggc
callgraph construction: 0.15 ( 0%) usr 0.02 ( 0%) sys 0.16 ( 0%) wall
645 kB ( 0%) ggc
callgraph optimization: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
194 kB ( 0%) ggc
ipa reference : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
1 kB ( 0%) ggc
ipa pure const : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
0 kB ( 0%) ggc
ipa type escape : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
0 kB ( 0%) ggc
cfg construction : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
77 kB ( 0%) ggc
cfg cleanup : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall
94 kB ( 0%) ggc
CFG verifier : 0.45 ( 1%) usr 0.09 ( 2%) sys 0.50 ( 1%) wall
0 kB ( 0%) ggc
trivially dead code : 0.12 ( 0%) usr 0.01 ( 0%) sys 0.28 ( 0%) wall
0 kB ( 0%) ggc
life analysis : 0.53 ( 1%) usr 0.02 ( 0%) sys 0.47 ( 1%) wall
505 kB ( 0%) ggc
life info update : 0.11 ( 0%) usr 0.01 ( 0%) sys 0.27 ( 0%) wall
102 kB ( 0%) ggc
alias analysis : 0.32 ( 1%) usr 0.02 ( 0%) sys 0.51 ( 1%) wall
2161 kB ( 2%) ggc
register scan : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall
7 kB ( 0%) ggc
rebuild jump labels : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall
0 kB ( 0%) ggc
parser : 0.39 ( 1%) usr 0.04 ( 1%) sys 0.98 ( 2%) wall
3728 kB ( 3%) ggc
integration : 0.01 ( 0%) usr 0.01 ( 0%) sys 0.00 ( 0%) wall
0 kB ( 0%) ggc
tree gimplify : 0.17 ( 0%) usr 0.01 ( 0%) sys 0.15 ( 0%) wall
977 kB ( 1%) ggc
tree CFG construction : 0.01 ( 0%) usr 0.01 ( 0%) sys 0.02 ( 0%) wall
1699 kB ( 1%) ggc
tree CFG cleanup : 0.13 ( 0%) usr 0.04 ( 1%) sys 0.16 ( 0%) wall
327 kB ( 0%) ggc
tree VRP : 0.25 ( 0%) usr 0.08 ( 2%) sys 0.36 ( 1%) wall
2304 kB ( 2%) ggc
tree copy propagation : 1.18 ( 2%) usr 0.31 ( 6%) sys 1.54 ( 2%) wall
542 kB ( 0%) ggc
tree store copy prop : 0.22 ( 0%) usr 0.05 ( 1%) sys 0.28 ( 0%) wall
93 kB ( 0%) ggc
tree find ref. vars : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
347 kB ( 0%) ggc
tree PTA : 0.58 ( 1%) usr 0.01 ( 0%) sys 0.61 ( 1%) wall
185 kB ( 0%) ggc
tree alias analysis : 0.70 ( 1%) usr 0.40 ( 8%) sys 1.24 ( 2%) wall
1747 kB ( 1%) ggc
tree PHI insertion : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
301 kB ( 0%) ggc
tree SSA rewrite : 1.39 ( 2%) usr 0.38 ( 8%) sys 1.81 ( 3%) wall
45605 kB (35%) ggc
tree SSA other : 0.06 ( 0%) usr 0.04 ( 1%) sys 0.10 ( 0%) wall
0 kB ( 0%) ggc
tree SSA incremental : 3.70 ( 6%) usr 0.13 ( 3%) sys 3.82 ( 6%) wall
7379 kB ( 6%) ggc
tree operand scan : 1.20 ( 2%) usr 0.63 (13%) sys 1.93 ( 3%) wall
19607 kB (15%) ggc
dominator optimization: 0.96 ( 2%) usr 0.03 ( 1%) sys 0.96 ( 2%) wall
3806 kB ( 3%) ggc
tree SRA : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
0 kB ( 0%) ggc
tree STORE-CCP : 0.16 ( 0%) usr 0.05 ( 1%) sys 0.16 ( 0%) wall
39 kB ( 0%) ggc
tree CCP : 0.17 ( 0%) usr 0.03 ( 1%) sys 0.18 ( 0%) wall
17 kB ( 0%) ggc
tree split crit edges : 0.03 ( 0%) usr 0.02 ( 0%) sys 0.05 ( 0%) wall
1842 kB ( 1%) ggc
tree reassociation : 0.05 ( 0%) usr 0.03 ( 1%) sys 0.05 ( 0%) wall
42 kB ( 0%) ggc
tree PRE : 0.56 ( 1%) usr 0.04 ( 1%) sys 0.58 ( 1%) wall
1264 kB ( 1%) ggc
tree FRE : 0.17 ( 0%) usr 0.01 ( 0%) sys 0.21 ( 0%) wall
1014 kB ( 1%) ggc
tree code sinking : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
9 kB ( 0%) ggc
tree forward propagate: 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
4 kB ( 0%) ggc
tree conservative DCE : 0.45 ( 1%) usr 0.00 ( 0%) sys 0.43 ( 1%) wall
0 kB ( 0%) ggc
tree aggressive DCE : 0.13 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 0%) wall
0 kB ( 0%) ggc
tree DSE : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall
87 kB ( 0%) ggc
PHI merge : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall
1368 kB ( 1%) ggc
tree loop bounds : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall
210 kB ( 0%) ggc
loop invariant motion : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
12 kB ( 0%) ggc
tree canonical iv : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
109 kB ( 0%) ggc
scev constant prop : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
36 kB ( 0%) ggc
tree loop unswitching : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
0 kB ( 0%) ggc
complete unrolling : 0.82 ( 1%) usr 0.04 ( 1%) sys 1.07 ( 2%) wall
737 kB ( 1%) ggc
tree iv optimization : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall
916 kB ( 1%) ggc
tree loop init : 0.07 ( 0%) usr 0.02 ( 0%) sys 0.13 ( 0%) wall
0 kB ( 0%) ggc
tree copy headers : 0.02 ( 0%) usr 0.01 ( 0%) sys 0.04 ( 0%) wall
2030 kB ( 2%) ggc
tree SSA uncprop : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
0 kB ( 0%) ggc
tree SSA to normal : 0.20 ( 0%) usr 0.13 ( 3%) sys 0.33 ( 1%) wall
1386 kB ( 1%) ggc
tree rename SSA copies: 0.06 ( 0%) usr 0.12 ( 2%) sys 0.15 ( 0%) wall
0 kB ( 0%) ggc
tree SSA verifier : 28.82 (50%) usr 1.12 (23%) sys 30.10 (48%) wall
19 kB ( 0%) ggc
tree STMT verifier : 4.78 ( 8%) usr 0.18 ( 4%) sys 4.92 ( 8%) wall
0 kB ( 0%) ggc
callgraph verifier : 0.02 ( 0%) usr 0.02 ( 0%) sys 0.05 ( 0%) wall
0 kB ( 0%) ggc
dominance frontiers : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
0 kB ( 0%) ggc
expand : 1.29 ( 2%) usr 0.09 ( 2%) sys 1.43 ( 2%) wall
9358 kB ( 7%) ggc
jump : 0.02 ( 0%) usr 0.01 ( 0%) sys 0.00 ( 0%) wall
18 kB ( 0%) ggc
CSE : 0.63 ( 1%) usr 0.03 ( 1%) sys 0.63 ( 1%) wall
443 kB ( 0%) ggc
loop analysis : 0.26 ( 0%) usr 0.10 ( 2%) sys 0.34 ( 1%) wall
1635 kB ( 1%) ggc
global CSE : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
0 kB ( 0%) ggc
CPROP 1 : 0.08 ( 0%) usr 0.01 ( 0%) sys 0.07 ( 0%) wall
567 kB ( 0%) ggc
PRE : 0.06 ( 0%) usr 0.02 ( 0%) sys 0.09 ( 0%) wall
355 kB ( 0%) ggc
CPROP 2 : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall
269 kB ( 0%) ggc
bypass jumps : 0.09 ( 0%) usr 0.01 ( 0%) sys 0.10 ( 0%) wall
238 kB ( 0%) ggc
web : 0.09 ( 0%) usr 0.02 ( 0%) sys 0.12 ( 0%) wall
203 kB ( 0%) ggc
CSE 2 : 0.41 ( 1%) usr 0.01 ( 0%) sys 0.46 ( 1%) wall
271 kB ( 0%) ggc
branch prediction : 0.04 ( 0%) usr 0.01 ( 0%) sys 0.06 ( 0%) wall
143 kB ( 0%) ggc
flow analysis : 0.00 ( 0%) usr 0.01 ( 0%) sys 0.03 ( 0%) wall
0 kB ( 0%) ggc
combiner : 0.33 ( 1%) usr 0.00 ( 0%) sys 0.34 ( 1%) wall
1379 kB ( 1%) ggc
if-conversion : 0.00 ( 0%) usr 0.01 ( 0%) sys 0.02 ( 0%) wall
19 kB ( 0%) ggc
regmove : 0.10 ( 0%) usr 0.01 ( 0%) sys 0.09 ( 0%) wall
3 kB ( 0%) ggc
scheduling : 0.44 ( 1%) usr 0.09 ( 2%) sys 0.45 ( 1%) wall
2689 kB ( 2%) ggc
local alloc : 0.30 ( 1%) usr 0.03 ( 1%) sys 0.32 ( 1%) wall
596 kB ( 0%) ggc
global alloc : 0.92 ( 2%) usr 0.03 ( 1%) sys 0.90 ( 1%) wall
2497 kB ( 2%) ggc
reload CSE regs : 0.33 ( 1%) usr 0.01 ( 0%) sys 0.32 ( 1%) wall
1198 kB ( 1%) ggc
load CSE after reload : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
13 kB ( 0%) ggc
flow 2 : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
200 kB ( 0%) ggc
if-conversion 2 : 0.00 ( 0%) usr 0.01 ( 0%) sys 0.01 ( 0%) wall
4 kB ( 0%) ggc
peephole 2 : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
0 kB ( 0%) ggc
rename registers : 0.48 ( 1%) usr 0.02 ( 0%) sys 0.50 ( 1%) wall
610 kB ( 0%) ggc
scheduling 2 : 0.40 ( 1%) usr 0.02 ( 0%) sys 0.36 ( 1%) wall
2552 kB ( 2%) ggc
reorder blocks : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
159 kB ( 0%) ggc
shorten branches : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
0 kB ( 0%) ggc
final : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.16 ( 0%) wall
350 kB ( 0%) ggc
TOTAL : 57.21 4.84 63.35
129840 kB
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --disable-checking to disable checks.
57.300u 4.940s 1:03.76 97.6% 0+0k 8+23io 0pf+0w
where tree SSA verifier takes half the time. I'll do some check on AMD wher I
can
more easily chose the version of gfortran I am using.
Cheers
Dominique