http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #37 from Jakub Jelinek jakub at gcc dot gnu.org 2011-01-27
15:55:57 UTC ---
/usr/src/gcc/objr/gcc/f951 -quiet -ftime-report -fbounds-check -g -O3
-ffast-math -funroll-loops -ftree-vectorize -march=amdfam10 pr45422.f90 21 |
grep ':[
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #38 from Jakub Jelinek jakub at gcc dot gnu.org 2011-01-27
16:02:49 UTC ---
*.gimple dump is roughly the same size between 4.5 and 4.6, but resulting
assembly size is 15MB in 4.5 and 23MB (with only 100KB variation with
-fno-ivopts)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #39 from Richard Guenther rguenth at gcc dot gnu.org 2011-01-27
16:16:48 UTC ---
The size difference is likely from prefetching, it's 1.5MB vs. 1.1MB without
that (-O3 -fbounds-check -ffast-math -funroll-loops). Prefetching usually
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #40 from Richard Guenther rguenth at gcc dot gnu.org 2011-01-27
16:19:26 UTC ---
Btw, when I remove -fbounds-check the sizes are comparable (without
prefetching),
so I guess we are just better in removing bounds checking for 4.6 and
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #41 from Jakub Jelinek jakub at gcc dot gnu.org 2011-01-27
16:28:49 UTC ---
With additional -fno-prefetch-loop-arrays the TOTAL goes down from that 137s to
92.23, and judging from tree dumps between 4.5 and 4.6 we do significantly
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #42 from Richard Guenther rguenth at gcc dot gnu.org 2011-01-27
16:30:52 UTC ---
Comparing -O3 -ffast-math -funroll-loops -fno-inline -fno-partial-inlining
(thus generic arch, without prefetching):
trunk:
df live regs :
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
Jakub Jelinek jakub at gcc dot gnu.org changed:
What|Removed |Added
Status|NEW |RESOLVED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #44 from davidxl xinliangli at gmail dot com 2011-01-27 17:33:42
UTC ---
Nice triaging..
David
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #32 from Jakub Jelinek jakub at gcc dot gnu.org 2011-01-25
09:02:57 UTC ---
IMHO for P1 purposes we should just look at compile time regressions from 4.5
here at this point. On the #c1 testcase I get with --enable-checking=release
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #33 from Joost VandeVondele Joost.VandeVondele at pci dot uzh.ch
2011-01-25 09:47:10 UTC ---
I just note that the timings reported by David and Jakub are not for the
compile options I originally reported.
With 4.6 (20110117) I now
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #34 from Jakub Jelinek jakub at gcc dot gnu.org 2011-01-25
09:52:23 UTC ---
-march=native is ambiguous, please see with -v what actually is being used.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #35 from Joost VandeVondele Joost.VandeVondele at pci dot uzh.ch
2011-01-25 10:03:02 UTC ---
(In reply to comment #34)
-march=native is ambiguous, please see with -v what actually is being used.
This was mentioned in the initial
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #36 from davidxl xinliangli at gmail dot com 2011-01-25 17:28:30
UTC ---
(In reply to comment #35)
(In reply to comment #34)
-march=native is ambiguous, please see with -v what actually is being used.
This was mentioned in the
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
Jakub Jelinek jakub at gcc dot gnu.org changed:
What|Removed |Added
CC||jakub at gcc dot
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #29 from davidxl xinliangli at gmail dot com 2011-01-21 16:27:43
UTC ---
(In reply to comment #28)
David, any progress with this?
The cost function fix to make sure solution set does not become too big will be
probably very involved
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #30 from davidxl xinliangli at gmail dot com 2011-01-21 19:58:41
UTC ---
(In reply to comment #29)
(In reply to comment #28)
David, any progress with this?
The cost function fix to make sure solution set does not become too big
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #31 from davidxl xinliangli at gmail dot com 2011-01-21 20:08:11
UTC ---
Comparing this timing with 4.6 results (164s), looks like many other passes
become slower other than ivopt (e.g IRA increases from 3.5s to 11s etc -- ivopt
only
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
Joost VandeVondele Joost.VandeVondele at pci dot uzh.ch changed:
What|Removed |Added
Last reconfirmed|2010-08-29 09:25:52
--
rguenth at gcc dot gnu dot org changed:
What|Removed |Added
Priority|P3 |P1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #26 from davidxl at gcc dot gnu dot org 2010-08-31 17:45
---
Good observation re. the number of IVs in the final set. This usually points to
some problem/bug in the cost function. I briefly looked at this case -- it
indeed exposes two more bugs in the cost model:
1) the
--- Comment #23 from rguenth at gcc dot gnu dot org 2010-08-30 07:11
---
(In reply to comment #22)
Given the fact that the solution space is really large -- M^N where M is the
number of candidates and M is the number of uses (here M == 70 and N == 48),
and the cost function is
--- Comment #24 from rguenth at gcc dot gnu dot org 2010-08-30 07:12
---
(In reply to comment #20)
(In reply to comment #16)
adjust summary according to the last timings
I am surprised to see such big differences between trunk and previous
releases.
Compiling this test case
--- Comment #25 from davidxl at gcc dot gnu dot org 2010-08-30 16:41
---
(In reply to comment #24)
(In reply to comment #20)
(In reply to comment #16)
adjust summary according to the last timings
I am surprised to see such big differences between trunk and previous
--- Comment #16 from jv244 at cam dot ac dot uk 2010-08-29 06:38 ---
adjust summary according to the last timings
--
jv244 at cam dot ac dot uk changed:
What|Removed |Added
--- Comment #17 from rguenth at gcc dot gnu dot org 2010-08-29 09:25
---
tree iv optimization : 32.57 (20%) usr 0.10 ( 5%) sys 32.73 (20%) wall
322095 kB (18%) ggc
20% is still completely unreasonable for IV optimization.
--
rguenth at gcc dot gnu dot org changed:
--- Comment #18 from jv244 at cam dot ac dot uk 2010-08-29 15:07 ---
FYI, these are the 4.5 branch timings:
Execution times (seconds)
garbage collection: 0.47 ( 1%) usr 0.00 ( 0%) sys 0.47 ( 1%) wall
0 kB ( 0%) ggc
callgraph construction: 0.05 ( 0%) usr 0.01 ( 1%)
--- Comment #20 from davidxl at gcc dot gnu dot org 2010-08-30 03:10
---
(In reply to comment #16)
adjust summary according to the last timings
I am surprised to see such big differences between trunk and previous releases.
Compiling this test case with the those options on my
--- Comment #21 from davidxl at gcc dot gnu dot org 2010-08-30 03:19
---
(In reply to comment #17)
tree iv optimization : 32.57 (20%) usr 0.10 ( 5%) sys 32.73 (20%) wall
322095 kB (18%) ggc
20% is still completely unreasonable for IV optimization.
There was a patch in
28 matches
Mail list logo