http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #44 from davidxl 2011-01-27 17:33:42
UTC ---
Nice triaging..
David
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
Jakub Jelinek changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #42 from Richard Guenther 2011-01-27
16:30:52 UTC ---
Comparing -O3 -ffast-math -funroll-loops -fno-inline -fno-partial-inlining
(thus generic arch, without prefetching):
trunk:
df live regs : 4.22 ( 6%) usr 0.04 ( 2%)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #41 from Jakub Jelinek 2011-01-27
16:28:49 UTC ---
With additional -fno-prefetch-loop-arrays the TOTAL goes down from that 137s to
92.23, and judging from tree dumps between 4.5 and 4.6 we do significantly more
vectorization too (4.6
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #40 from Richard Guenther 2011-01-27
16:19:26 UTC ---
Btw, when I remove -fbounds-check the sizes are comparable (without
prefetching),
so I guess we are just better in removing bounds checking for 4.6 and that
triggers size-costly lo
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #39 from Richard Guenther 2011-01-27
16:16:48 UTC ---
The size difference is likely from prefetching, it's 1.5MB vs. 1.1MB without
that (-O3 -fbounds-check -ffast-math -funroll-loops). Prefetching usually
causes another set of (then
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #38 from Jakub Jelinek 2011-01-27
16:02:49 UTC ---
*.gimple dump is roughly the same size between 4.5 and 4.6, but resulting
assembly size is 15MB in 4.5 and 23MB (with only < 100KB variation with
-fno-ivopts) in 4.6. -fno-inline doe
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #37 from Jakub Jelinek 2011-01-27
15:55:57 UTC ---
/usr/src/gcc/objr/gcc/f951 -quiet -ftime-report -fbounds-check -g -O3
-ffast-math -funroll-loops -ftree-vectorize -march=amdfam10 pr45422.f90 2>&1 |
grep ':[ ]*[1-9]\|TOTAL'
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #36 from davidxl 2011-01-25 17:28:30
UTC ---
(In reply to comment #35)
> (In reply to comment #34)
> > -march=native is ambiguous, please see with -v what actually is being used.
>
> This was mentioned in the initial comment:
> -marc
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #35 from Joost VandeVondele
2011-01-25 10:03:02 UTC ---
(In reply to comment #34)
> -march=native is ambiguous, please see with -v what actually is being used.
This was mentioned in the initial comment:
-march=k8-sse3 -mcx16 -msahf
-
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #34 from Jakub Jelinek 2011-01-25
09:52:23 UTC ---
-march=native is ambiguous, please see with -v what actually is being used.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #33 from Joost VandeVondele
2011-01-25 09:47:10 UTC ---
I just note that the timings reported by David and Jakub are not for the
compile options I originally reported.
With 4.6 (20110117) I now have
gfortran -c -ftime-report -cpp -
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #32 from Jakub Jelinek 2011-01-25
09:02:57 UTC ---
IMHO for P1 purposes we should just look at compile time regressions from 4.5
here at this point. On the #c1 testcase I get with --enable-checking=release
current trunk and current 4
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #31 from davidxl 2011-01-21 20:08:11
UTC ---
Comparing this timing with 4.6 results (164s), looks like many other passes
become slower other than ivopt (e.g IRA increases from 3.5s to 11s etc -- ivopt
only account for a small part of
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #30 from davidxl 2011-01-21 19:58:41
UTC ---
(In reply to comment #29)
> (In reply to comment #28)
> > David, any progress with this?
>
> The cost function fix to make sure solution set does not become too big will
> be
> probably v
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #29 from davidxl 2011-01-21 16:27:43
UTC ---
(In reply to comment #28)
> David, any progress with this?
The cost function fix to make sure solution set does not become too big will be
probably very involved and won't be availlable in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
Jakub Jelinek changed:
What|Removed |Added
CC||jakub at gcc dot gnu.org
--- Comment #28
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
Joost VandeVondele changed:
What|Removed |Added
Last reconfirmed|2010-08-29 09:25:52 |2011-01-17 9:25:52
--- Comment #27 f
--
rguenth at gcc dot gnu dot org changed:
What|Removed |Added
Priority|P3 |P1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #26 from davidxl at gcc dot gnu dot org 2010-08-31 17:45
---
Good observation re. the number of IVs in the final set. This usually points to
some problem/bug in the cost function. I briefly looked at this case -- it
indeed exposes two more bugs in the cost model:
1) the com
--- Comment #25 from davidxl at gcc dot gnu dot org 2010-08-30 16:41
---
(In reply to comment #24)
> (In reply to comment #20)
> > (In reply to comment #16)
> > > adjust summary according to the last timings
> > >
> >
> > I am surprised to see such big differences between trunk and pr
--- Comment #24 from rguenth at gcc dot gnu dot org 2010-08-30 07:12
---
(In reply to comment #20)
> (In reply to comment #16)
> > adjust summary according to the last timings
> >
>
> I am surprised to see such big differences between trunk and previous
> releases.
> Compiling this t
--- Comment #23 from rguenth at gcc dot gnu dot org 2010-08-30 07:11
---
(In reply to comment #22)
> Given the fact that the solution space is really large -- M^N where M is the
> number of candidates and M is the number of uses (here M == 70 and N == 48),
> and the cost function is co
--- Comment #21 from davidxl at gcc dot gnu dot org 2010-08-30 03:19
---
(In reply to comment #17)
> tree iv optimization : 32.57 (20%) usr 0.10 ( 5%) sys 32.73 (20%) wall
> 322095 kB (18%) ggc
>
>
> 20% is still completely unreasonable for IV optimization.
>
There was a patc
--- Comment #20 from davidxl at gcc dot gnu dot org 2010-08-30 03:10
---
(In reply to comment #16)
> adjust summary according to the last timings
>
I am surprised to see such big differences between trunk and previous releases.
Compiling this test case with the those options on my cor
--- Comment #18 from jv244 at cam dot ac dot uk 2010-08-29 15:07 ---
FYI, these are the 4.5 branch timings:
Execution times (seconds)
garbage collection: 0.47 ( 1%) usr 0.00 ( 0%) sys 0.47 ( 1%) wall
0 kB ( 0%) ggc
callgraph construction: 0.05 ( 0%) usr 0.01 ( 1%)
--- Comment #17 from rguenth at gcc dot gnu dot org 2010-08-29 09:25
---
tree iv optimization : 32.57 (20%) usr 0.10 ( 5%) sys 32.73 (20%) wall
322095 kB (18%) ggc
20% is still completely unreasonable for IV optimization.
--
rguenth at gcc dot gnu dot org changed:
--- Comment #16 from jv244 at cam dot ac dot uk 2010-08-29 06:38 ---
adjust summary according to the last timings
--
jv244 at cam dot ac dot uk changed:
What|Removed |Added
---
28 matches
Mail list logo