http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
Richard Guenther changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #26 from Dominique d'Humieres
2011-09-22 15:25:48 UTC ---
AFAICT this pr has been fixed since some time. Here are the results I get on
x86_64-apple-darwin10 (Core2Duo 2.53Ghz, 3Mb cache, 4Gb RAM) at revision
179079:
Compile options :
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #25 from Dominique d'Humieres
2011-02-16 18:38:19 UTC ---
AFAICT the patch in http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00973.html
seems to fix most of the fatigue.f90 problems:
At revision 170178 without the patch, I get
[macboo
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #24 from Dominique d'Humieres
2011-01-24 18:16:47 UTC ---
(In reply to comment #22)
> That FRE pass should be after pass_sra_early (certainly after
> pass_build_ealias).
Moving pass_fre after pass_sra_early does not fix the failures
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
Jack Howarth changed:
What|Removed |Added
CC||howarth at nitro dot
|
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #22 from Richard Guenther 2011-01-24
14:07:14 UTC ---
(In reply to comment #15)
> Enabling early FRE
> Index: passes.c
> ===
> --- passes.c(revision 169136)
> +++ pas
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #21 from Dominique d'Humieres
2011-01-24 09:29:00 UTC ---
I have regtested my working tree (with other patches) with the patch in comment
#15 and got 180 new failures (likely 90 for both -m32 and -m64), but I have not
checked that car
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #20 from Dominique d'Humieres
2011-01-23 23:20:34 UTC ---
> This makes hookes_law estimate to be 91 instructions, so -finline-limit=183
> should be enough.
With the patch in comment #19, I rather find a threshold of -finline-limit=25
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
Jan Hubicka changed:
What|Removed |Added
Last reconfirmed||2011-01-23 15:59:30
--- Comment #19 from Ja
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
Jan Hubicka changed:
What|Removed |Added
Last reconfirmed|2011-01-23 15:59:30 |
CC|
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #17 from Dominique d'Humieres
2011-01-23 19:38:30 UTC ---
With the patch in comment #15 and -finline-limit=300, I get
Date & Time : 23 Jan 2011 20:1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #16 from Jan Hubicka 2011-01-23
17:57:58 UTC ---
Also w/o inlining hookes_law but with inlining perida (by using
large-function-growth parameter only and the patch abov), I get 30% speedup,
not 50% as with inlining both, but it seems
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #15 from Jan Hubicka 2011-01-23
17:56:31 UTC ---
Enabling early FRE
Index: passes.c
===
--- passes.c(revision 169136)
+++ passes.c(working copy)
@@ -760,6 +760,7
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #14 from Dominique d'Humieres
2011-01-23 17:04:07 UTC ---
After removing the comments, generalized_hookes_law reads
function generalized_hookes_law (strain_tensor, lambda, mu) result
(stress_tensor)
!
real (kind = LONGrea
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #13 from Jan Hubicka 2011-01-23
16:45:23 UTC ---
OK, the slowdown comes away when both hookers_law and perida is inlined.
First needs -finline-limit=380 the second needs large-function-growth=1000
(or large increase of inline limi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
Jan Hubicka changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Last reconfirmed|
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #11 from Jan Hubicka 2011-01-08
20:08:26 UTC ---
Does --param hot-bb-frequency-fraction=10 work here?
This is weird!-( I have done the following profiling and it shows that -flto
prevents the inlining of __perdida_m_MOD_perdida,
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #10 from Dominique d'Humieres
2010-09-30 17:28:19 UTC ---
(In reply to comment #8)
> Using -fno-inline-functions, the program recovers the speed of the no-LTO
> version.
This does not work on powerpc-apple-darwin9:
[karma] lin/test%
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #9 from Dominique d'Humieres 2010-09-29
20:27:36 UTC ---
(In reply to comment #8)
> Using -fno-inline-functions, the program recovers the speed of the no-LTO
> version.
This is weird!-( I have done the following profiling and it show
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #8 from Tobias Burnus 2010-09-28
14:57:34 UTC ---
Using -fno-inline-functions, the program recovers the speed of the no-LTO
version.
Notes from #gcc:
(dominiq) For fatigue the key for speed-up is inlining of
generalized_hookes_law an
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #7 from Joost VandeVondele
2010-09-28 14:19:38 UTC ---
(In reply to comment #6)
> No, -fdump-tree-all works
great... I forgot to look in /tmp, and -save-temps also works fine.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #6 from Richard Guenther 2010-09-28
14:07:54 UTC ---
(In reply to comment #5)
> (In reply to comment #4)
> > Sure. As with all performance related bugs this needs analysis and is
> > unlikely an "LTO" problem - LTO does not (not-)opt
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #5 from Joost VandeVondele
2010-09-28 13:58:18 UTC ---
(In reply to comment #4)
> Sure. As with all performance related bugs this needs analysis and is
> unlikely an "LTO" problem - LTO does not (not-)optimize, optimization
> passes
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #4 from Richard Guenther 2010-09-28
13:38:58 UTC ---
(In reply to comment #3)
> (In reply to comment #2)
> > For single-file programs -fwhole-program and -flto should be basically
> > equivalent if the Frontend provides correctly merg
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
Tobias Burnus changed:
What|Removed |Added
CC||hubicka at gcc dot gnu.org
--- Comment #3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
--- Comment #2 from Richard Guenther 2010-09-27
10:48:33 UTC ---
For single-file programs -fwhole-program and -flto should be basically
equivalent if the Frontend provides correctly merged decls. I suppose
it does not and thus we do less inlinin
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45810
Joost VandeVondele changed:
What|Removed |Added
CC||Joost.VandeVondele at pci
27 matches
Mail list logo