https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
Richard Biener changed:
What|Removed |Added
Last reconfirmed|2019-03-05 00:00:00 |2024-2-20
--- Comment #62 from Richard
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #61 from Richard Biener ---
r12-7592, first testcase, x86_64:
-O0: 6s, 1GB
-O1: 264s, 1.4GB
callgraph ipa passes : 30.47 ( 12%)
alias stmt walking : 67.44 ( 26%)
tree loop invariant motion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #60 from Richard Biener ---
PRE is all find_base_term exploding ...
The LIM case is all store_motion () which is quadratic and the only
user of the quadratic in memory all_refs_stored_in_loop. The latter
would be reasonably easy to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
Richard Biener changed:
What|Removed |Added
Last reconfirmed|2009-03-01 11:39:34 |2019-3-5
--- Comment #59 from Richard B
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
Steven Bosscher changed:
What|Removed |Added
Status|ASSIGNED|NEW
CC|steven at gcc d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #57 from rguenther at suse dot de ---
On Wed, 17 Feb 2016, sergstesh at yahoo dot com wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
>
> --- Comment #56 from Sergei Steshenko ---
> "-O2 ... and 820MB peak memory use" v
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #56 from Sergei Steshenko ---
"-O2 ... and 820MB peak memory use" vs "-O3 ... and 700MB peak memory use" -
according to my common sense -O3 is stronger than -02 optimization, and one
should expect greater memory use.
So, can the abov
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #55 from Richard Biener ---
Current GCC 6 numbers:
-O1 -g
var-tracking dataflow : 89.37 (60%) usr 0.09 (23%) sys 89.53 (59%) wall
11542 kB ( 3%) ggc
var-tracking emit : 47.70 (32%) usr 0.05 (13%) sys 47.81 (32%)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #54 from Richard Biener ---
For the original testcase on trunk we get at -O1
tree loop invariant motion: 37.20 (16%) usr 0.02 ( 1%) sys 37.20 (16%)
wall 12127 kB ( 1%) ggc
dead store elim1: 17.42 ( 7%) usr 0.04 ( 2%
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #53 from Richard Biener ---
With the PR59802 and PR38518 fixes on trunk I see for the 2nd testcase at -O2
PRE : 19.23 (35%) usr 0.01 ( 1%) sys 19.22 (34%) wall
1421 kB ( 0%) ggc
combiner:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #52 from Richard Biener 2013-03-26
10:09:52 UTC ---
(In reply to comment #51)
> > (struct mem_ref): Replace mem member with ao_ref typed member.
>
> RTL gcse (-O2) suffers from the same slowness in its dependence tests.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #51 from Richard Biener 2013-03-22
14:00:25 UTC ---
> (struct mem_ref): Replace mem member with ao_ref typed member.
RTL gcse (-O2) suffers from the same slowness in its dependence tests. Caching
ao_ref instead of just
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
Jakub Jelinek changed:
What|Removed |Added
CC||jakub at gcc dot gnu.org
--- Co
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #49 from Richard Biener 2013-03-18
08:43:08 UTC ---
Author: rguenth
Date: Mon Mar 18 08:42:57 2013
New Revision: 196768
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=196768
Log:
2013-03-18 Richard Biener
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #48 from Richard Biener 2013-03-15
16:06:33 UTC ---
Removing all the caching (dep/indep_loop and dep/indep_ref) makes things
faster ... (they have a hit rate of 5% resp. 6.2% only)
Fastest timing sofar:
tree loop invarian
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #47 from Richard Biener 2013-03-12
14:05:10 UTC ---
With caching affine-combination compute and ao_ref compute I have it down to
tree loop invariant motion: 596.91 (79%) usr 0.73 (29%) sys 599.77 (78%)
wall 31135 kB ( 3%
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #46 from Richard Biener 2013-03-12
10:46:45 UTC ---
Created attachment 29649
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29649
symmetry in reference testing
Exploit symmetry in reference testing.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #45 from Steven Bosscher 2013-03-11
09:40:18 UTC ---
Patches posted:
* Restrict GIMPLE loop invariant code motion of loop-invariant loads and
stores to loops with fewer memory references than a certain maximum that
is contro
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #44 from Steven Bosscher 2013-03-09
17:25:46 UTC ---
Created attachment 29628
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29628
Re-use store register if possible
This patch resolves the issue mentioned in comment #43
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #43 from Steven Bosscher 2013-03-09
14:57:52 UTC ---
The problem with combine is only "collateral damage" from what dse1 is
doing to this function. It's loading stored values into registers and
replacing re-loads with those re
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #42 from rguenther at suse dot de
2013-03-08 09:22:39 UTC ---
On Thu, 7 Mar 2013, steven at gcc dot gnu.org wrote:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
>
> --- Comment #38 from Steven Bosscher 2013-03-07
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #41 from Richard Biener 2013-03-08
09:13:53 UTC ---
Created attachment 29616
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29616
make LIM work per outermost loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #40 from Richard Biener 2013-03-08
09:12:38 UTC ---
(In reply to comment #31)
> (In reply to comment #30)
> Hmm, RTL PRE isn't really mine either, but I probably know it as well as
> anyone else, so I will have a look. It's pr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #39 from Steven Bosscher 2013-03-07
23:18:48 UTC ---
Memory usage is still pathetic. Some stats:
mem stats from /proc/self/statm on *entry* of pass:
pass (#) sizeresident
*warn_unused_re
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #38 from Steven Bosscher 2013-03-07
22:15:39 UTC ---
Created attachment 29612
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29612
Punt on loops with more memory references than LIM can handle
For the LIM problem, I'm t
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #37 from Sergei Steshenko 2013-03-07
21:47:52 UTC ---
(In reply to comment #35)
> (In reply to comment #34)
> > Memory consumption appears to be the same as with -O2.
>
> Can you measure the peak memory with time?
>
> /u
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #36 from Steven Bosscher 2013-03-07
17:33:28 UTC ---
(In reply to comment #29)
> Yeah, one of my minor TLC patches. Most of the excessive memory
> usage for regular testcases can be fixed by doing LIM on
> all siblings of the
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #35 from Steven Bosscher 2013-03-07
17:30:58 UTC ---
(In reply to comment #34)
> Memory consumption appears to be the same as with -O2.
Can you measure the peak memory with time?
/usr/bin/time -f 'real=%e user=%U system=%
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #34 from Sergei Steshenko 2013-03-07
17:13:42 UTC ---
Somehow, with -O3 LLVM clang works a little bit faster than with -O2 - 54
minutes instead of 58 minutes, though this might be a random variation:
"
sergei@amdam2:~/gcc_bu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #33 from rguenther at suse dot de
2013-03-07 10:14:53 UTC ---
On Thu, 7 Mar 2013, steven at gcc dot gnu.org wrote:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
>
> Steven Bosscher changed:
>
>What
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #32 from Sergei Steshenko 2013-03-07
10:13:40 UTC ---
(In reply to comment #26)
> (In reply to comment #23)
> > FYI, the original file (
> > http://gcc.gnu.org/bugzilla/attachment.cgi?id=17377 )
> > can be compiled with 'clan
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
Steven Bosscher changed:
What|Removed |Added
Status|NEW |ASSIGNED
AssignedTo|un
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #30 from rguenther at suse dot de
2013-03-07 08:52:52 UTC ---
On Thu, 7 Mar 2013, steven at gcc dot gnu.org wrote:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
>
> --- Comment #27 from Steven Bosscher 2013-03-07
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #29 from rguenther at suse dot de
2013-03-07 08:47:35 UTC ---
On Thu, 7 Mar 2013, steven at gcc dot gnu.org wrote:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
>
> --- Comment #25 from Steven Bosscher 2013-03-07
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #28 from rguenther at suse dot de
2013-03-07 08:44:28 UTC ---
On Wed, 6 Mar 2013, steven at gcc dot gnu.org wrote:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
>
> Steven Bosscher changed:
>
>What
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #27 from Steven Bosscher 2013-03-07
08:09:59 UTC ---
Compilation finished after ~3 hours and consuming at least 3GB (from top - I
forgot to use memmax2...).
Winners in the "geez, I'm slow for this test case" list:
PRE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #26 from Steven Bosscher 2013-03-07
00:26:56 UTC ---
(In reply to comment #23)
> FYI, the original file ( http://gcc.gnu.org/bugzilla/attachment.cgi?id=17377 )
> can be compiled with 'clang', albeit slowly:
...
> Memory consu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #25 from Steven Bosscher 2013-03-07
00:08:26 UTC ---
(In reply to comment #24)
> (In reply to comment #22)
> > 4.8.0 -O2 (terminated after 9 minutes waiting, LIM being the offender, I
> > suspect domwalk ...) >2.5GB
> >
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
Steven Bosscher changed:
What|Removed |Added
CC||steven at gcc dot gnu.org
---
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #23 from Sergei Steshenko 2013-03-06
16:49:51 UTC ---
FYI, the original file ( http://gcc.gnu.org/bugzilla/attachment.cgi?id=17377 )
can be compiled with 'clang', albeit slowly:
"
sergei@amdam2:~/gcc_bug> time ~/AFSWD/instal
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #22 from Richard Biener 2013-03-06
11:38:07 UTC ---
4.7.2 -O0 25s 2189981kB
integrated RA : 8.96 (35%) usr 0.89 (28%) sys 9.89 (34%) wall
206439 kB (16%) ggc
reload : 2.98 (12%) usr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
Steven Bosscher changed:
What|Removed |Added
CC|gcc-bugs at gcc dot gnu.org |
Blocks|
--- Comment #20 from rguenth at gcc dot gnu dot org 2009-03-17 13:09
---
Btw, it looks like internal IRA data-structures can be shrinked on 64bit
platforms by avoiding padding between pointer and integer members a lot.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #19 from rguenth at gcc dot gnu dot org 2009-03-17 12:58
---
For trunk -O1 I see
CPU: AMD64 processors, speed 1000 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask
of 0x00 (No unit mask) count 10
samples %symbol na
--- Comment #18 from rguenth at gcc dot gnu dot org 2009-03-17 12:42
---
Vlad, for the second testcase I see
-O0:
expand: 0.78 (19%) usr 0.04 ( 5%) sys 0.83 (16%) wall
44335 kB (49%) ggc
integrated RA : 1.05 (25%) usr 0.03 ( 4%) sys 1.13 (22%) w
--- Comment #17 from rguenth at gcc dot gnu dot org 2009-03-17 11:05
---
Created an attachment (id=17476)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17476&action=view)
the other testcase
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
--- Comment #16 from sergstesh at yahoo dot com 2009-03-03 14:15 ---
(In reply to comment #15)
> Subject: Re: Segmentation fault with -O1, out of memory
> with -O2
>
> On Tue, 3 Mar 2009, sergstesh at yahoo dot com wrote:
>
> > --- Comment #14 from sergstesh at yahoo dot com 200
--- Comment #15 from rguenther at suse dot de 2009-03-03 13:48 ---
Subject: Re: Segmentation fault with -O1, out of memory
with -O2
On Tue, 3 Mar 2009, sergstesh at yahoo dot com wrote:
> --- Comment #14 from sergstesh at yahoo dot com 2009-03-03 13:36 ---
> 'spiral' has pro
--- Comment #14 from sergstesh at yahoo dot com 2009-03-03 13:36 ---
'spiral' has produced another testcase which segfaults with -O2 - the original
testcase segfaults with -O1.
The testcase, though has half the points if terms of FFT, is big as a file:
-rw-r--r-- 1 sergei users 165641
--
pinskia at gcc dot gnu dot org changed:
What|Removed |Added
Severity|major |normal
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39326
50 matches
Mail list logo