https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
Jakub Jelinek changed:
What|Removed |Added
Target Milestone|5.5 |6.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
Jakub Jelinek changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
Richard Biener changed:
What|Removed |Added
Target Milestone|5.4 |5.5
--- Comment #65 from Richard Biener
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
Jeffrey A. Law changed:
What|Removed |Added
CC||law at redhat dot com
Summar
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
Richard Biener changed:
What|Removed |Added
Priority|P1 |P2
--- Comment #56 from Richard Biener
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #55 from Jan Hubicka ---
Author: hubicka
Date: Fri Apr 3 18:09:13 2015
New Revision: 221859
URL: https://gcc.gnu.org/viewcvs?rev=221859&root=gcc&view=rev
Log:
PR ipa/65076
* ipa-inline.c (edge_badness): Add combined size to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #54 from Jan Hubicka ---
I have full set of firefox talos benchmarks with inline-unit-growth bumped back
to 30 (I did not test default value by accident, but I am running itnow). We
now get back the GCC 4.9 performance on dromaeo_dom/
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #53 from Jan Hubicka ---
This patch makes denominator to use resulting function size (not uninlined time
like 4.9 did but getting the resulting fraction closer to 4.9 style):
Index: ../../gcc/ipa-inline.c
=
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #52 from Jan Hubicka ---
$ time /aux/hubicka/trunk-install/bin/g++ -Ofast -fpermissive --param
large-function-insns=1 tramp3d-v4.ii -w ; ./a.out -n 3
real0m34.232s
user0m33.729s
sys 0m0.532s
i = 1t = 0.00209225 dt =
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #51 from Markus Trippelsdorf ---
(In reply to rguent...@suse.de from comment #48)
>
> Maybe we regressed optimizing GCC itself? (does not bootstrapping
> but compiling gcc 5 with gcc 4.9 improve things?)
No, gcc configured with "--d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #49 from Jan Hubicka ---
I did some experiments about the increase of early inlining insns:
- Early optimizers of both 4.9 and mainline process 9819 functions.
- At release_ssa time, the statement count is 8%
- at ipa-cp, we have
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #48 from rguenther at suse dot de ---
On Tue, 31 Mar 2015, trippels at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
>
> --- Comment #46 from Markus Trippelsdorf ---
> (In reply to Jan Hubicka from com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #47 from rguenther at suse dot de ---
On Tue, 31 Mar 2015, hubicka at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
>
> --- Comment #42 from Jan Hubicka ---
> Sorry, accidental message.
>
> It is 69->
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #46 from Markus Trippelsdorf ---
(In reply to Jan Hubicka from comment #45)
> > Like Richard wrote in comment 38 it is "phase opt and generate" that
> > regresses
>
> Yes, but is it regression because of one specific pass shown later
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #45 from Jan Hubicka ---
> Like Richard wrote in comment 38 it is "phase opt and generate" that regresses
Yes, but is it regression bcause of one specific pass shown later or is it just
a cummulative
effect of many little slowdown?
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #44 from Markus Trippelsdorf ---
(In reply to Jan Hubicka from comment #43)
> Markus, can you reproduce some consistent growth in -ftime-report for one of
> passes? Given that code size difference is solved (please try to double
> che
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #43 from Jan Hubicka ---
Markus, can you reproduce some consistent growth in -ftime-report for one of
passes? Given that code size difference is solved (please try to double check
that, we may have slightly different revisions of tram
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #42 from Jan Hubicka ---
Sorry, accidental message.
It is 69->80.5s between 141127.61083 and 150113.26056 (tester was down)
66->69s between 141123.15275 and 141124.01653
60->64 between 140807.80282 and 140808.66762
Now t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #41 from Jan Hubicka ---
OK. I can actually look it up in raw files.
It is: 69s->80s between
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #40 from Jan Hubicka ---
-O3 graph http://gcc.opensuse.org/c++bench/tramp3d/split-build.html seems to
show 3 bigger increases recently. Can we get the revisions for those?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #39 from Jan Hubicka ---
Hi, yep, -Os or flatten is unchanged. It seems something regress with -O3
inline decisions but it is somewhat hard
to pinpoint. I am on a way to Victoria, so I will do more only tonight.
https://gcc.gnu.org/
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #38 from Richard Biener ---
Looks that compile-time with -Dleafify=flatten is basically unchanged. So it
is definitely different inlining decisions for tram3d-v4.cpp. Maybe we inline
a lot more early now (due to early-insn param cha
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #37 from Stupachenko Evgeny ---
(In reply to Richard Biener from comment #31)
> No negative effects seen. Update on the regression? P3->P1 before
> willfully downgrading later...
Compiled with "-Ofast -flto -funroll-loops -m32" and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #36 from Markus Trippelsdorf ---
(In reply to Markus Trippelsdorf from comment #35)
> Firefox LTO buildtime on ppc64le: 5:18.12 total vs. 4:48.85 total = 6.25%
Please ignore the Firefox buildtime comparison. It was a measuring error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #35 from Markus Trippelsdorf ---
POWER8 : 23.424 vs. 20.676 = 11.7316%
Firefox LTO buildtime on ppc64le: 5:18.12 total vs. 4:48.85 total = 6.25%
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #34 from Markus Trippelsdorf ---
(In reply to rguent...@suse.de from comment #33)
> On Tue, 31 Mar 2015, trippels at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
> >
> > --- Comment #32 from Markus
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #33 from rguenther at suse dot de ---
On Tue, 31 Mar 2015, trippels at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
>
> --- Comment #32 from Markus Trippelsdorf ---
> (In reply to Richard Biener from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #32 from Markus Trippelsdorf ---
(In reply to Richard Biener from comment #31)
> No negative effects seen. Update on the regression? P3->P1 before
> willfully downgrading later...
It depends on the target machine. On amdfam10 it is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
Richard Biener changed:
What|Removed |Added
Priority|P3 |P1
--- Comment #31 from Richard Biener
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #30 from Jan Hubicka ---
Author: hubicka
Date: Mon Mar 30 02:00:56 2015
New Revision: 221769
URL: https://gcc.gnu.org/viewcvs?rev=221769&root=gcc&view=rev
Log:
PR ipa/65076
* ipa-inline.c (edge_badness): Base denominator on
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #29 from Jan Hubicka ---
I also tested with -Os and compile times seems about same as for 4.9 modulo
noise.
The following one liner brings instruction and function count in final binary
to same as in 4.9:
Index: ipa-inline.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #28 from Markus Trippelsdorf ---
Yes, it is still 15% on my machine, too:
markus@x4 ~ % time g++ -w -Ofast tramp3d-v4.cpp
g++ -w -Ofast tramp3d-v4.cpp 25.45s user 0.33s system 99% cpu 25.832 total
(At least this is still faster tha
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #27 from Jan Hubicka ---
Unfortunately from me it wend down from about 18% to 15%, so still a
regression. One quantiative parameter I can measure is increase of number of
functions in the resulting binary from 1030 to 1140. I will try
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #26 from Richard Biener ---
So how is the compile-time regression now?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #25 from Jan Hubicka ---
The memory use report:
rtl.c:317 (copy_rtx)9610680: 1.6% 0:
0.0% 0: 0.0% 0: 0.0% 401870
tree.c:8281 (build_method_type_directly)28395
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #24 from Jan Hubicka ---
Author: hubicka
Date: Fri Mar 27 04:02:28 2015
New Revision: 221719
URL: https://gcc.gnu.org/viewcvs?rev=221719&root=gcc&view=rev
Log:
PR ipa/65076
* passes.def: Add pass_nothrow.
* ipa-pure-cons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #23 from Jan Hubicka ---
Also with early-inlining-insns=11 the statement count is smaller for mainline
(copmared to 4.9) until the pass bswap, it grows up in PRE (by about 1%) and
then it continues growing with subsequent passes. So
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
Jan Hubicka changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #21 from Jan Hubicka ---
Actually, looking at the code, I do not think we want full pure/const pass
(that build loops and attmepts to prove finiteness). We only want to run
nothrow discovery that is a lot cheaper and perhaps we want t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
Jan Hubicka changed:
What|Removed |Added
CC||enkovich.gnu at gmail dot com
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #19 from Richard Biener ---
I've meanwhile disabled pass_chkp_instrumentation_passes by default.
The fixup_cfg in pass_build_ssa_passes doesn't do anything on tramp3d
(and bootstrap/testing shows no issue removing it either).
Adding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #18 from rguenther at suse dot de ---
On Wed, 25 Mar 2015, hubicka at ucw dot cz wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
>
> --- Comment #17 from Jan Hubicka ---
> > > Even though the inline decisions does not s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #17 from Jan Hubicka ---
> > Even though the inline decisions does not seem to be changed considerably
> > (at least on tramp3d).
>
> Yeah, clobbers don't account for anything for size/inline estimates
> (well, I hope so!).
Yep, the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #16 from Jan Hubicka ---
The chkp stuff is IMO bit problematic. I was thinking about cutting the
optimization queue but was always hesitant to do so because of the cache
locality and other implications. I am not sure if that was consi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
Jakub Jelinek changed:
What|Removed |Added
CC||jakub at gcc dot gnu.org
--- Comment #15
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
Richard Biener changed:
What|Removed |Added
CC||rguenth at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #13 from Richard Biener ---
(In reply to Jan Hubicka from comment #10)
> I can re-confirm the 16% compile time regression. I went through some
> compare.
>
> $ wc -l *.ssa
> 299231 tramp3d-v4.ii.015t.ssa
> $ wc -l ../5/*.ssa
> 33111
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #12 from Jan Hubicka ---
Also the number of statements is about the same at .cfg dump, so it is .ssa
that introduces all the differences. Why?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #11 from Jan Hubicka ---
Sorry, the number of clobbers drops at DSE1, not during ehcleanup2, I just
messed up my grep.
I tried the following patch:
Index: passes.def
=
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #10 from Jan Hubicka ---
I can re-confirm the 16% compile time regression. I went through some compare.
$ wc -l *.ssa
299231 tramp3d-v4.ii.015t.ssa
$ wc -l ../5/*.ssa
331115 ../5/tramp3d-v4.ii.018t.ssa
so as a lame compare, we alre
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #9 from Jan Hubicka ---
Concerning Comment #7, I do not think the sreal refactoring screwed things up.
sreals are not high on profile and the code generated is not worse (performance
wise). It is not better, but it is not a surprise -
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #8 from Richard Biener ---
The "culprit" basically refactors things and in the process screws
code-generation with sreals?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #7 from Richard Biener ---
(In reply to Richard Biener from comment #6)
> I wonder where the main _int_malloc load comes from.
To answer this question myself, 84% from the _int_malloc load comes from
calling
malloc of which 55% comes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #6 from Richard Biener ---
I wonder where the main _int_malloc load comes from.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #5 from Markus Trippelsdorf ---
Perf shows:
Overhead Command Shared Object Symbol
2.45% cc1plus libc-2.21.90.so [.] _int_malloc
1.88% cc1plus cc1plus [.] bitmap_find_bit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #4 from Markus Trippelsdorf ---
markus@x4 ~ % g++ -ftime-report -Ofast -w tramp3d-v4.cpp
Execution times (seconds)
phase setup : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
1419 kB ( 0%) ggc
phase parsing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #3 from Jan Hubicka ---
Yep, I looked into this somewhat while preparing the patch. With new metric we
manage to do a lot more inlining before hitting the limits. This is kind of
positive effect - clearly inliner does things that pays
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
--- Comment #2 from Markus Trippelsdorf ---
(In reply to Richard Biener from comment #1)
> So it's either time spent in the inliner (unlikely, though the patch has an
> extra update_callee_keys call) or different (early) inlining decisions.
>
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
Richard Biener changed:
What|Removed |Added
Target Milestone|--- |5.0
--- Comment #1 from Richard Biener
59 matches
Mail list logo