> Please also prioritize fixing P1s and avoid pushing in risky
> fixes for P2s that might end up causing regressions. We are still
> seeing way too many changes in the IPA area (hi Honza!).
Hello :)
GCC 5 is a busy release from IPA point of view indeed. Here is a quick summary
what is still on m
> As is shown here:
>
> https://gcc.gnu.org/ml/gcc-testresults/2015-03/msg03014.html.
Hello,
I believe this should be another symptom of bug fixed by r221718.
The earlier revision exposed a problem that the edge may be moved from
indirect edges to direct edges and the outer walk will end up confu
> I believe this should be another symptom of bug fixed by r221718.
> The earlier revision exposed a problem that the edge may be moved from
> indirect edges to direct edges and the outer walk will end up confused.
https://gcc.gnu.org/ml/gcc-testresults/2015-03/msg03059.html reports success
with
> which shows how the global objects initialization keeps things live.
> Early optimization turns it into
>
> (static initializers for t.C) ()
> {
> :
> NotUsedObject._vptr.CObject = &MEM[(void *)&_ZTV7CObject + 16B];
> return;
>
> }
>
> but we don't have any pass removing stores to global
> > which shows how the global objects initialization keeps things live.
> > Early optimization turns it into
> >
> > (static initializers for t.C) ()
> > {
> > :
> > NotUsedObject._vptr.CObject = &MEM[(void *)&_ZTV7CObject + 16B];
> > return;
> >
> > }
> >
> > but we don't have any pass r
> On Tue, Apr 7, 2015 at 9:45 AM, Ilya Palachev wrote:
> > In the mentioned README file it is said that " In order to collect this
> > profile, you will need to have an Intel CPU that have last branch record
> > (LBR) support." Is this information obsolete? Chrome Canary builds use
> > AutoFDO for
> LBR is used for both cfg edge profiling and indirect call Target value
> profiling.
I see, that makes sense ;) I guess if we want to support profile collection
on targets w/o this feature we could still use one of the algorithms that
try to guess edge profile from BB profile.
Honza
> On Fri, Apr 10, 2015 at 11:18:39AM -0400, Trevor Saunders wrote:
> > On Fri, Apr 10, 2015 at 03:59:19PM +0200, Toon Moene wrote:
> > > Like this:
> > >
> > > https://gcc.gnu.org/ml/gcc-testresults/2015-04/msg01086.html
> > >
> > > ODR rears its head again ...
> >
> > huh, why is c/c-lang.h ge
> > The patch applied cleanly - this is what I got as a result:
> >
> > https://gcc.gnu.org/ml/gcc-testresults/2015-04/msg01450.html
> >
> > I hope this is useful.
>
> ok, so the problem would seem to be graphite-scop-detection.c is
> including front end specific headers. Can you put a #error i
> You can use dump_gcov to show a text version of the profile dump and
> check if the profile data makes sense. If your program is just a very
> tight single loop, the current implementation in trunk may not yield
> good results because it does not have discriminator support. Try the
> google-4_9 b
> > It converts with the attached patches, but there's still some problem
> > parsing the data:
> >
> > % ./create_gcov -binary loop -gcov_version 1 -gcov loop.gcda -gcov_version
> > 0x500e
> > % gcc50 -O2 -fprofile-use loop.c
> > loop.c:1:0: warning: '/home/andi/src/autofdo/loop.gcda' is version
Hi,
I am adding Vladimir and Richard into CC. I tried to solve similar problem
with FP math years ago by having -mfpmath=sse,i387. The idea was to allow
use of i387 registers when SSE ones run out and possibly also model the fact
that Pentium4 had faster i387 additions than SSE additions. I also ha
Hi,
> Hello all,
>
> With gcc, does the fact that some branch results in a C++ exception
> effect the performance of a function when that exception branch
> isn't entered? In other words, does the presence of a throw effect
> the optimizer in any way?
EH handling is implemented in a way minimizin
> > Yes, it will. But it's not well tuned at all. I will start tuning it
> > if I have free cycles. It would be great if opensource community can
> > also contribute to this tuning effort.
>
> If you could outline portions of code which needs tuning, rewriting, that
> will help get started in thi
Hello,
as I mentioned yesterday on IRC adding a check that only complete types
have TYPE_BINFO defined triggers on type:
unsigned SI
size
unit size
align 32 symtab 0 alias set -1 canonical type 0x76d260a8
precision 32 min max
v
Hello,
A reminder about this year's Cauldron (http://gcc.gnu.org/wiki/cauldron2015).
The webpage was just updated with a list of accepted talks and BoFs as well
as a brief review of the schedule to let you plan your travel.
With 19 of very relevant talks and 10 BoFs we will have a busy schedule.
> Honza,
>
> At the Cauldron meeting last week I mentioned that I wasn't able to compile
> our "small" weather forecasting program with LTO.
>
> In the mean time I have read some bug reports with "multiple prevailing ..."
> errors, which made me try linking with the 'gold' linker - that worked.
>
Hello,
while running benchmarks for inliner tuning I also run benchmarks
comparing -O2 and -O2 -ftree-vectorize -ftree-slp-vectorize using Martin
Liska's LNT setup (https://lnt.opensuse.org/). The results are
summarized below but you can also see also colorful table produced
by Martin's LNT magic
> > Note that I benchmarked -ftree-slp-vectorize separately before and
> > results was hit/miss, so perhaps enabling only -ftree-vectorize would
> > give better compile time tradeoffs. I was worried of partial memory
> > stalls, but I will benchmark it and also benchmark difference between
> > cost
> On Mon, Jan 07, 2019 at 09:29:09AM +0100, Richard Biener wrote:
> > On Sun, 6 Jan 2019, Jan Hubicka wrote:
> > > Even though it is late in release cycle I wonder if we can do that for
> > > GCC 9? Performance of vectorization is very architecture specific, I
&
> On Wed, Jan 9, 2019 at 12:48 PM Richard Biener
> wrote:
> >
> > On Wed, Jan 9, 2019 at 10:46 AM Joern Wolfgang Rennecke
> > wrote:
> > >
> > > We've been running builds/regression tests for GCC 8.2 configured with
> > > --enable-checking=all, and have observed some failures related to
> > > gar
> Hi,
>
> On Mon, 15 Apr 2019, Martin Liška wrote:
>
> > There's a similar comparison that I did for the official openSUSE gcc
> > packages. gcc8 is built with PGO, while the gcc9 package is built in 2
> > different configurations: PGO, LTO, PGO+LTO (LTO used for FE in stage4,
> > for generato
>
> At least allow it to be built as part of the normal build like GMP,
> etc. are done.
> And include it in downloading using contrib/download_prerequisites
> like the libraries are done.
Anoying detail is that zstd builds with cmake, not autotools
Honza
>
> Thanks,
> Andrew Pinski
>
> >
> >
> On 6/20/19 12:58 PM, Thomas Koenig wrote:
> > Am 20.06.19 um 11:07 schrieb Martin Liška:
> >> On the contrary, decompression
> >> of zstd with zlib will end with:
> >> lto1: internal compiler error: compressed stream: data error
> >
> > Sogenerating object files on one system and trying to read
> > >> I should have been clearer about Darwin:
> > >>
> > >> collect2 is required because it wraps the calling of lto-wrapper and ld.
> > >>
> > >> FWIW Darwin also passes all the “-frepo” testcases, however, I’m not
> > >> aware of anyone actually
> > >> using case #2 from Jonathan’s post.
> > >
> On 6/21/19 2:34 PM, Richard Biener wrote:
> > On Fri, Jun 21, 2019 at 12:20 PM Martin Liška wrote:
> >>
> >> Hi.
> >>
> >> The patch is about a new ELF section that will contain information
> >> about LTO version. And for the future, used compression will be stored
> >> here. The patch removes s
> Hello.
> I have already sent a patch for expanding roundeven for i386 with
> relevant doubts. I also was regression testing with
> make -k check
> after successful bootstrap build with reverting my patches. Turns out
> do-check fails without any patches applied, Is it ok to do anyways for
> appli
>
> It's useful on targets without COMDAT support. Are there any such
> that we care about at this point?
>
> If the problem is the combination with LTO, why not just prohibit that?
The problem is that at the collect2 time we want to decide whether to
hold stderr/stdout of the linker. The issue
>
> > On 27 Jun 2019, at 19:21, Jan Hubicka wrote:
> >
> >>
> >> It's useful on targets without COMDAT support. Are there any such
> >> that we care about at this point?
> >>
> >> If the problem is the combination with LTO
times (such as ones
> > initialized in initialization functions).
> [ ... ]
> Just a note, I suspect most of the development community is currently
> focused on stage3 bugfixing rather than new code development. So
> replies may be limited.
>
> Jan Hubicka is probably the
> There doesn't seem to be a way to compare types at LTO time. The functions
> same_type_p and comptypes are front end only if I'm not totally confused
> (which is quite possible) and type_hash_eq doesn't seem to apply for
> structure types. Please, any advice would be welcome.
At LTO time it is b
> Hello.
>
> I've been working on a patch that would cope with target and optimization
> (read PerFunction)
> in a proper way. I came to following test-case (slightly modified
> ./gcc/testsuite/gcc.c-torture/execute/alias-1.c):
>
> int val;
>
> int *ptr = &val;
> float *ptr2 = &val;
>
> stati
Hi,
we are very pleased to invite you all the GNU Tools Cauldron on 8-10 September
2017. This year we will meet again in Prague, at Charles University. Details
are here:
https://gcc.gnu.org/wiki/cauldron2017
As usual, please register (capacity is limited), send abstracts and ask
administrivia
> On 05/04/2017 08:31 AM, Jeff Law wrote:
> >On 05/04/2017 07:26 AM, Дмитрий Дьяченко wrote:
> >>Fedora 26 x86_64
> >>r247595
> >>
> >>~/src/gcc_current/configure --prefix=/usr/local/gcc_current
> >>--enable-static --enable-checking=no --enable-languages=c,c++,lto
> >>--enable-plugin --disable-mult
> > On 05/04/2017 08:31 AM, Jeff Law wrote:
> > >On 05/04/2017 07:26 AM, Дмитрий Дьяченко wrote:
> > >>Fedora 26 x86_64
> > >>r247595
> > >>
> > >>~/src/gcc_current/configure --prefix=/usr/local/gcc_current
> > >>--enable-static --enable-checking=no --enable-languages=c,c++,lto
> > >>--enable-plugi
> > I wonder how that patch can cause mismatches. Does it reproduce on one of
> > compile farm machines (my x86-64 bootstrap works fine so does ia64 on
> > terbium
> > after fixing the gcc 4.1 issue yeterday).
> > It would be great to have -fdump-ipa-inline-details dumps of the mismatching
> > run
Hi,
I found the problem. It was pretty obvious - we compute sum of times twice.
Once when computing statement sizes and second time by summing the summaries.
because sreal is not distributive, it leads to different results.
I have comitted the following patch. Incrementally I will drop the
code d
Hello,
I would like to remind you of the upcoming GNU Tools Cauldron meeting in Prague
September 8-10. https://gcc.gnu.org/wiki/cauldron2017 (in 5 weeks!)
At the present we still have 83 registered participants and the room capacity
is bit over 100 seats. If you want to participate, please regist
> Hi!
>
> As Honza told me recently, it has been proposed by Martin -- I don't know
> which one ;-) -- and certainly makes sense, to have another OMP
> (Offloading and Multi Processing) BoF at the GNU Tools Cauldron 2017,
> which is currently scheduled for Saturday, 11:00 to 11:45. That is, a
> g
> On Wed, Sep 13, 2017 at 3:46 PM, Jakub Jelinek wrote:
> > On Wed, Sep 13, 2017 at 03:41:19PM +0200, Richard Biener wrote:
> >> On its own -O3 doesn't add much (some loop opts and slightly more
> >> aggressive inlining/unrolling), so whatever it does we
> >> should consider doing at -O2 eventuall
> On Wed, Sep 13, 2017 at 3:21 AM, Michael Clark wrote:
> >
> >> On 13 Sep 2017, at 1:15 PM, Michael Clark wrote:
> >>
> >> - https://rv8.io/bench#optimisation
> >> - https://rv8.io/bench#executable-file-sizes
> >>
> >> -O2 is 98% perf of -O3 on x86-64
> >> -Os is 81% perf of -O3 on x86-64
> >>
>
> >I don't see static profile prediction to be very useful here to find
> >"really
> >hot code" - neither in current implementation or future. The problem of
> >-O2 is that we kind of know that only 10% of code somewhere matters for
> >performance but we have no way to reliably identify it.
>
> It
Hello,
all but one videos from this year Cauldron has been edited and are now linked
from https://gcc.gnu.org/wiki/cauldron2017 (plugins BoF will appear till end
of week).
I would also like to update the page with links to slides. If someone beats me
on this and adds some or all of them as attach
Dne 2017-10-28 09:28, Jeff Law napsal:
Jan,
What's the purpose behind calling vrp_meet and
extract_range_from_unary_expr from within the IPA passes?
AFAICT that is not safe to do. Various paths through those routines
will access static objects within tree-vrp.c which may not be
initialized whe
Dne 2018-02-05 18:44, Richard Biener napsal:
On February 5, 2018 12:26:58 PM GMT+01:00, Allan Sandfeld Jensen
wrote:
Hello GCC
In trying to make it possible to use LTO for distro-builds of Qt, I
have again
hit the problem of static libraries. In Qt in general we for LTO rely
on a
library bound
Hello,
> On Fri, Mar 2, 2018 at 10:24 AM, Hrishikesh Kulkarni
> wrote:
> > Hello everyone,
> >
> >
> > Thanks for your suggestions and engaging response.
> >
> > Based on the feedback I think that the scope of this project comprises of
> > following three indicative actions:
> >
> >
> > 1. Creatin
nhancement.
I would agree here - dumping pass summaries would be nice but we already have
that more or less. All IPA passes dump their summary into beggining of their
dump file and I find that relatively sufficient to deal with mostly because
summaries are quite simple. It is much harder to deal w
> On Tue, Mar 6, 2018 at 11:12 AM, Martin Liška wrote:
> > Hello.
> >
> > Many significant changes has landed in mainline and will be released as GCC
> > 8.1.
> > I decided to use various GCC configs we have and test how there
> > configuration differ
> > in size and also binary size.
> >
> > Th
e
> for the cgraph in the file, etc.
>
> Basically while there's a lot of dumping infrastructure in GCC
> it may not always fit the needs of a LTO IL dumping tool 1:1
> and may need refactoring enhancement.
>
> Richard.
>
> >
> > Thanks,
> >
> > Hrishikes
> On Tue, Mar 6, 2018 at 4:02 PM, Jan Hubicka wrote:
> >> On Tue, Mar 6, 2018 at 2:30 PM, Hrishikesh Kulkarni
> >> wrote:
> >> > Hi,
> >> >
> >> > Thank you Richard and Honza for the suggestions. If I understand
> >> > c
Hello,
I have also re-done most of my firefox testing similar to ones I published at
http://hubicka.blogspot.cz/2014/04/linktime-optimization-in-gcc-2-firefox.html
(thanks to Martin Liska who got LTO builds to work again)
I am attaching statistics on binary sizes. Interesting is that for firefox
> On 03/21/2018 10:26 AM, Richard Biener wrote:
> >On Tue, Mar 20, 2018 at 8:57 PM, Martin Liška wrote:
> >>Hi.
> >>
> >>I did similar stats for postgresql server, more precisely for pgbench:
> >>pgbench -s100 & 10 runs of pgbench -t1 -v
> >
> >Without looking at the benchmark probably only be
> On 05/30/2018 12:27 PM, Gerald Pfeifer wrote:
> >On Wed, 30 May 2018, Martin Sebor wrote:
> >>I think your r260956 is missing the following hunk:
> >
> >If this fixes the bootstrap for you (also ran into this myself
> >just now), can you please go ahead and commit?
> >
> >We can always sort out t
>
> Hi,
>
> I'm in the process of changing the vectorizer to consider all
> vector sizes as advertised by targetm.autovectorize_vector_sizes
> and to decide which one to use based on its cost model.
>
> I expect that to make sense for example when choosing between
> AVX128 and AVX256 since the l
> On Thu, Feb 17, 2011 at 08:35:26AM +, Jan Beulich wrote:
> > >>> On 16.02.11 at 21:04, "H. Peter Anvin" wrote:
> > > On 02/16/2011 11:22 AM, H.J. Lu wrote:
> > >> Hi,
> > >>
> > >> I updated x32 psABI draft to version 0.2 to change x32 library path
> > >> from lib32 to libx32 since lib32 i
> > According to Mozilla folks however REL+RELA scheme used by EABI leads
> > to significandly smaller libxul.so size
> >
> > According to http://glandium.org/blog/?p=1177 the difference is about 4-5MB
> > (out of approximately 20-30MB shared lib)
>
> This is orthogonal to x32 psABI.
Understood.
> On Thu, Feb 17, 2011 at 04:44:53PM +0100, Jan Hubicka wrote:
> > > > According to Mozilla folks however REL+RELA scheme used by EABI leads
> > > > to significandly smaller libxul.so size
> > > >
> > > > According to http://glandium.org/
> On Fri, Feb 18, 2011 at 12:11 AM, Jan Beulich wrote:
> >>>> On 17.02.11 at 18:59, "H.J. Lu" wrote:
> >> On Thu, Feb 17, 2011 at 8:11 AM, Jan Beulich wrote:
> >>>>>> On 17.02.11 at 16:49, "H.J. Lu" wrote:
> >>&g
Hi,
I've copied mainline to pretty-ipa branch killing all the changes that was left
unmerged there.
I intend to use it for IPA and LTO related development to be merged at next
stage1, immediately
for the inliner related cleanups we seems to be cummulating at the mainling
list.
I would like the
>With release of Xcode 3.2.6/4.0 this week, an unfortunate change was made
> to
> the darwin assembler which effectively breaks LTO support for darwin. The
> design
> of LTO on darwin was based on the fact that mach-o object files tolerated
> additional
> sections as long as they didin't con
>
> On Mar 13, 2011, at 8:38 AM, Jack Howarth wrote:
>
> > On Sun, Mar 13, 2011 at 12:39:26PM +0100, Jan Hubicka wrote:
> >>> With release of Xcode 3.2.6/4.0 this week, an unfortunate change was
> >>> made to
> >>> the darwin assembler w
> On Wed, 23 Mar 2011, Diego Novillo wrote:
>
> > Over at the PPH branch we are starting to re-use the LTO streaming
> > routines to save front end trees. Clearly, there are things that need
> > to be extended and/or replaced since LTO streaming assumes that we are
> > in GIMPLE. However, there
> Hi,
>
> On Mon, 18 Apr 2011, H.J. Lu wrote:
>
> > LTO bootstrap has been broken for more than a month:
>
> I was LTO bootstrapping yesterday just fine. You're talking about
> bootstrap-profiled.
I just fixed at least the inliner profiledbootstrap ICE. I will try to
profiledltobootstrap ne
> On Tue, Apr 19, 2011 at 7:01 AM, Richard Guenther
> wrote:
> > On Tue, Apr 19, 2011 at 3:51 PM, Eric Botcazou
> > wrote:
> >>> Hmpf. Strange. I've bootstrapped with all languages except Ada
> >>> yesterday, with gold as plugin-ld.
> >>
> >> GNU ld (with plugins) for me, but --enable-checking
> ../libdecnumber/libdecnumber.a ../libcpp/libcpp.a
> ../libiberty/libiberty.a ../libdecnumber/libdecnumber.a-lmpc -lmpfr
> -lgmp -rdynamic -ldl -L../zlib -lz
> lto1: internal compiler error: vector VEC(tree,base) index domain error,
> in evaulate_conditions_for_edge at ipa-inline-analys
> On Mon, May 30, 2011 at 9:16 PM, Diego Novillo wrote:
> > The new routines lto_output_int_in_range and lto_input_int_in_range do
> > not seem to be working right. In the pph branch, we have an LTO_tags
> > enum with a range [0 - 351]. This is causing two things:
> >
> > - The writer gets out o
> On Wed, Jun 22, 2011 at 3:25 PM, Ian Lance Taylor wrote:
> > "H.J. Lu" writes:
> >
> >> Apparently, there is no GCC maintainer for Linux/x86 platform. I have
> >> been working on GCC, as well as binutils and C libraries, for Linux/x86
> >> over 20 years. I ported GCC, binutils and the C libra
Hi,
> The worst part is that test coverage for this feature is
> extremely poor. It's very difficult to tell if any cleanup
> in this area is likely to introduce more bugs than it fixes.
>
> After 3 days fighting with this code, I had a bit of a
> cathartic whine on IRC. I got two votes to just
> On 07/25/2011 06:42 AM, Xinliang David Li wrote:
>> FYI the performance impact of this option with SPEC06 (built with
>> google_46 compiler and measured on a core2 box). The base line number
>> is FDO, and ref number is FDO + reorder_with_partitioning.
>>
>> xalancbmk improves> 3.5%
>> perlben
> In xalancbmk, with the partition option, most of object files have
> nonzero size cold sections generated. The text size of the binary is
> increased to 3572728 bytes from 3466790 bytes. Profiling the program
> using the training input shows the following differences. With
> partitioning, number
> On Wed, Aug 3, 2011 at 2:06 PM, Jan Hubicka wrote:
> >> In xalancbmk, with the partition option, most of object files have
> >> nonzero size cold sections generated. The text size of the binary is
> >> increased to 3572728 bytes from 3466790 bytes. Profiling the p
Also on the oriignal topic, Iknow that Mozlla folks experimented with this
switch (and I do expect it should make noticeable reducion in the hot section
footprint that is important for them). They are not using it at the moment
because of problems with their bug reporting tool not being able to do
Did you try using FDO with -Os? FDO should make hot code parts
optimized similar to -O3 but leave other pieces optimized for size.
Using FDO with -O3 gives you the opposite, cold portions optimized
for size while the rest is optimized for speed.
FDO with -Os still optimize for size, even in hot
> +Mark who has done size optimization tuning with FDO.
>
> On Thu, Aug 4, 2011 at 7:05 AM, Mike Hommey wrote:
> > Hi,
> >
> > We (Mozilla) are trying to get the best of the ARM toolchain for our
> > Android build. I recently built an Android Native-code Development Kit
> > with GCC 4.6.1 and bin
Am Fri 05 Aug 2011 09:32:05 AM CEST schrieb Richard Guenther
:
On Thu, Aug 4, 2011 at 8:42 PM, Jan Hubicka wrote:
Did you try using FDO with -Os? FDO should make hot code parts
optimized similar to -O3 but leave other pieces optimized for size.
Using FDO with -O3 gives you the opposite
Am Fri 05 Aug 2011 07:49:49 PM CEST schrieb Xinliang David Li
:
On Fri, Aug 5, 2011 at 12:32 AM, Richard Guenther
wrote:
On Thu, Aug 4, 2011 at 8:42 PM, Jan Hubicka wrote:
Did you try using FDO with -Os? FDO should make hot code parts
optimized similar to -O3 but leave other pieces
> >
> > In a way I like the current scheme since it is simple and extending it
> > should IMO have some good reason. We could refine -Os behaviour without
> > changing current predicates to optimize for speed in
> > a) functions declared as "hot" by user and BBs in them that are not proved
> > cold
> On Fri, Aug 5, 2011 at 3:24 PM, Jan Hubicka wrote:
> >> >
> >> > In a way I like the current scheme since it is simple and extending it
> >> > should IMO have some good reason. We could refine -Os behaviour without
> >> > changing current pred
> On 09/09/2011 03:09 AM, Jakub Jelinek wrote:
>> Status
>> ==
>>
>> The trunk is in Stage 1, which, if we follow roughly the 4.6
>> release schedule, should end around end of October.
>> At this point I'd like to gather the status of the various
>> development branches that haven't been merged
> I'm committing the following test case that displays the bug. It does
> in fact pass with mainline, and does in fact fail with gcc 4.4.0.
>
> I spent two days trying to come up with some cleaner way to fix this bug
> than the inlinable flag you pass around, but to no avail. The only
> thing
> Could I convince you to have a look at the transactional-memory branch
> test libitm/testsuite/libitm.c++/eh-1.C? I'm getting
>
> z.c:36:1: error: edge void f1()->void* __cxa_allocate_exception(long
> unsigned int) has no corresponding call_stmt
> D.2114_4 = __cxa_allocate_exception (4);
>
>
> This merge brings in unit-at-a-time gimplification, so it needed some
> tweaking. Mostly, it helped to find out some GENERIC that was leaking
> into the streamer. A pleasant side-effect of the unit-at-a-time
> gimplification is that not every function is gimplified, so there is
> less gunk to p
> > struct cgraph_edge *edge = cgraph_edge (id->src_node,
> > orig_stmt);
> POINT_A
> > int flags;
> >
> > switch (id->transform_call_graph_edges)
> >{
> > case CB_CGE_DUPLICATE:
> >if (edge)
> >
> On 07/28/2009 10:44 AM, Richard Henderson wrote:
> >I guess I'll poke at cleaning this up today. I've got to
> >familiarize myself with how virtual clones work...
>
> The virtual clones that ipa-cp makes seems to be easy.
>
> My thought here is that since (virtual) clones don't
> have actual bo
> New constructs:
>
> { exc_ptr, filter } = EH_LANDING_PAD;
>
> Placeholder for the landing-pad rtl. Has 2 outputs
> for both the exception pointer and the filter.
Hmm, EH_LANDING_PAD will still need to be somewhat special (as moving it
across eh edge or something will change beha
Hi,
sorry for jumping in late, I had relatively urgent things to work at
and didn't had much time to think this over.
I am still having some problems understanding the plans on critical edge
splitting.
> EXC_PTR_EXPR and FILTER_EXPR will be expanded to take the EH
> region number as a parameter.
> On 08/10/2009 08:20 AM, Michael Matz wrote:
> >It's not that they _create_ side-effects, but they depend on some.
>
> Ah, fair enough. I hadn't actually thought that all through.
>
> >Btw, it's really wonderful that someone tackles EH-on-gimple ;-)
>
> I hadn't been planning on it, but my tra
>
> Yes. Although I'm streamlining things even more now. I've eliminated
> the "global" variables that store the excptr/filter, and instead each
> individual use location is asking for what it needs locally.
>
> Further, the actual landing pad itself is *not* generated in gimple.
> I had too ma
> On Wed, Sep 16, 2009 at 12:39 AM, Oliver Kellogg
> wrote:
> > Hi,
> >
> > Looking at ggc-page.c:175ff.,
> >
> > static const size_t extra_order_size_table[] = {
> > sizeof (struct var_ann_d),
> > sizeof (struct tree_decl_non_common),
> > sizeof (struct tree_field_decl),
> > sizeof (struct tr
> L.S.,
>
> On our weather forecasting code (compiled with -O3 -flto and linked with
> -O3 -flto -fwhole-program) I get a speedup of 65 seconds per time step
> in the model integration vs. 75 seconds with -O3 alone.
There is bug making -fwhole-program disabled with LTO compilations.
I hope to g
> > L.S.,
> >
> > On our weather forecasting code (compiled with -O3 -flto and linked with
> > -O3 -flto -fwhole-program) I get a speedup of 65 seconds per time step
> > in the model integration vs. 75 seconds with -O3 alone.
>
> There is bug making -fwhole-program disabled with LTO compilation
> > 2
> > _ZNKSt8ios_base6getlocEv
> > std::ios_base::getloc() const
> > version status: incompatible
> > GLIBCXX_3.4
> > type: function
> > status: added
> >
> > Are there very recent inlining changes?
>
> Yes.
This might be patch I commited today morning. It would help if you
could just send m
> Anyway, as regards *which* specific functions are not inlined, I would
> say all the functions which break the ABI test as newly exported symbols
> should be checked, like the above, 'std::ios_base::getloc() const'. I'm
> attaching below a complete list, from my libstdc++.log, but I would
> guess
> Btw, that new comdat behavior is very well reasonable. In
> whole-program mode it should be the old one though.
It is another effect of the patch that in whole-program we bring all
comdat functions static except for those having address taken (so the
address must be same from all modules)
I was
Hi,
thanks for the report! It is actually more promising than I've
expected. A while ago I did similar tests with whole-program and
--combine and we didn't get very consistent with performance (I saw also
code size reductions). I guess geomaverage will go down for specint
after vpr/gcc/perlbmk/g
> Richard, Jan,
>
> I'm confused. Consider this symbol:
>
> W
> _ZN9__gnu_cxx8__detail9__find_ifIPSt4pairIPNS_16bitmap_allocatorIcE12_Alloc_blockES6_ENS0_12_Functor_RefINS0_12_Ffit_finderIS6_EET_SD_SD_T0_
> version status: incompatible
> GLIBCXX_3.4
> type: function
> status:
> On Sat, Oct 10, 2009 at 12:55 PM, Toon Moene wrote:
> > Gcc's man page says:
> >
> > -finline-functions-called-once
> > Consider all "static" functions called once for inlining into
> > their caller even if they are not marked "inline". If a call
> > to a giv
> On Sat, Oct 10, 2009 at 02:31:25PM +0200, Jan Hubicka wrote:
> > My solution would be probably to pass -fdump-ipa-inline parameter to lto
> > compilation and read the log. It lists the inlining decisions and if
> > something is not inlined, you get dump of reason why.
> Richard Guenther wrote:
>
> >It'll be in /tmp and named after the first object file, in your case it
> >will
> >be ccGGS24.o.047i.inline (because the first object file will be a
> >tempfile). A minor inconvenience that maybe is going to be fixed.
>
> Now that Richard has pointed out to me whe
> On Wed, Nov 4, 2009 at 8:19 PM, Toon Moene wrote:
> > Jan,
> >
> > I had some time to study the example I sent you a couple of weeks ago.
> >
> > According to visible inspection of the source code, there are 5 functions
> > (subroutines in Fortran parlance) that are called once:
> >
> > MAIN c
201 - 300 of 681 matches
Mail list logo