https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62077

--- Comment #46 from Sven C. Dack <sven.c.dack at virginmedia dot com> ---
> ...
> > > >                 avg           stdev          %
> > > > default:    282.86s    0.56s, 0.20%    100.00% (base)
> > > > profiled:   255.76s    0.72s, 0.28%    +10.60%
> > > > lto:        282.80s    0.16s, 0.06%     +0.02%
> > > > lto-plugin: 285.41s    0.49s, 0.17%     -0.89%
> ...
> Not in my experience. As you've found out LTO is good for reducing 
> binary sizes, but it really needs PGO to make sensible decisions.

These are the new results (5 runs). I have dropped "lto-plugin" and added
"profiled-lto":

                  avg          stdev          %
default:      282.13s    0.71, 0.25%    100.00% (base)
profiled:     255.29s    0.80, 0.31%    +10.52%
lto:          281.71s    0.70, 0.25%     +0.15%
profiled-lto: 251.38s    0.49, 0.19%    +12.23%

gmp, mpfr and mpc have been compiled without LTO.

> BTW if you have TCMalloc installed on your machine, appending 
> "POSTSTAGE1_LDFLAGS += -l/usr/lib/libtcmalloc.so.4" to
> config/bootstrap-lto.mk
> may give you an additional 8-10% speed boost (at least for big C++ projects,
> I haven't measured kernel build times).

This is going off the topic, but I do not mean to disrespect you. I did not use
it in the test runs. You may want to look at "lockless malloc". It seems to be
all round better, because TCMalloc requires a very specific additional library
(libunwind-0.99-beta) and has got problems with glibc and locking, too.
"lockless malloc" is free of locking, very small, available as GPLv3 open
source and said be be faster than TCMalloc. See here:

http://locklessinc.com/benchmarks_allocator.shtml

Reply via email to