On Thu, Feb 16, 2012 at 11:12:06AM -0500, Bill Nottingham wrote:
> > The another usual mistake when people compare speed of GCC and LLVM
> > is to use -O2 for the both compilers.  But the true is that -O1 of
> > GCC is -O2 of LLVM with the point of code generation quality.  The
> > compiler speed of GCC with -O1 is the same as for LLVM with -O2.
> > You can find the latest comparison of LLVM and GCC on
> > http://vmakarov.fedorapeople.org/spec/ (see 2011 comparison at the
> > bottom of the left frame).
> 
> Speaking of potential magic bullets... is there any reason
> we don't enable auto-vectorization by default (with -O3, or with the
> assorted -f/-m flags?)

Auto-vectorization is enabled by default for -O3 if the chosen CPU
supports vector instructions (i.e. on x86_64 always, on i?86 only for -msse
(and better -msse2, -mavx can make a big difference over -msse2 for both),
or can be enabled manually (-O2 -ftree-vectorize).
Enabling it by default isn't a magic bullet, I believe most of the distro
code is cold code where -O3 or even -O2 -ftree-vectorize would enlarge the
code size too much, increase cache footprint and not be a win in the end.
For performance sensitive code sure, enabling -O3 or -O2 -ftree-vectorize
is desirable, even better when acompanied by PGO (-fprofile-generate,
run the resulting application on some training data (benchmark, testsuite,
...), -fprofile-use).
For just -O3 or -O2 -ftree-vectorize we could perhaps have some knob in
the spec files to request those extra flags, for PGO it really requires
some work from the packager (but e.g. bash/grep/awk, perhaps perl/python
etc. would definitely improve, gcc itself is already built with PGO).

> - Is it not stable enough?

It is pretty stable, of course -O2 -g is the usual default, so -O3 is
somewhat less tested than that, but many LFS distros are sometimes
built with -O3.

> - Does it not take effect often enough?

It takes effect pretty often and is improving in that, though of course
in many cases could take effect even more often than now.  On the other
side the current cost model is not very precise and sometimes vectorization
slows things down instead of speeding it up, sometimes on some CPUs only.

        Jakub
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Reply via email to