> For peak, FDO is the most effective option. It can boost performance
> by 7-10% depending on the program. The options you suggested probably
> won't make too big a dent.  -funroll-loops can hurt performance
> without profiling.  More aggressive inlining, ipa-cp, unswitching etc

-funroll-loops overall was 2.2% win on SPECint, -funrol-all-loops 2.5% last
time I noted down the SPECint results of this (that was in 2003, heh :)
http://www.ucw.cz/~hubicka/papers/amd64/node4.html

> enabled by O3 may help a little if there is any. -ffast-math won't
> help for integer benchmarks other than eon.  Traditionally, O3 helps
> FP performance because of the loop transformation enabled, but this
> won't be the case for gcc for now.

Function inlining definitly helps. -O3 also imply vectorization and other stuff.

Honza
> 
> Thanks,
> 
> David
> 
> On Mon, Nov 15, 2010 at 4:29 AM, Andrey Belevantsev <a...@ispras.ru> wrote:
> > Hello,
> >
> > On 14.11.2010 0:08, Xinliang David Li wrote:
> >>
> >> I re-measured the performance difference using trunk gcc and trunk
> >> clang/llvm on a core-2 box.  -fno-strict-aliasing is added to gcc
> >> because clang/llvm's type based aliasing is not incomplete and not
> >> enabled by default. I also added -fomit-frame-pointer to clang/llvm as
> >> this is gcc's default. The base option is -O2.
> >
> > It would be very interesting to compare also peak numbers, i.e. with LTO and
> > strict aliasing enabled, as well as -O3 and -ffast-math/-funroll-loops,
> > similar to Vlad's or OpenSUSE's options.  Can you try to measure these?
> > Maybe you can also run SPEC2k6, if there is enough machine resources, but
> > that's probably asking too much...
> >
> > Andrey
> >
> >

Reply via email to