> For peak, FDO is the most effective option. It can boost performance > by 7-10% depending on the program. The options you suggested probably > won't make too big a dent. -funroll-loops can hurt performance > without profiling. More aggressive inlining, ipa-cp, unswitching etc
-funroll-loops overall was 2.2% win on SPECint, -funrol-all-loops 2.5% last time I noted down the SPECint results of this (that was in 2003, heh :) http://www.ucw.cz/~hubicka/papers/amd64/node4.html > enabled by O3 may help a little if there is any. -ffast-math won't > help for integer benchmarks other than eon. Traditionally, O3 helps > FP performance because of the loop transformation enabled, but this > won't be the case for gcc for now. Function inlining definitly helps. -O3 also imply vectorization and other stuff. Honza > > Thanks, > > David > > On Mon, Nov 15, 2010 at 4:29 AM, Andrey Belevantsev <a...@ispras.ru> wrote: > > Hello, > > > > On 14.11.2010 0:08, Xinliang David Li wrote: > >> > >> I re-measured the performance difference using trunk gcc and trunk > >> clang/llvm on a core-2 box. -fno-strict-aliasing is added to gcc > >> because clang/llvm's type based aliasing is not incomplete and not > >> enabled by default. I also added -fomit-frame-pointer to clang/llvm as > >> this is gcc's default. The base option is -O2. > > > > It would be very interesting to compare also peak numbers, i.e. with LTO and > > strict aliasing enabled, as well as -O3 and -ffast-math/-funroll-loops, > > similar to Vlad's or OpenSUSE's options. Can you try to measure these? > > Maybe you can also run SPEC2k6, if there is enough machine resources, but > > that's probably asking too much... > > > > Andrey > > > >