For peak, FDO is the most effective option. It can boost performance by 7-10% depending on the program. The options you suggested probably won't make too big a dent. -funroll-loops can hurt performance without profiling. More aggressive inlining, ipa-cp, unswitching etc enabled by O3 may help a little if there is any. -ffast-math won't help for integer benchmarks other than eon. Traditionally, O3 helps FP performance because of the loop transformation enabled, but this won't be the case for gcc for now.
Thanks, David On Mon, Nov 15, 2010 at 4:29 AM, Andrey Belevantsev <a...@ispras.ru> wrote: > Hello, > > On 14.11.2010 0:08, Xinliang David Li wrote: >> >> I re-measured the performance difference using trunk gcc and trunk >> clang/llvm on a core-2 box. -fno-strict-aliasing is added to gcc >> because clang/llvm's type based aliasing is not incomplete and not >> enabled by default. I also added -fomit-frame-pointer to clang/llvm as >> this is gcc's default. The base option is -O2. > > It would be very interesting to compare also peak numbers, i.e. with LTO and > strict aliasing enabled, as well as -O3 and -ffast-math/-funroll-loops, > similar to Vlad's or OpenSUSE's options. Can you try to measure these? > Maybe you can also run SPEC2k6, if there is enough machine resources, but > that's probably asking too much... > > Andrey > >