On Thu, Feb 16, 2012 at 11:12:06AM -0500, Bill Nottingham wrote: > > The another usual mistake when people compare speed of GCC and LLVM > > is to use -O2 for the both compilers. But the true is that -O1 of > > GCC is -O2 of LLVM with the point of code generation quality. The > > compiler speed of GCC with -O1 is the same as for LLVM with -O2. > > You can find the latest comparison of LLVM and GCC on > > http://vmakarov.fedorapeople.org/spec/ (see 2011 comparison at the > > bottom of the left frame). > > Speaking of potential magic bullets... is there any reason > we don't enable auto-vectorization by default (with -O3, or with the > assorted -f/-m flags?)
Auto-vectorization is enabled by default for -O3 if the chosen CPU supports vector instructions (i.e. on x86_64 always, on i?86 only for -msse (and better -msse2, -mavx can make a big difference over -msse2 for both), or can be enabled manually (-O2 -ftree-vectorize). Enabling it by default isn't a magic bullet, I believe most of the distro code is cold code where -O3 or even -O2 -ftree-vectorize would enlarge the code size too much, increase cache footprint and not be a win in the end. For performance sensitive code sure, enabling -O3 or -O2 -ftree-vectorize is desirable, even better when acompanied by PGO (-fprofile-generate, run the resulting application on some training data (benchmark, testsuite, ...), -fprofile-use). For just -O3 or -O2 -ftree-vectorize we could perhaps have some knob in the spec files to request those extra flags, for PGO it really requires some work from the packager (but e.g. bash/grep/awk, perhaps perl/python etc. would definitely improve, gcc itself is already built with PGO). > - Is it not stable enough? It is pretty stable, of course -O2 -g is the usual default, so -O3 is somewhat less tested than that, but many LFS distros are sometimes built with -O3. > - Does it not take effect often enough? It takes effect pretty often and is improving in that, though of course in many cases could take effect even more often than now. On the other side the current cost model is not very precise and sometimes vectorization slows things down instead of speeding it up, sometimes on some CPUs only. Jakub -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel