I'm wiping this whole message since I'm not about to go and dwell on the finer points of delivery here. :)
Now, I for one would be -VERY- interested in seeing same system comparsion between the flags: -Os -O2 and -O3 (on the same -march= ) for a single system. i realize this takes a lot of time in many cases, and for some applications its not even a good comparsion (a lot of apps will override with their own flags for either stability, or other performance issues. Xine, Mplayer, Xfree, gcc and so on all do this. And they are right in doing it.) For one I suspect that celeron and duron would do better in either -Os or -O2 settings compared to -O3, mainly due to their rather low cache. in fact I also suspect that P4's would gain from -O2 in comparsion to -O3, unless you use the "far faster" RAM settings avaiable. This is another issue that really must be documented when dealing with optimizations. The importance of RAM and specifics on transfers between RAM-> CPU and cache. For Athlon /XP there is quite a large on-die cache (128/512 K for newer versions) Which probably means that -O3 would do better there than -O2 or -Os. But this is an area where testing would be very welcome instead of my speculations. As regards to -ffast-math. It makes a hell of a difference in math related things, but all applications that the developers think are safe and should have it, are already so (mplayer,ffmpeg,xine... check compile logs ;) Its very interesting that you don't remember wether using prelink or not, nor wether you use the Gentoo kernel's specific grsecurity patch or not (Which I think will have some, albeit minimal, impact) another thing to add that a lot of people seem to be confused about is "Gentoo's default optimization" This is -O2 -pipe Yes. it is, go look in /etc/make.globals The -O3 are recommendations and examples, genflags should have this even better theese days. Some more points: Comparing compilations is perhaps defacto praxis, but generally a useless comparsion as gcc overrules all optimizations (check the sources of gcc or ebuilds yourself) in most of all places. its almost impossible to optimize gcc, and if you want fast compiles, i'd suggest gcc-2.95.3 derivations. (still installable) -fomit-frame-pointer is generally giving some boost, but nothing I run on my own systems (I debug a lot ;) //Spider -- begin .signature This is a .signature virus! Please copy me into your .signature! See Microsoft KB Article Q265230 for more information. end
pgp00000.pgp
Description: PGP signature