Hi Robo, I'm really glad to see that someone else is with me on these kind of tests.
About your test, pretty cool, and with a real world program like a game, that's a benchmark, just to explain that optimizing for size is a lot faster on machines with slow or small cache, or with already good branch prediction (years ago, IIRC I readed somewhere that the pentium-mmx has good branch prediction, at least better than cyrix and amd at that moment). This may be the cause of "why!?"s on your message :+) some optimizations mixed makes bigger code, and, sometimes with these mixes the compiler doesn't uses the same pseudo-random branch prediction on code. Again, would be really nice to see that this kind of benchmarks would make our Gentoo faster than light :+) Salu2. Javier Villavicencio. On Tue, 4 Nov 2003 11:36:30 +0100 (Central Europe Standard Time) Robo Cernansky <[EMAIL PROTECTED]> wrote: > > This is results of simple CFLAGS test. Maybe it will be useful for someone so > I post it here. I was wonder if results for old processors will be same as for > the fast ones so I made this simple test. > You can compare this test with Javier Villavicencio's > (http://article.gmane.org/gmane.linux.gentoo.user/51881)). > > I was compiling gnuchess (http://www.gnu.org/software/chess/chess.html) with > various CFLAGS settings. For each compiled gnuchess I ran these commands: > > ./gnuchess > depth 8 > go > quit > > and wrote down duration of "go" command (gnuchess prints it). Following numbers > are average values of five (or more) values. > > Each test was performed two times. One in X with many processes running > (mostly sleeping). Second in console with minimum of processes and with niced > priority (nice -18) of gnuchess. > > Values are in seconds. In parentheses is place (best is on 1st place, worst on > 14th place). Lines marked with "X" are values from test in X environment and > "C" lines are values from console test. > > > Here are the results: > > 1. Without optimizations > X 22.12 (14) > C 17.65 (14) > [slowest of course - 14th place] > > 2. "Basic" O2 test > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer): > X 16.73 (8, 9) > C 13.29 (6) > [much better] > > 3. Changed to O3 > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer): > X 17.23 (12) > C 15.25 (13) > [slower than O2 - same result as in Javier Villavicencio's test] > > 4. Trying O3 being faster > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer > -falign-functions=4 -falign-jumps=4): > X 16.62 (7) > C 13.50 (7) > [almost same as "basic" O2] > > 5. Don't give up with O3 > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer > -falign-functions=4 -falign-jumps=4 > -fforce-addr): > X 17.09 (11) > C 14.39 (12) > [bad results :-(] > > 6. Piece of O3 in O2 (O3 implies -frename-registers) > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer > -frename-registers): > X 16.44 (4) > C 13.19 (5) > [pretty fast - as in Javier Villavicencio's test] > > 7. Trying somethin else > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer > -falign-functions=4 -falign-jumps=4): > X 16.25 (1) > C 13.11 (3, 4) > [bingo! first place in X environment; also close to second and first place in > console] > > 8. Combination of two previous > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer > -frename-registers > -falign-functions=4 -falign-jumps=4): > X 16.73 (8, 9) > C 14.03 (10) > [aargh! slow; why???] > > 9. Trying -fforce-addr > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer > -fforce-addr): > X 18.33 (13) > C 13.74 (9) > [slower - like with O3; Javier Villavicencio's test shows same results; > this is much slower in X (almost last place) - why?] > > 10. Just for a record > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer > -fforce-addr > -frename-registers): > X 17.02 (10) > C 13.51 (8) > [little bit faster] > > 11. Let's see "clean" O2 > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2): > X 16.58 (6) > C 14.14 (11) > [much slower in console (11th place)] > > 12. And "clean" Os > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os): > X 16.28 (2) > C 13.11 (3, 4) > [great! I didn't expect this] > > 13. Let's play with Os > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os -fomit-frame-pointer): > X 16.49 (5) > C 13.03 (2) > [hmm, strange - faster in console and slower in X] > > 14. Go Os Go! > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os -fomit-frame-pointer > -falign-functions=4 -falign-jumps=4): > X 16.31 (3) > C 12.99 (1) > [fastest in console! very close to first and second place in X; this is > different result than Bill Kenworthy got > (see http://article.gmane.org/gmane.linux.gentoo.user/50998)] > > > Note that this test is specific to one task in one application. Effect can be > different for whole system (see different effects of some options when > application is running in X (system with many processes) and in console > (minimum processes)). > > Machine specs: > > $ uname -rmip > 2.4.20-gentoo-r7 i586 Pentium MMX GenuineIntel > > Kernel is compiled with preemptive multitasking. > > Processor: Pentium 166 MMX, RAM: 64MB > > > Robert. > > > -- > Robert Cernansky > E-mail: [EMAIL PROTECTED] > Jabber: [EMAIL PROTECTED] > > > -- > [EMAIL PROTECTED] mailing list > > > -- [EMAIL PROTECTED] mailing list