Hi Robo, I'm really glad to see that someone else is with me on these kind of tests.

About your test, pretty cool, and with a real world program like a game, that's a 
benchmark, just to explain that optimizing for size is a lot faster on machines with 
slow or small cache, or with already good branch prediction (years ago, IIRC I readed 
somewhere that the pentium-mmx has good branch prediction, at least better than cyrix 
and amd at that moment).
This may be the cause of "why!?"s on your message :+) some optimizations mixed makes 
bigger code, and, sometimes with these mixes the compiler doesn't uses the same 
pseudo-random branch prediction on code.

Again, would be really nice to see that this kind of benchmarks would make our Gentoo 
faster than light :+)

Salu2.

Javier Villavicencio.


On Tue, 4 Nov 2003 11:36:30 +0100 (Central Europe Standard Time)
Robo Cernansky <[EMAIL PROTECTED]> wrote:

> 
> This is results of simple CFLAGS test. Maybe it will be useful for someone so
> I post it here. I was wonder if results for old processors will be same as for
> the fast ones so I made this simple test.
> You can compare this test with Javier Villavicencio's
> (http://article.gmane.org/gmane.linux.gentoo.user/51881)).
> 
> I was compiling gnuchess (http://www.gnu.org/software/chess/chess.html) with
> various CFLAGS settings. For each compiled gnuchess I ran these commands:
> 
> ./gnuchess
> depth 8
> go
> quit
> 
> and wrote down duration of "go" command (gnuchess prints it). Following numbers
> are average values of five (or more) values.
> 
> Each test was performed two times. One in X with many processes running
> (mostly sleeping). Second in console with minimum of processes and with niced
> priority (nice -18) of gnuchess.
> 
> Values are in seconds. In parentheses is place (best is on 1st place, worst on
> 14th place). Lines marked with "X" are values from test in X environment and
> "C" lines are values from console test.
> 
> 
> Here are the results:
> 
>  1. Without optimizations
>   X 22.12 (14)
>   C 17.65 (14)
>   [slowest of course - 14th place]
> 
>  2. "Basic" O2 test
>    (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer):
>   X 16.73 (8, 9)
>   C 13.29 (6)
>   [much better]
> 
>  3. Changed to O3
>    (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer):
>   X 17.23 (12)
>   C 15.25 (13)
>   [slower than O2 - same result as in Javier Villavicencio's test]
> 
>  4. Trying O3 being faster
>    (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer
>     -falign-functions=4 -falign-jumps=4):
>   X 16.62 (7)
>   C 13.50 (7)
>   [almost same as "basic" O2]
> 
>  5. Don't give up with O3
>    (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer
>     -falign-functions=4 -falign-jumps=4
>     -fforce-addr):
>   X 17.09 (11)
>   C 14.39 (12)
>   [bad results :-(]
> 
>  6. Piece of O3 in O2 (O3 implies -frename-registers)
>    (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer
>     -frename-registers):
>   X 16.44 (4)
>   C 13.19 (5)
>   [pretty fast - as in Javier Villavicencio's test]
> 
>  7. Trying somethin else
>    (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer 
>     -falign-functions=4 -falign-jumps=4):
>   X 16.25 (1)
>   C 13.11 (3, 4)
>   [bingo! first place in X environment; also close to second and first place in
>    console]
> 
>  8. Combination of two previous
>    (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer
>     -frename-registers
>     -falign-functions=4 -falign-jumps=4):
>   X 16.73 (8, 9)
>   C 14.03 (10)
>   [aargh! slow; why???]
> 
>  9. Trying -fforce-addr
>    (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer
>     -fforce-addr):
>   X 18.33 (13)
>   C 13.74 (9)
>   [slower - like with O3; Javier Villavicencio's test shows same results;
>    this is much slower in X (almost last place) - why?]
> 
> 10. Just for a record
>   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer
>    -fforce-addr
>    -frename-registers):
>   X 17.02 (10)
>   C 13.51 (8)
>   [little bit faster]
> 
> 11. Let's see "clean" O2
>   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2):
>   X 16.58 (6)
>   C 14.14 (11)
>   [much slower in console (11th place)]
> 
> 12. And "clean" Os
>   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os):
>   X 16.28 (2)
>   C 13.11 (3, 4)
>   [great! I didn't expect this]
> 
> 13. Let's play with Os
>   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os -fomit-frame-pointer):
>   X 16.49 (5)
>   C 13.03 (2)
>   [hmm, strange - faster in console and slower in X]
> 
> 14. Go Os Go!
>   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os -fomit-frame-pointer
>    -falign-functions=4 -falign-jumps=4):
>   X 16.31 (3)
>   C 12.99 (1)
>   [fastest in console! very close to first and second place in X; this is
>    different result than Bill Kenworthy got
>    (see http://article.gmane.org/gmane.linux.gentoo.user/50998)]
> 
> 
> Note that this test is specific to one task in one application. Effect can be
> different for whole system (see different effects of some options when
> application is running in X (system with many processes) and in console
> (minimum processes)).
> 
> Machine specs:
> 
> $ uname -rmip
> 2.4.20-gentoo-r7 i586 Pentium MMX GenuineIntel
> 
> Kernel is compiled with preemptive multitasking.
> 
> Processor: Pentium 166 MMX, RAM: 64MB
> 
> 
> Robert.
> 
> 
> -- 
> Robert Cernansky
> E-mail: [EMAIL PROTECTED]
> Jabber: [EMAIL PROTECTED]
> 
> 
> --
> [EMAIL PROTECTED] mailing list
> 
> 
> 



--
[EMAIL PROTECTED] mailing list

Reply via email to