Re: [gentoo-user] Simple CFLAGS test on Pentium MMX

2003-11-04 Thread William Kenworthy
Did you note the size of the binaries?  Something I neglected to with
the tests I did.

BillK

On Wed, 2003-11-05 at 06:34, Dennis Freise wrote:
> > > I forgot to tell version of gcc - it is 3.2.3.
> >
> > Ah! I was just about to ask you that!!
> > I hope you will consider reporting the results should you change gcc
> > versions. I'm given to understand this can make quite a big difference.
> 
> I've done some quick test with gcc-3.3.2 and povray 3.50.
> I only did test the -O things... what I found out:
> 
> povray compiled with -O3: took ~55 secs to render picture
> povray compiled with -O2: took ~50 secs to render picture
> povray compiled with -Os: took ~68 secs to render picture
> 
> -frename-registers and -finline-functions both did no good, making slower
> executables. However, this was really a quick test. CPU was a pentium-mmx
> 233 mhz with 256mb ram.
> 
> Greetings, Dennis
> 
> 
> --
> [EMAIL PROTECTED] mailing list
-- 
William Kenworthy <[EMAIL PROTECTED]>


--
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Simple CFLAGS test on Pentium MMX

2003-11-04 Thread Dennis Freise
> > I forgot to tell version of gcc - it is 3.2.3.
>
> Ah! I was just about to ask you that!!
> I hope you will consider reporting the results should you change gcc
> versions. I'm given to understand this can make quite a big difference.

I've done some quick test with gcc-3.3.2 and povray 3.50.
I only did test the -O things... what I found out:

povray compiled with -O3: took ~55 secs to render picture
povray compiled with -O2: took ~50 secs to render picture
povray compiled with -Os: took ~68 secs to render picture

-frename-registers and -finline-functions both did no good, making slower
executables. However, this was really a quick test. CPU was a pentium-mmx
233 mhz with 256mb ram.

Greetings, Dennis


--
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Simple CFLAGS test on Pentium MMX

2003-11-04 Thread Stroller
On Nov 4, 2003, at 1:58 pm, Robo Cernansky wrote:

On Tue, 4 Nov 2003 11:36:30 +0100 (Central Europe Standard Time) Robo 
Cernansky <[EMAIL PROTECTED]> wrote:

RC>
RC> This is results of simple CFLAGS test. Maybe it will be useful for 
someone so
[...]
RC> I was compiling gnuchess 
(http://www.gnu.org/software/chess/chess.html) with
RC> various CFLAGS settings. For each compiled gnuchess I ran these
[...]

I forgot to tell version of gcc - it is 3.2.3.
Ah! I was just about to ask you that!!
I hope you will consider reporting the results should you change gcc 
versions. I'm given to understand this can make quite a big difference.

Stroller.

--
[EMAIL PROTECTED] mailing list


Re: [gentoo-user] Simple CFLAGS test on Pentium MMX

2003-11-04 Thread Robo Cernansky
On Tue, 4 Nov 2003 11:36:30 +0100 (Central Europe Standard Time) Robo Cernansky 
<[EMAIL PROTECTED]> wrote:

RC> 
RC> This is results of simple CFLAGS test. Maybe it will be useful for someone so
[...]
RC> I was compiling gnuchess (http://www.gnu.org/software/chess/chess.html) with
RC> various CFLAGS settings. For each compiled gnuchess I ran these
[...]

I forgot to tell version of gcc - it is 3.2.3.

Robert.


-- 
Robert Cernansky
E-mail: [EMAIL PROTECTED]
Jabber: [EMAIL PROTECTED]


--
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Simple CFLAGS test on Pentium MMX

2003-11-04 Thread William Kenworthy
Interesting.  Most of the machines I use have around 1G ram and 2 to 4
gbytes of swap, so -Os looks like it creates a real loss on systems with
ram to spare (see http://wdk.dyndns.org/flags.png - pick -Os !), but
gains when ram is short.

Some people are saying (no figures though) that -Os helps on a desktop
system with responsiveness.

-falign-functions=4 created a slight loss , = 8 or 16 a bit more, but
=32 gained some (32 bit addressing?)

The more I test, I am coming down on the side of using some basic flags
for the system, compiling desktop stuff with -Os (if I can confirm it
does work) and then specific apps with the best flags for performance. 
Examples here are zip/gzip/bzip, mysql, gimp : basicly things that run a
lot and for a long time where long term speed is required.

One point to make about running in X and console: running an application
in an xterm, gnome-terminal, text console or frame-buffer console all
produced different results when tested.  So to be valid, you will need
to do the tests in as close to the way you intend to use the program as
possible.

The golden rule is "test, test, and dont accept someone elses flags
without testing"

BillK



On Tue, 2003-11-04 at 18:36, Robo Cernansky wrote:
> This is results of simple CFLAGS test. Maybe it will be useful for someone so
> I post it here. I was wonder if results for old processors will be same as for
> the fast ones so I made this simple test.
> You can compare this test with Javier Villavicencio's
> (http://article.gmane.org/gmane.linux.gentoo.user/51881)).
> 
> I was compiling gnuchess (http://www.gnu.org/software/chess/chess.html) with
> various CFLAGS settings. For each compiled gnuchess I ran these commands:



--
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Simple CFLAGS test on Pentium MMX

2003-11-04 Thread Javier Villavicencio
Hi Robo, I'm really glad to see that someone else is with me on these kind of tests.

About your test, pretty cool, and with a real world program like a game, that's a 
benchmark, just to explain that optimizing for size is a lot faster on machines with 
slow or small cache, or with already good branch prediction (years ago, IIRC I readed 
somewhere that the pentium-mmx has good branch prediction, at least better than cyrix 
and amd at that moment).
This may be the cause of "why!?"s on your message :+) some optimizations mixed makes 
bigger code, and, sometimes with these mixes the compiler doesn't uses the same 
pseudo-random branch prediction on code.

Again, would be really nice to see that this kind of benchmarks would make our Gentoo 
faster than light :+)

Salu2.

Javier Villavicencio.


On Tue, 4 Nov 2003 11:36:30 +0100 (Central Europe Standard Time)
Robo Cernansky <[EMAIL PROTECTED]> wrote:

> 
> This is results of simple CFLAGS test. Maybe it will be useful for someone so
> I post it here. I was wonder if results for old processors will be same as for
> the fast ones so I made this simple test.
> You can compare this test with Javier Villavicencio's
> (http://article.gmane.org/gmane.linux.gentoo.user/51881)).
> 
> I was compiling gnuchess (http://www.gnu.org/software/chess/chess.html) with
> various CFLAGS settings. For each compiled gnuchess I ran these commands:
> 
> ./gnuchess
> depth 8
> go
> quit
> 
> and wrote down duration of "go" command (gnuchess prints it). Following numbers
> are average values of five (or more) values.
> 
> Each test was performed two times. One in X with many processes running
> (mostly sleeping). Second in console with minimum of processes and with niced
> priority (nice -18) of gnuchess.
> 
> Values are in seconds. In parentheses is place (best is on 1st place, worst on
> 14th place). Lines marked with "X" are values from test in X environment and
> "C" lines are values from console test.
> 
> 
> Here are the results:
> 
>  1. Without optimizations
>   X 22.12 (14)
>   C 17.65 (14)
>   [slowest of course - 14th place]
> 
>  2. "Basic" O2 test
>(-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer):
>   X 16.73 (8, 9)
>   C 13.29 (6)
>   [much better]
> 
>  3. Changed to O3
>(-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer):
>   X 17.23 (12)
>   C 15.25 (13)
>   [slower than O2 - same result as in Javier Villavicencio's test]
> 
>  4. Trying O3 being faster
>(-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer
> -falign-functions=4 -falign-jumps=4):
>   X 16.62 (7)
>   C 13.50 (7)
>   [almost same as "basic" O2]
> 
>  5. Don't give up with O3
>(-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer
> -falign-functions=4 -falign-jumps=4
> -fforce-addr):
>   X 17.09 (11)
>   C 14.39 (12)
>   [bad results :-(]
> 
>  6. Piece of O3 in O2 (O3 implies -frename-registers)
>(-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer
> -frename-registers):
>   X 16.44 (4)
>   C 13.19 (5)
>   [pretty fast - as in Javier Villavicencio's test]
> 
>  7. Trying somethin else
>(-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer 
> -falign-functions=4 -falign-jumps=4):
>   X 16.25 (1)
>   C 13.11 (3, 4)
>   [bingo! first place in X environment; also close to second and first place in
>console]
> 
>  8. Combination of two previous
>(-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer
> -frename-registers
> -falign-functions=4 -falign-jumps=4):
>   X 16.73 (8, 9)
>   C 14.03 (10)
>   [aargh! slow; why???]
> 
>  9. Trying -fforce-addr
>(-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer
> -fforce-addr):
>   X 18.33 (13)
>   C 13.74 (9)
>   [slower - like with O3; Javier Villavicencio's test shows same results;
>this is much slower in X (almost last place) - why?]
> 
> 10. Just for a record
>   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer
>-fforce-addr
>-frename-registers):
>   X 17.02 (10)
>   C 13.51 (8)
>   [little bit faster]
> 
> 11. Let's see "clean" O2
>   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2):
>   X 16.58 (6)
>   C 14.14 (11)
>   [much slower in console (11th place)]
> 
> 12. And "clean" Os
>   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os):
>   X 16.28 (2)
>   C 13.11 (3, 4)
>   [great! I didn't expect this]
> 
> 13. Let's play with Os
>   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os -fomit-frame-pointer):
>   X 16.49 (5)
>   C 13.03 (2)
>   [hmm, strange - faster in console and slower in X]
> 
> 14. Go Os Go!
>   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os -fomit-frame-pointer
>-falign-functions=4 -falign-jumps=4):
>   X 16.31 (3)
>   C 12.99 (1)
>   [fastest in console! very close to first and second place in X; this is
>different result than Bill Kenworthy got
>(see http://article.gmane.org/gmane.linux.gentoo.user/50998)

[gentoo-user] Simple CFLAGS test on Pentium MMX

2003-11-04 Thread Robo Cernansky

This is results of simple CFLAGS test. Maybe it will be useful for someone so
I post it here. I was wonder if results for old processors will be same as for
the fast ones so I made this simple test.
You can compare this test with Javier Villavicencio's
(http://article.gmane.org/gmane.linux.gentoo.user/51881)).

I was compiling gnuchess (http://www.gnu.org/software/chess/chess.html) with
various CFLAGS settings. For each compiled gnuchess I ran these commands:

./gnuchess
depth 8
go
quit

and wrote down duration of "go" command (gnuchess prints it). Following numbers
are average values of five (or more) values.

Each test was performed two times. One in X with many processes running
(mostly sleeping). Second in console with minimum of processes and with niced
priority (nice -18) of gnuchess.

Values are in seconds. In parentheses is place (best is on 1st place, worst on
14th place). Lines marked with "X" are values from test in X environment and
"C" lines are values from console test.


Here are the results:

 1. Without optimizations
  X 22.12 (14)
  C 17.65 (14)
  [slowest of course - 14th place]

 2. "Basic" O2 test
   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer):
  X 16.73 (8, 9)
  C 13.29 (6)
  [much better]

 3. Changed to O3
   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer):
  X 17.23 (12)
  C 15.25 (13)
  [slower than O2 - same result as in Javier Villavicencio's test]

 4. Trying O3 being faster
   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer
-falign-functions=4 -falign-jumps=4):
  X 16.62 (7)
  C 13.50 (7)
  [almost same as "basic" O2]

 5. Don't give up with O3
   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer
-falign-functions=4 -falign-jumps=4
-fforce-addr):
  X 17.09 (11)
  C 14.39 (12)
  [bad results :-(]

 6. Piece of O3 in O2 (O3 implies -frename-registers)
   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer
-frename-registers):
  X 16.44 (4)
  C 13.19 (5)
  [pretty fast - as in Javier Villavicencio's test]

 7. Trying somethin else
   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer 
-falign-functions=4 -falign-jumps=4):
  X 16.25 (1)
  C 13.11 (3, 4)
  [bingo! first place in X environment; also close to second and first place in
   console]

 8. Combination of two previous
   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer
-frename-registers
-falign-functions=4 -falign-jumps=4):
  X 16.73 (8, 9)
  C 14.03 (10)
  [aargh! slow; why???]

 9. Trying -fforce-addr
   (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer
-fforce-addr):
  X 18.33 (13)
  C 13.74 (9)
  [slower - like with O3; Javier Villavicencio's test shows same results;
   this is much slower in X (almost last place) - why?]

10. Just for a record
  (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer
   -fforce-addr
   -frename-registers):
  X 17.02 (10)
  C 13.51 (8)
  [little bit faster]

11. Let's see "clean" O2
  (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2):
  X 16.58 (6)
  C 14.14 (11)
  [much slower in console (11th place)]

12. And "clean" Os
  (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os):
  X 16.28 (2)
  C 13.11 (3, 4)
  [great! I didn't expect this]

13. Let's play with Os
  (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os -fomit-frame-pointer):
  X 16.49 (5)
  C 13.03 (2)
  [hmm, strange - faster in console and slower in X]

14. Go Os Go!
  (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os -fomit-frame-pointer
   -falign-functions=4 -falign-jumps=4):
  X 16.31 (3)
  C 12.99 (1)
  [fastest in console! very close to first and second place in X; this is
   different result than Bill Kenworthy got
   (see http://article.gmane.org/gmane.linux.gentoo.user/50998)]


Note that this test is specific to one task in one application. Effect can be
different for whole system (see different effects of some options when
application is running in X (system with many processes) and in console
(minimum processes)).

Machine specs:

$ uname -rmip
2.4.20-gentoo-r7 i586 Pentium MMX GenuineIntel

Kernel is compiled with preemptive multitasking.

Processor: Pentium 166 MMX, RAM: 64MB


Robert.


-- 
Robert Cernansky
E-mail: [EMAIL PROTECTED]
Jabber: [EMAIL PROTECTED]


--
[EMAIL PROTECTED] mailing list