Re: [gentoo-user] Simple CFLAGS test on Pentium MMX
Did you note the size of the binaries? Something I neglected to with the tests I did. BillK On Wed, 2003-11-05 at 06:34, Dennis Freise wrote: > > > I forgot to tell version of gcc - it is 3.2.3. > > > > Ah! I was just about to ask you that!! > > I hope you will consider reporting the results should you change gcc > > versions. I'm given to understand this can make quite a big difference. > > I've done some quick test with gcc-3.3.2 and povray 3.50. > I only did test the -O things... what I found out: > > povray compiled with -O3: took ~55 secs to render picture > povray compiled with -O2: took ~50 secs to render picture > povray compiled with -Os: took ~68 secs to render picture > > -frename-registers and -finline-functions both did no good, making slower > executables. However, this was really a quick test. CPU was a pentium-mmx > 233 mhz with 256mb ram. > > Greetings, Dennis > > > -- > [EMAIL PROTECTED] mailing list -- William Kenworthy <[EMAIL PROTECTED]> -- [EMAIL PROTECTED] mailing list
Re: [gentoo-user] Simple CFLAGS test on Pentium MMX
> > I forgot to tell version of gcc - it is 3.2.3. > > Ah! I was just about to ask you that!! > I hope you will consider reporting the results should you change gcc > versions. I'm given to understand this can make quite a big difference. I've done some quick test with gcc-3.3.2 and povray 3.50. I only did test the -O things... what I found out: povray compiled with -O3: took ~55 secs to render picture povray compiled with -O2: took ~50 secs to render picture povray compiled with -Os: took ~68 secs to render picture -frename-registers and -finline-functions both did no good, making slower executables. However, this was really a quick test. CPU was a pentium-mmx 233 mhz with 256mb ram. Greetings, Dennis -- [EMAIL PROTECTED] mailing list
Re: [gentoo-user] Simple CFLAGS test on Pentium MMX
On Nov 4, 2003, at 1:58 pm, Robo Cernansky wrote: On Tue, 4 Nov 2003 11:36:30 +0100 (Central Europe Standard Time) Robo Cernansky <[EMAIL PROTECTED]> wrote: RC> RC> This is results of simple CFLAGS test. Maybe it will be useful for someone so [...] RC> I was compiling gnuchess (http://www.gnu.org/software/chess/chess.html) with RC> various CFLAGS settings. For each compiled gnuchess I ran these [...] I forgot to tell version of gcc - it is 3.2.3. Ah! I was just about to ask you that!! I hope you will consider reporting the results should you change gcc versions. I'm given to understand this can make quite a big difference. Stroller. -- [EMAIL PROTECTED] mailing list
Re: [gentoo-user] Simple CFLAGS test on Pentium MMX
On Tue, 4 Nov 2003 11:36:30 +0100 (Central Europe Standard Time) Robo Cernansky <[EMAIL PROTECTED]> wrote: RC> RC> This is results of simple CFLAGS test. Maybe it will be useful for someone so [...] RC> I was compiling gnuchess (http://www.gnu.org/software/chess/chess.html) with RC> various CFLAGS settings. For each compiled gnuchess I ran these [...] I forgot to tell version of gcc - it is 3.2.3. Robert. -- Robert Cernansky E-mail: [EMAIL PROTECTED] Jabber: [EMAIL PROTECTED] -- [EMAIL PROTECTED] mailing list
Re: [gentoo-user] Simple CFLAGS test on Pentium MMX
Interesting. Most of the machines I use have around 1G ram and 2 to 4 gbytes of swap, so -Os looks like it creates a real loss on systems with ram to spare (see http://wdk.dyndns.org/flags.png - pick -Os !), but gains when ram is short. Some people are saying (no figures though) that -Os helps on a desktop system with responsiveness. -falign-functions=4 created a slight loss , = 8 or 16 a bit more, but =32 gained some (32 bit addressing?) The more I test, I am coming down on the side of using some basic flags for the system, compiling desktop stuff with -Os (if I can confirm it does work) and then specific apps with the best flags for performance. Examples here are zip/gzip/bzip, mysql, gimp : basicly things that run a lot and for a long time where long term speed is required. One point to make about running in X and console: running an application in an xterm, gnome-terminal, text console or frame-buffer console all produced different results when tested. So to be valid, you will need to do the tests in as close to the way you intend to use the program as possible. The golden rule is "test, test, and dont accept someone elses flags without testing" BillK On Tue, 2003-11-04 at 18:36, Robo Cernansky wrote: > This is results of simple CFLAGS test. Maybe it will be useful for someone so > I post it here. I was wonder if results for old processors will be same as for > the fast ones so I made this simple test. > You can compare this test with Javier Villavicencio's > (http://article.gmane.org/gmane.linux.gentoo.user/51881)). > > I was compiling gnuchess (http://www.gnu.org/software/chess/chess.html) with > various CFLAGS settings. For each compiled gnuchess I ran these commands: -- [EMAIL PROTECTED] mailing list
Re: [gentoo-user] Simple CFLAGS test on Pentium MMX
Hi Robo, I'm really glad to see that someone else is with me on these kind of tests. About your test, pretty cool, and with a real world program like a game, that's a benchmark, just to explain that optimizing for size is a lot faster on machines with slow or small cache, or with already good branch prediction (years ago, IIRC I readed somewhere that the pentium-mmx has good branch prediction, at least better than cyrix and amd at that moment). This may be the cause of "why!?"s on your message :+) some optimizations mixed makes bigger code, and, sometimes with these mixes the compiler doesn't uses the same pseudo-random branch prediction on code. Again, would be really nice to see that this kind of benchmarks would make our Gentoo faster than light :+) Salu2. Javier Villavicencio. On Tue, 4 Nov 2003 11:36:30 +0100 (Central Europe Standard Time) Robo Cernansky <[EMAIL PROTECTED]> wrote: > > This is results of simple CFLAGS test. Maybe it will be useful for someone so > I post it here. I was wonder if results for old processors will be same as for > the fast ones so I made this simple test. > You can compare this test with Javier Villavicencio's > (http://article.gmane.org/gmane.linux.gentoo.user/51881)). > > I was compiling gnuchess (http://www.gnu.org/software/chess/chess.html) with > various CFLAGS settings. For each compiled gnuchess I ran these commands: > > ./gnuchess > depth 8 > go > quit > > and wrote down duration of "go" command (gnuchess prints it). Following numbers > are average values of five (or more) values. > > Each test was performed two times. One in X with many processes running > (mostly sleeping). Second in console with minimum of processes and with niced > priority (nice -18) of gnuchess. > > Values are in seconds. In parentheses is place (best is on 1st place, worst on > 14th place). Lines marked with "X" are values from test in X environment and > "C" lines are values from console test. > > > Here are the results: > > 1. Without optimizations > X 22.12 (14) > C 17.65 (14) > [slowest of course - 14th place] > > 2. "Basic" O2 test >(-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer): > X 16.73 (8, 9) > C 13.29 (6) > [much better] > > 3. Changed to O3 >(-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer): > X 17.23 (12) > C 15.25 (13) > [slower than O2 - same result as in Javier Villavicencio's test] > > 4. Trying O3 being faster >(-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer > -falign-functions=4 -falign-jumps=4): > X 16.62 (7) > C 13.50 (7) > [almost same as "basic" O2] > > 5. Don't give up with O3 >(-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer > -falign-functions=4 -falign-jumps=4 > -fforce-addr): > X 17.09 (11) > C 14.39 (12) > [bad results :-(] > > 6. Piece of O3 in O2 (O3 implies -frename-registers) >(-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer > -frename-registers): > X 16.44 (4) > C 13.19 (5) > [pretty fast - as in Javier Villavicencio's test] > > 7. Trying somethin else >(-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer > -falign-functions=4 -falign-jumps=4): > X 16.25 (1) > C 13.11 (3, 4) > [bingo! first place in X environment; also close to second and first place in >console] > > 8. Combination of two previous >(-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer > -frename-registers > -falign-functions=4 -falign-jumps=4): > X 16.73 (8, 9) > C 14.03 (10) > [aargh! slow; why???] > > 9. Trying -fforce-addr >(-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer > -fforce-addr): > X 18.33 (13) > C 13.74 (9) > [slower - like with O3; Javier Villavicencio's test shows same results; >this is much slower in X (almost last place) - why?] > > 10. Just for a record > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer >-fforce-addr >-frename-registers): > X 17.02 (10) > C 13.51 (8) > [little bit faster] > > 11. Let's see "clean" O2 > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2): > X 16.58 (6) > C 14.14 (11) > [much slower in console (11th place)] > > 12. And "clean" Os > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os): > X 16.28 (2) > C 13.11 (3, 4) > [great! I didn't expect this] > > 13. Let's play with Os > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os -fomit-frame-pointer): > X 16.49 (5) > C 13.03 (2) > [hmm, strange - faster in console and slower in X] > > 14. Go Os Go! > (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os -fomit-frame-pointer >-falign-functions=4 -falign-jumps=4): > X 16.31 (3) > C 12.99 (1) > [fastest in console! very close to first and second place in X; this is >different result than Bill Kenworthy got >(see http://article.gmane.org/gmane.linux.gentoo.user/50998)
[gentoo-user] Simple CFLAGS test on Pentium MMX
This is results of simple CFLAGS test. Maybe it will be useful for someone so I post it here. I was wonder if results for old processors will be same as for the fast ones so I made this simple test. You can compare this test with Javier Villavicencio's (http://article.gmane.org/gmane.linux.gentoo.user/51881)). I was compiling gnuchess (http://www.gnu.org/software/chess/chess.html) with various CFLAGS settings. For each compiled gnuchess I ran these commands: ./gnuchess depth 8 go quit and wrote down duration of "go" command (gnuchess prints it). Following numbers are average values of five (or more) values. Each test was performed two times. One in X with many processes running (mostly sleeping). Second in console with minimum of processes and with niced priority (nice -18) of gnuchess. Values are in seconds. In parentheses is place (best is on 1st place, worst on 14th place). Lines marked with "X" are values from test in X environment and "C" lines are values from console test. Here are the results: 1. Without optimizations X 22.12 (14) C 17.65 (14) [slowest of course - 14th place] 2. "Basic" O2 test (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer): X 16.73 (8, 9) C 13.29 (6) [much better] 3. Changed to O3 (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer): X 17.23 (12) C 15.25 (13) [slower than O2 - same result as in Javier Villavicencio's test] 4. Trying O3 being faster (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer -falign-functions=4 -falign-jumps=4): X 16.62 (7) C 13.50 (7) [almost same as "basic" O2] 5. Don't give up with O3 (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O3 -fomit-frame-pointer -falign-functions=4 -falign-jumps=4 -fforce-addr): X 17.09 (11) C 14.39 (12) [bad results :-(] 6. Piece of O3 in O2 (O3 implies -frename-registers) (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer -frename-registers): X 16.44 (4) C 13.19 (5) [pretty fast - as in Javier Villavicencio's test] 7. Trying somethin else (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer -falign-functions=4 -falign-jumps=4): X 16.25 (1) C 13.11 (3, 4) [bingo! first place in X environment; also close to second and first place in console] 8. Combination of two previous (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer -frename-registers -falign-functions=4 -falign-jumps=4): X 16.73 (8, 9) C 14.03 (10) [aargh! slow; why???] 9. Trying -fforce-addr (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer -fforce-addr): X 18.33 (13) C 13.74 (9) [slower - like with O3; Javier Villavicencio's test shows same results; this is much slower in X (almost last place) - why?] 10. Just for a record (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2 -fomit-frame-pointer -fforce-addr -frename-registers): X 17.02 (10) C 13.51 (8) [little bit faster] 11. Let's see "clean" O2 (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -O2): X 16.58 (6) C 14.14 (11) [much slower in console (11th place)] 12. And "clean" Os (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os): X 16.28 (2) C 13.11 (3, 4) [great! I didn't expect this] 13. Let's play with Os (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os -fomit-frame-pointer): X 16.49 (5) C 13.03 (2) [hmm, strange - faster in console and slower in X] 14. Go Os Go! (-mcpu=pentium-mmx -march=pentium-mmx -mmmx -Os -fomit-frame-pointer -falign-functions=4 -falign-jumps=4): X 16.31 (3) C 12.99 (1) [fastest in console! very close to first and second place in X; this is different result than Bill Kenworthy got (see http://article.gmane.org/gmane.linux.gentoo.user/50998)] Note that this test is specific to one task in one application. Effect can be different for whole system (see different effects of some options when application is running in X (system with many processes) and in console (minimum processes)). Machine specs: $ uname -rmip 2.4.20-gentoo-r7 i586 Pentium MMX GenuineIntel Kernel is compiled with preemptive multitasking. Processor: Pentium 166 MMX, RAM: 64MB Robert. -- Robert Cernansky E-mail: [EMAIL PROTECTED] Jabber: [EMAIL PROTECTED] -- [EMAIL PROTECTED] mailing list