this is interresting, so i emerged povray, and did like you, but i couldnt find the benchmark.ini you talk about, so i just did the command u used, in the dir with the file u use, and this is result:
real 0m2.568s user 0m2.220s sys 0m0.030s i've got an athlon xp 1800+, and a geforce2 intergrated GPU :-) and btw, you are saying it isnt recommended to compile gentoo with those flags, i have compiled everything with this: -march=athlon-xp -O3 -pipe -mmmx -msse -m3dnow -mfpmath=sse,387 -fexpensive-optimizations -fstack-protector -fomit-frame-pointer -funroll-loops -fforce-addr -falign-functions=4 -frerun-loop-opt -frerun-cse-after-loop -maccumulate-outgoing-args -fprefetch-loop-arrays and its stable and really fast ;) the thing with fprofile-arcs is interresting, i will give it a shot! i wonder too, what is your system specs? On Tue, 2003-10-28 at 22:17, Javier Villavicencio wrote: > These are the results of benchmarking gcc optimizations compiling povray > (www.povray.org) using the benchmark.ini and the skyvase.pov from the unofficial > benchmarks pages. > Of course that this isn't so accurate about timings (I should have used some more > time consuming render, but I liked this one) what I did to being more fair with > results is I runned again and again each compilation (more than 20 times) and I > posted here the -fastest- of these timings (from the 20 runs, the faster one, for > each compilation). And I used the "time" command because I didn't like the accuracy > of the povray timing (not showing milliseconds, only seconds). > (also read the part about branch probabilities, if exist a way to add this to > gentoo, then gentoo will run faster than WARP13 :+) > > Commandline: "time nice -n -20 povray skyvase.pov" (using benchmark.ini) > > CFLAGS= -O3 -march=athlon-xp -fomit-frame-pointer > real 0m3.156s > user 0m2.996s > sys 0m0.161s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -fomit-frame-pointer > real 0m3.002s > user 0m2.846s > sys 0m0.157s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -finline-functions -fomit-frame-pointer <- -O3 added > real 0m3.197s > user 0m3.039s > sys 0m0.158s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -fomit-frame-pointer <- -O3 added > ! this is the fast one ! > real 0m2.993s > user 0m2.834s > sys 0m0.159s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -mpreferred-stack-boundary=2 \ <- > slower ? > -fomit-frame-pointer > real 0m3.326s > user 0m3.158s > sys 0m0.168s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -mpreferred-stack-boundary=4 \ <- > RTFM, implied default > -fomit-frame-pointer > real 0m2.996s > user 0m2.834s > sys 0m0.162s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -mpreferred-stack-boundary=8 \ <- I > already RTFM, slower, ok. > -fomit-frame-pointer > real 0m3.021s > user 0m2.860s > sys 0m0.162s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- I didn't > added -mpreferred... bcos is implied > -fomit-frame-pointer <- Now > -malign-double FASTER! > real 0m2.959s > user 0m2.802s > sys 0m0.158s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- almost same > as before, new flag implied > -m96bit-long-double -fomit-frame-pointer > real 0m2.982s > user 0m2.802s > sys 0m0.181s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- 128bit long > double slower. > -m128bit-long-double -fomit-frame-pointer > real 0m3.018s > user 0m2.858s > sys 0m0.161s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- almost the > same as without -mmx, implied? > -mmmx -fomit-frame-pointer > real 0m2.969s > user 0m2.802s > sys 0m0.167s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- again, > maybe implied? > -mmx -msse -fomit-frame-pointer > real 0m2.965s > user 0m2.803s > sys 0m0.162s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- no > noticable effect yet, > -mmx -msse -m3dnow -fomit-frame-pointer <- maybe > implied? > real 0m2.962s > user 0m2.803s > sys 0m0.159s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- what > happens without mmx? > -msse -m3dnow -fomit-frame-pointer <- nothing :+/ > real 0m2.964s > user 0m2.802s > sys 0m0.162s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- and without > sse? > -m3dnow -fomit-frame-pointer <- bah, > nothing :+/ > real 0m2.974s > user 0m2.805s > sys 0m0.169s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- i was > reading the info... > -mno-push-args -fomit-frame-pointer <- and I found > this... not too much, and I don't like it :+P. > real 0m2.972s > user 0m2.804s > sys 0m0.168s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- this > implies the last one, well, see what happens.. > -maccumulate-outgoing-args -fomit-frame-pointer <- faster, but > bigger code size. (not a lot of space here) > real 0m2.969s > user 0m2.799s > sys 0m0.170s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- Huh, > faster, huh. > -maccumulate-outgoing-args -mno-align-stringops \ > -fomit-frame-pointer > real 0m2.948s > user 0m2.781s > sys 0m0.168s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- again i'm > reading the info... > -maccumulate-outgoing-args -mno-align-stringops \ <- 17ms > slower. bah. > -minline-all-stringops -fomit-frame-pointer > > real 0m2.968s > user 0m2.798s > sys 0m0.170s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- -fforce-mem > in -O2... > -maccumulate-outgoing-args -mno-align-stringops \ <- what about > -fforce-addr? > -fforce-addr -fomit-frame-pointer <- mbu. slower. > real 0m3.132s > user 0m2.970s > sys 0m0.162s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- > -fbranch-count-reg is enabled with -O2 > -maccumulate-outgoing-args -mno-align-stringops \ <- what > happens disabling this? > -fno-branch-count-reg -fomit-frame-pointer <- uhm, it's > enabled for a good reason (:+P) > real 0m2.958s > user 0m2.794s > sys 0m0.164s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- slow like > hell. > -maccumulate-outgoing-args -mno-align-stringops \ > -fmove-all-movables -freduce-all-givs -freduce-all-givs -fomit-frame-pointer > > real 0m3.198s > user 0m3.038s > sys 0m0.160s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- this one > generates imprecise math code > -maccumulate-outgoing-args -mno-align-stringops \ <- but not so > imprecise ;+P > -ffast-math -fomit-frame-pointer > real 0m3.043s > user 0m2.881s > sys 0m0.162s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- let's play > with -fpmath > -maccumulate-outgoing-args -mno-align-stringops \ <- sse: slower > -fpmath=sse -fomit-frame-pointer > real 0m3.048s > user 0m2.890s > sys 0m0.158s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- 387: > -maccumulate-outgoing-args -mno-align-stringops \ <- mmm, > better... > -fpmath=387 -fomit-frame-pointer > real 0m2.952s > user 0m2.788s > sys 0m0.164s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- sse,387: > -maccumulate-outgoing-args -mno-align-stringops \ <- b00, > slower... > -fpmath=sse,387 -fomit-frame-pointer > real 0m3.104s > user 0m2.941s > sys 0m0.163s > ********************************************************************* > ************************branch probabilities***************************** > ********************************************************************* > This is the end of the CFLAGS that gentoo can take, the following works in this way: > You first compile a program with -fprofile-arcs, then run the program a while. When > you do this, the program runs slower than hell, but don't worry, it's creating > information at the side of your already compiled code about branch probabilities, > (without this GCC does random branch prediction, with this GCC is writing the branch > flow to a .da file (with the same name of the .c/.o file that it's being executed, > so > DON'T delete your directory with the source code) > After -fprofile-arcs, and running the compiled program, you have to recompile it > again > with -fbranch-probabilities, and the compiler will get branch data from the already > generated .da files and make the code run in the directions of the most commonly, > and time consuming, code. Just looks what happens: > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- now, the > real part. profiling. > -maccumulate-outgoing-args -mno-align-stringops \ <- first we > compile with -fprofile-arcs > -fpmath=387 -fprofile-arcs -fomit-frame-pointer <- > (compile with -p and use gprof to see nice stats) > real 0m4.048s > user 0m3.882s > sys 0m0.166s > ********************************************************************* > CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \ <- now, gcc is > using the profiled data > -maccumulate-outgoing-args -mno-align-stringops \ <- what can be > faster than this?? :+) > -fpmath=387 -fbranch-probabilities -fomit-frame-pointer > > real 0m2.900s > user 0m2.733s > sys 0m0.167s > ********************************************************************* > > -- > [EMAIL PROTECTED] mailing list -- Regards, Redeeman () ascii ribbon campaign - against html e-mail /\ - against microsoft attachments -- [EMAIL PROTECTED] mailing list