this is interresting, so i emerged povray, and did like you, but i
couldnt find the benchmark.ini you talk about, so i just did the command
u used, in the dir with  the file u use, and this is result:

real    0m2.568s
user    0m2.220s
sys     0m0.030s

i've got an athlon xp 1800+, and a geforce2 intergrated GPU :-)

and btw, you are saying it isnt recommended to compile gentoo with those
flags, i have compiled everything with this:

-march=athlon-xp -O3 -pipe -mmmx -msse -m3dnow -mfpmath=sse,387
-fexpensive-optimizations -fstack-protector -fomit-frame-pointer
-funroll-loops -fforce-addr -falign-functions=4 -frerun-loop-opt
-frerun-cse-after-loop -maccumulate-outgoing-args -fprefetch-loop-arrays

and its stable and really fast ;)
the thing with fprofile-arcs is interresting, i will give it a shot!
i wonder too, what is your system specs?

On Tue, 2003-10-28 at 22:17, Javier Villavicencio wrote:
> These are the results of benchmarking gcc optimizations compiling povray 
> (www.povray.org) using the benchmark.ini and the skyvase.pov from the unofficial 
> benchmarks pages.
> Of course that this isn't so accurate about timings (I should have used some more 
> time consuming render, but I liked this one) what I did to being more fair with 
> results is I runned again and again each compilation (more than 20 times) and I 
> posted here the -fastest- of these timings (from the 20 runs, the faster one, for 
> each compilation). And I used the "time" command because I didn't like the accuracy 
> of the povray timing (not showing milliseconds, only seconds).
> (also read the part about branch probabilities, if exist a way to add this to 
> gentoo, then gentoo will run faster than WARP13 :+)
> 
> Commandline: "time nice -n -20 povray skyvase.pov" (using benchmark.ini)
> 
> CFLAGS= -O3 -march=athlon-xp -fomit-frame-pointer
> real    0m3.156s
> user    0m2.996s
> sys     0m0.161s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -fomit-frame-pointer
> real    0m3.002s
> user    0m2.846s
> sys     0m0.157s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -finline-functions -fomit-frame-pointer   <- -O3 added
> real    0m3.197s
> user    0m3.039s
> sys     0m0.158s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -fomit-frame-pointer   <- -O3 added 
> ! this is the fast one !
> real    0m2.993s
> user    0m2.834s
> sys     0m0.159s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -mpreferred-stack-boundary=2 \ <- 
> slower ?
>       -fomit-frame-pointer
> real    0m3.326s
> user    0m3.158s
> sys     0m0.168s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -mpreferred-stack-boundary=4 \ <- 
> RTFM, implied default
>       -fomit-frame-pointer
> real    0m2.996s
> user    0m2.834s
> sys     0m0.162s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -mpreferred-stack-boundary=8 \ <- I 
> already RTFM, slower, ok.
>       -fomit-frame-pointer
> real    0m3.021s
> user    0m2.860s
> sys     0m0.162s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \      <- I didn't 
> added -mpreferred... bcos is implied
>       -fomit-frame-pointer                                            <- Now 
> -malign-double FASTER!
> real    0m2.959s
> user    0m2.802s
> sys     0m0.158s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \      <- almost same 
> as before, new flag implied 
>       -m96bit-long-double -fomit-frame-pointer                
> real    0m2.982s
> user    0m2.802s
> sys     0m0.181s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \      <- 128bit long 
> double slower.
>        -m128bit-long-double -fomit-frame-pointer      
> real    0m3.018s
> user    0m2.858s
> sys     0m0.161s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \      <- almost the 
> same as without -mmx, implied?
>       -mmmx -fomit-frame-pointer
> real    0m2.969s
> user    0m2.802s
> sys     0m0.167s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \      <- again, 
> maybe implied?
>        -mmx -msse -fomit-frame-pointer                        
> real    0m2.965s
> user    0m2.803s
> sys     0m0.162s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \      <- no 
> noticable effect yet,
>       -mmx -msse -m3dnow -fomit-frame-pointer                         <- maybe 
> implied?
> real    0m2.962s
> user    0m2.803s
> sys     0m0.159s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \     <- what 
> happens without mmx?
>       -msse -m3dnow -fomit-frame-pointer                              <- nothing :+/
> real    0m2.964s
> user    0m2.802s
> sys     0m0.162s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \     <- and without 
> sse?
>       -m3dnow -fomit-frame-pointer                                    <- bah, 
> nothing :+/
> real    0m2.974s
> user    0m2.805s
> sys     0m0.169s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \     <- i was 
> reading the info...
>       -mno-push-args -fomit-frame-pointer                             <- and I found 
> this... not too much, and I don't like it :+P.
> real    0m2.972s
> user    0m2.804s
> sys     0m0.168s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \     <- this 
> implies the last one, well, see what happens..
>       -maccumulate-outgoing-args -fomit-frame-pointer                 <- faster, but 
> bigger code size. (not a lot of space here)
> real    0m2.969s
> user    0m2.799s
> sys     0m0.170s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \     <- Huh, 
> faster, huh.
>       -maccumulate-outgoing-args -mno-align-stringops \
>       -fomit-frame-pointer
> real    0m2.948s
> user    0m2.781s
> sys     0m0.168s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \     <- again i'm 
> reading the info...
>       -maccumulate-outgoing-args -mno-align-stringops \               <- 17ms 
> slower. bah.
>       -minline-all-stringops -fomit-frame-pointer                                    
>  
> real    0m2.968s
> user    0m2.798s
> sys     0m0.170s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \     <- -fforce-mem 
> in -O2...
>       -maccumulate-outgoing-args -mno-align-stringops \               <- what about 
> -fforce-addr?
>       -fforce-addr -fomit-frame-pointer                               <- mbu. slower.
> real    0m3.132s
> user    0m2.970s
> sys     0m0.162s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \     <- 
> -fbranch-count-reg is enabled with -O2
>       -maccumulate-outgoing-args -mno-align-stringops \               <- what 
> happens disabling this?
>       -fno-branch-count-reg -fomit-frame-pointer                      <- uhm, it's 
> enabled for a good reason (:+P)
> real    0m2.958s
> user    0m2.794s
> sys     0m0.164s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \     <- slow like 
> hell.
>       -maccumulate-outgoing-args -mno-align-stringops \
>       -fmove-all-movables -freduce-all-givs -freduce-all-givs -fomit-frame-pointer   
>                                  
> real    0m3.198s
> user    0m3.038s
> sys     0m0.160s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \     <- this one 
> generates imprecise math code
>       -maccumulate-outgoing-args -mno-align-stringops \               <- but not so 
> imprecise ;+P
>       -ffast-math -fomit-frame-pointer
> real    0m3.043s
> user    0m2.881s
> sys     0m0.162s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \     <- let's play 
> with -fpmath
>       -maccumulate-outgoing-args -mno-align-stringops \               <- sse: slower
>       -fpmath=sse -fomit-frame-pointer
> real    0m3.048s
> user    0m2.890s
> sys     0m0.158s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \     <- 387: 
>       -maccumulate-outgoing-args -mno-align-stringops \               <- mmm, 
> better...
>       -fpmath=387 -fomit-frame-pointer
> real    0m2.952s
> user    0m2.788s
> sys     0m0.164s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \     <- sse,387: 
>       -maccumulate-outgoing-args -mno-align-stringops \               <- b00, 
> slower...
>       -fpmath=sse,387 -fomit-frame-pointer
> real    0m3.104s
> user    0m2.941s
> sys     0m0.163s
> *********************************************************************
> ************************branch probabilities*****************************
> *********************************************************************
> This is the end of the CFLAGS that gentoo can take, the following works in this way:
> You first compile a program with -fprofile-arcs, then run the program a while. When 
> you do this, the program runs slower than hell, but don't worry, it's creating
> information at the side of your already compiled code about branch probabilities,
> (without this GCC does random branch prediction, with this GCC is writing the branch
> flow to a .da file (with the same name of the .c/.o file that it's being executed, 
> so 
> DON'T delete your directory with the source code)
> After -fprofile-arcs, and running the compiled program, you have to recompile it 
> again
> with -fbranch-probabilities, and the compiler will get branch data from the already
> generated .da files and make the code run in the directions of the most commonly,
> and time consuming, code. Just looks what happens:
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \     <- now, the 
> real part. profiling.
>       -maccumulate-outgoing-args -mno-align-stringops \               <- first we 
> compile with -fprofile-arcs
>       -fpmath=387 -fprofile-arcs -fomit-frame-pointer                         <- 
> (compile with -p and use gprof to see nice stats)
> real    0m4.048s
> user    0m3.882s
> sys     0m0.166s
> *********************************************************************
> CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \     <- now, gcc is 
> using the profiled data
>       -maccumulate-outgoing-args -mno-align-stringops \               <- what can be 
> faster than this?? :+)
>       -fpmath=387 -fbranch-probabilities -fomit-frame-pointer                        
>  
> real    0m2.900s
> user    0m2.733s
> sys     0m0.167s
> *********************************************************************
> 
> --
> [EMAIL PROTECTED] mailing list
-- 
Regards, Redeeman
()  ascii ribbon campaign - against html e-mail 
/\                        - against microsoft attachments


--
[EMAIL PROTECTED] mailing list

Reply via email to