Hello I notice maybe a simular problem.when there is a constant pointer then GCC >4.0 do not inline the func.gcc 3 do.
that can see on this old whetstone benchmark.with gcc 3.4.0 run faster. http://www.netlib.org/benchmark/whetstone.c as soon the term double *Z is change to double Z the func is inline and run faster on GCC >4. In GCC 3.4.0 its always inline. void P3(double X, double Y, double *Z) { double X1, Y1; X1 = X; Y1 = Y; X1 = T * (X1 + Y1); Y1 = T * (X1 + Y1); *Z = (X1 + Y1) / T2; } > Hi everyone, > > I've been looking at adding some code to a performance-critical section > of OpenBIOS, and I'm quite confused by how some of the changes I am > making are affecting the overall performance. > > For a benchmark, I am using a recursive fibonacci function to test the > effect of any changes that I have made. The test machines is an Intel > Core 2 x86 running under a 64-bit Debian Lenny installation. > > Firstly, here are the benchmark results running from the unmodified SVN > source tree: > > > bu...@zeno:~/src/openbios/openbios-devel$ time echo "28 fib-rec u. bye" > | ./obj-x86/openbios-unix ./obj-x86/openbios-x86.dict > Welcome to OpenBIOS v1.0 built on Nov 9 2009 17:12 > Type 'help' for detailed information > > [unix] Booting default not supported. > >> 28 fib-rec u. bye 6197ecb > Farewell! > > ok > > real 0m37.946s > user 0m37.178s > sys 0m0.020s > > > I then add a simple C function pointer below to the top of the forth > kernel file (which is currently not referenced by any other code changes): > > > void (*debughook) (void); > > > If I then re-build and re-run the same benchmark, the results now look > like this: > > > bu...@zeno:~/src/openbios/openbios-devel$ time echo "28 fib-rec u. bye" > | ./obj-x86/openbios-unix ./obj-x86/openbios-x86.dict > Welcome to OpenBIOS v1.0 built on Nov 9 2009 17:17 > Type 'help' for detailed information > > [unix] Booting default not supported. > >> 28 fib-rec u. bye 6197ecb > Farewell! > > ok > > real 0m52.564s > user 0m52.027s > sys 0m0.012s > > > So I'm really confused as to how adding a simply function pointer in the > global declaration section (without even adding any code to reference > it) suddenly incurs an extra 40% overhead? Can anyone explain why this > is, and/or point me to any suitable gcc optimisation guides? > > For reference, the gcc compiler is gcc 4.3.2 under Debian Lenny and the > compile flags are: > > -Os -g -Wall -Wredundant-decls -Wshadow -Wpointer-arith > -Wstrict-prototypes -Wmissing-declarations -Wundef -Wendif-labels > -Wstrict-aliasing -Wwrite-strings -Wmissing-prototypes > > > Many thanks, > > Mark. > Regards