Hello 

I notice maybe a simular problem.when there is a constant pointer then GCC
>4.0 do not inline the func.gcc 3 do.

that can see on this old whetstone benchmark.with gcc 3.4.0 run faster.

http://www.netlib.org/benchmark/whetstone.c

as soon the term double *Z is change to double Z the func is inline and run
faster on GCC >4.
In GCC 3.4.0 its always inline.

void
P3(double X, double Y, double *Z)
{
    double X1, Y1;

    X1 = X;
    Y1 = Y;
    X1 = T * (X1 + Y1);
    Y1 = T * (X1 + Y1);
    *Z  = (X1 + Y1) / T2;
}




> Hi everyone,
> 
> I've been looking at adding some code to a performance-critical section 
> of OpenBIOS, and I'm quite confused by how some of the changes I am 
> making are affecting the overall performance.
> 
> For a benchmark, I am using a recursive fibonacci function to test the 
> effect of any changes that I have made. The test machines is an Intel 
> Core 2 x86 running under a 64-bit Debian Lenny installation.
> 
> Firstly, here are the benchmark results running from the unmodified SVN 
> source tree:
> 
> 
> bu...@zeno:~/src/openbios/openbios-devel$ time echo "28 fib-rec u. bye" 
> | ./obj-x86/openbios-unix ./obj-x86/openbios-x86.dict
> Welcome to OpenBIOS v1.0 built on Nov 9 2009 17:12
>   Type 'help' for detailed information
> 
> [unix] Booting default not supported.
> 
>> 28 fib-rec u. bye 6197ecb
> Farewell!
> 
>  ok
> 
> real    0m37.946s
> user    0m37.178s
> sys     0m0.020s
> 
> 
> I then add a simple C function pointer below to the top of the forth 
> kernel file (which is currently not referenced by any other code changes):
> 
> 
> void (*debughook) (void);
> 
> 
> If I then re-build and re-run the same benchmark, the results now look 
> like this:
> 
> 
> bu...@zeno:~/src/openbios/openbios-devel$ time echo "28 fib-rec u. bye" 
> | ./obj-x86/openbios-unix ./obj-x86/openbios-x86.dict
> Welcome to OpenBIOS v1.0 built on Nov 9 2009 17:17
>   Type 'help' for detailed information
> 
> [unix] Booting default not supported.
> 
>> 28 fib-rec u. bye 6197ecb
> Farewell!
> 
>  ok
> 
> real    0m52.564s
> user    0m52.027s
> sys     0m0.012s
> 
> 
> So I'm really confused as to how adding a simply function pointer in the 
> global declaration section (without even adding any code to reference 
> it) suddenly incurs an extra 40% overhead? Can anyone explain why this 
> is, and/or point me to any suitable gcc optimisation guides?
> 
> For reference, the gcc compiler is gcc 4.3.2 under Debian Lenny and the 
> compile flags are:
> 
> -Os -g -Wall -Wredundant-decls -Wshadow -Wpointer-arith
> -Wstrict-prototypes -Wmissing-declarations -Wundef -Wendif-labels 
> -Wstrict-aliasing -Wwrite-strings -Wmissing-prototypes
> 
> 
> Many thanks,
> 
> Mark.
> 
Regards

Reply via email to