Andrea Arcangeli wrote:

> For x86 all function parameters are passed across the stack (little endian
> so the first push refer the last argument) while the retval is returned in
> eax if the function returns 32bit. If the function return a large struct
> it should be returned the pointer to the struct. I don' t know the
> details, if somebody would know the details I could be interested too
> (just for curiosity).

x86 uses transparent reference, i.e. the caller allocates space for
the return value and passes a pointer to it. IOW,

        struct foo bar(void)

is implemented as if it were

        void bar(struct foo *result)

> Sure you have not to preserve eax inside the function call, since it
> has to contain the retval...

You don't have to preserve any registers (I don't know about the FPU
state).

> BTW, Glynn some mail ago you said that the x86 would _not_ perform better
> a lot passing arguments to the function through registers, and this is not
> true since for example the eax register has not to be preserved at all.

You don't have to preserve registers across a function call. However,
you do need to preserve parameters for as long as you wish to use
them.

Given that registers are scarce on the x86, and many instructions need
specific registers (e.g. multiply uses eax/edx, the autoincrementing
instructions use esi and edi, etc), in many cases you will end up just
saving the registers in memory so that you can reuse them.

> It would be nice to pass the last parameter of the function call in
> the eax register and the other parameters across the stack as usual. 
> I think it would help a lot in performance. I' ll try to discover
> the improvement.

If the function performs a multiply, or calls another function, you're
just going to have to save eax in memory anyhow, so it may as well
start off there.

> ...<Some time passed>...
> 
> Here the example:

[snipped]

> andrea@dragon:/tmp$ gcc regparm-pentium.c -O2

What is the difference if you use `-O3' (which enables the
`-finline-functions' option)?

> andrea@dragon:/tmp$ ./a.out
> Fast latency: 1007, normal latency 1307
> 
> Using eax for passing the first argument we get a bit improvement. Sure
> this example is the best to get nice numbers from regparm(1), but the
> improvement exists in every case (except for `function(void)' of course).
> So really only history reasons are sucking performance on x86...

Trivial functions would typically benefit most from a register-based
calling convention. But in practice, a function as trivial as `return
++t' wouldn't be a separate function. And the simple functions that do
exist in a real world program would often be inlined. So this example
doesn't really tell us how much difference a register-based calling
convention would make to real world programs.

-- 
Glynn Clements <[EMAIL PROTECTED]>

Reply via email to