On 2007-02-16, David Brown <[email protected]> wrote:

>> I finally settled on -O1. It generates the smallest frame sizes
>> (about 25% smaller than -O2) with almost no code size penalty
>> compared to -O2.  Frame sizes blow up pretty badly using -O3
>> because functions that are only called once are get inlined.
>
> This should only be the case if you have, as you say, been overly clever 
> and confused the code flow analysis.  Otherwise things will work out 
> much the same even if some functions are inlined - the increase in frame 
> size in the calling function is offset by avoiding a new frame when the 
> function is called.

Not always.

Consider the case where there are three functions, A, B, C.

With -O1, each has a stack frame size of 16 bytes.

A calls B once, but calls C multiple times.

With -O1, the max stack depth is 32.

With -O2, the call to B gets inlined, but not the calls to C.

Now the stack frame for A is 32, and C is still 16.  The max
stack depth is 48.

> The frame size should be the maximum of the space needed for
> the different branches, rather than the sum, so you are only 
> going to waste significant space if you have a large frame
> function that is inlined, while other branches call large
> frame functions.

Right. Anytime the compile inlines some calls (with non-zero
frame sizes) made by a given function but not others then the
total stack depth will increase.

> You can always get around this sort of thing by using the
> "noinline" function attribute - it is better to be explicit in
> your source code than relying on the compiler flags chosen
> when compiling.

Good point.  There's nothing to prevent -O1 from deciding to
inline a function in future versions.

-- 
Grant Edwards                   grante             Yow!  Will this
                                  at               never-ending series of
                               visi.com            PLEASURABLE EVENTS never
                                                   cease?


Reply via email to