Re: Non-optimal stack usage with C++ temporaries

2011-05-12 Thread Richard Guenther
On Wed, May 11, 2011 at 10:15 PM, Matt Fischer  wrote:
> I've noticed some behavior with g++ that seems strange to me.  I don't
> know if there's some technicality in the C++ standard that requires
> this, or if it's just a limitation to the optimization code, but it
> seemed strange so I thought I'd see if anybody could shed more light
> on it.
>
> Here's a test program that illustrates the behavior:
>
> struct Foo {
>    char buf[256];
>    Foo() {} // suppress automatically-generated constructor code for clarity
>    ~Foo() {}
> };
>
> void func0(const Foo &);
> void func1(const Foo &);
> void func2(const Foo &);
> void func3(const Foo &);
>
> void f()
> {
>    func0(Foo());
>    func1(Foo());
>    func2(Foo());
>    func3(Foo());
> }
>
> Compiling with -O2 and "-fno-stack-protector -fno-exceptions" for
> clarity, on g++ 4.4.3, gives the following:
>
>  :
>   0:   55                              push   %ebp
>   1:   89 e5                   mov    %esp,%ebp
>   3:   81 ec 18 04 00 00       sub    $0x418,%esp
>   9:   8d 85 f8 fb ff ff       lea    -0x408(%ebp),%eax
>   f:   89 04 24                mov    %eax,(%esp)
>  12:   e8 fc ff ff ff                  call   13 <_Z1fv+0x13>
>  17:   8d 85 f8 fc ff ff       lea    -0x308(%ebp),%eax
>  1d:   89 04 24                mov    %eax,(%esp)
>  20:   e8 fc ff ff ff                  call   21 <_Z1fv+0x21>
>  25:   8d 85 f8 fd ff ff       lea    -0x208(%ebp),%eax
>  2b:   89 04 24                mov    %eax,(%esp)
>  2e:   e8 fc ff ff ff                  call   2f <_Z1fv+0x2f>
>  33:   8d 85 f8 fe ff ff       lea    -0x108(%ebp),%eax
>  39:   89 04 24                mov    %eax,(%esp)
>  3c:   e8 fc ff ff ff                  call   3d <_Z1fv+0x3d>
>  41:   c9                              leave
>  42:   c3                              ret
>
> The function makes four function calls, each of which constructs a
> temporary for the parameter.  The compiler dutifully allocates stack
> space to construct these, but it seems to allocate separate stack
> space for each of the temporaries.  This seems unnecessary--since
> their lifetimes don't overlap, the same stack space could be used for
> each of them.  The real-life code I adapted this example from had a
> fairly large number of temporaries strewn throughout it, each of which
> were quite large, so this behavior caused the generated function to
> use up a pretty substantial amount of stack, for what seems like no
> good reason.
>
> My question is, is this expected behavior?  My understanding of the
> C++ standard is that each of those temporaries goes away at the
> semicolon, so it seems like they have non-overlapping lifetimes, but I
> know there are some exceptions to that rule.  Could someone comment on
> whether this is an actual bug, or required for some reason by the
> standard, or just behavior that not enough people have run into
> problems with?

It's a missed optimization and not easy to fix.

Richard.

> Thanks,
> Matt
>


Non-optimal stack usage with C++ temporaries

2011-05-11 Thread Matt Fischer
I've noticed some behavior with g++ that seems strange to me.  I don't
know if there's some technicality in the C++ standard that requires
this, or if it's just a limitation to the optimization code, but it
seemed strange so I thought I'd see if anybody could shed more light
on it.

Here's a test program that illustrates the behavior:

struct Foo {
char buf[256];
Foo() {} // suppress automatically-generated constructor code for clarity
~Foo() {}
};

void func0(const Foo &);
void func1(const Foo &);
void func2(const Foo &);
void func3(const Foo &);

void f()
{
func0(Foo());
func1(Foo());
func2(Foo());
func3(Foo());
}

Compiling with -O2 and "-fno-stack-protector -fno-exceptions" for
clarity, on g++ 4.4.3, gives the following:

 :
   0:   55  push   %ebp
   1:   89 e5   mov%esp,%ebp
   3:   81 ec 18 04 00 00   sub$0x418,%esp
   9:   8d 85 f8 fb ff ff   lea-0x408(%ebp),%eax
   f:   89 04 24mov%eax,(%esp)
  12:   e8 fc ff ff ff  call   13 <_Z1fv+0x13>
  17:   8d 85 f8 fc ff ff   lea-0x308(%ebp),%eax
  1d:   89 04 24mov%eax,(%esp)
  20:   e8 fc ff ff ff  call   21 <_Z1fv+0x21>
  25:   8d 85 f8 fd ff ff   lea-0x208(%ebp),%eax
  2b:   89 04 24mov%eax,(%esp)
  2e:   e8 fc ff ff ff  call   2f <_Z1fv+0x2f>
  33:   8d 85 f8 fe ff ff   lea-0x108(%ebp),%eax
  39:   89 04 24mov%eax,(%esp)
  3c:   e8 fc ff ff ff  call   3d <_Z1fv+0x3d>
  41:   c9  leave
  42:   c3  ret

The function makes four function calls, each of which constructs a
temporary for the parameter.  The compiler dutifully allocates stack
space to construct these, but it seems to allocate separate stack
space for each of the temporaries.  This seems unnecessary--since
their lifetimes don't overlap, the same stack space could be used for
each of them.  The real-life code I adapted this example from had a
fairly large number of temporaries strewn throughout it, each of which
were quite large, so this behavior caused the generated function to
use up a pretty substantial amount of stack, for what seems like no
good reason.

My question is, is this expected behavior?  My understanding of the
C++ standard is that each of those temporaries goes away at the
semicolon, so it seems like they have non-overlapping lifetimes, but I
know there are some exceptions to that rule.  Could someone comment on
whether this is an actual bug, or required for some reason by the
standard, or just behavior that not enough people have run into
problems with?

Thanks,
Matt