Re: [PATCH] i386: Don't use frame pointer without stack access

Richard Sandiford Wed, 09 Aug 2017 00:53:36 -0700

Richard Biener <richard.guent...@gmail.com> writes:
> On August 8, 2017 7:36:35 PM GMT+02:00, Richard Sandiford
> <richard.sandif...@linaro.org> wrote:
>>Richard Sandiford <richard.sandif...@linaro.org> writes:
>>> Richard Biener <richard.guent...@gmail.com> writes:
>>>> On August 8, 2017 6:38:30 PM GMT+02:00, "H.J. Lu"
>><hjl.to...@gmail.com> wrote:
>>>>>On Mon, Aug 7, 2017 at 1:05 PM, Richard Sandiford
>>>>><richard.sandif...@linaro.org> wrote:
>>>>>> Arjan van de Ven <ar...@linux.intel.com> writes:
>>>>>>> On 8/7/2017 8:43 AM, Jakub Jelinek wrote:
>>>>>>>> On Mon, Aug 07, 2017 at 08:39:24AM -0700, H.J. Lu wrote:
>>>>>>>>> When Linux/x86-64 kernel is compiled with
>>-fno-omit-frame-pointer.
>>>>>>>>> this optimization removes more than 730
>>>>>>>>>
>>>>>>>>> pushq %rbp
>>>>>>>>> movq %rsp, %rbp
>>>>>>>>> popq %rbp
>>>>>>>>
>>>>>>>> If you don't want the frame pointer, why are you compiling with
>>>>>>>> -fno-omit-frame-pointer?  Are you going to add
>>>>>>>> -fforce-no-omit-frame-pointer or something similar so that
>>people
>>>>>can
>>>>>>>> actually get what they are asking for?  This doesn't really make
>>>>>sense.
>>>>>>>> It is perfectly fine to omit frame pointer by default, when it
>>>>>isn't
>>>>>>>> required for something, but if the user asks for it, we
>>shouldn't
>>>>>ignore his
>>>>>>>> request.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> wanting a framepointer is very nice and desired...  ... but if
>>the
>>>>>>> optimizer/ins scheduler moves instructions outside of the frame'd
>>>>>>> portion, (it does it for cases like below as well), the value is
>>>>>>> already negative for these functions that don't have stack use.
>>>>>>>
>>>>>>> <MPIDU_Sched_are_pending@@Base>:
>>>>>>> mov    all_schedules@@Base-0x38460,%rax
>>>>>>> push   %rbp
>>>>>>> mov    %rsp,%rbp
>>>>>>> pop    %rbp
>>>>>>> cmpq   $0x0,(%rax)
>>>>>>> setne  %al
>>>>>>> movzbl %al,%eax
>>>>>>> retq
>>>>>>
>>>>>> Yeah, and it could be even weirder for big single-block functions.
>>>>>> I think GCC has been doing this kind of scheduling of prologue and
>>>>>> epilogue instructions for a while, so there hasn*t really been a
>>>>>> guarantee which parts of the function will have a new FP and which
>>>>>> will still have the old one.
>>>>>>
>>>>>> Also, with an arbitrarily-picked host compiler (GCC 6.3.1),
>>>>>shrink-wrapping
>>>>>> kicks in when the following is compiled with -O3
>>>>>-fno-omit-frame-pointer:
>>>>>>
>>>>>>     void f (int *);
>>>>>>     void
>>>>>>     g (int *x)
>>>>>>     {
>>>>>>       for (int i = 0; i < 1000; ++i)
>>>>>>         x[i] += 1;
>>>>>>       if (x[0])
>>>>>>         {
>>>>>>           int temp;
>>>>>>           f (&temp);
>>>>>>         }
>>>>>>     }
>>>>>>
>>>>>> so only the block with the call to f sets up FP.  The relatively
>>>>>> long-running loop runs with the caller's FP.
>>>>>>
>>>>>> I hope we can go for a target-independent position that what HJ*s
>>>>>> patch does is OK...
>>>>>>
>>>>>
>>>>>In light of this,  I am resubmitting my patch.  I added 3 more
>>>>>testcases
>>>>>and also handle:
>>>>>
>>>>>typedef int v8si __attribute__ ((vector_size (32)));
>>>>>
>>>>>void
>>>>>foo (v8si *out_start, v8si *out_end, v8si *regions)
>>>>>{
>>>>>    v8si base = regions[3];
>>>>>    *out_start = base;
>>>>>    *out_end = base;
>>>>>}
>>>>>
>>>>>OK for trunk?
>>>>
>>>> The invoker specified -fno-omit-frame-pointer, why did you eliminate
>>it?
>>>> I'd argue it's OK when neither -f nor -fno- is explicitly specified
>>>> irrespective of the default in case we document the change but an
>>>> explicit -fno- is pretty clear.
>>>
>>> I don't buy that we're ignoring the user.  -fomit-frame-pointer says
>>> that, when you're creating a frame, it's OK not to set up the frame
>>> pointer.  Forcing it off means that if you create a frame, you need
>>> to set up the frame pointer too.  But it doesn't say anything about
>>> whether the frame itself is needed.  I.e. it's
>>-fno-omit-frame*-pointer*
>>> rather than -fno-omit-frame.
>
> Isn't that a bit splitting hairs if you look at (past) history?


I guess it would have been splitting hairs in the days when they
amounted to the same thing, i.e. when there was no behaviour that
would match "-fomit-frame" and when the prologue and epilogue were
glued to the start and end of the function.  But that was quite a
long time ago.  Shrink-wrapping at least means that omitting the frame
and omitting the frame pointer are different things, and it seems
fair that -fomit-frame-pointer has followed the natural meaning.

> You could also interpret -fno-omit-frame-pointer as obviously forcing a
> frame as otherwise there's nothing to omit...

But applying that kind of interpretation to something like
-maccumulate-outgoing-args would make inlining all calls within a
function invalid, since there'd no longer be arguments to accumulate.

I think this kind of disagreement just emphasises that if we really
need a "always emit a prologue at the very start, an epilogue at the
very end, and always use a frame pointer" option, we should add it
and document exactly what the guarantees are.  I don't think
-fno-omit-frame-pointer should be it, since as the replies earlier in
the thread said, the natural meaning of that option has its uses too.

Thanks,
Richard

>
>>> It seems like the responses have been treating it more like
>>> a combination of:
>>>
>>> -fno-shrink-wrapping
>>> -fno-omit-frame-pointer
>>> the equivalent of the old textual prologues and epilogues
>>>
>>> but the positive option -fomit-frame-pointer doesn't have any effect
>>> on the last two.
>>
>>er, you know what I mean :-)  It doesn't have any effect on
>>-fshrink-wrapping or the textual-style prologues and epilogues.
>
> True.  But I think people do not appreciate new options too much if
> existing ones worked in the past...
>
> Richard.

Re: [PATCH] i386: Don't use frame pointer without stack access

Reply via email to