On Tue, Dec 13, 2016 at 6:34 AM, Josh Poimboeuf <jpoim...@redhat.com> wrote:
> On Mon, Dec 12, 2016 at 05:05:11PM -0600, Josh Poimboeuf wrote:
>> On Mon, Dec 12, 2016 at 11:33:54PM +0100, Borislav Petkov wrote:
>> > On Mon, Dec 12, 2016 at 04:11:47PM -0600, Josh Poimboeuf wrote:
>> > > Yes, please.
>> >
>> > Attached.
>>
>> Thanks, I was able to recreate.  Will take a look tomorrow.
>
> Figured it out.  Your config has CONFIG_PARAVIRT=n, which convinces gcc
> to create the following preamble for x86_64_start_kernel():
>
>   0000000000000124 <x86_64_start_kernel>:
>    124: 4c 8d 54 24 08          lea    0x8(%rsp),%r10
>    129: 48 83 e4 f0             and    $0xfffffffffffffff0,%rsp
>    12d: 41 ff 72 f8             pushq  -0x8(%r10)
>    131: 55                      push   %rbp
>    132: 48 89 e5                mov    %rsp,%rbp
>
> It's an unusual pattern which aligns rsp (though in this case it's
> already aligned) and saves the start_cpu() return address again on the
> stack before storing the frame pointer.

Um, what?  I can reproduce it -- I get:

0000000000000124 <x86_64_start_kernel>:
 124:   4c 8d 54 24 08          lea    0x8(%rsp),%r10
 129:   48 83 e4 f0             and    $0xfffffffffffffff0,%rsp
 12d:   41 ff 72 f8             pushq  -0x8(%r10)
 131:   55                      push   %rbp
 132:   48 89 e5                mov    %rsp,%rbp
 135:   41 57                   push   %r15
 137:   41 56                   push   %r14
 139:   41 55                   push   %r13
 13b:   41 54                   push   %r12
 13d:   41 52                   push   %r10
 13f:   53                      push   %rbx
 140:   48 83 ec 10             sub    $0x10,%rsp

...

and the epilog looks like:

 29c:   58                      pop    %rax
 29d:   5a                      pop    %rdx
 29e:   5b                      pop    %rbx
 29f:   41 5a                   pop    %r10
 2a1:   41 5c                   pop    %r12
 2a3:   41 5d                   pop    %r13
 2a5:   41 5e                   pop    %r14
 2a7:   41 5f                   pop    %r15
 2a9:   5d                      pop    %rbp
 2aa:   49 8d 62 f8             lea    -0x8(%r10),%rsp
 2ae:   c3                      retq

This is, I think, *terrible* code.  It makes it entirely impossible
for the CPU to look through the retq until the instruction right
before it finishes because there's no way the CPU can tell what rsp is
until the instruction right before the retq.  And it's saving and
restoring an entire extra register (r10) instead of just using rbp for
this purpose.  *And* the extra copy of the return address seems
totally useless except for unwinding.

This does indeed depend on CONFIG_PARAVIRT, but I'm not seeing what
changes.  Presumably something related to what happens in the
function?

I want to file a GCC bug, though.  This code sucks.

>
> The unwinder assumes the last stack frame header is at a certain offset,
> but the above code breaks that assumption.  I still need to think about
> the best way to fix it.

Have a dummy written-in-asm top-of-the-stack function?  Or recognize
the end by the final saved RBP?

--Andy

Reply via email to