On Sun, May 22, 2016 at 1:59 PM, Andy Lutomirski <l...@amacapital.net> wrote: > cc: Josh Poimboeuf: do you care about the exact stack layout of the > bottom of the stack of an inactive task? > > On May 21, 2016 9:05 AM, "Brian Gerst" <brge...@gmail.com> wrote: >> >> Move the low-level context switch code to an out-of-line asm stub instead of >> using complex inline asm. This allows constructing a new stack frame for the >> child process to make it seamlessly flow to ret_from_fork without an extra >> test and branch in __switch_to(). It also improves code generation for >> __schedule() by using the C calling convention instead of clobbering all >> registers. > > I like the concept a lot. > >> >> Signed-off-by: Brian Gerst <brge...@gmail.com> >> --- >> arch/x86/entry/entry_32.S | 38 ++++++++++ >> arch/x86/entry/entry_64.S | 42 +++++++++++- >> arch/x86/include/asm/processor.h | 3 - >> arch/x86/include/asm/switch_to.h | 137 >> ++++++------------------------------- >> arch/x86/include/asm/thread_info.h | 2 - >> arch/x86/kernel/asm-offsets.c | 6 ++ >> arch/x86/kernel/asm-offsets_32.c | 5 ++ >> arch/x86/kernel/asm-offsets_64.c | 5 ++ >> arch/x86/kernel/process_32.c | 8 ++- >> arch/x86/kernel/process_64.c | 7 +- >> arch/x86/kernel/smpboot.c | 1 - >> 11 files changed, 124 insertions(+), 130 deletions(-) >> >> diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S >> index ee6fea0..05e5340 100644 >> --- a/arch/x86/entry/entry_32.S >> +++ b/arch/x86/entry/entry_32.S >> @@ -204,6 +204,44 @@ >> POP_GS_EX >> .endm >> >> +/* >> + * %eax: prev task >> + * %edx: next task >> + */ >> +ENTRY(__switch_to_asm) >> + /* >> + * Save callee-saved registers >> + * This must match the order in struct fork_frame >> + * Frame pointer must be last for get_wchan >> + */ >> + pushl %ebx >> + pushl %edi >> + pushl %esi >> + pushl %ebp >> + >> + /* switch stack */ >> + movl %esp, TASK_threadsp(%eax) >> + movl TASK_threadsp(%edx), %esp >> + >> +#ifdef CONFIG_CC_STACKPROTECTOR >> + movl TASK_stack_canary(%edx), %ebx >> + movl %ebx, PER_CPU_VAR(stack_canary)+stack_canary_offset >> +#endif >> + >> + /* restore callee-saved registers */ >> + popl %ebp >> + popl %esi >> + popl %edi >> + popl %ebx > > This is highly, highly magical. eax and edx are prev and next, and:
What is so magical about the standard C calling convention (regarm(3) in the 32-bit case)? This just passes them right though to __switch_to(). >> + >> + jmp __switch_to > > leaves prev in eax. This works, but it might be worth a comment. Not quite, __switch_to() returns 'last', not 'prev'. The previous task when this is called is not the same task when the thread wakes up. >> +END(__switch_to_asm) > >> /* >> + * %rdi: prev task >> + * %rsi: next task >> + */ >> +ENTRY(__switch_to_asm) >> + /* >> + * Save callee-saved registers >> + * This must match the order in struct fork_frame >> + * Frame pointer must be last for get_wchan >> + */ >> + pushq %rbx >> + pushq %r12 >> + pushq %r13 >> + pushq %r14 >> + pushq %r15 >> + pushq %rbp >> + >> + /* switch stack */ >> + movq %rsp, TASK_threadsp(%rdi) >> + movq TASK_threadsp(%rsi), %rsp >> + >> +#ifdef CONFIG_CC_STACKPROTECTOR >> + movq TASK_stack_canary(%rsi), %rbx >> + movq %rbx, PER_CPU_VAR(irq_stack_union)+stack_canary_offset >> +#endif >> + >> + /* restore callee-saved registers */ >> + popq %rbp >> + popq %r15 >> + popq %r14 >> + popq %r13 >> + popq %r12 >> + popq %rbx >> + >> + jmp __switch_to > > Ditto with the magic here. > >> +struct fork_frame { >> + unsigned long bp; >> +#ifdef CONFIG_X86_64 >> + unsigned long r15; >> + unsigned long r14; >> + unsigned long r13; >> + unsigned long r12; >> +#else >> + unsigned long si; >> + unsigned long di; >> +#endif >> + unsigned long bx; >> + unsigned long ret_addr; >> + struct pt_regs regs; >> +}; > > This, like the old implementation, is very much geared to the current > implementation of fork. Can you split it up: > > struct inactive_task_frame { > unsigned long bp; > ... > unsigned long ret_addr; > }; > > /* fork works by setting up the child stack so that switch_to will > land at ret_from_fork with sp pointing at pt_regs */ > struct fork_frame { > struct inactive_task_frame switch_frame; > struct pt_regs regs; > }; > > Then, if and when someone wants to fork into a different type of > context, they can reuse this. Also, a future improved unwinder can > use inactive_task_frame directly to kick off its unwind. Sounds reasonable. -- Brian Gerst