On Mon, May 23, 2016 at 11:36 AM, Andy Lutomirski <[email protected]> wrote:
> On Mon, May 23, 2016 at 8:23 AM, Josh Poimboeuf <[email protected]> wrote:
>> On Sat, May 21, 2016 at 12:04:51PM -0400, Brian Gerst wrote:
>>> --- a/arch/x86/entry/entry_64.S
>>> +++ b/arch/x86/entry/entry_64.S
>>> @@ -405,37 +405,29 @@ END(__switch_to_asm)
>>>   * A newly forked process directly context switches into this address.
>>>   *
>>>   * rax: prev task we switched from
>>> + * rbx: kernel thread func
>>> + * r12: kernel thread arg
>>>   */
>>>  ENTRY(ret_from_fork)
>>>       movq    %rax, %rdi
>>>       call    schedule_tail                   /* rdi: 'prev' task parameter 
>>> */
>>>
>>> -     testb   $3, CS(%rsp)                    /* from kernel_thread? */
>>> +     testq   %rbx, %rbx                      /* from kernel_thread? */
>>>       jnz     1f
>>>
>>> -     /*
>>> -      * We came from kernel_thread.  This code path is quite twisted, and
>>> -      * someone should clean it up.
>>> -      *
>>> -      * copy_thread_tls stashes the function pointer in RBX and the
>>> -      * parameter to be passed in RBP.  The called function is permitted
>>> -      * to call do_execve and thereby jump to user mode.
>>> -      */
>>> -     movq    RBP(%rsp), %rdi
>>> -     call    *RBX(%rsp)
>>> -     movq    %rax, RAX(%rsp)
>>> -
>>> -     /*
>>> -      * Fall through as though we're exiting a syscall.  This makes a
>>> -      * twisted sort of sense if we just called do_execve.
>>> -      */
>>> -
>>> -1:
>>> +2:
>>>       movq    %rsp, %rdi
>>>       call    syscall_return_slowpath /* returns with IRQs disabled */
>>>       TRACE_IRQS_ON                   /* user mode is traced as IRQS on */
>>>       SWAPGS
>>>       jmp     restore_regs_and_iret
>>> +
>>> +1:
>>> +     /* kernel thread */
>>> +     movq    %r12, %rdi
>>> +     call    *%rbx
>>> +     movq    %rax, RAX(%rsp)
>>> +     jmp     2b
>>>  END(ret_from_fork)
>>
>> It seems really surprising that a kernel thread would be returning to
>> user space.  It would probably be a good idea to preserve the existing
>> comments about that.
>>
>
> Agreed.
>
> Which reminds me: at some point, on top of this series, we should
> consider either having multiple variants of ret_from_fork or otherwise
> generalizing the code.  If and when we implement CPL3 for *kernel*
> code (SGX and UEFI come to mind as possible use cases), we probably
> won't want to go through syscall_return_slowpath.

I don't understand what you mean by CPL3 kernel code.  Do you mean
something like the VDSO where the kernel maps the code into userspace?
 Why would you want to do this?

--
Brian Gerst

Reply via email to