On Sat, Mar 05, 2016 at 09:52:20PM -0800, Andy Lutomirski wrote:
> Due to a blatant design error, SYSENTER doesn't clear TF.  As a result,
> if a user does SYSENTER with TF set, we will single-step through the
> kernel until something clears TF.  There is absolutely nothing we can
> do to prevent this short of turning off SYSENTER [1].
> 
> Simplify the handling considerably with two changes:
> 
> 1. We already sanitize EFLAGS in SYSENTER to clear NT and AC.  We can
>    add TF to that list of flags to sanitize with no overhead whatsoever.
> 
> 2. Teach do_debug to ignore single-step traps in the SYSENTER prologue.
> 
> That's all we need to do.
> 
> Don't get too excited -- our handling is still buggy on 32-bit
> kernels.  There's nothing wrong with the SYSENTER code itself, but
> the #DB prologue has a clever fixup for traps on the very first
> instruction of entry_SYSENTER_32, and the fixup doesn't work quite
> correctly.  The next two patches will fix that.
> 
> [1] We could probably prevent it by forcing BTF on at all times and
>     making sure we clear TF before any branches in the SYSENTER
>     code.  Needless to say, this is a bad idea.
> 
> Signed-off-by: Andy Lutomirski <l...@kernel.org>
> ---
>  arch/x86/entry/entry_32.S        | 42 ++++++++++++++++++++++----------
>  arch/x86/entry/entry_64_compat.S |  9 ++++++-
>  arch/x86/include/asm/proto.h     | 15 ++++++++++--
>  arch/x86/kernel/traps.c          | 52 
> +++++++++++++++++++++++++++++++++-------
>  4 files changed, 94 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
> index a8c3424c3392..7700cf695d8c 100644
> --- a/arch/x86/entry/entry_32.S
> +++ b/arch/x86/entry/entry_32.S
> @@ -287,7 +287,26 @@ need_resched:
>  END(resume_kernel)
>  #endif
>  
> -     # SYSENTER  call handler stub
> +GLOBAL(__begin_SYSENTER_singlestep_region)
> +/*
> + * All code from here through __end_SYSENTER_singlestep_region is subject
> + * to being single-stepped if a user program sets TF and executes SYSENTER.
> + * There is absolutely nothing that we can do to prevent this from happening
> + * (thanks Intel!).  To keep our handling of this situation as simple as
> + * possible, we handle TF just like AC and NT, except that our #DB handler
> + * will ignore all of the single-step traps generated in this range.
> + */
> +
> +#ifdef CONFIG_XEN
> +/*
> + * Xen doesn't set %esp to be precisely what the normal SYSENTER
> + * entry point expects, so fix it up before using the normal path.
> + */
> +ENTRY(xen_sysenter_target)
> +     addl    $5*4, %esp                      /* remove xen-provided frame */
> +     jmp     sysenter_past_esp
> +#endif
> +
>  ENTRY(entry_SYSENTER_32)
>       movl    TSS_sysenter_sp0(%esp), %esp
>  sysenter_past_esp:

Can you do this below (ontop of your diff) and get rid of those
__{begin,end}_SYSENTER_singlestep_region and __end_entry_SYSENTER_compat
globals and use the entry_SYSENTER_{32|compat} symbols instead?

>From a quick scan, kallsyms can give you the symbol size too so that you
can compute where it ends:

readelf -a vmlinux | grep entry_SYSENTER
 55454: ffffffff8170aff0    99 FUNC    GLOBAL DEFAULT    1 entry_SYSENTER_compat
 62596: ffffffff8170b053     0 NOTYPE  GLOBAL DEFAULT    1 
__end_entry_SYSENTER_comp

0xffffffff8170aff0 + 99 = 0xffffffff8170b053

---
Index: b/arch/x86/entry/entry_32.S
===================================================================
--- a/arch/x86/entry/entry_32.S 2016-03-06 17:47:10.059733163 +0100
+++ b/arch/x86/entry/entry_32.S 2016-03-06 17:46:52.235733410 +0100
@@ -297,18 +297,13 @@ GLOBAL(__begin_SYSENTER_singlestep_regio
  * will ignore all of the single-step traps generated in this range.
  */
 
+ENTRY(entry_SYSENTER_32)
 #ifdef CONFIG_XEN
-/*
- * Xen doesn't set %esp to be precisely what the normal SYSENTER
- * entry point expects, so fix it up before using the normal path.
- */
-ENTRY(xen_sysenter_target)
        addl    $5*4, %esp                      /* remove xen-provided frame */
        jmp     sysenter_past_esp
-#endif
-
-ENTRY(entry_SYSENTER_32)
+#else
        movl    TSS_sysenter_sp0(%esp), %esp
+#endif
 sysenter_past_esp:
        pushl   $__USER_DS              /* pt_regs->ss */
        pushl   %ebp                    /* pt_regs->sp (stashed in bp) */

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

Reply via email to