* Stas Sergeev <[EMAIL PROTECTED]> wrote:

> ENTRY(sysenter_entry)
>       movl TSS_sysenter_esp0(%esp),%esp
> sysenter_past_esp:
> -     sti
>       pushl $(__USER_DS)
>       pushl %ebp
> +     sti

ah, yes, sysenter. SYSENTER creates a degenerate 'small' stackframe with 
an esp0 that is missing the 5 entry words relative to the normal entry 
(int80 or irq) esp0 stackframe. These 5 words are: xss, esp, eflags, 
xcs, eip. The sysenter code sets them up manually.

now if an interrupt hits at this point, it will set up a 'same privilege 
level' stackframe, which has eip/xcs/eflags, i.e. no esp/xss. If upon 
irq-return we then examine the stack due to your patch, it will be an 
incorrect stackframe -> kaboom.

your patch doesnt remove the condition, it only removes the crash, 
because it adds the 2 words space that is needed - but the information 
relied on by your irq-return test is still bogus. At this point i'd 
suggest to remove the ESP patch altogether.

the correct solution is to always let the sysenter path set up a full 
and correct stackframe, before allowing preemption (see the attached 
patch). This was a nasty bug in the waiting. (I have not made this 
conditional on CONFIG_PREEMPT, to keep it simple and because the impact 
to irq latency is small and predictable. There's no runtime overhead.)

so i think with the help of Stas the mystery has been fully explained 
and solved. Linus?

        Ingo

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>

--- linux/arch/i386/kernel/entry.S.orig
+++ linux/arch/i386/kernel/entry.S
@@ -179,12 +179,17 @@ need_resched:
 ENTRY(sysenter_entry)
        movl TSS_sysenter_esp0(%esp),%esp
 sysenter_past_esp:
-       sti
+       #
+       # irqs are disabled: set up an entry stackframe without
+       # allowing irqs to potentially preempt us with an
+       # incomplete entry frame!
+       #
        pushl $(__USER_DS)
        pushl %ebp
        pushfl
        pushl $(__USER_CS)
        pushl $SYSENTER_RETURN
+       sti
 
 /*
  * Load the potential sixth argument from user stack.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to