On Thu, Jul 11, 2019 at 11:20 PM Jan Kiszka <jan.kis...@siemens.com> wrote:
>
> On 11.07.19 22:48, Richard Weinberger wrote:
> > On Thu, Jul 11, 2019 at 8:30 PM Jan Kiszka <jan.kis...@siemens.com> wrote:
> >>
> >> On 11.07.19 12:25, Richard Weinberger wrote:
> >>> On Thu, Jul 11, 2019 at 12:21 PM Jan Kiszka <jan.kis...@siemens.com> 
> >>> wrote:
> >>>> Can't reproduce so far, even with a while-true loop. Can you share your 
> >>>> .config?
> >>>
> >>> Sure, see attachment.
> >>>
> >>
> >> This seems to fix the issue here:
> >>
> >> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> >> index 119fd66d111e..8f647c208cf2 100644
> >> --- a/arch/x86/entry/entry_64.S
> >> +++ b/arch/x86/entry/entry_64.S
> >> @@ -997,8 +997,8 @@ apicinterrupt IRQ_WORK_VECTOR                       
> >> irq_work_interrupt              smp_irq_work_interrupt
> >>  \skip_label:
> >>         UNWIND_HINT_REGS
> >>         DISABLE_INTERRUPTS(CLBR_ANY)
> >> -       testl   %ebx, %ebx      /* %ebx: return to kernel mode */
> >> -       jnz     retint_kernel_early
> >> +       testb   $3, CS(%rsp)
> >> +       jz      retint_kernel_early
> >>         jmp     retint_user_early
> >>         .endif
> >>  1001:
> >>
> >> Tests welcome!
> >
> > With that change I can no longer trigger the crash.
>
> Perfect.
>
> > Can you please give more context? I'd like to understand the problem.
> >
>
> We were basing the decision whether to switch GS on return or not on a stale
> register (ebx). That register used to contain the information, but that 
> changed
> with "x86/entry/64: Remove %ebx handling from error_entry/exit". This caused 
> CPU
> state corruptions under certain conditions, apparently only when dealing with
> #DB exceptions, not with the way more frequent #PF.

Ah! Upstream b3681dd548d0 ("x86/entry/64: Remove %ebx handling from
error_entry/exit")
changed ebx to CS. Now things make sense again. :-)

Thanks for the quick fix and the explanation!

-- 
Thanks,
//richard

Reply via email to