On Mon, 24 Feb 2014, H. Peter Anvin wrote:

> On 02/24/2014 11:30 AM, Peter Zijlstra wrote:
> > On Mon, Feb 24, 2014 at 02:13:29PM -0500, Steven Rostedt wrote:
> >> Ah, and x86_64 saves off the cr2 register when entering NMI and restores
> >> it before returning. But it seems to be missing from the i386 code.
> > 
> > arch/x86/kernel/nmi.c:
> > 
> > #define nmi_nesting_preprocess(regs)                                        
> > \
> >     do {                                                            \
> >             if (this_cpu_read(nmi_state) != NMI_NOT_RUNNING) {      \
> >                     this_cpu_write(nmi_state, NMI_LATCHED);         \
> >                     return;                                         \
> >             }                                                       \
> >             this_cpu_write(nmi_state, NMI_EXECUTING);               \
> >             this_cpu_write(nmi_cr2, read_cr2());                    \
> >     } while (0);                                                    \
> >     nmi_restart:
> > 
> > #define nmi_nesting_postprocess()                                   \
> >     do {                                                            \
> >             if (unlikely(this_cpu_read(nmi_cr2) != read_cr2()))     \
> >                     write_cr2(this_cpu_read(nmi_cr2));              \
> >             if (this_cpu_dec_return(nmi_state))                     \
> >                     goto nmi_restart;                               \
> >     } while (0)
> > 
> > That very much looks like saving/restoring CR2 to me.
> > 
> > FWIW; I hate how the x86_64 and i386 versions of this NMI nesting magic
> > are so completely different.
> 
> Is there any way that nmi_cr2 can end up getting overwritten by multiple
> nestings of some kind?  I would have thought it would have made more
> sense to spill cr2 onto the stack after the stack has been properly set up.

So how can I help with debugging this?

While the missing cr2 issue made debugging frustrating, I find the other 
aspects of the bug more serious:

  1.  Programs that are doing valid memory accesses can segfault
and worse
  2.  This bug can cause an instant-reboot of the system (I assume somehow
      with the right combination of memory accesses  it causes a 
      triple-fault?)

#2 is why I spent all of this time tracking this down, I couldn't leave a 
machine fuzzing overnight without the machine rebooting.

Vince

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to