> From: Paolo Bonzini [mailto:pbonz...@redhat.com] > Sent: Monday, December 14, 2015 9:37 PM > On 14/12/2015 14:27, Gonglei (Arei) wrote: > > > >> > >> On 14/12/2015 13:49, Gonglei (Arei) wrote: > >>>>>>> This patch introduce nmi disable bit handler to fix the problem > >>>>>>> and make the emulated CMOS like the real hardware. > >>>>> > >>>>> I think that this only works with -machine kernel_irqchip=off, however. > >>> IIRCC, the kernel_irqchip is disabled by default, and we used the > >>> default > >> value. > >> > >> No, it's enabled by default. > >> > > > > Okay, yes, I saw the source code again. That means kmod finish the NMI > > injection wrok, and the NMI will not pass Qemu side. So, you thought > > this patch cannot block NMI injection when kernel_irqchip=on ? > > I am not sure. It depends on which NMIs are blocked by the bit. For > example, the IOAPIC can deliver NMIs, and they wouldn't be blocked. > > Do you have any documentation, to see whether they can actually happen on > emulated hardware? I guess we support the TCO watchdog, so yes. > Yes, watchdog is one case, and we have another case which need to use NMI to tell guest do something when guest's cpu stuck or something like that. And I can invoke qmp command "inject-nmi" when SeaBIOS try to close NMI by invoking rtc_read() or rtc_write().
After the NMI injection, the guest will reboot: [2015-12-14 16:41:57] In resume (status=0) [2015-12-14 16:41:57] In 32bit resume [2015-12-14 16:41:57] =====Attempting a hard reboot==== [2015-12-14 16:41:58] SeaBIOS (version rel-1.8.1-0-g4adadbd-20151214_135833-linux-jAPTBr) [snip] So, I think we should handle those scenarios, just like the real hardware. > > Maybe we should pass the nmi_disable bit to Kmod when kernel_irqchip=on , > right? > > Yes, that's the idea. > That means I have much more work need to do. > But first of all, I've read the thread you linked, and I couldn't find the > place > where it says that the root cause is NMIs. > That's complete true. I haven't direct proof, but I think I eliminated all possible causes, except NMIs. Of course, if you find any other clues, please let me know. The most trouble thing is I couldn't reproduce this problem. :( Thanks, -Gonglei