On Tue, Sep 11, 2018 at 10:17:44AM +0200, Takashi Iwai wrote:
> [ seems like my previous post didn't go out properly; if you have
>   already received it, please discard this one ]

Sorry, I got it, it's just in my large queue :(

> Hi Rafael, Greg,
> 
> James Wang reported on SUSE bugzilla that his machine spews many
> AMD-Vi errors at reboot like:
> 
> [  154.907879] systemd-shutdown[1]: Detaching loop devices.
> [  154.954583] kvm: exiting hardware virtualization
> [  154.999953] usb 5-2: USB disconnect, device number 2
> [  155.025278] ohci-pci 0000:00:12.1: AMD-Vi: Event logged [IO_PAGE_FAULT 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.081360] ohci-pci 0000:00:12.1: AMD-Vi: Event logged [IO_PAGE_FAULT 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.136778] ohci-pci 0000:00:12.1: AMD-Vi: Event logged [IO_PAGE_FAULT 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.191772] ohci-pci 0000:00:12.1: AMD-Vi: Event logged [IO_PAGE_FAULT 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.247055] ohci-pci 0000:00:12.1: AMD-Vi: Event logged [IO_PAGE_FAULT 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.302614] ohci-pci 0000:00:12.1: AMD-Vi: Event logged [IO_PAGE_FAULT 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.358996] ohci-pci 0000:00:12.1: AMD-Vi: Event logged [IO_PAGE_FAULT 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.392155] usb 4-2: new full-speed USB device number 2 using ohci-pci
> [  155.413752] ohci-pci 0000:00:12.1: AMD-Vi: Event logged [IO_PAGE_FAULT 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.413762] ohci-pci 0000:00:12.1: AMD-Vi: Event logged [IO_PAGE_FAULT 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.560307] ohci-pci 0000:00:12.1: AMD-Vi: Event logged [IO_PAGE_FAULT 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.616039] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.1 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.667843] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.1 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.719497] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.1 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.772697] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.1 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.823919] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.1 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.875490] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.1 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.927258] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.1 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  155.979318] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.1 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  156.031813] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.1 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  156.084293] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.1 
> domain=0x0006 address=0x0000000000000080 flags=0x0020]
> [  156.272157] reboot: Restarting system
> [  156.290316] reboot: machine restart
> 
> And, James bisected and spotted that it's introduced by the commit
> 722e5f2b1eec ("driver core: Partially revert "driver core: correct
> device's shutdown order"").  Reverting the commit fixes the problem.
> 
> He mentioned about Uncorrectable Machine Check Exception seen at
> shutdown, too, where it doesn't appear after the revert.  (Though,
> it's not sure whether it's really relevant.)
> 
> The errors are clearly related with the USB device (a KVM device,
> IIRC), and the errors are not seen if the USB device is disconnected.
> 
> We experienced this at first with SLE15 kernel (4.12 with backports),
> but later the same issue was confirmed on 4.18.y and 4.19-rc2.  Also,
> it's confirmed that revert works on the upstream kernels, too.
> 
> Does this hit your radar?

Ugh, no, I haven't heard of this before, Rafael?

So the need for the revert fixes some machines, but others need the
patch, this isn't going to be fun :(

greg k-h

Reply via email to