Resuming this old thread about an unfixed bug, which was introduced in qemu-4.2:
qemu ends up in piix_ide_reset from pci_unplug_disks. This was not the case prior 4.2, the removed call to qemu_register_reset(piix3_reset, d) in ee358e919e385fdc79d59d0d47b4a81e349cd5c9 did apparently nothing. In my debugging (with v8.0.0) it turned out the three pci_set_word causes the domU to hang. In fact, it is just the last one: pci_set_byte(pci_conf + 0x20, 0x01); /* BMIBA: 20-23h */ It changes the value from 0xc121 to 0x1. The question is: what does this do in practice? Starting with recent qemu (like 7.2), the domU sometimes proceeds with these messages: [ 1.631161] uhci_hcd 0000:00:01.2: host system error, PCI problems? [ 1.634965] uhci_hcd 0000:00:01.2: host controller process error, something bad happened! [ 1.634965] uhci_hcd 0000:00:01.2: host controller halted, very bad! [ 1.634965] uhci_hcd 0000:00:01.2: HC died; cleaning up Loading basic drivers...[ 2.398048] Disabling IRQ #23 Is anyone familiar enough with PIIX3 and knows how these devices are interacting? 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02) 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] 00:01.2 USB controller: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] (rev 01) 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03) 00:02.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01) 00:03.0 VGA compatible controller: Cirrus Logic GD 5446 Thanks, Olaf On Thu, Mar 25, Paolo Bonzini wrote: > On 25/03/21 12:12, Olaf Hering wrote: > > Am Mon, 22 Mar 2021 18:09:17 -0400 > > schrieb John Snow <js...@redhat.com>: > > > > > My understanding is that XEN has some extra disks that it unplugs when > > > it later figures out it doesn't need them. How exactly this works is > > > something I've not looked into too closely. > > > > It has no extra disks, why would it? > > > > I assume each virtualization variant has some sort of unplug if it has to > > support guests that lack PV/virtio/enlightened/whatever drivers. > > No, it's Xen only and really should be legacy. Ideally one would just have > devices supported at all levels from firmware to kernel. > > > > So if these IDE devices have been "unplugged" already, we avoid > > > resetting them here. What about this reset causes the bug you describe > > > in the commit message? > > > > > > Does this reset now happen earlier/later as compared to what it did > > > prior to ee358e91 ? > > > > Prior this commit, piix_ide_reset was only called when the entire > > emulated machine was reset. Like: never. With this commit, > > piix_ide_reset will be called from pci_piix3_xen_ide_unplug. For some > > reason it confuses the emulated USB hardware. Why it does confused > > it, no idea. > > > I wonder what the purpose of the qdev_reset_all() call really is. It > > is 10 years old. It might be stale. > > piix_ide_reset is only calling ide_bus_reset, and from there ide_reset and > bmdma_reset. All of these functions do just two things: reset internal > registers and ensure pending I/O is completed or canceled. The latter is > indeed unnecessary; drain/flush/detach is already done before the call to > qdev_reset_all. > > But the fact that it breaks USB is weird. That's the part that needs to be > debugged, because changing IDE to unbreak USB needs an explanation even if > it's the right thing to do. > > If you don't want to debug it, removing the qdev_reset_all call might do the > job; you'll have to see what the Xen maintainers think of it. But if you > don't debug the USB issue now, it will come back later almost surely. > > Paolo
signature.asc
Description: Digital signature