On Tue, Sep 10, 2019 at 10:08:20AM -0400, John Snow wrote: > > > On 9/10/19 9:58 AM, Michael S. Tsirkin wrote: > > On Tue, Sep 10, 2019 at 09:50:41AM -0400, John Snow wrote: > >> > >> > >> On 9/10/19 3:04 AM, Michael S. Tsirkin wrote: > >>> On Tue, Sep 10, 2019 at 01:18:37AM +0800, andychiu wrote: > >>>> If Windows 10 guests have enabled 'turn off hard disk after idle' > >>>> option in power settings, and the guest has a SATA disk plugged in, > >>>> the SATA disk will be turned off after a specified idle time. > >>>> If the guest is live migrated or saved/loaded with its SATA disk > >>>> turned off, the following error will occur: > >>>> > >>>> qemu-system-x86_64: AHCI: Failed to start FIS receive engine: bad FIS > >>>> receive buffer address > >>>> qemu-system-x86_64: Failed to load ich9_ahci:ahci > >>>> qemu-system-x86_64: error while loading state for instance 0x0 of device > >>>> '0000:00:1a.0/ich9_ahci' > >>>> qemu-system-x86_64: load of migration failed: Operation not permitted > >>>> > >>>> Observation from trace logs shows that a while after Windows 10 turns off > >>>> a SATA disk (IDE disks don't have the following behavior), > >>>> it will disable the PCI_COMMAND_MASTER flag of the pci device containing > >>>> the ahci device. When the the disk is turning back on, > >>>> the PCI_COMMAND_MASTER flag will be restored first. > >>>> But if the guest is migrated or saved/loaded while the disk is off, > >>>> the post_load callback of ahci device, ahci_state_post_load(), will fail > >>>> at ahci_cond_start_engines() if the MemoryRegion > >>>> pci_dev->bus_master_enable_region is not enabled, with pci_dev pointing > >>>> to the PCIDevice struct containing the ahci device. > >>>> > >>>> This patch enables pci_dev->bus_master_enable_region before calling > >>>> ahci_cond_start_engines() in ahci_state_post_load(), and restore the > >>>> MemoryRegion to its original state afterwards. > >>>> > >>>> Signed-off-by: andychiu <andyc...@synology.com> > >>> > >>> Poking at PCI device internals like this seems fragile. And force > >>> enabling bus master can lead to unpleasantness like corrupting guest > >>> memory, unhandled interrupts, etc. E.g. it's quite reasonable, > >>> spec-wise, for the guest to move thing in memory around while bus > >>> mastering is off. > >>> > >>> Can you teach ahci that region being disabled > >>> during migration is ok, and recover from it? > >> > >> That's what I'm wondering. > >> > >> I could try to just disable the FIS RX engine if the mapping fails, but > >> that will require a change to guest visible state. > >> > >> My hunch, though, is that when windows re-enables the device it will > >> need to re-program the address registers anyway, so it might cope well > >> with the FIS RX bit getting switched off. > >> > >> (I'm wondering if it isn't a mistake that QEMU is trying to re-map this > >> address in the first place. Is it legal that the PCI device has pci bus > >> master disabled but we've held on to a mapping? > > > > If you are poking at guest memory when bus master is off, then most likely > > yes. > > > >> Should there be some > >> callback where AHCI knows to invalidate mappings at that point...?) > > > > ATM the callback is the config write, you check > > proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER > > and if disabled invalidate the mapping. > > > > virtio at least has code that pokes at > > proxy->pci_dev.config[PCI_COMMAND] too, I'm quite > > open to a function along the lines of > > pci_is_bus_master_enabled() > > that will do this. > > > > Well, that's not a callback. I don't think it's right to check the > PCI_COMMAND register *every* time AHCI does anything at all to see if > its mappings are still valid. > > AHCI makes a mapping *once* when FIS RX is turned on, and it unmaps it > when it's turned off. It assumes it remains valid that whole time. When > we migrate, it checks to see if it was running, and performs the > mappings again to re-boot the state machine. > > What I'm asking is; what are the implications of a guest disabling > PCI_COMMAND_MASTER? (I don't know PCI as well as you do.)
The implication is that no reads or writes must be initiated by device: either memory or IO reads, or sending MSI. INT#x is unaffected. writes into device memory are unaffected. whether reads from device memory are affected kind of depends, but maybe not. Whether device caches anything internally has nothing to do with PCI_COMMAND_MASTER and PCI spec says nothing about it. Windows uses PCI_COMMAND_MASTER to emulate surprise removal so there's that. > What should that mean for the AHCI state machine? > > Does this *necessarily* invalidate the mappings? > (In which case -- it's an error that AHCI held on to them after Windows > disabled the card, even if AHCI isn't being engaged by the guest > anymore. Essentially, we were turned off but didn't clean up a dangling > pointer, but we need the event that tells us to clean the dangling mapping.) It does not have to but it must stop memory accesses through the mappings. -- MST