On Tue, Sep 10, 2019 at 01:18:37AM +0800, andychiu wrote: > If Windows 10 guests have enabled 'turn off hard disk after idle' > option in power settings, and the guest has a SATA disk plugged in, > the SATA disk will be turned off after a specified idle time. > If the guest is live migrated or saved/loaded with its SATA disk > turned off, the following error will occur: > > qemu-system-x86_64: AHCI: Failed to start FIS receive engine: bad FIS receive > buffer address > qemu-system-x86_64: Failed to load ich9_ahci:ahci > qemu-system-x86_64: error while loading state for instance 0x0 of device > '0000:00:1a.0/ich9_ahci' > qemu-system-x86_64: load of migration failed: Operation not permitted > > Observation from trace logs shows that a while after Windows 10 turns off > a SATA disk (IDE disks don't have the following behavior), > it will disable the PCI_COMMAND_MASTER flag of the pci device containing > the ahci device. When the the disk is turning back on, > the PCI_COMMAND_MASTER flag will be restored first. > But if the guest is migrated or saved/loaded while the disk is off, > the post_load callback of ahci device, ahci_state_post_load(), will fail > at ahci_cond_start_engines() if the MemoryRegion > pci_dev->bus_master_enable_region is not enabled, with pci_dev pointing > to the PCIDevice struct containing the ahci device. > > This patch enables pci_dev->bus_master_enable_region before calling > ahci_cond_start_engines() in ahci_state_post_load(), and restore the > MemoryRegion to its original state afterwards. > > Signed-off-by: andychiu <andyc...@synology.com>
Poking at PCI device internals like this seems fragile. And force enabling bus master can lead to unpleasantness like corrupting guest memory, unhandled interrupts, etc. E.g. it's quite reasonable, spec-wise, for the guest to move thing in memory around while bus mastering is off. Can you teach ahci that region being disabled during migration is ok, and recover from it? > --- > hw/ide/ahci.c | 53 ++++++++++++++++++++++++++++++++++++----------------- > 1 file changed, 36 insertions(+), 17 deletions(-) > > diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c > index d45393c..83f8c30 100644 > --- a/hw/ide/ahci.c > +++ b/hw/ide/ahci.c > @@ -1649,33 +1649,52 @@ static const VMStateDescription vmstate_ahci_device = > { > }, > }; > > +static int ahci_state_load_engines(AHCIState *s, AHCIDevice *ad) > +{ > + AHCIPortRegs *pr = &ad->port_regs; > + DeviceState *dev_state = s->container; > + PCIDevice *pci_dev = (PCIDevice *) object_dynamic_cast(OBJECT(dev_state), > + TYPE_PCI_DEVICE); > + bool pci_bus_master_enabled = pci_dev->bus_master_enable_region.enabled; > + > + if (!(pr->cmd & PORT_CMD_START) && (pr->cmd & PORT_CMD_LIST_ON)) { > + error_report("AHCI: DMA engine should be off, but status bit " > + "indicates it is still running."); > + return -1; > + } > + if (!(pr->cmd & PORT_CMD_FIS_RX) && (pr->cmd & PORT_CMD_FIS_ON)) { > + error_report("AHCI: FIS RX engine should be off, but status bit " > + "indicates it is still running."); > + return -1; > + } > + > + memory_region_set_enabled(&pci_dev->bus_master_enable_region, true); > + > + /* > + * After a migrate, the DMA/FIS engines are "off" and > + * need to be conditionally restarted > + */ > + pr->cmd &= ~(PORT_CMD_LIST_ON | PORT_CMD_FIS_ON); > + if (ahci_cond_start_engines(ad) != 0) { > + return -1; > + } > + memory_region_set_enabled(&pci_dev->bus_master_enable_region, > + pci_bus_master_enabled); > + > + return 0; > +} > + > static int ahci_state_post_load(void *opaque, int version_id) > { > int i, j; > struct AHCIDevice *ad; > NCQTransferState *ncq_tfs; > - AHCIPortRegs *pr; > AHCIState *s = opaque; > > for (i = 0; i < s->ports; i++) { > ad = &s->dev[i]; > - pr = &ad->port_regs; > - > - if (!(pr->cmd & PORT_CMD_START) && (pr->cmd & PORT_CMD_LIST_ON)) { > - error_report("AHCI: DMA engine should be off, but status bit " > - "indicates it is still running."); > - return -1; > - } > - if (!(pr->cmd & PORT_CMD_FIS_RX) && (pr->cmd & PORT_CMD_FIS_ON)) { > - error_report("AHCI: FIS RX engine should be off, but status bit " > - "indicates it is still running."); > - return -1; > - } > > - /* After a migrate, the DMA/FIS engines are "off" and > - * need to be conditionally restarted */ > - pr->cmd &= ~(PORT_CMD_LIST_ON | PORT_CMD_FIS_ON); > - if (ahci_cond_start_engines(ad) != 0) { > + if (ahci_state_load_engines(s, ad)) { > return -1; > } > > -- > 2.7.4