On Mon, 6 Jul 2015 12:11:06 +1000 Alexey Kardashevskiy <a...@ozlabs.ru> wrote:
> sPAPR IOMMU is managing two copies of an TCE table: > 1) a guest view of the table - this is what emulated devices use and > this is where H_GET_TCE reads from; > 2) a hardware TCE table - only present if there is at least one vfio-pci > device on a PHB; it is updated via a memory listener on a PHB address > space which forwards map/unmap requests to vfio-pci IOMMU host driver. > > At the moment presence of vfio-pci devices on a bus affect the way > the guest view table is allocated. If there is no vfio-pci on a PHB > and the host kernel supports KVM acceleration of H_PUT_TCE, a table > is allocated in KVM. However, if there is vfio-pci and we do yet not > support KVM acceleration for these, the table has to be allocated > by the userspace. > > When vfio-pci device is hotplugged and there were no vfio-pci devices > already, the guest view table could have been allocated by KVM which > means that H_PUT_TCE is handled by the host kernel and since we > do not support vfio-pci in KVM, the hardware table will not be updated. > > This reallocates the guest view table in QEMU if the first vfio-pci > device has just been plugged. spapr_tce_realloc_userspace() handles this. I wonder whether it would help to improve the readability of the code later if you put the description of the function into the code instead of the commit message? > This replays all the mappings to make sure that the tables are in sync. > This will not have a visible effect though as for a new device > the guest kernel will allocate-and-map new addresses and therefore > existing mappings from emulated devices will not be used by vfio-pci > devices. > > This adds calls to spapr_phb_dma_capabilities_update() in PCI hotplug > hooks. > > Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> > --- ... > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c > index 76c988f..d1fa157 100644 > --- a/hw/ppc/spapr_pci.c > +++ b/hw/ppc/spapr_pci.c > @@ -827,6 +827,43 @@ int spapr_phb_dma_reset(sPAPRPHBState *sphb) > return 0; > } > > +static int spapr_phb_hotplug_dma_sync(sPAPRPHBState *sphb) > +{ > + int ret = 0, i; > + bool had_vfio = sphb->has_vfio; > + sPAPRTCETable *tcet; > + > + spapr_phb_dma_capabilities_update(sphb); > + > + if (!had_vfio && sphb->has_vfio) { if (had_vfio || !sphb->has_vfio) { return 0; } ... and then you can save one level of indentation for the following for-loop. > + for (i = 0; i < SPAPR_PCI_DMA_MAX_WINDOWS; ++i) { > + tcet = spapr_tce_find_by_liobn(SPAPR_PCI_LIOBN(sphb->index, i)); > + if (!tcet || !tcet->enabled) { > + continue; > + } > + if (tcet->fd >= 0) { > + /* > + * We got first vfio-pci device on accelerated table. > + * VFIO acceleration is not possible. > + * Reallocate table in userspace and replay mappings. > + */ > + ret = spapr_tce_realloc_userspace(tcet, true); > + trace_spapr_pci_dma_realloc_update(tcet->liobn, ret); > + } else { > + /* There was no acceleration, so just replay mappings. */ > + ret = spapr_tce_replay(tcet); > + trace_spapr_pci_dma_update(tcet->liobn, ret); > + } > + if (ret) { > + break; > + } > + } > + return ret; > + } > + > + return 0; > +} > + > /* Macros to operate with address in OF binding to PCI */ > #define b_x(x, p, l) (((x) & ((1<<(l))-1)) << (p)) > #define b_n(x) b_x((x), 31, 1) /* 0 if relocatable */ ... > @@ -1130,6 +1174,9 @@ static void spapr_phb_remove_pci_device_cb(DeviceState > *dev, void *opaque) > */ > pci_device_reset(PCI_DEVICE(dev)); > object_unparent(OBJECT(dev)); > + > + /* Actual VFIO device release happens from RCU so postpone DMA update */ > + call_rcu1(&((sPAPRPHBState *)opaque)->rcu, spapr_phb_remove_sync_dma); Too much brackets again for my taste ;-) > } > Thomas