On Wed, Apr 27, 2016 at 07:14:15PM +1000, Alexey Kardashevskiy wrote: > On 04/27/2016 04:39 PM, David Gibson wrote: > >On Thu, Apr 21, 2016 at 02:22:01PM +1000, Alexey Kardashevskiy wrote: > >>On 04/21/2016 01:59 PM, David Gibson wrote: > >>>On Wed, Apr 20, 2016 at 07:15:15PM +1000, Alexey Kardashevskiy wrote: > >>>>On 04/07/2016 10:40 AM, David Gibson wrote: > >>>>>On Mon, Apr 04, 2016 at 07:33:43PM +1000, Alexey Kardashevskiy wrote: > >>>>>>The sPAPR TCE tables manage 2 copies when VFIO is using an IOMMU - > >>>>>>a guest view of the table and a hardware TCE table. If there is no VFIO > >>>>>>presense in the address space, then just the guest view is used, if > >>>>>>this is the case, it is allocated in the KVM. However since there is no > >>>>>>support yet for VFIO in KVM TCE hypercalls, when we start using VFIO, > >>>>>>we need to move the guest view from KVM to the userspace; and we need > >>>>>>to do this for every IOMMU on a bus with VFIO devices. > >>>>>> > >>>>>>This adds vfio_start/vfio_stop callbacks in MemoryRegionIOMMUOps to > >>>>>>notifiy IOMMU about changing environment so it can reallocate the table > >>>>>>to/from KVM or (when available) hook the IOMMU groups with the logical > >>>>>>bus (LIOBN) in the KVM. > >>>>>> > >>>>>>This removes explicit spapr_tce_set_need_vfio() call from PCI hotplug > >>>>>>path as the new callbacks do this better - they notify IOMMU at > >>>>>>the exact moment when the configuration is changed, and this also > >>>>>>includes the case of PCI hot unplug. > >>>>>> > >>>>>>As there can be multiple containers attached to the same PHB/LIOBN, > >>>>>>this replaces the @need_vfio flag in sPAPRTCETable with the counter > >>>>>>of VFIO users. > >>>>>> > >>>>>>Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> > >>>>> > >>>>>This looks correct, but there's one remaining ugly. > >>>>> > >>>>>>--- > >>>>>>Changes: > >>>>>>v15: > >>>>>>* s/need_vfio/vfio-Users/g > >>>>>>--- > >>>>>> hw/ppc/spapr_iommu.c | 30 ++++++++++++++++++++---------- > >>>>>> hw/ppc/spapr_pci.c | 6 ------ > >>>>>> hw/vfio/common.c | 9 +++++++++ > >>>>>> include/exec/memory.h | 4 ++++ > >>>>>> include/hw/ppc/spapr.h | 2 +- > >>>>>> 5 files changed, 34 insertions(+), 17 deletions(-) > >>>>>> > >>>>>>diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c > >>>>>>index c945dba..ea09414 100644 > >>>>>>--- a/hw/ppc/spapr_iommu.c > >>>>>>+++ b/hw/ppc/spapr_iommu.c > >>>>>>@@ -155,6 +155,16 @@ static uint64_t > >>>>>>spapr_tce_get_page_sizes(MemoryRegion *iommu) > >>>>>> return 1ULL << tcet->page_shift; > >>>>>> } > >>>>>> > >>>>>>+static void spapr_tce_vfio_start(MemoryRegion *iommu) > >>>>>>+{ > >>>>>>+ spapr_tce_set_need_vfio(container_of(iommu, sPAPRTCETable, iommu), > >>>>>>true); > >>>>>>+} > >>>>>>+ > >>>>>>+static void spapr_tce_vfio_stop(MemoryRegion *iommu) > >>>>>>+{ > >>>>>>+ spapr_tce_set_need_vfio(container_of(iommu, sPAPRTCETable, iommu), > >>>>>>false); > >>>>>>+} > >>>>>>+ > >>>>>> static void spapr_tce_table_do_enable(sPAPRTCETable *tcet); > >>>>>> static void spapr_tce_table_do_disable(sPAPRTCETable *tcet); > >>>>>> > >>>>>>@@ -239,6 +249,8 @@ static const VMStateDescription > >>>>>>vmstate_spapr_tce_table = { > >>>>>> static MemoryRegionIOMMUOps spapr_iommu_ops = { > >>>>>> .translate = spapr_tce_translate_iommu, > >>>>>> .get_page_sizes = spapr_tce_get_page_sizes, > >>>>>>+ .vfio_start = spapr_tce_vfio_start, > >>>>>>+ .vfio_stop = spapr_tce_vfio_stop, > >>>>> > >>>>>Ok, so AFAICT these callbacks are called whenever a VFIO context is > >>>>>added / removed from the gIOMMU's address space, and it's up to the > >>>>>gIOMMU code to ref count that to see if there are any current vfio > >>>>>users. That makes "vfio_start" and "vfio_stop" not great names. > >>>>> > >>>>>But.. better than changing the names would be to move the refcounting > >>>>>to the generic code if you can manage it, so the individual gIOMMU > >>>>>backends don't need to - they just told when they need to start / stop > >>>>>providing VFIO support. > >>>> > >>>>Everything is manageable... > >>>> > >>>>This referencing is needed for the case of >=2 containers so > >>>>2xvfio_listener_region_add will create 2xVFIOGuestIOMMU as they are per > >>>>VFIOContainer so VFIOGuestIOMMU is not the right place for the reference > >>>>counting, VFIOAddressSpace seems to be that place (=> add list of IOMMU > >>>>MRs > >>>>with refcounter). Or even IOMMU MR. Or move VFIOGuestIOMMU list from > >>>>VFIOContainer to VFIOAddressSpace and then gIOMMU can handle > >>>>refcounting? > >>> > >>>I'm having a lot of trouble parsing that. I think the ref parsing has > >>>to be per-giommu (because individual giommus could, in theory, be > >>>mapped or unmapped from an address space). > >> > >> > >>Example 1. > >>POWER8, no DDW, one QEMU PHB, 2 IOMMU groups, table sharing so just 1 > >>container, one TCE table (aka gIOMMU), one TCE table in KVM, no reference > >>counting needed at all, simple. > >> > >>Example 2. > >>POWER7, no DDW, one QEMU PHB, 2 IOMMU groups, no table sharing so there are > >>2 containers but still one IOMMU MR which is added to each container so > >>there are 2 gIOMMU objects. And there is still one TCE table in KVM (which > >>is a guest view). Where do I put the reference counter which will count that > >>there are 2 gIOMMUs per KVM TCE table in this example? > > > >Ah.. I'd forgotten that the gIOMMU object is per guest IOMMU window > >*and* per container, not just per guest IOMMU window. > > > >Ultimately it's the code implementing the guest side IOMMU which needs > >to know if it is supporting VFIO or not, so in generic terms that > >means per IOMMU-type MemoryRegion. > > > >Essentially you need to count the number of VFIOGuestIOMMU objects > >associated with each (gIOMMU) MemoryRegion, and notify the > >MemoryRegion if that changes from zero to non-zero or vice versa. > > > >I'd prefer if we can maintain that count from just the VFIO code and > >just notify the gIOMMU code on zero / non-zero changes. But I guess > >we'd need approval from Paolo to add that count to the MemoryRegion. > > > Why MR? I could wrap MR to "VFIOIOMMUMR", add a counter and keep a list of > these VFIOIOMMUMRs in VFIOAddressSpace.
Ah, yes I guess we could. It's just kinda ugly to have to keep another object with the same lifetime around for one extra counter. > I am adding Paolo, just for the case :) > > > >The fallback would be similar to what you have - instead the > >MemoryRegion gets notified whenever a VFIOGuestIOMMU is attached or > >removed, and the MR (i.e. the guest side IOMMU code) has to maintain > >the count itself. > > > > > > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature