Re: [Qemu-devel] [PATCH v8 01/13] vfio: KABI for migration interface

Tian, Kevin Sun, 15 Sep 2019 18:53:41 -0700

> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Friday, September 13, 2019 11:48 PM
> 
> On Thu, 12 Sep 2019 23:00:03 +0000
> "Tian, Kevin" <kevin.t...@intel.com> wrote:
> 
> > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > Sent: Thursday, September 12, 2019 10:41 PM
> > >
> > > On Tue, 3 Sep 2019 06:57:27 +0000
> > > "Tian, Kevin" <kevin.t...@intel.com> wrote:
> > >
> > > > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > > > Sent: Saturday, August 31, 2019 12:33 AM
> > > > >
> > > > > On Fri, 30 Aug 2019 08:06:32 +0000
> > > > > "Tian, Kevin" <kevin.t...@intel.com> wrote:
> > > > >
> > > > > > > From: Tian, Kevin
> > > > > > > Sent: Friday, August 30, 2019 3:26 PM
> > > > > > >
> > > > > > [...]
> > > > > > > > How does QEMU handle the fact that IOVAs are potentially
> > > dynamic
> > > > > while
> > > > > > > > performing the live portion of a migration?  For example, each
> > > time a
> > > > > > > > guest driver calls dma_map_page() or dma_unmap_page(), a
> > > > > > > > MemoryRegionSection pops in or out of the AddressSpace for
> the
> > > device
> > > > > > > > (I'm assuming a vIOMMU where the device AddressSpace is
> not
> > > > > > > > system_memory).  I don't see any QEMU code that intercepts
> that
> > > > > change
> > > > > > > > in the AddressSpace such that the IOVA dirty pfns could be
> > > recorded and
> > > > > > > > translated to GFNs.  The vendor driver can't track these beyond
> > > getting
> > > > > > > > an unmap notification since it only knows the IOVA pfns, which
> > > can be
> > > > > > > > re-used with different GFN backing.  Once the DMA mapping is
> > > torn
> > > > > down,
> > > > > > > > it seems those dirty pfns are lost in the ether.  If this works 
> > > > > > > > in
> > > QEMU,
> > > > > > > > please help me find the code that handles it.
> > > > > > >
> > > > > > > I'm curious about this part too. Interestingly, I didn't find any
> > > log_sync
> > > > > > > callback registered by emulated devices in Qemu. Looks dirty
> pages
> > > > > > > by emulated DMAs are recorded in some implicit way. But KVM
> > > always
> > > > > > > reports dirty page in GFN instead of IOVA, regardless of the
> > > presence of
> > > > > > > vIOMMU. If Qemu also tracks dirty pages in GFN for emulated
> DMAs
> > > > > > >  (translation can be done when DMA happens), then we don't
> need
> > > > > > > worry about transient mapping from IOVA to GFN. Along this
> way
> > > we
> > > > > > > also want GFN-based dirty bitmap being reported through VFIO,
> > > > > > > similar to what KVM does. For vendor drivers, it needs to
> translate
> > > > > > > from IOVA to HVA to GFN when tracking DMA activities on VFIO
> > > > > > > devices. IOVA->HVA is provided by VFIO. for HVA->GFN, it can be
> > > > > > > provided by KVM but I'm not sure whether it's exposed now.
> > > > > > >
> > > > > >
> > > > > > HVA->GFN can be done through hva_to_gfn_memslot in
> kvm_host.h.
> > > > >
> > > > > I thought it was bad enough that we have vendor drivers that
> depend
> > > on
> > > > > KVM, but designing a vfio interface that only supports a KVM
> interface
> > > > > is more undesirable.  I also note without comment that
> > > gfn_to_memslot()
> > > > > is a GPL symbol.  Thanks,
> > > >
> > > > yes it is bad, but sometimes inevitable. If you recall our discussions
> > > > back to 3yrs (when discussing the 1st mdev framework), there were
> > > similar
> > > > hypervisor dependencies in GVT-g, e.g. querying gpa->hpa when
> > > > creating some shadow structures. gpa->hpa is definitely hypervisor
> > > > specific knowledge, which is easy in KVM (gpa->hva->hpa), but needs
> > > > hypercall in Xen. but VFIO already makes assumption based on KVM-
> > > > only flavor when implementing vfio_{un}pin_page_external.
> > >
> > > Where's the KVM assumption there?  The MAP_DMA ioctl takes an
> IOVA
> > > and
> > > HVA.  When an mdev vendor driver calls vfio_pin_pages(), we GUP the
> HVA
> > > to get an HPA and provide an array of HPA pfns back to the caller.  The
> > > other vGPU mdev vendor manages to make use of this without KVM...
> the
> > > KVM interface used by GVT-g is GPL-only.
> >
> > To be clear it's the assumption on the host-based hypervisors e.g. KVM.
> > GUP is a perfect example, which doesn't work for Xen since DomU's
> > memory doesn't belong to Dom0. VFIO in Dom0 has to find the HPA
> > through Xen specific hypercalls.
> 
> VFIO does not assume a hypervisor at all.  Yes, it happens to work well
> with a host-based hypervisor like KVM were we can simply use GUP, but
> I'd hardly call using the standard mechanism to pin a user page and get
> the pfn within the Linux kernel a KVM assumption.  The fact that Dom0
> Xen requires work here while KVM does not does is not an equivalency to
> VFIO assuming KVM.  Thanks,
>


Agree, thanks for clarification.

Thanks
Kevin

Re: [Qemu-devel] [PATCH v8 01/13] vfio: KABI for migration interface

Reply via email to