Re: [Qemu-devel] [PATCH v8 01/13] vfio: KABI for migration interface

Tian, Kevin Fri, 30 Aug 2019 01:07:57 -0700

> From: Tian, Kevin
> Sent: Friday, August 30, 2019 3:26 PM
> 
[...]
> > How does QEMU handle the fact that IOVAs are potentially dynamic while
> > performing the live portion of a migration?  For example, each time a
> > guest driver calls dma_map_page() or dma_unmap_page(), a
> > MemoryRegionSection pops in or out of the AddressSpace for the device
> > (I'm assuming a vIOMMU where the device AddressSpace is not
> > system_memory).  I don't see any QEMU code that intercepts that change
> > in the AddressSpace such that the IOVA dirty pfns could be recorded and
> > translated to GFNs.  The vendor driver can't track these beyond getting
> > an unmap notification since it only knows the IOVA pfns, which can be
> > re-used with different GFN backing.  Once the DMA mapping is torn down,
> > it seems those dirty pfns are lost in the ether.  If this works in QEMU,
> > please help me find the code that handles it.
> 
> I'm curious about this part too. Interestingly, I didn't find any log_sync
> callback registered by emulated devices in Qemu. Looks dirty pages
> by emulated DMAs are recorded in some implicit way. But KVM always
> reports dirty page in GFN instead of IOVA, regardless of the presence of
> vIOMMU. If Qemu also tracks dirty pages in GFN for emulated DMAs
>  (translation can be done when DMA happens), then we don't need
> worry about transient mapping from IOVA to GFN. Along this way we
> also want GFN-based dirty bitmap being reported through VFIO,
> similar to what KVM does. For vendor drivers, it needs to translate
> from IOVA to HVA to GFN when tracking DMA activities on VFIO
> devices. IOVA->HVA is provided by VFIO. for HVA->GFN, it can be
> provided by KVM but I'm not sure whether it's exposed now.
>


HVA->GFN can be done through hva_to_gfn_memslot in kvm_host.h.

Above flow works for software-tracked dirty mechanism, e.g. in
KVMGT, where GFN-based 'dirty' is marked when a guest page is 
mapped into device mmu. IOVA->HPA->GFN translation is done 
at that time, thus immune from further IOVA->GFN changes.

When hardware IOMMU supports D-bit in 2nd level translation (e.g.
VT-d rev3.0), there are two scenarios:

1) nested translation: guest manages 1st-level translation (IOVA->GPA)
and host manages 2nd-level translation (GPA->HPA). The 2nd-level
is not affected by guest mapping operations. So it's OK for IOMMU
driver to retrieve GFN-based dirty pages by directly scanning the 2nd-
level structure, upon request from user space. 

2) shadowed translation (IOVA->HPA) in 2nd level: in such case the dirty
information is tied to IOVA. the IOMMU driver is expected to maintain
an internal dirty bitmap. Upon any change of IOVA->GPA notification
from VFIO, the IOMMU driver should flush dirty status of affected 2nd-level
entries to the internal GFN-based bitmap. At this time, again IOVA->HVA
->GPA translation required for GFN-based recording. When userspace 
queries dirty bitmap, the IOMMU driver needs to flush latest 2nd-level 
dirty status to internal bitmap, which is then copied to user space.

Given the trickiness of 2), we aim to enable 1) on intel-iommu driver.

Thanks
Kevin

Re: [Qemu-devel] [PATCH v8 01/13] vfio: KABI for migration interface

Reply via email to