From: Zhenzhong Duan <[email protected]> If a VFIO device in guest switches from passthrough(PT) domain to block domain, the whole memory address space is unmapped, but we passed a NULL iotlb entry to unmap_bitmap, then bitmap query didn't happen and we lost dirty pages.
By constructing an iotlb entry with iova = gpa for unmap_bitmap, it can set dirty bits correctly. For IOMMU address space, we still send NULL iotlb because VFIO don't know the actual mappings in guest. It's vIOMMU's responsibility to send actual unmapping notifications, e.g., vtd_address_space_unmap_in_dirty_tracking(). Signed-off-by: Zhenzhong Duan <[email protected]> Tested-by: Giovannio Cabiddu <[email protected]> Reviewed-by: Yi Liu <[email protected]> Link: https://lore.kernel.org/qemu-devel/[email protected] Signed-off-by: Cédric Le Goater <[email protected]> --- hw/vfio/listener.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/hw/vfio/listener.c b/hw/vfio/listener.c index 62699cb772d786c1510318dff73973ef4d297177..813621f22f8b5ec284388f9c5f719525ec5f282c 100644 --- a/hw/vfio/listener.c +++ b/hw/vfio/listener.c @@ -713,14 +713,34 @@ static void vfio_listener_region_del(MemoryListener *listener, if (try_unmap) { bool unmap_all = false; + IOMMUTLBEntry entry = {}, *iotlb = NULL; if (int128_eq(llsize, int128_2_64())) { assert(!iova); unmap_all = true; llsize = int128_zero(); } + + /* + * Fake an IOTLB entry for identity mapping which is needed by dirty + * tracking when switch out of PT domain. In fact, in unmap_bitmap, + * only translated_addr field is used to set dirty bitmap. + * + * Note: When switch into PT domain from DMA domain, the whole IOMMU + * MR is deleted without iotlb, before that happen, we depend on + * vIOMMU to send unmap notification with accurate iotlb entry to + * VFIO. See vtd_address_space_unmap_in_dirty_tracking() for example, + * it is triggered during switching to block domain because vtd does + * not support direct switching from DMA to PT domain. + */ + if (global_dirty_tracking && memory_region_is_ram(section->mr)) { + entry.iova = iova; + entry.translated_addr = iova; + iotlb = &entry; + } + ret = vfio_container_dma_unmap(bcontainer, iova, int128_get64(llsize), - NULL, unmap_all); + iotlb, unmap_all); if (ret) { error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", -- 2.52.0
