On 22/07/2024 09:58, Joao Martins wrote: > On 22/07/2024 07:05, Duan, Zhenzhong wrote: >> >> >>> -----Original Message----- >>> From: Joao Martins <joao.m.mart...@oracle.com> >>> Subject: [PATCH v5 09/13] vfio/iommufd: Probe and request hwpt dirty >>> tracking capability >>> >>> In preparation to using the dirty tracking UAPI, probe whether the IOMMU >>> supports dirty tracking. This is done via the data stored in >>> hiod::caps::hw_caps initialized from GET_HW_INFO. >>> >>> Qemu doesn't know if VF dirty tracking is supported when allocating >>> hardware pagetable in iommufd_cdev_autodomains_get(). This is because >>> VFIODevice migration state hasn't been initialized *yet* hence it can't pick >>> between VF dirty tracking vs IOMMU dirty tracking. So, if IOMMU supports >>> dirty tracking it always creates HWPTs with >>> IOMMU_HWPT_ALLOC_DIRTY_TRACKING >>> even if later on VFIOMigration decides to use VF dirty tracking instead. >> >> I thought there is no overhead for HWPT with IOMMU_HWPT_ALLOC_DIRTY_TRACKING >> vs. HWPT without IOMMU_HWPT_ALLOC_DIRTY_TRACKING if we don't enable dirty >> tracking. Right? >> > > Correct. > >>> >>> Signed-off-by: Joao Martins <joao.m.mart...@oracle.com> >>> --- >>> include/hw/vfio/vfio-common.h | 1 + >>> hw/vfio/iommufd.c | 19 +++++++++++++++++++ >>> 2 files changed, 20 insertions(+) >>> >>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio- >>> common.h >>> index 4e44b26d3c45..7e530c7869dc 100644 >>> --- a/include/hw/vfio/vfio-common.h >>> +++ b/include/hw/vfio/vfio-common.h >>> @@ -97,6 +97,7 @@ typedef struct IOMMUFDBackend IOMMUFDBackend; >>> >>> typedef struct VFIOIOASHwpt { >>> uint32_t hwpt_id; >>> + uint32_t hwpt_flags; >>> QLIST_HEAD(, VFIODevice) device_list; >>> QLIST_ENTRY(VFIOIOASHwpt) next; >>> } VFIOIOASHwpt; >>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c >>> index bb44d948c735..2e5c207bbca0 100644 >>> --- a/hw/vfio/iommufd.c >>> +++ b/hw/vfio/iommufd.c >>> @@ -110,6 +110,11 @@ static void >>> iommufd_cdev_unbind_and_disconnect(VFIODevice *vbasedev) >>> iommufd_backend_disconnect(vbasedev->iommufd); >>> } >>> >>> +static bool iommufd_hwpt_dirty_tracking(VFIOIOASHwpt *hwpt) >>> +{ >>> + return hwpt && hwpt->hwpt_flags & >>> IOMMU_HWPT_ALLOC_DIRTY_TRACKING; >>> +} >>> + >>> static int iommufd_cdev_getfd(const char *sysfs_path, Error **errp) >>> { >>> ERRP_GUARD(); >>> @@ -246,6 +251,17 @@ static bool >>> iommufd_cdev_autodomains_get(VFIODevice *vbasedev, >>> } >>> } >>> >>> + /* >>> + * This is quite early and VFIO Migration state isn't yet fully >>> + * initialized, thus rely only on IOMMU hardware capabilities as to >>> + * whether IOMMU dirty tracking is going to be requested. Later >>> + * vfio_migration_realize() may decide to use VF dirty tracking >>> + * instead. >>> + */ >>> + if (vbasedev->hiod->caps.hw_caps & >>> IOMMU_HW_CAP_DIRTY_TRACKING) { >> >> Looks there is still reference to hw_caps, then would suggest to bring back >> the NEW CAP. >> > Ah, but below helper is checking for GET_HW_INFO stuff, and not hwpt flags > gioven that we haven't allocated a hwpt yet. > > While I could place this check into a helper it would only have an user. I > will > need below helper iommufd_hwpt_dirty_tracking() in another patch, so this is a > bit of a one off check only (unless we want a new helper for cosmetic > purposes) > >>> + flags = IOMMU_HWPT_ALLOC_DIRTY_TRACKING; >>> + } >>> + >>> if (!iommufd_backend_alloc_hwpt(iommufd, vbasedev->devid, >>> container->ioas_id, flags, >>> IOMMU_HWPT_DATA_NONE, 0, NULL, >>> @@ -255,6 +271,7 @@ static bool >>> iommufd_cdev_autodomains_get(VFIODevice *vbasedev, >>> >>> hwpt = g_malloc0(sizeof(*hwpt)); >>> hwpt->hwpt_id = hwpt_id; >>> + hwpt->hwpt_flags = flags; >>> QLIST_INIT(&hwpt->device_list); >>> >>> ret = iommufd_cdev_attach_ioas_hwpt(vbasedev, hwpt->hwpt_id, errp); >>> @@ -267,6 +284,8 @@ static bool >>> iommufd_cdev_autodomains_get(VFIODevice *vbasedev, >>> vbasedev->hwpt = hwpt; >>> QLIST_INSERT_HEAD(&hwpt->device_list, vbasedev, hwpt_next); >>> QLIST_INSERT_HEAD(&container->hwpt_list, hwpt, next); >>> + container->bcontainer.dirty_pages_supported |= >>> + iommufd_hwpt_dirty_tracking(hwpt); >> >> If there is at least one hwpt without dirty tracking, shouldn't we make >> bcontainer.dirty_pages_supported false? >>
Missed this comment. We could set to false but the generic container abstraction is utilizing this to let the ioctls() of the individual backend to go through to the defined callback, and that's why I set to true. At that is really the only effect of this patch. By the time we reach to patch 12 (which is what really enables live migration with IOMMU automatically), the IOMMUFD dirty tracking is only called 1) when not one of the VF doesn't support device dirty tracking [only if you're using IOMMUFD backend], and finally 2) that no VF/mdev has added the migration blocker which essentially looks at the HWPT flags (as opposed to the container attribute). Joao