On 22/07/2024 15:09, Joao Martins wrote: > On 22/07/2024 09:58, Joao Martins wrote: >> On 22/07/2024 07:05, Duan, Zhenzhong wrote: >>> >>> >>>> -----Original Message----- >>>> From: Joao Martins <joao.m.mart...@oracle.com> >>>> Subject: [PATCH v5 09/13] vfio/iommufd: Probe and request hwpt dirty >>>> tracking capability >>>> >>>> In preparation to using the dirty tracking UAPI, probe whether the IOMMU >>>> supports dirty tracking. This is done via the data stored in >>>> hiod::caps::hw_caps initialized from GET_HW_INFO. >>>> >>>> Qemu doesn't know if VF dirty tracking is supported when allocating >>>> hardware pagetable in iommufd_cdev_autodomains_get(). This is because >>>> VFIODevice migration state hasn't been initialized *yet* hence it can't >>>> pick >>>> between VF dirty tracking vs IOMMU dirty tracking. So, if IOMMU supports >>>> dirty tracking it always creates HWPTs with >>>> IOMMU_HWPT_ALLOC_DIRTY_TRACKING >>>> even if later on VFIOMigration decides to use VF dirty tracking instead. >>> >>> I thought there is no overhead for HWPT with >>> IOMMU_HWPT_ALLOC_DIRTY_TRACKING vs. HWPT without >>> IOMMU_HWPT_ALLOC_DIRTY_TRACKING if we don't enable dirty tracking. Right? >>> >> >> Correct. >> >>>> >>>> Signed-off-by: Joao Martins <joao.m.mart...@oracle.com> >>>> --- >>>> include/hw/vfio/vfio-common.h | 1 + >>>> hw/vfio/iommufd.c | 19 +++++++++++++++++++ >>>> 2 files changed, 20 insertions(+) >>>> >>>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio- >>>> common.h >>>> index 4e44b26d3c45..7e530c7869dc 100644 >>>> --- a/include/hw/vfio/vfio-common.h >>>> +++ b/include/hw/vfio/vfio-common.h >>>> @@ -97,6 +97,7 @@ typedef struct IOMMUFDBackend IOMMUFDBackend; >>>> >>>> typedef struct VFIOIOASHwpt { >>>> uint32_t hwpt_id; >>>> + uint32_t hwpt_flags; >>>> QLIST_HEAD(, VFIODevice) device_list; >>>> QLIST_ENTRY(VFIOIOASHwpt) next; >>>> } VFIOIOASHwpt; >>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c >>>> index bb44d948c735..2e5c207bbca0 100644 >>>> --- a/hw/vfio/iommufd.c >>>> +++ b/hw/vfio/iommufd.c >>>> @@ -110,6 +110,11 @@ static void >>>> iommufd_cdev_unbind_and_disconnect(VFIODevice *vbasedev) >>>> iommufd_backend_disconnect(vbasedev->iommufd); >>>> } >>>> >>>> +static bool iommufd_hwpt_dirty_tracking(VFIOIOASHwpt *hwpt) >>>> +{ >>>> + return hwpt && hwpt->hwpt_flags & >>>> IOMMU_HWPT_ALLOC_DIRTY_TRACKING; >>>> +} >>>> + >>>> static int iommufd_cdev_getfd(const char *sysfs_path, Error **errp) >>>> { >>>> ERRP_GUARD(); >>>> @@ -246,6 +251,17 @@ static bool >>>> iommufd_cdev_autodomains_get(VFIODevice *vbasedev, >>>> } >>>> } >>>> >>>> + /* >>>> + * This is quite early and VFIO Migration state isn't yet fully >>>> + * initialized, thus rely only on IOMMU hardware capabilities as to >>>> + * whether IOMMU dirty tracking is going to be requested. Later >>>> + * vfio_migration_realize() may decide to use VF dirty tracking >>>> + * instead. >>>> + */ >>>> + if (vbasedev->hiod->caps.hw_caps & >>>> IOMMU_HW_CAP_DIRTY_TRACKING) { >>> >>> Looks there is still reference to hw_caps, then would suggest to bring back >>> the NEW CAP. >>> >> Ah, but below helper is checking for GET_HW_INFO stuff, and not hwpt flags >> given that we haven't allocated a hwpt yet. >> >> While I could place this check into a helper it would only have an user. I >> will >> need below helper iommufd_hwpt_dirty_tracking() in another patch, so this is >> a >> bit of a one off check only (unless we want a new helper for cosmetic >> purposes) >> >>>> + flags = IOMMU_HWPT_ALLOC_DIRTY_TRACKING; >>>> + } >>>> + >>>> if (!iommufd_backend_alloc_hwpt(iommufd, vbasedev->devid, >>>> container->ioas_id, flags, >>>> IOMMU_HWPT_DATA_NONE, 0, NULL, >>>> @@ -255,6 +271,7 @@ static bool >>>> iommufd_cdev_autodomains_get(VFIODevice *vbasedev, >>>> >>>> hwpt = g_malloc0(sizeof(*hwpt)); >>>> hwpt->hwpt_id = hwpt_id; >>>> + hwpt->hwpt_flags = flags; >>>> QLIST_INIT(&hwpt->device_list); >>>> >>>> ret = iommufd_cdev_attach_ioas_hwpt(vbasedev, hwpt->hwpt_id, errp); >>>> @@ -267,6 +284,8 @@ static bool >>>> iommufd_cdev_autodomains_get(VFIODevice *vbasedev, >>>> vbasedev->hwpt = hwpt; >>>> QLIST_INSERT_HEAD(&hwpt->device_list, vbasedev, hwpt_next); >>>> QLIST_INSERT_HEAD(&container->hwpt_list, hwpt, next); >>>> + container->bcontainer.dirty_pages_supported |= >>>> + iommufd_hwpt_dirty_tracking(hwpt); >>> >>> If there is at least one hwpt without dirty tracking, shouldn't we make >>> bcontainer.dirty_pages_supported false? >>> > > Missed this comment. We could set to false but the generic container > abstraction > is utilizing this to let the ioctls() of the individual backend to go through > to > the defined callback, and that's why I set to true. > Let me rephrase, I meant: "(...) utilizing this to let the individual backend container callbacks of dirty tracking to go through, and that's why I set to true."
> And that is really the only effect of this patch. By the time we reach to > patch > 12 (which is what really enables live migration with IOMMU automatically), the > IOMMUFD dirty tracking is only called 1) when not one of the VF doesn't > support > device dirty tracking [only if you're using IOMMUFD backend], and finally 2) > that no VF/mdev has added the migration blocker which essentially looks at the > HWPT flags (as opposed to the container attribute). > > Joao >