On Fri, Apr 26, 2024 at 03:10:14AM +0000, Duan, Zhenzhong wrote:
> 
> 
> >-----Original Message-----
> >From: Cédric Le Goater <c...@redhat.com>
> >Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
> >compatibility check with host IOMMU cap/ecap
> >
> >On 4/25/24 10:46, Duan, Zhenzhong wrote:
> >> Hi Cédric,
> >>
> >>> -----Original Message-----
> >>> From: Cédric Le Goater <c...@redhat.com>
> >>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
> >>> compatibility check with host IOMMU cap/ecap
> >>>
> >>> Hello Zhenzhong,
> >>>
> >>> On 4/18/24 10:42, Duan, Zhenzhong wrote:
> >>>> Hi Cédric,
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Cédric Le Goater <c...@redhat.com>
> >>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
> >>>>> compatibility check with host IOMMU cap/ecap
> >>>>>
> >>>>> Hello Zhenzhong
> >>>>>
> >>>>> On 4/17/24 11:24, Duan, Zhenzhong wrote:
> >>>>>>
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Cédric Le Goater <c...@redhat.com>
> >>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
> >>>>>>> compatibility check with host IOMMU cap/ecap
> >>>>>>>
> >>>>>>> On 4/17/24 06:21, Duan, Zhenzhong wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: Cédric Le Goater <c...@redhat.com>
> >>>>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to do
> >>>>>>>>> compatibility check with host IOMMU cap/ecap
> >>>>>>>>>
> >>>>>>>>> Hello,
> >>>>>>>>>
> >>>>>>>>> On 4/16/24 09:09, Duan, Zhenzhong wrote:
> >>>>>>>>>> Hi Cédric,
> >>>>>>>>>>
> >>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>> From: Cédric Le Goater <c...@redhat.com>
> >>>>>>>>>>> Subject: Re: [PATCH v2 3/5] intel_iommu: Add a framework to
> >do
> >>>>>>>>>>> compatibility check with host IOMMU cap/ecap
> >>>>>>>>>>>
> >>>>>>>>>>> On 4/8/24 10:44, Zhenzhong Duan wrote:
> >>>>>>>>>>>> From: Yi Liu <yi.l....@intel.com>
> >>>>>>>>>>>>
> >>>>>>>>>>>> If check fails, the host side device(either vfio or vdpa device)
> >>> should
> >>>>>>> not
> >>>>>>>>>>>> be passed to guest.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Implementation details for different backends will be in
> >>> following
> >>>>>>>>> patches.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Signed-off-by: Yi Liu <yi.l....@intel.com>
> >>>>>>>>>>>> Signed-off-by: Yi Sun <yi.y....@linux.intel.com>
> >>>>>>>>>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.d...@intel.com>
> >>>>>>>>>>>> ---
> >>>>>>>>>>>>        hw/i386/intel_iommu.c | 35
> >>>>>>>>>>> +++++++++++++++++++++++++++++++++++
> >>>>>>>>>>>>        1 file changed, 35 insertions(+)
> >>>>>>>>>>>>
> >>>>>>>>>>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> >>>>>>>>>>>> index 4f84e2e801..a49b587c73 100644
> >>>>>>>>>>>> --- a/hw/i386/intel_iommu.c
> >>>>>>>>>>>> +++ b/hw/i386/intel_iommu.c
> >>>>>>>>>>>> @@ -35,6 +35,7 @@
> >>>>>>>>>>>>        #include "sysemu/kvm.h"
> >>>>>>>>>>>>        #include "sysemu/dma.h"
> >>>>>>>>>>>>        #include "sysemu/sysemu.h"
> >>>>>>>>>>>> +#include "sysemu/iommufd.h"
> >>>>>>>>>>>>        #include "hw/i386/apic_internal.h"
> >>>>>>>>>>>>        #include "kvm/kvm_i386.h"
> >>>>>>>>>>>>        #include "migration/vmstate.h"
> >>>>>>>>>>>> @@ -3819,6 +3820,32 @@ VTDAddressSpace
> >>>>>>>>>>> *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
> >>>>>>>>>>>>            return vtd_dev_as;
> >>>>>>>>>>>>        }
> >>>>>>>>>>>>
> >>>>>>>>>>>> +static int vtd_check_legacy_hdev(IntelIOMMUState *s,
> >>>>>>>>>>>> +                                 HostIOMMUDevice *hiod,
> >>>>>>>>>>>> +                                 Error **errp)
> >>>>>>>>>>>> +{
> >>>>>>>>>>>> +    return 0;
> >>>>>>>>>>>> +}
> >>>>>>>>>>>> +
> >>>>>>>>>>>> +static int vtd_check_iommufd_hdev(IntelIOMMUState *s,
> >>>>>>>>>>>> +                                  HostIOMMUDevice *hiod,
> >>>>>>>>>>>> +                                  Error **errp)
> >>>>>>>>>>>> +{
> >>>>>>>>>>>> +    return 0;
> >>>>>>>>>>>> +}
> >>>>>>>>>>>> +
> >>>>>>>>>>>> +static int vtd_check_hdev(IntelIOMMUState *s,
> >>>>>>>>> VTDHostIOMMUDevice
> >>>>>>>>>>> *vtd_hdev,
> >>>>>>>>>>>> +                          Error **errp)
> >>>>>>>>>>>> +{
> >>>>>>>>>>>> +    HostIOMMUDevice *hiod = vtd_hdev->dev;
> >>>>>>>>>>>> +
> >>>>>>>>>>>> +    if (object_dynamic_cast(OBJECT(hiod),
> >>> TYPE_HIOD_IOMMUFD))
> >>>>> {
> >>>>>>>>>>>> +        return vtd_check_iommufd_hdev(s, hiod, errp);
> >>>>>>>>>>>> +    }
> >>>>>>>>>>>> +
> >>>>>>>>>>>> +    return vtd_check_legacy_hdev(s, hiod, errp);
> >>>>>>>>>>>> +}
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> I think we should be using the .get_host_iommu_info() class
> >>> handler
> >>>>>>>>>>> instead. Can we refactor the code slightly to avoid this check on
> >>>>>>>>>>> the type ?
> >>>>>>>>>>
> >>>>>>>>>> There is some difficulty ini avoiding this check, the behavior of
> >>>>>>>>> vtd_check_legacy_hdev
> >>>>>>>>>> and vtd_check_iommufd_hdev are different especially after
> >>> nesting
> >>>>>>>>> support introduced.
> >>>>>>>>>> vtd_check_iommufd_hdev() has much wider check over
> >cap/ecap
> >>> bits
> >>>>>>>>> besides aw_bits.
> >>>>>>>>>
> >>>>>>>>> I think it is important to fully separate the vIOMMU model from
> >the
> >>>>>>>>> host IOMMU backing device.
> >>>>>
> >>>>> This comment is true for the structures also.
> >>>>>
> >>>>>>>>> Could we introduce a new HostIOMMUDeviceClass
> >>>>>>>>> handler .check_hdev() handler, which would
> >>>>> call .get_host_iommu_info() ?
> >>>>>
> >>>>> This means that HIOD_LEGACY_INFO and HIOD_IOMMUFD_INFO
> >should
> >>> be
> >>>>> a common structure 'HostIOMMUDeviceInfo' holding all attributes
> >>>>> for the different backends. Each .get_host_iommu_info()
> >implementation
> >>>>> would translate the specific host iommu device data presentation
> >>>>> into the common 'HostIOMMUDeviceInfo', this is true for
> >host_aw_bits.
> >>>>
> >>>> I see, it's just not easy to define the unified elements in
> >>> HostIOMMUDeviceInfo
> >>>> so that they maps to bits or fields in host return IOMMU info.
> >>>
> >>> The proposal is adding a vIOMMU <-> HostIOMMUDevice interface and a
> >>> new
> >>> API needs to be completely defined for it. The IOMMU backend
> >>> implementation
> >>> could be anything, legacy, iommufd, iommufd v2, some other framework
> >>> and
> >>> the vIOMMU shouldn't be aware of its implementation.
> >>>
> >>> Exposing the kernel structures as done below should be avoided because
> >>> they are part of the QEMU <-> kernel IOMMUFD interface.
> >>>
> >>>
> >>>> Different platform returned host IOMMU info is platform specific.
> >>>> For vtd and siommu:
> >>>>
> >>>> struct iommu_hw_info_vtd {
> >>>>           __u32 flags;
> >>>>           __u32 __reserved;
> >>>>           __aligned_u64 cap_reg;
> >>>>           __aligned_u64 ecap_reg;
> >>>> };
> >>>>
> >>>> struct iommu_hw_info_arm_smmuv3 {
> >>>>          __u32 flags;
> >>>>          __u32 __reserved;
> >>>>          __u32 idr[6];
> >>>>          __u32 iidr;
> >>>>          __u32 aidr;
> >>>> };
> >>>>
> >>>> I can think of two kinds of declaration of HostIOMMUDeviceInfo:
> >>>>
> >>>> struct HostIOMMUDeviceInfo {
> >>>>       uint8_t aw_bits;
> >>>>       enum iommu_hw_info_type type;
> >>>>       union {
> >>>>           struct iommu_hw_info_vtd vtd;
> >>>>           struct iommu_hw_info_arm_smmuv3;
> >>>>           ......
> >>>>       } data;
> >>>> }
> >>>>
> >>>> or
> >>>>
> >>>> struct HostIOMMUDeviceInfo {
> >>>>       uint8_t aw_bits;
> >>>>       enum iommu_hw_info_type type;
> >>>>       __u32 flags;
> >>>>       __aligned_u64 cap_reg;
> >>>>       __aligned_u64 ecap_reg;
> >>>>       __u32 idr[6];
> >>>>       __u32 iidr;
> >>>>       __u32 aidr;
> >>>>      ......
> >>>> }
> >>>>
> >>>> Not clear if any is your expected format.
> >>>>
> >>>>> 'type' could be handled the same way, with a 'HostIOMMUDeviceInfo'
> >>>>> type attribute and host iommu device type definitions, or as you
> >>>>> suggested with a QOM interface. This is more complex however. In
> >>>>> this case, I would suggest to implement a .compatible() handler to
> >>>>> compare the host iommu device type with the vIOMMU type.
> >>>>>
> >>>>> The resulting check_hdev routine would look something like :
> >>>>>
> >>>>> static int vtd_check_hdev(IntelIOMMUState *s,
> >VTDHostIOMMUDevice
> >>>>> *vtd_hdev,
> >>>>>                             Error **errp)
> >>>>> {
> >>>>>       HostIOMMUDevice *hiod = vtd_hdev->dev;
> >>>>>       HostIOMMUDeviceClass *hiodc =
> >>>>> HOST_IOMMU_DEVICE_GET_CLASS(hiod);
> >>>>>       HostIOMMUDevice info;
> >>>>>       int host_aw_bits, ret;
> >>>>>
> >>>>>       ret = hiodc->get_host_iommu_info(hiod, &info, sizeof(info), errp);
> >>>>>       if (ret) {
> >>>>>           return ret;
> >>>>>       }
> >>>>>
> >>>>>       ret = hiodc->is_compatible(hiod, VIOMMU_INTERFACE(s));
> >>>>>       if (ret) {
> >>>>>           return ret;
> >>>>>       }
> >>>>>
> >>>>>       if (s->aw_bits > info.aw_bits) {
> >>>>>           error_setg(errp, "aw-bits %d > host aw-bits %d",
> >>>>>                      s->aw_bits, info.aw_bits);
> >>>>>           return -EINVAL;
> >>>>>       }
> >>>>> }
> >>>>>
> >>>>> and the HostIOMMUDeviceClass::is_compatible() handler would call a
> >>>>> vIOMMUInterface::compatible() handler simply returning
> >>>>> IOMMU_HW_INFO_TYPE_INTEL_VTD. How does that sound ?
> >>>>
> >>>> Not quite get what HostIOMMUDeviceClass::is_compatible() does.
> >>>
> >>> HostIOMMUDeviceClass::is_compatible() calls in the host IOMMU
> >backend
> >>> to determine which IOMMU types are exposed by the host, then calls the
> >>> vIOMMUInterface::compatible() handler to do the compare. API is to be
> >>> defined.
> >>>
> >>> As a refinement, we could introduce in the vIOMMU <->
> >HostIOMMUDevice
> >>> interface capabilities, or features, to check more precisely the level
> >>> of compatibility between the vIOMMU and the host IOMMU device. This
> >is
> >>> similar to what is done between QEMU and KVM.
> >>>
> >>> If you think this is too complex, include type in HostIOMMUDeviceInfo.
> >>>
> >>>> Currently legacy and IOMMUFD host device has different check logic,
> >how
> >>> it can help
> >>>> in merging vtd_check_legacy_hdev() and vtd_check_iommufd_hdev()
> >into
> >>> a single vtd_check_hdev()?
> >>>
> >>> IMHO, IOMMU shouldn't be aware of the IOMMU backend
> >implementation,
> >>> but
> >>> if you think the Intel vIOMMU should access directly the iommufd
> >backend
> >>> when available, then we should drop this proposal and revisit the design
> >>> to take a different approach.
> >>
> >> I implemented a draft following your suggestions so we could explore
> >further.
> >> See
> >https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_pre
> >q_v3_tmp
> >>
> >> In this draft, it uses .check_cap() to query HOST_IOMMU_DEVICE_CAP_xxx
> >> just like KVM CAPs.
> >> A common HostIOMMUDeviceCaps structure is introduced to be used by
> >> both legacy and iommufd backend.
> >>
> >> It indeed is cleaner. Only problem is I failed to implement .compatible()
> >> as all the check could go ahead by just calling check_cap().
> >> Could you help a quick check to see if I misunderstood any of your
> >suggestion?
> >
> >Thanks for the changes. It looks cleaner and simpler ! Some comments,
> >
> >* HostIOMMUDeviceIOMMUFDClass seems useless as it is empty. I don't
> >   remember if you told me already you had plans for future changes.
> >   Sorry about that if this is the case. I forgot.
> 
> Never mind😊, reason is:
> 
> In nesting series 
> https://github.com/yiliu1765/qemu/commits/zhenzhong/iommufd_nesting_rfcv2/
> This commit 
> https://github.com/yiliu1765/qemu/commit/581fc900aa296988eaa48abee6d68d3670faf8c9
> implement [at|de]tach_hwpt handlers.
> 
> So I add an extra layer of abstract HostIOMMUDeviceIOMMUFDClass to define
> [at|de]tach_hwpt handlers.
> 
> >
> >* I would use the 'host_iommu_device_' prefix for external routines
> >   which are part of the HostIOMMUDevice API and use 'hiod_' for
> >   internal routines where it makes sense, to limit the name length for
> >   instance.
> 
> Good idea, will do.
> 
> >
> >* I would rename HOST_IOMMU_DEVICE_CAP_IOMMUFD_V1 to
> >   HOST_IOMMU_DEVICE_CAP_IOMMUFD. I mentioned IOMMUFD v2 as a
> >   theoretical example of a different IOMMU interface. I don't think we
> >   need to anticipate yet :)
> 
> Will do.
> 
> >
> >* HostIOMMUDeviceCaps is using 'enum iommu_hw_info_type' from
> >   'linux/iommufd.h', that's not my preferred choice but I won't
> >   object. The result looks good.
> 
> Ok, will keep it for now. We can change when you want in future.
> 
> >
> >* HostIOMMUDevice now has a realize() routine to query the host IOMMU
> >   capability for later usage. This is a good idea. However, you could
> >   change the return value to bool and avoid the ERRP_GUARD() prologue.
> 
> Will do.
> 
> >
> >* Beware of :
> >
> >     struct Range {
> >         /*
> >          * Do not access members directly, use the functions!
> >          * A non-empty range has @lob <= @upb.
> >          * An empty range has @lob == @upb + 1.
> >          */
> >         uint64_t lob;        /* inclusive lower bound */
> >         uint64_t upb;        /* inclusive upper bound */
> >     };
> 
> I remember😊, will add the change in formal version.
> 
> >
> >
> >* I think you could introduce a new VFIOIOMMUClass attribute. Let's
> >   call it VFIOIOMMUClass::hiod_typename. The creation of
> >HostIOMMUDevice
> >   would become generic and you could move :
> >
> >     hiod=
> >HOST_IOMMU_DEVICE(object_new(TYPE_HOST_IOMMU_DEVICE_LEGACY_V
> >FIO));
> >     HOST_IOMMU_DEVICE_GET_CLASS(hiod)->realize(hiod, vbasedev, errp);
> >     if (*errp) {
> >         object_unref(hiod);
> >         return -EINVAL;
> >     }
> >     vbasedev->hiod = hiod;
> >
> >   at the end of vfio_attach_device().
> 
> Good suggestion! Will do.
> 
> Thanks
> Zhenzhong


So I'm expecting v3 of this.


Reply via email to