Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
Hi Kevin, On 2022/3/29 16:42, Tian, Kevin wrote: From: Lu Baolu Sent: Tuesday, March 29, 2022 1:38 PM Some of the interfaces in the IOMMU core require that only a single kernel device driver controls the device in the IOMMU group. The existing method is to check the device count in the IOMMU group in the interfaces. This is unreliable because any device added to the IOMMU group later breaks this assumption without notifying the driver using the interface. This adds iommu_group_singleton_lockdown() that checks the requirement and locks down the IOMMU group with only single device driver bound. Signed-off-by: Lu Baolu --- drivers/iommu/iommu.c | 30 ++ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 0c42ece25854..9fb8a5b4491e 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -48,6 +48,7 @@ struct iommu_group { struct list_head entry; unsigned int owner_cnt; void *owner; + bool singleton_lockdown; }; struct group_device { @@ -968,15 +969,16 @@ void iommu_group_remove_device(struct device *dev) } EXPORT_SYMBOL_GPL(iommu_group_remove_device); -static int iommu_group_device_count(struct iommu_group *group) +/* Callers should hold the group->mutex. */ +static bool iommu_group_singleton_lockdown(struct iommu_group *group) { - struct group_device *entry; - int ret = 0; - - list_for_each_entry(entry, >devices, list) - ret++; + if (group->owner_cnt != 1 || + group->domain != group->default_domain || + group->owner) + return false; Curious why there will be a case where group uses default_domain while still having a owner? I have the impression that owner is used for userspace DMA where a different domain is used. You are right. The default domain is automatically detached when a user is claimed. As long as a user is claimed, the group could only use an empty or user-specified domain. + group->singleton_lockdown = true; - return ret; + return true; } btw I'm not sure whether this is what SVA requires. IIRC the problem with SVA is because PASID TLP prefix is not counted in PCI packet routing thus a DMA target address with PASID might be treated as P2P if the address falls into the MMIO BAR of other devices in the group. This is why the original code needs to strictly apply SVA in a group containing a single device, instead of a group attached by a single driver, unless we want to reserve those MMIO ranges in CPU VA space. You are right. But I don't think the IOMMU core is able to guarantee above in a platform/device-agnostic way. Or any suggestions? I guess this should be somewhat off-loaded to the device driver which knows details of the device. The device driver should know this and guarantee it before calling iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA). This patch itself just replaces the existing "iommu_group_device_count(group) != 1" logic with a new one based on the group ownership logistics. The former is obviously not friendly to device hot joined afterward. Jean can correct me if my memory is wrong. Thanks Kevin Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
On 2022/3/30 5:38, Jacob Pan wrote: +static struct iommu_domain * +iommu_sva_alloc_domain(struct device *dev, struct mm_struct *mm) +{ + struct bus_type *bus = dev->bus; + struct iommu_sva_cookie *cookie; + struct iommu_domain *domain; + void *curr; + + if (!bus || !bus->iommu_ops) + return NULL; + + cookie = kzalloc(sizeof(*cookie), GFP_KERNEL); + if (!cookie) + return NULL; + + domain = bus->iommu_ops->domain_alloc(IOMMU_DOMAIN_SVA); + if (!domain) + goto err_domain_alloc; + + cookie->mm = mm; + cookie->pasid = mm->pasid; How do you manage the mm life cycle? do you require caller take mm reference? Or this should be limited to the current mm? The existing sva_bind() interface requires the caller to take the mm reference before calling it. So mm is safe during sva bind() and unbind(). drivers/iommu/iommu.c: /** * iommu_sva_bind_device() - Bind a process address space to a device * @dev: the device * @mm: the mm to bind, caller must hold a reference to it * @drvdata: opaque data pointer to pass to bind callback * * Create a bond between device and address space, allowing the device to access * the mm using the returned PASID. If a bond already exists between @device and * @mm, it is returned and an additional reference is taken. Caller must call * iommu_sva_unbind_device() to release each reference. * * iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) must be called first, to * initialize the required SVA features. * * On error, returns an ERR_PTR value. */ struct iommu_sva * iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata) Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RFC v2 01/11] iommu: Add pasid_bits field in struct dev_iommu
Hi Jacob, On 2022/3/30 5:00, Jacob Pan wrote: Hi BaoLu, On Tue, 29 Mar 2022 13:37:50 +0800, Lu Baolu wrote: Use this field to save the pasid/ssid bits that a device is able to support with its IOMMU hardware. It is a generic attribute of a device and lifting it into the per-device dev_iommu struct makes it possible to allocate a PASID for device without calls into the IOMMU drivers. Any iommu driver which suports PASID related features should set this field before features are enabled on the devices. Signed-off-by: Lu Baolu --- include/linux/iommu.h | 1 + drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 ++ drivers/iommu/intel/iommu.c | 5 - 3 files changed, 7 insertions(+), 1 deletion(-) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 6ef2df258673..36f43af0af53 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -368,6 +368,7 @@ struct dev_iommu { struct iommu_fwspec *fwspec; struct iommu_device *iommu_dev; void*priv; + unsigned intpasid_bits; pasid_width? PCI spec uses "Max PASID Width" My understanding is that this field represents "the pasid bits that the device is able to use with its IOMMU". This field considers the capabilities of both device and IOMMU. This is the reason why I put it in the per-device iommu object and initialize it in the iommu probe_device() callback. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RFC v2 03/11] iommu/sva: Add iommu_domain type for SVA
Hi BaoLu, On Tue, 29 Mar 2022 13:37:52 +0800, Lu Baolu wrote: > Add a new iommu domain type IOMMU_DOMAIN_SVA to represent an I/O page > table which is shared from CPU host VA. Add some helpers to get and > put an SVA domain and implement SVA domain life cycle management. > > Signed-off-by: Lu Baolu > --- > include/linux/iommu.h | 7 +++ > drivers/iommu/iommu-sva-lib.h | 10 > drivers/iommu/iommu-sva-lib.c | 89 +++ > 3 files changed, 106 insertions(+) > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h > index 36f43af0af53..29c4c2edd706 100644 > --- a/include/linux/iommu.h > +++ b/include/linux/iommu.h > @@ -42,6 +42,7 @@ struct notifier_block; > struct iommu_sva; > struct iommu_fault_event; > struct iommu_dma_cookie; > +struct iommu_sva_cookie; > > /* iommu fault flags */ > #define IOMMU_FAULT_READ 0x0 > @@ -64,6 +65,9 @@ struct iommu_domain_geometry { > #define __IOMMU_DOMAIN_PT(1U << 2) /* Domain is identity mapped > */ #define __IOMMU_DOMAIN_DMA_FQ(1U << 3) /* DMA-API uses > flush queue*/ > +#define __IOMMU_DOMAIN_SHARED(1U << 4) /* Page table shared > from CPU */ +#define __IOMMU_DOMAIN_HOST_VA (1U << 5) /* Host > CPU virtual address */ + > /* > * This are the possible domain-types > * > @@ -86,6 +90,8 @@ struct iommu_domain_geometry { > #define IOMMU_DOMAIN_DMA_FQ (__IOMMU_DOMAIN_PAGING |\ >__IOMMU_DOMAIN_DMA_API | \ >__IOMMU_DOMAIN_DMA_FQ) > +#define IOMMU_DOMAIN_SVA (__IOMMU_DOMAIN_SHARED |\ > + __IOMMU_DOMAIN_HOST_VA) > > struct iommu_domain { > unsigned type; > @@ -95,6 +101,7 @@ struct iommu_domain { > void *handler_token; > struct iommu_domain_geometry geometry; > struct iommu_dma_cookie *iova_cookie; > + struct iommu_sva_cookie *sva_cookie; > }; > > static inline bool iommu_is_dma_domain(struct iommu_domain *domain) > diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h > index 8909ea1094e3..1a71218b07f5 100644 > --- a/drivers/iommu/iommu-sva-lib.h > +++ b/drivers/iommu/iommu-sva-lib.h > @@ -10,6 +10,7 @@ > > int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t > max); struct mm_struct *iommu_sva_find(ioasid_t pasid); > +struct mm_struct *iommu_sva_domain_mm(struct iommu_domain *domain); > > /* I/O Page fault */ > struct device; > @@ -26,6 +27,8 @@ int iopf_queue_flush_dev(struct device *dev); > struct iopf_queue *iopf_queue_alloc(const char *name); > void iopf_queue_free(struct iopf_queue *queue); > int iopf_queue_discard_partial(struct iopf_queue *queue); > +bool iommu_sva_domain_get_user(struct iommu_domain *domain); > +void iommu_sva_domain_put_user(struct iommu_domain *domain); > > #else /* CONFIG_IOMMU_SVA */ > static inline int iommu_queue_iopf(struct iommu_fault *fault, void > *cookie) @@ -63,5 +66,12 @@ static inline int > iopf_queue_discard_partial(struct iopf_queue *queue) { > return -ENODEV; > } > + > +static inline bool iommu_sva_domain_get_user(struct iommu_domain *domain) > +{ > + return false; > +} > + > +static inline void iommu_sva_domain_put_user(struct iommu_domain > *domain) { } #endif /* CONFIG_IOMMU_SVA */ > #endif /* _IOMMU_SVA_LIB_H */ > diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c > index 106506143896..78820be23f15 100644 > --- a/drivers/iommu/iommu-sva-lib.c > +++ b/drivers/iommu/iommu-sva-lib.c > @@ -3,12 +3,21 @@ > * Helpers for IOMMU drivers implementing SVA > */ > #include > +#include > +#include > #include > > #include "iommu-sva-lib.h" > > static DEFINE_MUTEX(iommu_sva_lock); > static DECLARE_IOASID_SET(iommu_sva_pasid); > +static DEFINE_XARRAY_ALLOC(sva_domain_array); > + > +struct iommu_sva_cookie { > + struct mm_struct *mm; > + ioasid_t pasid; > + refcount_t users; > +}; > > /** > * iommu_sva_alloc_pasid - Allocate a PASID for the mm > @@ -69,3 +78,83 @@ struct mm_struct *iommu_sva_find(ioasid_t pasid) > return ioasid_find(_sva_pasid, pasid, __mmget_not_zero); > } > EXPORT_SYMBOL_GPL(iommu_sva_find); > + > +static struct iommu_domain * > +iommu_sva_alloc_domain(struct device *dev, struct mm_struct *mm) > +{ > + struct bus_type *bus = dev->bus; > + struct iommu_sva_cookie *cookie; > + struct iommu_domain *domain; > + void *curr; > + > + if (!bus || !bus->iommu_ops) > + return NULL; > + > + cookie = kzalloc(sizeof(*cookie), GFP_KERNEL); > + if (!cookie) > + return NULL; > + > + domain = bus->iommu_ops->domain_alloc(IOMMU_DOMAIN_SVA); > + if (!domain) > + goto err_domain_alloc; > + > + cookie->mm = mm; > + cookie->pasid = mm->pasid; How do you manage the mm life cycle? do you require caller take mm reference? Or this should be limited to the current mm? > + refcount_set(>users,
Re: [PATCH RFC v2 01/11] iommu: Add pasid_bits field in struct dev_iommu
Hi BaoLu, On Tue, 29 Mar 2022 13:37:50 +0800, Lu Baolu wrote: > Use this field to save the pasid/ssid bits that a device is able to > support with its IOMMU hardware. It is a generic attribute of a device > and lifting it into the per-device dev_iommu struct makes it possible > to allocate a PASID for device without calls into the IOMMU drivers. > Any iommu driver which suports PASID related features should set this > field before features are enabled on the devices. > > Signed-off-by: Lu Baolu > --- > include/linux/iommu.h | 1 + > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 ++ > drivers/iommu/intel/iommu.c | 5 - > 3 files changed, 7 insertions(+), 1 deletion(-) > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h > index 6ef2df258673..36f43af0af53 100644 > --- a/include/linux/iommu.h > +++ b/include/linux/iommu.h > @@ -368,6 +368,7 @@ struct dev_iommu { > struct iommu_fwspec *fwspec; > struct iommu_device *iommu_dev; > void*priv; > + unsigned intpasid_bits; pasid_width? PCI spec uses "Max PASID Width" > }; > > int iommu_device_register(struct iommu_device *iommu, > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index > 627a3ed5ee8f..afc63fce6107 100644 --- > a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2681,6 +2681,8 @@ > static struct iommu_device *arm_smmu_probe_device(struct device *dev) > smmu->features & ARM_SMMU_FEAT_STALL_FORCE) master->stall_enabled = true; > > + dev->iommu->pasid_bits = master->ssid_bits; > + > return >iommu; > > err_free_master: > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c > index 6f7485c44a4b..c1b91bce1530 100644 > --- a/drivers/iommu/intel/iommu.c > +++ b/drivers/iommu/intel/iommu.c > @@ -4587,8 +4587,11 @@ static struct iommu_device > *intel_iommu_probe_device(struct device *dev) if (pasid_supported(iommu)) > { int features = pci_pasid_features(pdev); > > - if (features >= 0) > + if (features >= 0) { > info->pasid_supported = features > | 1; > + dev->iommu->pasid_bits = > + > fls(pci_max_pasids(pdev)) - 1; > + } > } > > if (info->ats_supported && ecap_prs(iommu->ecap) > && Thanks, Jacob ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 7/8] iommu/vt-d: Delete supervisor/kernel SVA
Hi Kevin, On Fri, 18 Mar 2022 06:16:58 +, "Tian, Kevin" wrote: > > From: Jacob Pan > > Sent: Tuesday, March 15, 2022 1:07 PM > > > > In-kernel DMA with PASID should use DMA API now, remove supervisor > > PASID > > SVA support. Remove special cases in bind mm and page request service. > > > > Signed-off-by: Jacob Pan > > so you removed all the references to SVM_FLAG_SUPERVISOR_MODE > but the definition is still kept in include/linux/intel-svm.h... > Good catch, will remove. > > --- > > drivers/iommu/intel/svm.c | 42 --- > > 1 file changed, 8 insertions(+), 34 deletions(-) > > > > diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c > > index 2c53689da461..37d6218f173b 100644 > > --- a/drivers/iommu/intel/svm.c > > +++ b/drivers/iommu/intel/svm.c > > @@ -516,11 +516,10 @@ static void intel_svm_free_pasid(struct mm_struct > > *mm) > > > > static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu, > >struct device *dev, > > - struct mm_struct *mm, > > - unsigned int flags) > > + struct mm_struct *mm) > > { > > struct device_domain_info *info = get_domain_info(dev); > > - unsigned long iflags, sflags; > > + unsigned long iflags, sflags = 0; > > struct intel_svm_dev *sdev; > > struct intel_svm *svm; > > int ret = 0; > > @@ -533,16 +532,13 @@ static struct iommu_sva > > *intel_svm_bind_mm(struct intel_iommu *iommu, > > > > svm->pasid = mm->pasid; > > svm->mm = mm; > > - svm->flags = flags; > > INIT_LIST_HEAD_RCU(>devs); > > > > - if (!(flags & SVM_FLAG_SUPERVISOR_MODE)) { > > - svm->notifier.ops = _mmuops; > > - ret = mmu_notifier_register(>notifier, > > mm); > > - if (ret) { > > - kfree(svm); > > - return ERR_PTR(ret); > > - } > > + svm->notifier.ops = _mmuops; > > + ret = mmu_notifier_register(>notifier, mm); > > + if (ret) { > > + kfree(svm); > > + return ERR_PTR(ret); > > } > > > > ret = pasid_private_add(svm->pasid, svm); > > @@ -583,8 +579,6 @@ static struct iommu_sva *intel_svm_bind_mm(struct > > intel_iommu *iommu, > > } > > > > /* Setup the pasid table: */ > > - sflags = (flags & SVM_FLAG_SUPERVISOR_MODE) ? > > - PASID_FLAG_SUPERVISOR_MODE : 0; > > sflags |= cpu_feature_enabled(X86_FEATURE_LA57) ? > > PASID_FLAG_FL5LP : 0; > > spin_lock_irqsave(>lock, iflags); > > ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, mm- > > >pasid, > > @@ -957,7 +951,7 @@ static irqreturn_t prq_event_thread(int irq, void > > *d) > > * to unbind the mm while any page faults are > > outstanding. > > */ > > svm = pasid_private_find(req->pasid); > > - if (IS_ERR_OR_NULL(svm) || (svm->flags & > > SVM_FLAG_SUPERVISOR_MODE)) > > + if (IS_ERR_OR_NULL(svm)) > > goto bad_req; > > } > > > > @@ -1011,29 +1005,9 @@ static irqreturn_t prq_event_thread(int irq, void > > *d) > > struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct > > *mm, void *drvdata) > > { > > struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL); > > - unsigned int flags = 0; > > struct iommu_sva *sva; > > int ret; > > > > - if (drvdata) > > - flags = *(unsigned int *)drvdata; > > - > > - if (flags & SVM_FLAG_SUPERVISOR_MODE) { > > - if (!ecap_srs(iommu->ecap)) { > > - dev_err(dev, "%s: Supervisor PASID not > > supported\n", > > - iommu->name); > > - return ERR_PTR(-EOPNOTSUPP); > > - } > > - > > - if (mm) { > > - dev_err(dev, "%s: Supervisor PASID with user > > provided mm\n", > > - iommu->name); > > - return ERR_PTR(-EINVAL); > > - } > > - > > - mm = _mm; > > - } > > - > > mutex_lock(_mutex); > > ret = intel_svm_alloc_pasid(dev, mm, flags); > > if (ret) { > > -- > > 2.25.1 > Thanks, Jacob ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 6/8] dmaengine: idxd: Use DMA API for in-kernel DMA with PASID
Hi Kevin, On Fri, 18 Mar 2022 06:10:40 +, "Tian, Kevin" wrote: > > From: Jacob Pan > > Sent: Tuesday, March 15, 2022 1:07 PM > > > > The current in-kernel supervisor PASID support is based on the SVM/SVA > > machinery in SVA lib. The binding between a kernel PASID and kernel > > mapping has many flaws. See discussions in the link below. > > > > This patch enables in-kernel DMA by switching from SVA lib to the > > standard DMA mapping APIs. Since both DMA requests with and without > > PASIDs are mapped identically, there is no change to how DMA APIs are > > used after the kernel PASID is enabled. > > > > Link: https://lore.kernel.org/linux- > > iommu/20210511194726.gp1002...@nvidia.com/ > > Signed-off-by: Jacob Pan > > --- > > drivers/dma/idxd/idxd.h | 1 - > > drivers/dma/idxd/init.c | 34 +- > > drivers/dma/idxd/sysfs.c | 7 --- > > 3 files changed, 9 insertions(+), 33 deletions(-) > > > > diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h > > index da72eb15f610..a09ab4a6e1c1 100644 > > --- a/drivers/dma/idxd/idxd.h > > +++ b/drivers/dma/idxd/idxd.h > > @@ -276,7 +276,6 @@ struct idxd_device { > > struct idxd_wq **wqs; > > struct idxd_engine **engines; > > > > - struct iommu_sva *sva; > > unsigned int pasid; > > > > int num_groups; > > diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c > > index 08a5f4310188..5d1f8dd4abf6 100644 > > --- a/drivers/dma/idxd/init.c > > +++ b/drivers/dma/idxd/init.c > > @@ -16,6 +16,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include "../dmaengine.h" > > @@ -466,36 +467,22 @@ static struct idxd_device *idxd_alloc(struct > > pci_dev *pdev, struct idxd_driver_d > > > > static int idxd_enable_system_pasid(struct idxd_device *idxd) > > idxd_enable_pasid_dma() since system pasid is a confusing term now? > Or just remove the idxd specific wrappers and have the caller to call > iommu_enable_pasid_dma() directly given the simple logic here. > agreed, will do. > > { > > - int flags; > > - unsigned int pasid; > > - struct iommu_sva *sva; > > + u32 pasid; > > + int ret; > > > > - flags = SVM_FLAG_SUPERVISOR_MODE; > > - > > - sva = iommu_sva_bind_device(>pdev->dev, NULL, ); > > - if (IS_ERR(sva)) { > > - dev_warn(>pdev->dev, > > -"iommu sva bind failed: %ld\n", PTR_ERR(sva)); > > - return PTR_ERR(sva); > > - } > > - > > - pasid = iommu_sva_get_pasid(sva); > > - if (pasid == IOMMU_PASID_INVALID) { > > - iommu_sva_unbind_device(sva); > > - return -ENODEV; > > + ret = iommu_enable_pasid_dma(>pdev->dev, ); > > + if (ret) { > > + dev_err(>pdev->dev, "No DMA PASID %d\n", ret); > > + return ret; > > } > > - > > - idxd->sva = sva; > > idxd->pasid = pasid; > > - dev_dbg(>pdev->dev, "system pasid: %u\n", pasid); > > + > > return 0; > > } > > > > static void idxd_disable_system_pasid(struct idxd_device *idxd) > > { > > - > > - iommu_sva_unbind_device(idxd->sva); > > - idxd->sva = NULL; > > + iommu_disable_pasid_dma(>pdev->dev, idxd->pasid); > > } > > > > static int idxd_probe(struct idxd_device *idxd) > > @@ -524,10 +511,7 @@ static int idxd_probe(struct idxd_device *idxd) > > } else { > > dev_warn(dev, "Unable to turn on SVA > > feature.\n"); } > > - } else if (!sva) { > > - dev_warn(dev, "User forced SVA off via module > > param.\n"); > > why removing above 2 lines? they are related to a module param thus > not affected by the logic in this series. > This should be in a separate patch. I consulted with Dave, sva module param is not needed anymore. Thanks for pointing it out. > > } > > - > > idxd_read_caps(idxd); > > idxd_read_table_offsets(idxd); > > > > diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c > > index 7e19ab92b61a..fde6656695ba 100644 > > --- a/drivers/dma/idxd/sysfs.c > > +++ b/drivers/dma/idxd/sysfs.c > > @@ -839,13 +839,6 @@ static ssize_t wq_name_store(struct device *dev, > > if (strlen(buf) > WQ_NAME_SIZE || strlen(buf) == 0) > > return -EINVAL; > > > > - /* > > -* This is temporarily placed here until we have SVM support > > for > > -* dmaengine. > > -*/ > > - if (wq->type == IDXD_WQT_KERNEL && device_pasid_enabled(wq- > > >idxd)) > > - return -EOPNOTSUPP; > > - > > memset(wq->name, 0, WQ_NAME_SIZE + 1); > > strncpy(wq->name, buf, WQ_NAME_SIZE); > > strreplace(wq->name, '\n', '\0'); > > -- > > 2.25.1 > Thanks, Jacob ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [GIT PULL] dma-mapping updates for Linux 5.18
The pull request you sent on Tue, 29 Mar 2022 15:50:24 +0200: > git://git.infradead.org/users/hch/dma-mapping.git tags/dma-mapping-5.18 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/9ae2a143081fa8fba5042431007b33d9a855b7a2 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V3 1/2] Documentation: x86: Add documentation for AMD IOMMU
On 3/28/22 10:28, Alex Deucher wrote: > +How is IOVA generated? > +-- > + > +Well behaved drivers call dma_map_*() calls before sending command to device > +that needs to perform DMA. Once DMA is completed and mapping is no longer > +required, driver performs dma_unmap_*() calls to unmap the region. > + > +Fault reporting > +--- > + > +When errors are reported, the IOMMU signals via an interrupt. The fault > +reason and device that caused it is printed on the console. Just scanning this, it looks *awfully* generic. Is anything in here AMD-specific? Should this be in an AMD-specific file? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[GIT PULL] dma-mapping updates for Linux 5.18
The following changes since commit 0280e3c58f92b2fe0e8fbbdf8d386449168de4a8: Merge tag 'nfs-for-5.17-1' of git://git.linux-nfs.org/projects/anna/linux-nfs (2022-01-25 20:16:03 +0200) are available in the Git repository at: git://git.infradead.org/users/hch/dma-mapping.git tags/dma-mapping-5.18 for you to fetch changes up to 8ddde07a3d285a0f3cec14924446608320fdc013: dma-mapping: benchmark: extract a common header file for map_benchmark definition (2022-03-10 07:41:14 +0100) dma-mapping updates for Linux 5.18 - do not zero buffer in set_memory_decrypted (Kirill A. Shutemov) - fix return value of dma-debug __setup handlers (Randy Dunlap) - swiotlb cleanups (Robin Murphy) - remove most remaining users of the pci-dma-compat.h API (Christophe JAILLET) - share the ABI header for the DMA map_benchmark with userspace (Tian Tao) - update the maintainer for DMA MAPPING BENCHMARK (Xiang Chen) - remove CONFIG_DMA_REMAP (me) Christoph Hellwig (1): dma-mapping: remove CONFIG_DMA_REMAP Christophe JAILLET (5): alpha: Remove usage of the deprecated "pci-dma-compat.h" API agp/intel: Remove usage of the deprecated "pci-dma-compat.h" API sparc: Remove usage of the deprecated "pci-dma-compat.h" API rapidio/tsi721: Remove usage of the deprecated "pci-dma-compat.h" API media: v4l2-pci-skeleton: Remove usage of the deprecated "pci-dma-compat.h" API Kirill A. Shutemov (1): swiotlb: do not zero buffer in set_memory_decrypted() Randy Dunlap (1): dma-debug: fix return value of __setup handlers Robin Murphy (3): swiotlb: simplify debugfs setup swiotlb: tidy up includes swiotlb: simplify array allocation Tian Tao (1): dma-mapping: benchmark: extract a common header file for map_benchmark definition Xiang Chen (1): MAINTAINERS: update maintainer list of DMA MAPPING BENCHMARK MAINTAINERS | 2 +- arch/alpha/include/asm/floppy.h | 7 ++- arch/alpha/kernel/pci_iommu.c | 12 ++-- arch/arm/Kconfig| 2 +- arch/sparc/kernel/ioport.c | 2 +- arch/xtensa/Kconfig | 2 +- drivers/char/agp/intel-gtt.c| 26 - drivers/iommu/dma-iommu.c | 14 ++--- drivers/rapidio/devices/tsi721.c| 8 +-- include/linux/map_benchmark.h | 31 ++ kernel/dma/Kconfig | 7 +-- kernel/dma/Makefile | 2 +- kernel/dma/debug.c | 4 +- kernel/dma/direct.c | 18 +++--- kernel/dma/map_benchmark.c | 25 +--- kernel/dma/swiotlb.c| 76 - samples/v4l/v4l2-pci-skeleton.c | 2 +- tools/testing/selftests/dma/dma_map_benchmark.c | 25 +--- 18 files changed, 105 insertions(+), 160 deletions(-) create mode 100644 include/linux/map_benchmark.h ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 0/2] Fix coherence for VMbus and PCI pass-thru devices in Hyper-V VM
On Thu, Mar 24, 2022 at 09:14:50AM -0700, Michael Kelley wrote: > Hyper-V VMs have VMbus synthetic devices and PCI pass-thru devices that are > added > dynamically via the VMbus protocol and are not represented in the ACPI DSDT. > Only > the top level VMbus node exists in the DSDT. As such, on ARM64 these devices > don't > pick up coherence information and default to not hardware coherent. This > results > in extra software coherence management overhead since the synthetic devices > are > always hardware coherent. PCI pass-thru devices are also hardware coherent in > all > current usage scenarios. > > Fix this by propagating coherence information from the top level VMbus node in > the DSDT to all VMbus synthetic devices and PCI pass-thru devices. While > smaller > granularity of control would be better, basing on the VMbus node in the DSDT > gives as escape path if a future scenario arises with devices that are not > hardware coherent. > > Changes since v2: > * Move coherence propagation for VMbus synthetic devices to a separate > .dma_configure function instead of the .probe fucntion [Robin Murphy] > > Changes since v1: > * Use device_get_dma_attr() instead of acpi_get_dma_attr(), eliminating the > need to export acpi_get_dma_attr() [Robin Murphy] > * Use arch_setup_dma_ops() to set device coherence [Robin Murphy] > * Move handling of missing _CCA to vmbus_acpi_add() so it is only done once > * Rework handling of PCI devices so existing code in pci_dma_configure() > just works > > Michael Kelley (2): > Drivers: hv: vmbus: Propagate VMbus coherence to each VMbus device > PCI: hv: Propagate coherence from VMbus device to PCI device Patch 2 will not be very useful without patch 1 so I've applied the whole series to hyperv-fixes. Thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RFC 04/12] kernel/user: Allow user::locked_vm to be usable for iommufd
On Tue, Mar 29, 2022 at 12:59:52PM +0800, Jason Wang wrote: > vDPA has a backend feature negotiation, then actually, userspace can > tell vDPA to go with the new accounting approach. Not sure RDMA can do > the same. A security feature userspace can ask to turn off is not really a security feature. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
On Tue, Mar 29, 2022 at 08:42:13AM +, Tian, Kevin wrote: > btw I'm not sure whether this is what SVA requires. IIRC the problem with > SVA is because PASID TLP prefix is not counted in PCI packet routing thus > a DMA target address with PASID might be treated as P2P if the address > falls into the MMIO BAR of other devices in the group. This is why the > original code needs to strictly apply SVA in a group containing a single > device, instead of a group attached by a single driver, unless we want to > reserve those MMIO ranges in CPU VA space. I think it is not such a good idea to mix up group with this test Here you want to say that all TLPs from the RID route to the host bridge - ie ACS is on/etc. This is subtly different from a group with a single device. Specifically it is an immutable property of the fabric and doesn't change after hot plug events. ie if we have a singleton group that doesn't have ACS and someone hotplugs in another device on a bridge, then our SVA is completely broken and we get data corruption. Testing the group size is inherently the wrong test to make. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V3 2/2] Documentation: x86: Clarify Intel IOMMU documentation
On 29/03/22 00.28, Alex Deucher wrote: Based on feedback from Robin on the initial AMD IOMMU documentation, fix up the Intel documentation to clarify IOMMU vs device and modern DMA API. Maybe we can squash into [1/2]? -- An old man doll... just what I always wanted! - Clara ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RFC 11/12] iommufd: vfio container FD ioctl compatibility
On 2022/3/24 06:51, Alex Williamson wrote: On Fri, 18 Mar 2022 14:27:36 -0300 Jason Gunthorpe wrote: iommufd can directly implement the /dev/vfio/vfio container IOCTLs by mapping them into io_pagetable operations. Doing so allows the use of iommufd by symliking /dev/vfio/vfio to /dev/iommufd. Allowing VFIO to SET_CONTAINER using a iommufd instead of a container fd is a followup series. Internally the compatibility API uses a normal IOAS object that, like vfio, is automatically allocated when the first device is attached. Userspace can also query or set this IOAS object directly using the IOMMU_VFIO_IOAS ioctl. This allows mixing and matching new iommufd only features while still using the VFIO style map/unmap ioctls. While this is enough to operate qemu, it is still a bit of a WIP with a few gaps to be resolved: - Only the TYPE1v2 mode is supported where unmap cannot punch holes or split areas. The old mode can be implemented with a new operation to split an iopt_area into two without disturbing the iopt_pages or the domains, then unmapping a whole area as normal. - Resource limits rely on memory cgroups to bound what userspace can do instead of the module parameter dma_entry_limit. - VFIO P2P is not implemented. Avoiding the follow_pfn() mis-design will require some additional work to properly expose PFN lifecycle between VFIO and iommfd - Various components of the mdev API are not completed yet - Indefinite suspend of SW access (VFIO_DMA_MAP_FLAG_VADDR) is not implemented. - The 'dirty tracking' is not implemented - A full audit for pedantic compatibility details (eg errnos, etc) has not yet been done - powerpc SPAPR is left out, as it is not connected to the iommu_domain framework. My hope is that SPAPR will be moved into the iommu_domain framework as a special HW specific type and would expect power to support the generic interface through a normal iommu_domain. My overall question here would be whether we can actually achieve a compatibility interface that has sufficient feature transparency that we can dump vfio code in favor of this interface, or will there be enough niche use cases that we need to keep type1 and vfio containers around through a deprecation process? The locked memory differences for one seem like something that libvirt wouldn't want hidden and we have questions regarding support for vaddr hijacking and different ideas how to implement dirty page tracking, not to mention the missing features that are currently well used, like p2p mappings, coherency tracking, mdev, etc. It seems like quite an endeavor to fill all these gaps, while at the same time QEMU will be working to move to use iommufd directly in order to gain all the new features. Hi Alex, Jason hasn't included the vfio changes for adapting to iommufd. But it's in this branch (https://github.com/luxis1999/iommufd/commits/iommufd-v5.17-rc6). Eric and me are working on adding the iommufd support in QEMU as well. If wanting to run the new QEMU on old kernel, QEMU is supposed to support both the legacy group/container interface and the latest device/iommufd interface. We've got some draft code toward this direction (https://github.com/luxis1999/qemu/commits/qemu-for-5.17-rc4-vm). It works for both legacy group/container and device/iommufd path. It's just for reference so far, Eric and me will have a further sync on it. Where do we focus attention? Is symlinking device files our proposal to userspace and is that something achievable, or do we want to use this compatibility interface as a means to test the interface and allow userspace to make use of it for transition, if their use cases allow it, perhaps eventually performing the symlink after deprecation and eventual removal of the vfio container and type1 code? Thanks, I'm sure it is possible that one day the group/container interface will be removed in kernel. Perhaps this will happen when SPAPR is supported by iommufd. But how about QEMU, should QEMU keep backward compatibility forever? or one day QEMU may also remove the group/container path and hence unable to work on the old kernels? -- Regards, Yi Liu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()
> From: Lu Baolu > Sent: Tuesday, March 29, 2022 1:38 PM > > Some of the interfaces in the IOMMU core require that only a single > kernel device driver controls the device in the IOMMU group. The > existing method is to check the device count in the IOMMU group in > the interfaces. This is unreliable because any device added to the > IOMMU group later breaks this assumption without notifying the driver > using the interface. This adds iommu_group_singleton_lockdown() that > checks the requirement and locks down the IOMMU group with only single > device driver bound. > > Signed-off-by: Lu Baolu > --- > drivers/iommu/iommu.c | 30 ++ > 1 file changed, 18 insertions(+), 12 deletions(-) > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index 0c42ece25854..9fb8a5b4491e 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -48,6 +48,7 @@ struct iommu_group { > struct list_head entry; > unsigned int owner_cnt; > void *owner; > + bool singleton_lockdown; > }; > > struct group_device { > @@ -968,15 +969,16 @@ void iommu_group_remove_device(struct device > *dev) > } > EXPORT_SYMBOL_GPL(iommu_group_remove_device); > > -static int iommu_group_device_count(struct iommu_group *group) > +/* Callers should hold the group->mutex. */ > +static bool iommu_group_singleton_lockdown(struct iommu_group *group) > { > - struct group_device *entry; > - int ret = 0; > - > - list_for_each_entry(entry, >devices, list) > - ret++; > + if (group->owner_cnt != 1 || > + group->domain != group->default_domain || > + group->owner) > + return false; Curious why there will be a case where group uses default_domain while still having a owner? I have the impression that owner is used for userspace DMA where a different domain is used. > + group->singleton_lockdown = true; > > - return ret; > + return true; > } btw I'm not sure whether this is what SVA requires. IIRC the problem with SVA is because PASID TLP prefix is not counted in PCI packet routing thus a DMA target address with PASID might be treated as P2P if the address falls into the MMIO BAR of other devices in the group. This is why the original code needs to strictly apply SVA in a group containing a single device, instead of a group attached by a single driver, unless we want to reserve those MMIO ranges in CPU VA space. Jean can correct me if my memory is wrong. Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu