[RFC PATCH v2 10/10] vfio/type1: Attach domain for mdev group
When attaching domain to a group of mediated devices which all have the domain type attributes set to ATTACH_PARENT, we should attach domain to the parent PCI device instead of mdev device itself. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Sanjay Kumar Signed-off-by: Lu Baolu --- drivers/vfio/vfio_iommu_type1.c | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 64bf55b91de1..c4231df44304 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -1411,6 +1411,18 @@ static int vfio_mdev_domain_type(struct device *dev, void *data) return -EINVAL; } +static int vfio_parent_bus_type(struct device *dev, void *data) +{ + struct bus_type **bus = data; + + if (*bus && *bus != dev->parent->bus) + return -EINVAL; + + *bus = dev->parent->bus; + + return 0; +} + static int vfio_iommu_type1_attach_group(void *iommu_data, struct iommu_group *iommu_group) { @@ -1458,6 +1470,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, enum mdev_domain_type type = 0; symbol_put(mdev_bus_type); + mdev_bus = NULL; /* Determine the domain type: */ ret = iommu_group_for_each_dev(iommu_group, , @@ -1479,7 +1492,14 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, return 0; case DOMAIN_TYPE_ATTACH_PARENT: - /* FALLTHROUGH */ + bus = NULL; + group->attach_parent = true; + /* Set @bus to bus type of the parent: */ + ret = iommu_group_for_each_dev(iommu_group, , + vfio_parent_bus_type); + if (ret) + goto out_free; + break; default: ret = -EINVAL; goto out_free; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH v2 06/10] iommu/vt-d: Return ID associated with an auxiliary domain
This adds support to return the default pasid associated with an auxiliary domain. The PCI device which is bound with this domain should use this value as the pasid for all DMA requests of the subset of device which is isolated and protected with this domain. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Lu Baolu --- drivers/iommu/intel-iommu.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 784bd496f316..1ba39f120a38 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5544,6 +5544,14 @@ static void intel_iommu_disable_auxd(struct device *dev) spin_unlock_irqrestore(_domain_lock, flags); } +static int intel_iommu_auxd_id(struct iommu_domain *domain) +{ + struct dmar_domain *dmar_domain = to_dmar_domain(domain); + + return dmar_domain->default_pasid > 0 ? + dmar_domain->default_pasid : -EINVAL; +} + const struct iommu_ops intel_iommu_ops = { .capable= intel_iommu_capable, .domain_alloc = intel_iommu_domain_alloc, @@ -5560,6 +5568,7 @@ const struct iommu_ops intel_iommu_ops = { .device_group = pci_device_group, .enable_auxd= intel_iommu_enable_auxd, .disable_auxd = intel_iommu_disable_auxd, + .auxd_id= intel_iommu_auxd_id, .pgsize_bitmap = INTEL_IOMMU_PGSIZES, }; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH v2 07/10] vfio/mdev: Add mediated device domain type
A parent device might create different types of mediated devices. For example, a mediated device could be created by the parent device with full isolation and protection provided by the IOMMU. One usage case could be found on Intel platforms where a mediated device is an assignable subset of a PCI, the DMA requests on behalf of it are all tagged with a PASID. Since IOMMU supports PASID-granular translations (scalable mode in vt-d 3.0), this mediated device could be individually protected and isolated by the IOMMU. This patch defines the domain types of a mediated device and allows the parent driver to specify this attributes when a mediated device is being careated. The following types are defined: * DOMAIN_TYPE_NO_IOMMU - Do not need any IOMMU support. All isolation and protection are handled by the parent device driver through the callbacks with device specific mechanism. * DOMAIN_TYPE_ATTACH_PARENT - IOMMU can isolate and protect this mediated device, and an isolation domain should be attaced to the the parent device. This also reseves a place in mdev private data structure to save the iommu domain, and adds interfaces to store and retrieve the domain. Below APIs are introduced: * mdev_set/get_domain_type(type) - Set or query the domain type of a mediated device. The parent device driver should set the domain type (or keep DOMAIN_TYPE_NO_IOMMU by default) during the mediated device creation. * mdev_set/get_domain(domain) - A iommu domain which has been attached to the parent device in order to protect and isolate the mediated device will be kept in the mdev data structure and could be retrieved later. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Suggested-by: Kevin Tian Signed-off-by: Lu Baolu --- drivers/vfio/mdev/mdev_core.c| 36 drivers/vfio/mdev/mdev_private.h | 2 ++ include/linux/mdev.h | 26 +++ 3 files changed, 64 insertions(+) diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c index 0212f0ee8aea..d45a829c5b11 100644 --- a/drivers/vfio/mdev/mdev_core.c +++ b/drivers/vfio/mdev/mdev_core.c @@ -390,6 +390,42 @@ int mdev_device_remove(struct device *dev, bool force_remove) return 0; } +int mdev_set_domain_type(struct device *dev, enum mdev_domain_type type) +{ + struct mdev_device *mdev = to_mdev_device(dev); + + mdev->domain_type = type; + + return 0; +} +EXPORT_SYMBOL(mdev_set_domain_type); + +enum mdev_domain_type mdev_get_domain_type(struct device *dev) +{ + struct mdev_device *mdev = to_mdev_device(dev); + + return mdev->domain_type; +} +EXPORT_SYMBOL(mdev_get_domain_type); + +int mdev_set_domain(struct device *dev, void *domain) +{ + struct mdev_device *mdev = to_mdev_device(dev); + + mdev->domain = domain; + + return 0; +} +EXPORT_SYMBOL(mdev_set_domain); + +void *mdev_get_domain(struct device *dev) +{ + struct mdev_device *mdev = to_mdev_device(dev); + + return mdev->domain; +} +EXPORT_SYMBOL(mdev_get_domain); + static int __init mdev_init(void) { return mdev_bus_register(); diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h index b5819b7d7ef7..fd9e33fbd6e5 100644 --- a/drivers/vfio/mdev/mdev_private.h +++ b/drivers/vfio/mdev/mdev_private.h @@ -34,6 +34,8 @@ struct mdev_device { struct list_head next; struct kobject *type_kobj; bool active; + int domain_type; + void *domain; }; #define to_mdev_device(dev)container_of(dev, struct mdev_device, dev) diff --git a/include/linux/mdev.h b/include/linux/mdev.h index b6e048e1045f..3224587bda1e 100644 --- a/include/linux/mdev.h +++ b/include/linux/mdev.h @@ -15,6 +15,32 @@ struct mdev_device; +enum mdev_domain_type { + DOMAIN_TYPE_NO_IOMMU, /* Don't need any IOMMU support. +* All isolation and protection +* are handled by the parent +* device driver with a device +* specific mechanism. +*/ + DOMAIN_TYPE_ATTACH_PARENT, /* IOMMU can isolate and protect +* the mdev, and the isolation +* domain should be attaced with +* the parent device. +*/ +}; + +/* + * Called by the parent device driver to set the domain type. + * By default, the domain type is set to DOMAIN_TYPE_EXTERNAL. + */ +int mdev_set_domain_type(struct device *dev, enum mdev_domain_type type); + +/* Check the domain type. */ +enum mdev_domain_type mdev_get_domain_type(struct device *dev); + +int mdev_set_domain(struct device *dev, void *domain); +void *mdev_get_domain(struct device *dev); +
[RFC PATCH v2 09/10] vfio/type1: Determine domain type of an mdev group
This adds the support to determine the domain type of a group of mediated devices according to the domain type attribute of each device. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Sanjay Kumar Signed-off-by: Lu Baolu --- drivers/vfio/vfio_iommu_type1.c | 47 ++--- 1 file changed, 43 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 89e2e6123223..64bf55b91de1 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -1390,6 +1390,27 @@ static void vfio_iommu_detach_group(struct vfio_domain *domain, iommu_detach_group(domain->domain, group->iommu_group); } +static int vfio_mdev_domain_type(struct device *dev, void *data) +{ + enum mdev_domain_type new, *old = data; + enum mdev_domain_type (*fn)(struct device *dev); + + fn = symbol_get(mdev_get_domain_type); + if (fn) { + new = fn(dev); + symbol_put(mdev_get_domain_type); + + if (*old && *old != new) + return -EINVAL; + + *old = new; + + return 0; + } + + return -EINVAL; +} + static int vfio_iommu_type1_attach_group(void *iommu_data, struct iommu_group *iommu_group) { @@ -1433,9 +1454,19 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, mdev_bus = symbol_get(mdev_bus_type); - if (mdev_bus) { - if ((bus == mdev_bus) && !iommu_present(bus)) { - symbol_put(mdev_bus_type); + if (mdev_bus && bus == mdev_bus) { + enum mdev_domain_type type = 0; + + symbol_put(mdev_bus_type); + + /* Determine the domain type: */ + ret = iommu_group_for_each_dev(iommu_group, , + vfio_mdev_domain_type); + if (ret) + goto out_free; + + switch (type) { + case DOMAIN_TYPE_NO_IOMMU: if (!iommu->external_domain) { INIT_LIST_HEAD(>group_list); iommu->external_domain = domain; @@ -1445,11 +1476,19 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, list_add(>next, >external_domain->group_list); mutex_unlock(>lock); + return 0; + case DOMAIN_TYPE_ATTACH_PARENT: + /* FALLTHROUGH */ + default: + ret = -EINVAL; + goto out_free; } - symbol_put(mdev_bus_type); } + if (mdev_bus) + symbol_put(mdev_bus_type); + domain->domain = iommu_domain_alloc(bus); if (!domain->domain) { ret = -EIO; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH v2 08/10] vfio/type1: Add domain at(de)taching group helpers
If a domain is attaching to a group which includes the mediated devices, it should attach to the mdev parent of each mdev. This adds a helper for attaching domain to group, no matter a PCI physical device or mediated devices which are derived from a PCI physical device. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Lu Baolu --- drivers/vfio/vfio_iommu_type1.c | 77 ++--- 1 file changed, 70 insertions(+), 7 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index d9fd3188615d..89e2e6123223 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -91,6 +91,9 @@ struct vfio_dma { struct vfio_group { struct iommu_group *iommu_group; struct list_headnext; + boolattach_parent; /* An mdev group with domain +* attached to parent +*/ }; /* @@ -1327,6 +1330,66 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group *group, phys_addr_t *base) return ret; } +static int vfio_mdev_set_aux_domain(struct device *dev, + struct iommu_domain *domain) +{ + int (*fn)(struct device *dev, void *domain); + int ret; + + fn = symbol_get(mdev_set_domain); + if (fn) { + ret = fn(dev, domain); + symbol_put(mdev_set_domain); + + return ret; + } + + return -EINVAL; +} + +static int vfio_attach_aux_domain(struct device *dev, void *data) +{ + struct iommu_domain *domain = data; + int ret; + + ret = vfio_mdev_set_aux_domain(dev, domain); + if (ret) + return ret; + + return iommu_attach_device(domain, dev->parent); +} + +static int vfio_detach_aux_domain(struct device *dev, void *data) +{ + struct iommu_domain *domain = data; + + vfio_mdev_set_aux_domain(dev, NULL); + iommu_detach_device(domain, dev->parent); + + return 0; +} + +static int vfio_iommu_attach_group(struct vfio_domain *domain, + struct vfio_group *group) +{ + if (group->attach_parent) + return iommu_group_for_each_dev(group->iommu_group, + domain->domain, + vfio_attach_aux_domain); + else + return iommu_attach_group(domain->domain, group->iommu_group); +} + +static void vfio_iommu_detach_group(struct vfio_domain *domain, + struct vfio_group *group) +{ + if (group->attach_parent) + iommu_group_for_each_dev(group->iommu_group, domain->domain, +vfio_detach_aux_domain); + else + iommu_detach_group(domain->domain, group->iommu_group); +} + static int vfio_iommu_type1_attach_group(void *iommu_data, struct iommu_group *iommu_group) { @@ -1402,7 +1465,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, goto out_domain; } - ret = iommu_attach_group(domain->domain, iommu_group); + ret = vfio_iommu_attach_group(domain, group); if (ret) goto out_domain; @@ -1434,8 +1497,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, list_for_each_entry(d, >domain_list, next) { if (d->domain->ops == domain->domain->ops && d->prot == domain->prot) { - iommu_detach_group(domain->domain, iommu_group); - if (!iommu_attach_group(d->domain, iommu_group)) { + vfio_iommu_detach_group(domain, group); + if (!vfio_iommu_attach_group(d, group)) { list_add(>next, >group_list); iommu_domain_free(domain->domain); kfree(domain); @@ -1443,7 +1506,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, return 0; } - ret = iommu_attach_group(domain->domain, iommu_group); + ret = vfio_iommu_attach_group(domain, group); if (ret) goto out_domain; } @@ -1469,7 +1532,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, return 0; out_detach: - iommu_detach_group(domain->domain, iommu_group); + vfio_iommu_detach_group(domain, group); out_domain: iommu_domain_free(domain->domain); out_free: @@ -1560,7 +1623,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, if (!group) continue; - iommu_detach_group(domain->domain, iommu_group); +
[RFC PATCH v2 05/10] iommu/vt-d: Attach/detach domains in auxiliary mode
When multiple domains per device has been enabled by the device driver, the device will tag the default PASID for the domain to all DMA traffics out of the subset of this device; and the IOMMU should translate the DMA requests in PASID granularity. This extends the intel_iommu_attach/detach_device() ops to support managing PASID granular translation structures when the device driver has enabled multiple domains per device. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Sanjay Kumar Signed-off-by: Lu Baolu --- drivers/iommu/intel-iommu.c | 132 +++- include/linux/intel-iommu.h | 10 +++ 2 files changed, 139 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 3606d25bc40c..784bd496f316 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2502,6 +2502,7 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu, info->iommu = iommu; info->pasid_table = NULL; info->auxd_enabled = 0; + INIT_LIST_HEAD(>auxiliary_domains); if (dev && dev_is_pci(dev)) { struct pci_dev *pdev = to_pci_dev(info->dev); @@ -5036,6 +5037,124 @@ static void intel_iommu_domain_free(struct iommu_domain *domain) domain_exit(to_dmar_domain(domain)); } +/* + * Check whether a @domain will be attached to the @dev in the + * auxiliary mode. + */ +static inline bool +is_device_attach_aux_domain(struct device *dev, struct iommu_domain *domain) +{ + struct device_domain_info *info = dev->archdata.iommu; + + return info && info->auxd_enabled && + domain->type == IOMMU_DOMAIN_UNMANAGED; +} + +static void auxiliary_link_device(struct dmar_domain *domain, + struct device *dev) +{ + struct device_domain_info *info = dev->archdata.iommu; + + assert_spin_locked(_domain_lock); + if (WARN_ON(!info)) + return; + + domain->auxd_refcnt++; + list_add(>auxd, >auxiliary_domains); +} + +static void auxiliary_unlink_device(struct dmar_domain *domain, + struct device *dev) +{ + struct device_domain_info *info = dev->archdata.iommu; + + assert_spin_locked(_domain_lock); + if (WARN_ON(!info)) + return; + + list_del(>auxd); + domain->auxd_refcnt--; + + if (!domain->auxd_refcnt && domain->default_pasid > 0) + intel_pasid_free_id(domain->default_pasid); +} + +static int domain_add_dev_auxd(struct dmar_domain *domain, + struct device *dev) +{ + int ret; + u8 bus, devfn; + unsigned long flags; + struct intel_iommu *iommu; + + iommu = device_to_iommu(dev, , ); + if (!iommu) + return -ENODEV; + + spin_lock_irqsave(_domain_lock, flags); + if (domain->default_pasid <= 0) { + domain->default_pasid = intel_pasid_alloc_id(domain, PASID_MIN, + intel_pasid_get_dev_max_id(dev), GFP_ATOMIC); + if (domain->default_pasid < 0) { + pr_err("Can't allocate default pasid\n"); + ret = -ENODEV; + goto pasid_failed; + } + } + + spin_lock(>lock); + ret = domain_attach_iommu(domain, iommu); + if (ret) + goto attach_failed; + + /* Setup the PASID entry for mediated devices: */ + ret = intel_pasid_setup_second_level(iommu, domain, dev, +domain->default_pasid, false); + if (ret) + goto table_failed; + spin_unlock(>lock); + + auxiliary_link_device(domain, dev); + + spin_unlock_irqrestore(_domain_lock, flags); + + return 0; + +table_failed: + domain_detach_iommu(domain, iommu); +attach_failed: + spin_unlock(>lock); + if (!domain->auxd_refcnt && domain->default_pasid > 0) + intel_pasid_free_id(domain->default_pasid); +pasid_failed: + spin_unlock_irqrestore(_domain_lock, flags); + + return ret; +} + +static void domain_remove_dev_aux(struct dmar_domain *domain, + struct device *dev) +{ + struct device_domain_info *info; + struct intel_iommu *iommu; + unsigned long flags; + + spin_lock_irqsave(_domain_lock, flags); + info = dev->archdata.iommu; + iommu = info->iommu; + + intel_pasid_tear_down_second_level(iommu, domain, + dev, domain->default_pasid); + + auxiliary_unlink_device(domain, dev); + + spin_lock(>lock); + domain_detach_iommu(domain, iommu); + spin_unlock(>lock); + + spin_unlock_irqrestore(_domain_lock, flags); +} + static int intel_iommu_attach_device(struct iommu_domain *domain,
[RFC PATCH v2 04/10] iommu/vt-d: Enable/disable multiple domains per device
Add iommu ops for enabling and disabling multiple domains for a device. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Lu Baolu --- drivers/iommu/intel-iommu.c | 36 include/linux/intel-iommu.h | 1 + 2 files changed, 37 insertions(+) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 891ae70e7bf2..3606d25bc40c 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2501,6 +2501,7 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu, info->domain = domain; info->iommu = iommu; info->pasid_table = NULL; + info->auxd_enabled = 0; if (dev && dev_is_pci(dev)) { struct pci_dev *pdev = to_pci_dev(info->dev); @@ -5384,6 +5385,39 @@ struct intel_iommu *intel_svm_device_to_iommu(struct device *dev) } #endif /* CONFIG_INTEL_IOMMU_SVM */ +static int intel_iommu_enable_auxd(struct device *dev) +{ + struct device_domain_info *info; + struct dmar_domain *domain; + unsigned long flags; + + if (!scalable_mode_support()) + return -ENODEV; + + domain = get_valid_domain_for_dev(dev); + if (!domain) + return -ENODEV; + + spin_lock_irqsave(_domain_lock, flags); + info = dev->archdata.iommu; + info->auxd_enabled = 1; + spin_unlock_irqrestore(_domain_lock, flags); + + return 0; +} + +static void intel_iommu_disable_auxd(struct device *dev) +{ + struct device_domain_info *info; + unsigned long flags; + + spin_lock_irqsave(_domain_lock, flags); + info = dev->archdata.iommu; + if (!WARN_ON(!info)) + info->auxd_enabled = 0; + spin_unlock_irqrestore(_domain_lock, flags); +} + const struct iommu_ops intel_iommu_ops = { .capable= intel_iommu_capable, .domain_alloc = intel_iommu_domain_alloc, @@ -5398,6 +5432,8 @@ const struct iommu_ops intel_iommu_ops = { .get_resv_regions = intel_iommu_get_resv_regions, .put_resv_regions = intel_iommu_put_resv_regions, .device_group = pci_device_group, + .enable_auxd= intel_iommu_enable_auxd, + .disable_auxd = intel_iommu_disable_auxd, .pgsize_bitmap = INTEL_IOMMU_PGSIZES, }; diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h index b34cf8b887a0..15981245796e 100644 --- a/include/linux/intel-iommu.h +++ b/include/linux/intel-iommu.h @@ -487,6 +487,7 @@ struct device_domain_info { u8 pri_enabled:1; u8 ats_supported:1; u8 ats_enabled:1; + u8 auxd_enabled:1; /* Multiple domains per device */ u8 ats_qdep; struct device *dev; /* it's NULL for PCIe-to-PCI bridge */ struct intel_iommu *iommu; /* IOMMU used by this device */ -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH v2 03/10] iommu/amd: Add default branch in amd_iommu_capable()
Otherwise, there will be a build warning: drivers/iommu/amd_iommu.c:3083:2: warning: enumeration value 'IOMMU_CAP_AUX_DOMAIN' not handled in switch [-Wswitch] There is no functional change. Signed-off-by: Lu Baolu --- drivers/iommu/amd_iommu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 4e04fff23977..237ae6db4cfd 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -3077,6 +3077,8 @@ static bool amd_iommu_capable(enum iommu_cap cap) return (irq_remapping_enabled == 1); case IOMMU_CAP_NOEXEC: return false; + default: + break; } return false; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH v2 01/10] iommu: Add APIs for multiple domains per device
Sharing a physical PCI device in a finer-granularity way is becoming a consensus in the industry. IOMMU vendors are also engaging efforts to support such sharing as well as possible. Among the efforts, the capability of support finer-granularity DMA isolation is a common requirement due to the security consideration. With finer-granularity DMA isolation, all DMA requests out of or to a subset of a physical PCI device can be protected by the IOMMU. As a result, there is a request in software to attach multiple domains to a physical PCI device. One example of such use model is the Intel Scalable IOV [1] [2]. The Intel vt-d 3.0 spec [3] introduces the scalable mode which enables PASID granularity DMA isolation. This adds the APIs to support multiple domains per device. In order to ease the discussions, we call it 'a domain in auxiliary mode' or simply 'auxiliary domain' when multiple domains are attached to a physical device. The APIs includes: * iommu_capable(IOMMU_CAP_AUX_DOMAIN) - Represents the ability of supporting multiple domains per device. * iommu_en(dis)able_aux_domain(struct device *dev) - Enable/disable the multiple domains capability for a device referenced by @dev. * iommu_auxiliary_id(struct iommu_domain *domain) - Return ID used for finer-granularity DMA translation. For the Intel Scalable IOV usage model, this will be a PASID. The device which supports Scalalbe IOV needs to writes this ID to the device register so that DMA requests could be tagged with a right PASID prefix. Many people involved in discussions of this design. They're Kevin Tian Liu Yi L Ashok Raj Sanjay Kumar Alex Williamson Jean-Philippe Brucker and some discussions can be found here [4]. [1] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification [2] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf [3] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification [4] https://lkml.org/lkml/2018/7/26/4 Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Suggested-by: Kevin Tian Signed-off-by: Lu Baolu --- drivers/iommu/iommu.c | 29 + include/linux/iommu.h | 13 + 2 files changed, 42 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 8c15c5980299..2c6faf417dd5 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2014,3 +2014,32 @@ int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids) return 0; } EXPORT_SYMBOL_GPL(iommu_fwspec_add_ids); + +int iommu_enable_aux_domain(struct device *dev) +{ + const struct iommu_ops *ops = dev->bus->iommu_ops; + + if (ops && ops->enable_auxd) + return ops->enable_auxd(dev); + + return -EINVAL; +} +EXPORT_SYMBOL_GPL(iommu_enable_aux_domain); + +void iommu_disable_aux_domain(struct device *dev) +{ + const struct iommu_ops *ops = dev->bus->iommu_ops; + + if (ops && ops->disable_auxd) + ops->disable_auxd(dev); +} +EXPORT_SYMBOL_GPL(iommu_disable_aux_domain); + +int iommu_auxiliary_id(struct iommu_domain *domain) +{ + if (domain->ops->auxd_id) + return domain->ops->auxd_id(domain); + + return -EINVAL; +} +EXPORT_SYMBOL_GPL(iommu_auxiliary_id); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 87994c265bf5..ffd20b315bee 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -101,6 +101,8 @@ enum iommu_cap { transactions */ IOMMU_CAP_INTR_REMAP, /* IOMMU supports interrupt isolation */ IOMMU_CAP_NOEXEC, /* IOMMU_NOEXEC flag */ + IOMMU_CAP_AUX_DOMAIN, /* IOMMU supports multiple domains per + device */ }; /* @@ -185,6 +187,9 @@ struct iommu_resv_region { * @domain_get_windows: Return the number of windows for a domain * @of_xlate: add OF master IDs to iommu grouping * @pgsize_bitmap: bitmap of all possible supported page sizes + * @enable_auxd: enable multiple domains per device support + * @disable_auxd: disable multiple domains per device support + * @auxd_id: return the id of an auxiliary domain */ struct iommu_ops { bool (*capable)(enum iommu_cap); @@ -231,6 +236,10 @@ struct iommu_ops { int (*of_xlate)(struct device *dev, struct of_phandle_args *args); bool (*is_attach_deferred)(struct iommu_domain *domain, struct device *dev); + int (*enable_auxd)(struct device *dev); + void (*disable_auxd)(struct device *dev); + int (*auxd_id)(struct iommu_domain *domain); + unsigned long pgsize_bitmap; }; @@ -400,6 +409,10 @@ void iommu_fwspec_free(struct device *dev); int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids); const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode); +int
[RFC PATCH v2 00/10] vfio/mdev: IOMMU aware mediated device
Hi, The Mediate Device is a framework for fine-grained physical device sharing across the isolated domains. Currently the mdev framework is designed to be independent of the platform IOMMU support. As the result, the DMA isolation relies on the mdev parent device in a vendor specific way. There are several cases where a mediated device could be protected and isolated by the platform IOMMU. For example, Intel vt-d rev3.0 [1] introduces a new translation mode called 'scalable mode', which enables PASID-granular translations. The vt-d scalable mode is the key ingredient for Scalable I/O Virtualization [2] [3] which allows sharing a device in minimal possible granularity (ADI - Assignable Device Interface). A mediated device backed by an ADI could be protected and isolated by the IOMMU since 1) the parent device supports tagging an unique PASID to all DMA traffic out of the mediated device; and 2) the DMA translation unit (IOMMU) supports the PASID granular translation. We can apply IOMMU protection and isolation to this kind of devices just as what we are doing with an assignable PCI device. In order to distinguish the IOMMU-capable mediated devices from those which still need to rely on parent devices, this patch set adds a domain type attribute to each mdev. enum mdev_domain_type { DOMAIN_TYPE_NO_IOMMU, /* Don't need any IOMMU support. * All isolation and protection * are handled by the parent * device driver with a device * specific mechanism. */ DOMAIN_TYPE_ATTACH_PARENT, /* IOMMU can isolate and protect * the mdev, and the isolation * domain should be attaced with * the parent device. */ }; The mdev parent device driver could opt-in whether an mdev is IOMMU capable when the device is created by invoking below interface within its @create callback: int mdev_set_domain_type(struct device *dev, enum mdev_domain_type type); In the vfio_iommu_type1_attach_group(), a domain allocated through iommu_domain_alloc() will be attached to the mdev parent device if the domain types of mdev devices in group are of type ATTACH_PARENT; Otherwise, the dummy external domain will be used and all the DMA isolation and protection are routed to parent driver as the result. On IOMMU side, a basic requirement is allowing to attach multiple domains for a PCI device if the device advertises the capability and the IOMMU hardware supports finer granularity translations than the normal PCI Source ID based translation. In order for the ease of discussion, we call "a domain in auxiliary mode' or simply 'an auxiliary domain' when a domain is attached to a device for finer granularity translations (than the Source ID based one). But we need to keep in mind that this doesn't mean two types of domains. A same domain could be bound to a device for Source ID based translation, and bound to another device for finer granularity translation at the same time. Below APIs are introduced in the IOMMU glue for device drivers to use the finer granularity translation. * iommu_capable(IOMMU_CAP_AUX_DOMAIN) - Represents the ability for supporting multiple domains per device (a.k.a. finer granularity translations) of the IOMMU hardware. * iommu_en(dis)able_aux_domain(struct device *dev) - Enable/disable the multiple domains capability for a device referenced by @dev. * iommu_auxiliary_id(struct iommu_domain *domain) - Return the index value used for finer-granularity DMA translation. The specific device driver needs to feed the hardware with this value, so that hardware device could issue the DMA transaction with this value tagged. This patch series extends both IOMMU and vfio components to support mdev device passing through when it could be isolated and protected by the IOMMU units. The first part of this series (PATCH 1/10 ~ 6/10) adds the interfaces and implementation of the multiple domains per device. The second part (PATCH 7/12 ~ 10/12) adds the domain type attribute to each mdev, determines domain type according to the attribute when attaching group in vfio type1 iommu module, and bind an auxiliary domain for the group with all mediated devices which requires its own domain. This patch series depends on a patch set posted here [4] for discussion which added the support for scalable mode in Intel IOMMU driver. References: [1] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification [2] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification [3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf [4] https://lkml.org/lkml/2018/8/30/27 Best
[RFC PATCH v2 02/10] iommu/vt-d: Add multiple domains per device query
Add the response to IOMMU_CAP_AUX_DOMAIN capability query through iommu_capable(). Return true if IOMMUs support the scalable mode, return false otherwise. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Lu Baolu --- drivers/iommu/intel-iommu.c | 31 +-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 3e49d4029058..891ae70e7bf2 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5193,12 +5193,39 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain, return phys; } +static inline bool scalable_mode_support(void) +{ + struct dmar_drhd_unit *drhd; + struct intel_iommu *iommu; + bool ret = true; + + rcu_read_lock(); + for_each_active_iommu(iommu, drhd) { + if (!sm_supported(iommu)) { + ret = false; + break; + } + } + rcu_read_unlock(); + + return ret; +} + static bool intel_iommu_capable(enum iommu_cap cap) { - if (cap == IOMMU_CAP_CACHE_COHERENCY) + switch (cap) { + case IOMMU_CAP_CACHE_COHERENCY: return domain_update_iommu_snooping(NULL) == 1; - if (cap == IOMMU_CAP_INTR_REMAP) + case IOMMU_CAP_INTR_REMAP: return irq_remapping_enabled == 1; + case IOMMU_CAP_AUX_DOMAIN: + return scalable_mode_support(); + case IOMMU_CAP_NOEXEC: + /* PASSTHROUGH */ + default: + pr_info("Unsupported capability query %d\n", cap); + break; + } return false; } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 12/12] iommu/vt-d: Remove deferred invalidation
Deferred invalidation is an ECS specific feature. It will not be supported when IOMMU works in scalable mode. As we deprecated the ECS support, remove deferred invalidation and cleanup the code. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Lu Baolu Reviewed-by: Ashok Raj --- drivers/iommu/intel-iommu.c | 1 - drivers/iommu/intel-svm.c | 45 - include/linux/intel-iommu.h | 8 --- 3 files changed, 54 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index e378a383d4f4..3e49d4029058 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -1722,7 +1722,6 @@ static void free_dmar_iommu(struct intel_iommu *iommu) if (pasid_supported(iommu)) { if (ecap_prs(iommu->ecap)) intel_svm_finish_prq(iommu); - intel_svm_exit(iommu); } #endif } diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c index fa5a19d83795..3f5ed33c56f0 100644 --- a/drivers/iommu/intel-svm.c +++ b/drivers/iommu/intel-svm.c @@ -31,15 +31,8 @@ static irqreturn_t prq_event_thread(int irq, void *d); -struct pasid_state_entry { - u64 val; -}; - int intel_svm_init(struct intel_iommu *iommu) { - struct page *pages; - int order; - if (cpu_feature_enabled(X86_FEATURE_GBPAGES) && !cap_fl1gp_support(iommu->cap)) return -EINVAL; @@ -48,39 +41,6 @@ int intel_svm_init(struct intel_iommu *iommu) !cap_5lp_support(iommu->cap)) return -EINVAL; - /* Start at 2 because it's defined as 2^(1+PSS) */ - iommu->pasid_max = 2 << ecap_pss(iommu->ecap); - - /* Eventually I'm promised we will get a multi-level PASID table -* and it won't have to be physically contiguous. Until then, -* limit the size because 8MiB contiguous allocations can be hard -* to come by. The limit of 0x2, which is 1MiB for each of -* the PASID and PASID-state tables, is somewhat arbitrary. */ - if (iommu->pasid_max > 0x2) - iommu->pasid_max = 0x2; - - order = get_order(sizeof(struct pasid_entry) * iommu->pasid_max); - if (ecap_dis(iommu->ecap)) { - pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order); - if (pages) - iommu->pasid_state_table = page_address(pages); - else - pr_warn("IOMMU: %s: Failed to allocate PASID state table\n", - iommu->name); - } - - return 0; -} - -int intel_svm_exit(struct intel_iommu *iommu) -{ - int order = get_order(sizeof(struct pasid_entry) * iommu->pasid_max); - - if (iommu->pasid_state_table) { - free_pages((unsigned long)iommu->pasid_state_table, order); - iommu->pasid_state_table = NULL; - } - return 0; } @@ -214,11 +174,6 @@ static void intel_flush_svm_range(struct intel_svm *svm, unsigned long address, { struct intel_svm_dev *sdev; - /* Try deferred invalidate if available */ - if (svm->iommu->pasid_state_table && - !cmpxchg64(>iommu->pasid_state_table[svm->pasid].val, 0, 1ULL << 63)) - return; - rcu_read_lock(); list_for_each_entry_rcu(sdev, >devs, list) intel_flush_svm_range_dev(svm, sdev, address, pages, ih, gl); diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h index 30e2bbfbbd50..b34cf8b887a0 100644 --- a/include/linux/intel-iommu.h +++ b/include/linux/intel-iommu.h @@ -457,15 +457,8 @@ struct intel_iommu { struct iommu_flush flush; #endif #ifdef CONFIG_INTEL_IOMMU_SVM - /* These are large and need to be contiguous, so we allocate just -* one for now. We'll maybe want to rethink that if we truly give -* devices away to userspace processes (e.g. for DPDK) and don't -* want to trust that userspace will use *only* the PASID it was -* told to. But while it's all driver-arbitrated, we're fine. */ - struct pasid_state_entry *pasid_state_table; struct page_req_dsc *prq; unsigned char prq_name[16];/* Name for PRQ interrupt */ - u32 pasid_max; #endif struct q_inval *qi;/* Queued invalidation info */ u32 *iommu_state; /* Store iommu states between suspend and resume.*/ @@ -579,7 +572,6 @@ void iommu_flush_write_buffer(struct intel_iommu *iommu); #ifdef CONFIG_INTEL_IOMMU_SVM int intel_svm_init(struct intel_iommu *iommu); -int intel_svm_exit(struct intel_iommu *iommu); extern int intel_svm_enable_prq(struct intel_iommu *iommu); extern int intel_svm_finish_prq(struct intel_iommu *iommu); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 11/12] iommu/vt-d: Shared virtual address in scalable mode
This patch enables the current SVA (Shared Virtual Address) implementation to work in the scalable mode. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Sanjay Kumar Signed-off-by: Lu Baolu Reviewed-by: Ashok Raj --- drivers/iommu/intel-iommu.c | 40 +--- drivers/iommu/intel-pasid.c | 2 +- drivers/iommu/intel-pasid.h | 1 - drivers/iommu/intel-svm.c | 57 +++ include/linux/dma_remapping.h | 9 +- 5 files changed, 20 insertions(+), 89 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index d854b17033a4..e378a383d4f4 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5279,18 +5279,6 @@ static void intel_iommu_put_resv_regions(struct device *dev, } #ifdef CONFIG_INTEL_IOMMU_SVM -static inline unsigned long intel_iommu_get_pts(struct device *dev) -{ - int pts, max_pasid; - - max_pasid = intel_pasid_get_dev_max_id(dev); - pts = find_first_bit((unsigned long *)_pasid, MAX_NR_PASID_BITS); - if (pts < 5) - return 0; - - return pts - 5; -} - int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sdev) { struct device_domain_info *info; @@ -5318,37 +5306,11 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sd ctx_lo = context[0].lo; - sdev->did = domain->iommu_did[iommu->seq_id]; + sdev->did = FLPT_DEFAULT_DID; sdev->sid = PCI_DEVID(info->bus, info->devfn); if (!(ctx_lo & CONTEXT_PASIDE)) { - if (iommu->pasid_state_table) - context[1].hi = (u64)virt_to_phys(iommu->pasid_state_table); - context[1].lo = (u64)virt_to_phys(info->pasid_table->table) | - intel_iommu_get_pts(sdev->dev); - - wmb(); - /* CONTEXT_TT_MULTI_LEVEL and CONTEXT_TT_DEV_IOTLB are both -* extended to permit requests-with-PASID if the PASIDE bit -* is set. which makes sense. For CONTEXT_TT_PASS_THROUGH, -* however, the PASIDE bit is ignored and requests-with-PASID -* are unconditionally blocked. Which makes less sense. -* So convert from CONTEXT_TT_PASS_THROUGH to one of the new -* "guest mode" translation types depending on whether ATS -* is available or not. Annoyingly, we can't use the new -* modes *unless* PASIDE is set. */ - if ((ctx_lo & CONTEXT_TT_MASK) == (CONTEXT_TT_PASS_THROUGH << 2)) { - ctx_lo &= ~CONTEXT_TT_MASK; - if (info->ats_supported) - ctx_lo |= CONTEXT_TT_PT_PASID_DEV_IOTLB << 2; - else - ctx_lo |= CONTEXT_TT_PT_PASID << 2; - } ctx_lo |= CONTEXT_PASIDE; - if (iommu->pasid_state_table) - ctx_lo |= CONTEXT_DINVE; - if (info->pri_supported) - ctx_lo |= CONTEXT_PRS; context[0].lo = ctx_lo; wmb(); iommu->flush.flush_context(iommu, sdev->did, sdev->sid, diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c index c921426d7b64..a24a11bae03e 100644 --- a/drivers/iommu/intel-pasid.c +++ b/drivers/iommu/intel-pasid.c @@ -283,7 +283,7 @@ static inline void pasid_clear_entry(struct pasid_entry *pe) WRITE_ONCE(pe->val[7], 0); } -void intel_pasid_clear_entry(struct device *dev, int pasid) +static void intel_pasid_clear_entry(struct device *dev, int pasid) { struct pasid_entry *pe; diff --git a/drivers/iommu/intel-pasid.h b/drivers/iommu/intel-pasid.h index ee5ac3d2ac22..9f628db9db41 100644 --- a/drivers/iommu/intel-pasid.h +++ b/drivers/iommu/intel-pasid.h @@ -50,7 +50,6 @@ void intel_pasid_free_table(struct device *dev); struct pasid_table *intel_pasid_get_table(struct device *dev); int intel_pasid_get_dev_max_id(struct device *dev); struct pasid_entry *intel_pasid_get_entry(struct device *dev, int pasid); -void intel_pasid_clear_entry(struct device *dev, int pasid); int intel_pasid_setup_first_level(struct intel_iommu *iommu, struct mm_struct *mm, struct device *dev, diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c index a06ed098e928..fa5a19d83795 100644 --- a/drivers/iommu/intel-svm.c +++ b/drivers/iommu/intel-svm.c @@ -29,10 +29,6 @@ #include "intel-pasid.h" -#define PASID_ENTRY_P BIT_ULL(0) -#define PASID_ENTRY_FLPM_5LP BIT_ULL(9) -#define PASID_ENTRY_SREBIT_ULL(11) - static irqreturn_t prq_event_thread(int irq, void *d); struct pasid_state_entry { @@ -248,20 +244,6 @@ static void intel_invalidate_range(struct mmu_notifier *mn,
[PATCH v2 10/12] iommu/vt-d: Add first level page table interface
This adds the interfaces to setup or tear down the structures for first level page table translation. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Sanjay Kumar Signed-off-by: Lu Baolu Reviewed-by: Ashok Raj --- drivers/iommu/intel-pasid.c | 89 + drivers/iommu/intel-pasid.h | 7 +++ include/linux/intel-iommu.h | 1 + 3 files changed, 97 insertions(+) diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c index edcea1d8b9fc..c921426d7b64 100644 --- a/drivers/iommu/intel-pasid.c +++ b/drivers/iommu/intel-pasid.c @@ -10,6 +10,7 @@ #define pr_fmt(fmt)"DMAR: " fmt #include +#include #include #include #include @@ -377,6 +378,26 @@ static inline void pasid_set_page_snoop(struct pasid_entry *pe, bool value) pasid_set_bits(>val[1], 1 << 23, value); } +/* + * Setup the First Level Page table Pointer field (Bit 140~191) + * of a scalable mode PASID entry. + */ +static inline void +pasid_set_flptr(struct pasid_entry *pe, u64 value) +{ + pasid_set_bits(>val[2], VTD_PAGE_MASK, value); +} + +/* + * Setup the First Level Paging Mode field (Bit 130~131) of a + * scalable mode PASID entry. + */ +static inline void +pasid_set_flpm(struct pasid_entry *pe, u64 value) +{ + pasid_set_bits(>val[2], GENMASK_ULL(3, 2), value << 2); +} + static void pasid_based_pasid_cache_invalidation(struct intel_iommu *iommu, int did, int pasid) @@ -445,6 +466,74 @@ static void tear_down_one_pasid_entry(struct intel_iommu *iommu, pasid_based_dev_iotlb_cache_invalidation(iommu, dev, pasid); } +/* + * Set up the scalable mode pasid table entry for first only + * translation type. + */ +int intel_pasid_setup_first_level(struct intel_iommu *iommu, + struct mm_struct *mm, + struct device *dev, + u16 did, int pasid) +{ + struct pasid_entry *pte; + + if (!ecap_flts(iommu->ecap)) { + pr_err("No first level translation support on %s\n", + iommu->name); + return -EINVAL; + } + + pte = intel_pasid_get_entry(dev, pasid); + if (WARN_ON(!pte)) + return -EINVAL; + + pasid_clear_entry(pte); + + /* Setup the first level page table pointer: */ + if (mm) { + pasid_set_flptr(pte, (u64)__pa(mm->pgd)); + } else { + pasid_set_sre(pte); + pasid_set_flptr(pte, (u64)__pa(init_mm.pgd)); + } + +#ifdef CONFIG_X86 + if (cpu_feature_enabled(X86_FEATURE_LA57)) + pasid_set_flpm(pte, 1); +#endif /* CONFIG_X86 */ + + pasid_set_domain_id(pte, did); + pasid_set_address_width(pte, iommu->agaw); + pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap)); + + /* Setup Present and PASID Granular Transfer Type: */ + pasid_set_translation_type(pte, 1); + pasid_set_present(pte); + + if (!ecap_coherent(iommu->ecap)) + clflush_cache_range(pte, sizeof(*pte)); + + if (cap_caching_mode(iommu->cap)) { + pasid_based_pasid_cache_invalidation(iommu, did, pasid); + pasid_based_iotlb_cache_invalidation(iommu, did, pasid); + } else { + iommu_flush_write_buffer(iommu); + } + + return 0; +} + +/* + * Tear down the scalable mode pasid table entry for first only + * translation type. + */ +void intel_pasid_tear_down_first_level(struct intel_iommu *iommu, + struct device *dev, + u16 did, int pasid) +{ + tear_down_one_pasid_entry(iommu, dev, did, pasid); +} + /* * Set up the scalable mode pasid table entry for second only or * passthrough translation type. diff --git a/drivers/iommu/intel-pasid.h b/drivers/iommu/intel-pasid.h index 948cd3a25976..ee5ac3d2ac22 100644 --- a/drivers/iommu/intel-pasid.h +++ b/drivers/iommu/intel-pasid.h @@ -51,6 +51,13 @@ struct pasid_table *intel_pasid_get_table(struct device *dev); int intel_pasid_get_dev_max_id(struct device *dev); struct pasid_entry *intel_pasid_get_entry(struct device *dev, int pasid); void intel_pasid_clear_entry(struct device *dev, int pasid); +int intel_pasid_setup_first_level(struct intel_iommu *iommu, + struct mm_struct *mm, + struct device *dev, + u16 did, int pasid); +void intel_pasid_tear_down_first_level(struct intel_iommu *iommu, + struct device *dev, + u16 did, int pasid); int intel_pasid_setup_second_level(struct intel_iommu *iommu, struct dmar_domain *domain, struct device *dev, int pasid, diff --git a/include/linux/intel-iommu.h
[PATCH v2 09/12] iommu/vt-d: Setup context and enable RID2PASID support
This patch enables the translation for requests without PASID in the scalable mode by setting up the root and context entries. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Sanjay Kumar Signed-off-by: Lu Baolu Reviewed-by: Ashok Raj --- drivers/iommu/intel-iommu.c | 109 ++-- drivers/iommu/intel-pasid.h | 1 + include/linux/intel-iommu.h | 1 + 3 files changed, 95 insertions(+), 16 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 33642dd3d6ba..d854b17033a4 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -1219,6 +1219,8 @@ static void iommu_set_root_entry(struct intel_iommu *iommu) unsigned long flag; addr = virt_to_phys(iommu->root_entry); + if (sm_supported(iommu)) + addr |= DMA_RTADDR_SMT; raw_spin_lock_irqsave(>register_lock, flag); dmar_writeq(iommu->reg + DMAR_RTADDR_REG, addr); @@ -1940,6 +1942,55 @@ static void domain_exit(struct dmar_domain *domain) free_domain_mem(domain); } +/* + * Get the PASID directory size for scalable mode context entry. + * Value of X in the PDTS field of a scalable mode context entry + * indicates PASID directory with 2^(X + 7) entries. + */ +static inline unsigned long context_get_sm_pds(struct pasid_table *table) +{ + int pds, max_pde; + + max_pde = table->max_pasid >> PASID_PDE_SHIFT; + pds = find_first_bit((unsigned long *)_pde, MAX_NR_PASID_BITS); + if (pds < 7) + return 0; + + return pds - 7; +} + +/* + * Set the RID_PASID field of a scalable mode context entry. The + * IOMMU hardware will use the PASID value set in this field for + * DMA translations of DMA requests without PASID. + */ +static inline void +context_set_sm_rid2pasid(struct context_entry *context, unsigned long pasid) +{ + context->hi |= pasid & ((1 << 20) - 1); +} + +/* + * Set the DTE(Device-TLB Enable) field of a scalable mode context + * entry. + */ +static inline void context_set_sm_dte(struct context_entry *context) +{ + context->lo |= (1 << 2); +} + +/* + * Set the PRE(Page Request Enable) field of a scalable mode context + * entry. + */ +static inline void context_set_sm_pre(struct context_entry *context) +{ + context->lo |= (1 << 4); +} + +/* Convert value to context PASID directory size field coding. */ +#define context_pdts(pds) (((pds) & 0x7) << 9) + static int domain_context_mapping_one(struct dmar_domain *domain, struct intel_iommu *iommu, struct pasid_table *table, @@ -1998,9 +2049,7 @@ static int domain_context_mapping_one(struct dmar_domain *domain, } pgd = domain->pgd; - context_clear_entry(context); - context_set_domain_id(context, did); /* * Skip top levels of page tables for iommu which has less agaw @@ -2013,25 +2062,54 @@ static int domain_context_mapping_one(struct dmar_domain *domain, if (!dma_pte_present(pgd)) goto out_unlock; } + } - info = iommu_support_dev_iotlb(domain, iommu, bus, devfn); - if (info && info->ats_supported) - translation = CONTEXT_TT_DEV_IOTLB; - else - translation = CONTEXT_TT_MULTI_LEVEL; + if (sm_supported(iommu)) { + unsigned long pds; + + WARN_ON(!table); + + /* Setup the PASID DIR pointer: */ + pds = context_get_sm_pds(table); + context->lo = (u64)virt_to_phys(table->table) | + context_pdts(pds); + + /* Setup the RID_PASID field: */ + context_set_sm_rid2pasid(context, PASID_RID2PASID); - context_set_address_root(context, virt_to_phys(pgd)); - context_set_address_width(context, iommu->agaw); - } else { /* -* In pass through mode, AW must be programmed to -* indicate the largest AGAW value supported by -* hardware. And ASR is ignored by hardware. +* Setup the Device-TLB enable bit and Page request +* Enable bit: */ - context_set_address_width(context, iommu->msagaw); + info = iommu_support_dev_iotlb(domain, iommu, bus, devfn); + if (info && info->ats_supported) + context_set_sm_dte(context); + if (info && info->pri_supported) + context_set_sm_pre(context); + } else { + context_set_domain_id(context, did); + + if (translation != CONTEXT_TT_PASS_THROUGH) { + info = iommu_support_dev_iotlb(domain, iommu, + bus,
[PATCH v2 08/12] iommu/vt-d: Pass pasid table to context mapping
So that the pasid related info, such as the pasid table and the maximum of pasid could be used during setting up scalable mode context. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Lu Baolu Reviewed-by: Ashok Raj --- drivers/iommu/intel-iommu.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index c3bf2ccf094d..33642dd3d6ba 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -1942,6 +1942,7 @@ static void domain_exit(struct dmar_domain *domain) static int domain_context_mapping_one(struct dmar_domain *domain, struct intel_iommu *iommu, + struct pasid_table *table, u8 bus, u8 devfn) { u16 did = domain->iommu_did[iommu->seq_id]; @@ -2064,6 +2065,7 @@ static int domain_context_mapping_one(struct dmar_domain *domain, struct domain_context_mapping_data { struct dmar_domain *domain; struct intel_iommu *iommu; + struct pasid_table *table; }; static int domain_context_mapping_cb(struct pci_dev *pdev, @@ -2072,25 +2074,31 @@ static int domain_context_mapping_cb(struct pci_dev *pdev, struct domain_context_mapping_data *data = opaque; return domain_context_mapping_one(data->domain, data->iommu, - PCI_BUS_NUM(alias), alias & 0xff); + data->table, PCI_BUS_NUM(alias), + alias & 0xff); } static int domain_context_mapping(struct dmar_domain *domain, struct device *dev) { + struct domain_context_mapping_data data; + struct pasid_table *table; struct intel_iommu *iommu; u8 bus, devfn; - struct domain_context_mapping_data data; iommu = device_to_iommu(dev, , ); if (!iommu) return -ENODEV; + table = intel_pasid_get_table(dev); + if (!dev_is_pci(dev)) - return domain_context_mapping_one(domain, iommu, bus, devfn); + return domain_context_mapping_one(domain, iommu, table, + bus, devfn); data.domain = domain; data.iommu = iommu; + data.table = table; return pci_for_each_dma_alias(to_pci_dev(dev), _context_mapping_cb, ); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 07/12] iommu/vt-d: Setup pasid entry for RID2PASID support
when the scalable mode is enabled, there is no second level page translation pointer in the context entry any more (for DMA request without PASID). Instead, a new RID2PASID field is introduced in the context entry. Software can choose any PASID value to set RID2PASID and then setup the translation in the corresponding PASID entry. Upon receiving a DMA request without PASID, hardware will firstly look at this RID2PASID field and then treat this request as a request with a pasid value specified in RID2PASID field. Though software is allowed to use any PASID for the RID2PASID, we will always use the PASID 0 as a sort of design decision. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Sanjay Kumar Signed-off-by: Lu Baolu Reviewed-by: Ashok Raj --- drivers/iommu/intel-iommu.c | 20 drivers/iommu/intel-pasid.h | 1 + 2 files changed, 21 insertions(+) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index de6b909bb47a..c3bf2ccf094d 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2475,12 +2475,27 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu, dev->archdata.iommu = info; if (dev && dev_is_pci(dev) && sm_supported(iommu)) { + bool pass_through; + ret = intel_pasid_alloc_table(dev); if (ret) { __dmar_remove_one_dev_info(info); spin_unlock_irqrestore(_domain_lock, flags); return NULL; } + + /* Setup the PASID entry for requests without PASID: */ + pass_through = hw_pass_through && domain_type_is_si(domain); + spin_lock(>lock); + ret = intel_pasid_setup_second_level(iommu, domain, dev, +PASID_RID2PASID, +pass_through); + spin_unlock(>lock); + if (ret) { + __dmar_remove_one_dev_info(info); + spin_unlock_irqrestore(_domain_lock, flags); + return NULL; + } } spin_unlock_irqrestore(_domain_lock, flags); @@ -4846,6 +4861,11 @@ static void __dmar_remove_one_dev_info(struct device_domain_info *info) iommu = info->iommu; if (info->dev) { + if (dev_is_pci(info->dev) && sm_supported(iommu)) + intel_pasid_tear_down_second_level(iommu, + info->domain, info->dev, + PASID_RID2PASID); + iommu_disable_dev_iotlb(info); domain_context_clear(iommu, info->dev); intel_pasid_free_table(info->dev); diff --git a/drivers/iommu/intel-pasid.h b/drivers/iommu/intel-pasid.h index 85b158a1826a..dda578b8f18e 100644 --- a/drivers/iommu/intel-pasid.h +++ b/drivers/iommu/intel-pasid.h @@ -10,6 +10,7 @@ #ifndef __INTEL_PASID_H #define __INTEL_PASID_H +#define PASID_RID2PASID0x0 #define PASID_MIN 0x1 #define PASID_MAX 0x10 #define PASID_PTE_MASK 0x3F -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 05/12] iommu/vt-d: Reserve a domain id for FL and PT modes
Vt-d spec rev3.0 (section 6.2.3.1) requires that each pasid entry for first-level or pass-through translation should be programmed with a domain id different from those used for second-level or nested translation. It is recommended that software could use a same domain id for all first-only and pass-through translations. This reserves a domain id for first-level and pass-through translations. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Cc: Sanjay Kumar Signed-off-by: Lu Baolu --- drivers/iommu/intel-iommu.c | 10 ++ drivers/iommu/intel-pasid.h | 6 ++ 2 files changed, 16 insertions(+) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 93cde957adc7..562da10bf93e 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -1643,6 +1643,16 @@ static int iommu_init_domains(struct intel_iommu *iommu) */ set_bit(0, iommu->domain_ids); + /* +* Vt-d spec rev3.0 (section 6.2.3.1) requires that each pasid +* entry for first-level or pass-through translation modes should +* be programmed with a domain id different from those used for +* second-level or nested translation. We reserve a domain id for +* this purpose. +*/ + if (sm_supported(iommu)) + set_bit(FLPT_DEFAULT_DID, iommu->domain_ids); + return 0; } diff --git a/drivers/iommu/intel-pasid.h b/drivers/iommu/intel-pasid.h index 12f480c2bb8b..03c1612d173c 100644 --- a/drivers/iommu/intel-pasid.h +++ b/drivers/iommu/intel-pasid.h @@ -17,6 +17,12 @@ #define PDE_PFN_MASK PAGE_MASK #define PASID_PDE_SHIFT6 +/* + * Domain ID reserved for pasid entries programmed for first-level + * only and pass-through transfer modes. + */ +#define FLPT_DEFAULT_DID 1 + struct pasid_dir_entry { u64 val; }; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 06/12] iommu/vt-d: Add second level page table interface
This adds the interfaces to setup or tear down the structures for second level page table translations. This includes types of second level only translation and pass through. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Sanjay Kumar Signed-off-by: Lu Baolu Reviewed-by: Ashok Raj --- drivers/iommu/intel-iommu.c | 2 +- drivers/iommu/intel-pasid.c | 246 drivers/iommu/intel-pasid.h | 7 + include/linux/intel-iommu.h | 3 + 4 files changed, 257 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 562da10bf93e..de6b909bb47a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -1232,7 +1232,7 @@ static void iommu_set_root_entry(struct intel_iommu *iommu) raw_spin_unlock_irqrestore(>register_lock, flag); } -static void iommu_flush_write_buffer(struct intel_iommu *iommu) +void iommu_flush_write_buffer(struct intel_iommu *iommu) { u32 val; unsigned long flag; diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c index d6e90cd5b062..edcea1d8b9fc 100644 --- a/drivers/iommu/intel-pasid.c +++ b/drivers/iommu/intel-pasid.c @@ -9,6 +9,7 @@ #define pr_fmt(fmt)"DMAR: " fmt +#include #include #include #include @@ -291,3 +292,248 @@ void intel_pasid_clear_entry(struct device *dev, int pasid) pasid_clear_entry(pe); } + +static inline void pasid_set_bits(u64 *ptr, u64 mask, u64 bits) +{ + u64 old; + + old = READ_ONCE(*ptr); + WRITE_ONCE(*ptr, (old & ~mask) | bits); +} + +/* + * Setup the DID(Domain Identifier) field (Bit 64~79) of scalable mode + * PASID entry. + */ +static inline void +pasid_set_domain_id(struct pasid_entry *pe, u64 value) +{ + pasid_set_bits(>val[1], GENMASK_ULL(15, 0), value); +} + +/* + * Setup the SLPTPTR(Second Level Page Table Pointer) field (Bit 12~63) + * of a scalable mode PASID entry. + */ +static inline void +pasid_set_address_root(struct pasid_entry *pe, u64 value) +{ + pasid_set_bits(>val[0], VTD_PAGE_MASK, value); +} + +/* + * Setup the AW(Address Width) field (Bit 2~4) of a scalable mode PASID + * entry. + */ +static inline void +pasid_set_address_width(struct pasid_entry *pe, u64 value) +{ + pasid_set_bits(>val[0], GENMASK_ULL(4, 2), value << 2); +} + +/* + * Setup the PGTT(PASID Granular Translation Type) field (Bit 6~8) + * of a scalable mode PASID entry. + */ +static inline void +pasid_set_translation_type(struct pasid_entry *pe, u64 value) +{ + pasid_set_bits(>val[0], GENMASK_ULL(8, 6), value << 6); +} + +/* + * Enable fault processing by clearing the FPD(Fault Processing + * Disable) field (Bit 1) of a scalable mode PASID entry. + */ +static inline void pasid_set_fault_enable(struct pasid_entry *pe) +{ + pasid_set_bits(>val[0], 1 << 1, 0); +} + +/* + * Setup the SRE(Supervisor Request Enable) field (Bit 128) of a + * scalable mode PASID entry. + */ +static inline void pasid_set_sre(struct pasid_entry *pe) +{ + pasid_set_bits(>val[2], 1 << 0, 1); +} + +/* + * Setup the P(Present) field (Bit 0) of a scalable mode PASID + * entry. + */ +static inline void pasid_set_present(struct pasid_entry *pe) +{ + pasid_set_bits(>val[0], 1 << 0, 1); +} + +/* + * Setup Page Walk Snoop bit (Bit 87) of a scalable mode PASID + * entry. + */ +static inline void pasid_set_page_snoop(struct pasid_entry *pe, bool value) +{ + pasid_set_bits(>val[1], 1 << 23, value); +} + +static void +pasid_based_pasid_cache_invalidation(struct intel_iommu *iommu, +int did, int pasid) +{ + struct qi_desc desc; + + desc.qw0 = QI_PC_DID(did) | QI_PC_PASID_SEL | QI_PC_PASID(pasid); + desc.qw1 = 0; + desc.qw2 = 0; + desc.qw3 = 0; + + qi_submit_sync(, iommu); +} + +static void +pasid_based_iotlb_cache_invalidation(struct intel_iommu *iommu, +u16 did, u32 pasid) +{ + struct qi_desc desc; + + desc.qw0 = QI_EIOTLB_PASID(pasid) | QI_EIOTLB_DID(did) | + QI_EIOTLB_GRAN(QI_GRAN_NONG_PASID) | QI_EIOTLB_TYPE; + desc.qw1 = 0; + desc.qw2 = 0; + desc.qw3 = 0; + + qi_submit_sync(, iommu); +} + +static void +pasid_based_dev_iotlb_cache_invalidation(struct intel_iommu *iommu, +struct device *dev, int pasid) +{ + struct device_domain_info *info; + u16 sid, qdep, pfsid; + + info = dev->archdata.iommu; + if (!info || !info->ats_enabled) + return; + + sid = info->bus << 8 | info->devfn; + qdep = info->ats_qdep; + pfsid = info->pfsid; + + qi_flush_dev_iotlb(iommu, sid, pfsid, qdep, 0, 64 - VTD_PAGE_SHIFT); +} + +static void tear_down_one_pasid_entry(struct intel_iommu *iommu, + struct device *dev, u16 did, + int pasid) +{ +
[PATCH v2 03/12] iommu/vt-d: Move page table helpers into header
So that they could also be used in other source files. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Lu Baolu Reviewed-by: Ashok Raj --- drivers/iommu/intel-iommu.c | 43 - include/linux/intel-iommu.h | 43 + 2 files changed, 43 insertions(+), 43 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index b0da4f765274..93cde957adc7 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -315,49 +315,6 @@ static inline void context_clear_entry(struct context_entry *context) context->hi = 0; } -/* - * 0: readable - * 1: writable - * 2-6: reserved - * 7: super page - * 8-10: available - * 11: snoop behavior - * 12-63: Host physcial address - */ -struct dma_pte { - u64 val; -}; - -static inline void dma_clear_pte(struct dma_pte *pte) -{ - pte->val = 0; -} - -static inline u64 dma_pte_addr(struct dma_pte *pte) -{ -#ifdef CONFIG_64BIT - return pte->val & VTD_PAGE_MASK; -#else - /* Must have a full atomic 64-bit read */ - return __cmpxchg64(>val, 0ULL, 0ULL) & VTD_PAGE_MASK; -#endif -} - -static inline bool dma_pte_present(struct dma_pte *pte) -{ - return (pte->val & 3) != 0; -} - -static inline bool dma_pte_superpage(struct dma_pte *pte) -{ - return (pte->val & DMA_PTE_LARGE_PAGE); -} - -static inline int first_pte_in_page(struct dma_pte *pte) -{ - return !((unsigned long)pte & ~VTD_PAGE_MASK); -} - /* * This domain is a statically identity mapping domain. * 1. This domain creats a static 1:1 mapping to all usable memory. diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h index 2173ae35f1dc..41791903a5e3 100644 --- a/include/linux/intel-iommu.h +++ b/include/linux/intel-iommu.h @@ -501,6 +501,49 @@ static inline void __iommu_flush_cache( clflush_cache_range(addr, size); } +/* + * 0: readable + * 1: writable + * 2-6: reserved + * 7: super page + * 8-10: available + * 11: snoop behavior + * 12-63: Host physcial address + */ +struct dma_pte { + u64 val; +}; + +static inline void dma_clear_pte(struct dma_pte *pte) +{ + pte->val = 0; +} + +static inline u64 dma_pte_addr(struct dma_pte *pte) +{ +#ifdef CONFIG_64BIT + return pte->val & VTD_PAGE_MASK; +#else + /* Must have a full atomic 64-bit read */ + return __cmpxchg64(>val, 0ULL, 0ULL) & VTD_PAGE_MASK; +#endif +} + +static inline bool dma_pte_present(struct dma_pte *pte) +{ + return (pte->val & 3) != 0; +} + +static inline bool dma_pte_superpage(struct dma_pte *pte) +{ + return (pte->val & DMA_PTE_LARGE_PAGE); +} + +static inline int first_pte_in_page(struct dma_pte *pte) +{ + return !((unsigned long)pte & ~VTD_PAGE_MASK); +} + extern struct dmar_drhd_unit * dmar_find_matched_drhd_unit(struct pci_dev *dev); extern int dmar_find_matched_atsr_unit(struct pci_dev *dev); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 02/12] iommu/vt-d: Manage scalalble mode PASID tables
In scalable mode, pasid structure is a two level table with a pasid directory table and a pasid table. Any pasid entry can be identified by a pasid value in below way. 1 9 6 5 0 .---.---. | PASID| | '---'---'.-. || | | || | | || | | | .---. | .-. | | | |->| PASID Entry | | | | | '-' | | | |Plus | | | .---. | | | |>| DIR Entry |>| | | '---' '-' .-. |Plus | | | Context | | | | | Entry |--->| | '-''---' This changes the pasid table APIs to support scalable mode PASID directory and PASID table. It also adds a helper to get the PASID table entry according to the pasid value. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Sanjay Kumar Signed-off-by: Lu Baolu Reviewed-by: Ashok Raj --- drivers/iommu/intel-iommu.c | 2 +- drivers/iommu/intel-pasid.c | 72 - drivers/iommu/intel-pasid.h | 10 +- drivers/iommu/intel-svm.c | 6 +--- 4 files changed, 74 insertions(+), 16 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 5845edf4dcf9..b0da4f765274 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2507,7 +2507,7 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu, if (dev) dev->archdata.iommu = info; - if (dev && dev_is_pci(dev) && info->pasid_supported) { + if (dev && dev_is_pci(dev) && sm_supported(iommu)) { ret = intel_pasid_alloc_table(dev); if (ret) { __dmar_remove_one_dev_info(info); diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c index fe95c9bd4d33..d6e90cd5b062 100644 --- a/drivers/iommu/intel-pasid.c +++ b/drivers/iommu/intel-pasid.c @@ -127,8 +127,7 @@ int intel_pasid_alloc_table(struct device *dev) int ret, order; info = dev->archdata.iommu; - if (WARN_ON(!info || !dev_is_pci(dev) || - !info->pasid_supported || info->pasid_table)) + if (WARN_ON(!info || !dev_is_pci(dev) || info->pasid_table)) return -EINVAL; /* DMA alias device already has a pasid table, use it: */ @@ -143,8 +142,9 @@ int intel_pasid_alloc_table(struct device *dev) return -ENOMEM; INIT_LIST_HEAD(_table->dev); - size = sizeof(struct pasid_entry); + size = sizeof(struct pasid_dir_entry); count = min_t(int, pci_max_pasids(to_pci_dev(dev)), intel_pasid_max_id); + count >>= PASID_PDE_SHIFT; order = get_order(size * count); pages = alloc_pages_node(info->iommu->node, GFP_ATOMIC | __GFP_ZERO, @@ -154,7 +154,7 @@ int intel_pasid_alloc_table(struct device *dev) pasid_table->table = page_address(pages); pasid_table->order = order; - pasid_table->max_pasid = count; + pasid_table->max_pasid = count << PASID_PDE_SHIFT; attach_out: device_attach_pasid_table(info, pasid_table); @@ -162,14 +162,33 @@ int intel_pasid_alloc_table(struct device *dev) return 0; } +/* Get PRESENT bit of a PASID directory entry. */ +static inline bool +pasid_pde_is_present(struct pasid_dir_entry *pde) +{ + return READ_ONCE(pde->val) & PASID_PTE_PRESENT; +} + +/* Get PASID table from a PASID directory entry. */ +static inline struct pasid_entry * +get_pasid_table_from_pde(struct pasid_dir_entry *pde) +{ + if (!pasid_pde_is_present(pde)) + return NULL; + + return phys_to_virt(READ_ONCE(pde->val) & PDE_PFN_MASK); +} + void intel_pasid_free_table(struct device *dev) { struct device_domain_info *info; struct pasid_table *pasid_table; + struct pasid_dir_entry *dir; + struct pasid_entry *table; + int i, max_pde; info = dev->archdata.iommu; - if (!info || !dev_is_pci(dev) || - !info->pasid_supported || !info->pasid_table) + if (!info || !dev_is_pci(dev) || !info->pasid_table) return; pasid_table = info->pasid_table; @@ -178,6 +197,14 @@ void intel_pasid_free_table(struct device *dev) if (!list_empty(_table->dev)) return; + /* Free scalable mode PASID directory tables: */ + dir = pasid_table->table; + max_pde = pasid_table->max_pasid >> PASID_PDE_SHIFT; + for (i = 0; i <
[PATCH v2 04/12] iommu/vt-d: Add 256-bit invalidation descriptor support
Intel vt-d spec rev3.0 requires software to use 256-bit descriptors in invalidation queue. As the spec reads in section 6.5.2: Remapping hardware supporting Scalable Mode Translations (ECAP_REG.SMTS=1) allow software to additionally program the width of the descriptors (128-bits or 256-bits) that will be written into the Queue. Software should setup the Invalidation Queue for 256-bit descriptors before progra- mming remapping hardware for scalable-mode translation as 128-bit descriptors are treated as invalid descriptors (see Table 21 in Section 6.5.2.10) in scalable-mode. This patch adds 256-bit invalidation descriptor support if the hardware presents scalable mode capability. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Sanjay Kumar Signed-off-by: Lu Baolu --- drivers/iommu/dmar.c| 83 +++-- drivers/iommu/intel-svm.c | 76 -- drivers/iommu/intel_irq_remapping.c | 6 ++- include/linux/intel-iommu.h | 7 ++- 4 files changed, 113 insertions(+), 59 deletions(-) diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c index d9c748b6f9e4..b1429fa2cf29 100644 --- a/drivers/iommu/dmar.c +++ b/drivers/iommu/dmar.c @@ -1160,6 +1160,7 @@ static int qi_check_fault(struct intel_iommu *iommu, int index) int head, tail; struct q_inval *qi = iommu->qi; int wait_index = (index + 1) % QI_LENGTH; + int shift = DMAR_IQ_SHIFT + !!ecap_smts(iommu->ecap); if (qi->desc_status[wait_index] == QI_ABORT) return -EAGAIN; @@ -1173,13 +1174,15 @@ static int qi_check_fault(struct intel_iommu *iommu, int index) */ if (fault & DMA_FSTS_IQE) { head = readl(iommu->reg + DMAR_IQH_REG); - if ((head >> DMAR_IQ_SHIFT) == index) { + if ((head >> shift) == index) { + struct qi_desc *desc = qi->desc + head; + pr_err("VT-d detected invalid descriptor: " "low=%llx, high=%llx\n", - (unsigned long long)qi->desc[index].low, - (unsigned long long)qi->desc[index].high); - memcpy(>desc[index], >desc[wait_index], - sizeof(struct qi_desc)); + (unsigned long long)desc->qw0, + (unsigned long long)desc->qw1); + memcpy(desc, qi->desc + (wait_index << shift), + 1 << shift); writel(DMA_FSTS_IQE, iommu->reg + DMAR_FSTS_REG); return -EINVAL; } @@ -1191,10 +1194,10 @@ static int qi_check_fault(struct intel_iommu *iommu, int index) */ if (fault & DMA_FSTS_ITE) { head = readl(iommu->reg + DMAR_IQH_REG); - head = ((head >> DMAR_IQ_SHIFT) - 1 + QI_LENGTH) % QI_LENGTH; + head = ((head >> shift) - 1 + QI_LENGTH) % QI_LENGTH; head |= 1; tail = readl(iommu->reg + DMAR_IQT_REG); - tail = ((tail >> DMAR_IQ_SHIFT) - 1 + QI_LENGTH) % QI_LENGTH; + tail = ((tail >> shift) - 1 + QI_LENGTH) % QI_LENGTH; writel(DMA_FSTS_ITE, iommu->reg + DMAR_FSTS_REG); @@ -1222,15 +1225,14 @@ int qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu) { int rc; struct q_inval *qi = iommu->qi; - struct qi_desc *hw, wait_desc; + int offset, shift, length; + struct qi_desc wait_desc; int wait_index, index; unsigned long flags; if (!qi) return 0; - hw = qi->desc; - restart: rc = 0; @@ -1243,16 +1245,21 @@ int qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu) index = qi->free_head; wait_index = (index + 1) % QI_LENGTH; + shift = DMAR_IQ_SHIFT + !!ecap_smts(iommu->ecap); + length = 1 << shift; qi->desc_status[index] = qi->desc_status[wait_index] = QI_IN_USE; - hw[index] = *desc; - - wait_desc.low = QI_IWD_STATUS_DATA(QI_DONE) | + offset = index << shift; + memcpy(qi->desc + offset, desc, length); + wait_desc.qw0 = QI_IWD_STATUS_DATA(QI_DONE) | QI_IWD_STATUS_WRITE | QI_IWD_TYPE; - wait_desc.high = virt_to_phys(>desc_status[wait_index]); + wait_desc.qw1 = virt_to_phys(>desc_status[wait_index]); + wait_desc.qw2 = 0; + wait_desc.qw3 = 0; - hw[wait_index] = wait_desc; + offset = wait_index << shift; + memcpy(qi->desc + offset, _desc, length); qi->free_head = (qi->free_head + 2) % QI_LENGTH; qi->free_cnt -= 2; @@ -1261,7 +1268,7 @@ int qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu) * update the HW tail register indicating the presence of * new descriptors.
[PATCH v2 00/12] iommu/vt-d: Add scalable mode support
Hi, Intel vt-d rev3.0 [1] introduces a new translation mode called 'scalable mode', which enables PASID-granular translations for first level, second level, nested and pass-through modes. The vt-d scalable mode is the key ingredient to enable Scalable I/O Virtualization (Scalable IOV) [2] [3], which allows sharing a device in minimal possible granularity (ADI - Assignable Device Interface). It also includes all the capabilities required to enable Shared Virtual Addressing (SVA). As a result, previous Extended Context (ECS) mode is deprecated (no production ever implements ECS). Each scalable mode pasid table entry is 64 bytes in length, with fields point to the first level page table and the second level page table. The PGTT (Pasid Granular Translation Type) field is used by hardware to determine the translation type. A Scalable Mode.-. PASID Entry .-| | .--. .-| | 1st Level | 7| | | | | Page Table | .--. | | | | 6| | | | | | '--' | | '-' 5| | | '-' '--' '-' 4| |^ '--' / 3| | / .-. ..---.-. / .-| | 2|| FLPTR | |/ .-| | 2nd Level | .'---'-. | | | Page Table | 1| | | | | | .-.---..--.. | | | | 0| | SLPTR || PGTT ||> | | '-' '-'---''--'' | '-' 6 |0 '-' 3 v .. | PASID Granular Translation Type| || | 001b: 1st level translation only | | 101b: 2nd level translation only | | 011b: Nested translation | | 100b: Pass through | '' This patch series adds the scalable mode support in the Intel IOMMU driver. It will make all the Intel IOMMU features work in scalable mode. The changes are all constrained within the Intel IOMMU driver, as it's purely internal format change. References: [1] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification [2] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification [3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf Change log: v1->v2: - Rebase all patches on top of v4.19-rc1; - Add 256-bit invalidation descriptor support; - Reserve a domain id for first level and pass-through usage to make hardware cache entries more efficiently; - Various code refinements. Lu Baolu (12): iommu/vt-d: Enumerate the scalable mode capability iommu/vt-d: Manage scalalble mode PASID tables iommu/vt-d: Move page table helpers into header iommu/vt-d: Add 256-bit invalidation descriptor support iommu/vt-d: Reserve a domain id for FL and PT modes iommu/vt-d: Add second level page table interface iommu/vt-d: Setup pasid entry for RID2PASID support iommu/vt-d: Pass pasid table to context mapping iommu/vt-d: Setup context and enable RID2PASID support iommu/vt-d: Add first level page table interface iommu/vt-d: Shared virtual address in scalable mode iommu/vt-d: Remove deferred invalidation .../admin-guide/kernel-parameters.txt | 12 +- drivers/iommu/dmar.c | 83 ++-- drivers/iommu/intel-iommu.c | 305 ++--- drivers/iommu/intel-pasid.c | 409 +- drivers/iommu/intel-pasid.h | 33 +- drivers/iommu/intel-svm.c | 170 +++- drivers/iommu/intel_irq_remapping.c | 6 +- include/linux/dma_remapping.h | 9 +- include/linux/intel-iommu.h | 64 ++- 9 files changed, 764 insertions(+), 327 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 01/12] iommu/vt-d: Enumerate the scalable mode capability
The Intel vt-d spec rev3.0 introduces a new translation mode called scalable mode, which enables PASID-granular translations for first level, second level, nested and pass-through modes. At the same time, the previous Extended Context (ECS) mode is deprecated (no production ever implements ECS). This patch adds enumeration for Scalable Mode and removes the deprecated ECS enumeration. It provides a boot time option to disable scalable mode even hardware claims to support it. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Cc: Liu Yi L Signed-off-by: Sanjay Kumar Signed-off-by: Lu Baolu Reviewed-by: Ashok Raj --- .../admin-guide/kernel-parameters.txt | 12 ++-- drivers/iommu/intel-iommu.c | 64 +-- include/linux/intel-iommu.h | 1 + 3 files changed, 24 insertions(+), 53 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 9871e649ffef..5b971306a114 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1668,12 +1668,12 @@ By default, super page will be supported if Intel IOMMU has the capability. With this option, super page will not be supported. - ecs_off [Default Off] - By default, extended context tables will be supported if - the hardware advertises that it has support both for the - extended tables themselves, and also PASID support. With - this option set, extended tables will not be used even - on hardware which claims to support them. + sm_off [Default Off] + By default, scalable mode will be supported if the + hardware advertises that it has support for the scalable + mode translation. With this option set, scalable mode + will not be used even on hardware which claims to support + it. tboot_noforce [Default Off] Do not force the Intel IOMMU enabled under tboot. By default, tboot will force Intel IOMMU on, which diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 5f3f10cf9d9d..5845edf4dcf9 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -430,38 +430,16 @@ static int dmar_map_gfx = 1; static int dmar_forcedac; static int intel_iommu_strict; static int intel_iommu_superpage = 1; -static int intel_iommu_ecs = 1; -static int intel_iommu_pasid28; +static int intel_iommu_sm = 1; static int iommu_identity_mapping; #define IDENTMAP_ALL 1 #define IDENTMAP_GFX 2 #define IDENTMAP_AZALIA4 -/* Broadwell and Skylake have broken ECS support — normal so-called "second - * level" translation of DMA requests-without-PASID doesn't actually happen - * unless you also set the NESTE bit in an extended context-entry. Which of - * course means that SVM doesn't work because it's trying to do nested - * translation of the physical addresses it finds in the process page tables, - * through the IOVA->phys mapping found in the "second level" page tables. - * - * The VT-d specification was retroactively changed to change the definition - * of the capability bits and pretend that Broadwell/Skylake never happened... - * but unfortunately the wrong bit was changed. It's ECS which is broken, but - * for some reason it was the PASID capability bit which was redefined (from - * bit 28 on BDW/SKL to bit 40 in future). - * - * So our test for ECS needs to eschew those implementations which set the old - * PASID capabiity bit 28, since those are the ones on which ECS is broken. - * Unless we are working around the 'pasid28' limitations, that is, by putting - * the device into passthrough mode for normal DMA and thus masking the bug. - */ -#define ecs_enabled(iommu) (intel_iommu_ecs && ecap_ecs(iommu->ecap) && \ - (intel_iommu_pasid28 || !ecap_broken_pasid(iommu->ecap))) -/* PASID support is thus enabled if ECS is enabled and *either* of the old - * or new capability bits are set. */ -#define pasid_enabled(iommu) (ecs_enabled(iommu) &&\ - (ecap_pasid(iommu->ecap) || ecap_broken_pasid(iommu->ecap))) +#define sm_supported(iommu)(intel_iommu_sm && ecap_smts((iommu)->ecap)) +#define pasid_supported(iommu) (sm_supported(iommu) && \ +ecap_pasid((iommu)->ecap)) int intel_iommu_gfx_mapped; EXPORT_SYMBOL_GPL(intel_iommu_gfx_mapped); @@ -541,15 +519,9 @@ static int __init intel_iommu_setup(char *str) } else if (!strncmp(str, "sp_off", 6)) { pr_info("Disable supported super page\n");
Re: [Patch v15 4/5] dt-bindings: arm-smmu: Add bindings for qcom, smmu-v2
On Wed, Aug 29, 2018 at 6:23 AM Vivek Gautam wrote: > > On Wed, Aug 29, 2018 at 2:05 PM Vivek Gautam > wrote: > > > > Hi Rob, > > > > > > On 8/29/2018 2:04 AM, Rob Herring wrote: > > > On Mon, Aug 27, 2018 at 04:25:50PM +0530, Vivek Gautam wrote: > > >> Add bindings doc for Qcom's smmu-v2 implementation. > > >> > > >> Signed-off-by: Vivek Gautam > > >> Reviewed-by: Tomasz Figa > > >> Tested-by: Srinivas Kandagatla > > >> --- > > >> > > >> Changes since v14: > > >> - This is a new patch added in v15 after noticing the new > > >> checkpatch warning for separate dt-bindings doc. > > >> - This patch also addresses comments given by Rob and Robin to add > > >> a list of valid values of '' in "qcom,-smmu-v2" > > >> compatible string. > > >> > > >> .../devicetree/bindings/iommu/arm,smmu.txt | 47 > > >> ++ > > >> 1 file changed, 47 insertions(+) > > >> > > >> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt > > >> b/Documentation/devicetree/bindings/iommu/arm,smmu.txt > > >> index 8a6ffce12af5..52198a539606 100644 > > >> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt > > >> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt > > >> @@ -17,10 +17,24 @@ conditions. > > >> "arm,mmu-401" > > >> "arm,mmu-500" > > >> "cavium,smmu-v2" > > >> +"qcom,-smmu-v2", "qcom,smmu-v2" > > > The v2 in the compatible string is kind of redundant unless the SoC has > > > other SMMU types. > > > > sdm845 has smmu-v2, and smmu-500 [1]. > > > > >> > > >> depending on the particular implementation and/or the > > >> version of the architecture implemented. > > >> > > >> + A number of Qcom SoCs use qcom,smmu-v2 version of the > > >> IP. > > >> + "qcom,-smmu-v2" represents a soc specific > > >> compatible > > >> + string that should be present along with the > > >> "qcom,smmu-v2" > > >> + to facilitate SoC specific clocks/power connections > > >> and to > > >> + address specific bug fixes. > > >> + '' string in "qcom,-smmu-v2" should be one > > >> of the > > >> + following: > > >> + msm8996 - for msm8996 Qcom SoC. > > >> + sdm845 - for sdm845 Qcom Soc. > > > Rather than all this prose, it would be simpler to just add 2 lines with > > > the full compatibles rather than . The thing is not going to > > > work when/if we move bindings to json-schema also. > > > > then we keep adding > > "qcom,msm8996-smmu-v2", "qcom,smmu-v2" > > "qcom,msm8998-smmu-v2", "qcom,smmu-v2" > > "qcom,sdm845-smmu-v2", "qcom,smmu-v2", > > and from [1] > > "qcom,sdm845-smmu-500", "arm,mmu-500", etc. > > for each SoCs? > > How about following diff on top of this patch? > > diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt > b/Documentation/devicetree/bindings/iommu/arm,smmu.txt > index 52198a539606..5e6c04876533 100644 > --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt > +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt > @@ -17,23 +17,18 @@ conditions. > "arm,mmu-401" > "arm,mmu-500" > "cavium,smmu-v2" > -"qcom,-smmu-v2", "qcom,smmu-v2" > +"qcom,smmu-v2" > >depending on the particular implementation and/or the >version of the architecture implemented. > > - A number of Qcom SoCs use qcom,smmu-v2 version of the IP. > - "qcom,-smmu-v2" represents a soc specific compatible > - string that should be present along with the "qcom,smmu-v2" > - to facilitate SoC specific clocks/power connections and to > - address specific bug fixes. > - '' string in "qcom,-smmu-v2" should be one of the > - following: > - msm8996 - for msm8996 Qcom SoC. > - sdm845 - for sdm845 Qcom Soc. > - > - An example string would be - > - "qcom,msm8996-smmu-v2", "qcom,smmu-v2". > + Qcom SoCs using qcom,smmu-v2 must have soc specific > + compatible string attached to "qcom,smmu-v2" to take care > + of SoC specific clocks/power connections and to address > + specific bug fixes. > + Precisely, it should be one of the following: > + "qcom,msm8996-smmu-v2", "qcom,smmu-v2", > + "qcom,sdm845-smmu-v2", "qcom,smmu-v2". We don't need an explanation of why we need specific compatibles in each binding document (though maybe we need a better explanation somewhere). We just need to know what are valid
Re: various dma_mask fixups
On Wed, Aug 29, 2018 at 08:23:58AM +0200, Christoph Hellwig wrote: > Fix warnings and regressions from requiring a dma mask. With this series applied, I see the following in my sh4 boot tests. sb 1-1: new full-speed USB device number 2 using sm501-usb sm501-usb sm501-usb: OHCI Unrecoverable Error, disabled sm501-usb sm501-usb: HC died; cleaning up This is a persistent problem. Reverting upstream commit 2f606da7823 ("mfd: sm501: Set coherent_dma_mask when creating subdevices") does not make a difference. The problem is gone if I do not apply 'driver core: initialize a default DMA mask for platform device'. On the plus side, the sparc warnings are gone. Guenter ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] dma-mapping: fix return type of dma_set_max_seg_size()
The function dma_set_max_seg_size() can return either 0 on success or -EIO on error. Change its return type from unsigned int to int to capture this. Signed-off-by: Niklas Söderlund --- include/linux/dma-mapping.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 1db6a6b46d0d3dbd..669cde2fa8723ac5 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -674,8 +674,7 @@ static inline unsigned int dma_get_max_seg_size(struct device *dev) return SZ_64K; } -static inline unsigned int dma_set_max_seg_size(struct device *dev, - unsigned int size) +static inline int dma_set_max_seg_size(struct device *dev, unsigned int size) { if (dev->dma_parms) { dev->dma_parms->max_segment_size = size; -- 2.18.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] kernel/dma/direct: take DMA offset into account in dma_direct_supported
On Fri, Aug 24, 2018 at 12:11:23PM +0100, Robin Murphy wrote: > On 24/08/18 07:53, Christoph Hellwig wrote: >> When a device has a DMA offset the dma capable result will change due >> to the difference between the physical and DMA address. Take that into >> account. > > The "phys_to_dma(..., DMA_BIT_MASK(...))" idiom always looks like a glaring > error at first glance, but this whole function is fairly unintuitive > anyway, and ultimately I think the change does work out to be correct. > > It might be nicer if we could reference max_zone_pfns[] for a bit more > clarity, but I guess that's not arch-independent. Not just arch specific, but also local variables. There is arch_zone_lowest_possible_pfn in page_alloc.c, but that gets discarded after init. That being said this is the right direction and I'll look into it for 4.20 or later. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [Patch v15 4/5] dt-bindings: arm-smmu: Add bindings for qcom, smmu-v2
On Wed, Aug 29, 2018 at 2:05 PM Vivek Gautam wrote: > > Hi Rob, > > > On 8/29/2018 2:04 AM, Rob Herring wrote: > > On Mon, Aug 27, 2018 at 04:25:50PM +0530, Vivek Gautam wrote: > >> Add bindings doc for Qcom's smmu-v2 implementation. > >> > >> Signed-off-by: Vivek Gautam > >> Reviewed-by: Tomasz Figa > >> Tested-by: Srinivas Kandagatla > >> --- > >> > >> Changes since v14: > >> - This is a new patch added in v15 after noticing the new > >> checkpatch warning for separate dt-bindings doc. > >> - This patch also addresses comments given by Rob and Robin to add > >> a list of valid values of '' in "qcom,-smmu-v2" > >> compatible string. > >> > >> .../devicetree/bindings/iommu/arm,smmu.txt | 47 > >> ++ > >> 1 file changed, 47 insertions(+) > >> > >> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt > >> b/Documentation/devicetree/bindings/iommu/arm,smmu.txt > >> index 8a6ffce12af5..52198a539606 100644 > >> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt > >> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt > >> @@ -17,10 +17,24 @@ conditions. > >> "arm,mmu-401" > >> "arm,mmu-500" > >> "cavium,smmu-v2" > >> +"qcom,-smmu-v2", "qcom,smmu-v2" > > The v2 in the compatible string is kind of redundant unless the SoC has > > other SMMU types. > > sdm845 has smmu-v2, and smmu-500 [1]. > > >> > >> depending on the particular implementation and/or the > >> version of the architecture implemented. > >> > >> + A number of Qcom SoCs use qcom,smmu-v2 version of the > >> IP. > >> + "qcom,-smmu-v2" represents a soc specific > >> compatible > >> + string that should be present along with the > >> "qcom,smmu-v2" > >> + to facilitate SoC specific clocks/power connections and > >> to > >> + address specific bug fixes. > >> + '' string in "qcom,-smmu-v2" should be one of > >> the > >> + following: > >> + msm8996 - for msm8996 Qcom SoC. > >> + sdm845 - for sdm845 Qcom Soc. > > Rather than all this prose, it would be simpler to just add 2 lines with > > the full compatibles rather than . The thing is not going to > > work when/if we move bindings to json-schema also. > > then we keep adding > "qcom,msm8996-smmu-v2", "qcom,smmu-v2" > "qcom,msm8998-smmu-v2", "qcom,smmu-v2" > "qcom,sdm845-smmu-v2", "qcom,smmu-v2", > and from [1] > "qcom,sdm845-smmu-500", "arm,mmu-500", etc. > for each SoCs? How about following diff on top of this patch? diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt index 52198a539606..5e6c04876533 100644 --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt @@ -17,23 +17,18 @@ conditions. "arm,mmu-401" "arm,mmu-500" "cavium,smmu-v2" -"qcom,-smmu-v2", "qcom,smmu-v2" +"qcom,smmu-v2" depending on the particular implementation and/or the version of the architecture implemented. - A number of Qcom SoCs use qcom,smmu-v2 version of the IP. - "qcom,-smmu-v2" represents a soc specific compatible - string that should be present along with the "qcom,smmu-v2" - to facilitate SoC specific clocks/power connections and to - address specific bug fixes. - '' string in "qcom,-smmu-v2" should be one of the - following: - msm8996 - for msm8996 Qcom SoC. - sdm845 - for sdm845 Qcom Soc. - - An example string would be - - "qcom,msm8996-smmu-v2", "qcom,smmu-v2". + Qcom SoCs using qcom,smmu-v2 must have soc specific + compatible string attached to "qcom,smmu-v2" to take care + of SoC specific clocks/power connections and to address + specific bug fixes. + Precisely, it should be one of the following: + "qcom,msm8996-smmu-v2", "qcom,smmu-v2", + "qcom,sdm845-smmu-v2", "qcom,smmu-v2". Thanks! Best regards Vivek -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [Patch v15 4/5] dt-bindings: arm-smmu: Add bindings for qcom,smmu-v2
Hi Rob, On 8/29/2018 2:04 AM, Rob Herring wrote: On Mon, Aug 27, 2018 at 04:25:50PM +0530, Vivek Gautam wrote: Add bindings doc for Qcom's smmu-v2 implementation. Signed-off-by: Vivek Gautam Reviewed-by: Tomasz Figa Tested-by: Srinivas Kandagatla --- Changes since v14: - This is a new patch added in v15 after noticing the new checkpatch warning for separate dt-bindings doc. - This patch also addresses comments given by Rob and Robin to add a list of valid values of '' in "qcom,-smmu-v2" compatible string. .../devicetree/bindings/iommu/arm,smmu.txt | 47 ++ 1 file changed, 47 insertions(+) diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt index 8a6ffce12af5..52198a539606 100644 --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt @@ -17,10 +17,24 @@ conditions. "arm,mmu-401" "arm,mmu-500" "cavium,smmu-v2" +"qcom,-smmu-v2", "qcom,smmu-v2" The v2 in the compatible string is kind of redundant unless the SoC has other SMMU types. sdm845 has smmu-v2, and smmu-500 [1]. depending on the particular implementation and/or the version of the architecture implemented. + A number of Qcom SoCs use qcom,smmu-v2 version of the IP. + "qcom,-smmu-v2" represents a soc specific compatible + string that should be present along with the "qcom,smmu-v2" + to facilitate SoC specific clocks/power connections and to + address specific bug fixes. + '' string in "qcom,-smmu-v2" should be one of the + following: + msm8996 - for msm8996 Qcom SoC. + sdm845 - for sdm845 Qcom Soc. Rather than all this prose, it would be simpler to just add 2 lines with the full compatibles rather than . The thing is not going to work when/if we move bindings to json-schema also. then we keep adding "qcom,msm8996-smmu-v2", "qcom,smmu-v2" "qcom,msm8998-smmu-v2", "qcom,smmu-v2" "qcom,sdm845-smmu-v2", "qcom,smmu-v2", and from [1] "qcom,sdm845-smmu-500", "arm,mmu-500", etc. for each SoCs? + + An example string would be - + "qcom,msm8996-smmu-v2", "qcom,smmu-v2". + - reg : Base address and size of the SMMU. - #global-interrupts : The number of global interrupts exposed by the @@ -71,6 +85,22 @@ conditions. or using stream matching with #iommu-cells = <2>, and may be ignored if present in such cases. +- clock-names:List of the names of clocks input to the device. The + required list depends on particular implementation and + is as follows: + - for "qcom,smmu-v2": +- "bus": clock required for downstream bus access and + for the smmu ptw, +- "iface": clock required to access smmu's registers + through the TCU's programming interface. + - unspecified for other implementations. + +- clocks: Specifiers for all clocks listed in the clock-names property, + as per generic clock bindings. + +- power-domains: Specifiers for power domains required to be powered on for + the SMMU to operate, as per generic power domain bindings. + ** Deprecated properties: - mmu-masters (deprecated in favour of the generic "iommus" binding) : @@ -137,3 +167,20 @@ conditions. iommu-map = <0 0 0x400>; ... }; + + /* Qcom's arm,smmu-v2 implementation */ + smmu4: iommu { Needs a unit-address. I went in symmetry with another example in this file for 'smmu1'. I will add the address here. And if you would like, I can squash a change for 'smmu1' too in this patch, although that will be trivial. [1] https://patchwork.kernel.org/patch/10565291/ Best regards Vivek [snip] ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: various dma_mask fixups
On Wed, Aug 29, 2018 at 8:24 AM Christoph Hellwig wrote: > Fix warnings and regressions from requiring a dma mask. Applied all three patches and took a few ARM systems for a test ride: Tested-by: Linus Walleij Yours, Linus Walleij ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/3] driver core: initialize a default DMA mask for platform device
We still treat devices without a DMA mask as defaulting to 32-bits for both mask, but a few releases ago we've started warning about such cases, as they require special cases to work around this sloppyness. Add a dma_mask field to struct platform_object so that we can initialize the dma_mask pointer in struct device and initialize both masks to 32-bits by default. Architectures can still override this in arch_setup_pdev_archdata if needed. Note that the code looks a little odd with the various conditionals because we have to support platform_device structures that are statically allocated. Reported-by: Guenter Roeck Signed-off-by: Christoph Hellwig --- drivers/base/platform.c | 15 +-- include/linux/platform_device.h | 1 + 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/base/platform.c b/drivers/base/platform.c index dff82a3c2caa..baf4b06cf2d9 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -225,6 +225,17 @@ struct platform_object { char name[]; }; +static void setup_pdev_archdata(struct platform_device *pdev) +{ + if (!pdev->dev.coherent_dma_mask) + pdev->dev.coherent_dma_mask = DMA_BIT_MASK(32); + if (!pdev->dma_mask) + pdev->dma_mask = DMA_BIT_MASK(32); + if (!pdev->dev.dma_mask) + pdev->dev.dma_mask = >dma_mask; + arch_setup_pdev_archdata(pdev); +}; + /** * platform_device_put - destroy a platform device * @pdev: platform device to free @@ -271,7 +282,7 @@ struct platform_device *platform_device_alloc(const char *name, int id) pa->pdev.id = id; device_initialize(>pdev.dev); pa->pdev.dev.release = platform_device_release; - arch_setup_pdev_archdata(>pdev); + setup_pdev_archdata(>pdev); } return pa ? >pdev : NULL; @@ -472,7 +483,7 @@ EXPORT_SYMBOL_GPL(platform_device_del); int platform_device_register(struct platform_device *pdev) { device_initialize(>dev); - arch_setup_pdev_archdata(pdev); + setup_pdev_archdata(pdev); return platform_device_add(pdev); } EXPORT_SYMBOL_GPL(platform_device_register); diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h index 1a9f38f27f65..d9dc4883d5fb 100644 --- a/include/linux/platform_device.h +++ b/include/linux/platform_device.h @@ -25,6 +25,7 @@ struct platform_device { int id; boolid_auto; struct device dev; + u64 dma_mask; u32 num_resources; struct resource *resource; -- 2.18.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
various dma_mask fixups
Fix warnings and regressions from requiring a dma mask. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/3] of/platform: initialise AMBA default DMA masks
From: Linus Walleij commit a5516219b102 ("of/platform: Initialise default DMA masks") sets up the coherent_dma_mask of platform devices created from the device tree, but fails to do the same for AMBA (PrimeCell) devices. This leads to a regression in kernel v4.19-rc1 triggering the WARN_ON_ONCE(dev && !dev->coherent_dma_mask) in dma_alloc_attrs(). This regresses the PL111 DRM driver in drivers/gpu/drm/pl111 that uses the AMBA PrimeCell to instantiate the frame buffer device, as it cannot allocate a chunk of coherent memory anymore due to the missing mask. Fixes: a5516219b102 ("of/platform: Initialise default DMA masks") Signed-off-by: Linus Walleij [hch: added a comment, and droped a conditional that can't be true] Signed-off-by: Christoph Hellwig --- drivers/of/platform.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/of/platform.c b/drivers/of/platform.c index 7ba90c290a42..6c59673933e9 100644 --- a/drivers/of/platform.c +++ b/drivers/of/platform.c @@ -241,6 +241,10 @@ static struct amba_device *of_amba_device_create(struct device_node *node, if (!dev) goto err_clear_flag; + /* AMBA devices only support a single DMA mask */ + dev->dev.coherent_dma_mask = DMA_BIT_MASK(32); + dev->dev.dma_mask = >dev.coherent_dma_mask; + /* setup generic device info */ dev->dev.of_node = of_node_get(node); dev->dev.fwnode = >fwnode; -- 2.18.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/3] sparc: set a default 32-bit dma mask for OF devices
This keeps the historic default behavior for devices without a DMA mask, but removes the warning about a lacking DMA mask for doing DMA without a mask. Reported-by: Meelis Roos Tested-by: Meelis Roos Signed-off-by: Christoph Hellwig --- arch/sparc/kernel/of_device_32.c | 5 + arch/sparc/kernel/of_device_64.c | 4 2 files changed, 9 insertions(+) diff --git a/arch/sparc/kernel/of_device_32.c b/arch/sparc/kernel/of_device_32.c index 3641a294ed54..7f3dec7e1e78 100644 --- a/arch/sparc/kernel/of_device_32.c +++ b/arch/sparc/kernel/of_device_32.c @@ -9,6 +9,7 @@ #include #include #include +#include #include #include @@ -381,6 +382,10 @@ static struct platform_device * __init scan_one_device(struct device_node *dp, else dev_set_name(>dev, "%08x", dp->phandle); + op->dev.coherent_dma_mask = DMA_BIT_MASK(32); + op->dev.dma_mask = >dma_mask; + op->dma_mask = DMA_BIT_MASK(32); + if (of_device_register(op)) { printk("%s: Could not register of device.\n", dp->full_name); diff --git a/arch/sparc/kernel/of_device_64.c b/arch/sparc/kernel/of_device_64.c index 44e4d4435bed..d94c31822da1 100644 --- a/arch/sparc/kernel/of_device_64.c +++ b/arch/sparc/kernel/of_device_64.c @@ -2,6 +2,7 @@ #include #include #include +#include #include #include #include @@ -675,6 +676,9 @@ static struct platform_device * __init scan_one_device(struct device_node *dp, dev_set_name(>dev, "root"); else dev_set_name(>dev, "%08x", dp->phandle); + op->dev.coherent_dma_mask = DMA_BIT_MASK(32); + op->dev.dma_mask = >dma_mask; + op->dma_mask = DMA_BIT_MASK(32); if (of_device_register(op)) { printk("%s: Could not register of device.\n", -- 2.18.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu