Re: [PATCH 6/9] iommu/dma-iommu.c: Convert to use vm_insert_range
On Sat, Nov 24, 2018 at 3:04 AM Matthew Wilcox wrote: > > On Fri, Nov 23, 2018 at 05:23:06PM +, Robin Murphy wrote: > > On 15/11/2018 15:49, Souptick Joarder wrote: > > > Convert to use vm_insert_range() to map range of kernel > > > memory to user vma. > > > > > > Signed-off-by: Souptick Joarder > > > Reviewed-by: Matthew Wilcox > > > --- > > > drivers/iommu/dma-iommu.c | 12 ++-- > > > 1 file changed, 2 insertions(+), 10 deletions(-) > > > > > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > > > index d1b0475..69c66b1 100644 > > > --- a/drivers/iommu/dma-iommu.c > > > +++ b/drivers/iommu/dma-iommu.c > > > @@ -622,17 +622,9 @@ struct page **iommu_dma_alloc(struct device *dev, > > > size_t size, gfp_t gfp, > > > int iommu_dma_mmap(struct page **pages, size_t size, struct > > > vm_area_struct *vma) > > > { > > > - unsigned long uaddr = vma->vm_start; > > > - unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT; > > > - int ret = -ENXIO; > > > + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; > > > - for (i = vma->vm_pgoff; i < count && uaddr < vma->vm_end; i++) { > > > - ret = vm_insert_page(vma, uaddr, pages[i]); > > > - if (ret) > > > - break; > > > - uaddr += PAGE_SIZE; > > > - } > > > - return ret; > > > + return vm_insert_range(vma, vma->vm_start, pages, count); > > > > AFIACS, vm_insert_range() doesn't respect vma->vm_pgoff, so doesn't this > > break partial mmap()s of a large buffer? (which I believe can be a thing) > > Whoops. That should have been: > > return vm_insert_range(vma, vma->vm_start, pages + vma->vm_pgoff, count); > > I suppose. Matthew, patch "drm/rockchip/rockchip_drm_gem.c: Convert to use vm_insert_range" also need to address the same issue ? > > Although arguably we should respect vm_pgoff inside vm_insert_region() > and then callers automatically get support for vm_pgoff without having > to think about it ... although we should then also pass in the length > of the pages array to avoid pages being mapped in which aren't part of > the allocated array. > > Hm. More thought required. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] mm: Replace all open encodings for NUMA_NO_NODE
On Fri, 23 Nov 2018 15:24:16 +0530 Anshuman Khandual wrote: > At present there are multiple places where invalid node number is encoded > as -1. Even though implicitly understood it is always better to have macros > in there. Replace these open encodings for an invalid node number with the > global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like > 'invalid node' from various places redirecting them to a common definition. > > ... > > Build tested this with multiple cross compiler options like alpha, sparc, > arm64, x86, powerpc, powerpc64le etc with their default config which might > not have compiled tested all driver related changes. I will appreciate > folks giving this a test in their respective build environment. > > All these places for replacement were found by running the following grep > patterns on the entire kernel code. Please let me know if this might have > missed some instances. This might also have replaced some false positives. > I will appreciate suggestions, inputs and review. > > 1. git grep "nid == -1" > 2. git grep "node == -1" > 3. git grep "nid = -1" > 4. git grep "node = -1" The build testing is good, but I worry that some of the affected files don't clearly have numa.h in their include paths, for the NUMA_NO_NODE definition. The first thing I looked it is arch/powerpc/include/asm/pci-bridge.h. Maybe it somehow manages to include numa.h via some nested include, but if so, is that reliable across all config combinations and as code evolves? So I think that the patch should have added an explicit include of numa.h, especially in cases where the affected file previously had no references to any of the things which numa.h defines. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 5/7] iommu: Add virtio-iommu driver
On Thu, Nov 22, 2018 at 07:37:59PM +, Jean-Philippe Brucker wrote: > The virtio IOMMU is a para-virtualized device, allowing to send IOMMU > requests such as map/unmap over virtio transport without emulating page > tables. This implementation handles ATTACH, DETACH, MAP and UNMAP > requests. > > The bulk of the code transforms calls coming from the IOMMU API into > corresponding virtio requests. Mappings are kept in an interval tree > instead of page tables. > > Signed-off-by: Jean-Philippe Brucker > --- > MAINTAINERS | 7 + > drivers/iommu/Kconfig | 11 + > drivers/iommu/Makefile| 1 + > drivers/iommu/virtio-iommu.c | 916 ++ > include/uapi/linux/virtio_ids.h | 1 + > include/uapi/linux/virtio_iommu.h | 104 > 6 files changed, 1040 insertions(+) > create mode 100644 drivers/iommu/virtio-iommu.c > create mode 100644 include/uapi/linux/virtio_iommu.h > > diff --git a/MAINTAINERS b/MAINTAINERS > index 1689dcfec800..3d8550c76f4a 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -15946,6 +15946,13 @@ S: Maintained > F: drivers/virtio/virtio_input.c > F: include/uapi/linux/virtio_input.h > > +VIRTIO IOMMU DRIVER > +M: Jean-Philippe Brucker > +L: virtualizat...@lists.linux-foundation.org > +S: Maintained > +F: drivers/iommu/virtio-iommu.c > +F: include/uapi/linux/virtio_iommu.h > + > VIRTUAL BOX GUEST DEVICE DRIVER > M: Hans de Goede > M: Arnd Bergmann > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig > index bf2bbfa2a399..db5f2b8c23f5 100644 > --- a/drivers/iommu/Kconfig > +++ b/drivers/iommu/Kconfig > @@ -464,4 +464,15 @@ config QCOM_IOMMU > help > Support for IOMMU on certain Qualcomm SoCs. > > +config VIRTIO_IOMMU > + bool "Virtio IOMMU driver" > + depends on VIRTIO=y > + select IOMMU_API > + select INTERVAL_TREE > + select ARM_DMA_USE_IOMMU if ARM > + help > + Para-virtualised IOMMU driver with virtio. > + > + Say Y here if you intend to run this kernel as a guest. > + > endif # IOMMU_SUPPORT > diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile > index 5481e5fe1f95..bd7e55751d09 100644 > --- a/drivers/iommu/Makefile > +++ b/drivers/iommu/Makefile > @@ -36,3 +36,4 @@ obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o > obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o > obj-$(CONFIG_S390_IOMMU) += s390-iommu.o > obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o > +obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c > new file mode 100644 > index ..7540dab9c8dc > --- /dev/null > +++ b/drivers/iommu/virtio-iommu.c > @@ -0,0 +1,916 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Virtio driver for the paravirtualized IOMMU > + * > + * Copyright (C) 2018 Arm Limited > + */ > + > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include > + > +#define MSI_IOVA_BASE0x800 > +#define MSI_IOVA_LENGTH 0x10 > + > +#define VIOMMU_REQUEST_VQ0 > +#define VIOMMU_NR_VQS1 > + > +struct viommu_dev { > + struct iommu_device iommu; > + struct device *dev; > + struct virtio_device*vdev; > + > + struct ida domain_ids; > + > + struct virtqueue*vqs[VIOMMU_NR_VQS]; > + spinlock_t request_lock; > + struct list_headrequests; > + > + /* Device configuration */ > + struct iommu_domain_geometrygeometry; > + u64 pgsize_bitmap; > + u8 domain_bits; > +}; > + > +struct viommu_mapping { > + phys_addr_t paddr; > + struct interval_tree_node iova; > + u32 flags; > +}; > + > +struct viommu_domain { > + struct iommu_domain domain; > + struct viommu_dev *viommu; > + struct mutexmutex; /* protects viommu pointer */ > + unsigned intid; > + > + spinlock_t mappings_lock; > + struct rb_root_cached mappings; > + > + unsigned long nr_endpoints; > +}; > + > +struct viommu_endpoint { > + struct viommu_dev *viommu; > + struct viommu_domain*vdomain; > +}; > + > +struct viommu_request { > + struct list_headlist; > + void*writeback; > + unsigned intwrite_offset; > + unsigned intlen; > + charbuf[]; > +}; > +
Re: [PATCH v5 5/7] iommu: Add virtio-iommu driver
On Thu, Nov 22, 2018 at 07:37:59PM +, Jean-Philippe Brucker wrote: > The virtio IOMMU is a para-virtualized device, allowing to send IOMMU > requests such as map/unmap over virtio transport without emulating page > tables. This implementation handles ATTACH, DETACH, MAP and UNMAP > requests. > > The bulk of the code transforms calls coming from the IOMMU API into > corresponding virtio requests. Mappings are kept in an interval tree > instead of page tables. > > Signed-off-by: Jean-Philippe Brucker > --- > MAINTAINERS | 7 + > drivers/iommu/Kconfig | 11 + > drivers/iommu/Makefile| 1 + > drivers/iommu/virtio-iommu.c | 916 ++ > include/uapi/linux/virtio_ids.h | 1 + > include/uapi/linux/virtio_iommu.h | 104 > 6 files changed, 1040 insertions(+) > create mode 100644 drivers/iommu/virtio-iommu.c > create mode 100644 include/uapi/linux/virtio_iommu.h > > diff --git a/MAINTAINERS b/MAINTAINERS > index 1689dcfec800..3d8550c76f4a 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -15946,6 +15946,13 @@ S: Maintained > F: drivers/virtio/virtio_input.c > F: include/uapi/linux/virtio_input.h > > +VIRTIO IOMMU DRIVER > +M: Jean-Philippe Brucker > +L: virtualizat...@lists.linux-foundation.org > +S: Maintained > +F: drivers/iommu/virtio-iommu.c > +F: include/uapi/linux/virtio_iommu.h > + > VIRTUAL BOX GUEST DEVICE DRIVER > M: Hans de Goede > M: Arnd Bergmann > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig > index bf2bbfa2a399..db5f2b8c23f5 100644 > --- a/drivers/iommu/Kconfig > +++ b/drivers/iommu/Kconfig > @@ -464,4 +464,15 @@ config QCOM_IOMMU > help > Support for IOMMU on certain Qualcomm SoCs. > > +config VIRTIO_IOMMU > + bool "Virtio IOMMU driver" > + depends on VIRTIO=y > + select IOMMU_API > + select INTERVAL_TREE > + select ARM_DMA_USE_IOMMU if ARM > + help > + Para-virtualised IOMMU driver with virtio. > + > + Say Y here if you intend to run this kernel as a guest. > + Given it is arm specific right now, shouldn't this depend on ARM? E.g. there's a hack for x86 right now. > endif # IOMMU_SUPPORT > diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile > index 5481e5fe1f95..bd7e55751d09 100644 > --- a/drivers/iommu/Makefile > +++ b/drivers/iommu/Makefile > @@ -36,3 +36,4 @@ obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o > obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o > obj-$(CONFIG_S390_IOMMU) += s390-iommu.o > obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o > +obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c > new file mode 100644 > index ..7540dab9c8dc > --- /dev/null > +++ b/drivers/iommu/virtio-iommu.c > @@ -0,0 +1,916 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Virtio driver for the paravirtualized IOMMU > + * > + * Copyright (C) 2018 Arm Limited > + */ > + > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include > + > +#define MSI_IOVA_BASE0x800 > +#define MSI_IOVA_LENGTH 0x10 > + > +#define VIOMMU_REQUEST_VQ0 > +#define VIOMMU_NR_VQS1 > + > +struct viommu_dev { > + struct iommu_device iommu; > + struct device *dev; > + struct virtio_device*vdev; > + > + struct ida domain_ids; > + > + struct virtqueue*vqs[VIOMMU_NR_VQS]; > + spinlock_t request_lock; > + struct list_headrequests; > + > + /* Device configuration */ > + struct iommu_domain_geometrygeometry; > + u64 pgsize_bitmap; > + u8 domain_bits; > +}; > + > +struct viommu_mapping { > + phys_addr_t paddr; > + struct interval_tree_node iova; > + u32 flags; > +}; > + > +struct viommu_domain { > + struct iommu_domain domain; > + struct viommu_dev *viommu; > + struct mutexmutex; /* protects viommu pointer */ > + unsigned intid; > + > + spinlock_t mappings_lock; > + struct rb_root_cached mappings; > + > + unsigned long nr_endpoints; > +}; > + > +struct viommu_endpoint { > + struct viommu_dev *viommu; > + struct viommu_domain*vdomain; > +}; > + > +struct viommu_request { > + struct list_headlist; > + void*writeback; > + unsigned intwrite_off
Re: [PATCH v5 5/7] iommu: Add virtio-iommu driver
On Thu, Nov 22, 2018 at 07:37:59PM +, Jean-Philippe Brucker wrote: > The virtio IOMMU is a para-virtualized device, allowing to send IOMMU > requests such as map/unmap over virtio transport without emulating page > tables. This implementation handles ATTACH, DETACH, MAP and UNMAP > requests. > > The bulk of the code transforms calls coming from the IOMMU API into > corresponding virtio requests. Mappings are kept in an interval tree > instead of page tables. > > Signed-off-by: Jean-Philippe Brucker > --- > MAINTAINERS | 7 + > drivers/iommu/Kconfig | 11 + > drivers/iommu/Makefile| 1 + > drivers/iommu/virtio-iommu.c | 916 ++ > include/uapi/linux/virtio_ids.h | 1 + > include/uapi/linux/virtio_iommu.h | 104 > 6 files changed, 1040 insertions(+) > create mode 100644 drivers/iommu/virtio-iommu.c > create mode 100644 include/uapi/linux/virtio_iommu.h > > diff --git a/MAINTAINERS b/MAINTAINERS > index 1689dcfec800..3d8550c76f4a 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -15946,6 +15946,13 @@ S: Maintained > F: drivers/virtio/virtio_input.c > F: include/uapi/linux/virtio_input.h > > +VIRTIO IOMMU DRIVER > +M: Jean-Philippe Brucker > +L: virtualizat...@lists.linux-foundation.org > +S: Maintained > +F: drivers/iommu/virtio-iommu.c > +F: include/uapi/linux/virtio_iommu.h > + > VIRTUAL BOX GUEST DEVICE DRIVER > M: Hans de Goede > M: Arnd Bergmann > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig > index bf2bbfa2a399..db5f2b8c23f5 100644 > --- a/drivers/iommu/Kconfig > +++ b/drivers/iommu/Kconfig > @@ -464,4 +464,15 @@ config QCOM_IOMMU > help > Support for IOMMU on certain Qualcomm SoCs. > > +config VIRTIO_IOMMU > + bool "Virtio IOMMU driver" > + depends on VIRTIO=y > + select IOMMU_API > + select INTERVAL_TREE > + select ARM_DMA_USE_IOMMU if ARM > + help > + Para-virtualised IOMMU driver with virtio. > + > + Say Y here if you intend to run this kernel as a guest. > + > endif # IOMMU_SUPPORT > diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile > index 5481e5fe1f95..bd7e55751d09 100644 > --- a/drivers/iommu/Makefile > +++ b/drivers/iommu/Makefile > @@ -36,3 +36,4 @@ obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o > obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o > obj-$(CONFIG_S390_IOMMU) += s390-iommu.o > obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o > +obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c > new file mode 100644 > index ..7540dab9c8dc > --- /dev/null > +++ b/drivers/iommu/virtio-iommu.c > @@ -0,0 +1,916 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Virtio driver for the paravirtualized IOMMU > + * > + * Copyright (C) 2018 Arm Limited > + */ > + > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include > + > +#define MSI_IOVA_BASE0x800 > +#define MSI_IOVA_LENGTH 0x10 > + > +#define VIOMMU_REQUEST_VQ0 > +#define VIOMMU_NR_VQS1 > + > +struct viommu_dev { > + struct iommu_device iommu; > + struct device *dev; > + struct virtio_device*vdev; > + > + struct ida domain_ids; > + > + struct virtqueue*vqs[VIOMMU_NR_VQS]; > + spinlock_t request_lock; > + struct list_headrequests; > + > + /* Device configuration */ > + struct iommu_domain_geometrygeometry; > + u64 pgsize_bitmap; > + u8 domain_bits; > +}; > + > +struct viommu_mapping { > + phys_addr_t paddr; > + struct interval_tree_node iova; > + u32 flags; > +}; > + > +struct viommu_domain { > + struct iommu_domain domain; > + struct viommu_dev *viommu; > + struct mutexmutex; /* protects viommu pointer */ > + unsigned intid; > + > + spinlock_t mappings_lock; > + struct rb_root_cached mappings; > + > + unsigned long nr_endpoints; > +}; > + > +struct viommu_endpoint { > + struct viommu_dev *viommu; > + struct viommu_domain*vdomain; > +}; > + > +struct viommu_request { > + struct list_headlist; > + void*writeback; > + unsigned intwrite_offset; > + unsigned intlen; > + charbuf[]; > +}; > +
Re: [PATCH 6/9] iommu/dma-iommu.c: Convert to use vm_insert_range
On Fri, Nov 23, 2018 at 05:23:06PM +, Robin Murphy wrote: > On 15/11/2018 15:49, Souptick Joarder wrote: > > Convert to use vm_insert_range() to map range of kernel > > memory to user vma. > > > > Signed-off-by: Souptick Joarder > > Reviewed-by: Matthew Wilcox > > --- > > drivers/iommu/dma-iommu.c | 12 ++-- > > 1 file changed, 2 insertions(+), 10 deletions(-) > > > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > > index d1b0475..69c66b1 100644 > > --- a/drivers/iommu/dma-iommu.c > > +++ b/drivers/iommu/dma-iommu.c > > @@ -622,17 +622,9 @@ struct page **iommu_dma_alloc(struct device *dev, > > size_t size, gfp_t gfp, > > int iommu_dma_mmap(struct page **pages, size_t size, struct > > vm_area_struct *vma) > > { > > - unsigned long uaddr = vma->vm_start; > > - unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT; > > - int ret = -ENXIO; > > + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; > > - for (i = vma->vm_pgoff; i < count && uaddr < vma->vm_end; i++) { > > - ret = vm_insert_page(vma, uaddr, pages[i]); > > - if (ret) > > - break; > > - uaddr += PAGE_SIZE; > > - } > > - return ret; > > + return vm_insert_range(vma, vma->vm_start, pages, count); > > AFIACS, vm_insert_range() doesn't respect vma->vm_pgoff, so doesn't this > break partial mmap()s of a large buffer? (which I believe can be a thing) Whoops. That should have been: return vm_insert_range(vma, vma->vm_start, pages + vma->vm_pgoff, count); I suppose. Although arguably we should respect vm_pgoff inside vm_insert_region() and then callers automatically get support for vm_pgoff without having to think about it ... although we should then also pass in the length of the pages array to avoid pages being mapped in which aren't part of the allocated array. Hm. More thought required. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
Hi Will, On 2018-11-23 6:27 pm, Will Deacon wrote: Hi John, On Tue, Nov 20, 2018 at 10:25:16AM +0100, Christoph Hellwig wrote: On Mon, Nov 19, 2018 at 03:22:13PM -0800, John Stultz wrote: + sg->dma_address = dma_addr; sg_dma_len(sg) = sg->length; } I know Robin has already replied with more detailed info, but just to close the loop as I'm finally home, applying this patch didn't seem to help with the IO hangs I'm seeing w/ HiKey960. If Robins observation is right this should fix the problem for you: Please could you give this diff a try and let us know whether the problem persists with your board? This is actually the exact same change that I ended up with in my fixes[1], which John has indeed confirmed. (sorry, it's the thing I was telling you about the other day, but I neglected to include you on cc you when I sent the patches out the following morning) Robin. [1] https://lore.kernel.org/lkml/cover.1542812807.git.robin.mur...@arm.com/ ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RESEND PATCH v17 2/5] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device
On Thu, Nov 22, 2018 at 05:32:24PM +0530, Vivek Gautam wrote: > Hi Will, > > On Wed, Nov 21, 2018 at 11:09 PM Will Deacon wrote: > > > > On Fri, Nov 16, 2018 at 04:54:27PM +0530, Vivek Gautam wrote: > > > From: Sricharan R > > > > > > The smmu device probe/remove and add/remove master device callbacks > > > gets called when the smmu is not linked to its master, that is without > > > the context of the master device. So calling runtime apis in those places > > > separately. > > > Global locks are also initialized before enabling runtime pm as the > > > runtime_resume() calls device_reset() which does tlb_sync_global() > > > that ultimately requires locks to be initialized. > > > > > > Signed-off-by: Sricharan R > > > [vivek: Cleanup pm runtime calls] > > > Signed-off-by: Vivek Gautam > > > Reviewed-by: Tomasz Figa > > > Tested-by: Srinivas Kandagatla > > > Reviewed-by: Robin Murphy > > > --- > > > drivers/iommu/arm-smmu.c | 101 > > > ++- > > > 1 file changed, 91 insertions(+), 10 deletions(-) > > > > Given that you're doing the get/put in the TLBI ops unconditionally: > > > > > static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain) > > > { > > > struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); > > > + struct arm_smmu_device *smmu = smmu_domain->smmu; > > > > > > - if (smmu_domain->tlb_ops) > > > + if (smmu_domain->tlb_ops) { > > > + arm_smmu_rpm_get(smmu); > > > smmu_domain->tlb_ops->tlb_flush_all(smmu_domain); > > > + arm_smmu_rpm_put(smmu); > > > + } > > > } > > > > > > static void arm_smmu_iotlb_sync(struct iommu_domain *domain) > > > { > > > struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); > > > + struct arm_smmu_device *smmu = smmu_domain->smmu; > > > > > > - if (smmu_domain->tlb_ops) > > > + if (smmu_domain->tlb_ops) { > > > + arm_smmu_rpm_get(smmu); > > > smmu_domain->tlb_ops->tlb_sync(smmu_domain); > > > + arm_smmu_rpm_put(smmu); > > > + } > > > > Why do you need them around the map/unmap calls as well? > > We still have .tlb_add_flush path? Ok, so we could add the ops around that as well. Right now, we've got the runtime pm hooks crossing two parts of the API. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RESEND PATCH v17 5/5] iommu/arm-smmu: Add support for qcom,smmu-v2 variant
On Fri, Nov 23, 2018 at 03:06:29PM +0530, Vivek Gautam wrote: > On Fri, Nov 23, 2018 at 2:52 PM Tomasz Figa wrote: > > On Fri, Nov 23, 2018 at 6:13 PM Vivek Gautam > > wrote: > > > On Wed, Nov 21, 2018 at 11:09 PM Will Deacon wrote: > > > > On Fri, Nov 16, 2018 at 04:54:30PM +0530, Vivek Gautam wrote: > > > > > @@ -2026,6 +2027,17 @@ ARM_SMMU_MATCH_DATA(arm_mmu401, > > > > > ARM_SMMU_V1_64K, GENERIC_SMMU); > > > > > ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500); > > > > > ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2); > > > > > > > > > > +static const char * const qcom_smmuv2_clks[] = { > > > > > + "bus", "iface", > > > > > +}; > > > > > + > > > > > +static const struct arm_smmu_match_data qcom_smmuv2 = { > > > > > + .version = ARM_SMMU_V2, > > > > > + .model = QCOM_SMMUV2, > > > > > + .clks = qcom_smmuv2_clks, > > > > > + .num_clks = ARRAY_SIZE(qcom_smmuv2_clks), > > > > > +}; > > > > > > > > These seems redundant if we go down the route proposed by Thor, where we > > > > just pull all of the clocks out of the device-tree. In which case, why > > > > do we need this match_data at all? > > > > > > Which is better? Driver relying solely on the device tree to tell > > > which all clocks > > > are required to be enabled, > > > or, the driver deciding itself based on the platform's match data, > > > that it should > > > have X, Y, & Z clocks that should be supplied from the device tree. > > > > The former would simplify the driver, but would also make it > > impossible to spot mistakes in DT, which would ultimately surface out > > as very hard to debug bugs (likely complete system lockups). > > Thanks. > Yea, this is how I understand things presently. Relying on device tree > puts the things out of driver's control. But it also has the undesirable effect of having to update the driver code whenever we want to add support for a new SMMU implementation. If we do this all in the DT, as Thor is trying to do, then older kernels will work well with new hardware. > Hi Will, > Am I unable to understand the intentions here for Thor's clock-fetch > design change? I'm having trouble parsing your question, sorry. Please work with Thor so that we have a single way to get the clock information. My preference is to take it from the firmware, for the reason I stated above. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
Hi John, On Tue, Nov 20, 2018 at 10:25:16AM +0100, Christoph Hellwig wrote: > On Mon, Nov 19, 2018 at 03:22:13PM -0800, John Stultz wrote: > > > + sg->dma_address = dma_addr; > > > sg_dma_len(sg) = sg->length; > > > } > > > > I know Robin has already replied with more detailed info, but just to > > close the loop as I'm finally home, applying this patch didn't seem to > > help with the IO hangs I'm seeing w/ HiKey960. > > If Robins observation is right this should fix the problem for you: Please could you give this diff a try and let us know whether the problem persists with your board? Thanks, Will > diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h > index bd73e7a91410..1833f0c1fba0 100644 > --- a/include/linux/dma-direct.h > +++ b/include/linux/dma-direct.h > @@ -5,7 +5,7 @@ > #include > #include > > -#define DIRECT_MAPPING_ERROR 0 > +#define DIRECT_MAPPING_ERROR (~(dma_addr_t)0x0) > > #ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA > #include ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 6/9] iommu/dma-iommu.c: Convert to use vm_insert_range
On 15/11/2018 15:49, Souptick Joarder wrote: Convert to use vm_insert_range() to map range of kernel memory to user vma. Signed-off-by: Souptick Joarder Reviewed-by: Matthew Wilcox --- drivers/iommu/dma-iommu.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index d1b0475..69c66b1 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -622,17 +622,9 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp, int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma) { - unsigned long uaddr = vma->vm_start; - unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT; - int ret = -ENXIO; + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; - for (i = vma->vm_pgoff; i < count && uaddr < vma->vm_end; i++) { - ret = vm_insert_page(vma, uaddr, pages[i]); - if (ret) - break; - uaddr += PAGE_SIZE; - } - return ret; + return vm_insert_range(vma, vma->vm_start, pages, count); AFIACS, vm_insert_range() doesn't respect vma->vm_pgoff, so doesn't this break partial mmap()s of a large buffer? (which I believe can be a thing) Robin. } static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3] iommu/dma: Use NUMA aware memory allocations in __iommu_dma_alloc_pages()
On 21/11/2018 16:57, Will Deacon wrote: On Wed, Nov 21, 2018 at 04:47:48PM +, John Garry wrote: On 21/11/2018 16:07, Will Deacon wrote: On Wed, Nov 21, 2018 at 10:54:10PM +0800, John Garry wrote: From: Ganapatrao Kulkarni Change function __iommu_dma_alloc_pages() to allocate pages for DMA from respective device NUMA node. The ternary operator which would be for alloc_pages_node() is tidied along with this. We also include a change to use kvzalloc() for kzalloc()/vzalloc() combination. Signed-off-by: Ganapatrao Kulkarni [JPG: Added kvzalloc(), drop pages ** being device local, tidied ternary operator] Signed-off-by: John Garry Weird, you're missing a diffstat here. Anyway, the patch looks fine to me, but it would be nice if you could justify the change with some numbers. Do you actually see an improvement >from this change? Hi Will, Ah, I missed adding my comments explaining the motivation. It would be better in the commit log. Anyway, here's the snippet: " ... as mentioned in [3], dma_alloc_coherent() uses the locality information from the device - as in direct DMA - so this patch is just applying this same policy. [3] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1692998.html"; Yes, please add to this to the commit log. Sure, I did have some numbers to show improvement in some scenarios when I tested this a while back which I'll dig out. However I would say that some scenarios will improve and the opposite for others with this change, considering different conditions in which DMA memory may be used. Well, if you can show that it's useful in some cases and not catastrophic in others, then I think shooting for parity with direct DMA is a reasonable justification for the change. So I have done some more testing with our SoC crypto engine, using tcrypt module. The reason I used this device was because we can utilise a local device per socket in the system, unlike other DMAing devices, which generally only exist on a single socket (for us, anyway). Overall the results aren't brilliant - as expected - but show a general marginal improvement. Here's some figures: Summary: Average diff+0.9% Max diff+1.5% Min diff+0.2% Test Ops/second before after diff % async ecb(aes) encrypt test 0 (128 bit key, 16 byte blocks)68717 69057 0.5 test 1 (128 bit key, 64 byte blocks): 72633 73163 0.7 test 2 (128 bit key, 256 byte blocks): 71475 72076 0.8 test 3 (128 bit key, 1024 byte blocks): 66819 67467 1.0 test 4 (128 bit key, 8192 byte blocks): 38237 38495 0.7 test 5 (192 bit key, 16 byte blocks): 70273 71079 1.2 test 6 (192 bit key, 64 byte blocks): 72455 73292 1.2 test 7 (192 bit key, 256 byte blocks): 71085 71876 1.1 test 8 (192 bit key, 1024 byte blocks): 65891 66576 1.0 test 9 (192 bit key, 8192 byte blocks): 34846 35061 0.6 test 10 (256 bit key, 16 byte blocks): 72927 73762 1.2 test 11 (256 bit key, 64 byte blocks): 72361 73207 1.2 test 12 (256 bit key, 256 byte blocks): 70907 71602 1.0 test 13 (256 bit key, 1024 byte blocks):65035 65653 1.0 test 14 (256 bit key, 8192 byte blocks):32835 32998 0.5 async ecb(aes) decrypt test 0 (128 bit key, 16 byte blocks)68384 69130 1.1 test 1 (128 bit key, 64 byte blocks): 72645 73133 0.7 test 2 (128 bit key, 256 byte blocks): 71523 71912 0.5 test 3 (128 bit key, 1024 byte blocks): 66902 67258 0.5 test 4 (128 bit key, 8192 byte blocks): 38260 38434 0.5 test 5 (192 bit key, 16 byte blocks): 70301 70816 0.7 test 6 (192 bit key, 64 byte blocks): 72473 73064 0.8 test 7 (192 bit key, 256 byte blocks): 71106 71663 0.8 test 8 (192 bit key, 1024 byte blocks): 65915 66363 0.7 test 9 (192 bit key, 8192 byte blocks): 34876 35006 0.4 test 10 (256 bit key, 16 byte blocks): 72969 73519 0.8 test 11 (256 bit key, 64 byte blocks): 72404 72925 0.7 test 12 (256 bit key, 256 byte blocks): 70861 71350 0.7 test 13 (256 bit key, 1024 byte blocks):65074 65485 0.6 test 14 (256 bit key, 8192 byte blocks):32861 32974 0.3 async cbc(aes) encrypt test 0 (128 bit key, 16 byte blocks)58306 59131 1.4 test 1 (128 bit key, 64 byte blocks): 61647 62565 1.5 test 2 (128 bit key, 256 byte blocks): 60841 61666 1.4 test 3 (128 bit key, 1024 byte blocks): 57503 58204 1.2 test 4 (128 bit key, 8192 byte blocks): 34760 35055 0.9 test 5 (192 bit key, 16 byte blocks): 59684 60452 1.3 test 6 (192 bit key, 64 byte blocks): 61705 62516 1.3 test 7 (192 bit key, 256 byte blocks): 60678 61426 1.2 test 8 (192 bit key, 1024 byte blocks): 56805 57487 1.2 test 9 (192 bit key, 8192 byte blocks): 31836 32093 0.8 test 10 (256 bit key, 16 byte blocks): 61961 62785 1.3 test 11 (256 bit key, 64 byte blocks): 61584 62427 1.4 test 12 (256 bit key, 256 byte blocks): 60407 61246 1.4 test 13 (256 bit key, 1024
[git pull] IOMMU Fixes for Linux v4.20-rc3
Hi Linus, The following changes since commit 651022382c7f8da46cb4872a545ee1da6d097d2a: Linux 4.20-rc1 (2018-11-04 15:37:52 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git iommu-fixes-v4.20-rc3 for you to fetch changes up to 829383e183728dec7ed9150b949cd6de64127809: iommu/vt-d: Use memunmap to free memremap (2018-11-22 17:02:21 +0100) IOMMU Fixes for Linux v4.20-rc3 Including: - Two fixes for the Intel VT-d driver to fix a NULL-ptr dereference and an unbalance in an allocate/free path (allocated with memremap, freed with iounmap) - Fix for a crash in the Renesas IOMMU driver - Fix for the Advanced Virtual Interrupt Controler (AVIC) code in the AMD IOMMU driver Filippo Sironi (1): amd/iommu: Fix Guest Virtual APIC Log Tail Address Register Geert Uytterhoeven (1): iommu/ipmmu-vmsa: Fix crash on early domain free Lu Baolu (1): iommu/vt-d: Fix NULL pointer dereference in prq_event_thread() Pan Bian (1): iommu/vt-d: Use memunmap to free memremap drivers/iommu/amd_iommu_init.c | 3 ++- drivers/iommu/intel-iommu.c| 2 +- drivers/iommu/intel-svm.c | 2 +- drivers/iommu/ipmmu-vmsa.c | 3 +++ 4 files changed, 7 insertions(+), 3 deletions(-) Please pull. Thanks, Joerg signature.asc Description: Digital signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 8/8] vfio/type1: Handle different mdev isolation type
Hi Lu, On 11/5/18 8:34 AM, Lu Baolu wrote: > This adds the support to determine the isolation type > of a mediated device group by checking whether it has > an iommu device. If an iommu device exists, an iommu > domain will be allocated and then attached to the iommu > device. Otherwise, keep the same behavior as it is. > > Cc: Ashok Raj > Cc: Jacob Pan > Cc: Kevin Tian > Cc: Liu Yi L > Signed-off-by: Sanjay Kumar > Signed-off-by: Lu Baolu > Signed-off-by: Liu Yi L > --- > drivers/vfio/vfio_iommu_type1.c | 48 - > 1 file changed, 42 insertions(+), 6 deletions(-) > > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > index 178264b330e7..eed26129f58c 100644 > --- a/drivers/vfio/vfio_iommu_type1.c > +++ b/drivers/vfio/vfio_iommu_type1.c > @@ -1427,13 +1427,40 @@ static void vfio_iommu_detach_group(struct > vfio_domain *domain, > iommu_detach_group(domain->domain, group->iommu_group); > } > > +static bool vfio_bus_is_mdev(struct bus_type *bus) > +{ > + struct bus_type *mdev_bus; > + bool ret = false; > + > + mdev_bus = symbol_get(mdev_bus_type); > + if (mdev_bus) { > + ret = (bus == mdev_bus); > + symbol_put(mdev_bus_type); > + } > + > + return ret; > +} > + > +static int vfio_mdev_iommu_device(struct device *dev, void *data) > +{ > + struct device **old = data, *new; > + > + new = vfio_mdev_get_iommu_device(dev); > + if (*old && *old != new) if !new can't you return -EINVAL as well? > + return -EINVAL; > + > + *old = new; > + > + return 0; > +} > + > static int vfio_iommu_type1_attach_group(void *iommu_data, >struct iommu_group *iommu_group) > { > struct vfio_iommu *iommu = iommu_data; > struct vfio_group *group; > struct vfio_domain *domain, *d; > - struct bus_type *bus = NULL, *mdev_bus; > + struct bus_type *bus = NULL; > int ret; > bool resv_msi, msi_remap; > phys_addr_t resv_msi_base; > @@ -1468,11 +1495,18 @@ static int vfio_iommu_type1_attach_group(void > *iommu_data, > if (ret) > goto out_free; > > - mdev_bus = symbol_get(mdev_bus_type); > + if (vfio_bus_is_mdev(bus)) { > + struct device *iommu_device = NULL; > > - if (mdev_bus) { > - if ((bus == mdev_bus) && !iommu_present(bus)) { > - symbol_put(mdev_bus_type); > + group->mdev_group = true; > + > + /* Determine the isolation type */ > + ret = iommu_group_for_each_dev(iommu_group, &iommu_device, > +vfio_mdev_iommu_device); > + if (ret) > + goto out_free; > + > + if (!iommu_device) { > if (!iommu->external_domain) { > INIT_LIST_HEAD(&domain->group_list); > iommu->external_domain = domain; > @@ -1482,9 +1516,11 @@ static int vfio_iommu_type1_attach_group(void > *iommu_data, > list_add(&group->next, >&iommu->external_domain->group_list); > mutex_unlock(&iommu->lock); > + extra new line > return 0; > } > - symbol_put(mdev_bus_type); > + > + bus = iommu_device->bus; > } > > domain->domain = iommu_domain_alloc(bus); > Thanks Eric ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 7/8] vfio/type1: Add domain at(de)taching group helpers
Hi Lu, On 11/5/18 8:34 AM, Lu Baolu wrote: > This adds helpers to attach or detach a domain to a > group. This will replace iommu_attach_group() which > only works for pci devices. s/pci/non mdev? > > If a domain is attaching to a group which includes the > mediated devices, it should attach to the iommu device > (a pci device which represents the mdev in iommu scope) > instead. The added helper supports attaching domain to > groups for both pci and mdev devices. > > Cc: Ashok Raj > Cc: Jacob Pan > Cc: Kevin Tian > Signed-off-by: Lu Baolu > Signed-off-by: Liu Yi L > --- > drivers/vfio/vfio_iommu_type1.c | 114 ++-- > 1 file changed, 107 insertions(+), 7 deletions(-) > > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > index d9fd3188615d..178264b330e7 100644 > --- a/drivers/vfio/vfio_iommu_type1.c > +++ b/drivers/vfio/vfio_iommu_type1.c > @@ -91,6 +91,7 @@ struct vfio_dma { > struct vfio_group { > struct iommu_group *iommu_group; > struct list_headnext; > + boolmdev_group; /* An mdev group */ > }; > > /* > @@ -1327,6 +1328,105 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group > *group, phys_addr_t *base) > return ret; > } > > +static struct device *vfio_mdev_get_iommu_device(struct device *dev) > +{ > + struct device *(*fn)(struct device *dev); > + struct device *iommu_parent; > + > + fn = symbol_get(mdev_get_iommu_device); > + if (fn) { > + iommu_parent = fn(dev); > + symbol_put(mdev_get_iommu_device); > + > + return iommu_parent; > + } > + > + return NULL; > +} > + > +static int vfio_mdev_set_domain(struct device *dev, struct iommu_domain > *domain) > +{ > + int (*fn)(struct device *dev, void *domain); > + int ret; > + > + fn = symbol_get(mdev_set_iommu_domain); > + if (fn) { > + ret = fn(dev, domain); > + symbol_put(mdev_set_iommu_domain); > + > + return ret; > + } > + > + return -EINVAL; > +} > + > +static int vfio_mdev_attach_domain(struct device *dev, void *data) > +{ > + struct iommu_domain *domain = data; > + struct device *iommu_device; > + int ret; > + > + ret = vfio_mdev_set_domain(dev, domain); > + if (ret) > + return ret; > + > + iommu_device = vfio_mdev_get_iommu_device(dev); > + if (iommu_device) { > + bool aux_mode = false; > + > + iommu_get_dev_attr(iommu_device, > +IOMMU_DEV_ATTR_AUXD_ENABLED, &aux_mode); Don' you need to test the returned value before using aux_mode? > + if (aux_mode) > + return iommu_attach_device_aux(domain, iommu_device); > + else > + return iommu_attach_device(domain, iommu_device); if for some reason the above ops fail, don't you want to call vfio_mdev_set_domain(dev, NULL) > + } > + > + return -EINVAL; > +} > + > +static int vfio_mdev_detach_domain(struct device *dev, void *data) > +{ > + struct iommu_domain *domain = data; > + struct device *iommu_device; > + > + vfio_mdev_set_domain(dev, NULL); > + iommu_device = vfio_mdev_get_iommu_device(dev); > + if (iommu_device) { > + bool aux_mode = false; > + > + iommu_get_dev_attr(iommu_device, > +IOMMU_DEV_ATTR_AUXD_ENABLED, &aux_mode); same here > + if (aux_mode) > + iommu_detach_device_aux(domain, iommu_device); > + else > + iommu_detach_device(domain, iommu_device); > + } > + > + return 0; > +} > + > +static int vfio_iommu_attach_group(struct vfio_domain *domain, > +struct vfio_group *group) > +{ > + if (group->mdev_group) > + return iommu_group_for_each_dev(group->iommu_group, > + domain->domain, > + vfio_mdev_attach_domain); > + else > + return iommu_attach_group(domain->domain, group->iommu_group); > +} > + > +static void vfio_iommu_detach_group(struct vfio_domain *domain, > + struct vfio_group *group) > +{ > + if (group->mdev_group) > + iommu_group_for_each_dev(group->iommu_group, domain->domain, > + vfio_mdev_detach_domain); > + else > + iommu_detach_group(domain->domain, group->iommu_group); > +} > + > static int vfio_iommu_type1_attach_group(void *iommu_data, >struct iommu_group *iommu_group) > { > @@ -1402,7 +1502,7 @@ static int vfio_iommu_type1_attach_group(void > *iommu_data, > goto out_domain; > } > > - ret = iommu_attach_group(domain->domain, iommu_group); > + ret = vfio_iommu_attach_group(domain, group); >
Re: remove the ->mapping_error method from dma_map_ops V2
On Fri, Nov 23, 2018 at 02:03:13PM +0100, Joerg Roedel wrote: > On Fri, Nov 23, 2018 at 11:01:55AM +, Russell King - ARM Linux wrote: > > Yuck. So, if we have a 4GB non-PAE 32-bit system, or a PAE system > > where we have valid memory across the 4GB boundary and no IOMMU, > > we have to reserve the top 4K page in the first 4GB of RAM? > > But that is only needed when dma_addr_t is 32bit anyway, no? You said: But we can easily work around that by reserving the top 4k of the first 4GB of IOVA address space in the allocator, no? Then these values are never returned as valid DMA handles. To me, your proposal equates to this in code: int dma_error(dma_addr_t addr) { return (addr & ~(dma_addr_t)0xfff) == 0xf000 ? (s32)addr : 0; } Hence, the size of dma_addr_t would be entirely irrelevant. This makes dma_addr_t values in the range of 0xf000 to 0x special, irrespective of whether dma_addr_t is 32-bit or 64-bit. If that's not what you meant, then I'm afraid your statement wasn't worded very well - so please can you re-word to state more precisely what your proposal is? > > Rather than inventing magic cookies like this, I'd much rather we > > sanitised the API so that we have functions that return success or > > an error code, rather than trying to shoe-horn some kind of magic > > error codes into dma_addr_t and subtly break systems in the process. > > Sure, but is has the obvious downside that we need to touch every driver > that uses these functions, and that are probably a lot of drivers. So we have two options: - change the interface - subtly break drivers In any case, I disagree that we need to touch every driver. Today, drivers just do: if (dma_mapping_error(dev, addr)) and, because dma_mapping_error() returns a boolean, they only care about the true/falseness. If we're going to start returning error codes, then there's no point just returning error codes unless we have some way for drivers to use them. Yes, the simple way would be to make dma_mapping_error() translate addr to an error code, and that maintains compatibility with existing drivers. If, instead, we revamped the DMA API, and had the *new* mapping functions which returned an error code, then the existing mapping functions can be implemented as part of compatibility rather trivially: dma_addr_t dma_map_single(...) { dma_addr_t addr; int error; error = dma_map_single_error(..., &addr); if (error) addr = DMA_MAPPING_ERROR; return addr; } which means that if drivers want access to the error code, they use dma_map_single_error(), meanwhile existing drivers just get on with life as it _currently_ is, with the cookie-based all-ones error code - without introducing any of this potential breakage. Remember, existing drivers would need modification in _any_ case to make use of a returned error code, so the argument that we'd need to touch every driver doesn't really stand up. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: remove the ->mapping_error method from dma_map_ops V2
On Fri, Nov 23, 2018 at 11:01:55AM +, Russell King - ARM Linux wrote: > Yuck. So, if we have a 4GB non-PAE 32-bit system, or a PAE system > where we have valid memory across the 4GB boundary and no IOMMU, > we have to reserve the top 4K page in the first 4GB of RAM? But that is only needed when dma_addr_t is 32bit anyway, no? > Rather than inventing magic cookies like this, I'd much rather we > sanitised the API so that we have functions that return success or > an error code, rather than trying to shoe-horn some kind of magic > error codes into dma_addr_t and subtly break systems in the process. Sure, but is has the obvious downside that we need to touch every driver that uses these functions, and that are probably a lot of drivers. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi Jean-Philippe, On Wed, Nov 21, 2018 at 07:05:13PM +, Jean-Philippe Brucker wrote: > For the moment though, I think we should allow device drivers to use the > DMA-API at the same time as SVA. Yeah, that makes sense. > If a device driver has to map a management ring buffer for example, > it's much nicer to use dma_alloc as usual rather than have them write > their own DMA allocation routines. Which is another reason to > implement 2) above: the DMA-API would still act on the default_domain, > and attaching an "extended" domain augments it with SVA features. > Detaching from the device doesn't require copying mappings back to the > default domain. Does that sound OK? Yes, as I just wrote to Kevin, this sounds good. > struct io_mm *io_mm; > struct iommu_domain *domain; > > domain = iommu_alloc_ext_domain(bus); > > /* Set an mm-exit callback to disable PASIDs. More attributes > could be added later to change the capabilities of the ext > domain */ > iommu_domain_set_attr(domain, DOMAIN_ATTR_MM_EXIT_CB, &mm_exit); > > /* Fails if the device doesn't support this domain type */ > iommu_attach_device(domain, dev); > > /* Same as SVA v3, except a domain instead of dev as argument */ > io_mm = iommu_sva_bind_mm(domain, current->mm, ...); > > /* on success, install the PASID in the device */ > pasid = io_mm->pasid; > > /* Do more work */ > > iommu_sva_unbind_mm(io_mm); > iommu_detach_device(domain, dev); Okay, we can still discuss the naming and a few details, but that goes into the right directions. One open questions is, for example, where the pasid-allocator comes into play. As it looks the amdgpu driver needs it even without an iommu enabled. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables
On Fri 23-11-18 13:23:41, Vlastimil Babka wrote: > On 11/22/18 9:23 AM, Christoph Hellwig wrote: [...] > > But I do agree with the sentiment of not wanting to spread GFP_DMA32 > > futher into the slab allocator. > > I don't see a problem with GFP_DMA32 for custom caches. Generic > kmalloc() would be worse, since it would have to create a new array of > kmalloc caches. But that's already ruled out due to the alignment. Yes that makes quite a lot of sense to me. We do not really need a generic support. Just make sure that if somebody creates a GFP_DMA32 restricted cache then allow allocating restricted memory from that. Is there any fundamental reason that this wouldn't be possible? -- Michal Hocko SUSE Labs ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables
On 11/22/18 9:23 AM, Christoph Hellwig wrote: > On Wed, Nov 21, 2018 at 10:26:26PM +, Robin Murphy wrote: >> TBH, if this DMA32 stuff is going to be contentious we could possibly just >> rip out the offending kmem_cache - it seemed like good practice for the >> use-case, but provided kzalloc(SZ_1K, gfp | GFP_DMA32) can be relied upon to >> give the same 1KB alignment and chance of succeeding as the equivalent >> kmem_cache_alloc(), then we could quite easily make do with that instead. > > Neither is the slab support for kmalloc, not do kmalloc allocations > have useful alignment apparently (at least if you use slub debug). Is this also true for caches created by kmem_cache_create(), that debugging options can result in not respecting the alignment passed to kmem_cache_create()? That would be rather bad, IMHO. > But I do agree with the sentiment of not wanting to spread GFP_DMA32 > futher into the slab allocator. I don't see a problem with GFP_DMA32 for custom caches. Generic kmalloc() would be worse, since it would have to create a new array of kmalloc caches. But that's already ruled out due to the alignment. > I think you want a simple genalloc allocator for this rather special > use case. I would prefer if slab could support it, as it doesn't have to preallocate. OTOH if the allocations are GFP_ATOMIC as suggested later in the thread, and need to always succeed, then preallocation could be better, and thus maybe genalloc. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 3/3] iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging
On 11/22/18 2:20 AM, Nicolas Boichat wrote: > On Thu, Nov 22, 2018 at 2:02 AM Michal Hocko wrote: >> >> On Wed 21-11-18 16:46:38, Will Deacon wrote: >>> On Sun, Nov 11, 2018 at 05:03:41PM +0800, Nicolas Boichat wrote: >>> >>> It's a bit grotty that GFP_DMA32 doesn't just map to GFP_DMA on 32-bit >>> architectures, since then we wouldn't need this #ifdeffery afaict. >> >> But GFP_DMA32 should map to GFP_KERNEL on 32b, no? Or what exactly is >> going on in here? > > GFP_DMA32 will fail due to check_slab_flags (aka GFP_SLAB_BUG_MASK > before patch 1/3 of this series)... But yes, it may be neater if there > was transparent remapping of GFP_DMA32/SLAB_CACHE_DMA32 to > GFP_DMA/SLAB_CACHE_DMA on 32-bit arch... I don't know about ARM, but AFAIK on x86 DMA means within first 4MB of physical memory, and DMA32 means within first 4GB. It doesn't matter if the CPU is running in 32bit or 64bit mode. But, when it runs 32bit, the kernel can direct map less than 4GB anyway, which means it doesn't need the extra DMA32 zone, i.e. GFP_KERNEL can only get you memory that's also acceptable for GFP_DMA32. But, DMA is still DMA, i.e. first 4MB. Remapping GFP_DMA32 to GFP_DMA on x86 wouldn't work, as the GFP_DMA32 allocations would then only use those 4MB and exhaust it very fast. >> -- >> Michal Hocko >> SUSE Labs ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi Kevin, On Thu, Nov 22, 2018 at 08:39:19AM +, Tian, Kevin wrote: > I agree special action needs to be taken for everything else (other than > DMA-API), but the point that I didn't get is why the action must be based > a new SVA-type domain, instead of extending default domain with SVA > capability (including related paging structures which are accessed through > new SVA APIs). In the latter case domain-wise attribute (e.g. IRQ mapping) > is naturally shared between capabilities (DMA-API and SVA). There is no > need to create cross-domain connections as two options that you listed > below. > > Can you help elaborate more about the motivation behind proposal? Yeah, thinking more about it, there are no real reasons against allowing aux and sva bindings on the default domain. It allows the host to use a device through the DMA-API and assign parts of it to a guest or a user-space process, for example. > |-default domain (DMA-API) > |-sva domain1 (SVA) > |-sva domain2 (SVA) > |-... > |-sva domainN (AUX, guest DMA-API) > | |- sva domainN1 (AUX, guest SVA) > | |- sva domainN2 (AUX, guest SVA) > | |-... > |-sva domainM (AUX, guest DMA-API) > | |- sva domainM1 (AUX, guest SVA) > | |- sva domainM2 (AUX, guest SVA) > | |-... In my proposal the sva-domains are no sub-domains of the default-domain, they exist on the same level. A device is detached from its default domain and attached to an sva-domain. > means same domain can be default on deviceA while being aux on deviceB > (when assigning pci and mdev to same VM). As already written in another mail, this raises a couple of questions, like what happens when the aux-domain itself has other aux-domains bound to it? Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 07/17] debugobjects: Move printk out of db lock critical sections
On Fri 2018-11-23 11:40:48, Sergey Senozhatsky wrote: > On (11/22/18 11:16), Peter Zijlstra wrote: > > > So maybe we need to switch debug objects print-outs to _always_ > > > printk_deferred(). Debug objects can be used in code which cannot > > > do direct printk() - timekeeping is just one example. > > > > No, printk_deferred() is a disease, it needs to be eradicated, not > > spread around. > > deadlock-free printk() is deferred, but OK. The best solution would be lockless console drivers. Sigh. > Another idea then: > > --- > > diff --git a/lib/debugobjects.c b/lib/debugobjects.c > index 70935ed91125..3928c2b2f77c 100644 > --- a/lib/debugobjects.c > +++ b/lib/debugobjects.c > @@ -323,10 +323,13 @@ static void debug_print_object(struct debug_obj *obj, > char *msg) > void *hint = descr->debug_hint ? > descr->debug_hint(obj->object) : NULL; > limit++; > + > + bust_spinlocks(1); > WARN(1, KERN_ERR "ODEBUG: %s %s (active state %u) " >"object type: %s hint: %pS\n", > msg, obj_states[obj->state], obj->astate, > descr->name, hint); > + bust_spinlocks(0); > } > debug_objects_warnings++; > } > > --- > > This should make serial consoles re-entrant. > So printk->console_driver_write() hopefully will not deadlock. Is the re-entrance safe? Some risk might be acceptable in Oops/panic situations. It is much less acceptable for random warnings. Best Regards, Petr ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 07/17] debugobjects: Move printk out of db lock critical sections
On Thu 2018-11-22 15:29:35, Waiman Long wrote: > On 11/22/2018 11:02 AM, Petr Mladek wrote: > > Anyway, I wonder what was the primary motivation for this patch. > > Was it the system hang? Or was it lockdep report about nesting > > two terminal locks: db->lock, pool_lock with logbuf_lock? > > The primary motivation was to make sure that printk() won't be called > while holding either db->lock or pool_lock in the debugobjects code. In > the determination of which locks can be made terminal, I focused on > local spinlocks that won't cause boundary to an unrelated subsystem as > it will greatly complicate the analysis. Then please mention this as the real reason in the commit message. The reason that printk might take too long in IRQ context does not explain why we need the patch. printk() is called in IRQ context in many other locations. I do not see why this place should be special. The delay is the price for the chance to see the problem. > I didn't realize that it fixed a hang problem that I was seeing until I > did bisection to find out that it was caused by the patch that cause the > debugobjects splat in the first place a few days ago. I was comparing > the performance status of the pre and post patched kernel. The pre-patch > kernel failed to boot in the one of my test systems, but the > post-patched kernel could. I narrowed it down to this patch as the fix > for the hang problem. The hang is a mystery. My guess is that there is a race somewhere. The patch might have reduced the race window but it probably did not fix the race itself. Best Regards, Petr ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 07/17] debugobjects: Move printk out of db lock critical sections
On Thu 2018-11-22 14:57:02, Waiman Long wrote: > On 11/21/2018 09:04 PM, Sergey Senozhatsky wrote: > > On (11/21/18 11:49), Waiman Long wrote: > > [..] > >>> case ODEBUG_STATE_ACTIVE: > >>> - debug_print_object(obj, "init"); > >>> state = obj->state; > >>> raw_spin_unlock_irqrestore(&db->lock, flags); > >>> + debug_print_object(obj, "init"); > >>> debug_object_fixup(descr->fixup_init, addr, state); > >>> return; > >>> > >>> case ODEBUG_STATE_DESTROYED: > >>> - debug_print_object(obj, "init"); > >>> + debug_printobj = true; > >>> break; > >>> default: > >>> break; > >>> } > >>> > >>> raw_spin_unlock_irqrestore(&db->lock, flags); > >>> + if (debug_chkstack) > >>> + debug_object_is_on_stack(addr, onstack); > >>> + if (debug_printobj) > >>> + debug_print_object(obj, "init"); > >>> > > [..] > >> As a side note, one of the test systems that I used generated a > >> debugobjects splat in the bootup process and the system hanged > >> afterward. Applying this patch alone fix the hanging problem and the > >> system booted up successfully. So it is not really a good idea to call > >> printk() while holding a raw spinlock. > > Right, I like this patch. > > And I think that we, maybe, can go even further. > > > > Some serial consoles call mod_timer(). So what we could have with the > > debug objects enabled was > > > > mod_timer() > > lock_timer_base() > > debug_activate() > >printk() > > call_console_drivers() > > foo_console() > > mod_timer() > >lock_timer_base() << deadlock > > > > That's one possible scenario. The other one can involve console's > > IRQ handler, uart port spinlock, mod_timer, debug objects, printk, > > and an eventual deadlock on the uart port spinlock. This one can > > be mitigated with printk_safe. But mod_timer() deadlock will require > > a different fix. > > > > So maybe we need to switch debug objects print-outs to _always_ > > printk_deferred(). Debug objects can be used in code which cannot > > do direct printk() - timekeeping is just one example. > > > > -ss > > Actually, I don't think that was the cause of the hang. The debugobjects > splat was caused by debug_object_is_on_stack(), below was the output: > > [ 6.890048] ODEBUG: object (ptrval) is NOT on stack > (ptrval), but annotated. > [ 6.891000] WARNING: CPU: 28 PID: 1 at lib/debugobjects.c:369 > __debug_object_init.cold.11+0x51/0x2d6 > [ 6.891000] Modules linked in: > [ 6.891000] CPU: 28 PID: 1 Comm: swapper/0 Not tainted > 4.18.0-41.el8.bz1651764_cgroup_debug.x86_64+debug #1 > [ 6.891000] Hardware name: HPE ProLiant DL120 Gen10/ProLiant DL120 > Gen10, BIOS U36 11/14/2017 > [ 6.891000] RIP: 0010:__debug_object_init.cold.11+0x51/0x2d6 > [ 6.891000] Code: ea 03 80 3c 02 00 0f 85 85 02 00 00 49 8b 54 24 18 > 48 89 de 4c 89 44 24 10 48 c7 c7 00 ce 22 94 e8 73 18 62 ff 4c 8b 44 24 > 10 <0f> 0b e9 60 db ff ff 41 83 c4 01 b8 ff ff 37 00 44 89 25 ce 46 f9 > [ 6.891000] RSP: :880104187960 EFLAGS: 00010086 > [ 6.891000] RAX: 0050 RBX: 9764c570 RCX: > > [ 6.891000] RDX: RSI: RDI: > 880104178ca8 > [ 6.891000] RBP: 110020830f34 R08: 8807ce68a1d0 R09: > fbfff2923554 > [ 6.891000] R10: fbfff2923554 R11: 9491aaa3 R12: > 880104178000 > [ 6.891000] R13: 96c809b8 R14: a370 R15: > 8807ce68a1c0 > [ 6.891000] FS: () GS:8807d420() > knlGS: > [ 6.891000] CS: 0010 DS: ES: CR0: 80050033 > [ 6.891000] CR2: CR3: 00028de16001 CR4: > 007606e0 > [ 6.891000] DR0: DR1: DR2: > > [ 6.891000] DR3: DR6: fffe0ff0 DR7: > 0400 > [ 6.891000] PKRU: > [ 6.891000] Call Trace: > [ 6.891000] ? debug_object_fixup+0x30/0x30 > [ 6.891000] ? _raw_spin_unlock_irqrestore+0x4b/0x60 > [ 6.891000] ? __lockdep_init_map+0x12f/0x510 > [ 6.891000] ? __lockdep_init_map+0x12f/0x510 > [ 6.891000] virt_efi_get_next_variable+0xa2/0x160 > [ 6.891000] efivar_init+0x1c4/0x6d7 > [ 6.891000] ? efivar_ssdt_setup+0x3b/0x3b > [ 6.891000] ? efivar_entry_iter+0x120/0x120 > [ 6.891000] ? find_held_lock+0x3a/0x1c0 > [ 6.891000] ? lock_downgrade+0x5e0/0x5e0 > [ 6.891000] ? kmsg_dump_rewind_nolock+0xd9/0xd9 > [ 6.891000] ? _raw_spin_unlock_irqrestore+0x4b/0x60 > [ 6.891000] ? trace_hardirqs_on_caller+0x381/0x570 > [ 6.891000] ? efivar_ssdt_iter+0x1f4/0x1f4 > [ 6.891000] efisubsys_init+0x1be/0x4ae > [ 6.891000] ? kernfs_get.part.8+0x4c/0x60 > [ 6.891000] ? efivar_ssdt_iter+0x1f4/0x1f4 > [ 6.891000] ? __kernfs_create_file+0x235/0x2e0 > [ 6.891000] ? e
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
On Wed, Nov 21, 2018 at 12:40:44PM +0800, Lu Baolu wrote: > Can you please elaborate a bit more about the concept of subdomains? > From my point of view, an aux-domain is a normal un-managed domain which > has a PASID and could be attached to any ADIs through the aux-domain > specific attach/detach APIs. It could also be attached to a device with > the existing domain_attach/detach_device() APIs at the same time, hence > mdev and pci devices in a vfio container could share a domain. Okay, let's think a bit about having aux-specific attach/detach functions, in the end I don't insist on my proposal as long as the IOMMU-API extensions are clean, consistent, and have well defined semantics. If we have aux-domain specific attach/detach functions like iommu_aux_domain_attach/detach(), what happens when the primary domain of the device is changed with iommu_attach/detach()? 1) Will the aux-domains stay in place? If yes, how does this work in setups where the pasid-bound page-tables are guest-owned and translated through the primary domain page-tables? 2) Will the aux-domains be unbound too? In that case, if the primary domain is re-used, will the aux-domains be implicitly bound too when iommu_device_attach() is called? 3) Do we just disallow changing the primary domain through that interface as long as there are aux-domains or mm_structs bound to the device? Using option 2) or 3) would mean that the aux-domains are still bound to the primary domain, but that is not reflected in the API. Further, if an aux-domain is just a normal domain (struct iommu_domain), what happens when a domain that was used as a primary domain and has bound aux-domains to it, is bound with iommu_aux_domain_attach() to another domain? As you can see, this design decission raises a lot of questions, but maybe we can work it out and find a solution we all agree on. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] mm: Replace all open encodings for NUMA_NO_NODE
min On 11/23/2018 04:06 PM, David Hildenbrand wrote: > On 23.11.18 10:54, Anshuman Khandual wrote: >> At present there are multiple places where invalid node number is encoded >> as -1. Even though implicitly understood it is always better to have macros >> in there. Replace these open encodings for an invalid node number with the >> global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like >> 'invalid node' from various places redirecting them to a common definition. >> >> Signed-off-by: Anshuman Khandual >> --- >> >> Changes in V1: >> >> - Dropped OCFS2 changes per Joseph >> - Dropped media/video drivers changes per Hans >> >> RFC - https://patchwork.kernel.org/patch/10678035/ >> >> Build tested this with multiple cross compiler options like alpha, sparc, >> arm64, x86, powerpc, powerpc64le etc with their default config which might >> not have compiled tested all driver related changes. I will appreciate >> folks giving this a test in their respective build environment. >> >> All these places for replacement were found by running the following grep >> patterns on the entire kernel code. Please let me know if this might have >> missed some instances. This might also have replaced some false positives. >> I will appreciate suggestions, inputs and review. >> >> 1. git grep "nid == -1" >> 2. git grep "node == -1" >> 3. git grep "nid = -1" >> 4. git grep "node = -1" > > Hopefully you found most users :) I hope so :) > > Did you check if some are encoded into function calls? f(-1, ...) Not really. Just wondering how do we even search for it. There might be higher level functions passing down -1 to core MM. If you have some instances in mind which need replacement I will accommodate them. > > Reviewed-by: David Hildenbrand Thanks for the review. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: remove the ->mapping_error method from dma_map_ops V2
On Fri, Nov 23, 2018 at 11:49:18AM +0100, Joerg Roedel wrote: > On Thu, Nov 22, 2018 at 05:52:15PM +, Robin Murphy wrote: > > Unfortunately, with things like the top-down IOVA allocator, and 32-bit > > systems in general, "the top 4095" values may well still be valid addresses > > - we're relying on a 1-byte mapping of the very top byte of memory/IOVA > > space being sufficiently ridiculous that no real code would ever do that, > > but even a 4-byte mapping of the top 4 bytes is within the realms of the > > plausible (I've definitely seen the USB layer make 8-byte mappings from any > > old offset within a page, for example). > > But we can easily work around that by reserving the top 4k of the first > 4GB of IOVA address space in the allocator, no? Then these values are > never returned as valid DMA handles. Yuck. So, if we have a 4GB non-PAE 32-bit system, or a PAE system where we have valid memory across the 4GB boundary and no IOMMU, we have to reserve the top 4K page in the first 4GB of RAM? Linus' IS_DMA_ERR() solution is way more preferable, but unfortunately it still falls short, because it knocks out the top 4K of every DMA capable bus. A DMA capable bus may involve some translation of the address (eg, by simple offsetting) which means that we'd need to potentially knock out multiple pages from the page allocator for each of those pages that correspond to the top 4K of any DMA capable bus. Knowing that at the right time at boot will be fun! It also sounds wasteful to me. Rather than inventing magic cookies like this, I'd much rather we sanitised the API so that we have functions that return success or an error code, rather than trying to shoe-horn some kind of magic error codes into dma_addr_t and subtly break systems in the process. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 4/8] iommu/vt-d: Attach/detach domains in auxiliary mode
Hi Lu, On 11/5/18 8:34 AM, Lu Baolu wrote: > When multiple domains per device has been enabled by the > device driver, the device will tag the default PASID for > the domain to all DMA traffics out of the subset of this > device; and the IOMMU should translate the DMA requests > in PASID granularity. > > This extends the intel_iommu_attach/detach_device() ops > to support managing PASID granular translation structures > when the device driver has enabled multiple domains per > device. > > Cc: Ashok Raj > Cc: Jacob Pan > Cc: Kevin Tian > Signed-off-by: Sanjay Kumar > Signed-off-by: Lu Baolu > Signed-off-by: Liu Yi L > --- > drivers/iommu/intel-iommu.c | 192 +++- > include/linux/intel-iommu.h | 10 ++ > 2 files changed, 180 insertions(+), 22 deletions(-) > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 2c86ac71c774..a61b25ad0d3b 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -2477,6 +2477,7 @@ static struct dmar_domain > *dmar_insert_one_dev_info(struct intel_iommu *iommu, > info->iommu = iommu; > info->pasid_table = NULL; > info->auxd_enabled = 0; > + INIT_LIST_HEAD(&info->auxiliary_domains); > > if (dev && dev_is_pci(dev)) { > struct pci_dev *pdev = to_pci_dev(info->dev); > @@ -5010,35 +5011,134 @@ static void intel_iommu_domain_free(struct > iommu_domain *domain) > domain_exit(to_dmar_domain(domain)); > } > > -static int intel_iommu_attach_device(struct iommu_domain *domain, > - struct device *dev) > +/* > + * Check whether a @domain will be attached to the @dev in the > + * auxiliary mode. > + */ > +static inline bool > +is_device_attach_aux_domain(struct device *dev, struct iommu_domain *domain) > { > - struct dmar_domain *dmar_domain = to_dmar_domain(domain); > - struct intel_iommu *iommu; > - int addr_width; > - u8 bus, devfn; > + struct device_domain_info *info = dev->archdata.iommu; > > - if (device_is_rmrr_locked(dev)) { > - dev_warn(dev, "Device is ineligible for IOMMU domain attach due > to platform RMRR requirement. Contact your platform vendor.\n"); > - return -EPERM; > - } > + return info && info->auxd_enabled && > + domain->type == IOMMU_DOMAIN_UNMANAGED; > +} > > - /* normally dev is not mapped */ > - if (unlikely(domain_context_mapped(dev))) { > - struct dmar_domain *old_domain; > +static void auxiliary_link_device(struct dmar_domain *domain, > + struct device *dev) > +{ > + struct device_domain_info *info = dev->archdata.iommu; > > - old_domain = find_domain(dev); > - if (old_domain) { > - rcu_read_lock(); > - dmar_remove_one_dev_info(old_domain, dev); > - rcu_read_unlock(); > + assert_spin_locked(&device_domain_lock); > + if (WARN_ON(!info)) > + return; > > - if (!domain_type_is_vm_or_si(old_domain) && > - list_empty(&old_domain->devices)) > - domain_exit(old_domain); > + domain->auxd_refcnt++; > + list_add(&domain->auxd, &info->auxiliary_domains); > +} > + > +static void auxiliary_unlink_device(struct dmar_domain *domain, > + struct device *dev) > +{ > + struct device_domain_info *info = dev->archdata.iommu; > + > + assert_spin_locked(&device_domain_lock); > + if (WARN_ON(!info)) > + return; > + > + list_del(&domain->auxd); > + domain->auxd_refcnt--; > + > + if (!domain->auxd_refcnt && domain->default_pasid > 0) > + intel_pasid_free_id(domain->default_pasid); > +} > + > +static int domain_add_dev_auxd(struct dmar_domain *domain, > +struct device *dev) > +{ > + int ret; > + u8 bus, devfn; > + unsigned long flags; > + struct intel_iommu *iommu; > + > + iommu = device_to_iommu(dev, &bus, &devfn); > + if (!iommu) > + return -ENODEV; > + > + spin_lock_irqsave(&device_domain_lock, flags); > + if (domain->default_pasid <= 0) { > + domain->default_pasid = intel_pasid_alloc_id(domain, PASID_MIN, > + pci_max_pasids(to_pci_dev(dev)), GFP_ATOMIC); > + if (domain->default_pasid < 0) { > + pr_err("Can't allocate default pasid\n"); > + ret = -ENODEV; > + goto pasid_failed; > } > } > > + spin_lock(&iommu->lock); You may comment your nested lock policy somewhere. > + ret = domain_attach_iommu(domain, iommu); > + if (ret) > + goto attach_failed; > + > + /* Setup the PASID entry for mediated devices: */ > + ret = intel_pasid_setup_second_level(iommu, domain, dev, > +
Re: [PATCH v4 1/8] iommu: Add APIs for multiple domains per device
Hi Lu, On 11/5/18 8:34 AM, Lu Baolu wrote: > Sharing a physical PCI device in a finer-granularity way > is becoming a consensus in the industry. IOMMU vendors > are also engaging efforts to support such sharing as well > as possible. Among the efforts, the capability of support > finer-granularity DMA isolation is a common requirement > due to the security consideration. With finer-granularity > DMA isolation, all DMA requests out of or to a subset of > a physical PCI device can be protected by the IOMMU. As a > result, there is a request in software to attach multiple > domains to a physical PCI device. One example of such use > model is the Intel Scalable IOV [1] [2]. The Intel vt-d > 3.0 spec [3] introduces the scalable mode which enables > PASID granularity DMA isolation. > > This adds the APIs to support multiple domains per device. > In order to ease the discussions, we call it 'a domain in > auxiliary mode' or simply 'auxiliary domain' when multiple > domains are attached to a physical device. > > The APIs includes: > > * iommu_get_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_CAPABILITY) > - Represents the ability of supporting multiple domains > per device. > > * iommu_get_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLED) > - Checks whether the device identified by @dev is working > in auxiliary mode. > > * iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE) > - Enables the multiple domains capability for the device > referenced by @dev. > > * iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE) > - Disables the multiple domains capability for the device > referenced by @dev. > > * iommu_attach_device_aux(domain, dev) > - Attaches @domain to @dev in the auxiliary mode. Multiple > domains could be attached to a single device in the > auxiliary mode with each domain representing an isolated > address space for an assignable subset of the device. > > * iommu_detach_device_aux(domain, dev) > - Detach @domain which has been attached to @dev in the > auxiliary mode. > > * iommu_domain_get_attr(domain, DOMAIN_ATTR_AUXD_ID) > - Return ID used for finer-granularity DMA translation. > For the Intel Scalable IOV usage model, this will be > a PASID. The device which supports Scalalbe IOV needs s/Scalalbe/Scalable > to writes this ID to the device register so that DMA s/writes/write > requests could be tagged with a right PASID prefix. This is not crystal clear to me as the intel implementation returns the default PASID and not the PASID of the aux domain. > > Many people involved in discussions of this design. > > Kevin Tian > Liu Yi L > Ashok Raj > Sanjay Kumar > Jacob Pan > Alex Williamson > Jean-Philippe Brucker > > and some discussions can be found here [4]. > > [1] > https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification > [2] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf > [3] > https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification > [4] https://lkml.org/lkml/2018/7/26/4 > > Cc: Ashok Raj > Cc: Jacob Pan > Cc: Kevin Tian > Cc: Liu Yi L > Suggested-by: Kevin Tian > Suggested-by: Jean-Philippe Brucker > Signed-off-by: Lu Baolu > --- > drivers/iommu/iommu.c | 52 +++ > include/linux/iommu.h | 52 +++ > 2 files changed, 104 insertions(+) > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index edbdf5d6962c..0b7c96d1425e 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -2030,3 +2030,55 @@ int iommu_fwspec_add_ids(struct device *dev, u32 *ids, > int num_ids) > return 0; > } > EXPORT_SYMBOL_GPL(iommu_fwspec_add_ids); > + > +/* > + * Generic interfaces to get or set per device IOMMU attributions. > + */ > +int iommu_get_dev_attr(struct device *dev, enum iommu_dev_attr attr, void > *data) > +{ > + const struct iommu_ops *ops = dev->bus->iommu_ops; > + > + if (ops && ops->get_dev_attr) > + return ops->get_dev_attr(dev, attr, data); > + > + return -EINVAL; > +} > +EXPORT_SYMBOL_GPL(iommu_get_dev_attr); > + > +int iommu_set_dev_attr(struct device *dev, enum iommu_dev_attr attr, void > *data) > +{ > + const struct iommu_ops *ops = dev->bus->iommu_ops; > + > + if (ops && ops->set_dev_attr) > + return ops->set_dev_attr(dev, attr, data); > + > + return -EINVAL; > +} > +EXPORT_SYMBOL_GPL(iommu_set_dev_attr); > + > +/* > + * APIs to attach/detach a domain to/from a device in the > + * auxiliary mode. > + */ > +int iommu_attach_device_aux(struct iommu_domain *domain, struct device *dev) > +{ > + int ret = -ENODEV; > + > + if (domain->ops->attach_dev_aux) > + ret = domain->ops->attach_dev_aux(domain, dev); > + > + if (!ret) > + trace_attach_device_to_domain(dev); > + > + return ret; > +} > +EXPORT_SYMBOL_GPL(iommu_atta
Re: [PATCH v4 2/8] iommu/vt-d: Add multiple domains per device query
Hi, On 11/5/18 8:34 AM, Lu Baolu wrote: > Add the response to IOMMU_DEV_ATTR_AUXD_CAPABILITY capability query > through iommu_get_dev_attr(). commit title: Advertise auxiliary domain capability? > > Cc: Ashok Raj > Cc: Jacob Pan > Cc: Kevin Tian > Signed-off-by: Lu Baolu > Signed-off-by: Liu Yi L > --- > drivers/iommu/intel-iommu.c | 38 + > 1 file changed, 38 insertions(+) > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 5e149d26ea9b..298f7a3fafe8 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -5167,6 +5167,24 @@ static phys_addr_t intel_iommu_iova_to_phys(struct > iommu_domain *domain, > return phys; > } > > +static inline bool scalable_mode_support(void) > +{ > + struct dmar_drhd_unit *drhd; > + struct intel_iommu *iommu; > + bool ret = true; > + > + rcu_read_lock(); > + for_each_active_iommu(iommu, drhd) { > + if (!sm_supported(iommu)) { > + ret = false; > + break; > + } > + } > + rcu_read_unlock(); > + > + return ret; > +} > + > static bool intel_iommu_capable(enum iommu_cap cap) > { > if (cap == IOMMU_CAP_CACHE_COHERENCY) > @@ -5331,6 +5349,25 @@ struct intel_iommu *intel_svm_device_to_iommu(struct > device *dev) > } > #endif /* CONFIG_INTEL_IOMMU_SVM */ > > +static int intel_iommu_get_dev_attr(struct device *dev, > + enum iommu_dev_attr attr, void *data) > +{ > + int ret = 0; > + bool *auxd_capable; nit: could be local to the case as other cases may use other datatypes. > + > + switch (attr) { > + case IOMMU_DEV_ATTR_AUXD_CAPABILITY: > + auxd_capable = data; > + *auxd_capable = scalable_mode_support(); > + break; > + default: > + ret = -EINVAL; > + break; > + } > + > + return ret; > +} > + > const struct iommu_ops intel_iommu_ops = { > .capable= intel_iommu_capable, > .domain_alloc = intel_iommu_domain_alloc, > @@ -5345,6 +5382,7 @@ const struct iommu_ops intel_iommu_ops = { > .get_resv_regions = intel_iommu_get_resv_regions, > .put_resv_regions = intel_iommu_put_resv_regions, > .device_group = pci_device_group, > + .get_dev_attr = intel_iommu_get_dev_attr, > .pgsize_bitmap = INTEL_IOMMU_PGSIZES, > }; > > Thanks Eric ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: remove the ->mapping_error method from dma_map_ops V2
On Thu, Nov 22, 2018 at 05:52:15PM +, Robin Murphy wrote: > Unfortunately, with things like the top-down IOVA allocator, and 32-bit > systems in general, "the top 4095" values may well still be valid addresses > - we're relying on a 1-byte mapping of the very top byte of memory/IOVA > space being sufficiently ridiculous that no real code would ever do that, > but even a 4-byte mapping of the top 4 bytes is within the realms of the > plausible (I've definitely seen the USB layer make 8-byte mappings from any > old offset within a page, for example). But we can easily work around that by reserving the top 4k of the first 4GB of IOVA address space in the allocator, no? Then these values are never returned as valid DMA handles. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] mm: Replace all open encodings for NUMA_NO_NODE
On 23.11.18 10:54, Anshuman Khandual wrote: > At present there are multiple places where invalid node number is encoded > as -1. Even though implicitly understood it is always better to have macros > in there. Replace these open encodings for an invalid node number with the > global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like > 'invalid node' from various places redirecting them to a common definition. > > Signed-off-by: Anshuman Khandual > --- > > Changes in V1: > > - Dropped OCFS2 changes per Joseph > - Dropped media/video drivers changes per Hans > > RFC - https://patchwork.kernel.org/patch/10678035/ > > Build tested this with multiple cross compiler options like alpha, sparc, > arm64, x86, powerpc, powerpc64le etc with their default config which might > not have compiled tested all driver related changes. I will appreciate > folks giving this a test in their respective build environment. > > All these places for replacement were found by running the following grep > patterns on the entire kernel code. Please let me know if this might have > missed some instances. This might also have replaced some false positives. > I will appreciate suggestions, inputs and review. > > 1. git grep "nid == -1" > 2. git grep "node == -1" > 3. git grep "nid = -1" > 4. git grep "node = -1" Hopefully you found most users :) Did you check if some are encoded into function calls? f(-1, ...) Reviewed-by: David Hildenbrand -- Thanks, David / dhildenb ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] mm: Replace all open encodings for NUMA_NO_NODE
At present there are multiple places where invalid node number is encoded as -1. Even though implicitly understood it is always better to have macros in there. Replace these open encodings for an invalid node number with the global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like 'invalid node' from various places redirecting them to a common definition. Signed-off-by: Anshuman Khandual --- Changes in V1: - Dropped OCFS2 changes per Joseph - Dropped media/video drivers changes per Hans RFC - https://patchwork.kernel.org/patch/10678035/ Build tested this with multiple cross compiler options like alpha, sparc, arm64, x86, powerpc, powerpc64le etc with their default config which might not have compiled tested all driver related changes. I will appreciate folks giving this a test in their respective build environment. All these places for replacement were found by running the following grep patterns on the entire kernel code. Please let me know if this might have missed some instances. This might also have replaced some false positives. I will appreciate suggestions, inputs and review. 1. git grep "nid == -1" 2. git grep "node == -1" 3. git grep "nid = -1" 4. git grep "node = -1" arch/alpha/include/asm/topology.h | 2 +- arch/ia64/kernel/numa.c | 2 +- arch/ia64/mm/discontig.c | 6 +++--- arch/ia64/sn/kernel/io_common.c | 2 +- arch/powerpc/include/asm/pci-bridge.h | 2 +- arch/powerpc/kernel/paca.c| 2 +- arch/powerpc/kernel/pci-common.c | 2 +- arch/powerpc/mm/numa.c| 14 +++--- arch/powerpc/platforms/powernv/memtrace.c | 4 ++-- arch/sparc/kernel/auxio_32.c | 2 +- arch/sparc/kernel/pci_fire.c | 2 +- arch/sparc/kernel/pci_schizo.c| 2 +- arch/sparc/kernel/pcic.c | 6 +++--- arch/sparc/kernel/psycho_common.c | 2 +- arch/sparc/kernel/sbus.c | 2 +- arch/sparc/mm/init_64.c | 6 +++--- arch/sparc/prom/init_32.c | 2 +- arch/sparc/prom/init_64.c | 4 ++-- arch/sparc/prom/tree_32.c | 12 ++-- arch/sparc/prom/tree_64.c | 18 +- arch/x86/include/asm/pci.h| 2 +- arch/x86/kernel/apic/x2apic_uv_x.c| 6 +++--- arch/x86/kernel/smpboot.c | 2 +- arch/x86/platform/olpc/olpc_dt.c | 16 drivers/block/mtip32xx/mtip32xx.c | 4 ++-- drivers/dma/dmaengine.c | 3 ++- drivers/infiniband/hw/hfi1/affinity.c | 2 +- drivers/infiniband/hw/hfi1/init.c | 2 +- drivers/iommu/dmar.c | 4 ++-- drivers/iommu/intel-iommu.c | 2 +- drivers/misc/sgi-xp/xpc_uv.c | 2 +- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 ++-- init/init_task.c | 2 +- kernel/kthread.c | 2 +- kernel/sched/fair.c | 15 --- lib/cpumask.c | 2 +- mm/huge_memory.c | 12 ++-- mm/hugetlb.c | 2 +- mm/ksm.c | 2 +- mm/memory.c | 6 +++--- mm/memory_hotplug.c | 12 ++-- mm/mempolicy.c| 2 +- mm/page_alloc.c | 4 ++-- mm/page_ext.c | 2 +- net/core/pktgen.c | 2 +- net/qrtr/qrtr.c | 2 +- tools/perf/bench/numa.c | 6 +++--- 47 files changed, 109 insertions(+), 107 deletions(-) diff --git a/arch/alpha/include/asm/topology.h b/arch/alpha/include/asm/topology.h index e6e13a8..f6dc89c 100644 --- a/arch/alpha/include/asm/topology.h +++ b/arch/alpha/include/asm/topology.h @@ -29,7 +29,7 @@ static const struct cpumask *cpumask_of_node(int node) { int cpu; - if (node == -1) + if (node == NUMA_NO_NODE) return cpu_all_mask; cpumask_clear(&node_to_cpumask_map[node]); diff --git a/arch/ia64/kernel/numa.c b/arch/ia64/kernel/numa.c index 92c3762..1315da6 100644 --- a/arch/ia64/kernel/numa.c +++ b/arch/ia64/kernel/numa.c @@ -74,7 +74,7 @@ void __init build_cpu_to_node_map(void) cpumask_clear(&node_to_cpu_mask[node]); for_each_possible_early_cpu(cpu) { - node = -1; + node = NUMA_NO_NODE; for (i = 0; i < NR_CPUS; ++i) if (cpu_physical_id(cpu) == node_cpuid[i].phys_id) { node = node_cpuid[i].nid; diff --git a/arch/i
Re: [RESEND PATCH v17 5/5] iommu/arm-smmu: Add support for qcom,smmu-v2 variant
Hi Tomasz, On Fri, Nov 23, 2018 at 2:52 PM Tomasz Figa wrote: > > Hi Vivek, Will, > > On Fri, Nov 23, 2018 at 6:13 PM Vivek Gautam > wrote: > > > > Hi Will, > > > > On Wed, Nov 21, 2018 at 11:09 PM Will Deacon wrote: > > > > > > [+Thor] > > > > > > On Fri, Nov 16, 2018 at 04:54:30PM +0530, Vivek Gautam wrote: > > > > qcom,smmu-v2 is an arm,smmu-v2 implementation with specific > > > > clock and power requirements. > > > > On msm8996, multiple cores, viz. mdss, video, etc. use this > > > > smmu. On sdm845, this smmu is used with gpu. > > > > Add bindings for the same. > > > > > > > > Signed-off-by: Vivek Gautam > > > > Reviewed-by: Rob Herring > > > > Reviewed-by: Tomasz Figa > > > > Tested-by: Srinivas Kandagatla > > > > Reviewed-by: Robin Murphy > > > > --- > > > > drivers/iommu/arm-smmu.c | 13 + > > > > 1 file changed, 13 insertions(+) > > > > > > > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > > > > index 2098c3141f5f..d315ca637097 100644 > > > > --- a/drivers/iommu/arm-smmu.c > > > > +++ b/drivers/iommu/arm-smmu.c > > > > @@ -120,6 +120,7 @@ enum arm_smmu_implementation { > > > > GENERIC_SMMU, > > > > ARM_MMU500, > > > > CAVIUM_SMMUV2, > > > > + QCOM_SMMUV2, > > > > }; > > > > > > > > struct arm_smmu_s2cr { > > > > @@ -2026,6 +2027,17 @@ ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, > > > > GENERIC_SMMU); > > > > ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500); > > > > ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2); > > > > > > > > +static const char * const qcom_smmuv2_clks[] = { > > > > + "bus", "iface", > > > > +}; > > > > + > > > > +static const struct arm_smmu_match_data qcom_smmuv2 = { > > > > + .version = ARM_SMMU_V2, > > > > + .model = QCOM_SMMUV2, > > > > + .clks = qcom_smmuv2_clks, > > > > + .num_clks = ARRAY_SIZE(qcom_smmuv2_clks), > > > > +}; > > > > > > These seems redundant if we go down the route proposed by Thor, where we > > > just pull all of the clocks out of the device-tree. In which case, why > > > do we need this match_data at all? > > > > Which is better? Driver relying solely on the device tree to tell > > which all clocks > > are required to be enabled, > > or, the driver deciding itself based on the platform's match data, > > that it should > > have X, Y, & Z clocks that should be supplied from the device tree. > > The former would simplify the driver, but would also make it > impossible to spot mistakes in DT, which would ultimately surface out > as very hard to debug bugs (likely complete system lockups). Thanks. Yea, this is how I understand things presently. Relying on device tree puts the things out of driver's control. Hi Will, Am I unable to understand the intentions here for Thor's clock-fetch design change? > > For qcom_smmuv2, I believe we're eventually going to end up with > platform-specific quirks anyway, so specifying the clocks too wouldn't > hurt. Given that, I'd recommend sticking to the latter, i.e. what this > patch does. > > Best regards, > Tomasz Best regards Vivek > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 09/17] debugobjects: Make object hash locks nestable terminal locks
On Thu 2018-11-22 15:17:52, Waiman Long wrote: > On 11/22/2018 10:33 AM, Petr Mladek wrote: > > On Mon 2018-11-19 13:55:18, Waiman Long wrote: > >> By making the object hash locks nestable terminal locks, we can avoid > >> a bunch of unnecessary lockdep validations as well as saving space > >> in the lockdep tables. > > Please, explain which terminal lock might be nested. > > So the db_lock is made to be nestable that it can allow acquisition of > pool_lock (a regular terminal lock) underneath it. Please, mention this in the commit message. Best Regards, Petr ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RESEND PATCH v17 5/5] iommu/arm-smmu: Add support for qcom,smmu-v2 variant
Hi Vivek, Will, On Fri, Nov 23, 2018 at 6:13 PM Vivek Gautam wrote: > > Hi Will, > > On Wed, Nov 21, 2018 at 11:09 PM Will Deacon wrote: > > > > [+Thor] > > > > On Fri, Nov 16, 2018 at 04:54:30PM +0530, Vivek Gautam wrote: > > > qcom,smmu-v2 is an arm,smmu-v2 implementation with specific > > > clock and power requirements. > > > On msm8996, multiple cores, viz. mdss, video, etc. use this > > > smmu. On sdm845, this smmu is used with gpu. > > > Add bindings for the same. > > > > > > Signed-off-by: Vivek Gautam > > > Reviewed-by: Rob Herring > > > Reviewed-by: Tomasz Figa > > > Tested-by: Srinivas Kandagatla > > > Reviewed-by: Robin Murphy > > > --- > > > drivers/iommu/arm-smmu.c | 13 + > > > 1 file changed, 13 insertions(+) > > > > > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > > > index 2098c3141f5f..d315ca637097 100644 > > > --- a/drivers/iommu/arm-smmu.c > > > +++ b/drivers/iommu/arm-smmu.c > > > @@ -120,6 +120,7 @@ enum arm_smmu_implementation { > > > GENERIC_SMMU, > > > ARM_MMU500, > > > CAVIUM_SMMUV2, > > > + QCOM_SMMUV2, > > > }; > > > > > > struct arm_smmu_s2cr { > > > @@ -2026,6 +2027,17 @@ ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, > > > GENERIC_SMMU); > > > ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500); > > > ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2); > > > > > > +static const char * const qcom_smmuv2_clks[] = { > > > + "bus", "iface", > > > +}; > > > + > > > +static const struct arm_smmu_match_data qcom_smmuv2 = { > > > + .version = ARM_SMMU_V2, > > > + .model = QCOM_SMMUV2, > > > + .clks = qcom_smmuv2_clks, > > > + .num_clks = ARRAY_SIZE(qcom_smmuv2_clks), > > > +}; > > > > These seems redundant if we go down the route proposed by Thor, where we > > just pull all of the clocks out of the device-tree. In which case, why > > do we need this match_data at all? > > Which is better? Driver relying solely on the device tree to tell > which all clocks > are required to be enabled, > or, the driver deciding itself based on the platform's match data, > that it should > have X, Y, & Z clocks that should be supplied from the device tree. The former would simplify the driver, but would also make it impossible to spot mistakes in DT, which would ultimately surface out as very hard to debug bugs (likely complete system lockups). For qcom_smmuv2, I believe we're eventually going to end up with platform-specific quirks anyway, so specifying the clocks too wouldn't hurt. Given that, I'd recommend sticking to the latter, i.e. what this patch does. Best regards, Tomasz ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RESEND PATCH v17 5/5] iommu/arm-smmu: Add support for qcom,smmu-v2 variant
Hi Will, On Wed, Nov 21, 2018 at 11:09 PM Will Deacon wrote: > > [+Thor] > > On Fri, Nov 16, 2018 at 04:54:30PM +0530, Vivek Gautam wrote: > > qcom,smmu-v2 is an arm,smmu-v2 implementation with specific > > clock and power requirements. > > On msm8996, multiple cores, viz. mdss, video, etc. use this > > smmu. On sdm845, this smmu is used with gpu. > > Add bindings for the same. > > > > Signed-off-by: Vivek Gautam > > Reviewed-by: Rob Herring > > Reviewed-by: Tomasz Figa > > Tested-by: Srinivas Kandagatla > > Reviewed-by: Robin Murphy > > --- > > drivers/iommu/arm-smmu.c | 13 + > > 1 file changed, 13 insertions(+) > > > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > > index 2098c3141f5f..d315ca637097 100644 > > --- a/drivers/iommu/arm-smmu.c > > +++ b/drivers/iommu/arm-smmu.c > > @@ -120,6 +120,7 @@ enum arm_smmu_implementation { > > GENERIC_SMMU, > > ARM_MMU500, > > CAVIUM_SMMUV2, > > + QCOM_SMMUV2, > > }; > > > > struct arm_smmu_s2cr { > > @@ -2026,6 +2027,17 @@ ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, > > GENERIC_SMMU); > > ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500); > > ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2); > > > > +static const char * const qcom_smmuv2_clks[] = { > > + "bus", "iface", > > +}; > > + > > +static const struct arm_smmu_match_data qcom_smmuv2 = { > > + .version = ARM_SMMU_V2, > > + .model = QCOM_SMMUV2, > > + .clks = qcom_smmuv2_clks, > > + .num_clks = ARRAY_SIZE(qcom_smmuv2_clks), > > +}; > > These seems redundant if we go down the route proposed by Thor, where we > just pull all of the clocks out of the device-tree. In which case, why > do we need this match_data at all? Which is better? Driver relying solely on the device tree to tell which all clocks are required to be enabled, or, the driver deciding itself based on the platform's match data, that it should have X, Y, & Z clocks that should be supplied from the device tree. Thanks Vivek > > Will > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 0/7] Add virtio-iommu driver
Hi Jean, On 11/22/18 8:37 PM, Jean-Philippe Brucker wrote: > Implement the virtio-iommu driver, following specification v0.9 [1]. > > Since v4 [2] I fixed the issues reported by Eric, and added Reviewed-by > from Eric and Rob. Thanks! > > I changed the specification to fix one inconsistency discussed in v4. > That the device fills the probe buffer with zeroes is now a "SHOULD" > instead of a "MAY", since it's the only way for the driver to know if > the device wrote the status. Existing devices already do this. In > addition the device now needs to fill the three padding bytes at the > tail with zeroes. > > You can find Linux driver and kvmtool device on branches > virtio-iommu/v0.9 [3]. I also lightly tested with Eric's latest QEMU > device [4]. > > [1] Virtio-iommu specification v0.9, sources, pdf and diff from v0.8 > git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9 > http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf > > http://jpbrucker.net/virtio-iommu/spec/diffs/virtio-iommu-pdf-diff-v0.8-v0.9.pdf > > [2] [PATCH v4 0/7] Add virtio-iommu driver > > https://lists.linuxfoundation.org/pipermail/iommu/2018-November/031074.html > > [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9 > git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9 > > [4] [RFC v9 00/17] VIRTIO-IOMMU device > https://www.mail-archive.com/qemu-devel@nongnu.org/msg575578.html > > Jean-Philippe Brucker (7): > dt-bindings: virtio-mmio: Add IOMMU description > dt-bindings: virtio: Add virtio-pci-iommu node > of: Allow the iommu-map property to omit untranslated devices > PCI: OF: Initialize dev->fwnode appropriately > iommu: Add virtio-iommu driver > iommu/virtio: Add probe request > iommu/virtio: Add event queue > > .../devicetree/bindings/virtio/iommu.txt | 66 + > .../devicetree/bindings/virtio/mmio.txt | 30 + > MAINTAINERS |7 + > drivers/iommu/Kconfig | 11 + > drivers/iommu/Makefile|1 + > drivers/iommu/virtio-iommu.c | 1157 + > drivers/of/base.c | 10 +- > drivers/pci/of.c |7 + > include/uapi/linux/virtio_ids.h |1 + > include/uapi/linux/virtio_iommu.h | 161 +++ > 10 files changed, 1448 insertions(+), 3 deletions(-) > create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt > create mode 100644 drivers/iommu/virtio-iommu.c > create mode 100644 include/uapi/linux/virtio_iommu.h > for the whole series Tested-by: Eric Auger Thanks Eric ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 5/7] iommu: Add virtio-iommu driver
Hi Jean, On 11/22/18 8:37 PM, Jean-Philippe Brucker wrote: > The virtio IOMMU is a para-virtualized device, allowing to send IOMMU > requests such as map/unmap over virtio transport without emulating page > tables. This implementation handles ATTACH, DETACH, MAP and UNMAP > requests. > > The bulk of the code transforms calls coming from the IOMMU API into > corresponding virtio requests. Mappings are kept in an interval tree > instead of page tables. > > Signed-off-by: Jean-Philippe Brucker Reviewed-by: Eric Auger Thanks Eric > --- > MAINTAINERS | 7 + > drivers/iommu/Kconfig | 11 + > drivers/iommu/Makefile| 1 + > drivers/iommu/virtio-iommu.c | 916 ++ > include/uapi/linux/virtio_ids.h | 1 + > include/uapi/linux/virtio_iommu.h | 104 > 6 files changed, 1040 insertions(+) > create mode 100644 drivers/iommu/virtio-iommu.c > create mode 100644 include/uapi/linux/virtio_iommu.h > > diff --git a/MAINTAINERS b/MAINTAINERS > index 1689dcfec800..3d8550c76f4a 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -15946,6 +15946,13 @@ S: Maintained > F: drivers/virtio/virtio_input.c > F: include/uapi/linux/virtio_input.h > > +VIRTIO IOMMU DRIVER > +M: Jean-Philippe Brucker > +L: virtualizat...@lists.linux-foundation.org > +S: Maintained > +F: drivers/iommu/virtio-iommu.c > +F: include/uapi/linux/virtio_iommu.h > + > VIRTUAL BOX GUEST DEVICE DRIVER > M: Hans de Goede > M: Arnd Bergmann > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig > index bf2bbfa2a399..db5f2b8c23f5 100644 > --- a/drivers/iommu/Kconfig > +++ b/drivers/iommu/Kconfig > @@ -464,4 +464,15 @@ config QCOM_IOMMU > help > Support for IOMMU on certain Qualcomm SoCs. > > +config VIRTIO_IOMMU > + bool "Virtio IOMMU driver" > + depends on VIRTIO=y > + select IOMMU_API > + select INTERVAL_TREE > + select ARM_DMA_USE_IOMMU if ARM > + help > + Para-virtualised IOMMU driver with virtio. > + > + Say Y here if you intend to run this kernel as a guest. > + > endif # IOMMU_SUPPORT > diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile > index 5481e5fe1f95..bd7e55751d09 100644 > --- a/drivers/iommu/Makefile > +++ b/drivers/iommu/Makefile > @@ -36,3 +36,4 @@ obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o > obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o > obj-$(CONFIG_S390_IOMMU) += s390-iommu.o > obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o > +obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c > new file mode 100644 > index ..7540dab9c8dc > --- /dev/null > +++ b/drivers/iommu/virtio-iommu.c > @@ -0,0 +1,916 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Virtio driver for the paravirtualized IOMMU > + * > + * Copyright (C) 2018 Arm Limited > + */ > + > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include > + > +#define MSI_IOVA_BASE0x800 > +#define MSI_IOVA_LENGTH 0x10 > + > +#define VIOMMU_REQUEST_VQ0 > +#define VIOMMU_NR_VQS1 > + > +struct viommu_dev { > + struct iommu_device iommu; > + struct device *dev; > + struct virtio_device*vdev; > + > + struct ida domain_ids; > + > + struct virtqueue*vqs[VIOMMU_NR_VQS]; > + spinlock_t request_lock; > + struct list_headrequests; > + > + /* Device configuration */ > + struct iommu_domain_geometrygeometry; > + u64 pgsize_bitmap; > + u8 domain_bits; > +}; > + > +struct viommu_mapping { > + phys_addr_t paddr; > + struct interval_tree_node iova; > + u32 flags; > +}; > + > +struct viommu_domain { > + struct iommu_domain domain; > + struct viommu_dev *viommu; > + struct mutexmutex; /* protects viommu pointer */ > + unsigned intid; > + > + spinlock_t mappings_lock; > + struct rb_root_cached mappings; > + > + unsigned long nr_endpoints; > +}; > + > +struct viommu_endpoint { > + struct viommu_dev *viommu; > + struct viommu_domain*vdomain; > +}; > + > +struct viommu_request { > + struct list_headlist; > + void*writeback; > + unsigned intwrite_offset; > + unsigned intlen; > + char
Re: [PATCH v3] drm/rockchip: update cursors asynchronously through atomic.
On Fri, 2018-11-23 at 11:27 +0900, Tomasz Figa wrote: > > The point here is not about setting and resetting the plane->fb > pointer. It's about what happens inside > drm_atomic_set_fb_for_plane(). > > It calls drm_framebuffer_get() for the new fb and > drm_framebuffer_put() for the old fb. In result, if the fb changes, > the old fb, which had its reference count incremented in the atomic > commit that set it to the plane before, has its reference count > decremented. Moreover, if the new reference count becomes 0, > drm_framebuffer_put() will immediately free the buffer. > > Freeing a buffer when the hardware is still scanning out of it isn't > a > good idea, is it? No, it's not. But the board I submitted the patch for doesn't have anything like hot swapable ram. The ram access is still going to work, just it might display something it shouldn't. Say for example if that frame buffer got reused by somethig else and filled with new data in the very small window. But yes, I agree the best solution would be to not release the buffer until the next vblank. Perhaps a good solution would be for the DRM api to have the concept of a deferred release? Meaning if the put() call just added the frame buffer to a list that DRM core could walk during the vblank. That might be better then every single driver trying to work up a custom solution. > The vc4 driver seems to be able to program the hardware to switch the > scanout to the new buffer immediately: > > https://elixir.bootlin.com/linux/v4.20-rc3/source/drivers/gpu/drm/vc4/vc4_plane.c#L794 > > Although I wonder if there isn't still a tiny race there - the > hardware may have just started refilling the FIFO from the old > address. Still, if the FIFO is small, the FIFO refill operation may > be > much shorter than it takes for the kernel code to actually free the > buffer. Eric and Michael, could you confirm? > I don't have those boards anymore, and I don't have access to any technical documentation on the GPU so I can't really add much here. Eric can probably provide the best information. I submitted the patch because I was working on arm64 support for fun and was becomming very annoyed by desktop lockups for long periods of time on the desktop enviroment of my choice due to the driver being flooded with curser animation updates. I sent the patch to Eric who was kind enough to review it and suggest some improvements. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu