date:20181123

Re: [PATCH 6/9] iommu/dma-iommu.c: Convert to use vm_insert_range

2018-11-23 Thread Souptick Joarder

On Sat, Nov 24, 2018 at 3:04 AM Matthew Wilcox  wrote:
>
> On Fri, Nov 23, 2018 at 05:23:06PM +, Robin Murphy wrote:
> > On 15/11/2018 15:49, Souptick Joarder wrote:
> > > Convert to use vm_insert_range() to map range of kernel
> > > memory to user vma.
> > >
> > > Signed-off-by: Souptick Joarder 
> > > Reviewed-by: Matthew Wilcox 
> > > ---
> > >   drivers/iommu/dma-iommu.c | 12 ++--
> > >   1 file changed, 2 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > > index d1b0475..69c66b1 100644
> > > --- a/drivers/iommu/dma-iommu.c
> > > +++ b/drivers/iommu/dma-iommu.c
> > > @@ -622,17 +622,9 @@ struct page **iommu_dma_alloc(struct device *dev, 
> > > size_t size, gfp_t gfp,
> > >   int iommu_dma_mmap(struct page **pages, size_t size, struct 
> > > vm_area_struct *vma)
> > >   {
> > > -   unsigned long uaddr = vma->vm_start;
> > > -   unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > > -   int ret = -ENXIO;
> > > +   unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > > -   for (i = vma->vm_pgoff; i < count && uaddr < vma->vm_end; i++) {
> > > -   ret = vm_insert_page(vma, uaddr, pages[i]);
> > > -   if (ret)
> > > -   break;
> > > -   uaddr += PAGE_SIZE;
> > > -   }
> > > -   return ret;
> > > +   return vm_insert_range(vma, vma->vm_start, pages, count);
> >
> > AFIACS, vm_insert_range() doesn't respect vma->vm_pgoff, so doesn't this
> > break partial mmap()s of a large buffer? (which I believe can be a thing)
>
> Whoops.  That should have been:
>
> return vm_insert_range(vma, vma->vm_start, pages + vma->vm_pgoff, count);
>
> I suppose.

Matthew, patch "drm/rockchip/rockchip_drm_gem.c: Convert to use vm_insert_range"
also need to address the same issue ?

>
> Although arguably we should respect vm_pgoff inside vm_insert_region()
> and then callers automatically get support for vm_pgoff without having
> to think about it ... although we should then also pass in the length
> of the pages array to avoid pages being mapped in which aren't part of
> the allocated array.
>
> Hm.  More thought required.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] mm: Replace all open encodings for NUMA_NO_NODE

2018-11-23 Thread Andrew Morton

On Fri, 23 Nov 2018 15:24:16 +0530 Anshuman Khandual 
 wrote:

> At present there are multiple places where invalid node number is encoded
> as -1. Even though implicitly understood it is always better to have macros
> in there. Replace these open encodings for an invalid node number with the
> global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like
> 'invalid node' from various places redirecting them to a common definition.
> 
> ...
> 
> Build tested this with multiple cross compiler options like alpha, sparc,
> arm64, x86, powerpc, powerpc64le etc with their default config which might
> not have compiled tested all driver related changes. I will appreciate
> folks giving this a test in their respective build environment.
> 
> All these places for replacement were found by running the following grep
> patterns on the entire kernel code. Please let me know if this might have
> missed some instances. This might also have replaced some false positives.
> I will appreciate suggestions, inputs and review.
> 
> 1. git grep "nid == -1"
> 2. git grep "node == -1"
> 3. git grep "nid = -1"
> 4. git grep "node = -1"

The build testing is good, but I worry that some of the affected files
don't clearly have numa.h in their include paths, for the NUMA_NO_NODE
definition.

The first thing I looked it is arch/powerpc/include/asm/pci-bridge.h. 
Maybe it somehow manages to include numa.h via some nested include, but
if so, is that reliable across all config combinations and as code
evolves?

So I think that the patch should have added an explicit include of
numa.h, especially in cases where the affected file previously had no
references to any of the things which numa.h defines.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v5 5/7] iommu: Add virtio-iommu driver

2018-11-23 Thread Michael S. Tsirkin

On Thu, Nov 22, 2018 at 07:37:59PM +, Jean-Philippe Brucker wrote:
> The virtio IOMMU is a para-virtualized device, allowing to send IOMMU
> requests such as map/unmap over virtio transport without emulating page
> tables. This implementation handles ATTACH, DETACH, MAP and UNMAP
> requests.
> 
> The bulk of the code transforms calls coming from the IOMMU API into
> corresponding virtio requests. Mappings are kept in an interval tree
> instead of page tables.
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  MAINTAINERS   |   7 +
>  drivers/iommu/Kconfig |  11 +
>  drivers/iommu/Makefile|   1 +
>  drivers/iommu/virtio-iommu.c  | 916 ++
>  include/uapi/linux/virtio_ids.h   |   1 +
>  include/uapi/linux/virtio_iommu.h | 104 
>  6 files changed, 1040 insertions(+)
>  create mode 100644 drivers/iommu/virtio-iommu.c
>  create mode 100644 include/uapi/linux/virtio_iommu.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1689dcfec800..3d8550c76f4a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -15946,6 +15946,13 @@ S:   Maintained
>  F:   drivers/virtio/virtio_input.c
>  F:   include/uapi/linux/virtio_input.h
>  
> +VIRTIO IOMMU DRIVER
> +M:   Jean-Philippe Brucker 
> +L:   virtualizat...@lists.linux-foundation.org
> +S:   Maintained
> +F:   drivers/iommu/virtio-iommu.c
> +F:   include/uapi/linux/virtio_iommu.h
> +
>  VIRTUAL BOX GUEST DEVICE DRIVER
>  M:   Hans de Goede 
>  M:   Arnd Bergmann 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index bf2bbfa2a399..db5f2b8c23f5 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -464,4 +464,15 @@ config QCOM_IOMMU
>   help
> Support for IOMMU on certain Qualcomm SoCs.
>  
> +config VIRTIO_IOMMU
> + bool "Virtio IOMMU driver"
> + depends on VIRTIO=y
> + select IOMMU_API
> + select INTERVAL_TREE
> + select ARM_DMA_USE_IOMMU if ARM
> + help
> +   Para-virtualised IOMMU driver with virtio.
> +
> +   Say Y here if you intend to run this kernel as a guest.
> +
>  endif # IOMMU_SUPPORT
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 5481e5fe1f95..bd7e55751d09 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -36,3 +36,4 @@ obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
>  obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
>  obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
>  obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
> +obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> new file mode 100644
> index ..7540dab9c8dc
> --- /dev/null
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -0,0 +1,916 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Virtio driver for the paravirtualized IOMMU
> + *
> + * Copyright (C) 2018 Arm Limited
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +#define MSI_IOVA_BASE0x800
> +#define MSI_IOVA_LENGTH  0x10
> +
> +#define VIOMMU_REQUEST_VQ0
> +#define VIOMMU_NR_VQS1
> +
> +struct viommu_dev {
> + struct iommu_device iommu;
> + struct device   *dev;
> + struct virtio_device*vdev;
> +
> + struct ida  domain_ids;
> +
> + struct virtqueue*vqs[VIOMMU_NR_VQS];
> + spinlock_t  request_lock;
> + struct list_headrequests;
> +
> + /* Device configuration */
> + struct iommu_domain_geometrygeometry;
> + u64 pgsize_bitmap;
> + u8  domain_bits;
> +};
> +
> +struct viommu_mapping {
> + phys_addr_t paddr;
> + struct interval_tree_node   iova;
> + u32 flags;
> +};
> +
> +struct viommu_domain {
> + struct iommu_domain domain;
> + struct viommu_dev   *viommu;
> + struct mutexmutex; /* protects viommu pointer */
> + unsigned intid;
> +
> + spinlock_t  mappings_lock;
> + struct rb_root_cached   mappings;
> +
> + unsigned long   nr_endpoints;
> +};
> +
> +struct viommu_endpoint {
> + struct viommu_dev   *viommu;
> + struct viommu_domain*vdomain;
> +};
> +
> +struct viommu_request {
> + struct list_headlist;
> + void*writeback;
> + unsigned intwrite_offset;
> + unsigned intlen;
> + charbuf[];
> +};
> +

Re: [PATCH v5 5/7] iommu: Add virtio-iommu driver

2018-11-23 Thread Michael S. Tsirkin

On Thu, Nov 22, 2018 at 07:37:59PM +, Jean-Philippe Brucker wrote:
> The virtio IOMMU is a para-virtualized device, allowing to send IOMMU
> requests such as map/unmap over virtio transport without emulating page
> tables. This implementation handles ATTACH, DETACH, MAP and UNMAP
> requests.
> 
> The bulk of the code transforms calls coming from the IOMMU API into
> corresponding virtio requests. Mappings are kept in an interval tree
> instead of page tables.
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  MAINTAINERS   |   7 +
>  drivers/iommu/Kconfig |  11 +
>  drivers/iommu/Makefile|   1 +
>  drivers/iommu/virtio-iommu.c  | 916 ++
>  include/uapi/linux/virtio_ids.h   |   1 +
>  include/uapi/linux/virtio_iommu.h | 104 
>  6 files changed, 1040 insertions(+)
>  create mode 100644 drivers/iommu/virtio-iommu.c
>  create mode 100644 include/uapi/linux/virtio_iommu.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1689dcfec800..3d8550c76f4a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -15946,6 +15946,13 @@ S:   Maintained
>  F:   drivers/virtio/virtio_input.c
>  F:   include/uapi/linux/virtio_input.h
>  
> +VIRTIO IOMMU DRIVER
> +M:   Jean-Philippe Brucker 
> +L:   virtualizat...@lists.linux-foundation.org
> +S:   Maintained
> +F:   drivers/iommu/virtio-iommu.c
> +F:   include/uapi/linux/virtio_iommu.h
> +
>  VIRTUAL BOX GUEST DEVICE DRIVER
>  M:   Hans de Goede 
>  M:   Arnd Bergmann 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index bf2bbfa2a399..db5f2b8c23f5 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -464,4 +464,15 @@ config QCOM_IOMMU
>   help
> Support for IOMMU on certain Qualcomm SoCs.
>  
> +config VIRTIO_IOMMU
> + bool "Virtio IOMMU driver"
> + depends on VIRTIO=y
> + select IOMMU_API
> + select INTERVAL_TREE
> + select ARM_DMA_USE_IOMMU if ARM
> + help
> +   Para-virtualised IOMMU driver with virtio.
> +
> +   Say Y here if you intend to run this kernel as a guest.
> +

Given it is arm specific right now, shouldn't this depend on ARM?
E.g. there's a hack for x86 right now.

>  endif # IOMMU_SUPPORT
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 5481e5fe1f95..bd7e55751d09 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -36,3 +36,4 @@ obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
>  obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
>  obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
>  obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
> +obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> new file mode 100644
> index ..7540dab9c8dc
> --- /dev/null
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -0,0 +1,916 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Virtio driver for the paravirtualized IOMMU
> + *
> + * Copyright (C) 2018 Arm Limited
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +#define MSI_IOVA_BASE0x800
> +#define MSI_IOVA_LENGTH  0x10
> +
> +#define VIOMMU_REQUEST_VQ0
> +#define VIOMMU_NR_VQS1
> +
> +struct viommu_dev {
> + struct iommu_device iommu;
> + struct device   *dev;
> + struct virtio_device*vdev;
> +
> + struct ida  domain_ids;
> +
> + struct virtqueue*vqs[VIOMMU_NR_VQS];
> + spinlock_t  request_lock;
> + struct list_headrequests;
> +
> + /* Device configuration */
> + struct iommu_domain_geometrygeometry;
> + u64 pgsize_bitmap;
> + u8  domain_bits;
> +};
> +
> +struct viommu_mapping {
> + phys_addr_t paddr;
> + struct interval_tree_node   iova;
> + u32 flags;
> +};
> +
> +struct viommu_domain {
> + struct iommu_domain domain;
> + struct viommu_dev   *viommu;
> + struct mutexmutex; /* protects viommu pointer */
> + unsigned intid;
> +
> + spinlock_t  mappings_lock;
> + struct rb_root_cached   mappings;
> +
> + unsigned long   nr_endpoints;
> +};
> +
> +struct viommu_endpoint {
> + struct viommu_dev   *viommu;
> + struct viommu_domain*vdomain;
> +};
> +
> +struct viommu_request {
> + struct list_headlist;
> + void*writeback;
> + unsigned intwrite_off

Re: [PATCH v5 5/7] iommu: Add virtio-iommu driver

2018-11-23 Thread Michael S. Tsirkin

On Thu, Nov 22, 2018 at 07:37:59PM +, Jean-Philippe Brucker wrote:
> The virtio IOMMU is a para-virtualized device, allowing to send IOMMU
> requests such as map/unmap over virtio transport without emulating page
> tables. This implementation handles ATTACH, DETACH, MAP and UNMAP
> requests.
> 
> The bulk of the code transforms calls coming from the IOMMU API into
> corresponding virtio requests. Mappings are kept in an interval tree
> instead of page tables.
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  MAINTAINERS   |   7 +
>  drivers/iommu/Kconfig |  11 +
>  drivers/iommu/Makefile|   1 +
>  drivers/iommu/virtio-iommu.c  | 916 ++
>  include/uapi/linux/virtio_ids.h   |   1 +
>  include/uapi/linux/virtio_iommu.h | 104 
>  6 files changed, 1040 insertions(+)
>  create mode 100644 drivers/iommu/virtio-iommu.c
>  create mode 100644 include/uapi/linux/virtio_iommu.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1689dcfec800..3d8550c76f4a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -15946,6 +15946,13 @@ S:   Maintained
>  F:   drivers/virtio/virtio_input.c
>  F:   include/uapi/linux/virtio_input.h
>  
> +VIRTIO IOMMU DRIVER
> +M:   Jean-Philippe Brucker 
> +L:   virtualizat...@lists.linux-foundation.org
> +S:   Maintained
> +F:   drivers/iommu/virtio-iommu.c
> +F:   include/uapi/linux/virtio_iommu.h
> +
>  VIRTUAL BOX GUEST DEVICE DRIVER
>  M:   Hans de Goede 
>  M:   Arnd Bergmann 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index bf2bbfa2a399..db5f2b8c23f5 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -464,4 +464,15 @@ config QCOM_IOMMU
>   help
> Support for IOMMU on certain Qualcomm SoCs.
>  
> +config VIRTIO_IOMMU
> + bool "Virtio IOMMU driver"
> + depends on VIRTIO=y
> + select IOMMU_API
> + select INTERVAL_TREE
> + select ARM_DMA_USE_IOMMU if ARM
> + help
> +   Para-virtualised IOMMU driver with virtio.
> +
> +   Say Y here if you intend to run this kernel as a guest.
> +
>  endif # IOMMU_SUPPORT
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 5481e5fe1f95..bd7e55751d09 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -36,3 +36,4 @@ obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
>  obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
>  obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
>  obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
> +obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> new file mode 100644
> index ..7540dab9c8dc
> --- /dev/null
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -0,0 +1,916 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Virtio driver for the paravirtualized IOMMU
> + *
> + * Copyright (C) 2018 Arm Limited
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +#define MSI_IOVA_BASE0x800
> +#define MSI_IOVA_LENGTH  0x10
> +
> +#define VIOMMU_REQUEST_VQ0
> +#define VIOMMU_NR_VQS1
> +
> +struct viommu_dev {
> + struct iommu_device iommu;
> + struct device   *dev;
> + struct virtio_device*vdev;
> +
> + struct ida  domain_ids;
> +
> + struct virtqueue*vqs[VIOMMU_NR_VQS];
> + spinlock_t  request_lock;
> + struct list_headrequests;
> +
> + /* Device configuration */
> + struct iommu_domain_geometrygeometry;
> + u64 pgsize_bitmap;
> + u8  domain_bits;
> +};
> +
> +struct viommu_mapping {
> + phys_addr_t paddr;
> + struct interval_tree_node   iova;
> + u32 flags;
> +};
> +
> +struct viommu_domain {
> + struct iommu_domain domain;
> + struct viommu_dev   *viommu;
> + struct mutexmutex; /* protects viommu pointer */
> + unsigned intid;
> +
> + spinlock_t  mappings_lock;
> + struct rb_root_cached   mappings;
> +
> + unsigned long   nr_endpoints;
> +};
> +
> +struct viommu_endpoint {
> + struct viommu_dev   *viommu;
> + struct viommu_domain*vdomain;
> +};
> +
> +struct viommu_request {
> + struct list_headlist;
> + void*writeback;
> + unsigned intwrite_offset;
> + unsigned intlen;
> + charbuf[];
> +};
> +

Re: [PATCH 6/9] iommu/dma-iommu.c: Convert to use vm_insert_range

2018-11-23 Thread Matthew Wilcox

On Fri, Nov 23, 2018 at 05:23:06PM +, Robin Murphy wrote:
> On 15/11/2018 15:49, Souptick Joarder wrote:
> > Convert to use vm_insert_range() to map range of kernel
> > memory to user vma.
> > 
> > Signed-off-by: Souptick Joarder 
> > Reviewed-by: Matthew Wilcox 
> > ---
> >   drivers/iommu/dma-iommu.c | 12 ++--
> >   1 file changed, 2 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index d1b0475..69c66b1 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -622,17 +622,9 @@ struct page **iommu_dma_alloc(struct device *dev, 
> > size_t size, gfp_t gfp,
> >   int iommu_dma_mmap(struct page **pages, size_t size, struct 
> > vm_area_struct *vma)
> >   {
> > -   unsigned long uaddr = vma->vm_start;
> > -   unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > -   int ret = -ENXIO;
> > +   unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > -   for (i = vma->vm_pgoff; i < count && uaddr < vma->vm_end; i++) {
> > -   ret = vm_insert_page(vma, uaddr, pages[i]);
> > -   if (ret)
> > -   break;
> > -   uaddr += PAGE_SIZE;
> > -   }
> > -   return ret;
> > +   return vm_insert_range(vma, vma->vm_start, pages, count);
> 
> AFIACS, vm_insert_range() doesn't respect vma->vm_pgoff, so doesn't this
> break partial mmap()s of a large buffer? (which I believe can be a thing)

Whoops.  That should have been:

return vm_insert_range(vma, vma->vm_start, pages + vma->vm_pgoff, count);

I suppose.

Although arguably we should respect vm_pgoff inside vm_insert_region()
and then callers automatically get support for vm_pgoff without having
to think about it ... although we should then also pass in the length
of the pages array to avoid pages being mapped in which aren't part of
the allocated array.

Hm.  More thought required.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs

2018-11-23 Thread Robin Murphy


Hi Will,

On 2018-11-23 6:27 pm, Will Deacon wrote:

Hi John,

On Tue, Nov 20, 2018 at 10:25:16AM +0100, Christoph Hellwig wrote:

On Mon, Nov 19, 2018 at 03:22:13PM -0800, John Stultz wrote:

+   sg->dma_address = dma_addr;
 sg_dma_len(sg) = sg->length;
 }


I know Robin has already replied with more detailed info, but just to
close the loop as I'm finally home, applying this patch didn't seem to
help with the IO hangs I'm seeing w/ HiKey960.


If Robins observation is right this should fix the problem for you:


Please could you give this diff a try and let us know whether the problem
persists with your board?


This is actually the exact same change that I ended up with in my 
fixes[1], which John has indeed confirmed.


(sorry, it's the thing I was telling you about the other day, but I 
neglected to include you on cc you when I sent the patches out the 
following morning)


Robin.

[1] https://lore.kernel.org/lkml/cover.1542812807.git.robin.mur...@arm.com/
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RESEND PATCH v17 2/5] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2018-11-23 Thread Will Deacon

On Thu, Nov 22, 2018 at 05:32:24PM +0530, Vivek Gautam wrote:
> Hi Will,
> 
> On Wed, Nov 21, 2018 at 11:09 PM Will Deacon  wrote:
> >
> > On Fri, Nov 16, 2018 at 04:54:27PM +0530, Vivek Gautam wrote:
> > > From: Sricharan R 
> > >
> > > The smmu device probe/remove and add/remove master device callbacks
> > > gets called when the smmu is not linked to its master, that is without
> > > the context of the master device. So calling runtime apis in those places
> > > separately.
> > > Global locks are also initialized before enabling runtime pm as the
> > > runtime_resume() calls device_reset() which does tlb_sync_global()
> > > that ultimately requires locks to be initialized.
> > >
> > > Signed-off-by: Sricharan R 
> > > [vivek: Cleanup pm runtime calls]
> > > Signed-off-by: Vivek Gautam 
> > > Reviewed-by: Tomasz Figa 
> > > Tested-by: Srinivas Kandagatla 
> > > Reviewed-by: Robin Murphy 
> > > ---
> > >  drivers/iommu/arm-smmu.c | 101 
> > > ++-
> > >  1 file changed, 91 insertions(+), 10 deletions(-)
> >
> > Given that you're doing the get/put in the TLBI ops unconditionally:
> >
> > >  static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain)
> > >  {
> > >   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > > + struct arm_smmu_device *smmu = smmu_domain->smmu;
> > >
> > > - if (smmu_domain->tlb_ops)
> > > + if (smmu_domain->tlb_ops) {
> > > + arm_smmu_rpm_get(smmu);
> > >   smmu_domain->tlb_ops->tlb_flush_all(smmu_domain);
> > > + arm_smmu_rpm_put(smmu);
> > > + }
> > >  }
> > >
> > >  static void arm_smmu_iotlb_sync(struct iommu_domain *domain)
> > >  {
> > >   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > > + struct arm_smmu_device *smmu = smmu_domain->smmu;
> > >
> > > - if (smmu_domain->tlb_ops)
> > > + if (smmu_domain->tlb_ops) {
> > > + arm_smmu_rpm_get(smmu);
> > >   smmu_domain->tlb_ops->tlb_sync(smmu_domain);
> > > + arm_smmu_rpm_put(smmu);
> > > + }
> >
> > Why do you need them around the map/unmap calls as well?
> 
> We still have .tlb_add_flush path?

Ok, so we could add the ops around that as well. Right now, we've got
the runtime pm hooks crossing two parts of the API.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RESEND PATCH v17 5/5] iommu/arm-smmu: Add support for qcom,smmu-v2 variant

2018-11-23 Thread Will Deacon

On Fri, Nov 23, 2018 at 03:06:29PM +0530, Vivek Gautam wrote:
> On Fri, Nov 23, 2018 at 2:52 PM Tomasz Figa  wrote:
> > On Fri, Nov 23, 2018 at 6:13 PM Vivek Gautam
> >  wrote:
> > > On Wed, Nov 21, 2018 at 11:09 PM Will Deacon  wrote:
> > > > On Fri, Nov 16, 2018 at 04:54:30PM +0530, Vivek Gautam wrote:
> > > > > @@ -2026,6 +2027,17 @@ ARM_SMMU_MATCH_DATA(arm_mmu401, 
> > > > > ARM_SMMU_V1_64K, GENERIC_SMMU);
> > > > >  ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
> > > > >  ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
> > > > >
> > > > > +static const char * const qcom_smmuv2_clks[] = {
> > > > > + "bus", "iface",
> > > > > +};
> > > > > +
> > > > > +static const struct arm_smmu_match_data qcom_smmuv2 = {
> > > > > + .version = ARM_SMMU_V2,
> > > > > + .model = QCOM_SMMUV2,
> > > > > + .clks = qcom_smmuv2_clks,
> > > > > + .num_clks = ARRAY_SIZE(qcom_smmuv2_clks),
> > > > > +};
> > > >
> > > > These seems redundant if we go down the route proposed by Thor, where we
> > > > just pull all of the clocks out of the device-tree. In which case, why
> > > > do we need this match_data at all?
> > >
> > > Which is better? Driver relying solely on the device tree to tell
> > > which all clocks
> > > are required to be enabled,
> > > or, the driver deciding itself based on the platform's match data,
> > > that it should
> > > have X, Y, & Z clocks that should be supplied from the device tree.
> >
> > The former would simplify the driver, but would also make it
> > impossible to spot mistakes in DT, which would ultimately surface out
> > as very hard to debug bugs (likely complete system lockups).
> 
> Thanks.
> Yea, this is how I understand things presently. Relying on device tree
> puts the things out of driver's control.

But it also has the undesirable effect of having to update the driver
code whenever we want to add support for a new SMMU implementation. If
we do this all in the DT, as Thor is trying to do, then older kernels
will work well with new hardware.

> Hi Will,
> Am I unable to understand the intentions here for Thor's clock-fetch
> design change?

I'm having trouble parsing your question, sorry. Please work with Thor
so that we have a single way to get the clock information. My preference
is to take it from the firmware, for the reason I stated above.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs

2018-11-23 Thread Will Deacon

Hi John,

On Tue, Nov 20, 2018 at 10:25:16AM +0100, Christoph Hellwig wrote:
> On Mon, Nov 19, 2018 at 03:22:13PM -0800, John Stultz wrote:
> > > +   sg->dma_address = dma_addr;
> > > sg_dma_len(sg) = sg->length;
> > > }
> > 
> > I know Robin has already replied with more detailed info, but just to
> > close the loop as I'm finally home, applying this patch didn't seem to
> > help with the IO hangs I'm seeing w/ HiKey960.
> 
> If Robins observation is right this should fix the problem for you:

Please could you give this diff a try and let us know whether the problem
persists with your board?

Thanks,

Will

> diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
> index bd73e7a91410..1833f0c1fba0 100644
> --- a/include/linux/dma-direct.h
> +++ b/include/linux/dma-direct.h
> @@ -5,7 +5,7 @@
>  #include 
>  #include 
>  
> -#define DIRECT_MAPPING_ERROR 0
> +#define DIRECT_MAPPING_ERROR (~(dma_addr_t)0x0)
>  
>  #ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA
>  #include 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 6/9] iommu/dma-iommu.c: Convert to use vm_insert_range

2018-11-23 Thread Robin Murphy


On 15/11/2018 15:49, Souptick Joarder wrote:

Convert to use vm_insert_range() to map range of kernel
memory to user vma.

Signed-off-by: Souptick Joarder 
Reviewed-by: Matthew Wilcox 
---
  drivers/iommu/dma-iommu.c | 12 ++--
  1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index d1b0475..69c66b1 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -622,17 +622,9 @@ struct page **iommu_dma_alloc(struct device *dev, size_t 
size, gfp_t gfp,
  
  int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma)

  {
-   unsigned long uaddr = vma->vm_start;
-   unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-   int ret = -ENXIO;
+   unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
  
-	for (i = vma->vm_pgoff; i < count && uaddr < vma->vm_end; i++) {

-   ret = vm_insert_page(vma, uaddr, pages[i]);
-   if (ret)
-   break;
-   uaddr += PAGE_SIZE;
-   }
-   return ret;
+   return vm_insert_range(vma, vma->vm_start, pages, count);


AFIACS, vm_insert_range() doesn't respect vma->vm_pgoff, so doesn't this 
break partial mmap()s of a large buffer? (which I believe can be a thing)


Robin.


  }
  
  static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3] iommu/dma: Use NUMA aware memory allocations in __iommu_dma_alloc_pages()

2018-11-23 Thread John Garry


On 21/11/2018 16:57, Will Deacon wrote:

On Wed, Nov 21, 2018 at 04:47:48PM +, John Garry wrote:

On 21/11/2018 16:07, Will Deacon wrote:

On Wed, Nov 21, 2018 at 10:54:10PM +0800, John Garry wrote:

From: Ganapatrao Kulkarni 

Change function __iommu_dma_alloc_pages() to allocate pages for DMA from
respective device NUMA node. The ternary operator which would be for
alloc_pages_node() is tidied along with this.

We also include a change to use kvzalloc() for kzalloc()/vzalloc()
combination.

Signed-off-by: Ganapatrao Kulkarni 
[JPG: Added kvzalloc(), drop pages ** being device local, tidied ternary 
operator]
Signed-off-by: John Garry 


Weird, you're missing a diffstat here.

Anyway, the patch looks fine to me, but it would be nice if you could
justify the change with some numbers. Do you actually see an improvement

>from this change?




Hi Will,

Ah, I missed adding my comments explaining the motivation. It would be
better in the commit log. Anyway, here's the snippet:

" ... as mentioned in [3], dma_alloc_coherent() uses the locality
information from the device - as in direct DMA - so this patch is just
applying this same policy.

[3]
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1692998.html";


Yes, please add to this to the commit log.



Sure,


I did have some numbers to show improvement in some scenarios when I tested
this a while back which I'll dig out.

However I would say that some scenarios will improve and the opposite for
others with this change, considering different conditions in which DMA
memory may be used.


Well, if you can show that it's useful in some cases and not catastrophic in
others, then I think shooting for parity with direct DMA is a reasonable
justification for the change.



So I have done some more testing with our SoC crypto engine, using 
tcrypt module. The reason I used this device was because we can utilise 
a local device per socket in the system, unlike other DMAing devices, 
which generally only exist on a single socket (for us, anyway).


Overall the results aren't brilliant - as expected - but show a general 
marginal improvement. Here's some figures:


Summary:
Average diff+0.9%
Max diff+1.5%
Min diff+0.2%

Test Ops/second before  after   diff %  

async ecb(aes) encrypt
test 0 (128 bit key, 16 byte blocks)68717   69057   0.5
test 1 (128 bit key, 64 byte blocks):   72633   73163   0.7
test 2 (128 bit key, 256 byte blocks):  71475   72076   0.8
test 3 (128 bit key, 1024 byte blocks): 66819   67467   1.0
test 4 (128 bit key, 8192 byte blocks): 38237   38495   0.7
test 5 (192 bit key, 16 byte blocks):   70273   71079   1.2
test 6 (192 bit key, 64 byte blocks):   72455   73292   1.2
test 7 (192 bit key, 256 byte blocks):  71085   71876   1.1
test 8 (192 bit key, 1024 byte blocks): 65891   66576   1.0
test 9 (192 bit key, 8192 byte blocks): 34846   35061   0.6
test 10 (256 bit key, 16 byte blocks):  72927   73762   1.2
test 11 (256 bit key, 64 byte blocks):  72361   73207   1.2
test 12 (256 bit key, 256 byte blocks): 70907   71602   1.0
test 13 (256 bit key, 1024 byte blocks):65035   65653   1.0
test 14 (256 bit key, 8192 byte blocks):32835   32998   0.5
async ecb(aes) decrypt
test 0 (128 bit key, 16 byte blocks)68384   69130   1.1
test 1 (128 bit key, 64 byte blocks):   72645   73133   0.7
test 2 (128 bit key, 256 byte blocks):  71523   71912   0.5
test 3 (128 bit key, 1024 byte blocks): 66902   67258   0.5
test 4 (128 bit key, 8192 byte blocks): 38260   38434   0.5
test 5 (192 bit key, 16 byte blocks):   70301   70816   0.7
test 6 (192 bit key, 64 byte blocks):   72473   73064   0.8
test 7 (192 bit key, 256 byte blocks):  71106   71663   0.8
test 8 (192 bit key, 1024 byte blocks): 65915   66363   0.7
test 9 (192 bit key, 8192 byte blocks): 34876   35006   0.4
test 10 (256 bit key, 16 byte blocks):  72969   73519   0.8
test 11 (256 bit key, 64 byte blocks):  72404   72925   0.7
test 12 (256 bit key, 256 byte blocks): 70861   71350   0.7
test 13 (256 bit key, 1024 byte blocks):65074   65485   0.6
test 14 (256 bit key, 8192 byte blocks):32861   32974   0.3
async cbc(aes) encrypt
test 0 (128 bit key, 16 byte blocks)58306   59131   1.4
test 1 (128 bit key, 64 byte blocks):   61647   62565   1.5
test 2 (128 bit key, 256 byte blocks):  60841   61666   1.4
test 3 (128 bit key, 1024 byte blocks): 57503   58204   1.2
test 4 (128 bit key, 8192 byte blocks): 34760   35055   0.9
test 5 (192 bit key, 16 byte blocks):   59684   60452   1.3
test 6 (192 bit key, 64 byte blocks):   61705   62516   1.3
test 7 (192 bit key, 256 byte blocks):  60678   61426   1.2
test 8 (192 bit key, 1024 byte blocks): 56805   57487   1.2
test 9 (192 bit key, 8192 byte blocks): 31836   32093   0.8
test 10 (256 bit key, 16 byte blocks):  61961   62785   1.3
test 11 (256 bit key, 64 byte blocks):  61584   62427   1.4
test 12 (256 bit key, 256 byte blocks): 60407   61246   1.4
test 13 (256 bit key, 1024

[git pull] IOMMU Fixes for Linux v4.20-rc3

2018-11-23 Thread Joerg Roedel

Hi Linus,

The following changes since commit 651022382c7f8da46cb4872a545ee1da6d097d2a:

  Linux 4.20-rc1 (2018-11-04 15:37:52 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git 
iommu-fixes-v4.20-rc3

for you to fetch changes up to 829383e183728dec7ed9150b949cd6de64127809:

  iommu/vt-d: Use memunmap to free memremap (2018-11-22 17:02:21 +0100)


IOMMU Fixes for Linux v4.20-rc3

Including:

- Two fixes for the Intel VT-d driver to fix a NULL-ptr
  dereference and an unbalance in an allocate/free path
  (allocated with memremap, freed with iounmap)

- Fix for a crash in the Renesas IOMMU driver

- Fix for the Advanced Virtual Interrupt Controler (AVIC) code
  in the AMD IOMMU driver


Filippo Sironi (1):
  amd/iommu: Fix Guest Virtual APIC Log Tail Address Register

Geert Uytterhoeven (1):
  iommu/ipmmu-vmsa: Fix crash on early domain free

Lu Baolu (1):
  iommu/vt-d: Fix NULL pointer dereference in prq_event_thread()

Pan Bian (1):
  iommu/vt-d: Use memunmap to free memremap

 drivers/iommu/amd_iommu_init.c | 3 ++-
 drivers/iommu/intel-iommu.c| 2 +-
 drivers/iommu/intel-svm.c  | 2 +-
 drivers/iommu/ipmmu-vmsa.c | 3 +++
 4 files changed, 7 insertions(+), 3 deletions(-)

Please pull.

Thanks,

Joerg


signature.asc
Description: Digital signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 8/8] vfio/type1: Handle different mdev isolation type

2018-11-23 Thread Auger Eric

Hi Lu,

On 11/5/18 8:34 AM, Lu Baolu wrote:
> This adds the support to determine the isolation type
> of a mediated device group by checking whether it has
> an iommu device. If an iommu device exists, an iommu
> domain will be allocated and then attached to the iommu
> device. Otherwise, keep the same behavior as it is.
> 
> Cc: Ashok Raj 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Cc: Liu Yi L 
> Signed-off-by: Sanjay Kumar 
> Signed-off-by: Lu Baolu 
> Signed-off-by: Liu Yi L 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 48 -
>  1 file changed, 42 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 178264b330e7..eed26129f58c 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1427,13 +1427,40 @@ static void vfio_iommu_detach_group(struct 
> vfio_domain *domain,
>   iommu_detach_group(domain->domain, group->iommu_group);
>  }
>  
> +static bool vfio_bus_is_mdev(struct bus_type *bus)
> +{
> + struct bus_type *mdev_bus;
> + bool ret = false;
> +
> + mdev_bus = symbol_get(mdev_bus_type);
> + if (mdev_bus) {
> + ret = (bus == mdev_bus);
> + symbol_put(mdev_bus_type);
> + }
> +
> + return ret;
> +}
> +
> +static int vfio_mdev_iommu_device(struct device *dev, void *data)
> +{
> + struct device **old = data, *new;
> +
> + new = vfio_mdev_get_iommu_device(dev);
> + if (*old && *old != new)
if !new can't you return -EINVAL as well?
> + return -EINVAL;
> +
> + *old = new;
> +
> + return 0;
> +}
> +
>  static int vfio_iommu_type1_attach_group(void *iommu_data,
>struct iommu_group *iommu_group)
>  {
>   struct vfio_iommu *iommu = iommu_data;
>   struct vfio_group *group;
>   struct vfio_domain *domain, *d;
> - struct bus_type *bus = NULL, *mdev_bus;
> + struct bus_type *bus = NULL;
>   int ret;
>   bool resv_msi, msi_remap;
>   phys_addr_t resv_msi_base;
> @@ -1468,11 +1495,18 @@ static int vfio_iommu_type1_attach_group(void 
> *iommu_data,
>   if (ret)
>   goto out_free;
>  
> - mdev_bus = symbol_get(mdev_bus_type);
> + if (vfio_bus_is_mdev(bus)) {
> + struct device *iommu_device = NULL;
>  
> - if (mdev_bus) {
> - if ((bus == mdev_bus) && !iommu_present(bus)) {
> - symbol_put(mdev_bus_type);
> + group->mdev_group = true;
> +
> + /* Determine the isolation type */
> + ret = iommu_group_for_each_dev(iommu_group, &iommu_device,
> +vfio_mdev_iommu_device);
> + if (ret)
> + goto out_free;
> +
> + if (!iommu_device) {
>   if (!iommu->external_domain) {
>   INIT_LIST_HEAD(&domain->group_list);
>   iommu->external_domain = domain;
> @@ -1482,9 +1516,11 @@ static int vfio_iommu_type1_attach_group(void 
> *iommu_data,
>   list_add(&group->next,
>&iommu->external_domain->group_list);
>   mutex_unlock(&iommu->lock);
> +
extra new line
>   return 0;
>   }
> - symbol_put(mdev_bus_type);
> +
> + bus = iommu_device->bus;
>   }
>  
>   domain->domain = iommu_domain_alloc(bus);
> 
Thanks

Eric
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 7/8] vfio/type1: Add domain at(de)taching group helpers

2018-11-23 Thread Auger Eric

Hi Lu,

On 11/5/18 8:34 AM, Lu Baolu wrote:
> This adds helpers to attach or detach a domain to a
> group. This will replace iommu_attach_group() which
> only works for pci devices.
s/pci/non mdev?
> 
> If a domain is attaching to a group which includes the
> mediated devices, it should attach to the iommu device
> (a pci device which represents the mdev in iommu scope)
> instead. The added helper supports attaching domain to
> groups for both pci and mdev devices.
> 
> Cc: Ashok Raj 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Signed-off-by: Lu Baolu 
> Signed-off-by: Liu Yi L 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 114 ++--
>  1 file changed, 107 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index d9fd3188615d..178264b330e7 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -91,6 +91,7 @@ struct vfio_dma {
>  struct vfio_group {
>   struct iommu_group  *iommu_group;
>   struct list_headnext;
> + boolmdev_group; /* An mdev group */
>  };
>  
>  /*
> @@ -1327,6 +1328,105 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group 
> *group, phys_addr_t *base)
>   return ret;
>  }
>  
> +static struct device *vfio_mdev_get_iommu_device(struct device *dev)
> +{
> + struct device *(*fn)(struct device *dev);
> + struct device *iommu_parent;
> +
> + fn = symbol_get(mdev_get_iommu_device);
> + if (fn) {
> + iommu_parent = fn(dev);
> + symbol_put(mdev_get_iommu_device);
> +
> + return iommu_parent;
> + }
> +
> + return NULL;
> +}
> +
> +static int vfio_mdev_set_domain(struct device *dev, struct iommu_domain 
> *domain)
> +{
> + int (*fn)(struct device *dev, void *domain);
> + int ret;
> +
> + fn = symbol_get(mdev_set_iommu_domain);
> + if (fn) {
> + ret = fn(dev, domain);
> + symbol_put(mdev_set_iommu_domain);
> +
> + return ret;
> + }
> +
> + return -EINVAL;
> +}
> +
> +static int vfio_mdev_attach_domain(struct device *dev, void *data)
> +{
> + struct iommu_domain *domain = data;
> + struct device *iommu_device;
> + int ret;
> +
> + ret = vfio_mdev_set_domain(dev, domain);
> + if (ret)
> + return ret;
> +
> + iommu_device = vfio_mdev_get_iommu_device(dev);
> + if (iommu_device) {
> + bool aux_mode = false;
> +
> + iommu_get_dev_attr(iommu_device,
> +IOMMU_DEV_ATTR_AUXD_ENABLED, &aux_mode);
Don' you need to test the returned value before using aux_mode?
> + if (aux_mode)
> + return iommu_attach_device_aux(domain, iommu_device);
> + else
> + return iommu_attach_device(domain, iommu_device);
if for some reason the above ops fail, don't you want to call
vfio_mdev_set_domain(dev, NULL)

> + }
> +
> + return -EINVAL;
> +}
> +
> +static int vfio_mdev_detach_domain(struct device *dev, void *data)
> +{
> + struct iommu_domain *domain = data;
> + struct device *iommu_device;
> +
> + vfio_mdev_set_domain(dev, NULL);
> + iommu_device = vfio_mdev_get_iommu_device(dev);
> + if (iommu_device) {
> + bool aux_mode = false;
> +
> + iommu_get_dev_attr(iommu_device,
> +IOMMU_DEV_ATTR_AUXD_ENABLED, &aux_mode);
same here
> + if (aux_mode)
> + iommu_detach_device_aux(domain, iommu_device);
> + else
> + iommu_detach_device(domain, iommu_device);
> + }
> +
> + return 0;
> +}
> +
> +static int vfio_iommu_attach_group(struct vfio_domain *domain,
> +struct vfio_group *group)
> +{
> + if (group->mdev_group)
> + return iommu_group_for_each_dev(group->iommu_group,
> + domain->domain,
> + vfio_mdev_attach_domain);
> + else
> + return iommu_attach_group(domain->domain, group->iommu_group);
> +}
> +
> +static void vfio_iommu_detach_group(struct vfio_domain *domain,
> + struct vfio_group *group)
> +{
> + if (group->mdev_group)
> + iommu_group_for_each_dev(group->iommu_group, domain->domain,
> +  vfio_mdev_detach_domain);
> + else
> + iommu_detach_group(domain->domain, group->iommu_group);
> +}
> +
>  static int vfio_iommu_type1_attach_group(void *iommu_data,
>struct iommu_group *iommu_group)
>  {
> @@ -1402,7 +1502,7 @@ static int vfio_iommu_type1_attach_group(void 
> *iommu_data,
>   goto out_domain;
>   }
>  
> - ret = iommu_attach_group(domain->domain, iommu_group);
> + ret = vfio_iommu_attach_group(domain, group);
>

Re: remove the ->mapping_error method from dma_map_ops V2

2018-11-23 Thread Russell King - ARM Linux

On Fri, Nov 23, 2018 at 02:03:13PM +0100, Joerg Roedel wrote:
> On Fri, Nov 23, 2018 at 11:01:55AM +, Russell King - ARM Linux wrote:
> > Yuck.  So, if we have a 4GB non-PAE 32-bit system, or a PAE system
> > where we have valid memory across the 4GB boundary and no IOMMU,
> > we have to reserve the top 4K page in the first 4GB of RAM?
> 
> But that is only needed when dma_addr_t is 32bit anyway, no?

You said:

  But we can easily work around that by reserving the top 4k of the first
  4GB of IOVA address space in the allocator, no? Then these values are
  never returned as valid DMA handles.

To me, your proposal equates to this in code:

int dma_error(dma_addr_t addr)
{
return (addr & ~(dma_addr_t)0xfff) == 0xf000 ? (s32)addr : 0; 
}

Hence, the size of dma_addr_t would be entirely irrelevant.  This
makes dma_addr_t values in the range of 0xf000 to 0x special,
irrespective of whether dma_addr_t is 32-bit or 64-bit.

If that's not what you meant, then I'm afraid your statement wasn't
worded very well - so please can you re-word to state more precisely
what your proposal is?

> > Rather than inventing magic cookies like this, I'd much rather we
> > sanitised the API so that we have functions that return success or
> > an error code, rather than trying to shoe-horn some kind of magic
> > error codes into dma_addr_t and subtly break systems in the process.
> 
> Sure, but is has the obvious downside that we need to touch every driver
> that uses these functions, and that are probably a lot of drivers.

So we have two options:
- change the interface
- subtly break drivers

In any case, I disagree that we need to touch every driver.  Today,
drivers just do:

if (dma_mapping_error(dev, addr))

and, because dma_mapping_error() returns a boolean, they only care about
the true/falseness.  If we're going to start returning error codes, then
there's no point just returning error codes unless we have some way for
drivers to use them.  Yes, the simple way would be to make
dma_mapping_error() translate addr to an error code, and that maintains
compatibility with existing drivers.

If, instead, we revamped the DMA API, and had the *new* mapping functions
which returned an error code, then the existing mapping functions can be
implemented as part of compatibility rather trivially:

dma_addr_t dma_map_single(...)
{
dma_addr_t addr;
int error;

error = dma_map_single_error(..., &addr);
if (error)
addr = DMA_MAPPING_ERROR;
return addr;
}

which means that if drivers want access to the error code, they use
dma_map_single_error(), meanwhile existing drivers just get on with life
as it _currently_ is, with the cookie-based all-ones error code - without
introducing any of this potential breakage.

Remember, existing drivers would need modification in _any_ case to make
use of a returned error code, so the argument that we'd need to touch
every driver doesn't really stand up.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: remove the ->mapping_error method from dma_map_ops V2

2018-11-23 Thread Joerg Roedel

On Fri, Nov 23, 2018 at 11:01:55AM +, Russell King - ARM Linux wrote:
> Yuck.  So, if we have a 4GB non-PAE 32-bit system, or a PAE system
> where we have valid memory across the 4GB boundary and no IOMMU,
> we have to reserve the top 4K page in the first 4GB of RAM?

But that is only needed when dma_addr_t is 32bit anyway, no?

> Rather than inventing magic cookies like this, I'd much rather we
> sanitised the API so that we have functions that return success or
> an error code, rather than trying to shoe-horn some kind of magic
> error codes into dma_addr_t and subtly break systems in the process.

Sure, but is has the obvious downside that we need to touch every driver
that uses these functions, and that are probably a lot of drivers.


Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-23 Thread j...@8bytes.org

Hi Jean-Philippe,

On Wed, Nov 21, 2018 at 07:05:13PM +, Jean-Philippe Brucker wrote:
> For the moment though, I think we should allow device drivers to use the
> DMA-API at the same time as SVA.

Yeah, that makes sense.

> If a device driver has to map a management ring buffer for example,
> it's much nicer to use dma_alloc as usual rather than have them write
> their own DMA allocation routines.  Which is another reason to
> implement 2) above: the DMA-API would still act on the default_domain,
> and attaching an "extended" domain augments it with SVA features.
> Detaching from the device doesn't require copying mappings back to the
> default domain. Does that sound OK?

Yes, as I just wrote to Kevin, this sounds good.

>   struct io_mm *io_mm;
>   struct iommu_domain *domain;
> 
>   domain = iommu_alloc_ext_domain(bus);
> 
>   /* Set an mm-exit callback to disable PASIDs. More attributes
>  could be added later to change the capabilities of the ext
>  domain */
>   iommu_domain_set_attr(domain, DOMAIN_ATTR_MM_EXIT_CB, &mm_exit);
> 
>   /* Fails if the device doesn't support this domain type */
>   iommu_attach_device(domain, dev);
> 
>   /* Same as SVA v3, except a domain instead of dev as argument */
>   io_mm = iommu_sva_bind_mm(domain, current->mm, ...);
> 
>   /* on success, install the PASID in the device */
>   pasid = io_mm->pasid;
> 
>   /* Do more work */
> 
>   iommu_sva_unbind_mm(io_mm);
>   iommu_detach_device(domain, dev);

Okay, we can still discuss the naming and a few details, but that goes
into the right directions. One open questions is, for example, where the
pasid-allocator comes into play. As it looks the amdgpu driver needs it
even without an iommu enabled.


Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2018-11-23 Thread Michal Hocko

On Fri 23-11-18 13:23:41, Vlastimil Babka wrote:
> On 11/22/18 9:23 AM, Christoph Hellwig wrote:
[...]
> > But I do agree with the sentiment of not wanting to spread GFP_DMA32
> > futher into the slab allocator.
> 
> I don't see a problem with GFP_DMA32 for custom caches. Generic
> kmalloc() would be worse, since it would have to create a new array of
> kmalloc caches. But that's already ruled out due to the alignment.

Yes that makes quite a lot of sense to me. We do not really need a
generic support. Just make sure that if somebody creates a GFP_DMA32
restricted cache then allow allocating restricted memory from that.

Is there any fundamental reason that this wouldn't be possible?
-- 
Michal Hocko
SUSE Labs
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2018-11-23 Thread Vlastimil Babka

On 11/22/18 9:23 AM, Christoph Hellwig wrote:
> On Wed, Nov 21, 2018 at 10:26:26PM +, Robin Murphy wrote:
>> TBH, if this DMA32 stuff is going to be contentious we could possibly just
>> rip out the offending kmem_cache - it seemed like good practice for the
>> use-case, but provided kzalloc(SZ_1K, gfp | GFP_DMA32) can be relied upon to
>> give the same 1KB alignment and chance of succeeding as the equivalent
>> kmem_cache_alloc(), then we could quite easily make do with that instead.
> 
> Neither is the slab support for kmalloc, not do kmalloc allocations
> have useful alignment apparently (at least if you use slub debug).

Is this also true for caches created by kmem_cache_create(), that
debugging options can result in not respecting the alignment passed to
kmem_cache_create()? That would be rather bad, IMHO.

> But I do agree with the sentiment of not wanting to spread GFP_DMA32
> futher into the slab allocator.

I don't see a problem with GFP_DMA32 for custom caches. Generic
kmalloc() would be worse, since it would have to create a new array of
kmalloc caches. But that's already ruled out due to the alignment.

> I think you want a simple genalloc allocator for this rather special
> use case.

I would prefer if slab could support it, as it doesn't have to
preallocate. OTOH if the allocations are GFP_ATOMIC as suggested later
in the thread, and need to always succeed, then preallocation could be
better, and thus maybe genalloc.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 3/3] iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

2018-11-23 Thread Vlastimil Babka

On 11/22/18 2:20 AM, Nicolas Boichat wrote:
> On Thu, Nov 22, 2018 at 2:02 AM Michal Hocko  wrote:
>>
>> On Wed 21-11-18 16:46:38, Will Deacon wrote:
>>> On Sun, Nov 11, 2018 at 05:03:41PM +0800, Nicolas Boichat wrote:
>>>
>>> It's a bit grotty that GFP_DMA32 doesn't just map to GFP_DMA on 32-bit
>>> architectures, since then we wouldn't need this #ifdeffery afaict.
>>
>> But GFP_DMA32 should map to GFP_KERNEL on 32b, no? Or what exactly is
>> going on in here?
> 
> GFP_DMA32 will fail due to check_slab_flags (aka GFP_SLAB_BUG_MASK
> before patch 1/3 of this series)... But yes, it may be neater if there
> was transparent remapping of GFP_DMA32/SLAB_CACHE_DMA32 to
> GFP_DMA/SLAB_CACHE_DMA on 32-bit arch...

I don't know about ARM, but AFAIK on x86 DMA means within first 4MB of
physical memory, and DMA32 means within first 4GB. It doesn't matter if
the CPU is running in 32bit or 64bit mode. But, when it runs 32bit, the
kernel can direct map less than 4GB anyway, which means it doesn't need
the extra DMA32 zone, i.e. GFP_KERNEL can only get you memory that's
also acceptable for GFP_DMA32.
But, DMA is still DMA, i.e. first 4MB. Remapping GFP_DMA32 to GFP_DMA on
x86 wouldn't work, as the GFP_DMA32 allocations would then only use
those 4MB and exhaust it very fast.

>> --
>> Michal Hocko
>> SUSE Labs

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-23 Thread j...@8bytes.org

Hi Kevin,

On Thu, Nov 22, 2018 at 08:39:19AM +, Tian, Kevin wrote:
> I agree special action needs to be taken for everything else (other than
> DMA-API), but the point that I didn't get is why the action must be based
> a new SVA-type domain, instead of extending default domain with SVA
> capability (including related paging structures which are accessed through 
> new SVA APIs). In the latter case domain-wise attribute (e.g. IRQ mapping) 
> is naturally shared between capabilities (DMA-API and SVA). There is no 
> need to create cross-domain connections as two options that you listed 
> below.
> 
> Can you help elaborate more about the motivation behind proposal?

Yeah, thinking more about it, there are no real reasons against allowing
aux and sva bindings on the default domain. It allows the host to use a
device through the DMA-API and assign parts of it to a guest or a
user-space process, for example.

> |-default domain (DMA-API)
>   |-sva domain1 (SVA)
>   |-sva domain2 (SVA)
>   |-...
>   |-sva domainN (AUX, guest DMA-API)
>   |   |- sva domainN1 (AUX, guest SVA)
>   |   |- sva domainN2 (AUX, guest SVA)
>   |   |-...
>   |-sva domainM (AUX, guest DMA-API)
>   |   |- sva domainM1 (AUX, guest SVA)
>   |   |- sva domainM2 (AUX, guest SVA)
>   |   |-...

In my proposal the sva-domains are no sub-domains of the default-domain,
they exist on the same level. A device is detached from its default
domain and attached to an sva-domain.

> means same domain can be default on deviceA while being aux on deviceB
> (when assigning pci and mdev to same VM).

As already written in another mail, this raises a couple of questions,
like what happens when the aux-domain itself has other aux-domains bound
to it?

Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 07/17] debugobjects: Move printk out of db lock critical sections

2018-11-23 Thread Petr Mladek

On Fri 2018-11-23 11:40:48, Sergey Senozhatsky wrote:
> On (11/22/18 11:16), Peter Zijlstra wrote:
> > > So maybe we need to switch debug objects print-outs to _always_
> > > printk_deferred(). Debug objects can be used in code which cannot
> > > do direct printk() - timekeeping is just one example.
> > 
> > No, printk_deferred() is a disease, it needs to be eradicated, not
> > spread around.
> 
> deadlock-free printk() is deferred, but OK.

The best solution would be lockless console drivers. Sigh.


> Another idea then:
> 
> ---
> 
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index 70935ed91125..3928c2b2f77c 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -323,10 +323,13 @@ static void debug_print_object(struct debug_obj *obj, 
> char *msg)
>   void *hint = descr->debug_hint ?
>   descr->debug_hint(obj->object) : NULL;
>   limit++;
> +
> + bust_spinlocks(1);
>   WARN(1, KERN_ERR "ODEBUG: %s %s (active state %u) "
>"object type: %s hint: %pS\n",
>   msg, obj_states[obj->state], obj->astate,
>   descr->name, hint);
> + bust_spinlocks(0);
>   }
>   debug_objects_warnings++;
>  }
> 
> ---
> 
> This should make serial consoles re-entrant.
> So printk->console_driver_write() hopefully will not deadlock.

Is the re-entrance safe? Some risk might be acceptable in Oops/panic
situations. It is much less acceptable for random warnings.

Best Regards,
Petr
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 07/17] debugobjects: Move printk out of db lock critical sections

2018-11-23 Thread Petr Mladek

On Thu 2018-11-22 15:29:35, Waiman Long wrote:
> On 11/22/2018 11:02 AM, Petr Mladek wrote:
> > Anyway, I wonder what was the primary motivation for this patch.
> > Was it the system hang? Or was it lockdep report about nesting
> > two terminal locks: db->lock, pool_lock with logbuf_lock?
> 
> The primary motivation was to make sure that printk() won't be called
> while holding either db->lock or pool_lock in the debugobjects code. In
> the determination of which locks can be made terminal, I focused on
> local spinlocks that won't cause boundary to an unrelated subsystem as
> it will greatly complicate the analysis.

Then please mention this as the real reason in the commit message.

The reason that printk might take too long in IRQ context does
not explain why we need the patch. printk() is called in IRQ
context in many other locations. I do not see why this place
should be special. The delay is the price for the chance to
see the problem.


> I didn't realize that it fixed a hang problem that I was seeing until I
> did bisection to find out that it was caused by the patch that cause the
> debugobjects splat in the first place a few days ago. I was comparing
> the performance status of the pre and post patched kernel. The pre-patch
> kernel failed to boot in the one of my test systems, but the
> post-patched kernel could. I narrowed it down to this patch as the fix
> for the hang problem.

The hang is a mystery. My guess is that there is a race somewhere.
The patch might have reduced the race window but it probably did
not fix the race itself.

Best Regards,
Petr
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 07/17] debugobjects: Move printk out of db lock critical sections

2018-11-23 Thread Petr Mladek

On Thu 2018-11-22 14:57:02, Waiman Long wrote:
> On 11/21/2018 09:04 PM, Sergey Senozhatsky wrote:
> > On (11/21/18 11:49), Waiman Long wrote:
> > [..]
> >>>   case ODEBUG_STATE_ACTIVE:
> >>> - debug_print_object(obj, "init");
> >>>   state = obj->state;
> >>>   raw_spin_unlock_irqrestore(&db->lock, flags);
> >>> + debug_print_object(obj, "init");
> >>>   debug_object_fixup(descr->fixup_init, addr, state);
> >>>   return;
> >>>  
> >>>   case ODEBUG_STATE_DESTROYED:
> >>> - debug_print_object(obj, "init");
> >>> + debug_printobj = true;
> >>>   break;
> >>>   default:
> >>>   break;
> >>>   }
> >>>  
> >>>   raw_spin_unlock_irqrestore(&db->lock, flags);
> >>> + if (debug_chkstack)
> >>> + debug_object_is_on_stack(addr, onstack);
> >>> + if (debug_printobj)
> >>> + debug_print_object(obj, "init");
> >>>
> > [..]
> >> As a side note, one of the test systems that I used generated a
> >> debugobjects splat in the bootup process and the system hanged
> >> afterward. Applying this patch alone fix the hanging problem and the
> >> system booted up successfully. So it is not really a good idea to call
> >> printk() while holding a raw spinlock.
> > Right, I like this patch.
> > And I think that we, maybe, can go even further.
> >
> > Some serial consoles call mod_timer(). So what we could have with the
> > debug objects enabled was
> >
> > mod_timer()
> >  lock_timer_base()
> >   debug_activate()
> >printk()
> > call_console_drivers()
> >  foo_console()
> >   mod_timer()
> >lock_timer_base()   << deadlock
> >
> > That's one possible scenario. The other one can involve console's
> > IRQ handler, uart port spinlock, mod_timer, debug objects, printk,
> > and an eventual deadlock on the uart port spinlock. This one can
> > be mitigated with printk_safe. But mod_timer() deadlock will require
> > a different fix.
> >
> > So maybe we need to switch debug objects print-outs to _always_
> > printk_deferred(). Debug objects can be used in code which cannot
> > do direct printk() - timekeeping is just one example.
> >
> > -ss
> 
> Actually, I don't think that was the cause of the hang. The debugobjects
> splat was caused by debug_object_is_on_stack(), below was the output:
> 
> [    6.890048] ODEBUG: object (ptrval) is NOT on stack
> (ptrval), but annotated.
> [    6.891000] WARNING: CPU: 28 PID: 1 at lib/debugobjects.c:369
> __debug_object_init.cold.11+0x51/0x2d6
> [    6.891000] Modules linked in:
> [    6.891000] CPU: 28 PID: 1 Comm: swapper/0 Not tainted
> 4.18.0-41.el8.bz1651764_cgroup_debug.x86_64+debug #1
> [    6.891000] Hardware name: HPE ProLiant DL120 Gen10/ProLiant DL120
> Gen10, BIOS U36 11/14/2017
> [    6.891000] RIP: 0010:__debug_object_init.cold.11+0x51/0x2d6
> [    6.891000] Code: ea 03 80 3c 02 00 0f 85 85 02 00 00 49 8b 54 24 18
> 48 89 de 4c 89 44 24 10 48 c7 c7 00 ce 22 94 e8 73 18 62 ff 4c 8b 44 24
> 10 <0f> 0b e9 60 db ff ff 41 83 c4 01 b8 ff ff 37 00 44 89 25 ce 46 f9
> [    6.891000] RSP: :880104187960 EFLAGS: 00010086
> [    6.891000] RAX: 0050 RBX: 9764c570 RCX:
> 
> [    6.891000] RDX:  RSI:  RDI:
> 880104178ca8
> [    6.891000] RBP: 110020830f34 R08: 8807ce68a1d0 R09:
> fbfff2923554
> [    6.891000] R10: fbfff2923554 R11: 9491aaa3 R12:
> 880104178000
> [    6.891000] R13: 96c809b8 R14: a370 R15:
> 8807ce68a1c0
> [    6.891000] FS:  () GS:8807d420()
> knlGS:
> [    6.891000] CS:  0010 DS:  ES:  CR0: 80050033
> [    6.891000] CR2:  CR3: 00028de16001 CR4:
> 007606e0
> [    6.891000] DR0:  DR1:  DR2:
> 
> [    6.891000] DR3:  DR6: fffe0ff0 DR7:
> 0400
> [    6.891000] PKRU: 
> [    6.891000] Call Trace:
> [    6.891000]  ? debug_object_fixup+0x30/0x30
> [    6.891000]  ? _raw_spin_unlock_irqrestore+0x4b/0x60
> [    6.891000]  ? __lockdep_init_map+0x12f/0x510
> [    6.891000]  ? __lockdep_init_map+0x12f/0x510
> [    6.891000]  virt_efi_get_next_variable+0xa2/0x160
> [    6.891000]  efivar_init+0x1c4/0x6d7
> [    6.891000]  ? efivar_ssdt_setup+0x3b/0x3b
> [    6.891000]  ? efivar_entry_iter+0x120/0x120
> [    6.891000]  ? find_held_lock+0x3a/0x1c0
> [    6.891000]  ? lock_downgrade+0x5e0/0x5e0
> [    6.891000]  ? kmsg_dump_rewind_nolock+0xd9/0xd9
> [    6.891000]  ? _raw_spin_unlock_irqrestore+0x4b/0x60
> [    6.891000]  ? trace_hardirqs_on_caller+0x381/0x570
> [    6.891000]  ? efivar_ssdt_iter+0x1f4/0x1f4
> [    6.891000]  efisubsys_init+0x1be/0x4ae
> [    6.891000]  ? kernfs_get.part.8+0x4c/0x60
> [    6.891000]  ? efivar_ssdt_iter+0x1f4/0x1f4
> [    6.891000]  ? __kernfs_create_file+0x235/0x2e0
> [    6.891000]  ? e

Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-23 Thread j...@8bytes.org

On Wed, Nov 21, 2018 at 12:40:44PM +0800, Lu Baolu wrote:
> Can you please elaborate a bit more about the concept of subdomains?
> From my point of view, an aux-domain is a normal un-managed domain which
> has a PASID and could be attached to any ADIs through the aux-domain
> specific attach/detach APIs. It could also be attached to a device with
> the existing domain_attach/detach_device() APIs at the same time, hence
> mdev and pci devices in a vfio container could share a domain.

Okay, let's think a bit about having aux-specific attach/detach
functions, in the end I don't insist on my proposal as long as the
IOMMU-API extensions are clean, consistent, and have well defined
semantics.

If we have aux-domain specific attach/detach functions like
iommu_aux_domain_attach/detach(), what happens when the primary domain
of the device is changed with iommu_attach/detach()?

1) Will the aux-domains stay in place? If yes, how does this
   work in setups where the pasid-bound page-tables are
   guest-owned and translated through the primary domain
   page-tables?

2) Will the aux-domains be unbound too? In that case, if the
   primary domain is re-used, will the aux-domains be implicitly
   bound too when iommu_device_attach() is called?

3) Do we just disallow changing the primary domain through that
   interface as long as there are aux-domains or mm_structs
   bound to the device?

Using option 2) or 3) would mean that the aux-domains are still bound to
the primary domain, but that is not reflected in the API. Further, if an
aux-domain is just a normal domain (struct iommu_domain), what happens
when a domain that was used as a primary domain and has bound
aux-domains to it, is bound with iommu_aux_domain_attach() to another
domain?

As you can see, this design decission raises a lot of questions, but
maybe we can work it out and find a solution we all agree on.

Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] mm: Replace all open encodings for NUMA_NO_NODE

2018-11-23 Thread Anshuman Khandual

min

On 11/23/2018 04:06 PM, David Hildenbrand wrote:
> On 23.11.18 10:54, Anshuman Khandual wrote:
>> At present there are multiple places where invalid node number is encoded
>> as -1. Even though implicitly understood it is always better to have macros
>> in there. Replace these open encodings for an invalid node number with the
>> global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like
>> 'invalid node' from various places redirecting them to a common definition.
>>
>> Signed-off-by: Anshuman Khandual 
>> ---
>>
>> Changes in V1:
>>
>> - Dropped OCFS2 changes per Joseph
>> - Dropped media/video drivers changes per Hans
>>
>> RFC - https://patchwork.kernel.org/patch/10678035/
>>
>> Build tested this with multiple cross compiler options like alpha, sparc,
>> arm64, x86, powerpc, powerpc64le etc with their default config which might
>> not have compiled tested all driver related changes. I will appreciate
>> folks giving this a test in their respective build environment.
>>
>> All these places for replacement were found by running the following grep
>> patterns on the entire kernel code. Please let me know if this might have
>> missed some instances. This might also have replaced some false positives.
>> I will appreciate suggestions, inputs and review.
>>
>> 1. git grep "nid == -1"
>> 2. git grep "node == -1"
>> 3. git grep "nid = -1"
>> 4. git grep "node = -1"
> 
> Hopefully you found most users :)

I hope so :)

> 
> Did you check if some are encoded into function calls? f(-1, ...)

Not really. Just wondering how do we even search for it. There might be
higher level functions passing down -1 to core MM. If you have some
instances in mind which need replacement I will accommodate them.

> 
> Reviewed-by: David Hildenbrand 

Thanks for the review.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: remove the ->mapping_error method from dma_map_ops V2

2018-11-23 Thread Russell King - ARM Linux

On Fri, Nov 23, 2018 at 11:49:18AM +0100, Joerg Roedel wrote:
> On Thu, Nov 22, 2018 at 05:52:15PM +, Robin Murphy wrote:
> > Unfortunately, with things like the top-down IOVA allocator, and 32-bit
> > systems in general, "the top 4095" values may well still be valid addresses
> > - we're relying on a 1-byte mapping of the very top byte of memory/IOVA
> > space being sufficiently ridiculous that no real code would ever do that,
> > but even a 4-byte mapping of the top 4 bytes is within the realms of the
> > plausible (I've definitely seen the USB layer make 8-byte mappings from any
> > old offset within a page, for example).
> 
> But we can easily work around that by reserving the top 4k of the first
> 4GB of IOVA address space in the allocator, no? Then these values are
> never returned as valid DMA handles.

Yuck.  So, if we have a 4GB non-PAE 32-bit system, or a PAE system
where we have valid memory across the 4GB boundary and no IOMMU,
we have to reserve the top 4K page in the first 4GB of RAM?

Linus' IS_DMA_ERR() solution is way more preferable, but unfortunately
it still falls short, because it knocks out the top 4K of every DMA
capable bus.

A DMA capable bus may involve some translation of the address (eg, by
simple offsetting) which means that we'd need to potentially knock out
multiple pages from the page allocator for each of those pages that
correspond to the top 4K of any DMA capable bus.  Knowing that at the
right time at boot will be fun!  It also sounds wasteful to me.

Rather than inventing magic cookies like this, I'd much rather we
sanitised the API so that we have functions that return success or
an error code, rather than trying to shoe-horn some kind of magic
error codes into dma_addr_t and subtly break systems in the process.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 4/8] iommu/vt-d: Attach/detach domains in auxiliary mode

2018-11-23 Thread Auger Eric

Hi Lu,

On 11/5/18 8:34 AM, Lu Baolu wrote:
> When multiple domains per device has been enabled by the
> device driver, the device will tag the default PASID for
> the domain to all DMA traffics out of the subset of this
> device; and the IOMMU should translate the DMA requests
> in PASID granularity.
> 
> This extends the intel_iommu_attach/detach_device() ops
> to support managing PASID granular translation structures
> when the device driver has enabled multiple domains per
> device.
> 
> Cc: Ashok Raj 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Signed-off-by: Sanjay Kumar 
> Signed-off-by: Lu Baolu 
> Signed-off-by: Liu Yi L 
> ---
>  drivers/iommu/intel-iommu.c | 192 +++-
>  include/linux/intel-iommu.h |  10 ++
>  2 files changed, 180 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 2c86ac71c774..a61b25ad0d3b 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -2477,6 +2477,7 @@ static struct dmar_domain 
> *dmar_insert_one_dev_info(struct intel_iommu *iommu,
>   info->iommu = iommu;
>   info->pasid_table = NULL;
>   info->auxd_enabled = 0;
> + INIT_LIST_HEAD(&info->auxiliary_domains);
>  
>   if (dev && dev_is_pci(dev)) {
>   struct pci_dev *pdev = to_pci_dev(info->dev);
> @@ -5010,35 +5011,134 @@ static void intel_iommu_domain_free(struct 
> iommu_domain *domain)
>   domain_exit(to_dmar_domain(domain));
>  }
>  
> -static int intel_iommu_attach_device(struct iommu_domain *domain,
> -  struct device *dev)
> +/*
> + * Check whether a @domain will be attached to the @dev in the
> + * auxiliary mode.
> + */
> +static inline bool
> +is_device_attach_aux_domain(struct device *dev, struct iommu_domain *domain)
>  {
> - struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> - struct intel_iommu *iommu;
> - int addr_width;
> - u8 bus, devfn;
> + struct device_domain_info *info = dev->archdata.iommu;
>  
> - if (device_is_rmrr_locked(dev)) {
> - dev_warn(dev, "Device is ineligible for IOMMU domain attach due 
> to platform RMRR requirement.  Contact your platform vendor.\n");
> - return -EPERM;
> - }
> + return info && info->auxd_enabled &&
> + domain->type == IOMMU_DOMAIN_UNMANAGED;
> +}
>  
> - /* normally dev is not mapped */
> - if (unlikely(domain_context_mapped(dev))) {
> - struct dmar_domain *old_domain;
> +static void auxiliary_link_device(struct dmar_domain *domain,
> +   struct device *dev)
> +{
> + struct device_domain_info *info = dev->archdata.iommu;
>  
> - old_domain = find_domain(dev);
> - if (old_domain) {
> - rcu_read_lock();
> - dmar_remove_one_dev_info(old_domain, dev);
> - rcu_read_unlock();
> + assert_spin_locked(&device_domain_lock);
> + if (WARN_ON(!info))
> + return;
>  
> - if (!domain_type_is_vm_or_si(old_domain) &&
> -  list_empty(&old_domain->devices))
> - domain_exit(old_domain);
> + domain->auxd_refcnt++;
> + list_add(&domain->auxd, &info->auxiliary_domains);
> +}
> +
> +static void auxiliary_unlink_device(struct dmar_domain *domain,
> + struct device *dev)
> +{
> + struct device_domain_info *info = dev->archdata.iommu;
> +
> + assert_spin_locked(&device_domain_lock);
> + if (WARN_ON(!info))
> + return;
> +
> + list_del(&domain->auxd);
> + domain->auxd_refcnt--;
> +
> + if (!domain->auxd_refcnt && domain->default_pasid > 0)
> + intel_pasid_free_id(domain->default_pasid);
> +}
> +
> +static int domain_add_dev_auxd(struct dmar_domain *domain,
> +struct device *dev)
> +{
> + int ret;
> + u8 bus, devfn;
> + unsigned long flags;
> + struct intel_iommu *iommu;
> +
> + iommu = device_to_iommu(dev, &bus, &devfn);
> + if (!iommu)
> + return -ENODEV;
> +
> + spin_lock_irqsave(&device_domain_lock, flags);
> + if (domain->default_pasid <= 0) {
> + domain->default_pasid = intel_pasid_alloc_id(domain, PASID_MIN,
> + pci_max_pasids(to_pci_dev(dev)), GFP_ATOMIC);
> + if (domain->default_pasid < 0) {
> + pr_err("Can't allocate default pasid\n");
> + ret = -ENODEV;
> + goto pasid_failed;
>   }
>   }
>  
> + spin_lock(&iommu->lock);
You may comment your nested lock policy somewhere.
> + ret = domain_attach_iommu(domain, iommu);
> + if (ret)
> + goto attach_failed;
> +
> + /* Setup the PASID entry for mediated devices: */
> + ret = intel_pasid_setup_second_level(iommu, domain, dev,
> +

Re: [PATCH v4 1/8] iommu: Add APIs for multiple domains per device

2018-11-23 Thread Auger Eric

Hi Lu,

On 11/5/18 8:34 AM, Lu Baolu wrote:
> Sharing a physical PCI device in a finer-granularity way
> is becoming a consensus in the industry. IOMMU vendors
> are also engaging efforts to support such sharing as well
> as possible. Among the efforts, the capability of support
> finer-granularity DMA isolation is a common requirement
> due to the security consideration. With finer-granularity
> DMA isolation, all DMA requests out of or to a subset of
> a physical PCI device can be protected by the IOMMU. As a
> result, there is a request in software to attach multiple
> domains to a physical PCI device. One example of such use
> model is the Intel Scalable IOV [1] [2]. The Intel vt-d
> 3.0 spec [3] introduces the scalable mode which enables
> PASID granularity DMA isolation.
> 
> This adds the APIs to support multiple domains per device.
> In order to ease the discussions, we call it 'a domain in
> auxiliary mode' or simply 'auxiliary domain' when multiple
> domains are attached to a physical device.
> 
> The APIs includes:
> 
> * iommu_get_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_CAPABILITY)
>   - Represents the ability of supporting multiple domains
> per device.
> 
> * iommu_get_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLED)
>   - Checks whether the device identified by @dev is working
> in auxiliary mode.
> 
> * iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE)
>   - Enables the multiple domains capability for the device
> referenced by @dev.
> 
> * iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE)
>   - Disables the multiple domains capability for the device
> referenced by @dev.
> 
> * iommu_attach_device_aux(domain, dev)
>   - Attaches @domain to @dev in the auxiliary mode. Multiple
> domains could be attached to a single device in the
> auxiliary mode with each domain representing an isolated
> address space for an assignable subset of the device.
> 
> * iommu_detach_device_aux(domain, dev)
>   - Detach @domain which has been attached to @dev in the
> auxiliary mode.
> 
> * iommu_domain_get_attr(domain, DOMAIN_ATTR_AUXD_ID)
>   - Return ID used for finer-granularity DMA translation.
> For the Intel Scalable IOV usage model, this will be
> a PASID. The device which supports Scalalbe IOV needs
s/Scalalbe/Scalable
> to writes this ID to the device register so that DMA
s/writes/write
> requests could be tagged with a right PASID prefix.
This is not crystal clear to me as the intel implementation returns the
default PASID and not the PASID of the aux domain.
> 
> Many people involved in discussions of this design.
> 
> Kevin Tian 
> Liu Yi L 
> Ashok Raj 
> Sanjay Kumar 
> Jacob Pan 
> Alex Williamson 
> Jean-Philippe Brucker 
> 
> and some discussions can be found here [4].
> 
> [1] 
> https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
> [2] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf
> [3] 
> https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
> [4] https://lkml.org/lkml/2018/7/26/4
> 
> Cc: Ashok Raj 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Cc: Liu Yi L 
> Suggested-by: Kevin Tian 
> Suggested-by: Jean-Philippe Brucker 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/iommu.c | 52 +++
>  include/linux/iommu.h | 52 +++
>  2 files changed, 104 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index edbdf5d6962c..0b7c96d1425e 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -2030,3 +2030,55 @@ int iommu_fwspec_add_ids(struct device *dev, u32 *ids, 
> int num_ids)
>   return 0;
>  }
>  EXPORT_SYMBOL_GPL(iommu_fwspec_add_ids);
> +
> +/*
> + * Generic interfaces to get or set per device IOMMU attributions.
> + */
> +int iommu_get_dev_attr(struct device *dev, enum iommu_dev_attr attr, void 
> *data)
> +{
> + const struct iommu_ops *ops = dev->bus->iommu_ops;
> +
> + if (ops && ops->get_dev_attr)
> + return ops->get_dev_attr(dev, attr, data);
> +
> + return -EINVAL;
> +}
> +EXPORT_SYMBOL_GPL(iommu_get_dev_attr);
> +
> +int iommu_set_dev_attr(struct device *dev, enum iommu_dev_attr attr, void 
> *data)
> +{
> + const struct iommu_ops *ops = dev->bus->iommu_ops;
> +
> + if (ops && ops->set_dev_attr)
> + return ops->set_dev_attr(dev, attr, data);
> +
> + return -EINVAL;
> +}
> +EXPORT_SYMBOL_GPL(iommu_set_dev_attr);
> +
> +/*
> + * APIs to attach/detach a domain to/from a device in the
> + * auxiliary mode.
> + */
> +int iommu_attach_device_aux(struct iommu_domain *domain, struct device *dev)
> +{
> + int ret = -ENODEV;
> +
> + if (domain->ops->attach_dev_aux)
> + ret = domain->ops->attach_dev_aux(domain, dev);
> +
> + if (!ret)
> + trace_attach_device_to_domain(dev);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_atta

Re: [PATCH v4 2/8] iommu/vt-d: Add multiple domains per device query

2018-11-23 Thread Auger Eric

Hi,

On 11/5/18 8:34 AM, Lu Baolu wrote:
> Add the response to IOMMU_DEV_ATTR_AUXD_CAPABILITY capability query
> through iommu_get_dev_attr().

commit title: Advertise auxiliary domain capability?
> 
> Cc: Ashok Raj 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Signed-off-by: Lu Baolu 
> Signed-off-by: Liu Yi L 
> ---
>  drivers/iommu/intel-iommu.c | 38 +
>  1 file changed, 38 insertions(+)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 5e149d26ea9b..298f7a3fafe8 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -5167,6 +5167,24 @@ static phys_addr_t intel_iommu_iova_to_phys(struct 
> iommu_domain *domain,
>   return phys;
>  }
>  
> +static inline bool scalable_mode_support(void)
> +{
> + struct dmar_drhd_unit *drhd;
> + struct intel_iommu *iommu;
> + bool ret = true;
> +
> + rcu_read_lock();
> + for_each_active_iommu(iommu, drhd) {
> + if (!sm_supported(iommu)) {
> + ret = false;
> + break;
> + }
> + }
> + rcu_read_unlock();
> +
> + return ret;
> +}
> +
>  static bool intel_iommu_capable(enum iommu_cap cap)
>  {
>   if (cap == IOMMU_CAP_CACHE_COHERENCY)
> @@ -5331,6 +5349,25 @@ struct intel_iommu *intel_svm_device_to_iommu(struct 
> device *dev)
>  }
>  #endif /* CONFIG_INTEL_IOMMU_SVM */
>  
> +static int intel_iommu_get_dev_attr(struct device *dev,
> + enum iommu_dev_attr attr, void *data)
> +{
> + int ret = 0;
> + bool *auxd_capable;
nit: could be local to the case as other cases may use other datatypes.
> +
> + switch (attr) {
> + case IOMMU_DEV_ATTR_AUXD_CAPABILITY:
> + auxd_capable = data;
> + *auxd_capable = scalable_mode_support();
> + break;
> + default:
> + ret = -EINVAL;
> + break;
> + }
> +
> + return ret;
> +}
> +
>  const struct iommu_ops intel_iommu_ops = {
>   .capable= intel_iommu_capable,
>   .domain_alloc   = intel_iommu_domain_alloc,
> @@ -5345,6 +5382,7 @@ const struct iommu_ops intel_iommu_ops = {
>   .get_resv_regions   = intel_iommu_get_resv_regions,
>   .put_resv_regions   = intel_iommu_put_resv_regions,
>   .device_group   = pci_device_group,
> + .get_dev_attr   = intel_iommu_get_dev_attr,
>   .pgsize_bitmap  = INTEL_IOMMU_PGSIZES,
>  };
>  
>
Thanks

Eric

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: remove the ->mapping_error method from dma_map_ops V2

2018-11-23 Thread Joerg Roedel

On Thu, Nov 22, 2018 at 05:52:15PM +, Robin Murphy wrote:
> Unfortunately, with things like the top-down IOVA allocator, and 32-bit
> systems in general, "the top 4095" values may well still be valid addresses
> - we're relying on a 1-byte mapping of the very top byte of memory/IOVA
> space being sufficiently ridiculous that no real code would ever do that,
> but even a 4-byte mapping of the top 4 bytes is within the realms of the
> plausible (I've definitely seen the USB layer make 8-byte mappings from any
> old offset within a page, for example).

But we can easily work around that by reserving the top 4k of the first
4GB of IOVA address space in the allocator, no? Then these values are
never returned as valid DMA handles.


Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] mm: Replace all open encodings for NUMA_NO_NODE

2018-11-23 Thread David Hildenbrand

On 23.11.18 10:54, Anshuman Khandual wrote:
> At present there are multiple places where invalid node number is encoded
> as -1. Even though implicitly understood it is always better to have macros
> in there. Replace these open encodings for an invalid node number with the
> global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like
> 'invalid node' from various places redirecting them to a common definition.
> 
> Signed-off-by: Anshuman Khandual 
> ---
> 
> Changes in V1:
> 
> - Dropped OCFS2 changes per Joseph
> - Dropped media/video drivers changes per Hans
> 
> RFC - https://patchwork.kernel.org/patch/10678035/
> 
> Build tested this with multiple cross compiler options like alpha, sparc,
> arm64, x86, powerpc, powerpc64le etc with their default config which might
> not have compiled tested all driver related changes. I will appreciate
> folks giving this a test in their respective build environment.
> 
> All these places for replacement were found by running the following grep
> patterns on the entire kernel code. Please let me know if this might have
> missed some instances. This might also have replaced some false positives.
> I will appreciate suggestions, inputs and review.
> 
> 1. git grep "nid == -1"
> 2. git grep "node == -1"
> 3. git grep "nid = -1"
> 4. git grep "node = -1"

Hopefully you found most users :)

Did you check if some are encoded into function calls? f(-1, ...)

Reviewed-by: David Hildenbrand 


-- 

Thanks,

David / dhildenb
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] mm: Replace all open encodings for NUMA_NO_NODE

2018-11-23 Thread Anshuman Khandual

At present there are multiple places where invalid node number is encoded
as -1. Even though implicitly understood it is always better to have macros
in there. Replace these open encodings for an invalid node number with the
global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like
'invalid node' from various places redirecting them to a common definition.

Signed-off-by: Anshuman Khandual 
---

Changes in V1:

- Dropped OCFS2 changes per Joseph
- Dropped media/video drivers changes per Hans

RFC - https://patchwork.kernel.org/patch/10678035/

Build tested this with multiple cross compiler options like alpha, sparc,
arm64, x86, powerpc, powerpc64le etc with their default config which might
not have compiled tested all driver related changes. I will appreciate
folks giving this a test in their respective build environment.

All these places for replacement were found by running the following grep
patterns on the entire kernel code. Please let me know if this might have
missed some instances. This might also have replaced some false positives.
I will appreciate suggestions, inputs and review.

1. git grep "nid == -1"
2. git grep "node == -1"
3. git grep "nid = -1"
4. git grep "node = -1"

 arch/alpha/include/asm/topology.h |  2 +-
 arch/ia64/kernel/numa.c   |  2 +-
 arch/ia64/mm/discontig.c  |  6 +++---
 arch/ia64/sn/kernel/io_common.c   |  2 +-
 arch/powerpc/include/asm/pci-bridge.h |  2 +-
 arch/powerpc/kernel/paca.c|  2 +-
 arch/powerpc/kernel/pci-common.c  |  2 +-
 arch/powerpc/mm/numa.c| 14 +++---
 arch/powerpc/platforms/powernv/memtrace.c |  4 ++--
 arch/sparc/kernel/auxio_32.c  |  2 +-
 arch/sparc/kernel/pci_fire.c  |  2 +-
 arch/sparc/kernel/pci_schizo.c|  2 +-
 arch/sparc/kernel/pcic.c  |  6 +++---
 arch/sparc/kernel/psycho_common.c |  2 +-
 arch/sparc/kernel/sbus.c  |  2 +-
 arch/sparc/mm/init_64.c   |  6 +++---
 arch/sparc/prom/init_32.c |  2 +-
 arch/sparc/prom/init_64.c |  4 ++--
 arch/sparc/prom/tree_32.c | 12 ++--
 arch/sparc/prom/tree_64.c | 18 +-
 arch/x86/include/asm/pci.h|  2 +-
 arch/x86/kernel/apic/x2apic_uv_x.c|  6 +++---
 arch/x86/kernel/smpboot.c |  2 +-
 arch/x86/platform/olpc/olpc_dt.c  | 16 
 drivers/block/mtip32xx/mtip32xx.c |  4 ++--
 drivers/dma/dmaengine.c   |  3 ++-
 drivers/infiniband/hw/hfi1/affinity.c |  2 +-
 drivers/infiniband/hw/hfi1/init.c |  2 +-
 drivers/iommu/dmar.c  |  4 ++--
 drivers/iommu/intel-iommu.c   |  2 +-
 drivers/misc/sgi-xp/xpc_uv.c  |  2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  4 ++--
 init/init_task.c  |  2 +-
 kernel/kthread.c  |  2 +-
 kernel/sched/fair.c   | 15 ---
 lib/cpumask.c |  2 +-
 mm/huge_memory.c  | 12 ++--
 mm/hugetlb.c  |  2 +-
 mm/ksm.c  |  2 +-
 mm/memory.c   |  6 +++---
 mm/memory_hotplug.c   | 12 ++--
 mm/mempolicy.c|  2 +-
 mm/page_alloc.c   |  4 ++--
 mm/page_ext.c |  2 +-
 net/core/pktgen.c |  2 +-
 net/qrtr/qrtr.c   |  2 +-
 tools/perf/bench/numa.c   |  6 +++---
 47 files changed, 109 insertions(+), 107 deletions(-)

diff --git a/arch/alpha/include/asm/topology.h 
b/arch/alpha/include/asm/topology.h
index e6e13a8..f6dc89c 100644
--- a/arch/alpha/include/asm/topology.h
+++ b/arch/alpha/include/asm/topology.h
@@ -29,7 +29,7 @@ static const struct cpumask *cpumask_of_node(int node)
 {
int cpu;
 
-   if (node == -1)
+   if (node == NUMA_NO_NODE)
return cpu_all_mask;
 
cpumask_clear(&node_to_cpumask_map[node]);
diff --git a/arch/ia64/kernel/numa.c b/arch/ia64/kernel/numa.c
index 92c3762..1315da6 100644
--- a/arch/ia64/kernel/numa.c
+++ b/arch/ia64/kernel/numa.c
@@ -74,7 +74,7 @@ void __init build_cpu_to_node_map(void)
cpumask_clear(&node_to_cpu_mask[node]);
 
for_each_possible_early_cpu(cpu) {
-   node = -1;
+   node = NUMA_NO_NODE;
for (i = 0; i < NR_CPUS; ++i)
if (cpu_physical_id(cpu) == node_cpuid[i].phys_id) {
node = node_cpuid[i].nid;
diff --git a/arch/i

Re: [RESEND PATCH v17 5/5] iommu/arm-smmu: Add support for qcom,smmu-v2 variant

2018-11-23 Thread Vivek Gautam

Hi Tomasz,

On Fri, Nov 23, 2018 at 2:52 PM Tomasz Figa  wrote:
>
> Hi Vivek, Will,
>
> On Fri, Nov 23, 2018 at 6:13 PM Vivek Gautam
>  wrote:
> >
> > Hi Will,
> >
> > On Wed, Nov 21, 2018 at 11:09 PM Will Deacon  wrote:
> > >
> > > [+Thor]
> > >
> > > On Fri, Nov 16, 2018 at 04:54:30PM +0530, Vivek Gautam wrote:
> > > > qcom,smmu-v2 is an arm,smmu-v2 implementation with specific
> > > > clock and power requirements.
> > > > On msm8996, multiple cores, viz. mdss, video, etc. use this
> > > > smmu. On sdm845, this smmu is used with gpu.
> > > > Add bindings for the same.
> > > >
> > > > Signed-off-by: Vivek Gautam 
> > > > Reviewed-by: Rob Herring 
> > > > Reviewed-by: Tomasz Figa 
> > > > Tested-by: Srinivas Kandagatla 
> > > > Reviewed-by: Robin Murphy 
> > > > ---
> > > >  drivers/iommu/arm-smmu.c | 13 +
> > > >  1 file changed, 13 insertions(+)
> > > >
> > > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > > > index 2098c3141f5f..d315ca637097 100644
> > > > --- a/drivers/iommu/arm-smmu.c
> > > > +++ b/drivers/iommu/arm-smmu.c
> > > > @@ -120,6 +120,7 @@ enum arm_smmu_implementation {
> > > >   GENERIC_SMMU,
> > > >   ARM_MMU500,
> > > >   CAVIUM_SMMUV2,
> > > > + QCOM_SMMUV2,
> > > >  };
> > > >
> > > >  struct arm_smmu_s2cr {
> > > > @@ -2026,6 +2027,17 @@ ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, 
> > > > GENERIC_SMMU);
> > > >  ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
> > > >  ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
> > > >
> > > > +static const char * const qcom_smmuv2_clks[] = {
> > > > + "bus", "iface",
> > > > +};
> > > > +
> > > > +static const struct arm_smmu_match_data qcom_smmuv2 = {
> > > > + .version = ARM_SMMU_V2,
> > > > + .model = QCOM_SMMUV2,
> > > > + .clks = qcom_smmuv2_clks,
> > > > + .num_clks = ARRAY_SIZE(qcom_smmuv2_clks),
> > > > +};
> > >
> > > These seems redundant if we go down the route proposed by Thor, where we
> > > just pull all of the clocks out of the device-tree. In which case, why
> > > do we need this match_data at all?
> >
> > Which is better? Driver relying solely on the device tree to tell
> > which all clocks
> > are required to be enabled,
> > or, the driver deciding itself based on the platform's match data,
> > that it should
> > have X, Y, & Z clocks that should be supplied from the device tree.
>
> The former would simplify the driver, but would also make it
> impossible to spot mistakes in DT, which would ultimately surface out
> as very hard to debug bugs (likely complete system lockups).

Thanks.
Yea, this is how I understand things presently. Relying on device tree
puts the things out of driver's control.

Hi Will,
Am I unable to understand the intentions here for Thor's clock-fetch
design change?

>
> For qcom_smmuv2, I believe we're eventually going to end up with
> platform-specific quirks anyway, so specifying the clocks too wouldn't
> hurt. Given that, I'd recommend sticking to the latter, i.e. what this
> patch does.
>
> Best regards,
> Tomasz


Best regards
Vivek

> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 09/17] debugobjects: Make object hash locks nestable terminal locks

2018-11-23 Thread Petr Mladek

On Thu 2018-11-22 15:17:52, Waiman Long wrote:
> On 11/22/2018 10:33 AM, Petr Mladek wrote:
> > On Mon 2018-11-19 13:55:18, Waiman Long wrote:
> >> By making the object hash locks nestable terminal locks, we can avoid
> >> a bunch of unnecessary lockdep validations as well as saving space
> >> in the lockdep tables.
> > Please, explain which terminal lock might be nested.
>
> So the db_lock is made to be nestable that it can allow acquisition of
> pool_lock (a regular terminal lock) underneath it.

Please, mention this in the commit message.

Best Regards,
Petr
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RESEND PATCH v17 5/5] iommu/arm-smmu: Add support for qcom,smmu-v2 variant

2018-11-23 Thread Tomasz Figa

Hi Vivek, Will,

On Fri, Nov 23, 2018 at 6:13 PM Vivek Gautam
 wrote:
>
> Hi Will,
>
> On Wed, Nov 21, 2018 at 11:09 PM Will Deacon  wrote:
> >
> > [+Thor]
> >
> > On Fri, Nov 16, 2018 at 04:54:30PM +0530, Vivek Gautam wrote:
> > > qcom,smmu-v2 is an arm,smmu-v2 implementation with specific
> > > clock and power requirements.
> > > On msm8996, multiple cores, viz. mdss, video, etc. use this
> > > smmu. On sdm845, this smmu is used with gpu.
> > > Add bindings for the same.
> > >
> > > Signed-off-by: Vivek Gautam 
> > > Reviewed-by: Rob Herring 
> > > Reviewed-by: Tomasz Figa 
> > > Tested-by: Srinivas Kandagatla 
> > > Reviewed-by: Robin Murphy 
> > > ---
> > >  drivers/iommu/arm-smmu.c | 13 +
> > >  1 file changed, 13 insertions(+)
> > >
> > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > > index 2098c3141f5f..d315ca637097 100644
> > > --- a/drivers/iommu/arm-smmu.c
> > > +++ b/drivers/iommu/arm-smmu.c
> > > @@ -120,6 +120,7 @@ enum arm_smmu_implementation {
> > >   GENERIC_SMMU,
> > >   ARM_MMU500,
> > >   CAVIUM_SMMUV2,
> > > + QCOM_SMMUV2,
> > >  };
> > >
> > >  struct arm_smmu_s2cr {
> > > @@ -2026,6 +2027,17 @@ ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, 
> > > GENERIC_SMMU);
> > >  ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
> > >  ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
> > >
> > > +static const char * const qcom_smmuv2_clks[] = {
> > > + "bus", "iface",
> > > +};
> > > +
> > > +static const struct arm_smmu_match_data qcom_smmuv2 = {
> > > + .version = ARM_SMMU_V2,
> > > + .model = QCOM_SMMUV2,
> > > + .clks = qcom_smmuv2_clks,
> > > + .num_clks = ARRAY_SIZE(qcom_smmuv2_clks),
> > > +};
> >
> > These seems redundant if we go down the route proposed by Thor, where we
> > just pull all of the clocks out of the device-tree. In which case, why
> > do we need this match_data at all?
>
> Which is better? Driver relying solely on the device tree to tell
> which all clocks
> are required to be enabled,
> or, the driver deciding itself based on the platform's match data,
> that it should
> have X, Y, & Z clocks that should be supplied from the device tree.

The former would simplify the driver, but would also make it
impossible to spot mistakes in DT, which would ultimately surface out
as very hard to debug bugs (likely complete system lockups).

For qcom_smmuv2, I believe we're eventually going to end up with
platform-specific quirks anyway, so specifying the clocks too wouldn't
hurt. Given that, I'd recommend sticking to the latter, i.e. what this
patch does.

Best regards,
Tomasz
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RESEND PATCH v17 5/5] iommu/arm-smmu: Add support for qcom,smmu-v2 variant

2018-11-23 Thread Vivek Gautam

Hi Will,

On Wed, Nov 21, 2018 at 11:09 PM Will Deacon  wrote:
>
> [+Thor]
>
> On Fri, Nov 16, 2018 at 04:54:30PM +0530, Vivek Gautam wrote:
> > qcom,smmu-v2 is an arm,smmu-v2 implementation with specific
> > clock and power requirements.
> > On msm8996, multiple cores, viz. mdss, video, etc. use this
> > smmu. On sdm845, this smmu is used with gpu.
> > Add bindings for the same.
> >
> > Signed-off-by: Vivek Gautam 
> > Reviewed-by: Rob Herring 
> > Reviewed-by: Tomasz Figa 
> > Tested-by: Srinivas Kandagatla 
> > Reviewed-by: Robin Murphy 
> > ---
> >  drivers/iommu/arm-smmu.c | 13 +
> >  1 file changed, 13 insertions(+)
> >
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index 2098c3141f5f..d315ca637097 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -120,6 +120,7 @@ enum arm_smmu_implementation {
> >   GENERIC_SMMU,
> >   ARM_MMU500,
> >   CAVIUM_SMMUV2,
> > + QCOM_SMMUV2,
> >  };
> >
> >  struct arm_smmu_s2cr {
> > @@ -2026,6 +2027,17 @@ ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, 
> > GENERIC_SMMU);
> >  ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
> >  ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
> >
> > +static const char * const qcom_smmuv2_clks[] = {
> > + "bus", "iface",
> > +};
> > +
> > +static const struct arm_smmu_match_data qcom_smmuv2 = {
> > + .version = ARM_SMMU_V2,
> > + .model = QCOM_SMMUV2,
> > + .clks = qcom_smmuv2_clks,
> > + .num_clks = ARRAY_SIZE(qcom_smmuv2_clks),
> > +};
>
> These seems redundant if we go down the route proposed by Thor, where we
> just pull all of the clocks out of the device-tree. In which case, why
> do we need this match_data at all?

Which is better? Driver relying solely on the device tree to tell
which all clocks
are required to be enabled,
or, the driver deciding itself based on the platform's match data,
that it should
have X, Y, & Z clocks that should be supplied from the device tree.

Thanks
Vivek

>
> Will
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v5 0/7] Add virtio-iommu driver

2018-11-23 Thread Auger Eric

Hi Jean,

On 11/22/18 8:37 PM, Jean-Philippe Brucker wrote:
> Implement the virtio-iommu driver, following specification v0.9 [1].
> 
> Since v4 [2] I fixed the issues reported by Eric, and added Reviewed-by
> from Eric and Rob. Thanks!
> 
> I changed the specification to fix one inconsistency discussed in v4.
> That the device fills the probe buffer with zeroes is now a "SHOULD"
> instead of a "MAY", since it's the only way for the driver to know if
> the device wrote the status. Existing devices already do this. In
> addition the device now needs to fill the three padding bytes at the
> tail with zeroes.
> 
> You can find Linux driver and kvmtool device on branches
> virtio-iommu/v0.9 [3]. I also lightly tested with Eric's latest QEMU
> device [4].
> 
> [1] Virtio-iommu specification v0.9, sources, pdf and diff from v0.8
> git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9
> http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf
> 
> http://jpbrucker.net/virtio-iommu/spec/diffs/virtio-iommu-pdf-diff-v0.8-v0.9.pdf
> 
> [2] [PATCH v4 0/7] Add virtio-iommu driver
> 
> https://lists.linuxfoundation.org/pipermail/iommu/2018-November/031074.html
> 
> [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9
> git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9
> 
> [4] [RFC v9 00/17] VIRTIO-IOMMU device
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg575578.html
> 
> Jean-Philippe Brucker (7):
>   dt-bindings: virtio-mmio: Add IOMMU description
>   dt-bindings: virtio: Add virtio-pci-iommu node
>   of: Allow the iommu-map property to omit untranslated devices
>   PCI: OF: Initialize dev->fwnode appropriately
>   iommu: Add virtio-iommu driver
>   iommu/virtio: Add probe request
>   iommu/virtio: Add event queue
> 
>  .../devicetree/bindings/virtio/iommu.txt  |   66 +
>  .../devicetree/bindings/virtio/mmio.txt   |   30 +
>  MAINTAINERS   |7 +
>  drivers/iommu/Kconfig |   11 +
>  drivers/iommu/Makefile|1 +
>  drivers/iommu/virtio-iommu.c  | 1157 +
>  drivers/of/base.c |   10 +-
>  drivers/pci/of.c  |7 +
>  include/uapi/linux/virtio_ids.h   |1 +
>  include/uapi/linux/virtio_iommu.h |  161 +++
>  10 files changed, 1448 insertions(+), 3 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt
>  create mode 100644 drivers/iommu/virtio-iommu.c
>  create mode 100644 include/uapi/linux/virtio_iommu.h
> 
for the whole series
Tested-by: Eric Auger 

Thanks

Eric

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v5 5/7] iommu: Add virtio-iommu driver

2018-11-23 Thread Auger Eric

Hi Jean,

On 11/22/18 8:37 PM, Jean-Philippe Brucker wrote:
> The virtio IOMMU is a para-virtualized device, allowing to send IOMMU
> requests such as map/unmap over virtio transport without emulating page
> tables. This implementation handles ATTACH, DETACH, MAP and UNMAP
> requests.
> 
> The bulk of the code transforms calls coming from the IOMMU API into
> corresponding virtio requests. Mappings are kept in an interval tree
> instead of page tables.
> 
> Signed-off-by: Jean-Philippe Brucker 
Reviewed-by: Eric Auger 

Thanks

Eric
> ---
>  MAINTAINERS   |   7 +
>  drivers/iommu/Kconfig |  11 +
>  drivers/iommu/Makefile|   1 +
>  drivers/iommu/virtio-iommu.c  | 916 ++
>  include/uapi/linux/virtio_ids.h   |   1 +
>  include/uapi/linux/virtio_iommu.h | 104 
>  6 files changed, 1040 insertions(+)
>  create mode 100644 drivers/iommu/virtio-iommu.c
>  create mode 100644 include/uapi/linux/virtio_iommu.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1689dcfec800..3d8550c76f4a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -15946,6 +15946,13 @@ S:   Maintained
>  F:   drivers/virtio/virtio_input.c
>  F:   include/uapi/linux/virtio_input.h
>  
> +VIRTIO IOMMU DRIVER
> +M:   Jean-Philippe Brucker 
> +L:   virtualizat...@lists.linux-foundation.org
> +S:   Maintained
> +F:   drivers/iommu/virtio-iommu.c
> +F:   include/uapi/linux/virtio_iommu.h
> +
>  VIRTUAL BOX GUEST DEVICE DRIVER
>  M:   Hans de Goede 
>  M:   Arnd Bergmann 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index bf2bbfa2a399..db5f2b8c23f5 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -464,4 +464,15 @@ config QCOM_IOMMU
>   help
> Support for IOMMU on certain Qualcomm SoCs.
>  
> +config VIRTIO_IOMMU
> + bool "Virtio IOMMU driver"
> + depends on VIRTIO=y
> + select IOMMU_API
> + select INTERVAL_TREE
> + select ARM_DMA_USE_IOMMU if ARM
> + help
> +   Para-virtualised IOMMU driver with virtio.
> +
> +   Say Y here if you intend to run this kernel as a guest.
> +
>  endif # IOMMU_SUPPORT
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 5481e5fe1f95..bd7e55751d09 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -36,3 +36,4 @@ obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
>  obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
>  obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
>  obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
> +obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> new file mode 100644
> index ..7540dab9c8dc
> --- /dev/null
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -0,0 +1,916 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Virtio driver for the paravirtualized IOMMU
> + *
> + * Copyright (C) 2018 Arm Limited
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +#define MSI_IOVA_BASE0x800
> +#define MSI_IOVA_LENGTH  0x10
> +
> +#define VIOMMU_REQUEST_VQ0
> +#define VIOMMU_NR_VQS1
> +
> +struct viommu_dev {
> + struct iommu_device iommu;
> + struct device   *dev;
> + struct virtio_device*vdev;
> +
> + struct ida  domain_ids;
> +
> + struct virtqueue*vqs[VIOMMU_NR_VQS];
> + spinlock_t  request_lock;
> + struct list_headrequests;
> +
> + /* Device configuration */
> + struct iommu_domain_geometrygeometry;
> + u64 pgsize_bitmap;
> + u8  domain_bits;
> +};
> +
> +struct viommu_mapping {
> + phys_addr_t paddr;
> + struct interval_tree_node   iova;
> + u32 flags;
> +};
> +
> +struct viommu_domain {
> + struct iommu_domain domain;
> + struct viommu_dev   *viommu;
> + struct mutexmutex; /* protects viommu pointer */
> + unsigned intid;
> +
> + spinlock_t  mappings_lock;
> + struct rb_root_cached   mappings;
> +
> + unsigned long   nr_endpoints;
> +};
> +
> +struct viommu_endpoint {
> + struct viommu_dev   *viommu;
> + struct viommu_domain*vdomain;
> +};
> +
> +struct viommu_request {
> + struct list_headlist;
> + void*writeback;
> + unsigned intwrite_offset;
> + unsigned intlen;
> + char

Re: [PATCH v3] drm/rockchip: update cursors asynchronously through atomic.

2018-11-23 Thread Michael Zoran

On Fri, 2018-11-23 at 11:27 +0900, Tomasz Figa wrote:
> 
> The point here is not about setting and resetting the plane->fb
> pointer. It's about what happens inside
> drm_atomic_set_fb_for_plane().
> 
> It calls drm_framebuffer_get() for the new fb and
> drm_framebuffer_put() for the old fb. In result, if the fb changes,
> the old fb, which had its reference count incremented in the atomic
> commit that set it to the plane before, has its reference count
> decremented. Moreover, if the new reference count becomes 0,
> drm_framebuffer_put() will immediately free the buffer.
> 
> Freeing a buffer when the hardware is still scanning out of it isn't
> a
> good idea, is it?

No, it's not.  But the board I submitted the patch for doesn't have
anything like hot swapable ram.  The ram access is still going to work,
just it might display something it shouldn't. Say for example if that
frame buffer got reused by somethig else and filled with new data in
the very small window.

But yes, I agree the best solution would be to not release the buffer
until the next vblank.

Perhaps a good solution would be for the DRM api to have the concept of
a deferred release?  Meaning if the put() call just added the frame
buffer to a list that DRM core could walk during the vblank.  That
might be better then every single driver trying to work up a custom
solution.

> The vc4 driver seems to be able to program the hardware to switch the
> scanout to the new buffer immediately:
> 
> https://elixir.bootlin.com/linux/v4.20-rc3/source/drivers/gpu/drm/vc4/vc4_plane.c#L794
> 
> Although I wonder if there isn't still a tiny race there - the
> hardware may have just started refilling the FIFO from the old
> address. Still, if the FIFO is small, the FIFO refill operation may
> be
> much shorter than it takes for the kernel code to actually free the
> buffer. Eric and Michael, could you confirm?
> 

I don't have those boards anymore, and I don't have access to any
technical documentation on the GPU so I can't really add much here.  
Eric can probably provide the best information.

I submitted the patch because I was working on arm64 support for fun
and was becomming very annoyed by desktop lockups for long periods of
time on the desktop enviroment of my choice due to the driver being
flooded with curser animation updates.  I sent the patch to Eric who
was kind enough to review it and suggest some improvements.  

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

41 matches

Mail list logo