date:20180730

Re: [PATCH] iommu/iova: Update cached node pointer when current node fails to get any free IOVA

2018-07-30 Thread Ganapatrao Kulkarni

Hi Robin,


On Mon, Jul 30, 2018 at 12:40 PM, Ganapatrao Kulkarni
 wrote:
> On Fri, Jul 27, 2018 at 9:48 PM, Robin Murphy  wrote:
>> On 27/07/18 13:56, Ganapatrao Kulkarni wrote:
>> [...]
>
> did you get any chance to look in to this issue?
> i am waiting for your suggestion/patch for this issue!



 I got as far as [1], but I wasn't sure how much I liked it, since it
 still
 seems a little invasive for such a specific case (plus I can't remember
 if
 it's actually been debugged or not). I think in the end I started
 wondering
 whether it's even worth bothering with the 32-bit optimisation for PCIe
 devices - 4 extra bytes worth of TLP is surely a lot less significant
 than
 every transaction taking up to 50% more bus cycles was for legacy PCI.
>>>
>>>
>>> how about tracking previous attempt to get 32bit range iova and avoid
>>> further attempts, if it was failed. Later Resume attempts once
>>> replenish happens.
>>> Created patch for the same [2]
>>
>>
>> Ooh, that's a much neater implementation of essentially the same concept -
>> now why couldn't I think of that? :)
>>
>> Looks like it should be possible to make it entirely self-contained too,
>> since alloc_iova() is in a position to both test and update the flag based
>> on the limit_pfn passed in.
>
> is below patch much better?

testing with this diff looks ok, shall i send formal patch with your Acked-by?
>
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index 83fe262..abb15d6 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -56,6 +56,7 @@ init_iova_domain(struct iova_domain *iovad, unsigned
> long granule,
>  iovad->granule = granule;
>  iovad->start_pfn = start_pfn;
>  iovad->dma_32bit_pfn = 1UL << (32 - iova_shift(iovad));
> +iovad->free_32bit_pfns = true;
>  iovad->flush_cb = NULL;
>  iovad->fq = NULL;
>  iovad->anchor.pfn_lo = iovad->anchor.pfn_hi = IOVA_ANCHOR;
> @@ -139,8 +140,10 @@ __cached_rbnode_delete_update(struct iova_domain
> *iovad, struct iova *free)
>
>  cached_iova = rb_entry(iovad->cached32_node, struct iova, node);
>  if (free->pfn_hi < iovad->dma_32bit_pfn &&
> -free->pfn_lo >= cached_iova->pfn_lo)
> +free->pfn_lo >= cached_iova->pfn_lo) {
>  iovad->cached32_node = rb_next(&free->node);
> +iovad->free_32bit_pfns = true;
> +}
>
>  cached_iova = rb_entry(iovad->cached_node, struct iova, node);
>  if (free->pfn_lo >= cached_iova->pfn_lo)
> @@ -290,6 +293,10 @@ alloc_iova(struct iova_domain *iovad, unsigned long size,
>  struct iova *new_iova;
>  int ret;
>
> +if (limit_pfn < iovad->dma_32bit_pfn &&
> +!iovad->free_32bit_pfns)
> +return NULL;
> +
>  new_iova = alloc_iova_mem();
>  if (!new_iova)
>  return NULL;
> @@ -299,6 +306,8 @@ alloc_iova(struct iova_domain *iovad, unsigned long size,
>
>  if (ret) {
>  free_iova_mem(new_iova);
> +if (limit_pfn < iovad->dma_32bit_pfn)
> +iovad->free_32bit_pfns = false;
>  return NULL;
>  }
>
> diff --git a/include/linux/iova.h b/include/linux/iova.h
> index 928442d..3810ba9 100644
> --- a/include/linux/iova.h
> +++ b/include/linux/iova.h
> @@ -96,6 +96,7 @@ struct iova_domain {
> flush-queues */
>  atomic_t fq_timer_on;/* 1 when timer is active, 0
> when not */
> +boolfree_32bit_pfns;
>  };
>
>  static inline unsigned long iova_size(struct iova *iova)
> --
> 2.9.4
>>
>> Robin.
>>
>>
>>>
>>> [2]
>>> https://github.com/gpkulkarni/linux/commit/e2343a3e1f55cdeb5694103dd354bcb881dc65c3
>>> note, the testing of this patch is in progress.
>>>

 Robin.

 [1]

 http://www.linux-arm.org/git?p=linux-rm.git;a=commitdiff;h=a8e0e4af10ebebb3669750e05bf0028e5bd6afe8
>>>
>>>
>>> thanks
>>> Ganapat
>>>
>>
>
> thanks
> Ganapat

thanks
Ganapat
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] PCI: call dma_debug_add_bus for pci_bus_type in common code

2018-07-30 Thread Bjorn Helgaas

[+cc Joerg]

On Mon, Jul 30, 2018 at 09:38:42AM +0200, Christoph Hellwig wrote:
> There is nothing arch specific about PCI or dma-debug, so move this
> call to common code just after registering the bus type.

I assume that previously, even if the user set CONFIG_DMA_API_DEBUG=y
we only got PCI DMA debug on powerpc, sh, and x86.  And after this
patch, we'll get PCI DMA debug on *all* arches?

If that's true, I'll add a comment to that effect to the commitlog
since that new functionality might be of interest to other arches.

> Signed-off-by: Christoph Hellwig 
> ---
>  arch/powerpc/kernel/dma.c | 3 ---
>  arch/sh/drivers/pci/pci.c | 2 --
>  arch/x86/kernel/pci-dma.c | 3 ---
>  drivers/pci/pci-driver.c  | 2 +-
>  4 files changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
> index 155170d70324..dbfc7056d7df 100644
> --- a/arch/powerpc/kernel/dma.c
> +++ b/arch/powerpc/kernel/dma.c
> @@ -357,9 +357,6 @@ EXPORT_SYMBOL_GPL(dma_get_required_mask);
>  
>  static int __init dma_init(void)
>  {
> -#ifdef CONFIG_PCI
> - dma_debug_add_bus(&pci_bus_type);
> -#endif
>  #ifdef CONFIG_IBMVIO
>   dma_debug_add_bus(&vio_bus_type);
>  #endif
> diff --git a/arch/sh/drivers/pci/pci.c b/arch/sh/drivers/pci/pci.c
> index e5b7437ab4af..8256626bc53c 100644
> --- a/arch/sh/drivers/pci/pci.c
> +++ b/arch/sh/drivers/pci/pci.c
> @@ -160,8 +160,6 @@ static int __init pcibios_init(void)
>   for (hose = hose_head; hose; hose = hose->next)
>   pcibios_scanbus(hose);
>  
> - dma_debug_add_bus(&pci_bus_type);
> -
>   pci_initialized = 1;
>  
>   return 0;
> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
> index ab5d9dd668d2..43f58632f123 100644
> --- a/arch/x86/kernel/pci-dma.c
> +++ b/arch/x86/kernel/pci-dma.c
> @@ -155,9 +155,6 @@ static int __init pci_iommu_init(void)
>  {
>   struct iommu_table_entry *p;
>  
> -#ifdef CONFIG_PCI
> - dma_debug_add_bus(&pci_bus_type);
> -#endif
>   x86_init.iommu.iommu_init();
>  
>   for (p = __iommu_table; p < __iommu_table_end; p++) {
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 6792292b5fc7..bef17c3fca67 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -1668,7 +1668,7 @@ static int __init pci_driver_init(void)
>   if (ret)
>   return ret;
>  #endif
> -
> + dma_debug_add_bus(&pci_bus_type);
>   return 0;
>  }
>  postcore_initcall(pci_driver_init);
> -- 
> 2.18.0
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] sparc: use generic dma_noncoherent_ops

2018-07-30 Thread Sam Ravnborg

Hi Christoph.

On Mon, Jul 30, 2018 at 06:17:23PM +0200, Christoph Hellwig wrote:
> Switch to the generic noncoherent direct mapping implementation.
> 
> This removes the previous sync_single_for_device implementation, which
> looks bogus given that no syncing is happening in the similar but more
> important map_single case.
> 
> Signed-off-by: Christoph Hellwig 
> Acked-by: Sam Ravnborg 
> ---
>  Makefile |   2 +-
>  arch/sparc/Kconfig   |   2 +
>  arch/sparc/include/asm/dma-mapping.h |   5 +-
>  arch/sparc/kernel/ioport.c   | 193 +--
>  4 files changed, 36 insertions(+), 166 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index 85f3481a56d6..8a3fd0c4a76e 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -368,7 +368,7 @@ HOST_LOADLIBES := $(HOST_LFS_LIBS)
>  # Make variables (CC, etc...)
>  AS   = $(CROSS_COMPILE)as
>  LD   = $(CROSS_COMPILE)ld
> -CC   = $(CROSS_COMPILE)gcc
> +CC   = sparc64-linux-gnu-gcc-8
>  CPP  = $(CC) -E
>  AR   = $(CROSS_COMPILE)ar
>  NM   = $(CROSS_COMPILE)nm
This sneaked in by accident...

Sam
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 20/20] powerpc/dma: remove dma_nommu_mmap_coherent

2018-07-30 Thread Christoph Hellwig

The remaining implementation for coherent caches is functionally
identical to the default provided in common code.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/include/asm/dma-mapping.h |  7 ---
 arch/powerpc/kernel/dma-iommu.c|  1 -
 arch/powerpc/kernel/dma.c  | 13 -
 arch/powerpc/platforms/pseries/vio.c   |  1 -
 4 files changed, 22 deletions(-)

diff --git a/arch/powerpc/include/asm/dma-mapping.h 
b/arch/powerpc/include/asm/dma-mapping.h
index 879c4efba785..e62e23aa3714 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -18,13 +18,6 @@
 #include 
 #include 
 
-/* Some dma direct funcs must be visible for use in other dma_ops */
-extern int dma_nommu_mmap_coherent(struct device *dev,
-   struct vm_area_struct *vma,
-   void *cpu_addr, dma_addr_t handle,
-   size_t size, unsigned long attrs);
-
-
 static inline unsigned long device_to_mask(struct device *dev)
 {
if (dev->dma_mask && *dev->dma_mask)
diff --git a/arch/powerpc/kernel/dma-iommu.c b/arch/powerpc/kernel/dma-iommu.c
index f9fe2080ceb9..bf5234e1f71b 100644
--- a/arch/powerpc/kernel/dma-iommu.c
+++ b/arch/powerpc/kernel/dma-iommu.c
@@ -114,7 +114,6 @@ int dma_iommu_mapping_error(struct device *dev, dma_addr_t 
dma_addr)
 struct dma_map_ops dma_iommu_ops = {
.alloc  = dma_iommu_alloc_coherent,
.free   = dma_iommu_free_coherent,
-   .mmap   = dma_nommu_mmap_coherent,
.map_sg = dma_iommu_map_sg,
.unmap_sg   = dma_iommu_unmap_sg,
.dma_supported  = dma_iommu_dma_supported,
diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index 08b12cbd7abf..5b71c9d1b8cc 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -70,18 +70,6 @@ static void dma_nommu_free_coherent(struct device *dev, 
size_t size,
iommu_free_coherent(iommu, size, vaddr, dma_handle);
 }
 
-int dma_nommu_mmap_coherent(struct device *dev, struct vm_area_struct *vma,
-void *cpu_addr, dma_addr_t handle, size_t size,
-unsigned long attrs)
-{
-   unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
-
-   return remap_pfn_range(vma, vma->vm_start,
-  pfn + vma->vm_pgoff,
-  vma->vm_end - vma->vm_start,
-  vma->vm_page_prot);
-}
-
 /* note: needs to be called arch_get_required_mask for dma-noncoherent.c */
 u64 arch_get_required_mask(struct device *dev)
 {
@@ -98,7 +86,6 @@ u64 arch_get_required_mask(struct device *dev)
 const struct dma_map_ops dma_nommu_ops = {
.alloc  = dma_nommu_alloc_coherent,
.free   = dma_nommu_free_coherent,
-   .mmap   = dma_nommu_mmap_coherent,
.map_sg = dma_direct_map_sg,
.map_page   = dma_direct_map_page,
.get_required_mask  = arch_get_required_mask,
diff --git a/arch/powerpc/platforms/pseries/vio.c 
b/arch/powerpc/platforms/pseries/vio.c
index 49e04ec19238..51d564313bd0 100644
--- a/arch/powerpc/platforms/pseries/vio.c
+++ b/arch/powerpc/platforms/pseries/vio.c
@@ -618,7 +618,6 @@ static u64 vio_dma_get_required_mask(struct device *dev)
 static const struct dma_map_ops vio_dma_mapping_ops = {
.alloc = vio_dma_iommu_alloc_coherent,
.free  = vio_dma_iommu_free_coherent,
-   .mmap  = dma_nommu_mmap_coherent,
.map_sg= vio_dma_iommu_map_sg,
.unmap_sg  = vio_dma_iommu_unmap_sg,
.map_page  = vio_dma_iommu_map_page,
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 18/20] powerpc/dma-noncoherent: use generic dma_noncoherent_ops

2018-07-30 Thread Christoph Hellwig

The generic dma-noncoherent code provides all that is needed by powerpc.

Note that the cache maintainance in the existing code is a bit odd
as it implements both the sync_to_device and sync_to_cpu callouts,
but never flushes caches when unmapping.  This patch keeps both
directions arounds, which will lead to more flushing than the previous
implementation.  Someone more familar with the required CPUs should
eventually take a look and optimize the cache flush handling if needed.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/Kconfig   |  2 +-
 arch/powerpc/include/asm/dma-mapping.h | 29 -
 arch/powerpc/kernel/dma.c  | 59 +++---
 arch/powerpc/kernel/pci-common.c   |  5 ++-
 arch/powerpc/kernel/setup-common.c |  4 ++
 arch/powerpc/mm/dma-noncoherent.c  | 52 +--
 arch/powerpc/platforms/44x/warp.c  |  2 +-
 arch/powerpc/platforms/Kconfig.cputype |  6 ++-
 8 files changed, 60 insertions(+), 99 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index bbfa6a8df4da..33c6017ffce6 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -129,7 +129,7 @@ config PPC
# Please keep this list sorted alphabetically.
#
select ARCH_HAS_DEVMEM_IS_ALLOWED
-   select ARCH_HAS_DMA_SET_COHERENT_MASK
+   select ARCH_HAS_DMA_SET_COHERENT_MASK if !NOT_COHERENT_CACHE
select ARCH_HAS_ELF_RANDOMIZE
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
diff --git a/arch/powerpc/include/asm/dma-mapping.h 
b/arch/powerpc/include/asm/dma-mapping.h
index f0bf7ac2686c..879c4efba785 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -19,40 +19,11 @@
 #include 
 
 /* Some dma direct funcs must be visible for use in other dma_ops */
-extern void *__dma_nommu_alloc_coherent(struct device *dev, size_t size,
-dma_addr_t *dma_handle, gfp_t flag,
-unsigned long attrs);
-extern void __dma_nommu_free_coherent(struct device *dev, size_t size,
-  void *vaddr, dma_addr_t dma_handle,
-  unsigned long attrs);
 extern int dma_nommu_mmap_coherent(struct device *dev,
struct vm_area_struct *vma,
void *cpu_addr, dma_addr_t handle,
size_t size, unsigned long attrs);
 
-#ifdef CONFIG_NOT_COHERENT_CACHE
-/*
- * DMA-consistent mapping functions for PowerPCs that don't support
- * cache snooping.  These allocate/free a region of uncached mapped
- * memory space for use with DMA devices.  Alternatively, you could
- * allocate the space "normally" and use the cache management functions
- * to ensure it is consistent.
- */
-struct device;
-extern void __dma_sync(void *vaddr, size_t size, int direction);
-extern void __dma_sync_page(struct page *page, unsigned long offset,
-size_t size, int direction);
-extern unsigned long __dma_get_coherent_pfn(unsigned long cpu_addr);
-
-#else /* ! CONFIG_NOT_COHERENT_CACHE */
-/*
- * Cache coherent cores.
- */
-
-#define __dma_sync(addr, size, rw) ((void)0)
-#define __dma_sync_page(pg, off, sz, rw)   ((void)0)
-
-#endif /* ! CONFIG_NOT_COHERENT_CACHE */
 
 static inline unsigned long device_to_mask(struct device *dev)
 {
diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index 2b90a403cdac..b2e88075b2ea 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -36,12 +36,7 @@ static void *dma_nommu_alloc_coherent(struct device *dev, 
size_t size,
 * we can really use the direct ops
 */
if (dma_direct_supported(dev, dev->coherent_dma_mask))
-#ifdef CONFIG_NOT_COHERENT_CACHE
-   return __dma_nommu_alloc_coherent(dev, size, dma_handle,
-  flag, attrs);
-#else
return dma_direct_alloc(dev, size, dma_handle, flag, attrs);
-#endif
 
/* Ok we can't ... do we have an iommu ? If not, fail */
iommu = get_iommu_table_base(dev);
@@ -62,12 +57,7 @@ static void dma_nommu_free_coherent(struct device *dev, 
size_t size,
 
/* See comments in dma_nommu_alloc_coherent() */
if (dma_direct_supported(dev, dev->coherent_dma_mask))
-#ifdef CONFIG_NOT_COHERENT_CACHE
-   return __dma_nommu_free_coherent(dev, size, vaddr, dma_handle,
- attrs);
-#else
return dma_direct_free(dev, size, vaddr, dma_handle, attrs);
-#endif
 
/* Maybe we used an iommu ... */
iommu = get_iommu_table_base(dev);
@@ -84,14 +74,8 @@ int dma_nommu_mmap_coherent(struct device *dev, struct 
vm_area_struct *vma,
 void *cpu_addr, dma_addr_t handle, size_t size,

[PATCH 19/20] powerpc/dma: use the generic dma-direct map_page and map_sg routines

2018-07-30 Thread Christoph Hellwig

These are indentical except for additional error checking, so migrate
to the common code, and wire up the get_mapping_error method as well.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/kernel/dma.c | 32 
 1 file changed, 4 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index b2e88075b2ea..08b12cbd7abf 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -82,21 +82,6 @@ int dma_nommu_mmap_coherent(struct device *dev, struct 
vm_area_struct *vma,
   vma->vm_page_prot);
 }
 
-static int dma_nommu_map_sg(struct device *dev, struct scatterlist *sgl,
-int nents, enum dma_data_direction direction,
-unsigned long attrs)
-{
-   struct scatterlist *sg;
-   int i;
-
-   for_each_sg(sgl, sg, nents, i) {
-   sg->dma_address = phys_to_dma(dev, sg_phys(sg));
-   sg->dma_length = sg->length;
-   }
-
-   return nents;
-}
-
 /* note: needs to be called arch_get_required_mask for dma-noncoherent.c */
 u64 arch_get_required_mask(struct device *dev)
 {
@@ -110,24 +95,15 @@ u64 arch_get_required_mask(struct device *dev)
return mask;
 }
 
-static inline dma_addr_t dma_nommu_map_page(struct device *dev,
-struct page *page,
-unsigned long offset,
-size_t size,
-enum dma_data_direction dir,
-unsigned long attrs)
-{
-   return phys_to_dma(dev, page_to_phys(page)) + offset;
-}
-
 const struct dma_map_ops dma_nommu_ops = {
.alloc  = dma_nommu_alloc_coherent,
.free   = dma_nommu_free_coherent,
.mmap   = dma_nommu_mmap_coherent,
-   .map_sg = dma_nommu_map_sg,
-   .dma_supported  = dma_direct_supported,
-   .map_page   = dma_nommu_map_page,
+   .map_sg = dma_direct_map_sg,
+   .map_page   = dma_direct_map_page,
.get_required_mask  = arch_get_required_mask,
+   .dma_supported  = dma_direct_supported,
+   .mapping_error  = dma_direct_mapping_error,
 };
 
 #ifndef CONFIG_NOT_COHERENT_CACHE
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 17/20] powerpc/dma-swiotlb: use generic swiotlb_dma_ops

2018-07-30 Thread Christoph Hellwig

These are identical to the arch specific ones, so remove them.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/include/asm/dma-direct.h |  4 
 arch/powerpc/include/asm/swiotlb.h|  2 --
 arch/powerpc/kernel/dma-swiotlb.c | 28 ++-
 arch/powerpc/sysdev/fsl_pci.c |  2 +-
 4 files changed, 7 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/include/asm/dma-direct.h 
b/arch/powerpc/include/asm/dma-direct.h
index 0fba19445ae8..657f84ddb20d 100644
--- a/arch/powerpc/include/asm/dma-direct.h
+++ b/arch/powerpc/include/asm/dma-direct.h
@@ -30,4 +30,8 @@ static inline phys_addr_t __dma_to_phys(struct device *dev, 
dma_addr_t daddr)
return daddr - PCI_DRAM_OFFSET;
return daddr - dev->archdata.dma_offset;
 }
+
+u64 swiotlb_powerpc_get_required(struct device *dev);
+#define swiotlb_get_required_mask swiotlb_powerpc_get_required
+
 #endif /* ASM_POWERPC_DMA_DIRECT_H */
diff --git a/arch/powerpc/include/asm/swiotlb.h 
b/arch/powerpc/include/asm/swiotlb.h
index f65ecf57b66c..1d8c1da26ab3 100644
--- a/arch/powerpc/include/asm/swiotlb.h
+++ b/arch/powerpc/include/asm/swiotlb.h
@@ -13,8 +13,6 @@
 
 #include 
 
-extern const struct dma_map_ops powerpc_swiotlb_dma_ops;
-
 extern unsigned int ppc_swiotlb_enable;
 int __init swiotlb_setup_bus_notifier(void);
 
diff --git a/arch/powerpc/kernel/dma-swiotlb.c 
b/arch/powerpc/kernel/dma-swiotlb.c
index 25986fcd1e5e..0c269de61f39 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -24,7 +24,7 @@
 
 unsigned int ppc_swiotlb_enable;
 
-static u64 swiotlb_powerpc_get_required(struct device *dev)
+u64 swiotlb_powerpc_get_required(struct device *dev)
 {
u64 end, mask, max_direct_dma_addr = dev->archdata.max_direct_dma_addr;
 
@@ -38,30 +38,6 @@ static u64 swiotlb_powerpc_get_required(struct device *dev)
return mask;
 }
 
-/*
- * At the moment, all platforms that use this code only require
- * swiotlb to be used if we're operating on HIGHMEM.  Since
- * we don't ever call anything other than map_sg, unmap_sg,
- * map_page, and unmap_page on highmem, use normal dma_ops
- * for everything else.
- */
-const struct dma_map_ops powerpc_swiotlb_dma_ops = {
-   .alloc = dma_direct_alloc,
-   .free = dma_direct_free,
-   .mmap = dma_nommu_mmap_coherent,
-   .map_sg = swiotlb_map_sg_attrs,
-   .unmap_sg = swiotlb_unmap_sg_attrs,
-   .dma_supported = swiotlb_dma_supported,
-   .map_page = swiotlb_map_page,
-   .unmap_page = swiotlb_unmap_page,
-   .sync_single_for_cpu = swiotlb_sync_single_for_cpu,
-   .sync_single_for_device = swiotlb_sync_single_for_device,
-   .sync_sg_for_cpu = swiotlb_sync_sg_for_cpu,
-   .sync_sg_for_device = swiotlb_sync_sg_for_device,
-   .mapping_error = swiotlb_dma_mapping_error,
-   .get_required_mask = swiotlb_powerpc_get_required,
-};
-
 void pci_dma_dev_setup_swiotlb(struct pci_dev *pdev)
 {
struct pci_controller *hose;
@@ -88,7 +64,7 @@ static int ppc_swiotlb_bus_notify(struct notifier_block *nb,
 
/* May need to bounce if the device can't address all of DRAM */
if ((dma_get_mask(dev) + 1) < memblock_end_of_DRAM())
-   set_dma_ops(dev, &powerpc_swiotlb_dma_ops);
+   set_dma_ops(dev, &swiotlb_dma_ops);
 
return NOTIFY_DONE;
 }
diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
index 918be816b097..daf44bc0108d 100644
--- a/arch/powerpc/sysdev/fsl_pci.c
+++ b/arch/powerpc/sysdev/fsl_pci.c
@@ -118,7 +118,7 @@ static void setup_swiotlb_ops(struct pci_controller *hose)
 {
if (ppc_swiotlb_enable) {
hose->controller_ops.dma_dev_setup = pci_dma_dev_setup_swiotlb;
-   set_pci_dma_ops(&powerpc_swiotlb_dma_ops);
+   set_pci_dma_ops(&swiotlb_dma_ops);
}
 }
 #else
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 15/20] powerpc/dma: remove the unused unmap_page and unmap_sg methods

2018-07-30 Thread Christoph Hellwig

These methods are optional to start with, no need to implement no-op
versions.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/kernel/dma.c | 16 
 1 file changed, 16 deletions(-)

diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index 511a4972560d..2cfc45acbb52 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -178,12 +178,6 @@ static int dma_nommu_map_sg(struct device *dev, struct 
scatterlist *sgl,
return nents;
 }
 
-static void dma_nommu_unmap_sg(struct device *dev, struct scatterlist *sg,
-   int nents, enum dma_data_direction direction,
-   unsigned long attrs)
-{
-}
-
 static u64 dma_nommu_get_required_mask(struct device *dev)
 {
u64 end, mask;
@@ -209,14 +203,6 @@ static inline dma_addr_t dma_nommu_map_page(struct device 
*dev,
return phys_to_dma(dev, page_to_phys(page)) + offset;
 }
 
-static inline void dma_nommu_unmap_page(struct device *dev,
-dma_addr_t dma_address,
-size_t size,
-enum dma_data_direction direction,
-unsigned long attrs)
-{
-}
-
 #ifdef CONFIG_NOT_COHERENT_CACHE
 static inline void dma_nommu_sync_sg(struct device *dev,
struct scatterlist *sgl, int nents,
@@ -242,10 +228,8 @@ const struct dma_map_ops dma_nommu_ops = {
.free   = dma_nommu_free_coherent,
.mmap   = dma_nommu_mmap_coherent,
.map_sg = dma_nommu_map_sg,
-   .unmap_sg   = dma_nommu_unmap_sg,
.dma_supported  = dma_direct_supported,
.map_page   = dma_nommu_map_page,
-   .unmap_page = dma_nommu_unmap_page,
.get_required_mask  = dma_nommu_get_required_mask,
 #ifdef CONFIG_NOT_COHERENT_CACHE
.sync_single_for_cpu= dma_nommu_sync_single,
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 16/20] powerpc/dma: use dma_direct_{alloc,free}

2018-07-30 Thread Christoph Hellwig

These do the same functionality as the existing helpers, but do it
simpler, and also allow the (optional) use of CMA.

Note that the swiotlb code now calls into the dma_direct code directly,
given that it doesn't work with noncoherent caches at all, and isn't called
when we have an iommu either, so the iommu special case in
dma_nommu_alloc_coherent isn't required for swiotlb.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/include/asm/pgtable.h |  1 -
 arch/powerpc/kernel/dma-swiotlb.c  |  4 +-
 arch/powerpc/kernel/dma.c  | 78 --
 arch/powerpc/mm/mem.c  | 19 
 4 files changed, 11 insertions(+), 91 deletions(-)

diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index 14c79a7dc855..123de4958d2e 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -38,7 +38,6 @@ extern unsigned long empty_zero_page[];
 extern pgd_t swapper_pg_dir[];
 
 void limit_zone_pfn(enum zone_type zone, unsigned long max_pfn);
-int dma_pfn_limit_to_zone(u64 pfn_limit);
 extern void paging_init(void);
 
 /*
diff --git a/arch/powerpc/kernel/dma-swiotlb.c 
b/arch/powerpc/kernel/dma-swiotlb.c
index f6e0701c5303..25986fcd1e5e 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -46,8 +46,8 @@ static u64 swiotlb_powerpc_get_required(struct device *dev)
  * for everything else.
  */
 const struct dma_map_ops powerpc_swiotlb_dma_ops = {
-   .alloc = __dma_nommu_alloc_coherent,
-   .free = __dma_nommu_free_coherent,
+   .alloc = dma_direct_alloc,
+   .free = dma_direct_free,
.mmap = dma_nommu_mmap_coherent,
.map_sg = swiotlb_map_sg_attrs,
.unmap_sg = swiotlb_unmap_sg_attrs,
diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index 2cfc45acbb52..2b90a403cdac 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -26,75 +26,6 @@
  * can set archdata.dma_data to an unsigned long holding the offset. By
  * default the offset is PCI_DRAM_OFFSET.
  */
-
-static u64 __maybe_unused get_pfn_limit(struct device *dev)
-{
-   u64 pfn = (dev->coherent_dma_mask >> PAGE_SHIFT) + 1;
-   struct dev_archdata __maybe_unused *sd = &dev->archdata;
-
-#ifdef CONFIG_SWIOTLB
-   if (sd->max_direct_dma_addr && dev->dma_ops == &powerpc_swiotlb_dma_ops)
-   pfn = min_t(u64, pfn, sd->max_direct_dma_addr >> PAGE_SHIFT);
-#endif
-
-   return pfn;
-}
-
-#ifndef CONFIG_NOT_COHERENT_CACHE
-void *__dma_nommu_alloc_coherent(struct device *dev, size_t size,
- dma_addr_t *dma_handle, gfp_t flag,
- unsigned long attrs)
-{
-   void *ret;
-   struct page *page;
-   int node = dev_to_node(dev);
-#ifdef CONFIG_FSL_SOC
-   u64 pfn = get_pfn_limit(dev);
-   int zone;
-
-   /*
-* This code should be OK on other platforms, but we have drivers that
-* don't set coherent_dma_mask. As a workaround we just ifdef it. This
-* whole routine needs some serious cleanup.
-*/
-
-   zone = dma_pfn_limit_to_zone(pfn);
-   if (zone < 0) {
-   dev_err(dev, "%s: No suitable zone for pfn %#llx\n",
-   __func__, pfn);
-   return NULL;
-   }
-
-   switch (zone) {
-   case ZONE_DMA:
-   flag |= GFP_DMA;
-   break;
-#ifdef CONFIG_ZONE_DMA32
-   case ZONE_DMA32:
-   flag |= GFP_DMA32;
-   break;
-#endif
-   };
-#endif /* CONFIG_FSL_SOC */
-
-   page = alloc_pages_node(node, flag, get_order(size));
-   if (page == NULL)
-   return NULL;
-   ret = page_address(page);
-   memset(ret, 0, size);
-   *dma_handle = phys_to_dma(dev,__pa(ret));
-
-   return ret;
-}
-
-void __dma_nommu_free_coherent(struct device *dev, size_t size,
-   void *vaddr, dma_addr_t dma_handle,
-   unsigned long attrs)
-{
-   free_pages((unsigned long)vaddr, get_order(size));
-}
-#endif /* !CONFIG_NOT_COHERENT_CACHE */
-
 static void *dma_nommu_alloc_coherent(struct device *dev, size_t size,
   dma_addr_t *dma_handle, gfp_t flag,
   unsigned long attrs)
@@ -105,8 +36,12 @@ static void *dma_nommu_alloc_coherent(struct device *dev, 
size_t size,
 * we can really use the direct ops
 */
if (dma_direct_supported(dev, dev->coherent_dma_mask))
+#ifdef CONFIG_NOT_COHERENT_CACHE
return __dma_nommu_alloc_coherent(dev, size, dma_handle,
   flag, attrs);
+#else
+   return dma_direct_alloc(dev, size, dma_handle, flag, attrs);
+#endif
 
/* Ok we can't ... do we have an iommu ? If not, fail */
iommu = get_iommu_table_base(dev);
@@ -127,8 +62,13 @@ static void dma_nommu_free_co

[PATCH 14/20] powerpc/dma: replace dma_nommu_dma_supported with dma_direct_supported

2018-07-30 Thread Christoph Hellwig

The ppc32 case of dma_nommu_dma_supported already was a no-op, and the
64-bit case came to the same conclusion as dma_direct_supported, so
replace it with the generic version.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/Kconfig  |  1 +
 arch/powerpc/kernel/dma.c | 28 +++-
 2 files changed, 4 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index f9cae7edd735..bbfa6a8df4da 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -158,6 +158,7 @@ config PPC
select CLONE_BACKWARDS
select DCACHE_WORD_ACCESS   if PPC64 && CPU_LITTLE_ENDIAN
select DYNAMIC_FTRACE   if FUNCTION_TRACER
+   select DMA_DIRECT_OPS
select EDAC_ATOMIC_SCRUB
select EDAC_SUPPORT
select GENERIC_ATOMIC64 if PPC32
diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index 3487de83bb37..511a4972560d 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -40,28 +40,6 @@ static u64 __maybe_unused get_pfn_limit(struct device *dev)
return pfn;
 }
 
-static int dma_nommu_dma_supported(struct device *dev, u64 mask)
-{
-#ifdef CONFIG_PPC64
-   u64 limit = phys_to_dma(dev, (memblock_end_of_DRAM() - 1));
-
-   /* Limit fits in the mask, we are good */
-   if (mask >= limit)
-   return 1;
-
-#ifdef CONFIG_FSL_SOC
-   /* Freescale gets another chance via ZONE_DMA/ZONE_DMA32, however
-* that will have to be refined if/when they support iommus
-*/
-   return 1;
-#endif
-   /* Sorry ... */
-   return 0;
-#else
-   return 1;
-#endif
-}
-
 #ifndef CONFIG_NOT_COHERENT_CACHE
 void *__dma_nommu_alloc_coherent(struct device *dev, size_t size,
  dma_addr_t *dma_handle, gfp_t flag,
@@ -126,7 +104,7 @@ static void *dma_nommu_alloc_coherent(struct device *dev, 
size_t size,
/* The coherent mask may be smaller than the real mask, check if
 * we can really use the direct ops
 */
-   if (dma_nommu_dma_supported(dev, dev->coherent_dma_mask))
+   if (dma_direct_supported(dev, dev->coherent_dma_mask))
return __dma_nommu_alloc_coherent(dev, size, dma_handle,
   flag, attrs);
 
@@ -148,7 +126,7 @@ static void dma_nommu_free_coherent(struct device *dev, 
size_t size,
struct iommu_table *iommu;
 
/* See comments in dma_nommu_alloc_coherent() */
-   if (dma_nommu_dma_supported(dev, dev->coherent_dma_mask))
+   if (dma_direct_supported(dev, dev->coherent_dma_mask))
return __dma_nommu_free_coherent(dev, size, vaddr, dma_handle,
  attrs);
/* Maybe we used an iommu ... */
@@ -265,7 +243,7 @@ const struct dma_map_ops dma_nommu_ops = {
.mmap   = dma_nommu_mmap_coherent,
.map_sg = dma_nommu_map_sg,
.unmap_sg   = dma_nommu_unmap_sg,
-   .dma_supported  = dma_nommu_dma_supported,
+   .dma_supported  = dma_direct_supported,
.map_page   = dma_nommu_map_page,
.unmap_page = dma_nommu_unmap_page,
.get_required_mask  = dma_nommu_get_required_mask,
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 13/20] powerpc/dma: remove get_dma_offset

2018-07-30 Thread Christoph Hellwig

Just fold the calculation into __phys_to_dma/__dma_to_phys as those are
the only places that should know about it.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/include/asm/dma-direct.h  |  8 ++--
 arch/powerpc/include/asm/dma-mapping.h | 16 
 2 files changed, 6 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/dma-direct.h 
b/arch/powerpc/include/asm/dma-direct.h
index 7702875aabb7..0fba19445ae8 100644
--- a/arch/powerpc/include/asm/dma-direct.h
+++ b/arch/powerpc/include/asm/dma-direct.h
@@ -19,11 +19,15 @@ static inline bool dma_capable(struct device *dev, 
dma_addr_t addr, size_t size)
 
 static inline dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
 {
-   return paddr + get_dma_offset(dev);
+   if (!dev)
+   return paddr + PCI_DRAM_OFFSET;
+   return paddr + dev->archdata.dma_offset;
 }
 
 static inline phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t daddr)
 {
-   return daddr - get_dma_offset(dev);
+   if (!dev)
+   return daddr - PCI_DRAM_OFFSET;
+   return daddr - dev->archdata.dma_offset;
 }
 #endif /* ASM_POWERPC_DMA_DIRECT_H */
diff --git a/arch/powerpc/include/asm/dma-mapping.h 
b/arch/powerpc/include/asm/dma-mapping.h
index dacd0f93f2b2..f0bf7ac2686c 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -80,22 +80,6 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
return NULL;
 }
 
-/*
- * get_dma_offset()
- *
- * Get the dma offset on configurations where the dma address can be determined
- * from the physical address by looking at a simple offset.  Direct dma and
- * swiotlb use this function, but it is typically not used by implementations
- * with an iommu.
- */
-static inline dma_addr_t get_dma_offset(struct device *dev)
-{
-   if (dev)
-   return dev->archdata.dma_offset;
-
-   return PCI_DRAM_OFFSET;
-}
-
 static inline void set_dma_offset(struct device *dev, dma_addr_t off)
 {
if (dev)
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 12/20] powerpc/dma: use phys_to_dma instead of get_dma_offset

2018-07-30 Thread Christoph Hellwig

Use the standard portable helper instead of the powerpc specific one,
which is about to go away.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/kernel/dma-swiotlb.c |  5 ++---
 arch/powerpc/kernel/dma.c | 12 ++--
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/dma-swiotlb.c 
b/arch/powerpc/kernel/dma-swiotlb.c
index 88f3963ca30f..f6e0701c5303 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -11,7 +11,7 @@
  *
  */
 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -31,9 +31,8 @@ static u64 swiotlb_powerpc_get_required(struct device *dev)
end = memblock_end_of_DRAM();
if (max_direct_dma_addr && end > max_direct_dma_addr)
end = max_direct_dma_addr;
-   end += get_dma_offset(dev);
 
-   mask = 1ULL << (fls64(end) - 1);
+   mask = 1ULL << (fls64(phys_to_dma(dev, end)) - 1);
mask += mask - 1;
 
return mask;
diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index eceaa92e6986..3487de83bb37 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -6,7 +6,7 @@
  */
 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -43,7 +43,7 @@ static u64 __maybe_unused get_pfn_limit(struct device *dev)
 static int dma_nommu_dma_supported(struct device *dev, u64 mask)
 {
 #ifdef CONFIG_PPC64
-   u64 limit = get_dma_offset(dev) + (memblock_end_of_DRAM() - 1);
+   u64 limit = phys_to_dma(dev, (memblock_end_of_DRAM() - 1));
 
/* Limit fits in the mask, we are good */
if (mask >= limit)
@@ -104,7 +104,7 @@ void *__dma_nommu_alloc_coherent(struct device *dev, size_t 
size,
return NULL;
ret = page_address(page);
memset(ret, 0, size);
-   *dma_handle = __pa(ret) + get_dma_offset(dev);
+   *dma_handle = phys_to_dma(dev,__pa(ret));
 
return ret;
 }
@@ -188,7 +188,7 @@ static int dma_nommu_map_sg(struct device *dev, struct 
scatterlist *sgl,
int i;
 
for_each_sg(sgl, sg, nents, i) {
-   sg->dma_address = sg_phys(sg) + get_dma_offset(dev);
+   sg->dma_address = phys_to_dma(dev, sg_phys(sg));
sg->dma_length = sg->length;
 
if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
@@ -210,7 +210,7 @@ static u64 dma_nommu_get_required_mask(struct device *dev)
 {
u64 end, mask;
 
-   end = memblock_end_of_DRAM() + get_dma_offset(dev);
+   end = phys_to_dma(dev, memblock_end_of_DRAM());
 
mask = 1ULL << (fls64(end) - 1);
mask += mask - 1;
@@ -228,7 +228,7 @@ static inline dma_addr_t dma_nommu_map_page(struct device 
*dev,
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
__dma_sync_page(page, offset, size, dir);
 
-   return page_to_phys(page) + offset + get_dma_offset(dev);
+   return phys_to_dma(dev, page_to_phys(page)) + offset;
 }
 
 static inline void dma_nommu_unmap_page(struct device *dev,
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 10/20] powerpc/dma-noncoherent: don't disable irqs over kmap_atomic

2018-07-30 Thread Christoph Hellwig

The requirement to disable local irqs over kmap_atomic is long gone,
so remove those calls.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/mm/dma-noncoherent.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/powerpc/mm/dma-noncoherent.c 
b/arch/powerpc/mm/dma-noncoherent.c
index 382528475433..d1c16456abac 100644
--- a/arch/powerpc/mm/dma-noncoherent.c
+++ b/arch/powerpc/mm/dma-noncoherent.c
@@ -357,12 +357,10 @@ static inline void __dma_sync_page_highmem(struct page 
*page,
 {
size_t seg_size = min((size_t)(PAGE_SIZE - offset), size);
size_t cur_size = seg_size;
-   unsigned long flags, start, seg_offset = offset;
+   unsigned long start, seg_offset = offset;
int nr_segs = 1 + ((size - seg_size) + PAGE_SIZE - 1)/PAGE_SIZE;
int seg_nr = 0;
 
-   local_irq_save(flags);
-
do {
start = (unsigned long)kmap_atomic(page + seg_nr) + seg_offset;
 
@@ -378,8 +376,6 @@ static inline void __dma_sync_page_highmem(struct page 
*page,
cur_size += seg_size;
seg_offset = 0;
} while (seg_nr < nr_segs);
-
-   local_irq_restore(flags);
 }
 #endif /* CONFIG_HIGHMEM */
 
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 11/20] powerpc/dma: split the two __dma_alloc_coherent implementations

2018-07-30 Thread Christoph Hellwig

The implemementation for the CONFIG_NOT_COHERENT_CACHE case doesn't share
any code with the one for systems with coherent caches.  Split it off
and merge it with the helpers in dma-noncoherent.c that have no other
callers.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/include/asm/dma-mapping.h |  5 -
 arch/powerpc/kernel/dma.c  | 14 ++
 arch/powerpc/mm/dma-noncoherent.c  | 15 +++
 arch/powerpc/platforms/44x/warp.c  |  2 +-
 4 files changed, 10 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/include/asm/dma-mapping.h 
b/arch/powerpc/include/asm/dma-mapping.h
index f2a4a7142b1e..dacd0f93f2b2 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -39,9 +39,6 @@ extern int dma_nommu_mmap_coherent(struct device *dev,
  * to ensure it is consistent.
  */
 struct device;
-extern void *__dma_alloc_coherent(struct device *dev, size_t size,
- dma_addr_t *handle, gfp_t gfp);
-extern void __dma_free_coherent(size_t size, void *vaddr);
 extern void __dma_sync(void *vaddr, size_t size, int direction);
 extern void __dma_sync_page(struct page *page, unsigned long offset,
 size_t size, int direction);
@@ -52,8 +49,6 @@ extern unsigned long __dma_get_coherent_pfn(unsigned long 
cpu_addr);
  * Cache coherent cores.
  */
 
-#define __dma_alloc_coherent(dev, gfp, size, handle)   NULL
-#define __dma_free_coherent(size, addr)((void)0)
 #define __dma_sync(addr, size, rw) ((void)0)
 #define __dma_sync_page(pg, off, sz, rw)   ((void)0)
 
diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index 3939589aab04..eceaa92e6986 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -62,18 +62,12 @@ static int dma_nommu_dma_supported(struct device *dev, u64 
mask)
 #endif
 }
 
+#ifndef CONFIG_NOT_COHERENT_CACHE
 void *__dma_nommu_alloc_coherent(struct device *dev, size_t size,
  dma_addr_t *dma_handle, gfp_t flag,
  unsigned long attrs)
 {
void *ret;
-#ifdef CONFIG_NOT_COHERENT_CACHE
-   ret = __dma_alloc_coherent(dev, size, dma_handle, flag);
-   if (ret == NULL)
-   return NULL;
-   *dma_handle += get_dma_offset(dev);
-   return ret;
-#else
struct page *page;
int node = dev_to_node(dev);
 #ifdef CONFIG_FSL_SOC
@@ -113,19 +107,15 @@ void *__dma_nommu_alloc_coherent(struct device *dev, 
size_t size,
*dma_handle = __pa(ret) + get_dma_offset(dev);
 
return ret;
-#endif
 }
 
 void __dma_nommu_free_coherent(struct device *dev, size_t size,
void *vaddr, dma_addr_t dma_handle,
unsigned long attrs)
 {
-#ifdef CONFIG_NOT_COHERENT_CACHE
-   __dma_free_coherent(size, vaddr);
-#else
free_pages((unsigned long)vaddr, get_order(size));
-#endif
 }
+#endif /* !CONFIG_NOT_COHERENT_CACHE */
 
 static void *dma_nommu_alloc_coherent(struct device *dev, size_t size,
   dma_addr_t *dma_handle, gfp_t flag,
diff --git a/arch/powerpc/mm/dma-noncoherent.c 
b/arch/powerpc/mm/dma-noncoherent.c
index d1c16456abac..cfc48a253707 100644
--- a/arch/powerpc/mm/dma-noncoherent.c
+++ b/arch/powerpc/mm/dma-noncoherent.c
@@ -29,7 +29,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include 
@@ -151,8 +151,8 @@ static struct ppc_vm_region *ppc_vm_region_find(struct 
ppc_vm_region *head, unsi
  * Allocate DMA-coherent memory space and return both the kernel remapped
  * virtual and bus address for that space.
  */
-void *
-__dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *handle, 
gfp_t gfp)
+void *__dma_nommu_alloc_coherent(struct device *dev, size_t size,
+   dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
 {
struct page *page;
struct ppc_vm_region *c;
@@ -223,7 +223,7 @@ __dma_alloc_coherent(struct device *dev, size_t size, 
dma_addr_t *handle, gfp_t
/*
 * Set the "dma handle"
 */
-   *handle = page_to_phys(page);
+   *dma_handle = phys_to_dma(dev, page_to_phys(page));
 
do {
SetPageReserved(page);
@@ -249,12 +249,12 @@ __dma_alloc_coherent(struct device *dev, size_t size, 
dma_addr_t *handle, gfp_t
  no_page:
return NULL;
 }
-EXPORT_SYMBOL(__dma_alloc_coherent);
 
 /*
  * free a page as defined by the above mapping.
  */
-void __dma_free_coherent(size_t size, void *vaddr)
+void __dma_nommu_free_coherent(struct device *dev, size_t size, void *vaddr,
+   dma_addr_t dma_handle, unsigned long attrs)
 {
struct ppc_vm_region *c;
unsigned long flags, addr;
@@ -309,7 +309,6 @@ void __dma_free_coherent(size_t size, void *vaddr)
   __func__, vaddr);
dump_stack();
 }
-

[PATCH 09/20] powerpc/dma: remove the unused ISA_DMA_THRESHOLD export

2018-07-30 Thread Christoph Hellwig

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/kernel/setup_32.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index 74457485574b..3c2d093f74c7 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -55,7 +55,6 @@ unsigned long ISA_DMA_THRESHOLD;
 unsigned int DMA_MODE_READ;
 unsigned int DMA_MODE_WRITE;
 
-EXPORT_SYMBOL(ISA_DMA_THRESHOLD);
 EXPORT_SYMBOL(DMA_MODE_READ);
 EXPORT_SYMBOL(DMA_MODE_WRITE);
 
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 08/20] powerpc/dma: remove the unused dma_nommu_ops export

2018-07-30 Thread Christoph Hellwig

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/kernel/dma.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index dbfc7056d7df..3939589aab04 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -286,7 +286,6 @@ const struct dma_map_ops dma_nommu_ops = {
.sync_sg_for_device = dma_nommu_sync_sg,
 #endif
 };
-EXPORT_SYMBOL(dma_nommu_ops);
 
 int dma_set_coherent_mask(struct device *dev, u64 mask)
 {
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 07/20] powerpc/dma: remove the unused ARCH_HAS_DMA_MMAP_COHERENT define

2018-07-30 Thread Christoph Hellwig

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/include/asm/dma-mapping.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/include/asm/dma-mapping.h 
b/arch/powerpc/include/asm/dma-mapping.h
index 8fa394520af6..f2a4a7142b1e 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -112,7 +112,5 @@ extern int dma_set_mask(struct device *dev, u64 dma_mask);
 
 extern u64 __dma_get_required_mask(struct device *dev);
 
-#define ARCH_HAS_DMA_MMAP_COHERENT
-
 #endif /* __KERNEL__ */
 #endif /* _ASM_DMA_MAPPING_H */
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 06/20] dma-noncoherent: add an optional arch hook for ->get_required_mask

2018-07-30 Thread Christoph Hellwig

This is need for powerpc for now.  Hopefully we can come up with a clean
generic implementation mid-term.

Signed-off-by: Christoph Hellwig 
---
 include/linux/dma-noncoherent.h | 6 ++
 kernel/dma/Kconfig  | 4 
 kernel/dma/noncoherent.c| 1 +
 3 files changed, 11 insertions(+)

diff --git a/include/linux/dma-noncoherent.h b/include/linux/dma-noncoherent.h
index 10b2654d549b..61394c6e56df 100644
--- a/include/linux/dma-noncoherent.h
+++ b/include/linux/dma-noncoherent.h
@@ -17,6 +17,12 @@ int arch_dma_mmap(struct device *dev, struct vm_area_struct 
*vma,
 #define arch_dma_mmap NULL
 #endif /* CONFIG_DMA_NONCOHERENT_MMAP */
 
+#ifdef CONFIG_DMA_NONCOHERENT_GET_REQUIRED
+u64 arch_get_required_mask(struct device *dev);
+#else
+#define arch_get_required_mask NULL
+#endif
+
 #ifdef CONFIG_DMA_NONCOHERENT_CACHE_SYNC
 void arch_dma_cache_sync(struct device *dev, void *vaddr, size_t size,
enum dma_data_direction direction);
diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index 9bd54304446f..b523104d6199 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -36,6 +36,10 @@ config DMA_NONCOHERENT_MMAP
bool
depends on DMA_NONCOHERENT_OPS
 
+config DMA_NONCOHERENT_GET_REQUIRED
+   bool
+   depends on DMA_NONCOHERENT_OPS
+
 config DMA_NONCOHERENT_CACHE_SYNC
bool
depends on DMA_NONCOHERENT_OPS
diff --git a/kernel/dma/noncoherent.c b/kernel/dma/noncoherent.c
index 79e9a757387f..cf4ffbe4a09d 100644
--- a/kernel/dma/noncoherent.c
+++ b/kernel/dma/noncoherent.c
@@ -98,5 +98,6 @@ const struct dma_map_ops dma_noncoherent_ops = {
.dma_supported  = dma_direct_supported,
.mapping_error  = dma_direct_mapping_error,
.cache_sync = arch_dma_cache_sync,
+   .get_required_mask  = arch_get_required_mask,
 };
 EXPORT_SYMBOL(dma_noncoherent_ops);
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 05/20] swiotlb: allow the architecture to provide a get_required_mask hook

2018-07-30 Thread Christoph Hellwig

For now this allows consolidating the powerpc code.  In the long run
we should grow a generic implementation of dma_get_required_mask that
returns the dma mask required to avoid bounce buffering.

Signed-off-by: Christoph Hellwig 
---
 kernel/dma/swiotlb.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 904541055792..1bb420244753 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -1084,5 +1084,9 @@ const struct dma_map_ops swiotlb_dma_ops = {
.map_page   = swiotlb_map_page,
.unmap_page = swiotlb_unmap_page,
.dma_supported  = dma_direct_supported,
+#ifdef swiotlb_get_required_mask
+   .get_required_mask  = swiotlb_get_required_mask,
+#endif
+
 };
 EXPORT_SYMBOL(swiotlb_dma_ops);
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 04/20] ia64: remove get_required_mask implementation

2018-07-30 Thread Christoph Hellwig

ia64 can use the generic implementation in general, and SN2 can just
override it in the dma_map_ops now.

Signed-off-by: Christoph Hellwig 
---
 arch/ia64/include/asm/dma-mapping.h  |  2 --
 arch/ia64/include/asm/machvec.h  |  7 ---
 arch/ia64/include/asm/machvec_init.h |  1 -
 arch/ia64/include/asm/machvec_sn2.h  |  2 --
 arch/ia64/pci/pci.c  | 26 --
 arch/ia64/sn/pci/pci_dma.c   |  4 ++--
 6 files changed, 2 insertions(+), 40 deletions(-)

diff --git a/arch/ia64/include/asm/dma-mapping.h 
b/arch/ia64/include/asm/dma-mapping.h
index 76e4d6632d68..522745ae67bb 100644
--- a/arch/ia64/include/asm/dma-mapping.h
+++ b/arch/ia64/include/asm/dma-mapping.h
@@ -10,8 +10,6 @@
 #include 
 #include 
 
-#define ARCH_HAS_DMA_GET_REQUIRED_MASK
-
 extern const struct dma_map_ops *dma_ops;
 extern struct ia64_machine_vector ia64_mv;
 extern void set_iommu_machvec(void);
diff --git a/arch/ia64/include/asm/machvec.h b/arch/ia64/include/asm/machvec.h
index 267f4f170191..5133739966bc 100644
--- a/arch/ia64/include/asm/machvec.h
+++ b/arch/ia64/include/asm/machvec.h
@@ -44,7 +44,6 @@ typedef void ia64_mv_kernel_launch_event_t(void);
 
 /* DMA-mapping interface: */
 typedef void ia64_mv_dma_init (void);
-typedef u64 ia64_mv_dma_get_required_mask (struct device *);
 typedef const struct dma_map_ops *ia64_mv_dma_get_ops(struct device *);
 
 /*
@@ -127,7 +126,6 @@ extern void machvec_tlb_migrate_finish (struct mm_struct *);
 #  define platform_global_tlb_purgeia64_mv.global_tlb_purge
 #  define platform_tlb_migrate_finish  ia64_mv.tlb_migrate_finish
 #  define platform_dma_initia64_mv.dma_init
-#  define platform_dma_get_required_mask ia64_mv.dma_get_required_mask
 #  define platform_dma_get_ops ia64_mv.dma_get_ops
 #  define platform_irq_to_vector   ia64_mv.irq_to_vector
 #  define platform_local_vector_to_irq ia64_mv.local_vector_to_irq
@@ -171,7 +169,6 @@ struct ia64_machine_vector {
ia64_mv_global_tlb_purge_t *global_tlb_purge;
ia64_mv_tlb_migrate_finish_t *tlb_migrate_finish;
ia64_mv_dma_init *dma_init;
-   ia64_mv_dma_get_required_mask *dma_get_required_mask;
ia64_mv_dma_get_ops *dma_get_ops;
ia64_mv_irq_to_vector *irq_to_vector;
ia64_mv_local_vector_to_irq *local_vector_to_irq;
@@ -211,7 +208,6 @@ struct ia64_machine_vector {
platform_global_tlb_purge,  \
platform_tlb_migrate_finish,\
platform_dma_init,  \
-   platform_dma_get_required_mask, \
platform_dma_get_ops,   \
platform_irq_to_vector, \
platform_local_vector_to_irq,   \
@@ -286,9 +282,6 @@ extern const struct dma_map_ops *dma_get_ops(struct device 
*);
 #ifndef platform_dma_get_ops
 # define platform_dma_get_ops  dma_get_ops
 #endif
-#ifndef platform_dma_get_required_mask
-# define  platform_dma_get_required_mask   ia64_dma_get_required_mask
-#endif
 #ifndef platform_irq_to_vector
 # define platform_irq_to_vector__ia64_irq_to_vector
 #endif
diff --git a/arch/ia64/include/asm/machvec_init.h 
b/arch/ia64/include/asm/machvec_init.h
index 2b32fd06b7c6..2aafb69a3787 100644
--- a/arch/ia64/include/asm/machvec_init.h
+++ b/arch/ia64/include/asm/machvec_init.h
@@ -4,7 +4,6 @@
 
 extern ia64_mv_send_ipi_t ia64_send_ipi;
 extern ia64_mv_global_tlb_purge_t ia64_global_tlb_purge;
-extern ia64_mv_dma_get_required_mask ia64_dma_get_required_mask;
 extern ia64_mv_irq_to_vector __ia64_irq_to_vector;
 extern ia64_mv_local_vector_to_irq __ia64_local_vector_to_irq;
 extern ia64_mv_pci_get_legacy_mem_t ia64_pci_get_legacy_mem;
diff --git a/arch/ia64/include/asm/machvec_sn2.h 
b/arch/ia64/include/asm/machvec_sn2.h
index ece9fa85be88..b5153d300289 100644
--- a/arch/ia64/include/asm/machvec_sn2.h
+++ b/arch/ia64/include/asm/machvec_sn2.h
@@ -55,7 +55,6 @@ extern ia64_mv_readb_t __sn_readb_relaxed;
 extern ia64_mv_readw_t __sn_readw_relaxed;
 extern ia64_mv_readl_t __sn_readl_relaxed;
 extern ia64_mv_readq_t __sn_readq_relaxed;
-extern ia64_mv_dma_get_required_mask   sn_dma_get_required_mask;
 extern ia64_mv_dma_initsn_dma_init;
 extern ia64_mv_migrate_t   sn_migrate;
 extern ia64_mv_kernel_launch_event_t   sn_kernel_launch_event;
@@ -100,7 +99,6 @@ extern ia64_mv_pci_fixup_bus_t   
sn_pci_fixup_bus;
 #define platform_pci_get_legacy_memsn_pci_get_legacy_mem
 #define platform_pci_legacy_read   sn_pci_legacy_read
 #define platform_pci_legacy_write  sn_pci_legacy_write
-#define platform_dma_get_required_mask sn_dma_get_required_mask
 #define platform_dma_init  sn_dma_init
 #define platform_migrate   sn_migrate
 #define platform_kernel_launch_eventsn_kernel_launch_event
diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c
index 7ccc64d5fe3e..5d71800df431 100644
--- a/arch/ia64/pci/pci.c
+++ b/arch/ia64/

[PATCH 02/20] kernel/dma/direct: refine dma_direct_alloc zone selection

2018-07-30 Thread Christoph Hellwig

We need to take the DMA offset and encryption bit into account when selecting
a zone.  Add a helper that takes those into account and use it.

Signed-off-by: Christoph Hellwig 
---
 kernel/dma/direct.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index d32d4f0d2c0c..c2c1df8827f2 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -58,6 +58,14 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t 
phys, size_t size)
return addr + size - 1 <= dev->coherent_dma_mask;
 }
 
+static bool dma_coherent_below(struct device *dev, u64 mask)
+{
+   dma_addr_t addr = force_dma_unencrypted() ?
+   __phys_to_dma(dev, mask) : phys_to_dma(dev, mask);
+
+   return dev->coherent_dma_mask <= addr;
+}
+
 void *dma_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
gfp_t gfp, unsigned long attrs)
 {
@@ -70,9 +78,9 @@ void *dma_direct_alloc(struct device *dev, size_t size, 
dma_addr_t *dma_handle,
gfp &= ~__GFP_ZERO;
 
/* GFP_DMA32 and GFP_DMA are no ops without the corresponding zones: */
-   if (dev->coherent_dma_mask <= DMA_BIT_MASK(ARCH_ZONE_DMA_BITS))
+   if (dma_coherent_below(dev, DMA_BIT_MASK(ARCH_ZONE_DMA_BITS)))
gfp |= GFP_DMA;
-   if (dev->coherent_dma_mask <= DMA_BIT_MASK(32) && !(gfp & GFP_DMA))
+   if (dma_coherent_below(dev, DMA_BIT_MASK(32) && !(gfp & GFP_DMA)))
gfp |= GFP_DMA32;
 
 again:
@@ -92,14 +100,14 @@ void *dma_direct_alloc(struct device *dev, size_t size, 
dma_addr_t *dma_handle,
page = NULL;
 
if (IS_ENABLED(CONFIG_ZONE_DMA32) &&
-   dev->coherent_dma_mask < DMA_BIT_MASK(64) &&
+   dma_coherent_below(dev, DMA_BIT_MASK(64)) &&
!(gfp & (GFP_DMA32 | GFP_DMA))) {
gfp |= GFP_DMA32;
goto again;
}
 
if (IS_ENABLED(CONFIG_ZONE_DMA) &&
-   dev->coherent_dma_mask < DMA_BIT_MASK(32) &&
+   dma_coherent_below(dev, DMA_BIT_MASK(32)) &&
!(gfp & GFP_DMA)) {
gfp = (gfp & ~GFP_DMA32) | GFP_DMA;
goto again;
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 01/20] kernel/dma/direct: take DMA offset into account in dma_direct_supported

2018-07-30 Thread Christoph Hellwig

When a device has a DMA offset the dma capable result will change due
to the difference between the physical and DMA address.  Take that into
account.

Signed-off-by: Christoph Hellwig 
---
 kernel/dma/direct.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 8be8106270c2..d32d4f0d2c0c 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -167,7 +167,7 @@ int dma_direct_map_sg(struct device *dev, struct 
scatterlist *sgl, int nents,
 int dma_direct_supported(struct device *dev, u64 mask)
 {
 #ifdef CONFIG_ZONE_DMA
-   if (mask < DMA_BIT_MASK(ARCH_ZONE_DMA_BITS))
+   if (mask < phys_to_dma(dev, DMA_BIT_MASK(ARCH_ZONE_DMA_BITS)))
return 0;
 #else
/*
@@ -176,14 +176,14 @@ int dma_direct_supported(struct device *dev, u64 mask)
 * memory, or by providing a ZONE_DMA32.  If neither is the case, the
 * architecture needs to use an IOMMU instead of the direct mapping.
 */
-   if (mask < DMA_BIT_MASK(32))
+   if (mask < phys_to_dma(dev, DMA_BIT_MASK(32)))
return 0;
 #endif
/*
 * Various PCI/PCIe bridges have broken support for > 32bit DMA even
 * if the device itself might support it.
 */
-   if (dev->dma_32bit_limit && mask > DMA_BIT_MASK(32))
+   if (dev->dma_32bit_limit && mask > phys_to_dma(dev, DMA_BIT_MASK(32)))
return 0;
return 1;
 }
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 03/20] dma-mapping: make the get_required_mask method available unconditionally

2018-07-30 Thread Christoph Hellwig

This save some duplication for ia64.  In the long run this method will
need some additional work including moving over to kernel/dma, but that
will require some additional prep work, so let's do this minimal change
for now.

Signed-off-by: Christoph Hellwig 
---
 drivers/base/platform.c | 11 ++-
 include/linux/dma-mapping.h |  2 --
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index dff82a3c2caa..921ddb0c051b 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -1180,7 +1180,7 @@ int __init platform_bus_init(void)
 }
 
 #ifndef ARCH_HAS_DMA_GET_REQUIRED_MASK
-u64 dma_get_required_mask(struct device *dev)
+static u64 default_dma_get_required_mask(struct device *dev)
 {
u32 low_totalram = ((max_pfn - 1) << PAGE_SHIFT);
u32 high_totalram = ((max_pfn - 1) >> (32 - PAGE_SHIFT));
@@ -1198,6 +1198,15 @@ u64 dma_get_required_mask(struct device *dev)
}
return mask;
 }
+
+u64 dma_get_required_mask(struct device *dev)
+{
+   const struct dma_map_ops *ops = get_dma_ops(dev);
+
+   if (ops->get_required_mask)
+   return ops->get_required_mask(dev);
+   return default_dma_get_required_mask(dev);
+}
 EXPORT_SYMBOL_GPL(dma_get_required_mask);
 #endif
 
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index f9cc309507d9..00d3065e1510 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -130,9 +130,7 @@ struct dma_map_ops {
enum dma_data_direction direction);
int (*mapping_error)(struct device *dev, dma_addr_t dma_addr);
int (*dma_supported)(struct device *dev, u64 mask);
-#ifdef ARCH_HAS_DMA_GET_REQUIRED_MASK
u64 (*get_required_mask)(struct device *dev);
-#endif
 };
 
 extern const struct dma_map_ops dma_direct_ops;
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

use generic DMA mapping code in powerpc

2018-07-30 Thread Christoph Hellwig

Hi all,

this series switches the powerpc port to use the generic swiotlb
and noncoherent dma ops, and to use more generic code for the
coherent direct mapping, as well as removing dead code.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] sparc: use generic dma_noncoherent_ops

2018-07-30 Thread Christoph Hellwig

Switch to the generic noncoherent direct mapping implementation.

This removes the previous sync_single_for_device implementation, which
looks bogus given that no syncing is happening in the similar but more
important map_single case.

Signed-off-by: Christoph Hellwig 
Acked-by: Sam Ravnborg 
---
 Makefile |   2 +-
 arch/sparc/Kconfig   |   2 +
 arch/sparc/include/asm/dma-mapping.h |   5 +-
 arch/sparc/kernel/ioport.c   | 193 +--
 4 files changed, 36 insertions(+), 166 deletions(-)

diff --git a/Makefile b/Makefile
index 85f3481a56d6..8a3fd0c4a76e 100644
--- a/Makefile
+++ b/Makefile
@@ -368,7 +368,7 @@ HOST_LOADLIBES := $(HOST_LFS_LIBS)
 # Make variables (CC, etc...)
 AS = $(CROSS_COMPILE)as
 LD = $(CROSS_COMPILE)ld
-CC = $(CROSS_COMPILE)gcc
+CC = sparc64-linux-gnu-gcc-8
 CPP= $(CC) -E
 AR = $(CROSS_COMPILE)ar
 NM = $(CROSS_COMPILE)nm
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 0f535debf802..79f29c67291a 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -48,6 +48,8 @@ config SPARC
 
 config SPARC32
def_bool !64BIT
+   select ARCH_HAS_SYNC_DMA_FOR_CPU
+   select DMA_NONCOHERENT_OPS
select GENERIC_ATOMIC64
select CLZ_TAB
select HAVE_UID16
diff --git a/arch/sparc/include/asm/dma-mapping.h 
b/arch/sparc/include/asm/dma-mapping.h
index 12ae33daf52f..e17566376934 100644
--- a/arch/sparc/include/asm/dma-mapping.h
+++ b/arch/sparc/include/asm/dma-mapping.h
@@ -7,7 +7,6 @@
 #include 
 
 extern const struct dma_map_ops *dma_ops;
-extern const struct dma_map_ops pci32_dma_ops;
 
 extern struct bus_type pci_bus_type;
 
@@ -15,11 +14,11 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
 {
 #ifdef CONFIG_SPARC_LEON
if (sparc_cpu_model == sparc_leon)
-   return &pci32_dma_ops;
+   return &dma_noncoherent_ops;
 #endif
 #if defined(CONFIG_SPARC32) && defined(CONFIG_PCI)
if (bus == &pci_bus_type)
-   return &pci32_dma_ops;
+   return &dma_noncoherent_ops;
 #endif
return dma_ops;
 }
diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c
index cca9134cfa7d..6799c93c9f27 100644
--- a/arch/sparc/kernel/ioport.c
+++ b/arch/sparc/kernel/ioport.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -434,42 +435,41 @@ arch_initcall(sparc_register_ioport);
 /* Allocate and map kernel buffer using consistent mode DMA for a device.
  * hwdev should be valid struct pci_dev pointer for PCI devices.
  */
-static void *pci32_alloc_coherent(struct device *dev, size_t len,
- dma_addr_t *pba, gfp_t gfp,
- unsigned long attrs)
+void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
+   gfp_t gfp, unsigned long attrs)
 {
-   unsigned long len_total = PAGE_ALIGN(len);
+   unsigned long len_total = PAGE_ALIGN(size);
void *va;
struct resource *res;
int order;
 
-   if (len == 0) {
+   if (size == 0) {
return NULL;
}
-   if (len > 256*1024) {   /* __get_free_pages() limit */
+   if (size > 256*1024) {  /* __get_free_pages() limit */
return NULL;
}
 
order = get_order(len_total);
va = (void *) __get_free_pages(gfp, order);
if (va == NULL) {
-   printk("pci_alloc_consistent: no %ld pages\n", 
len_total>>PAGE_SHIFT);
+   printk("%s: no %ld pages\n", __func__, len_total>>PAGE_SHIFT);
goto err_nopages;
}
 
if ((res = kzalloc(sizeof(struct resource), GFP_KERNEL)) == NULL) {
-   printk("pci_alloc_consistent: no core\n");
+   printk("%s: no core\n", __func__);
goto err_nomem;
}
 
if (allocate_resource(&_sparc_dvma, res, len_total,
_sparc_dvma.start, _sparc_dvma.end, PAGE_SIZE, NULL, NULL) != 0) {
-   printk("pci_alloc_consistent: cannot occupy 0x%lx", len_total);
+   printk("%s: cannot occupy 0x%lx", __func__, len_total);
goto err_nova;
}
srmmu_mapiorange(0, virt_to_phys(va), res->start, len_total);
 
-   *pba = virt_to_phys(va); /* equals virt_to_bus (R.I.P.) for us. */
+   *dma_handle = virt_to_phys(va);
return (void *) res->start;
 
 err_nova:
@@ -481,184 +481,53 @@ static void *pci32_alloc_coherent(struct device *dev, 
size_t len,
 }
 
 /* Free and unmap a consistent DMA buffer.
- * cpu_addr is what was returned from pci_alloc_consistent,
- * size must be the same as what as passed into pci_alloc_consistent,
- * and likewise dma_addr must be the same as what *dma_addrp was set to.
+ * cpu_addr is what was returned arch_dma_alloc, size must

Re: [PATCH 2/3] dmapool: improve scalability of dma_pool_free

2018-07-30 Thread Tony Battersby

On 07/27/2018 05:27 PM, Tony Battersby wrote:
> On 07/27/2018 03:38 PM, Tony Battersby wrote:
>> But the bigger problem is that my first patch adds another list_head to
>> the dma_page for the avail_page_link to make allocations faster.  I
>> suppose we could make the lists singly-linked instead of doubly-linked
>> to save space.
>>
> I managed to redo my dma_pool_alloc() patch to make avail_page_list
> singly-linked instead of doubly-linked.  But the problem with making
> either list singly-linked is that it would no longer be possible to call
> pool_free_page() any time other than dma_pool_destroy() without scanning
> the lists to remove the page from them, which would make pruning
> arbitrary free pages slower (adding back a O(n^2)).  But the current
> code doesn't do that anyway, and in fact it has a comment in
> dma_pool_free() to "resist the temptation" to prune free pages.  And yet
> it seems like it might be reasonable for someone to add such code in the
> future if there are a whole lot of free pages, so I am hesitant to make
> it more difficult.
>
> So my question is: when I post v2 of the patchset, should I send the
> doubly-linked version or the singly-linked version, in anticipation that
> someone else might want to take it further and move everything into
> struct page as you suggest?
>
Over the weekend I came up with a better solution.  Instead of having
the page listed in two singly-linked lists at the same time, move the
page between two doubly-linked lists.  One list is dedicated for pages
that have all blocks allocated, and one list is for pages that have some
blocks free.  Since the page is only in one list at a time, it only
needs one set of list pointers.

I also implemented the code to make the offset 16-bit, while ignoring
the offset for cases where it is not needed (where it would overflow
anyway).

So now I have an implementation that eliminates struct dma_page.  I will
post it once it is ready for review.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] PCI: call dma_debug_add_bus for pci_bus_type in common code

2018-07-30 Thread Thomas Gleixner

On Mon, 30 Jul 2018, Christoph Hellwig wrote:

> There is nothing arch specific about PCI or dma-debug, so move this
> call to common code just after registering the bus type.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: Thomas Gleixner 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: use the generic dma-noncoherent code for sh V2

2018-07-30 Thread Geert Uytterhoeven

Hi Rob,

CC Guennadi

On Fri, Jul 27, 2018 at 6:21 PM Rob Landley  wrote:
> On 07/24/2018 03:21 PM, Christoph Hellwig wrote:
> > On Tue, Jul 24, 2018 at 02:01:42PM +0200, Christoph Hellwig wrote:
> >> can you review these patches to switch sh to use the generic
> >> dma-noncoherent code?  All the requirements are in mainline already
> >> and we've switched various architectures over to it already.
> >
> > Ok, there is one more issue with this version.   Wait for a new one
> > tomorrow.
>
> Speaking of DMA:
>
> I'm trying to wire up DMAEngine to an sh7760 board that uses platform data 
> (and
> fix the smc91x.c driver to use DMAEngine without #ifdef arm), so I've been
> reading through all that stuff, but the docs seem kinda... thin?
>
> Is there something I should have read other than
> Documentation/driver-model/platform.txt,
> Documentation/dmaegine/{provider,client}.txt, then trying to picking through 
> the
> source code and the sh7760 hardware pdf? (And watching the youtube video of
> Laurent Pinchart's 2014 ELC talk on DMA, Maxime Ripard's 2015 ELC overview of
> DMAEngine, the Xilinx video on DMAEngine...)
>
> At first I thought the SH_DMAE could initialize itself, but the probe function
> needs platform data, and although arch/sh/kernel/cpu/sh4a/setup-sh7722.c looks
> _kind_ of like a model I can crib from:
>
> A) "make ARCH=sh se7722_defconfig" results in a config with SH_DMA 
> disabled??!?
> (This is why I use miniconfig instead of defconfig format, I'm assuming that's
> bit rot?)
>
> B) That platform data is supplying sh_dmae_slave_config preallocating slave
> channels to devices? (Does it have to? The docs gave me the impression the
> driver would dynamically request them and devices could even share. Wasn't 
> that
> sort of the point of DMAEngine? Can my new board data _not_ do that? What's 
> the
> minimum amount of micromanaging I have to do?)
>
> C) It's full of stuff like setting ts_low_shift to CHCR_TS_LOW_SHIFT where 
> both
> grepping Docuemntation and Google "dmaengine ts_low_shift" are unhelpful.
>
> What I'd really like is a "hello world" version of DMAEngine somewhere I can
> build and run on a supported qemu target, to set up _one_ channel with a block
> device or something using it. I can't tell what's optional, or what the 
> minimal
> version of this looks like.

I have no experience with DMA on SH, only with DMA on (DT-based) Renesas
ARM SoCs. But I believe the DMA engines are somewhat similar.

I don't know if all pieces to support DMA were ever upstreamed.
See e.g. commit 219fb0c1436e4893 ("serial: sh-sci: Remove the platform data
dma slave rx/tx channel IDs").

Perhaps Guennadi knows/remembers?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: use the generic dma-noncoherent code for sh V2

2018-07-30 Thread Christoph Hellwig

On Fri, Jul 27, 2018 at 11:20:21AM -0500, Rob Landley wrote:
> Speaking of DMA:

Which really has nothing to do with the dma mapping code, which
also means I can't help you much unfortunately.

That being said sh is the last pending of the initial dma-noncoherent
conversion, I'd greatly appreciate if we could get this reviewed and
merge for the 4.19 merge window..
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] sparc: use generic dma_noncoherent_ops

2018-07-30 Thread Christoph Hellwig

On Fri, Jul 27, 2018 at 11:05:48PM +0200, Sam Ravnborg wrote:
> Hi Christoph.
> 
> Some observations below - nitpick and bikeshedding only.
> 
> The parameter of phys_addr_t is sometimes renamed
> to use the same name as in the original prototype (good),
> and sometimes uses the old name (bad).
> This makes it inconsistent as the local name changes in the
> different functions, but they represents the same.

I'll change it.

> You can add my:
> Acked-by: Sam Ravnborg 

I will resend it with your nitpicks addressed and your ack added.

> > +void *arch_dma_alloc(struct device *dev, size_t len, dma_addr_t *pba, 
> > gfp_t gfp,
> > +   unsigned long attrs)
> 
> This function was renamed in ee664a9252d24 and is now renamed again.
> The printk statements should be updated to use arch_dma_alloc.

I've switched the printk statements to use __func__ instead, not that
I plan for another rename any time soon.

I've also update the comments referring to this function to use the
right name.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] PCI: call dma_debug_add_bus for pci_bus_type in common code

2018-07-30 Thread Christoph Hellwig

There is nothing arch specific about PCI or dma-debug, so move this
call to common code just after registering the bus type.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/kernel/dma.c | 3 ---
 arch/sh/drivers/pci/pci.c | 2 --
 arch/x86/kernel/pci-dma.c | 3 ---
 drivers/pci/pci-driver.c  | 2 +-
 4 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index 155170d70324..dbfc7056d7df 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -357,9 +357,6 @@ EXPORT_SYMBOL_GPL(dma_get_required_mask);
 
 static int __init dma_init(void)
 {
-#ifdef CONFIG_PCI
-   dma_debug_add_bus(&pci_bus_type);
-#endif
 #ifdef CONFIG_IBMVIO
dma_debug_add_bus(&vio_bus_type);
 #endif
diff --git a/arch/sh/drivers/pci/pci.c b/arch/sh/drivers/pci/pci.c
index e5b7437ab4af..8256626bc53c 100644
--- a/arch/sh/drivers/pci/pci.c
+++ b/arch/sh/drivers/pci/pci.c
@@ -160,8 +160,6 @@ static int __init pcibios_init(void)
for (hose = hose_head; hose; hose = hose->next)
pcibios_scanbus(hose);
 
-   dma_debug_add_bus(&pci_bus_type);
-
pci_initialized = 1;
 
return 0;
diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index ab5d9dd668d2..43f58632f123 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -155,9 +155,6 @@ static int __init pci_iommu_init(void)
 {
struct iommu_table_entry *p;
 
-#ifdef CONFIG_PCI
-   dma_debug_add_bus(&pci_bus_type);
-#endif
x86_init.iommu.iommu_init();
 
for (p = __iommu_table; p < __iommu_table_end; p++) {
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 6792292b5fc7..bef17c3fca67 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1668,7 +1668,7 @@ static int __init pci_driver_init(void)
if (ret)
return ret;
 #endif
-
+   dma_debug_add_bus(&pci_bus_type);
return 0;
 }
 postcore_initcall(pci_driver_init);
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] powerpc: do not redefined NEED_DMA_MAP_STATE

2018-07-30 Thread Christoph Hellwig

kernel/dma/Kconfig already defines NEED_DMA_MAP_STATE, just select it
from PPC64 and NOT_COHERENT_CACHE instead.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/Kconfig   | 3 ---
 arch/powerpc/platforms/Kconfig.cputype | 2 ++
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 9f2b75fe2c2d..f9cae7edd735 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -884,9 +884,6 @@ config ZONE_DMA
bool
default y
 
-config NEED_DMA_MAP_STATE
-   def_bool (PPC64 || NOT_COHERENT_CACHE)
-
 config GENERIC_ISA_DMA
bool
depends on ISA_DMA_API
diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index e6a1de521319..a2578bf8d560 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -3,6 +3,7 @@ config PPC64
bool "64-bit kernel"
default n
select ZLIB_DEFLATE
+   select NEED_DMA_MAP_STATE
help
  This option selects whether a 32-bit or a 64-bit kernel
  will be built.
@@ -386,6 +387,7 @@ config NOT_COHERENT_CACHE
depends on 4xx || PPC_8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON
default n if PPC_47x
default y
+   select NEED_DMA_MAP_STATE
 
 config CHECK_CACHE_COHERENCY
bool
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] iommu: remove the ->map_sg indirection

2018-07-30 Thread Christoph Hellwig

All iommu drivers use the default_iommu_map_sg implementation, and there
is no good reason to ever override it.  Just expose it as iommu_map_sg
directly and remove the indirection, specially in our post-spectre world
where indirect calls are horribly expensive.

Signed-off-by: Christoph Hellwig 
---
 drivers/iommu/amd_iommu.c  |  1 -
 drivers/iommu/arm-smmu-v3.c|  1 -
 drivers/iommu/arm-smmu.c   |  1 -
 drivers/iommu/exynos-iommu.c   |  1 -
 drivers/iommu/intel-iommu.c|  1 -
 drivers/iommu/iommu.c  |  6 +++---
 drivers/iommu/ipmmu-vmsa.c |  1 -
 drivers/iommu/msm_iommu.c  |  1 -
 drivers/iommu/mtk_iommu.c  |  1 -
 drivers/iommu/mtk_iommu_v1.c   |  1 -
 drivers/iommu/omap-iommu.c |  1 -
 drivers/iommu/qcom_iommu.c |  1 -
 drivers/iommu/rockchip-iommu.c |  1 -
 drivers/iommu/tegra-gart.c |  1 -
 drivers/iommu/tegra-smmu.c |  1 -
 include/linux/iommu.h  | 16 ++--
 16 files changed, 5 insertions(+), 31 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 596b95c50051..a23c6a4014a5 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3192,7 +3192,6 @@ const struct iommu_ops amd_iommu_ops = {
.detach_dev = amd_iommu_detach_device,
.map = amd_iommu_map,
.unmap = amd_iommu_unmap,
-   .map_sg = default_iommu_map_sg,
.iova_to_phys = amd_iommu_iova_to_phys,
.add_device = amd_iommu_add_device,
.remove_device = amd_iommu_remove_device,
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 1d647104bccc..f1dc294f8e08 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1997,7 +1997,6 @@ static struct iommu_ops arm_smmu_ops = {
.attach_dev = arm_smmu_attach_dev,
.map= arm_smmu_map,
.unmap  = arm_smmu_unmap,
-   .map_sg = default_iommu_map_sg,
.flush_iotlb_all= arm_smmu_iotlb_sync,
.iotlb_sync = arm_smmu_iotlb_sync,
.iova_to_phys   = arm_smmu_iova_to_phys,
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index f7a96bcf94a6..644fd7ec8ac7 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1562,7 +1562,6 @@ static struct iommu_ops arm_smmu_ops = {
.attach_dev = arm_smmu_attach_dev,
.map= arm_smmu_map,
.unmap  = arm_smmu_unmap,
-   .map_sg = default_iommu_map_sg,
.flush_iotlb_all= arm_smmu_iotlb_sync,
.iotlb_sync = arm_smmu_iotlb_sync,
.iova_to_phys   = arm_smmu_iova_to_phys,
diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index 85879cfec52f..19e55cf6a9dd 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -1332,7 +1332,6 @@ static const struct iommu_ops exynos_iommu_ops = {
.detach_dev = exynos_iommu_detach_device,
.map = exynos_iommu_map,
.unmap = exynos_iommu_unmap,
-   .map_sg = default_iommu_map_sg,
.iova_to_phys = exynos_iommu_iova_to_phys,
.device_group = generic_device_group,
.add_device = exynos_iommu_add_device,
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 115ff26e9ced..6bfa3bd5a174 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5366,7 +5366,6 @@ const struct iommu_ops intel_iommu_ops = {
.detach_dev = intel_iommu_detach_device,
.map= intel_iommu_map,
.unmap  = intel_iommu_unmap,
-   .map_sg = default_iommu_map_sg,
.iova_to_phys   = intel_iommu_iova_to_phys,
.add_device = intel_iommu_add_device,
.remove_device  = intel_iommu_remove_device,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 63b37563db7e..edf1a19c5755 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1637,8 +1637,8 @@ size_t iommu_unmap_fast(struct iommu_domain *domain,
 }
 EXPORT_SYMBOL_GPL(iommu_unmap_fast);
 
-size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
-struct scatterlist *sg, unsigned int nents, int prot)
+size_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
+   struct scatterlist *sg, unsigned int nents, int prot)
 {
struct scatterlist *s;
size_t mapped = 0;
@@ -1678,7 +1678,7 @@ size_t default_iommu_map_sg(struct iommu_domain *domain, 
unsigned long iova,
return 0;
 
 }
-EXPORT_SYMBOL_GPL(default_iommu_map_sg);
+EXPORT_SYMBOL_GPL(iommu_map_sg);
 
 int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr,
   phys_addr_t paddr, u64 size, int prot)
diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.

Re: [PATCH] iommu/iova: Update cached node pointer when current node fails to get any free IOVA

2018-07-30 Thread Ganapatrao Kulkarni

On Fri, Jul 27, 2018 at 9:48 PM, Robin Murphy  wrote:
> On 27/07/18 13:56, Ganapatrao Kulkarni wrote:
> [...]

 did you get any chance to look in to this issue?
 i am waiting for your suggestion/patch for this issue!
>>>
>>>
>>>
>>> I got as far as [1], but I wasn't sure how much I liked it, since it
>>> still
>>> seems a little invasive for such a specific case (plus I can't remember
>>> if
>>> it's actually been debugged or not). I think in the end I started
>>> wondering
>>> whether it's even worth bothering with the 32-bit optimisation for PCIe
>>> devices - 4 extra bytes worth of TLP is surely a lot less significant
>>> than
>>> every transaction taking up to 50% more bus cycles was for legacy PCI.
>>
>>
>> how about tracking previous attempt to get 32bit range iova and avoid
>> further attempts, if it was failed. Later Resume attempts once
>> replenish happens.
>> Created patch for the same [2]
>
>
> Ooh, that's a much neater implementation of essentially the same concept -
> now why couldn't I think of that? :)
>
> Looks like it should be possible to make it entirely self-contained too,
> since alloc_iova() is in a position to both test and update the flag based
> on the limit_pfn passed in.

is below patch much better?

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 83fe262..abb15d6 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -56,6 +56,7 @@ init_iova_domain(struct iova_domain *iovad, unsigned
long granule,
 iovad->granule = granule;
 iovad->start_pfn = start_pfn;
 iovad->dma_32bit_pfn = 1UL << (32 - iova_shift(iovad));
+iovad->free_32bit_pfns = true;
 iovad->flush_cb = NULL;
 iovad->fq = NULL;
 iovad->anchor.pfn_lo = iovad->anchor.pfn_hi = IOVA_ANCHOR;
@@ -139,8 +140,10 @@ __cached_rbnode_delete_update(struct iova_domain
*iovad, struct iova *free)

 cached_iova = rb_entry(iovad->cached32_node, struct iova, node);
 if (free->pfn_hi < iovad->dma_32bit_pfn &&
-free->pfn_lo >= cached_iova->pfn_lo)
+free->pfn_lo >= cached_iova->pfn_lo) {
 iovad->cached32_node = rb_next(&free->node);
+iovad->free_32bit_pfns = true;
+}

 cached_iova = rb_entry(iovad->cached_node, struct iova, node);
 if (free->pfn_lo >= cached_iova->pfn_lo)
@@ -290,6 +293,10 @@ alloc_iova(struct iova_domain *iovad, unsigned long size,
 struct iova *new_iova;
 int ret;

+if (limit_pfn < iovad->dma_32bit_pfn &&
+!iovad->free_32bit_pfns)
+return NULL;
+
 new_iova = alloc_iova_mem();
 if (!new_iova)
 return NULL;
@@ -299,6 +306,8 @@ alloc_iova(struct iova_domain *iovad, unsigned long size,

 if (ret) {
 free_iova_mem(new_iova);
+if (limit_pfn < iovad->dma_32bit_pfn)
+iovad->free_32bit_pfns = false;
 return NULL;
 }

diff --git a/include/linux/iova.h b/include/linux/iova.h
index 928442d..3810ba9 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -96,6 +96,7 @@ struct iova_domain {
flush-queues */
 atomic_t fq_timer_on;/* 1 when timer is active, 0
when not */
+boolfree_32bit_pfns;
 };

 static inline unsigned long iova_size(struct iova *iova)
-- 
2.9.4
>
> Robin.
>
>
>>
>> [2]
>> https://github.com/gpkulkarni/linux/commit/e2343a3e1f55cdeb5694103dd354bcb881dc65c3
>> note, the testing of this patch is in progress.
>>
>>>
>>> Robin.
>>>
>>> [1]
>>>
>>> http://www.linux-arm.org/git?p=linux-rm.git;a=commitdiff;h=a8e0e4af10ebebb3669750e05bf0028e5bd6afe8
>>
>>
>> thanks
>> Ganapat
>>
>

thanks
Ganapat
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/iova: Update cached node pointer when current node fails to get any free IOVA

Re: [PATCH] PCI: call dma_debug_add_bus for pci_bus_type in common code

Re: [PATCH] sparc: use generic dma_noncoherent_ops

[PATCH 20/20] powerpc/dma: remove dma_nommu_mmap_coherent

[PATCH 18/20] powerpc/dma-noncoherent: use generic dma_noncoherent_ops

[PATCH 19/20] powerpc/dma: use the generic dma-direct map_page and map_sg routines

[PATCH 17/20] powerpc/dma-swiotlb: use generic swiotlb_dma_ops

[PATCH 15/20] powerpc/dma: remove the unused unmap_page and unmap_sg methods

[PATCH 16/20] powerpc/dma: use dma_direct_{alloc,free}

[PATCH 14/20] powerpc/dma: replace dma_nommu_dma_supported with dma_direct_supported

[PATCH 13/20] powerpc/dma: remove get_dma_offset

[PATCH 12/20] powerpc/dma: use phys_to_dma instead of get_dma_offset

[PATCH 10/20] powerpc/dma-noncoherent: don't disable irqs over kmap_atomic

[PATCH 11/20] powerpc/dma: split the two __dma_alloc_coherent implementations

[PATCH 09/20] powerpc/dma: remove the unused ISA_DMA_THRESHOLD export

[PATCH 08/20] powerpc/dma: remove the unused dma_nommu_ops export

[PATCH 07/20] powerpc/dma: remove the unused ARCH_HAS_DMA_MMAP_COHERENT define

[PATCH 06/20] dma-noncoherent: add an optional arch hook for ->get_required_mask

[PATCH 05/20] swiotlb: allow the architecture to provide a get_required_mask hook

[PATCH 04/20] ia64: remove get_required_mask implementation

[PATCH 02/20] kernel/dma/direct: refine dma_direct_alloc zone selection

[PATCH 01/20] kernel/dma/direct: take DMA offset into account in dma_direct_supported

[PATCH 03/20] dma-mapping: make the get_required_mask method available unconditionally

use generic DMA mapping code in powerpc

[PATCH] sparc: use generic dma_noncoherent_ops

Re: [PATCH 2/3] dmapool: improve scalability of dma_pool_free

Re: [PATCH] PCI: call dma_debug_add_bus for pci_bus_type in common code

Re: use the generic dma-noncoherent code for sh V2

Re: use the generic dma-noncoherent code for sh V2

Re: [PATCH] sparc: use generic dma_noncoherent_ops

[PATCH] PCI: call dma_debug_add_bus for pci_bus_type in common code

[PATCH] powerpc: do not redefined NEED_DMA_MAP_STATE

[PATCH] iommu: remove the ->map_sg indirection

Re: [PATCH] iommu/iova: Update cached node pointer when current node fails to get any free IOVA

34 matches

Site Navigation

Mail list logo

Footer information