Re: [PATCH] ia64: fix barrier placement for write* / dma mapping

2018-07-31 Thread okaya

+ my new email

On 2018-07-31 10:20, Christoph Hellwig wrote:

memory-barriers.txt has been updated with the following requirement.

"When using writel(), a prior wmb() is not needed to guarantee that the
cache coherent memory writes have completed before writing to the MMIO
region."

The current writeX() and iowriteX() implementations on ia64 are not
satisfying this requirement as the barrier is after the register write.



I asked this question to Tony Luck before. If I remember right,
his answer was:

CPU guarantees outstanding writes to be flushed when a register write
instruction is executed and an additional barrier instruction is not
needed.


This adds the missing memory barriers, and instead drops them from the
dma sync routine where they are misplaced (and were missing in the
more important map/unmap cases anyway).

All this doesn't affect the SN2 platform, which already has barrier
in the I/O accessors, and none in dma mapping (but then again
swiotlb doesn't have any either).

Signed-off-by: Christoph Hellwig 
---

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] ia64: fix barrier placement for write* / dma mapping

2018-07-31 Thread Christoph Hellwig
memory-barriers.txt has been updated with the following requirement.

"When using writel(), a prior wmb() is not needed to guarantee that the
cache coherent memory writes have completed before writing to the MMIO
region."

The current writeX() and iowriteX() implementations on ia64 are not
satisfying this requirement as the barrier is after the register write.

This adds the missing memory barriers, and instead drops them from the
dma sync routine where they are misplaced (and were missing in the
more important map/unmap cases anyway).

All this doesn't affect the SN2 platform, which already has barrier
in the I/O accessors, and none in dma mapping (but then again
swiotlb doesn't have any either).

Signed-off-by: Christoph Hellwig 
---
 arch/ia64/hp/common/sba_iommu.c |  4 
 arch/ia64/include/asm/dma-mapping.h |  5 -
 arch/ia64/include/asm/io.h  |  5 +
 arch/ia64/kernel/machvec.c  | 16 
 arch/ia64/kernel/pci-dma.c  |  5 -
 5 files changed, 5 insertions(+), 30 deletions(-)

diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
index ee5b652d320a..e8da6503ed2f 100644
--- a/arch/ia64/hp/common/sba_iommu.c
+++ b/arch/ia64/hp/common/sba_iommu.c
@@ -2207,10 +2207,6 @@ const struct dma_map_ops sba_dma_ops = {
.unmap_page = sba_unmap_page,
.map_sg = sba_map_sg_attrs,
.unmap_sg   = sba_unmap_sg_attrs,
-   .sync_single_for_cpu= machvec_dma_sync_single,
-   .sync_sg_for_cpu= machvec_dma_sync_sg,
-   .sync_single_for_device = machvec_dma_sync_single,
-   .sync_sg_for_device = machvec_dma_sync_sg,
.dma_supported  = sba_dma_supported,
.mapping_error  = sba_dma_mapping_error,
 };
diff --git a/arch/ia64/include/asm/dma-mapping.h 
b/arch/ia64/include/asm/dma-mapping.h
index 76e4d6632d68..2b8cd4a6d958 100644
--- a/arch/ia64/include/asm/dma-mapping.h
+++ b/arch/ia64/include/asm/dma-mapping.h
@@ -16,11 +16,6 @@ extern const struct dma_map_ops *dma_ops;
 extern struct ia64_machine_vector ia64_mv;
 extern void set_iommu_machvec(void);
 
-extern void machvec_dma_sync_single(struct device *, dma_addr_t, size_t,
-   enum dma_data_direction);
-extern void machvec_dma_sync_sg(struct device *, struct scatterlist *, int,
-   enum dma_data_direction);
-
 static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
 {
return platform_dma_get_ops(NULL);
diff --git a/arch/ia64/include/asm/io.h b/arch/ia64/include/asm/io.h
index fb0651961e2c..ba5523b67eaf 100644
--- a/arch/ia64/include/asm/io.h
+++ b/arch/ia64/include/asm/io.h
@@ -22,6 +22,7 @@
 
 #include 
 #include 
+#include 
 
 /* We don't use IO slowdowns on the ia64, but.. */
 #define __SLOW_DOWN_IO do { } while (0)
@@ -345,24 +346,28 @@ ___ia64_readq (const volatile void __iomem *addr)
 static inline void
 __writeb (unsigned char val, volatile void __iomem *addr)
 {
+   mb();
*(volatile unsigned char __force *) addr = val;
 }
 
 static inline void
 __writew (unsigned short val, volatile void __iomem *addr)
 {
+   mb();
*(volatile unsigned short __force *) addr = val;
 }
 
 static inline void
 __writel (unsigned int val, volatile void __iomem *addr)
 {
+   mb();
*(volatile unsigned int __force *) addr = val;
 }
 
 static inline void
 __writeq (unsigned long val, volatile void __iomem *addr)
 {
+   mb();
*(volatile unsigned long __force *) addr = val;
 }
 
diff --git a/arch/ia64/kernel/machvec.c b/arch/ia64/kernel/machvec.c
index 7bfe98859911..1b604d02250b 100644
--- a/arch/ia64/kernel/machvec.c
+++ b/arch/ia64/kernel/machvec.c
@@ -73,19 +73,3 @@ machvec_timer_interrupt (int irq, void *dev_id)
 {
 }
 EXPORT_SYMBOL(machvec_timer_interrupt);
-
-void
-machvec_dma_sync_single(struct device *hwdev, dma_addr_t dma_handle, size_t 
size,
-   enum dma_data_direction dir)
-{
-   mb();
-}
-EXPORT_SYMBOL(machvec_dma_sync_single);
-
-void
-machvec_dma_sync_sg(struct device *hwdev, struct scatterlist *sg, int n,
-   enum dma_data_direction dir)
-{
-   mb();
-}
-EXPORT_SYMBOL(machvec_dma_sync_sg);
diff --git a/arch/ia64/kernel/pci-dma.c b/arch/ia64/kernel/pci-dma.c
index 3c2884bef3d4..2512aa3029f5 100644
--- a/arch/ia64/kernel/pci-dma.c
+++ b/arch/ia64/kernel/pci-dma.c
@@ -55,11 +55,6 @@ void __init pci_iommu_alloc(void)
 {
dma_ops = &intel_dma_ops;
 
-   intel_dma_ops.sync_single_for_cpu = machvec_dma_sync_single;
-   intel_dma_ops.sync_sg_for_cpu = machvec_dma_sync_sg;
-   intel_dma_ops.sync_single_for_device = machvec_dma_sync_single;
-   intel_dma_ops.sync_sg_for_device = machvec_dma_sync_sg;
-
/*
 * The order of these functions is important for
 * fall-back/fail-over reasons
-- 
2.18.0

___
iommu mailing list
iommu@lists

barriers vs I/O and DMA for ia64

2018-07-31 Thread Christoph Hellwig
Hi all,

please review these patches carefully - ia64 currenly seems to be
the odd one out in terms of barrier placement for DMA and I/O and
this patch tries to resolve it.  But I don't have any IA64 hardware
nor do I know the architecture to well, so don't blindly trust me.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/3] hexagon: remove the sync_single_for_cpu DMA operation

2018-07-31 Thread Christoph Hellwig
On Wed, Jul 25, 2018 at 06:39:27AM +0200, Christoph Hellwig wrote:
> On Tue, Jul 24, 2018 at 10:29:48PM -0500, Richard Kuo wrote:
> > Patch series looks good.  Definitely appreciate the cleanup.
> > 
> > I can take it through my tree, or if not:
> > 
> > Acked-by: Richard Kuo 
> 
> Please take it through your tree, thanks!

I haven't seen it in linux-next yet, do you still plan to take it?

Otherwise I'll merge it in the dma-mapping tree.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: use the generic dma-noncoherent code for sh V2

2018-07-31 Thread Arnd Bergmann
On Mon, Jul 30, 2018 at 11:06 AM, Christoph Hellwig  wrote:
> On Fri, Jul 27, 2018 at 11:20:21AM -0500, Rob Landley wrote:
>> Speaking of DMA:
>
> Which really has nothing to do with the dma mapping code, which
> also means I can't help you much unfortunately.
>
> That being said sh is the last pending of the initial dma-noncoherent
> conversion, I'd greatly appreciate if we could get this reviewed and
> merge for the 4.19 merge window..

I've spent 30 minutes looking through your submission to find something
wrong with it now, but it all looks fine, the only criticism would be that
some of the changelogs could provide a little more background.

The original implementation seems odd in some places, but your
new version resolves the few concerns I had (like mixing up
phys and dma addresses), and I didn't see anything that should
change behavior.

I hope that helps.

  Arnd
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: use the generic dma-noncoherent code for sh V2

2018-07-31 Thread Arnd Bergmann
On Fri, Jul 27, 2018 at 6:20 PM, Rob Landley  wrote:
> On 07/24/2018 03:21 PM, Christoph Hellwig wrote:
>> On Tue, Jul 24, 2018 at 02:01:42PM +0200, Christoph Hellwig wrote:
>>> Hi all,
>>>
>>> can you review these patches to switch sh to use the generic
>>> dma-noncoherent code?  All the requirements are in mainline already
>>> and we've switched various architectures over to it already.
>>
>> Ok, there is one more issue with this version.   Wait for a new one
>> tomorrow.
>
> Speaking of DMA:
>
> I'm trying to wire up DMAEngine to an sh7760 board that uses platform data 
> (and
> fix the smc91x.c driver to use DMAEngine without #ifdef arm), so I've been
> reading through all that stuff, but the docs seem kinda... thin?
>
> Is there something I should have read other than
> Documentation/driver-model/platform.txt,
> Documentation/dmaegine/{provider,client}.txt, then trying to picking through 
> the
> source code and the sh7760 hardware pdf? (And watching the youtube video of
> Laurent Pinchart's 2014 ELC talk on DMA, Maxime Ripard's 2015 ELC overview of
> DMAEngine, the Xilinx video on DMAEngine...)
>
> At first I thought the SH_DMAE could initialize itself, but the probe function
> needs platform data, and although arch/sh/kernel/cpu/sh4a/setup-sh7722.c looks
> _kind_ of like a model I can crib from:

> B) That platform data is supplying sh_dmae_slave_config preallocating slave
> channels to devices? (Does it have to? The docs gave me the impression the
> driver would dynamically request them and devices could even share. Wasn't 
> that
> sort of the point of DMAEngine? Can my new board data _not_ do that? What's 
> the
> minimum amount of micromanaging I have to do?)

The thing here is that arch/sh is way behind on the API use, and it
has prevented us from cleaning up drivers as well. A slave driver
should have to just call dma_request_chan() with a constant
string to identify its channel rather than going two different ways
depending on whether it's used with DT or platform data.

If you hack on it, please convert the dmaengine platform data to use
a dma_slave_map array to pass the data into the dmaengine driver,
mapping the settings from a (pdev-name, channel-id) tuple to a pointer
that describes the channel configuration rather than having the
mapping from an numerical slave_id to a struct sh_dmae_slave_config
in the setup files. It should be a fairly mechanical conversion.

The other part I noticed is arch/sh/drivers/dma/*, which appears to
be entirely unused, and should probably removed.

 Arnd
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 08/20] powerpc/dma: remove the unused dma_nommu_ops export

2018-07-31 Thread Christoph Hellwig
It turns out cxl actually uses it.  So for now skip this patch,
although random code in drivers messing with dma ops will need to
be sorted out sooner or later.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/7] Stop losing firmware-set DMA masks

2018-07-31 Thread Robin Murphy

Hi Arnd,

On 29/07/18 13:32, Arnd Bergmann wrote:

On Tue, Jul 24, 2018 at 12:16 AM, Robin Murphy  wrote:

Whilst the common firmware code invoked by dma_configure() initialises
devices' DMA masks according to limitations described by the respective
properties ("dma-ranges" for OF and _DMA/IORT for ACPI), the nature of
the dma_set_mask() API leads to that information getting lost when
well-behaved drivers probe and set a 64-bit mask, since in general
there's no way to tell the difference between a firmware-described mask
(which should be respected) and whatever default may have come from the
bus code (which should be replaced outright). This can break DMA on
systems with certain IOMMU topologies (e.g. [1]) where the IOMMU driver
only knows its maximum supported address size, not how many of those
address bits might actually be wired up between any of its input
interfaces and the associated DMA master devices. Similarly, some PCIe
root complexes only have a 32-bit native interface on their host bridge,
which leads to the same DMA-address-truncation problem in systems with a
larger physical memory map and RAM above 4GB (e.g. [2]).

These patches attempt to deal with this in the simplest way possible by
generalising the specific quirk for 32-bit bridges into an arbitrary
mask which can then also be plumbed into the firmware code. In the
interest of being minimally invasive, I've only included a point fix
for the IOMMU issue as seen on arm64 - there may be further tweaks
needed in DMA ops (e.g. in arch/arm/ and other OF users) to catch all
possible incarnations of this problem, but at least any that I'm not
fixing here have always been broken. It is also noteworthy that
of_dma_get_range() has never worked properly for the way PCI host
bridges are passed into of_dma_configure() - I'll be working on
further patches to sort that out once this part is done.


Thanks a lot for working on this, this has bugged me for many years,
and I've discussed possible solutions with lots of people over time.

I /think/ all your patches are good, but I'm currently travelling and don't
have a chance to review the resulting overall implementation.
Could you summarize what happens in the following corner cases of
DT dma-ranges after your changes (with a driver not setting a mask,
setting a 64-bit mask and setting a 24-bit mask, respectively)?

a) a device with no dma-ranges property anywhere in its parents
b) a device with with a 64-bit dma-ranges translation in its parent
but none in its grandparent
c) a device with no dma-ranges in its parent but a 64-bit mask
in its grandparent
d) a device with a 24-bit mask in its parent.


In terms of the actual dma-ranges parsing, nothing should be changed by 
these patches, so the weirdness and inconsistency that I'm pretty sure 
exists for some of those cases will still be there for the moment - I'm 
starting on actually fixing of_dma_get_range() now.


The effect after these patches is that a device with a "valid" (per the 
current of_dma_get_range() implementation) dma-ranges translation gets 
it bus_dma_mask set to cover the given range, whereas a device with no 
valid dma-ranges effectively gets a 32-bit bus_dma_mask. That's slightly 
different from the ACPI default behaviour, due to subtle spec 
differences, but I think it's in line with what you've proposed before 
for DT, and it's certainly still flexible if anyone has a different 
view. The bus_dma_mask in itself should also be low-impact, since it 
will only currently be enforced in the generic dma-direct and iommu-dma 
paths, so the likes of powerpc shouldn't see any change at all just yet.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] PCI: call dma_debug_add_bus for pci_bus_type in common code

2018-07-31 Thread Bjorn Helgaas
On Mon, Jul 30, 2018 at 09:38:42AM +0200, Christoph Hellwig wrote:
> There is nothing arch specific about PCI or dma-debug, so move this
> call to common code just after registering the bus type.
> 
> Signed-off-by: Christoph Hellwig 

Applied with acks from Thomas and Michael to pci/misc for v4.19, thanks!

> ---
>  arch/powerpc/kernel/dma.c | 3 ---
>  arch/sh/drivers/pci/pci.c | 2 --
>  arch/x86/kernel/pci-dma.c | 3 ---
>  drivers/pci/pci-driver.c  | 2 +-
>  4 files changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
> index 155170d70324..dbfc7056d7df 100644
> --- a/arch/powerpc/kernel/dma.c
> +++ b/arch/powerpc/kernel/dma.c
> @@ -357,9 +357,6 @@ EXPORT_SYMBOL_GPL(dma_get_required_mask);
>  
>  static int __init dma_init(void)
>  {
> -#ifdef CONFIG_PCI
> - dma_debug_add_bus(&pci_bus_type);
> -#endif
>  #ifdef CONFIG_IBMVIO
>   dma_debug_add_bus(&vio_bus_type);
>  #endif
> diff --git a/arch/sh/drivers/pci/pci.c b/arch/sh/drivers/pci/pci.c
> index e5b7437ab4af..8256626bc53c 100644
> --- a/arch/sh/drivers/pci/pci.c
> +++ b/arch/sh/drivers/pci/pci.c
> @@ -160,8 +160,6 @@ static int __init pcibios_init(void)
>   for (hose = hose_head; hose; hose = hose->next)
>   pcibios_scanbus(hose);
>  
> - dma_debug_add_bus(&pci_bus_type);
> -
>   pci_initialized = 1;
>  
>   return 0;
> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
> index ab5d9dd668d2..43f58632f123 100644
> --- a/arch/x86/kernel/pci-dma.c
> +++ b/arch/x86/kernel/pci-dma.c
> @@ -155,9 +155,6 @@ static int __init pci_iommu_init(void)
>  {
>   struct iommu_table_entry *p;
>  
> -#ifdef CONFIG_PCI
> - dma_debug_add_bus(&pci_bus_type);
> -#endif
>   x86_init.iommu.iommu_init();
>  
>   for (p = __iommu_table; p < __iommu_table_end; p++) {
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 6792292b5fc7..bef17c3fca67 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -1668,7 +1668,7 @@ static int __init pci_driver_init(void)
>   if (ret)
>   return ret;
>  #endif
> -
> + dma_debug_add_bus(&pci_bus_type);
>   return 0;
>  }
>  postcore_initcall(pci_driver_init);
> -- 
> 2.18.0
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] powerpc: do not redefined NEED_DMA_MAP_STATE

2018-07-31 Thread Michael Ellerman
Christoph Hellwig  writes:

> kernel/dma/Kconfig already defines NEED_DMA_MAP_STATE, just select it
> from PPC64 and NOT_COHERENT_CACHE instead.
>
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/powerpc/Kconfig   | 3 ---
>  arch/powerpc/platforms/Kconfig.cputype | 2 ++
>  2 files changed, 2 insertions(+), 3 deletions(-)

Thanks.

I did this instead:

commit 870771ae76010c5e42ee8e0278f5823e46e96e3f (HEAD -> next-test)
Author: Christoph Hellwig 
AuthorDate: Mon Jul 30 09:37:21 2018 +0200
Commit: Michael Ellerman 
CommitDate: Tue Jul 31 20:43:57 2018 +1000

powerpc: Do not redefine NEED_DMA_MAP_STATE

kernel/dma/Kconfig already defines NEED_DMA_MAP_STATE, just select it
from CONFIG_PPC using the same condition as an if guard.

Signed-off-by: Christoph Hellwig 
[mpe: Move it under PPC]
Signed-off-by: Michael Ellerman 

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 5eb4d969afbf..ee38fce075ee 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -226,6 +226,7 @@ config PPC
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
select MODULES_USE_ELF_RELA
+   select NEED_DMA_MAP_STATE   if PPC64 || NOT_COHERENT_CACHE
select NEED_SG_DMA_LENGTH
select NO_BOOTMEM
select OF
@@ -885,9 +886,6 @@ config ZONE_DMA
bool
default y
 
-config NEED_DMA_MAP_STATE
-   def_bool (PPC64 || NOT_COHERENT_CACHE)
-
 config GENERIC_ISA_DMA
bool
depends on ISA_DMA_API


cheers
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] PCI: call dma_debug_add_bus for pci_bus_type in common code

2018-07-31 Thread Michael Ellerman
Christoph Hellwig  writes:

> There is nothing arch specific about PCI or dma-debug, so move this
> call to common code just after registering the bus type.
>
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/powerpc/kernel/dma.c | 3 ---
>  arch/sh/drivers/pci/pci.c | 2 --
>  arch/x86/kernel/pci-dma.c | 3 ---
>  drivers/pci/pci-driver.c  | 2 +-
>  4 files changed, 1 insertion(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
> index 155170d70324..dbfc7056d7df 100644
> --- a/arch/powerpc/kernel/dma.c
> +++ b/arch/powerpc/kernel/dma.c
> @@ -357,9 +357,6 @@ EXPORT_SYMBOL_GPL(dma_get_required_mask);
>  
>  static int __init dma_init(void)
>  {
> -#ifdef CONFIG_PCI
> - dma_debug_add_bus(&pci_bus_type);
> -#endif
>  #ifdef CONFIG_IBMVIO
>   dma_debug_add_bus(&vio_bus_type);
>  #endif

Acked-by: Michael Ellerman  (powerpc)

cheers
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] PCI: call dma_debug_add_bus for pci_bus_type in common code

2018-07-31 Thread Christoph Hellwig
On Mon, Jul 30, 2018 at 04:17:13PM -0500, Bjorn Helgaas wrote:
> [+cc Joerg]
> 
> On Mon, Jul 30, 2018 at 09:38:42AM +0200, Christoph Hellwig wrote:
> > There is nothing arch specific about PCI or dma-debug, so move this
> > call to common code just after registering the bus type.
> 
> I assume that previously, even if the user set CONFIG_DMA_API_DEBUG=y
> we only got PCI DMA debug on powerpc, sh, and x86.  And after this
> patch, we'll get PCI DMA debug on *all* arches?

Yes.  Note that this only covers the actual bus related part, that
is warning about outstanding dma mappings on unload.  The rest of the
dma api debugging already is entirely generic.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3] sparc: use generic dma_noncoherent_ops

2018-07-31 Thread Christoph Hellwig
Switch to the generic noncoherent direct mapping implementation.

This removes the previous sync_single_for_device implementation, which
looks bogus given that no syncing is happening in the similar but more
important map_single case.

Signed-off-by: Christoph Hellwig 
Acked-by: Sam Ravnborg 
---

Changes since v2:
 - remove incorrect hunk to set the sparc cross compiler

Changes since v1:
 - clean up various tidbits
 - add Ack from Sam

 arch/sparc/Kconfig   |   2 +
 arch/sparc/include/asm/dma-mapping.h |   5 +-
 arch/sparc/kernel/ioport.c   | 193 +--
 3 files changed, 35 insertions(+), 165 deletions(-)

diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 0f535debf802..79f29c67291a 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -48,6 +48,8 @@ config SPARC
 
 config SPARC32
def_bool !64BIT
+   select ARCH_HAS_SYNC_DMA_FOR_CPU
+   select DMA_NONCOHERENT_OPS
select GENERIC_ATOMIC64
select CLZ_TAB
select HAVE_UID16
diff --git a/arch/sparc/include/asm/dma-mapping.h 
b/arch/sparc/include/asm/dma-mapping.h
index 12ae33daf52f..e17566376934 100644
--- a/arch/sparc/include/asm/dma-mapping.h
+++ b/arch/sparc/include/asm/dma-mapping.h
@@ -7,7 +7,6 @@
 #include 
 
 extern const struct dma_map_ops *dma_ops;
-extern const struct dma_map_ops pci32_dma_ops;
 
 extern struct bus_type pci_bus_type;
 
@@ -15,11 +14,11 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
 {
 #ifdef CONFIG_SPARC_LEON
if (sparc_cpu_model == sparc_leon)
-   return &pci32_dma_ops;
+   return &dma_noncoherent_ops;
 #endif
 #if defined(CONFIG_SPARC32) && defined(CONFIG_PCI)
if (bus == &pci_bus_type)
-   return &pci32_dma_ops;
+   return &dma_noncoherent_ops;
 #endif
return dma_ops;
 }
diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c
index cca9134cfa7d..6799c93c9f27 100644
--- a/arch/sparc/kernel/ioport.c
+++ b/arch/sparc/kernel/ioport.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -434,42 +435,41 @@ arch_initcall(sparc_register_ioport);
 /* Allocate and map kernel buffer using consistent mode DMA for a device.
  * hwdev should be valid struct pci_dev pointer for PCI devices.
  */
-static void *pci32_alloc_coherent(struct device *dev, size_t len,
- dma_addr_t *pba, gfp_t gfp,
- unsigned long attrs)
+void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
+   gfp_t gfp, unsigned long attrs)
 {
-   unsigned long len_total = PAGE_ALIGN(len);
+   unsigned long len_total = PAGE_ALIGN(size);
void *va;
struct resource *res;
int order;
 
-   if (len == 0) {
+   if (size == 0) {
return NULL;
}
-   if (len > 256*1024) {   /* __get_free_pages() limit */
+   if (size > 256*1024) {  /* __get_free_pages() limit */
return NULL;
}
 
order = get_order(len_total);
va = (void *) __get_free_pages(gfp, order);
if (va == NULL) {
-   printk("pci_alloc_consistent: no %ld pages\n", 
len_total>>PAGE_SHIFT);
+   printk("%s: no %ld pages\n", __func__, len_total>>PAGE_SHIFT);
goto err_nopages;
}
 
if ((res = kzalloc(sizeof(struct resource), GFP_KERNEL)) == NULL) {
-   printk("pci_alloc_consistent: no core\n");
+   printk("%s: no core\n", __func__);
goto err_nomem;
}
 
if (allocate_resource(&_sparc_dvma, res, len_total,
_sparc_dvma.start, _sparc_dvma.end, PAGE_SIZE, NULL, NULL) != 0) {
-   printk("pci_alloc_consistent: cannot occupy 0x%lx", len_total);
+   printk("%s: cannot occupy 0x%lx", __func__, len_total);
goto err_nova;
}
srmmu_mapiorange(0, virt_to_phys(va), res->start, len_total);
 
-   *pba = virt_to_phys(va); /* equals virt_to_bus (R.I.P.) for us. */
+   *dma_handle = virt_to_phys(va);
return (void *) res->start;
 
 err_nova:
@@ -481,184 +481,53 @@ static void *pci32_alloc_coherent(struct device *dev, 
size_t len,
 }
 
 /* Free and unmap a consistent DMA buffer.
- * cpu_addr is what was returned from pci_alloc_consistent,
- * size must be the same as what as passed into pci_alloc_consistent,
- * and likewise dma_addr must be the same as what *dma_addrp was set to.
+ * cpu_addr is what was returned arch_dma_alloc, size must be the same as what
+ * was passed into arch_dma_alloc, and likewise dma_addr must be the same as
+ * what *dma_ndler was set to.
  *
  * References to the memory and mappings associated with cpu_addr/dma_addr
  * past this call are illegal.
  */
-static void pci32_free_coherent(struct device *dev, size_t n, void *p,
-  

Re: [PATCH] PCI: call dma_debug_add_bus for pci_bus_type in common code

2018-07-31 Thread Joerg Roedel
On Mon, Jul 30, 2018 at 04:17:13PM -0500, Bjorn Helgaas wrote:
> [+cc Joerg]
> 
> On Mon, Jul 30, 2018 at 09:38:42AM +0200, Christoph Hellwig wrote:
> > There is nothing arch specific about PCI or dma-debug, so move this
> > call to common code just after registering the bus type.
> 
> I assume that previously, even if the user set CONFIG_DMA_API_DEBUG=y
> we only got PCI DMA debug on powerpc, sh, and x86.  And after this
> patch, we'll get PCI DMA debug on *all* arches?
> 
> If that's true, I'll add a comment to that effect to the commitlog
> since that new functionality might be of interest to other arches.

There should be implicit support for dma-debug for all arches that use
the generic dma_ops code. The dma_debug_add_bus() function just adds the
reporting of pending dma-allocations on driver-unload for a device. 

Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: use the generic dma-noncoherent code for sh V3

2018-07-31 Thread Christoph Hellwig
On Tue, Jul 31, 2018 at 03:06:13PM +0900, Yoshinori Sato wrote:
> On Wed, 25 Jul 2018 18:40:38 +0900,
> Christoph Hellwig wrote:
> > 
> > Hi all,
> > 
> > can you review these patches to switch sh to use the generic
> > dma-noncoherent code?  All the requirements are in mainline already
> > and we've switched various architectures over to it already.
> > 
> > Changes since V2:
> >  - drop a now obsolete export
> > 
> > Changes since V1:
> >  - fixed two stupid compile errors and verified them using a local
> >cross toolchain instead of the 0day buildbot
> 
> Acked-by: Yoshinori Sato 

Do you want to pull this in through the sh tree?  If not I'd be happy
to take it through the dma mapping tree.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: use the generic dma-noncoherent code for sh V3

2018-07-31 Thread Yoshinori Sato
On Wed, 25 Jul 2018 18:40:38 +0900,
Christoph Hellwig wrote:
> 
> Hi all,
> 
> can you review these patches to switch sh to use the generic
> dma-noncoherent code?  All the requirements are in mainline already
> and we've switched various architectures over to it already.
> 
> Changes since V2:
>  - drop a now obsolete export
> 
> Changes since V1:
>  - fixed two stupid compile errors and verified them using a local
>cross toolchain instead of the 0day buildbot

Acked-by: Yoshinori Sato 

-- 
Yosinori Sato
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu