Re: [RFC PATCH 17/23] watchdog/hardlockup/hpet: Convert the timer's interrupt to NMI

2018-06-15 Thread Ricardo Neri
On Fri, Jun 15, 2018 at 11:19:09AM +0200, Thomas Gleixner wrote:
> On Thu, 14 Jun 2018, Ricardo Neri wrote:
> > On Wed, Jun 13, 2018 at 11:40:00AM +0200, Thomas Gleixner wrote:
> > > On Tue, 12 Jun 2018, Ricardo Neri wrote:
> > > > @@ -183,6 +184,8 @@ static irqreturn_t 
> > > > hardlockup_detector_irq_handler(int irq, void *data)
> > > > if (!(hdata->flags & HPET_DEV_PERI_CAP))
> > > > kick_timer(hdata);
> > > >  
> > > > +   pr_err("This interrupt should not have happened. Ensure 
> > > > delivery mode is NMI.\n");
> > > 
> > > Eeew.
> > 
> > If you don't mind me asking. What is the problem with this error message?
> 
> The problem is not the error message. The problem is the abuse of
> request_irq() and the fact that this irq handler function exists in the
> first place for something which is NMI based.

I wanted to add this handler in case the interrupt was not configured correctly
to be delivered as NMI (e.g., not supported by the hardware). I see your point.
Perhaps this is not needed. There is code in place to complain when an interrupt
that nobody was expecting happens.

> 
> > > And in case that the HPET does not support periodic mode this reprogramms
> > > the timer on every NMI which means that while perf is running the watchdog
> > > will never ever detect anything.
> > 
> > Yes. I see that this is wrong. With MSI interrupts, as far as I can
> > see, there is not a way to make sure that the HPET timer caused the NMI
> > perhaps the only option is to use an IO APIC interrupt and read the
> > interrupt status register.
> > 
> > > Aside of that, reading TWO HPET registers for every NMI is insane. HPET
> > > access is horribly slow, so any high frequency perf monitoring will take a
> > > massive performance hit.
> > 
> > If an IO APIC interrupt is used, only HPET register (the status register)
> > would need to be read for every NMI. Would that be more acceptable? 
> > Otherwise,
> > there is no way to determine if the HPET cause the NMI.
> 
> You need level trigger for the HPET status register to be useful at all
> because in edge mode the interrupt status bits read always 0.

Indeed.

> 
> That means you have to fiddle with the IOAPIC acknowledge magic from NMI
> context. Brilliant idea. If the NMI hits in the middle of a regular
> io_apic_read() then the interrupted code will endup with the wrong index
> register. Not to talk about the fun which the affinity rotation from NMI
> context would bring.
> 
> Do not even think about using IOAPIC and level for this.

OK. I will stay away of it and focus on MSI.
> 
> > Alternatively, there could be a counter that skips reading the HPET status
> > register (and the detection of hardlockups) for every X NMIs. This would
> > reduce the overall frequency of HPET register reads.
> 
> Great plan. So if the watchdog is the only NMI (because perf is off) then
> you delay the watchdog detection by that count.

OK. This was a bad idea. Then, is it acceptable to have an read to an HPET
register per NMI just to check in the status register if the HPET timer
caused the NMI?

Thanks and BR,
Ricardo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 20/23] watchdog/hardlockup/hpet: Rotate interrupt among all monitored CPUs

2018-06-15 Thread Ricardo Neri
On Fri, Jun 15, 2018 at 12:29:06PM +0200, Thomas Gleixner wrote:
> On Thu, 14 Jun 2018, Ricardo Neri wrote:
> > On Wed, Jun 13, 2018 at 11:48:09AM +0200, Thomas Gleixner wrote:
> > > On Tue, 12 Jun 2018, Ricardo Neri wrote:
> > > > +   /* There are no CPUs to monitor. */
> > > > +   if (!cpumask_weight(>monitored_mask))
> > > > +   return NMI_HANDLED;
> > > > +
> > > > inspect_for_hardlockups(regs);
> > > >  
> > > > +   /*
> > > > +* Target a new CPU. Keep trying until we find a monitored CPU. 
> > > > CPUs
> > > > +* are addded and removed to this mask at cpu_up() and 
> > > > cpu_down(),
> > > > +* respectively. Thus, the interrupt should be able to be moved 
> > > > to
> > > > +* the next monitored CPU.
> > > > +*/
> > > > +   spin_lock(_data->lock);
> > > 
> > > Yuck. Taking a spinlock from NMI ...
> > 
> > I am sorry. I will look into other options for locking. Do you think 
> > rcu_lock
> > would help in this case? I need this locking because the CPUs being 
> > monitored
> > changes as CPUs come online and offline.
> 
> Sure, but you _cannot_ take any locks in NMI context which are also taken
> in !NMI context. And RCU will not help either. How so? The NMI can hit
> exactly before the CPU bit is cleared and then the CPU goes down. So RCU
> _cannot_ protect anything.
> 
> All you can do there is make sure that the TIMn_CONF is only ever accessed
> in !NMI code. Then you can stop the timer _before_ a CPU goes down and make
> sure that the eventually on the fly NMI is finished. After that you can
> fiddle with the CPU mask and restart the timer. Be aware that this is going
> to be more corner case handling that actual functionality.

Thanks for the suggestion. It makes sense to stop the timer when updating the
CPU mask. In this manner the timer will not cause any NMI.
> 
> > > > +   for_each_cpu_wrap(cpu, >monitored_mask, 
> > > > smp_processor_id() + 1) {
> > > > +   if (!irq_set_affinity(hld_data->irq, cpumask_of(cpu)))
> > > > +   break;
> > > 
> > > ... and then calling into generic interrupt code which will take even more
> > > locks is completely broken.
> > 
> > I will into reworking how the destination of the interrupt is set.
> 
> You have to consider two cases:
> 
>  1) !remapped mode:
> 
> That's reasonably simple because you just have to deal with the HPET
> TIMERn_PROCMSG_ROUT register. But then you need to do this directly and
> not through any of the existing interrupt facilities.

Indeed, there is no need to use the generic interrupt faciities to set affinity;
I am dealing with an NMI anyways.
> 
>  2) remapped mode:
> 
> That's way more complex as you _cannot_ ever do anything which touches
> the IOMMU and the related tables.
> 
> So you'd need to reserve an IOMMU remapping entry for each CPU upfront,
> store the resulting value for the HPET TIMERn_PROCMSG_ROUT register in
> per cpu storage and just modify that one from NMI.
> 
> Though there might be subtle side effects involved, which are related to
> the acknowledge part. You need to talk to the IOMMU wizards first.

I see. I will look into the code and prototype something that makes sense for
the IOMMU maintainers.

> 
> All in all, the idea itself is interesting, but the envisioned approach of
> round robin and no fast accessible NMI reason detection is going to create
> more problems than it solves.

I see it more clearly now.

Thanks and BR,
Ricardo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 03/23] genirq: Introduce IRQF_DELIVER_AS_NMI

2018-06-15 Thread Ricardo Neri
On Fri, Jun 15, 2018 at 09:01:02AM +0100, Julien Thierry wrote:
> Hi Ricardo,
> 
> On 15/06/18 03:12, Ricardo Neri wrote:
> >On Wed, Jun 13, 2018 at 11:06:25AM +0100, Marc Zyngier wrote:
> >>On 13/06/18 10:20, Thomas Gleixner wrote:
> >>>On Wed, 13 Jun 2018, Julien Thierry wrote:
> On 13/06/18 09:34, Peter Zijlstra wrote:
> >On Tue, Jun 12, 2018 at 05:57:23PM -0700, Ricardo Neri wrote:
> >>diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> >>index 5426627..dbc5e02 100644
> >>--- a/include/linux/interrupt.h
> >>+++ b/include/linux/interrupt.h
> >>@@ -61,6 +61,8 @@
> >>*interrupt handler after suspending interrupts. For
> >>system
> >>*wakeup devices users need to implement wakeup
> >>detection in
> >>*their interrupt handlers.
> >>+ * IRQF_DELIVER_AS_NMI - Configure interrupt to be delivered as
> >>non-maskable, if
> >>+ *supported by the chip.
> >>*/
> >
> >NAK on the first 6 patches. You really _REALLY_ don't want to expose
> >NMIs to this level.
> >
> 
> I've been working on something similar on arm64 side, and effectively the 
> one
> thing that might be common to arm64 and intel is the interface to set an
> interrupt as NMI. So I guess it would be nice to agree on the right 
> approach
> for this.
> 
> The way I did it was by introducing a new irq_state and let the irqchip 
> driver
> handle most of the work (if it supports that state):
> 
> https://lkml.org/lkml/2018/5/25/181
> 
> This has not been ACKed nor NAKed. So I am just asking whether this is a 
> more
> suitable approach, and if not, is there any suggestions on how to do this?
> >>>
> >>>I really didn't pay attention to that as it's burried in the GIC/ARM series
> >>>which is usually Marc's playground.
> >>
> >>I'm working my way through it ATM now that I have some brain cycles back.
> >>
> >>>Adding NMI delivery support at low level architecture irq chip level is
> >>>perfectly fine, but the exposure of that needs to be restricted very
> >>>much. Adding it to the generic interrupt control interfaces is not going to
> >>>happen. That's doomed to begin with and a complete abuse of the interface
> >>>as the handler can not ever be used for that.
> >>
> >>I can only agree with that. Allowing random driver to use request_irq()
> >>to make anything an NMI ultimately turns it into a complete mess ("hey,
> >>NMI is *faster*, let's use that"), and a potential source of horrible
> >>deadlocks.
> >>
> >>What I'd find more palatable is a way for an irqchip to be able to
> >>prioritize some interrupts based on a set of architecturally-defined
> >>requirements, and a separate NMI requesting/handling framework that is
> >>separate from the IRQ API, as the overall requirements are likely to
> >>completely different.
> >>
> >>It shouldn't have to be nearly as complex as the IRQ API, and require
> >>much stricter requirements in terms of what you can do there (flow
> >>handling should definitely be different).
> >
> >Marc, Julien, do you plan to actively work on this? Would you mind keeping
> >me in the loop? I also need this work for this watchdog. In the meantime,
> >I will go through Julien's patches and try to adapt it to my work.
> 
> We are going to work on this and of course your input is most welcome to
> make sure we have an interface usable across different architectures.

Great! Thanks! I will keep an eye to future version of your "arm64: provide
pseudo NMI with GICv3" series.
> 
> In my patches, I'm not sure there is much to adapt to your work as most of
> it is arch specific (although I wont say no to another pair of eyes looking
> at them). From what I've seen of your patches, the point where we converge
> is that need for some code to be able to tell the irqchip "I want that
> particular interrupt line to be treated/setup as an NMI".

Indeed, there has to be a generic way for the irqchip to announce that it
supports configuring an interrupt as NMI... and a way to actually configuring
it.

> 
> We'll make sure to keep you in the loop for discussions/suggestions on this.

Thank you!

Thanks and BR,
Ricardo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 25/25] MIPS: remove unneeded includes from dma-mapping.h

2018-06-15 Thread Christoph Hellwig
Keep this file as light as possible as it gets pulled into every
driver using dma mapping APIs.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/include/asm/dma-mapping.h | 8 
 arch/mips/mti-malta/malta-setup.c   | 1 +
 2 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/arch/mips/include/asm/dma-mapping.h 
b/arch/mips/include/asm/dma-mapping.h
index 143250986e17..1c6e0c8ef483 100644
--- a/arch/mips/include/asm/dma-mapping.h
+++ b/arch/mips/include/asm/dma-mapping.h
@@ -2,14 +2,6 @@
 #ifndef _ASM_DMA_MAPPING_H
 #define _ASM_DMA_MAPPING_H
 
-#include 
-#include 
-#include 
-
-#ifndef CONFIG_SGI_IP27 /* Kludge to fix 2.6.39 build for IP27 */
-#include 
-#endif
-
 extern const struct dma_map_ops jazz_dma_ops;
 extern const struct dma_map_ops mips_swiotlb_ops;
 
diff --git a/arch/mips/mti-malta/malta-setup.c 
b/arch/mips/mti-malta/malta-setup.c
index 4d5cdfeee3db..7cb7d5a42087 100644
--- a/arch/mips/mti-malta/malta-setup.c
+++ b/arch/mips/mti-malta/malta-setup.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 24/25] MIPS: remove the old dma-default implementation

2018-06-15 Thread Christoph Hellwig
Now unused.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/Kconfig |   5 +-
 arch/mips/include/asm/dma-mapping.h   |   3 -
 .../include/asm/mach-generic/dma-coherence.h  |  73 
 arch/mips/mm/Makefile |   1 -
 arch/mips/mm/dma-default.c| 379 --
 5 files changed, 1 insertion(+), 460 deletions(-)
 delete mode 100644 arch/mips/include/asm/mach-generic/dma-coherence.h
 delete mode 100644 arch/mips/mm/dma-default.c

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 7f7edb2b4fcd..3e2a2b49287f 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -77,9 +77,6 @@ config MIPS
select SYSCTL_EXCEPTION_TRACE
select VIRT_TO_BUS
 
-config MIPS_DMA_DEFAULT
-   bool
-
 menu "Machine selection"
 
 choice
@@ -1118,7 +1115,7 @@ config DMA_NONCOHERENT
select NEED_DMA_MAP_STATE
select DMA_NONCOHERENT_MMAP
select DMA_NONCOHERENT_CACHE_SYNC
-   select DMA_NONCOHERENT_OPS if !MIPS_DMA_DEFAULT
+   select DMA_NONCOHERENT_OPS
 
 config SYS_HAS_EARLY_PRINTK
bool
diff --git a/arch/mips/include/asm/dma-mapping.h 
b/arch/mips/include/asm/dma-mapping.h
index caf97f739897..143250986e17 100644
--- a/arch/mips/include/asm/dma-mapping.h
+++ b/arch/mips/include/asm/dma-mapping.h
@@ -11,7 +11,6 @@
 #endif
 
 extern const struct dma_map_ops jazz_dma_ops;
-extern const struct dma_map_ops mips_default_dma_map_ops;
 extern const struct dma_map_ops mips_swiotlb_ops;
 
 static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
@@ -20,8 +19,6 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
return _dma_ops;
 #elif defined(CONFIG_SWIOTLB)
return _swiotlb_ops;
-#elif defined(CONFIG_MIPS_DMA_DEFAULT)
-   return _default_dma_map_ops;
 #elif defined(CONFIG_DMA_NONCOHERENT_OPS)
return _noncoherent_ops;
 #else
diff --git a/arch/mips/include/asm/mach-generic/dma-coherence.h 
b/arch/mips/include/asm/mach-generic/dma-coherence.h
deleted file mode 100644
index 8ad7a40ca786..
--- a/arch/mips/include/asm/mach-generic/dma-coherence.h
+++ /dev/null
@@ -1,73 +0,0 @@
-/*
- * This file is subject to the terms and conditions of the GNU General Public
- * License.  See the file "COPYING" in the main directory of this archive
- * for more details.
- *
- * Copyright (C) 2006  Ralf Baechle 
- *
- */
-#ifndef __ASM_MACH_GENERIC_DMA_COHERENCE_H
-#define __ASM_MACH_GENERIC_DMA_COHERENCE_H
-
-struct device;
-
-static inline dma_addr_t plat_map_dma_mem(struct device *dev, void *addr,
-   size_t size)
-{
-   return virt_to_phys(addr);
-}
-
-static inline dma_addr_t plat_map_dma_mem_page(struct device *dev,
-   struct page *page)
-{
-   return page_to_phys(page);
-}
-
-static inline unsigned long plat_dma_addr_to_phys(struct device *dev,
-   dma_addr_t dma_addr)
-{
-   return dma_addr;
-}
-
-static inline void plat_unmap_dma_mem(struct device *dev, dma_addr_t dma_addr,
-   size_t size, enum dma_data_direction direction)
-{
-}
-
-static inline int plat_dma_supported(struct device *dev, u64 mask)
-{
-   /*
-* we fall back to GFP_DMA when the mask isn't all 1s,
-* so we can't guarantee allocations that must be
-* within a tighter range than GFP_DMA..
-*/
-   if (mask < DMA_BIT_MASK(24))
-   return 0;
-
-   return 1;
-}
-
-static inline int plat_device_is_coherent(struct device *dev)
-{
-#ifdef CONFIG_DMA_PERDEV_COHERENT
-   return dev->archdata.dma_coherent;
-#else
-   switch (coherentio) {
-   default:
-   case IO_COHERENCE_DEFAULT:
-   return hw_coherentio;
-   case IO_COHERENCE_ENABLED:
-   return 1;
-   case IO_COHERENCE_DISABLED:
-   return 0;
-   }
-#endif
-}
-
-#ifndef plat_post_dma_flush
-static inline void plat_post_dma_flush(struct device *dev)
-{
-}
-#endif
-
-#endif /* __ASM_MACH_GENERIC_DMA_COHERENCE_H */
diff --git a/arch/mips/mm/Makefile b/arch/mips/mm/Makefile
index c6146c3805dc..6922f393af19 100644
--- a/arch/mips/mm/Makefile
+++ b/arch/mips/mm/Makefile
@@ -17,7 +17,6 @@ obj-$(CONFIG_32BIT)   += ioremap.o pgtable-32.o
 obj-$(CONFIG_64BIT)+= pgtable-64.o
 obj-$(CONFIG_HIGHMEM)  += highmem.o
 obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
-obj-$(CONFIG_MIPS_DMA_DEFAULT) += dma-default.o
 obj-$(CONFIG_DMA_NONCOHERENT)  += dma-noncoherent.o
 obj-$(CONFIG_SWIOTLB)  += dma-swiotlb.o
 
diff --git a/arch/mips/mm/dma-default.c b/arch/mips/mm/dma-default.c
deleted file mode 100644
index 10b56e8a2076..
--- a/arch/mips/mm/dma-default.c
+++ /dev/null
@@ -1,379 +0,0 @@
-/*
- * This file is subject to the terms and conditions of the GNU General Public
- * License.  See the file "COPYING" in the main directory of this archive
- * for more details.
- *
- * Copyright (C) 2000  Ani Joshi 
- * Copyright (C) 2000, 2001, 06 

Re: [PATCH 1/1] iommu/arm-smmu: Add support to use Last level cache

2018-06-15 Thread Jordan Crouse
On Fri, Jun 15, 2018 at 05:52:32PM +0100, Will Deacon wrote:
> Hi Vivek,
> 
> On Fri, Jun 15, 2018 at 04:23:29PM +0530, Vivek Gautam wrote:
> > Qualcomm SoCs have an additional level of cache called as
> > System cache or Last level cache[1]. This cache sits right
> > before the DDR, and is tightly coupled with the memory
> > controller.
> > The cache is available to all the clients present in the
> > SoC system. The clients request their slices from this system
> > cache, make it active, and can then start using it. For these
> > clients with smmu, to start using the system cache for
> > dma buffers and related page tables [2], few of the memory
> > attributes need to be set accordingly.
> > This change makes the related memory Outer-Shareable, and
> > updates the MAIR with necessary protection.
> > 
> > The MAIR attribute requirements are:
> > Inner Cacheablity = 0
> > Outer Cacheablity = 1, Write-Back Write Allocate
> > Outer Shareablity = 1
> 
> Hmm, so is this cache coherent with the CPU or not? Why don't normal
> non-cacheable mappings allocated in the LLC by default?
> 
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index f7a96bcf94a6..8058e7205034 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -249,6 +249,7 @@ struct arm_smmu_domain {
> > struct mutexinit_mutex; /* Protects smmu pointer */
> > spinlock_t  cb_lock; /* Serialises ATS1* ops and 
> > TLB syncs */
> > struct iommu_domain domain;
> > +   boolhas_sys_cache;
> >  };
> >  
> >  struct arm_smmu_option_prop {
> > @@ -862,6 +863,8 @@ static int arm_smmu_init_domain_context(struct 
> > iommu_domain *domain,
> >  
> > if (smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
> > pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
> > +   if (smmu_domain->has_sys_cache)
> > +   pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_SYS_CACHE;
> >  
> > smmu_domain->smmu = smmu;
> > pgtbl_ops = alloc_io_pgtable_ops(fmt, _cfg, smmu_domain);
> > @@ -1477,6 +1480,9 @@ static int arm_smmu_domain_get_attr(struct 
> > iommu_domain *domain,
> > case DOMAIN_ATTR_NESTING:
> > *(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
> > return 0;
> > +   case DOMAIN_ATTR_USE_SYS_CACHE:
> > +   *((int *)data) = smmu_domain->has_sys_cache;
> > +   return 0;
> 
> I really don't like exposing this to clients directly like this,
> particularly as there aren't any in-tree users. I would prefer that we
> provide a way for the io-pgtable code to have its MAIR values overridden
> so that all non-coherent DMA ends up using the system cache.

FWIW here is a future in-tree user for LLC:

https://patchwork.freedesktop.org/series/40545/

Specifically:

https://patchwork.freedesktop.org/patch/212400/

Jordan

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/arm-smmu: Add support to use Last level cache

2018-06-15 Thread Will Deacon
Hi Vivek,

On Fri, Jun 15, 2018 at 04:23:29PM +0530, Vivek Gautam wrote:
> Qualcomm SoCs have an additional level of cache called as
> System cache or Last level cache[1]. This cache sits right
> before the DDR, and is tightly coupled with the memory
> controller.
> The cache is available to all the clients present in the
> SoC system. The clients request their slices from this system
> cache, make it active, and can then start using it. For these
> clients with smmu, to start using the system cache for
> dma buffers and related page tables [2], few of the memory
> attributes need to be set accordingly.
> This change makes the related memory Outer-Shareable, and
> updates the MAIR with necessary protection.
> 
> The MAIR attribute requirements are:
> Inner Cacheablity = 0
> Outer Cacheablity = 1, Write-Back Write Allocate
> Outer Shareablity = 1

Hmm, so is this cache coherent with the CPU or not? Why don't normal
non-cacheable mappings allocated in the LLC by default?

> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index f7a96bcf94a6..8058e7205034 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -249,6 +249,7 @@ struct arm_smmu_domain {
>   struct mutexinit_mutex; /* Protects smmu pointer */
>   spinlock_t  cb_lock; /* Serialises ATS1* ops and 
> TLB syncs */
>   struct iommu_domain domain;
> + boolhas_sys_cache;
>  };
>  
>  struct arm_smmu_option_prop {
> @@ -862,6 +863,8 @@ static int arm_smmu_init_domain_context(struct 
> iommu_domain *domain,
>  
>   if (smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
>   pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
> + if (smmu_domain->has_sys_cache)
> + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_SYS_CACHE;
>  
>   smmu_domain->smmu = smmu;
>   pgtbl_ops = alloc_io_pgtable_ops(fmt, _cfg, smmu_domain);
> @@ -1477,6 +1480,9 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
> *domain,
>   case DOMAIN_ATTR_NESTING:
>   *(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
>   return 0;
> + case DOMAIN_ATTR_USE_SYS_CACHE:
> + *((int *)data) = smmu_domain->has_sys_cache;
> + return 0;

I really don't like exposing this to clients directly like this,
particularly as there aren't any in-tree users. I would prefer that we
provide a way for the io-pgtable code to have its MAIR values overridden
so that all non-coherent DMA ends up using the system cache.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 15/25] MIPS: IP27: use dma_direct_ops

2018-06-15 Thread Christoph Hellwig
IP27 is coherent and has a reasonably direct mapping, just with a little
per-bus offset added into the dma address.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/Kconfig |  2 +-
 .../include/asm/mach-ip27/dma-coherence.h | 70 ---
 arch/mips/pci/pci-ip27.c  | 14 
 3 files changed, 15 insertions(+), 71 deletions(-)
 delete mode 100644 arch/mips/include/asm/mach-ip27/dma-coherence.h

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 6247bb7f8244..8bf378651d74 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -682,11 +682,11 @@ config SGI_IP22
 
 config SGI_IP27
bool "SGI IP27 (Origin200/2000)"
+   select ARCH_HAS_PHYS_TO_DMA
select FW_ARC
select FW_ARC64
select BOOT_ELF64
select DEFAULT_SGI_PARTITION
-   select MIPS_DMA_DEFAULT
select SYS_HAS_EARLY_PRINTK
select HW_HAS_PCI
select NR_CPUS_DEFAULT_64
diff --git a/arch/mips/include/asm/mach-ip27/dma-coherence.h 
b/arch/mips/include/asm/mach-ip27/dma-coherence.h
deleted file mode 100644
index 04d862020ac9..
--- a/arch/mips/include/asm/mach-ip27/dma-coherence.h
+++ /dev/null
@@ -1,70 +0,0 @@
-/*
- * This file is subject to the terms and conditions of the GNU General Public
- * License.  See the file "COPYING" in the main directory of this archive
- * for more details.
- *
- * Copyright (C) 2006  Ralf Baechle 
- *
- */
-#ifndef __ASM_MACH_IP27_DMA_COHERENCE_H
-#define __ASM_MACH_IP27_DMA_COHERENCE_H
-
-#include 
-
-#define pdev_to_baddr(pdev, addr) \
-   (BRIDGE_CONTROLLER(pdev->bus)->baddr + (addr))
-#define dev_to_baddr(dev, addr) \
-   pdev_to_baddr(to_pci_dev(dev), (addr))
-
-struct device;
-
-static inline dma_addr_t plat_map_dma_mem(struct device *dev, void *addr,
-   size_t size)
-{
-   dma_addr_t pa = dev_to_baddr(dev, virt_to_phys(addr));
-
-   return pa;
-}
-
-static inline dma_addr_t plat_map_dma_mem_page(struct device *dev,
-   struct page *page)
-{
-   dma_addr_t pa = dev_to_baddr(dev, page_to_phys(page));
-
-   return pa;
-}
-
-static inline unsigned long plat_dma_addr_to_phys(struct device *dev,
-   dma_addr_t dma_addr)
-{
-   return dma_addr & ~(0xffUL << 56);
-}
-
-static inline void plat_unmap_dma_mem(struct device *dev, dma_addr_t dma_addr,
-   size_t size, enum dma_data_direction direction)
-{
-}
-
-static inline int plat_dma_supported(struct device *dev, u64 mask)
-{
-   /*
-* we fall back to GFP_DMA when the mask isn't all 1s,
-* so we can't guarantee allocations that must be
-* within a tighter range than GFP_DMA..
-*/
-   if (mask < DMA_BIT_MASK(24))
-   return 0;
-
-   return 1;
-}
-
-static inline void plat_post_dma_flush(struct device *dev)
-{
-}
-
-static inline int plat_device_is_coherent(struct device *dev)
-{
-   return 1;   /* IP27 non-coherent mode is unsupported */
-}
-
-#endif /* __ASM_MACH_IP27_DMA_COHERENCE_H */
diff --git a/arch/mips/pci/pci-ip27.c b/arch/mips/pci/pci-ip27.c
index 0f09eafa5e3a..65b48d41a229 100644
--- a/arch/mips/pci/pci-ip27.c
+++ b/arch/mips/pci/pci-ip27.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -182,6 +183,19 @@ int pcibios_plat_dev_init(struct pci_dev *dev)
return 0;
 }
 
+dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
+{
+   struct pci_dev *pdev = to_pci_dev(dev);
+   struct bridge_controller *bc = BRIDGE_CONTROLLER(pdev->bus);
+
+   return bc->baddr + paddr;
+}
+
+phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dma_addr)
+{
+   return dma_addr & ~(0xffUL << 56);
+}
+
 /*
  * Device might live on a subordinate PCI bus. XXX Walk up the chain of buses
  * to find the slot number in sense of the bridge device register.
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 23/25] MIPS: bmips: use generic dma noncoherent ops

2018-06-15 Thread Christoph Hellwig
Provide phys_to_dma/dma_to_phys helpers, and the special
arch_sync_dma_for_cpu_all hook, everything else is generic

Signed-off-by: Christoph Hellwig 
---
 arch/mips/Kconfig |  3 +-
 arch/mips/bmips/dma.c | 32 ++-
 arch/mips/include/asm/bmips.h | 16 --
 .../include/asm/mach-bmips/dma-coherence.h| 54 ---
 4 files changed, 21 insertions(+), 84 deletions(-)
 delete mode 100644 arch/mips/include/asm/mach-bmips/dma-coherence.h

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index f30ef932451f..7f7edb2b4fcd 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -214,6 +214,8 @@ config ATH79
 
 config BMIPS_GENERIC
bool "Broadcom Generic BMIPS kernel"
+   select ARCH_HAS_SYNC_DMA_FOR_CPU_ALL
+   select ARCH_HAS_PHYS_TO_DMA
select BOOT_RAW
select NO_EXCEPT_FILL
select USE_OF
@@ -226,7 +228,6 @@ config BMIPS_GENERIC
select BCM7120_L2_IRQ
select BRCMSTB_L2_IRQ
select IRQ_MIPS_CPU
-   select MIPS_DMA_DEFAULT
select DMA_NONCOHERENT
select SYS_SUPPORTS_32BIT_KERNEL
select SYS_SUPPORTS_LITTLE_ENDIAN
diff --git a/arch/mips/bmips/dma.c b/arch/mips/bmips/dma.c
index 6dec30842b2f..3d13c77c125f 100644
--- a/arch/mips/bmips/dma.c
+++ b/arch/mips/bmips/dma.c
@@ -17,7 +17,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 /*
  * BCM338x has configurable address translation windows which allow the
@@ -40,7 +40,7 @@ static struct bmips_dma_range *bmips_dma_ranges;
 
 #define FLUSH_RAC  0x100
 
-static dma_addr_t bmips_phys_to_dma(struct device *dev, phys_addr_t pa)
+dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t pa)
 {
struct bmips_dma_range *r;
 
@@ -52,17 +52,7 @@ static dma_addr_t bmips_phys_to_dma(struct device *dev, 
phys_addr_t pa)
return pa;
 }
 
-dma_addr_t plat_map_dma_mem(struct device *dev, void *addr, size_t size)
-{
-   return bmips_phys_to_dma(dev, virt_to_phys(addr));
-}
-
-dma_addr_t plat_map_dma_mem_page(struct device *dev, struct page *page)
-{
-   return bmips_phys_to_dma(dev, page_to_phys(page));
-}
-
-unsigned long plat_dma_addr_to_phys(struct device *dev, dma_addr_t dma_addr)
+phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dma_addr)
 {
struct bmips_dma_range *r;
 
@@ -74,6 +64,22 @@ unsigned long plat_dma_addr_to_phys(struct device *dev, 
dma_addr_t dma_addr)
return dma_addr;
 }
 
+void arch_sync_dma_for_cpu_all(struct device *dev)
+{
+   void __iomem *cbr = BMIPS_GET_CBR();
+   u32 cfg;
+
+   if (boot_cpu_type() != CPU_BMIPS3300 &&
+   boot_cpu_type() != CPU_BMIPS4350 &&
+   boot_cpu_type() != CPU_BMIPS4380)
+   return;
+
+   /* Flush stale data out of the readahead cache */
+   cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG);
+   __raw_writel(cfg | 0x100, cbr + BMIPS_RAC_CONFIG);
+   __raw_readl(cbr + BMIPS_RAC_CONFIG);
+}
+
 static int __init bmips_init_dma_ranges(void)
 {
struct device_node *np =
diff --git a/arch/mips/include/asm/bmips.h b/arch/mips/include/asm/bmips.h
index b3e2975f83d3..bf6a8afd7ad2 100644
--- a/arch/mips/include/asm/bmips.h
+++ b/arch/mips/include/asm/bmips.h
@@ -123,22 +123,6 @@ static inline void bmips_write_zscm_reg(unsigned int 
offset, unsigned long data)
barrier();
 }
 
-static inline void bmips_post_dma_flush(struct device *dev)
-{
-   void __iomem *cbr = BMIPS_GET_CBR();
-   u32 cfg;
-
-   if (boot_cpu_type() != CPU_BMIPS3300 &&
-   boot_cpu_type() != CPU_BMIPS4350 &&
-   boot_cpu_type() != CPU_BMIPS4380)
-   return;
-
-   /* Flush stale data out of the readahead cache */
-   cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG);
-   __raw_writel(cfg | 0x100, cbr + BMIPS_RAC_CONFIG);
-   __raw_readl(cbr + BMIPS_RAC_CONFIG);
-}
-
 #endif /* !defined(__ASSEMBLY__) */
 
 #endif /* _ASM_BMIPS_H */
diff --git a/arch/mips/include/asm/mach-bmips/dma-coherence.h 
b/arch/mips/include/asm/mach-bmips/dma-coherence.h
deleted file mode 100644
index d29781f02285..
--- a/arch/mips/include/asm/mach-bmips/dma-coherence.h
+++ /dev/null
@@ -1,54 +0,0 @@
-/*
- * Copyright (C) 2006 Ralf Baechle 
- * Copyright (C) 2009 Broadcom Corporation
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- */
-
-#ifndef __ASM_MACH_BMIPS_DMA_COHERENCE_H
-#define __ASM_MACH_BMIPS_DMA_COHERENCE_H
-
-#include 
-#include 
-#include 
-
-struct device;
-
-extern dma_addr_t plat_map_dma_mem(struct device *dev, void *addr, size_t 

[PATCH 22/25] dma-noncoherent: add a arch_sync_dma_for_cpu_all hook

2018-06-15 Thread Christoph Hellwig
The MIPS bmips platform needs a global flush when transferring ownership
back to the CPU.  Add a hook for that to the dma-noncoherent
implementation.

Signed-off-by: Christoph Hellwig 
---
 include/linux/dma-noncoherent.h | 8 
 lib/dma-noncoherent.c   | 8 ++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/linux/dma-noncoherent.h b/include/linux/dma-noncoherent.h
index 10b2654d549b..a0aa00cc909d 100644
--- a/include/linux/dma-noncoherent.h
+++ b/include/linux/dma-noncoherent.h
@@ -44,4 +44,12 @@ static inline void arch_sync_dma_for_cpu(struct device *dev,
 }
 #endif /* ARCH_HAS_SYNC_DMA_FOR_CPU */
 
+#ifdef CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL
+void arch_sync_dma_for_cpu_all(struct device *dev);
+#else
+static inline void arch_sync_dma_for_cpu_all(struct device *dev)
+{
+}
+#endif /* CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL */
+
 #endif /* _LINUX_DMA_NONCOHERENT_H */
diff --git a/lib/dma-noncoherent.c b/lib/dma-noncoherent.c
index 79e9a757387f..031fe235d958 100644
--- a/lib/dma-noncoherent.c
+++ b/lib/dma-noncoherent.c
@@ -49,11 +49,13 @@ static int dma_noncoherent_map_sg(struct device *dev, 
struct scatterlist *sgl,
return nents;
 }
 
-#ifdef CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU
+#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
+defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
 static void dma_noncoherent_sync_single_for_cpu(struct device *dev,
dma_addr_t addr, size_t size, enum dma_data_direction dir)
 {
arch_sync_dma_for_cpu(dev, dma_to_phys(dev, addr), size, dir);
+   arch_sync_dma_for_cpu_all(dev);
 }
 
 static void dma_noncoherent_sync_sg_for_cpu(struct device *dev,
@@ -64,6 +66,7 @@ static void dma_noncoherent_sync_sg_for_cpu(struct device 
*dev,
 
for_each_sg(sgl, sg, nents, i)
arch_sync_dma_for_cpu(dev, sg_phys(sg), sg->length, dir);
+   arch_sync_dma_for_cpu_all(dev);
 }
 
 static void dma_noncoherent_unmap_page(struct device *dev, dma_addr_t addr,
@@ -89,7 +92,8 @@ const struct dma_map_ops dma_noncoherent_ops = {
.sync_sg_for_device = dma_noncoherent_sync_sg_for_device,
.map_page   = dma_noncoherent_map_page,
.map_sg = dma_noncoherent_map_sg,
-#ifdef CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU
+#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
+defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
.sync_single_for_cpu= dma_noncoherent_sync_single_for_cpu,
.sync_sg_for_cpu= dma_noncoherent_sync_sg_for_cpu,
.unmap_page = dma_noncoherent_unmap_page,
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 21/25] MIPS: jazz: split dma mapping operations from dma-default

2018-06-15 Thread Christoph Hellwig
Jazz actually has a very basic IOMMU, so split the ops into a separate
implementation from the generic default support (which is about to go
away anyway).

Signed-off-by: Christoph Hellwig 
---
 arch/mips/include/asm/dma-mapping.h   |   5 +-
 .../include/asm/mach-jazz/dma-coherence.h |  60 
 arch/mips/jazz/Kconfig|   3 -
 arch/mips/jazz/jazzdma.c  | 141 +-
 4 files changed, 144 insertions(+), 65 deletions(-)
 delete mode 100644 arch/mips/include/asm/mach-jazz/dma-coherence.h

diff --git a/arch/mips/include/asm/dma-mapping.h 
b/arch/mips/include/asm/dma-mapping.h
index e32a7b439816..caf97f739897 100644
--- a/arch/mips/include/asm/dma-mapping.h
+++ b/arch/mips/include/asm/dma-mapping.h
@@ -10,12 +10,15 @@
 #include 
 #endif
 
+extern const struct dma_map_ops jazz_dma_ops;
 extern const struct dma_map_ops mips_default_dma_map_ops;
 extern const struct dma_map_ops mips_swiotlb_ops;
 
 static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
 {
-#ifdef CONFIG_SWIOTLB
+#if defined(CONFIG_MACH_JAZZ)
+   return _dma_ops;
+#elif defined(CONFIG_SWIOTLB)
return _swiotlb_ops;
 #elif defined(CONFIG_MIPS_DMA_DEFAULT)
return _default_dma_map_ops;
diff --git a/arch/mips/include/asm/mach-jazz/dma-coherence.h 
b/arch/mips/include/asm/mach-jazz/dma-coherence.h
deleted file mode 100644
index dc347c25c343..
--- a/arch/mips/include/asm/mach-jazz/dma-coherence.h
+++ /dev/null
@@ -1,60 +0,0 @@
-/*
- * This file is subject to the terms and conditions of the GNU General Public
- * License.  See the file "COPYING" in the main directory of this archive
- * for more details.
- *
- * Copyright (C) 2006  Ralf Baechle 
- */
-#ifndef __ASM_MACH_JAZZ_DMA_COHERENCE_H
-#define __ASM_MACH_JAZZ_DMA_COHERENCE_H
-
-#include 
-
-struct device;
-
-static inline dma_addr_t plat_map_dma_mem(struct device *dev, void *addr, 
size_t size)
-{
-   return vdma_alloc(virt_to_phys(addr), size);
-}
-
-static inline dma_addr_t plat_map_dma_mem_page(struct device *dev,
-   struct page *page)
-{
-   return vdma_alloc(page_to_phys(page), PAGE_SIZE);
-}
-
-static inline unsigned long plat_dma_addr_to_phys(struct device *dev,
-   dma_addr_t dma_addr)
-{
-   return vdma_log2phys(dma_addr);
-}
-
-static inline void plat_unmap_dma_mem(struct device *dev, dma_addr_t dma_addr,
-   size_t size, enum dma_data_direction direction)
-{
-   vdma_free(dma_addr);
-}
-
-static inline int plat_dma_supported(struct device *dev, u64 mask)
-{
-   /*
-* we fall back to GFP_DMA when the mask isn't all 1s,
-* so we can't guarantee allocations that must be
-* within a tighter range than GFP_DMA..
-*/
-   if (mask < DMA_BIT_MASK(24))
-   return 0;
-
-   return 1;
-}
-
-static inline void plat_post_dma_flush(struct device *dev)
-{
-}
-
-static inline int plat_device_is_coherent(struct device *dev)
-{
-   return 0;
-}
-
-#endif /* __ASM_MACH_JAZZ_DMA_COHERENCE_H */
diff --git a/arch/mips/jazz/Kconfig b/arch/mips/jazz/Kconfig
index d3ae3e0356f6..06838f80a5d7 100644
--- a/arch/mips/jazz/Kconfig
+++ b/arch/mips/jazz/Kconfig
@@ -3,7 +3,6 @@ config ACER_PICA_61
bool "Support for Acer PICA 1 chipset"
depends on MACH_JAZZ
select DMA_NONCOHERENT
-   select MIPS_DMA_DEFAULT
select SYS_SUPPORTS_LITTLE_ENDIAN
help
  This is a machine with a R4400 133/150 MHz CPU. To compile a Linux
@@ -15,7 +14,6 @@ config MIPS_MAGNUM_4000
bool "Support for MIPS Magnum 4000"
depends on MACH_JAZZ
select DMA_NONCOHERENT
-   select MIPS_DMA_DEFAULT
select SYS_SUPPORTS_BIG_ENDIAN
select SYS_SUPPORTS_LITTLE_ENDIAN
help
@@ -28,7 +26,6 @@ config OLIVETTI_M700
bool "Support for Olivetti M700-10"
depends on MACH_JAZZ
select DMA_NONCOHERENT
-   select MIPS_DMA_DEFAULT
select SYS_SUPPORTS_LITTLE_ENDIAN
help
  This is a machine with a R4000 100 MHz CPU. To compile a Linux
diff --git a/arch/mips/jazz/jazzdma.c b/arch/mips/jazz/jazzdma.c
index d626a9a391cc..446fc8c92e1e 100644
--- a/arch/mips/jazz/jazzdma.c
+++ b/arch/mips/jazz/jazzdma.c
@@ -16,6 +16,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -86,6 +88,7 @@ static int __init vdma_init(void)
printk(KERN_INFO "VDMA: R4030 DMA pagetables initialized.\n");
return 0;
 }
+arch_initcall(vdma_init);
 
 /*
  * Allocate DMA pagetables using a simple first-fit algorithm
@@ -556,4 +559,140 @@ int vdma_get_enable(int channel)
return enable;
 }
 
-arch_initcall(vdma_init);
+static void *jazz_dma_alloc(struct device *dev, size_t size,
+   dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
+{
+   void *ret;
+
+   ret = dma_direct_alloc(dev, size, dma_handle, gfp, attrs);
+   if (!ret)
+   return NULL;
+
+

[PATCH 19/25] MIPS: IP32: use generic dma noncoherent ops

2018-06-15 Thread Christoph Hellwig
Provide phys_to_dma/dma_to_phys helpers, everything else is generic.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/Kconfig |  2 +-
 .../include/asm/mach-ip32/dma-coherence.h | 92 ---
 arch/mips/sgi-ip32/Makefile   |  2 +-
 arch/mips/sgi-ip32/ip32-dma.c | 37 
 4 files changed, 39 insertions(+), 94 deletions(-)
 delete mode 100644 arch/mips/include/asm/mach-ip32/dma-coherence.h
 create mode 100644 arch/mips/sgi-ip32/ip32-dma.c

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index e192e484b1b8..8e84d14c17fe 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -725,6 +725,7 @@ config SGI_IP28
 
 config SGI_IP32
bool "SGI IP32 (O2)"
+   select ARCH_HAS_PHYS_TO_DMA
select FW_ARC
select FW_ARC32
select BOOT_ELF32
@@ -733,7 +734,6 @@ config SGI_IP32
select DMA_NONCOHERENT
select HW_HAS_PCI
select IRQ_MIPS_CPU
-   select MIPS_DMA_DEFAULT
select R5000_CPU_SCACHE
select RM7000_CPU_SCACHE
select SYS_HAS_CPU_R5000
diff --git a/arch/mips/include/asm/mach-ip32/dma-coherence.h 
b/arch/mips/include/asm/mach-ip32/dma-coherence.h
deleted file mode 100644
index 7bdf212587a0..
--- a/arch/mips/include/asm/mach-ip32/dma-coherence.h
+++ /dev/null
@@ -1,92 +0,0 @@
-/*
- * This file is subject to the terms and conditions of the GNU General Public
- * License.  See the file "COPYING" in the main directory of this archive
- * for more details.
- *
- * Copyright (C) 2006  Ralf Baechle 
- *
- */
-#ifndef __ASM_MACH_IP32_DMA_COHERENCE_H
-#define __ASM_MACH_IP32_DMA_COHERENCE_H
-
-#include 
-
-struct device;
-
-/*
- * Few notes.
- * 1. CPU sees memory as two chunks: 0-256M@0x0, and the rest @0x4000+256M
- * 2. PCI sees memory as one big chunk @0x0 (or we could use 0x4000 for
- *native-endian)
- * 3. All other devices see memory as one big chunk at 0x4000
- * 4. Non-PCI devices will pass NULL as struct device*
- *
- * Thus we translate differently, depending on device.
- */
-
-#define RAM_OFFSET_MASK 0x3fffUL
-
-static inline dma_addr_t plat_map_dma_mem(struct device *dev, void *addr,
-   size_t size)
-{
-   dma_addr_t pa = virt_to_phys(addr) & RAM_OFFSET_MASK;
-
-   if (dev == NULL)
-   pa += CRIME_HI_MEM_BASE;
-
-   return pa;
-}
-
-static inline dma_addr_t plat_map_dma_mem_page(struct device *dev,
-   struct page *page)
-{
-   dma_addr_t pa;
-
-   pa = page_to_phys(page) & RAM_OFFSET_MASK;
-
-   if (dev == NULL)
-   pa += CRIME_HI_MEM_BASE;
-
-   return pa;
-}
-
-/* This is almost certainly wrong but it's what dma-ip32.c used to use */
-static inline unsigned long plat_dma_addr_to_phys(struct device *dev,
-   dma_addr_t dma_addr)
-{
-   unsigned long addr = dma_addr & RAM_OFFSET_MASK;
-
-   if (dma_addr >= 256*1024*1024)
-   addr += CRIME_HI_MEM_BASE;
-
-   return addr;
-}
-
-static inline void plat_unmap_dma_mem(struct device *dev, dma_addr_t dma_addr,
-   size_t size, enum dma_data_direction direction)
-{
-}
-
-static inline int plat_dma_supported(struct device *dev, u64 mask)
-{
-   /*
-* we fall back to GFP_DMA when the mask isn't all 1s,
-* so we can't guarantee allocations that must be
-* within a tighter range than GFP_DMA..
-*/
-   if (mask < DMA_BIT_MASK(24))
-   return 0;
-
-   return 1;
-}
-
-static inline void plat_post_dma_flush(struct device *dev)
-{
-}
-
-static inline int plat_device_is_coherent(struct device *dev)
-{
-   return 0;   /* IP32 is non-coherent */
-}
-
-#endif /* __ASM_MACH_IP32_DMA_COHERENCE_H */
diff --git a/arch/mips/sgi-ip32/Makefile b/arch/mips/sgi-ip32/Makefile
index 60f0227425e7..4745cd94df11 100644
--- a/arch/mips/sgi-ip32/Makefile
+++ b/arch/mips/sgi-ip32/Makefile
@@ -4,4 +4,4 @@
 #
 
 obj-y  += ip32-berr.o ip32-irq.o ip32-platform.o ip32-setup.o ip32-reset.o \
-  crime.o ip32-memory.o
+  crime.o ip32-memory.o ip32-dma.o
diff --git a/arch/mips/sgi-ip32/ip32-dma.c b/arch/mips/sgi-ip32/ip32-dma.c
new file mode 100644
index ..fa7b17cb5385
--- /dev/null
+++ b/arch/mips/sgi-ip32/ip32-dma.c
@@ -0,0 +1,37 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2006  Ralf Baechle 
+ */
+#include 
+#include 
+
+/*
+ * Few notes.
+ * 1. CPU sees memory as two chunks: 0-256M@0x0, and the rest @0x4000+256M
+ * 2. PCI sees memory as one big chunk @0x0 (or we could use 0x4000 for
+ *native-endian)
+ * 3. All other devices see memory as one big chunk at 0x4000
+ * 4. Non-PCI devices will pass NULL as struct device*
+ *
+ * Thus we translate differently, depending on device.
+ */
+
+#define RAM_OFFSET_MASK 0x3fffUL
+
+dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
+{
+   dma_addr_t dma_addr = paddr & RAM_OFFSET_MASK;
+
+   if (!dev)
+   dma_addr 

[PATCH 12/25] MIPS: loongson: untangle dma implementations

2018-06-15 Thread Christoph Hellwig
Only loongson-3 is DMA coherent and uses swiotlb.  So move the dma
address translations stubs directly to the loongson-3 code, and remove
a few Kconfig indirections.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/Kconfig|  2 +-
 arch/mips/loongson64/Kconfig |  5 -
 arch/mips/loongson64/common/Makefile |  5 -
 arch/mips/loongson64/loongson-3/Makefile |  2 +-
 .../{common/dma-swiotlb.c => loongson-3/dma.c}   | 16 
 5 files changed, 6 insertions(+), 24 deletions(-)
 rename arch/mips/loongson64/{common/dma-swiotlb.c => loongson-3/dma.c} (68%)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index bc8893063609..aae92a7b6a9c 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -453,7 +453,6 @@ config MACH_LOONGSON32
 
 config MACH_LOONGSON64
bool "Loongson-2/3 family of machines"
-   select ARCH_HAS_PHYS_TO_DMA
select SYS_SUPPORTS_ZBOOT
help
  This enables the support of Loongson-2/3 family of machines.
@@ -1388,6 +1387,7 @@ choice
 config CPU_LOONGSON3
bool "Loongson 3 CPU"
depends on SYS_HAS_CPU_LOONGSON3
+   select ARCH_HAS_PHYS_TO_DMA
select CPU_SUPPORTS_64BIT_KERNEL
select CPU_SUPPORTS_HIGHMEM
select CPU_SUPPORTS_HUGEPAGES
diff --git a/arch/mips/loongson64/Kconfig b/arch/mips/loongson64/Kconfig
index dbd2a9f9f9a9..a785bf8da3f3 100644
--- a/arch/mips/loongson64/Kconfig
+++ b/arch/mips/loongson64/Kconfig
@@ -93,7 +93,6 @@ config LOONGSON_MACH3X
select LOONGSON_MC146818
select ZONE_DMA32
select LEFI_FIRMWARE_INTERFACE
-   select PHYS48_TO_HT40
help
Generic Loongson 3 family machines utilize the 3A/3B revision
of Loongson processor and RS780/SBX00 chipset.
@@ -132,10 +131,6 @@ config LOONGSON_UART_BASE
default y
depends on EARLY_PRINTK || SERIAL_8250
 
-config PHYS48_TO_HT40
-   bool
-   default y if CPU_LOONGSON3
-
 config LOONGSON_MC146818
bool
default n
diff --git a/arch/mips/loongson64/common/Makefile 
b/arch/mips/loongson64/common/Makefile
index 8235ac7eac95..684624f61f5a 100644
--- a/arch/mips/loongson64/common/Makefile
+++ b/arch/mips/loongson64/common/Makefile
@@ -25,8 +25,3 @@ obj-$(CONFIG_CS5536) += cs5536/
 #
 
 obj-$(CONFIG_SUSPEND) += pm.o
-
-#
-# Big Memory (SWIOTLB) Support
-#
-obj-$(CONFIG_SWIOTLB) += dma-swiotlb.o
diff --git a/arch/mips/loongson64/loongson-3/Makefile 
b/arch/mips/loongson64/loongson-3/Makefile
index 44bc1482158b..b5a0c2fa5446 100644
--- a/arch/mips/loongson64/loongson-3/Makefile
+++ b/arch/mips/loongson64/loongson-3/Makefile
@@ -1,7 +1,7 @@
 #
 # Makefile for Loongson-3 family machines
 #
-obj-y  += irq.o cop2-ex.o platform.o acpi_init.o
+obj-y  += irq.o cop2-ex.o platform.o acpi_init.o dma.o
 
 obj-$(CONFIG_SMP)  += smp.o
 
diff --git a/arch/mips/loongson64/common/dma-swiotlb.c 
b/arch/mips/loongson64/loongson-3/dma.c
similarity index 68%
rename from arch/mips/loongson64/common/dma-swiotlb.c
rename to arch/mips/loongson64/loongson-3/dma.c
index a4f554bf1232..5e86635f71db 100644
--- a/arch/mips/loongson64/common/dma-swiotlb.c
+++ b/arch/mips/loongson64/loongson-3/dma.c
@@ -5,26 +5,18 @@
 
 dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
 {
-   long nid;
-#ifdef CONFIG_PHYS48_TO_HT40
/* We extract 2bit node id (bit 44~47, only bit 44~45 used now) from
 * Loongson-3's 48bit address space and embed it into 40bit */
-   nid = (paddr >> 44) & 0x3;
-   paddr = ((nid << 44) ^ paddr) | (nid << 37);
-#endif
-   return paddr;
+   long nid = (paddr >> 44) & 0x3;
+   return ((nid << 44) ^ paddr) | (nid << 37);
 }
 
 phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t daddr)
 {
-   long nid;
-#ifdef CONFIG_PHYS48_TO_HT40
/* We extract 2bit node id (bit 44~47, only bit 44~45 used now) from
 * Loongson-3's 48bit address space and embed it into 40bit */
-   nid = (daddr >> 37) & 0x3;
-   daddr = ((nid << 37) ^ daddr) | (nid << 44);
-#endif
-   return daddr;
+   long nid = (daddr >> 37) & 0x3;
+   return ((nid << 37) ^ daddr) | (nid << 44);
 }
 
 void __init plat_swiotlb_setup(void)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 14/25] MIPS: use dma_direct_ops for coherent I/O

2018-06-15 Thread Christoph Hellwig
Switch the simple cache coherent architectures that don't require any
DMA address translation to dma_direct_ops.

We'll soon use at least parts of the direct DMA ops implementation for
all platforms, so select the symbol globally.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/Kconfig   | 15 +--
 arch/mips/include/asm/dma-mapping.h |  2 +-
 2 files changed, 2 insertions(+), 15 deletions(-)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index aae92a7b6a9c..6247bb7f8244 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -16,6 +16,7 @@ config MIPS
select BUILDTIME_EXTABLE_SORT
select CLONE_BACKWARDS
select CPU_PM if CPU_IDLE
+   select DMA_DIRECT_OPS
select GENERIC_ATOMIC64 if !64BIT
select GENERIC_CLOCKEVENTS
select GENERIC_CMOS_UPDATE
@@ -568,7 +569,6 @@ config NEC_MARKEINS
bool "NEC EMMA2RH Mark-eins board"
select SOC_EMMA2RH
select HW_HAS_PCI
-   select MIPS_DMA_DEFAULT
help
  This enables support for the NEC Electronics Mark-eins boards.
 
@@ -582,14 +582,12 @@ config MACH_VR41XX
 
 config NXP_STB220
bool "NXP STB220 board"
-   select MIPS_DMA_DEFAULT
select SOC_PNX833X
help
 Support for NXP Semiconductors STB220 Development Board.
 
 config NXP_STB225
bool "NXP 225 board"
-   select MIPS_DMA_DEFAULT
select SOC_PNX833X
select SOC_PNX8335
help
@@ -767,7 +765,6 @@ config SGI_IP32
 config SIBYTE_CRHINE
bool "Sibyte BCM91120C-CRhine"
select BOOT_ELF32
-   select MIPS_DMA_DEFAULT
select SIBYTE_BCM1120
select SWAP_IO_SPACE
select SYS_HAS_CPU_SB1
@@ -777,7 +774,6 @@ config SIBYTE_CRHINE
 config SIBYTE_CARMEL
bool "Sibyte BCM91120x-Carmel"
select BOOT_ELF32
-   select MIPS_DMA_DEFAULT
select SIBYTE_BCM1120
select SWAP_IO_SPACE
select SYS_HAS_CPU_SB1
@@ -787,7 +783,6 @@ config SIBYTE_CARMEL
 config SIBYTE_CRHONE
bool "Sibyte BCM91125C-CRhone"
select BOOT_ELF32
-   select MIPS_DMA_DEFAULT
select SIBYTE_BCM1125
select SWAP_IO_SPACE
select SYS_HAS_CPU_SB1
@@ -798,7 +793,6 @@ config SIBYTE_CRHONE
 config SIBYTE_RHONE
bool "Sibyte BCM91125E-Rhone"
select BOOT_ELF32
-   select MIPS_DMA_DEFAULT
select SIBYTE_BCM1125H
select SWAP_IO_SPACE
select SYS_HAS_CPU_SB1
@@ -809,7 +803,6 @@ config SIBYTE_SWARM
bool "Sibyte BCM91250A-SWARM"
select BOOT_ELF32
select HAVE_PATA_PLATFORM
-   select MIPS_DMA_DEFAULT
select SIBYTE_SB1250
select SWAP_IO_SPACE
select SYS_HAS_CPU_SB1
@@ -822,7 +815,6 @@ config SIBYTE_LITTLESUR
bool "Sibyte BCM91250C2-LittleSur"
select BOOT_ELF32
select HAVE_PATA_PLATFORM
-   select MIPS_DMA_DEFAULT
select SIBYTE_SB1250
select SWAP_IO_SPACE
select SYS_HAS_CPU_SB1
@@ -833,7 +825,6 @@ config SIBYTE_LITTLESUR
 config SIBYTE_SENTOSA
bool "Sibyte BCM91250E-Sentosa"
select BOOT_ELF32
-   select MIPS_DMA_DEFAULT
select SIBYTE_SB1250
select SWAP_IO_SPACE
select SYS_HAS_CPU_SB1
@@ -843,7 +834,6 @@ config SIBYTE_SENTOSA
 config SIBYTE_BIGSUR
bool "Sibyte BCM91480B-BigSur"
select BOOT_ELF32
-   select MIPS_DMA_DEFAULT
select NR_CPUS_DEFAULT_4
select SIBYTE_BCM1x80
select SWAP_IO_SPACE
@@ -964,7 +954,6 @@ config NLM_XLR_BOARD
select SYS_HAS_CPU_XLR
select SYS_SUPPORTS_SMP
select HW_HAS_PCI
-   select MIPS_DMA_DEFAULT
select SWAP_IO_SPACE
select SYS_SUPPORTS_32BIT_KERNEL
select SYS_SUPPORTS_64BIT_KERNEL
@@ -991,7 +980,6 @@ config NLM_XLP_BOARD
select SYS_HAS_CPU_XLP
select SYS_SUPPORTS_SMP
select HW_HAS_PCI
-   select MIPS_DMA_DEFAULT
select SYS_SUPPORTS_32BIT_KERNEL
select SYS_SUPPORTS_64BIT_KERNEL
select PHYS_ADDR_T_64BIT
@@ -1017,7 +1005,6 @@ config MIPS_PARAVIRT
bool "Para-Virtualized guest system"
select CEVT_R4K
select CSRC_R4K
-   select MIPS_DMA_DEFAULT
select SYS_SUPPORTS_64BIT_KERNEL
select SYS_SUPPORTS_32BIT_KERNEL
select SYS_SUPPORTS_BIG_ENDIAN
diff --git a/arch/mips/include/asm/dma-mapping.h 
b/arch/mips/include/asm/dma-mapping.h
index eaf3d9054104..7c0d4f0ccaa0 100644
--- a/arch/mips/include/asm/dma-mapping.h
+++ b/arch/mips/include/asm/dma-mapping.h
@@ -20,7 +20,7 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
 #elif defined(CONFIG_MIPS_DMA_DEFAULT)
return _default_dma_map_ops;
 #else
-   return NULL;
+   return _direct_ops;
 #endif
 }
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 20/25] MIPS: ath25: use generic dma noncoherent ops

2018-06-15 Thread Christoph Hellwig
Provide phys_to_dma/dma_to_phys helpers only if PCI support is
enabled, everything else is generic.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/Kconfig |  1 -
 arch/mips/ath25/Kconfig   |  1 +
 .../include/asm/mach-ath25/dma-coherence.h| 71 ---
 arch/mips/pci/pci-ar2315.c| 24 +++
 4 files changed, 25 insertions(+), 72 deletions(-)
 delete mode 100644 arch/mips/include/asm/mach-ath25/dma-coherence.h

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 8e84d14c17fe..f30ef932451f 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -181,7 +181,6 @@ config ATH25
select DMA_NONCOHERENT
select IRQ_MIPS_CPU
select IRQ_DOMAIN
-   select MIPS_DMA_DEFAULT
select SYS_HAS_CPU_MIPS32_R1
select SYS_SUPPORTS_BIG_ENDIAN
select SYS_SUPPORTS_32BIT_KERNEL
diff --git a/arch/mips/ath25/Kconfig b/arch/mips/ath25/Kconfig
index 7070b4bcd01d..2c1dfd06c366 100644
--- a/arch/mips/ath25/Kconfig
+++ b/arch/mips/ath25/Kconfig
@@ -12,6 +12,7 @@ config SOC_AR2315
 config PCI_AR2315
bool "Atheros AR2315 PCI controller support"
depends on SOC_AR2315
+   select ARCH_HAS_PHYS_TO_DMA
select HW_HAS_PCI
select PCI
default y
diff --git a/arch/mips/include/asm/mach-ath25/dma-coherence.h 
b/arch/mips/include/asm/mach-ath25/dma-coherence.h
deleted file mode 100644
index 124755d4f079..
--- a/arch/mips/include/asm/mach-ath25/dma-coherence.h
+++ /dev/null
@@ -1,71 +0,0 @@
-/*
- * This file is subject to the terms and conditions of the GNU General Public
- * License.  See the file "COPYING" in the main directory of this archive
- * for more details.
- *
- * Copyright (C) 2006  Ralf Baechle 
- * Copyright (C) 2007  Felix Fietkau 
- *
- */
-#ifndef __ASM_MACH_ATH25_DMA_COHERENCE_H
-#define __ASM_MACH_ATH25_DMA_COHERENCE_H
-
-#include 
-
-/*
- * We need some arbitrary non-zero value to be programmed to the BAR1 register
- * of PCI host controller to enable DMA. The same value should be used as the
- * offset to calculate the physical address of DMA buffer for PCI devices.
- */
-#define AR2315_PCI_HOST_SDRAM_BASEADDR 0x2000
-
-static inline dma_addr_t ath25_dev_offset(struct device *dev)
-{
-#ifdef CONFIG_PCI
-   extern struct bus_type pci_bus_type;
-
-   if (dev && dev->bus == _bus_type)
-   return AR2315_PCI_HOST_SDRAM_BASEADDR;
-#endif
-   return 0;
-}
-
-static inline dma_addr_t
-plat_map_dma_mem(struct device *dev, void *addr, size_t size)
-{
-   return virt_to_phys(addr) + ath25_dev_offset(dev);
-}
-
-static inline dma_addr_t
-plat_map_dma_mem_page(struct device *dev, struct page *page)
-{
-   return page_to_phys(page) + ath25_dev_offset(dev);
-}
-
-static inline unsigned long
-plat_dma_addr_to_phys(struct device *dev, dma_addr_t dma_addr)
-{
-   return dma_addr - ath25_dev_offset(dev);
-}
-
-static inline void
-plat_unmap_dma_mem(struct device *dev, dma_addr_t dma_addr, size_t size,
-  enum dma_data_direction direction)
-{
-}
-
-static inline int plat_dma_supported(struct device *dev, u64 mask)
-{
-   return 1;
-}
-
-static inline int plat_device_is_coherent(struct device *dev)
-{
-   return 0;
-}
-
-static inline void plat_post_dma_flush(struct device *dev)
-{
-}
-
-#endif /* __ASM_MACH_ATH25_DMA_COHERENCE_H */
diff --git a/arch/mips/pci/pci-ar2315.c b/arch/mips/pci/pci-ar2315.c
index b4fa6413c4e5..c539d0d2b0cf 100644
--- a/arch/mips/pci/pci-ar2315.c
+++ b/arch/mips/pci/pci-ar2315.c
@@ -149,6 +149,13 @@
 #define AR2315_PCI_HOST_SLOT   3
 #define AR2315_PCI_HOST_DEVID  ((0xff18 << 16) | PCI_VENDOR_ID_ATHEROS)
 
+/*
+ * We need some arbitrary non-zero value to be programmed to the BAR1 register
+ * of PCI host controller to enable DMA. The same value should be used as the
+ * offset to calculate the physical address of DMA buffer for PCI devices.
+ */
+#define AR2315_PCI_HOST_SDRAM_BASEADDR 0x2000
+
 /* ??? access BAR */
 #define AR2315_PCI_HOST_MBAR0  0x1000
 /* RAM access BAR */
@@ -167,6 +174,23 @@ struct ar2315_pci_ctrl {
struct resource io_res;
 };
 
+static inline dma_addr_t ar2315_dev_offset(struct device *dev)
+{
+   if (dev && dev_is_pci(dev))
+   return AR2315_PCI_HOST_SDRAM_BASEADDR;
+   return 0;
+}
+
+dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
+{
+   return paddr + ar2315_dev_offset(dev);
+}
+
+phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dma_addr)
+{
+   return dma_addr - ar2315_dev_offset(dev);
+}
+
 static inline struct ar2315_pci_ctrl *ar2315_pci_bus_to_apc(struct pci_bus 
*bus)
 {
struct pci_controller *hose = bus->sysdata;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 18/25] MIPS: loongson64: use generic dma noncoherent ops

2018-06-15 Thread Christoph Hellwig
Provide phys_to_dma/dma_to_phys helpers, everything else is generic.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/Kconfig |  1 +
 .../asm/mach-loongson64/dma-coherence.h   | 69 ---
 arch/mips/loongson64/Kconfig  |  2 -
 arch/mips/loongson64/common/Makefile  |  1 +
 arch/mips/loongson64/common/dma.c | 16 +
 5 files changed, 18 insertions(+), 71 deletions(-)
 delete mode 100644 arch/mips/include/asm/mach-loongson64/dma-coherence.h
 create mode 100644 arch/mips/loongson64/common/dma.c

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 326bd73bc5bf..e192e484b1b8 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -1827,6 +1827,7 @@ config CPU_LOONGSON2
select CPU_SUPPORTS_64BIT_KERNEL
select CPU_SUPPORTS_HIGHMEM
select CPU_SUPPORTS_HUGEPAGES
+   select ARCH_HAS_PHYS_TO_DMA
 
 config CPU_LOONGSON1
bool
diff --git a/arch/mips/include/asm/mach-loongson64/dma-coherence.h 
b/arch/mips/include/asm/mach-loongson64/dma-coherence.h
deleted file mode 100644
index 651dd2eb3ee5..
--- a/arch/mips/include/asm/mach-loongson64/dma-coherence.h
+++ /dev/null
@@ -1,69 +0,0 @@
-/*
- * This file is subject to the terms and conditions of the GNU General Public
- * License.  See the file "COPYING" in the main directory of this archive
- * for more details.
- *
- * Copyright (C) 2006, 07  Ralf Baechle 
- * Copyright (C) 2007 Lemote, Inc. & Institute of Computing Technology
- * Author: Fuxin Zhang, zhan...@lemote.com
- *
- */
-#ifndef __ASM_MACH_LOONGSON64_DMA_COHERENCE_H
-#define __ASM_MACH_LOONGSON64_DMA_COHERENCE_H
-
-#ifdef CONFIG_SWIOTLB
-#include 
-#endif
-
-struct device;
-
-static inline dma_addr_t plat_map_dma_mem(struct device *dev, void *addr,
- size_t size)
-{
-   return virt_to_phys(addr) | 0x8000;
-}
-
-static inline dma_addr_t plat_map_dma_mem_page(struct device *dev,
-  struct page *page)
-{
-   return page_to_phys(page) | 0x8000;
-}
-
-static inline unsigned long plat_dma_addr_to_phys(struct device *dev,
-   dma_addr_t dma_addr)
-{
-#if defined(CONFIG_CPU_LOONGSON2F) && defined(CONFIG_64BIT)
-   return (dma_addr > 0x8fff) ? dma_addr : (dma_addr & 0x0fff);
-#else
-   return dma_addr & 0x7fff;
-#endif
-}
-
-static inline void plat_unmap_dma_mem(struct device *dev, dma_addr_t dma_addr,
-   size_t size, enum dma_data_direction direction)
-{
-}
-
-static inline int plat_dma_supported(struct device *dev, u64 mask)
-{
-   /*
-* we fall back to GFP_DMA when the mask isn't all 1s,
-* so we can't guarantee allocations that must be
-* within a tighter range than GFP_DMA..
-*/
-   if (mask < DMA_BIT_MASK(24))
-   return 0;
-
-   return 1;
-}
-
-static inline int plat_device_is_coherent(struct device *dev)
-{
-   return 0;
-}
-
-static inline void plat_post_dma_flush(struct device *dev)
-{
-}
-
-#endif /* __ASM_MACH_LOONGSON64_DMA_COHERENCE_H */
diff --git a/arch/mips/loongson64/Kconfig b/arch/mips/loongson64/Kconfig
index a785bf8da3f3..c865b4b9b775 100644
--- a/arch/mips/loongson64/Kconfig
+++ b/arch/mips/loongson64/Kconfig
@@ -13,7 +13,6 @@ config LEMOTE_FULOONG2E
select CSRC_R4K
select SYS_HAS_CPU_LOONGSON2E
select DMA_NONCOHERENT
-   select MIPS_DMA_DEFAULT
select BOOT_ELF32
select BOARD_SCACHE
select HW_HAS_PCI
@@ -45,7 +44,6 @@ config LEMOTE_MACH2F
select CS5536
select CSRC_R4K if ! MIPS_EXTERNAL_TIMER
select DMA_NONCOHERENT
-   select MIPS_DMA_DEFAULT
select GENERIC_ISA_DMA_SUPPORT_BROKEN
select HAVE_CLK
select HW_HAS_PCI
diff --git a/arch/mips/loongson64/common/Makefile 
b/arch/mips/loongson64/common/Makefile
index 684624f61f5a..57ee03022941 100644
--- a/arch/mips/loongson64/common/Makefile
+++ b/arch/mips/loongson64/common/Makefile
@@ -6,6 +6,7 @@
 obj-y += setup.o init.o cmdline.o env.o time.o reset.o irq.o \
 bonito-irq.o mem.o machtype.o platform.o serial.o
 obj-$(CONFIG_PCI) += pci.o
+obj-$(CONFIG_CPU_LOONGSON2) += dma.o
 
 #
 # Serial port support
diff --git a/arch/mips/loongson64/common/dma.c 
b/arch/mips/loongson64/common/dma.c
new file mode 100644
index ..95ede4b0fbbb
--- /dev/null
+++ b/arch/mips/loongson64/common/dma.c
@@ -0,0 +1,16 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+
+dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
+{
+   return paddr | 0x8000;
+}
+
+phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dma_addr)
+{
+#if defined(CONFIG_CPU_LOONGSON2F) && defined(CONFIG_64BIT)
+   if (dma_addr > 0x8fff)
+   return dma_addr;
+#endif
+   return dma_addr & 0x0fff;
+}
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org

[PATCH 13/25] MIPS: loongson: remove loongson-3 handling from dma-coherence.h

2018-06-15 Thread Christoph Hellwig
Loongson3 is dma coherent and uses swiotlb, so it will never used any
of these helpers.

Signed-off-by: Christoph Hellwig 
---
 .../include/asm/mach-loongson64/dma-coherence.h  | 16 +---
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/arch/mips/include/asm/mach-loongson64/dma-coherence.h 
b/arch/mips/include/asm/mach-loongson64/dma-coherence.h
index b8825a7d1279..651dd2eb3ee5 100644
--- a/arch/mips/include/asm/mach-loongson64/dma-coherence.h
+++ b/arch/mips/include/asm/mach-loongson64/dma-coherence.h
@@ -20,29 +20,19 @@ struct device;
 static inline dma_addr_t plat_map_dma_mem(struct device *dev, void *addr,
  size_t size)
 {
-#ifdef CONFIG_CPU_LOONGSON3
-   return __phys_to_dma(dev, virt_to_phys(addr));
-#else
return virt_to_phys(addr) | 0x8000;
-#endif
 }
 
 static inline dma_addr_t plat_map_dma_mem_page(struct device *dev,
   struct page *page)
 {
-#ifdef CONFIG_CPU_LOONGSON3
-   return __phys_to_dma(dev, page_to_phys(page));
-#else
return page_to_phys(page) | 0x8000;
-#endif
 }
 
 static inline unsigned long plat_dma_addr_to_phys(struct device *dev,
dma_addr_t dma_addr)
 {
-#if defined(CONFIG_CPU_LOONGSON3) && defined(CONFIG_64BIT)
-   return __dma_to_phys(dev, dma_addr);
-#elif defined(CONFIG_CPU_LOONGSON2F) && defined(CONFIG_64BIT)
+#if defined(CONFIG_CPU_LOONGSON2F) && defined(CONFIG_64BIT)
return (dma_addr > 0x8fff) ? dma_addr : (dma_addr & 0x0fff);
 #else
return dma_addr & 0x7fff;
@@ -69,11 +59,7 @@ static inline int plat_dma_supported(struct device *dev, u64 
mask)
 
 static inline int plat_device_is_coherent(struct device *dev)
 {
-#ifdef CONFIG_DMA_NONCOHERENT
return 0;
-#else
-   return 1;
-#endif /* CONFIG_DMA_NONCOHERENT */
 }
 
 static inline void plat_post_dma_flush(struct device *dev)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 16/25] MIPS: move coherentio setup to setup.c

2018-06-15 Thread Christoph Hellwig
We want to be able to use it even when not building dma-default.c
in the near future.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/kernel/setup.c   | 24 
 arch/mips/mm/dma-default.c | 23 ---
 2 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 2c96c0c68116..3d4524309b5c 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1055,3 +1056,26 @@ static int __init debugfs_mips(void)
 }
 arch_initcall(debugfs_mips);
 #endif
+
+#if defined(CONFIG_DMA_MAYBE_COHERENT) && !defined(CONFIG_DMA_PERDEV_COHERENT)
+/* User defined DMA coherency from command line. */
+enum coherent_io_user_state coherentio = IO_COHERENCE_DEFAULT;
+EXPORT_SYMBOL_GPL(coherentio);
+int hw_coherentio = 0; /* Actual hardware supported DMA coherency setting. */
+
+static int __init setcoherentio(char *str)
+{
+   coherentio = IO_COHERENCE_ENABLED;
+   pr_info("Hardware DMA cache coherency (command line)\n");
+   return 0;
+}
+early_param("coherentio", setcoherentio);
+
+static int __init setnocoherentio(char *str)
+{
+   coherentio = IO_COHERENCE_DISABLED;
+   pr_info("Software DMA cache coherency (command line)\n");
+   return 0;
+}
+early_param("nocoherentio", setnocoherentio);
+#endif
diff --git a/arch/mips/mm/dma-default.c b/arch/mips/mm/dma-default.c
index 2db6c2a6f964..10b56e8a2076 100644
--- a/arch/mips/mm/dma-default.c
+++ b/arch/mips/mm/dma-default.c
@@ -24,29 +24,6 @@
 
 #include 
 
-#if defined(CONFIG_DMA_MAYBE_COHERENT) && !defined(CONFIG_DMA_PERDEV_COHERENT)
-/* User defined DMA coherency from command line. */
-enum coherent_io_user_state coherentio = IO_COHERENCE_DEFAULT;
-EXPORT_SYMBOL_GPL(coherentio);
-int hw_coherentio = 0; /* Actual hardware supported DMA coherency setting. */
-
-static int __init setcoherentio(char *str)
-{
-   coherentio = IO_COHERENCE_ENABLED;
-   pr_info("Hardware DMA cache coherency (command line)\n");
-   return 0;
-}
-early_param("coherentio", setcoherentio);
-
-static int __init setnocoherentio(char *str)
-{
-   coherentio = IO_COHERENCE_DISABLED;
-   pr_info("Software DMA cache coherency (command line)\n");
-   return 0;
-}
-early_param("nocoherentio", setnocoherentio);
-#endif
-
 static inline struct page *dma_addr_to_page(struct device *dev,
dma_addr_t dma_addr)
 {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 17/25] MIPS: use generic dma noncoherent ops for simple noncoherent platforms

2018-06-15 Thread Christoph Hellwig
Convert everything not overriding dma-coherence.h to the generic
noncoherent ops.  The new dma-noncoherent.c file duplicates a lot of
the code in dma-default.c, but that file will be gone by the end of
this series.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/Kconfig   |  24 +---
 arch/mips/include/asm/dma-mapping.h |   2 +
 arch/mips/loongson32/Kconfig|   2 -
 arch/mips/mm/Makefile   |   1 +
 arch/mips/mm/dma-noncoherent.c  | 208 
 arch/mips/pic32/Kconfig |   1 -
 arch/mips/txx9/Kconfig  |   1 -
 arch/mips/vr41xx/Kconfig|   5 -
 8 files changed, 216 insertions(+), 28 deletions(-)
 create mode 100644 arch/mips/mm/dma-noncoherent.c

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 8bf378651d74..326bd73bc5bf 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -101,7 +101,6 @@ config MIPS_GENERIC
select IRQ_MIPS_CPU
select LIBFDT
select MIPS_CPU_SCACHE
-   select MIPS_DMA_DEFAULT
select MIPS_GIC
select MIPS_L1_CACHE_SHIFT_7
select NO_EXCEPT_FILL
@@ -145,7 +144,6 @@ config MIPS_ALCHEMY
select CEVT_R4K
select CSRC_R4K
select IRQ_MIPS_CPU
-   select MIPS_DMA_DEFAULT
select DMA_MAYBE_COHERENT   # Au1000,1500,1100 aren't, rest is
select SYS_HAS_CPU_MIPS32_R1
select SYS_SUPPORTS_32BIT_KERNEL
@@ -161,7 +159,6 @@ config AR7
select CEVT_R4K
select CSRC_R4K
select IRQ_MIPS_CPU
-   select MIPS_DMA_DEFAULT
select NO_EXCEPT_FILL
select SWAP_IO_SPACE
select SYS_HAS_CPU_MIPS32_R1
@@ -204,7 +201,6 @@ config ATH79
select COMMON_CLK
select CLKDEV_LOOKUP
select IRQ_MIPS_CPU
-   select MIPS_DMA_DEFAULT
select MIPS_MACHINE
select SYS_HAS_CPU_MIPS32_R2
select SYS_HAS_EARLY_PRINTK
@@ -262,7 +258,6 @@ config BCM47XX
select HW_HAS_PCI
select IRQ_MIPS_CPU
select SYS_HAS_CPU_MIPS32_R1
-   select MIPS_DMA_DEFAULT
select NO_EXCEPT_FILL
select SYS_SUPPORTS_32BIT_KERNEL
select SYS_SUPPORTS_LITTLE_ENDIAN
@@ -286,7 +281,6 @@ config BCM63XX
select SYNC_R4K
select DMA_NONCOHERENT
select IRQ_MIPS_CPU
-   select MIPS_DMA_DEFAULT
select SYS_SUPPORTS_32BIT_KERNEL
select SYS_SUPPORTS_BIG_ENDIAN
select SYS_HAS_EARLY_PRINTK
@@ -309,7 +303,6 @@ config MIPS_COBALT
select I8259
select IRQ_MIPS_CPU
select IRQ_GT641XX
-   select MIPS_DMA_DEFAULT
select PCI_GT64XXX_PCI0
select PCI
select SYS_HAS_CPU_NEVADA
@@ -330,7 +323,6 @@ config MACH_DECSTATION
select CPU_R4000_WORKAROUNDS if 64BIT
select CPU_R4400_WORKAROUNDS if 64BIT
select DMA_NONCOHERENT
-   select MIPS_DMA_DEFAULT
select NO_IOPORT_MAP
select IRQ_MIPS_CPU
select SYS_HAS_CPU_R3000
@@ -390,7 +382,6 @@ config MACH_INGENIC
select SYS_SUPPORTS_ZBOOT_UART16550
select DMA_NONCOHERENT
select IRQ_MIPS_CPU
-   select MIPS_DMA_DEFAULT
select PINCTRL
select GPIOLIB
select COMMON_CLK
@@ -405,7 +396,6 @@ config LANTIQ
select IRQ_MIPS_CPU
select CEVT_R4K
select CSRC_R4K
-   select MIPS_DMA_DEFAULT
select SYS_HAS_CPU_MIPS32_R1
select SYS_HAS_CPU_MIPS32_R2
select SYS_SUPPORTS_BIG_ENDIAN
@@ -433,7 +423,6 @@ config LASAT
select SYS_HAS_EARLY_PRINTK
select HW_HAS_PCI
select IRQ_MIPS_CPU
-   select MIPS_DMA_DEFAULT
select PCI_GT64XXX_PCI0
select MIPS_NILE4
select R5000_CPU_SCACHE
@@ -479,7 +468,6 @@ config MACH_PISTACHIO
select LIBFDT
select MFD_SYSCON
select MIPS_CPU_SCACHE
-   select MIPS_DMA_DEFAULT
select MIPS_GIC
select PINCTRL
select REGULATOR
@@ -512,7 +500,6 @@ config MIPS_MALTA
select GENERIC_ISA_DMA
select HAVE_PCSPKR_PLATFORM
select IRQ_MIPS_CPU
-   select MIPS_DMA_DEFAULT
select MIPS_GIC
select HW_HAS_PCI
select I8253
@@ -607,7 +594,6 @@ config PMC_MSP
select SYS_SUPPORTS_BIG_ENDIAN
select SYS_SUPPORTS_MIPS16
select IRQ_MIPS_CPU
-   select MIPS_DMA_DEFAULT
select SERIAL_8250
select SERIAL_8250_CONSOLE
select USB_EHCI_BIG_ENDIAN_MMIO
@@ -625,7 +611,6 @@ config RALINK
select BOOT_RAW
select DMA_NONCOHERENT
select IRQ_MIPS_CPU
-   select MIPS_DMA_DEFAULT
select USE_OF
select SYS_HAS_CPU_MIPS32_R1
select SYS_HAS_CPU_MIPS32_R2
@@ -652,7 +637,6 @@ config SGI_IP22
select I8259
select IP22_CPU_SCACHE
select IRQ_MIPS_CPU
-   select MIPS_DMA_DEFAULT
select GENERIC_ISA_DMA_SUPPORT_BROKEN
select SGI_HAS_I8042
select SGI_HAS_INDYDOG
@@ -713,7 +697,6 @@ config SGI_IP28

[PATCH 05/25] MIPS: Octeon: refactor swiotlb code

2018-06-15 Thread Christoph Hellwig
Share a common set of swiotlb operations, and to instead branch out in
__phys_to_dma/__dma_to_phys for the PCI vs non-PCI case.  Also use const
structures for the PCI methods so that attackers can't use them as
exploit vectors.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/cavium-octeon/dma-octeon.c  | 161 --
 .../asm/mach-cavium-octeon/dma-coherence.h|   2 -
 arch/mips/pci/pci-octeon.c|   2 -
 3 files changed, 71 insertions(+), 94 deletions(-)

diff --git a/arch/mips/cavium-octeon/dma-octeon.c 
b/arch/mips/cavium-octeon/dma-octeon.c
index e5d00c79bd26..7f0c9f926b6e 100644
--- a/arch/mips/cavium-octeon/dma-octeon.c
+++ b/arch/mips/cavium-octeon/dma-octeon.c
@@ -23,10 +23,16 @@
 #include 
 
 #ifdef CONFIG_PCI
+#include 
 #include 
 #include 
 #include 
 
+struct octeon_dma_map_ops {
+   dma_addr_t (*phys_to_dma)(struct device *dev, phys_addr_t paddr);
+   phys_addr_t (*dma_to_phys)(struct device *dev, dma_addr_t daddr);
+};
+
 static dma_addr_t octeon_hole_phys_to_dma(phys_addr_t paddr)
 {
if (paddr >= CVMX_PCIE_BAR1_PHYS_BASE && paddr < 
(CVMX_PCIE_BAR1_PHYS_BASE + CVMX_PCIE_BAR1_PHYS_SIZE))
@@ -60,6 +66,11 @@ static phys_addr_t octeon_gen1_dma_to_phys(struct device 
*dev, dma_addr_t daddr)
return daddr;
 }
 
+static const struct octeon_dma_map_ops octeon_gen1_ops = {
+   .phys_to_dma= octeon_gen1_phys_to_dma,
+   .dma_to_phys= octeon_gen1_dma_to_phys,
+};
+
 static dma_addr_t octeon_gen2_phys_to_dma(struct device *dev, phys_addr_t 
paddr)
 {
return octeon_hole_phys_to_dma(paddr);
@@ -70,6 +81,11 @@ static phys_addr_t octeon_gen2_dma_to_phys(struct device 
*dev, dma_addr_t daddr)
return octeon_hole_dma_to_phys(daddr);
 }
 
+static const struct octeon_dma_map_ops octeon_gen2_ops = {
+   .phys_to_dma= octeon_gen2_phys_to_dma,
+   .dma_to_phys= octeon_gen2_dma_to_phys,
+};
+
 static dma_addr_t octeon_big_phys_to_dma(struct device *dev, phys_addr_t paddr)
 {
if (paddr >= 0x41000ull && paddr < 0x42000ull)
@@ -92,6 +108,11 @@ static phys_addr_t octeon_big_dma_to_phys(struct device 
*dev, dma_addr_t daddr)
return daddr;
 }
 
+static const struct octeon_dma_map_ops octeon_big_ops = {
+   .phys_to_dma= octeon_big_phys_to_dma,
+   .dma_to_phys= octeon_big_dma_to_phys,
+};
+
 static dma_addr_t octeon_small_phys_to_dma(struct device *dev,
   phys_addr_t paddr)
 {
@@ -120,6 +141,32 @@ static phys_addr_t octeon_small_dma_to_phys(struct device 
*dev,
return daddr;
 }
 
+static const struct octeon_dma_map_ops octeon_small_ops = {
+   .phys_to_dma= octeon_small_phys_to_dma,
+   .dma_to_phys= octeon_small_dma_to_phys,
+};
+
+static const struct octeon_dma_map_ops *octeon_pci_dma_ops;
+
+void __init octeon_pci_dma_init(void)
+{
+   switch (octeon_dma_bar_type) {
+   case OCTEON_DMA_BAR_TYPE_PCIE:
+   octeon_pci_dma_ops = _gen1_ops;
+   break;
+   case OCTEON_DMA_BAR_TYPE_PCIE2:
+   octeon_pci_dma_ops = _gen2_ops;
+   break;
+   case OCTEON_DMA_BAR_TYPE_BIG:
+   octeon_pci_dma_ops = _big_ops;
+   break;
+   case OCTEON_DMA_BAR_TYPE_SMALL:
+   octeon_pci_dma_ops = _small_ops;
+   break;
+   default:
+   BUG();
+   }
+}
 #endif /* CONFIG_PCI */
 
 static dma_addr_t octeon_dma_map_page(struct device *dev, struct page *page,
@@ -165,57 +212,37 @@ static void *octeon_dma_alloc_coherent(struct device 
*dev, size_t size,
return ret;
 }
 
-static dma_addr_t octeon_unity_phys_to_dma(struct device *dev, phys_addr_t 
paddr)
-{
-   return paddr;
-}
-
-static phys_addr_t octeon_unity_dma_to_phys(struct device *dev, dma_addr_t 
daddr)
-{
-   return daddr;
-}
-
-struct octeon_dma_map_ops {
-   const struct dma_map_ops dma_map_ops;
-   dma_addr_t (*phys_to_dma)(struct device *dev, phys_addr_t paddr);
-   phys_addr_t (*dma_to_phys)(struct device *dev, dma_addr_t daddr);
-};
-
 dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
 {
-   struct octeon_dma_map_ops *ops = container_of(get_dma_ops(dev),
- struct octeon_dma_map_ops,
- dma_map_ops);
-
-   return ops->phys_to_dma(dev, paddr);
+#ifdef CONFIG_PCI
+   if (dev && dev_is_pci(dev))
+   return octeon_pci_dma_ops->phys_to_dma(dev, paddr);
+#endif
+   return paddr;
 }
 
 phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t daddr)
 {
-   struct octeon_dma_map_ops *ops = container_of(get_dma_ops(dev),
- struct octeon_dma_map_ops,
- dma_map_ops);
-
-   return ops->dma_to_phys(dev, daddr);
+#ifdef CONFIG_PCI
+   if (dev && dev_is_pci(dev))
+ 

[PATCH 01/25] MIPS: remove a dead ifdef from mach-ath25/dma-coherence.h

2018-06-15 Thread Christoph Hellwig
ath25 is alwas non-coherent, so keeping these ifdefs doesn't make any sense.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Paul Burton 
---
 arch/mips/include/asm/mach-ath25/dma-coherence.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/arch/mips/include/asm/mach-ath25/dma-coherence.h 
b/arch/mips/include/asm/mach-ath25/dma-coherence.h
index d5defdde32db..124755d4f079 100644
--- a/arch/mips/include/asm/mach-ath25/dma-coherence.h
+++ b/arch/mips/include/asm/mach-ath25/dma-coherence.h
@@ -61,12 +61,7 @@ static inline int plat_dma_supported(struct device *dev, u64 
mask)
 
 static inline int plat_device_is_coherent(struct device *dev)
 {
-#ifdef CONFIG_DMA_COHERENT
-   return 1;
-#endif
-#ifdef CONFIG_DMA_NONCOHERENT
return 0;
-#endif
 }
 
 static inline void plat_post_dma_flush(struct device *dev)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/1] iommu/arm-smmu: Add support to use Last level cache

2018-06-15 Thread Vivek Gautam
Qualcomm SoCs have an additional level of cache called as
System cache or Last level cache[1]. This cache sits right
before the DDR, and is tightly coupled with the memory
controller.
The cache is available to all the clients present in the
SoC system. The clients request their slices from this system
cache, make it active, and can then start using it. For these
clients with smmu, to start using the system cache for
dma buffers and related page tables [2], few of the memory
attributes need to be set accordingly.
This change makes the related memory Outer-Shareable, and
updates the MAIR with necessary protection.

The MAIR attribute requirements are:
Inner Cacheablity = 0
Outer Cacheablity = 1, Write-Back Write Allocate
Outer Shareablity = 1

This change is a realisation of following changes
from downstream msm-4.9:
iommu: io-pgtable-arm: Support DOMAIN_ATTRIBUTE_USE_UPSTREAM_HINT
iommu: io-pgtable-arm: Implement IOMMU_USE_UPSTREAM_HINT

[1] https://patchwork.kernel.org/patch/10422531/
[2] https://patchwork.kernel.org/patch/10302791/

Signed-off-by: Vivek Gautam 
---
 drivers/iommu/arm-smmu.c   | 14 ++
 drivers/iommu/io-pgtable-arm.c | 24 +++-
 drivers/iommu/io-pgtable.h |  4 
 include/linux/iommu.h  |  4 
 4 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index f7a96bcf94a6..8058e7205034 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -249,6 +249,7 @@ struct arm_smmu_domain {
struct mutexinit_mutex; /* Protects smmu pointer */
spinlock_t  cb_lock; /* Serialises ATS1* ops and 
TLB syncs */
struct iommu_domain domain;
+   boolhas_sys_cache;
 };
 
 struct arm_smmu_option_prop {
@@ -862,6 +863,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain 
*domain,
 
if (smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
+   if (smmu_domain->has_sys_cache)
+   pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_SYS_CACHE;
 
smmu_domain->smmu = smmu;
pgtbl_ops = alloc_io_pgtable_ops(fmt, _cfg, smmu_domain);
@@ -1477,6 +1480,9 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
*domain,
case DOMAIN_ATTR_NESTING:
*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
return 0;
+   case DOMAIN_ATTR_USE_SYS_CACHE:
+   *((int *)data) = smmu_domain->has_sys_cache;
+   return 0;
default:
return -ENODEV;
}
@@ -1506,6 +1512,14 @@ static int arm_smmu_domain_set_attr(struct iommu_domain 
*domain,
smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
break;
+   case DOMAIN_ATTR_USE_SYS_CACHE:
+   if (smmu_domain->smmu) {
+   ret = -EPERM;
+   goto out_unlock;
+   }
+   if (*((int *)data))
+   smmu_domain->has_sys_cache = true;
+   break;
default:
ret = -ENODEV;
}
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 010a254305dd..b2aee1828524 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -169,9 +169,11 @@
 #define ARM_LPAE_MAIR_ATTR_DEVICE  0x04
 #define ARM_LPAE_MAIR_ATTR_NC  0x44
 #define ARM_LPAE_MAIR_ATTR_WBRWA   0xff
+#define ARM_LPAE_MAIR_ATTR_SYS_CACHE   0xf4
 #define ARM_LPAE_MAIR_ATTR_IDX_NC  0
 #define ARM_LPAE_MAIR_ATTR_IDX_CACHE   1
 #define ARM_LPAE_MAIR_ATTR_IDX_DEV 2
+#define ARM_LPAE_MAIR_ATTR_IDX_SYS_CACHE   3
 
 /* IOPTE accessors */
 #define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d))
@@ -442,6 +444,10 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct 
arm_lpae_io_pgtable *data,
else if (prot & IOMMU_CACHE)
pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE
<< ARM_LPAE_PTE_ATTRINDX_SHIFT);
+   else if (prot & IOMMU_SYS_CACHE)
+   pte |= (ARM_LPAE_MAIR_ATTR_IDX_SYS_CACHE
+   << ARM_LPAE_PTE_ATTRINDX_SHIFT);
+
} else {
pte = ARM_LPAE_PTE_HAP_FAULT;
if (prot & IOMMU_READ)
@@ -771,7 +777,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
void *cookie)
u64 reg;
struct arm_lpae_io_pgtable *data;
 
-   if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA))
+   if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA |
+   IO_PGTABLE_QUIRK_SYS_CACHE))
return NULL;
 
data = arm_lpae_alloc_pgtable(cfg);
@@ -779,9 +786,14 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
void *cookie)
return 

switch mips to use the generic dma map ops v2

2018-06-15 Thread Christoph Hellwig
Hi all,

this huge series does a deep cleaning of the mips dma mapping code and
moves most architectures over to use the generic dma_direct_ops or
dma_noncoherent_ops.  The Jazz architectures grows a new dma_map_ops
tailered to its bare bones iommu implementation, and the swiotlb code
use by Loongson-3 and Octeon is merged into a single implementation,
pending further unification with the generic swiotlb_ops in another
step.
Note that all this has been compile tested only, and I've probably
missed even that for some platforms..

A git tree is available here:

git://git.infradead.org/users/hch/misc.git mips-direct-ops

Gitweb:


http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/mips-direct-ops


Changes since v2:
 - addressed review comments from Paul Burton
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 09/25] MIPS: make the default mips dma implementation optional

2018-06-15 Thread Christoph Hellwig
Octeon and loonson64 already don't use it at all, and we're going to
migrate more plaforms away from it.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/Kconfig   | 40 +
 arch/mips/include/asm/dma-mapping.h |  4 ++-
 arch/mips/jazz/Kconfig  |  3 +++
 arch/mips/loongson32/Kconfig|  2 ++
 arch/mips/loongson64/Kconfig|  2 ++
 arch/mips/mm/Makefile   |  3 ++-
 arch/mips/pic32/Kconfig |  1 +
 arch/mips/txx9/Kconfig  |  1 +
 arch/mips/vr41xx/Kconfig|  5 
 9 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 89be9f97da4e..bc8893063609 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -76,6 +76,9 @@ config MIPS
select SYSCTL_EXCEPTION_TRACE
select VIRT_TO_BUS
 
+config MIPS_DMA_DEFAULT
+   bool
+
 menu "Machine selection"
 
 choice
@@ -97,6 +100,7 @@ config MIPS_GENERIC
select IRQ_MIPS_CPU
select LIBFDT
select MIPS_CPU_SCACHE
+   select MIPS_DMA_DEFAULT
select MIPS_GIC
select MIPS_L1_CACHE_SHIFT_7
select NO_EXCEPT_FILL
@@ -140,6 +144,7 @@ config MIPS_ALCHEMY
select CEVT_R4K
select CSRC_R4K
select IRQ_MIPS_CPU
+   select MIPS_DMA_DEFAULT
select DMA_MAYBE_COHERENT   # Au1000,1500,1100 aren't, rest is
select SYS_HAS_CPU_MIPS32_R1
select SYS_SUPPORTS_32BIT_KERNEL
@@ -155,6 +160,7 @@ config AR7
select CEVT_R4K
select CSRC_R4K
select IRQ_MIPS_CPU
+   select MIPS_DMA_DEFAULT
select NO_EXCEPT_FILL
select SWAP_IO_SPACE
select SYS_HAS_CPU_MIPS32_R1
@@ -177,6 +183,7 @@ config ATH25
select DMA_NONCOHERENT
select IRQ_MIPS_CPU
select IRQ_DOMAIN
+   select MIPS_DMA_DEFAULT
select SYS_HAS_CPU_MIPS32_R1
select SYS_SUPPORTS_BIG_ENDIAN
select SYS_SUPPORTS_32BIT_KERNEL
@@ -196,6 +203,7 @@ config ATH79
select COMMON_CLK
select CLKDEV_LOOKUP
select IRQ_MIPS_CPU
+   select MIPS_DMA_DEFAULT
select MIPS_MACHINE
select SYS_HAS_CPU_MIPS32_R2
select SYS_HAS_EARLY_PRINTK
@@ -222,6 +230,7 @@ config BMIPS_GENERIC
select BCM7120_L2_IRQ
select BRCMSTB_L2_IRQ
select IRQ_MIPS_CPU
+   select MIPS_DMA_DEFAULT
select DMA_NONCOHERENT
select SYS_SUPPORTS_32BIT_KERNEL
select SYS_SUPPORTS_LITTLE_ENDIAN
@@ -252,6 +261,7 @@ config BCM47XX
select HW_HAS_PCI
select IRQ_MIPS_CPU
select SYS_HAS_CPU_MIPS32_R1
+   select MIPS_DMA_DEFAULT
select NO_EXCEPT_FILL
select SYS_SUPPORTS_32BIT_KERNEL
select SYS_SUPPORTS_LITTLE_ENDIAN
@@ -275,6 +285,7 @@ config BCM63XX
select SYNC_R4K
select DMA_NONCOHERENT
select IRQ_MIPS_CPU
+   select MIPS_DMA_DEFAULT
select SYS_SUPPORTS_32BIT_KERNEL
select SYS_SUPPORTS_BIG_ENDIAN
select SYS_HAS_EARLY_PRINTK
@@ -297,6 +308,7 @@ config MIPS_COBALT
select I8259
select IRQ_MIPS_CPU
select IRQ_GT641XX
+   select MIPS_DMA_DEFAULT
select PCI_GT64XXX_PCI0
select PCI
select SYS_HAS_CPU_NEVADA
@@ -317,6 +329,7 @@ config MACH_DECSTATION
select CPU_R4000_WORKAROUNDS if 64BIT
select CPU_R4400_WORKAROUNDS if 64BIT
select DMA_NONCOHERENT
+   select MIPS_DMA_DEFAULT
select NO_IOPORT_MAP
select IRQ_MIPS_CPU
select SYS_HAS_CPU_R3000
@@ -376,6 +389,7 @@ config MACH_INGENIC
select SYS_SUPPORTS_ZBOOT_UART16550
select DMA_NONCOHERENT
select IRQ_MIPS_CPU
+   select MIPS_DMA_DEFAULT
select PINCTRL
select GPIOLIB
select COMMON_CLK
@@ -390,6 +404,7 @@ config LANTIQ
select IRQ_MIPS_CPU
select CEVT_R4K
select CSRC_R4K
+   select MIPS_DMA_DEFAULT
select SYS_HAS_CPU_MIPS32_R1
select SYS_HAS_CPU_MIPS32_R2
select SYS_SUPPORTS_BIG_ENDIAN
@@ -417,6 +432,7 @@ config LASAT
select SYS_HAS_EARLY_PRINTK
select HW_HAS_PCI
select IRQ_MIPS_CPU
+   select MIPS_DMA_DEFAULT
select PCI_GT64XXX_PCI0
select MIPS_NILE4
select R5000_CPU_SCACHE
@@ -463,6 +479,7 @@ config MACH_PISTACHIO
select LIBFDT
select MFD_SYSCON
select MIPS_CPU_SCACHE
+   select MIPS_DMA_DEFAULT
select MIPS_GIC
select PINCTRL
select REGULATOR
@@ -495,6 +512,7 @@ config MIPS_MALTA
select GENERIC_ISA_DMA
select HAVE_PCSPKR_PLATFORM
select IRQ_MIPS_CPU
+   select MIPS_DMA_DEFAULT
select MIPS_GIC
select HW_HAS_PCI
select I8253
@@ -551,6 +569,7 @@ config NEC_MARKEINS
bool "NEC EMMA2RH Mark-eins board"
select SOC_EMMA2RH
select HW_HAS_PCI
+   select MIPS_DMA_DEFAULT
help
  This enables support for the 

[PATCH 10/25] MIPS: Octeon: remove mips dma-default stubs

2018-06-15 Thread Christoph Hellwig
Octeon doesn't use the dma-default code, and now doesn't built it either,
so these stubs can be removed.

Signed-off-by: Christoph Hellwig 
---
 .../asm/mach-cavium-octeon/dma-coherence.h| 48 ---
 1 file changed, 48 deletions(-)

diff --git a/arch/mips/include/asm/mach-cavium-octeon/dma-coherence.h 
b/arch/mips/include/asm/mach-cavium-octeon/dma-coherence.h
index c0254c72d97b..66eee98b8b8d 100644
--- a/arch/mips/include/asm/mach-cavium-octeon/dma-coherence.h
+++ b/arch/mips/include/asm/mach-cavium-octeon/dma-coherence.h
@@ -4,11 +4,6 @@
  * for more details.
  *
  * Copyright (C) 2006  Ralf Baechle 
- *
- *
- * Similar to mach-generic/dma-coherence.h except
- * plat_device_is_coherent hard coded to return 1.
- *
  */
 #ifndef __ASM_MACH_CAVIUM_OCTEON_DMA_COHERENCE_H
 #define __ASM_MACH_CAVIUM_OCTEON_DMA_COHERENCE_H
@@ -18,49 +13,6 @@
 struct device;
 
 extern void octeon_pci_dma_init(void);
-
-static inline dma_addr_t plat_map_dma_mem(struct device *dev, void *addr,
-   size_t size)
-{
-   BUG();
-   return 0;
-}
-
-static inline dma_addr_t plat_map_dma_mem_page(struct device *dev,
-   struct page *page)
-{
-   BUG();
-   return 0;
-}
-
-static inline unsigned long plat_dma_addr_to_phys(struct device *dev,
-   dma_addr_t dma_addr)
-{
-   BUG();
-   return 0;
-}
-
-static inline void plat_unmap_dma_mem(struct device *dev, dma_addr_t dma_addr,
-   size_t size, enum dma_data_direction direction)
-{
-   BUG();
-}
-
-static inline int plat_dma_supported(struct device *dev, u64 mask)
-{
-   BUG();
-   return 0;
-}
-
-static inline int plat_device_is_coherent(struct device *dev)
-{
-   return 1;
-}
-
-static inline void plat_post_dma_flush(struct device *dev)
-{
-}
-
 extern char *octeon_swiotlb;
 
 #endif /* __ASM_MACH_CAVIUM_OCTEON_DMA_COHERENCE_H */
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 07/25] MIPS: consolidate the swiotlb implementations

2018-06-15 Thread Christoph Hellwig
Octeon and Loongson share exactly the same code, move it into a common
implementation, and use that implementation directly from get_arch_dma_ops.

Also provide the expected dma-direct.h helpers directly instead of
delegating to platform dma-coherence.h headers.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/cavium-octeon/dma-octeon.c  | 61 
 arch/mips/include/asm/dma-direct.h| 17 -
 arch/mips/include/asm/dma-mapping.h   |  5 ++
 .../asm/mach-cavium-octeon/dma-coherence.h| 11 ---
 .../asm/mach-loongson64/dma-coherence.h   | 10 ---
 arch/mips/loongson64/common/dma-swiotlb.c | 71 +--
 arch/mips/mm/Makefile |  1 +
 arch/mips/mm/dma-swiotlb.c| 61 
 8 files changed, 84 insertions(+), 153 deletions(-)
 create mode 100644 arch/mips/mm/dma-swiotlb.c

diff --git a/arch/mips/cavium-octeon/dma-octeon.c 
b/arch/mips/cavium-octeon/dma-octeon.c
index 7f0c9f926b6e..236833be6fbe 100644
--- a/arch/mips/cavium-octeon/dma-octeon.c
+++ b/arch/mips/cavium-octeon/dma-octeon.c
@@ -11,7 +11,6 @@
  * Copyright (C) 2010 Cavium Networks, Inc.
  */
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -169,49 +168,6 @@ void __init octeon_pci_dma_init(void)
 }
 #endif /* CONFIG_PCI */
 
-static dma_addr_t octeon_dma_map_page(struct device *dev, struct page *page,
-   unsigned long offset, size_t size, enum dma_data_direction direction,
-   unsigned long attrs)
-{
-   dma_addr_t daddr = swiotlb_map_page(dev, page, offset, size,
-   direction, attrs);
-   mb();
-
-   return daddr;
-}
-
-static int octeon_dma_map_sg(struct device *dev, struct scatterlist *sg,
-   int nents, enum dma_data_direction direction, unsigned long attrs)
-{
-   int r = swiotlb_map_sg_attrs(dev, sg, nents, direction, attrs);
-   mb();
-   return r;
-}
-
-static void octeon_dma_sync_single_for_device(struct device *dev,
-   dma_addr_t dma_handle, size_t size, enum dma_data_direction direction)
-{
-   swiotlb_sync_single_for_device(dev, dma_handle, size, direction);
-   mb();
-}
-
-static void octeon_dma_sync_sg_for_device(struct device *dev,
-   struct scatterlist *sg, int nelems, enum dma_data_direction direction)
-{
-   swiotlb_sync_sg_for_device(dev, sg, nelems, direction);
-   mb();
-}
-
-static void *octeon_dma_alloc_coherent(struct device *dev, size_t size,
-   dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
-{
-   void *ret = swiotlb_alloc(dev, size, dma_handle, gfp, attrs);
-
-   mb();
-
-   return ret;
-}
-
 dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
 {
 #ifdef CONFIG_PCI
@@ -230,21 +186,6 @@ phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t 
daddr)
return daddr;
 }
 
-static const struct dma_map_ops octeon_swiotlb_ops = {
-   .alloc  = octeon_dma_alloc_coherent,
-   .free   = swiotlb_free,
-   .map_page   = octeon_dma_map_page,
-   .unmap_page = swiotlb_unmap_page,
-   .map_sg = octeon_dma_map_sg,
-   .unmap_sg   = swiotlb_unmap_sg_attrs,
-   .sync_single_for_cpu= swiotlb_sync_single_for_cpu,
-   .sync_single_for_device = octeon_dma_sync_single_for_device,
-   .sync_sg_for_cpu= swiotlb_sync_sg_for_cpu,
-   .sync_sg_for_device = octeon_dma_sync_sg_for_device,
-   .mapping_error  = swiotlb_dma_mapping_error,
-   .dma_supported  = swiotlb_dma_supported
-};
-
 char *octeon_swiotlb;
 
 void __init plat_swiotlb_setup(void)
@@ -307,6 +248,4 @@ void __init plat_swiotlb_setup(void)
 
if (swiotlb_init_with_tbl(octeon_swiotlb, swiotlb_nslabs, 1) == -ENOMEM)
panic("Cannot allocate SWIOTLB buffer");
-
-   mips_dma_map_ops = _swiotlb_ops;
 }
diff --git a/arch/mips/include/asm/dma-direct.h 
b/arch/mips/include/asm/dma-direct.h
index f32f15530aba..b5c240806e1b 100644
--- a/arch/mips/include/asm/dma-direct.h
+++ b/arch/mips/include/asm/dma-direct.h
@@ -1 +1,16 @@
-#include 
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _MIPS_DMA_DIRECT_H
+#define _MIPS_DMA_DIRECT_H 1
+
+static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t 
size)
+{
+   if (!dev->dma_mask)
+   return false;
+
+   return addr + size - 1 <= *dev->dma_mask;
+}
+
+dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr);
+phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t daddr);
+
+#endif /* _MIPS_DMA_DIRECT_H */
diff --git a/arch/mips/include/asm/dma-mapping.h 
b/arch/mips/include/asm/dma-mapping.h
index 886e75a383f2..ebcce3e22297 100644
--- a/arch/mips/include/asm/dma-mapping.h
+++ b/arch/mips/include/asm/dma-mapping.h
@@ -11,10 +11,15 @@
 #endif
 
 extern const struct dma_map_ops *mips_dma_map_ops;
+extern const struct dma_map_ops mips_swiotlb_ops;
 
 static 

[PATCH 08/25] MIPS: remove the mips_dma_map_ops indirection

2018-06-15 Thread Christoph Hellwig
And use mips_default_dma_map_ops directly.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/include/asm/dma-mapping.h | 4 ++--
 arch/mips/mm/dma-default.c  | 6 ++
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/mips/include/asm/dma-mapping.h 
b/arch/mips/include/asm/dma-mapping.h
index ebcce3e22297..f24b052ec740 100644
--- a/arch/mips/include/asm/dma-mapping.h
+++ b/arch/mips/include/asm/dma-mapping.h
@@ -10,7 +10,7 @@
 #include 
 #endif
 
-extern const struct dma_map_ops *mips_dma_map_ops;
+extern const struct dma_map_ops mips_default_dma_map_ops;
 extern const struct dma_map_ops mips_swiotlb_ops;
 
 static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
@@ -18,7 +18,7 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
 #ifdef CONFIG_SWIOTLB
return _swiotlb_ops;
 #else
-   return mips_dma_map_ops;
+   return _default_dma_map_ops;
 #endif
 }
 
diff --git a/arch/mips/mm/dma-default.c b/arch/mips/mm/dma-default.c
index f9fef0028ca2..2db6c2a6f964 100644
--- a/arch/mips/mm/dma-default.c
+++ b/arch/mips/mm/dma-default.c
@@ -384,7 +384,7 @@ static void mips_dma_cache_sync(struct device *dev, void 
*vaddr, size_t size,
__dma_sync_virtual(vaddr, size, direction);
 }
 
-static const struct dma_map_ops mips_default_dma_map_ops = {
+const struct dma_map_ops mips_default_dma_map_ops = {
.alloc = mips_dma_alloc_coherent,
.free = mips_dma_free_coherent,
.mmap = mips_dma_mmap,
@@ -399,6 +399,4 @@ static const struct dma_map_ops mips_default_dma_map_ops = {
.dma_supported = mips_dma_supported,
.cache_sync = mips_dma_cache_sync,
 };
-
-const struct dma_map_ops *mips_dma_map_ops = _default_dma_map_ops;
-EXPORT_SYMBOL(mips_dma_map_ops);
+EXPORT_SYMBOL(mips_default_dma_map_ops);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 11/25] MIPS: Octeon: move swiotlb declarations out of dma-coherence.h

2018-06-15 Thread Christoph Hellwig
No need to pull them into a global header.

Signed-off-by: Christoph Hellwig 
---
 .../asm/mach-cavium-octeon/dma-coherence.h | 18 --
 arch/mips/include/asm/octeon/pci-octeon.h  |  3 +++
 arch/mips/pci/pci-octeon.c |  2 --
 arch/mips/pci/pcie-octeon.c|  2 --
 4 files changed, 3 insertions(+), 22 deletions(-)
 delete mode 100644 arch/mips/include/asm/mach-cavium-octeon/dma-coherence.h

diff --git a/arch/mips/include/asm/mach-cavium-octeon/dma-coherence.h 
b/arch/mips/include/asm/mach-cavium-octeon/dma-coherence.h
deleted file mode 100644
index 66eee98b8b8d..
--- a/arch/mips/include/asm/mach-cavium-octeon/dma-coherence.h
+++ /dev/null
@@ -1,18 +0,0 @@
-/*
- * This file is subject to the terms and conditions of the GNU General Public
- * License.  See the file "COPYING" in the main directory of this archive
- * for more details.
- *
- * Copyright (C) 2006  Ralf Baechle 
- */
-#ifndef __ASM_MACH_CAVIUM_OCTEON_DMA_COHERENCE_H
-#define __ASM_MACH_CAVIUM_OCTEON_DMA_COHERENCE_H
-
-#include 
-
-struct device;
-
-extern void octeon_pci_dma_init(void);
-extern char *octeon_swiotlb;
-
-#endif /* __ASM_MACH_CAVIUM_OCTEON_DMA_COHERENCE_H */
diff --git a/arch/mips/include/asm/octeon/pci-octeon.h 
b/arch/mips/include/asm/octeon/pci-octeon.h
index 1884609741a8..b12d9a3fbfb6 100644
--- a/arch/mips/include/asm/octeon/pci-octeon.h
+++ b/arch/mips/include/asm/octeon/pci-octeon.h
@@ -63,4 +63,7 @@ enum octeon_dma_bar_type {
  */
 extern enum octeon_dma_bar_type octeon_dma_bar_type;
 
+void octeon_pci_dma_init(void);
+extern char *octeon_swiotlb;
+
 #endif
diff --git a/arch/mips/pci/pci-octeon.c b/arch/mips/pci/pci-octeon.c
index a20697df3539..5017d5843c5a 100644
--- a/arch/mips/pci/pci-octeon.c
+++ b/arch/mips/pci/pci-octeon.c
@@ -21,8 +21,6 @@
 #include 
 #include 
 
-#include 
-
 #define USE_OCTEON_INTERNAL_ARBITER
 
 /*
diff --git a/arch/mips/pci/pcie-octeon.c b/arch/mips/pci/pcie-octeon.c
index 87ba86bd8696..9cc5905860ef 100644
--- a/arch/mips/pci/pcie-octeon.c
+++ b/arch/mips/pci/pcie-octeon.c
@@ -94,8 +94,6 @@ union cvmx_pcie_address {
 
 static int cvmx_pcie_rc_initialize(int pcie_port);
 
-#include 
-
 /**
  * Return the Core virtual base address for PCIe IO access. IOs are
  * read/written as an offset from this address.
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 06/25] MIPS: loongson: remove loongson_dma_supported

2018-06-15 Thread Christoph Hellwig
swiotlb_dma_supported will always return true for a mask large enough to
cover the DMA addresses for all physical memory, which is the right
thing to do for swiotlb based dma ops.  This function returned false
if the mask was bigger than a firmware set dma_mask_bits that apparently
can be either 32 or 64, and which seems completely buggys if it actually
is not 64, as the false return negates the whole point of swiotlb.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/loongson64/common/dma-swiotlb.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/arch/mips/loongson64/common/dma-swiotlb.c 
b/arch/mips/loongson64/common/dma-swiotlb.c
index 6a739f8ae110..a5e50f2ec301 100644
--- a/arch/mips/loongson64/common/dma-swiotlb.c
+++ b/arch/mips/loongson64/common/dma-swiotlb.c
@@ -56,13 +56,6 @@ static void loongson_dma_sync_sg_for_device(struct device 
*dev,
mb();
 }
 
-static int loongson_dma_supported(struct device *dev, u64 mask)
-{
-   if (mask > DMA_BIT_MASK(loongson_sysconf.dma_mask_bits))
-   return 0;
-   return swiotlb_dma_supported(dev, mask);
-}
-
 dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
 {
long nid;
@@ -99,7 +92,7 @@ static const struct dma_map_ops loongson_dma_map_ops = {
.sync_sg_for_cpu = swiotlb_sync_sg_for_cpu,
.sync_sg_for_device = loongson_dma_sync_sg_for_device,
.mapping_error = swiotlb_dma_mapping_error,
-   .dma_supported = loongson_dma_supported,
+   .dma_supported = swiotlb_dma_supported,
 };
 
 void __init plat_swiotlb_setup(void)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 03/25] MIPS: remove CONFIG_DMA_COHERENT

2018-06-15 Thread Christoph Hellwig
We can just check for !CONFIG_DMA_NONCOHERENT instead and simplify things
a lot.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Paul Burton 
---
 arch/mips/Kconfig| 16 
 arch/mips/include/asm/dma-coherence.h|  6 +++---
 arch/mips/include/asm/mach-generic/kmalloc.h |  3 +--
 arch/mips/mti-malta/malta-setup.c|  4 ++--
 arch/mips/sibyte/Kconfig |  1 -
 5 files changed, 6 insertions(+), 24 deletions(-)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 3f9deec70b92..89be9f97da4e 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -665,7 +665,6 @@ config SGI_IP27
select FW_ARC64
select BOOT_ELF64
select DEFAULT_SGI_PARTITION
-   select DMA_COHERENT
select SYS_HAS_EARLY_PRINTK
select HW_HAS_PCI
select NR_CPUS_DEFAULT_64
@@ -742,7 +741,6 @@ config SGI_IP32
 config SIBYTE_CRHINE
bool "Sibyte BCM91120C-CRhine"
select BOOT_ELF32
-   select DMA_COHERENT
select SIBYTE_BCM1120
select SWAP_IO_SPACE
select SYS_HAS_CPU_SB1
@@ -752,7 +750,6 @@ config SIBYTE_CRHINE
 config SIBYTE_CARMEL
bool "Sibyte BCM91120x-Carmel"
select BOOT_ELF32
-   select DMA_COHERENT
select SIBYTE_BCM1120
select SWAP_IO_SPACE
select SYS_HAS_CPU_SB1
@@ -762,7 +759,6 @@ config SIBYTE_CARMEL
 config SIBYTE_CRHONE
bool "Sibyte BCM91125C-CRhone"
select BOOT_ELF32
-   select DMA_COHERENT
select SIBYTE_BCM1125
select SWAP_IO_SPACE
select SYS_HAS_CPU_SB1
@@ -773,7 +769,6 @@ config SIBYTE_CRHONE
 config SIBYTE_RHONE
bool "Sibyte BCM91125E-Rhone"
select BOOT_ELF32
-   select DMA_COHERENT
select SIBYTE_BCM1125H
select SWAP_IO_SPACE
select SYS_HAS_CPU_SB1
@@ -783,7 +778,6 @@ config SIBYTE_RHONE
 config SIBYTE_SWARM
bool "Sibyte BCM91250A-SWARM"
select BOOT_ELF32
-   select DMA_COHERENT
select HAVE_PATA_PLATFORM
select SIBYTE_SB1250
select SWAP_IO_SPACE
@@ -796,7 +790,6 @@ config SIBYTE_SWARM
 config SIBYTE_LITTLESUR
bool "Sibyte BCM91250C2-LittleSur"
select BOOT_ELF32
-   select DMA_COHERENT
select HAVE_PATA_PLATFORM
select SIBYTE_SB1250
select SWAP_IO_SPACE
@@ -808,7 +801,6 @@ config SIBYTE_LITTLESUR
 config SIBYTE_SENTOSA
bool "Sibyte BCM91250E-Sentosa"
select BOOT_ELF32
-   select DMA_COHERENT
select SIBYTE_SB1250
select SWAP_IO_SPACE
select SYS_HAS_CPU_SB1
@@ -818,7 +810,6 @@ config SIBYTE_SENTOSA
 config SIBYTE_BIGSUR
bool "Sibyte BCM91480B-BigSur"
select BOOT_ELF32
-   select DMA_COHERENT
select NR_CPUS_DEFAULT_4
select SIBYTE_BCM1x80
select SWAP_IO_SPACE
@@ -895,7 +886,6 @@ config CAVIUM_OCTEON_SOC
select CEVT_R4K
select ARCH_HAS_PHYS_TO_DMA
select PHYS_ADDR_T_64BIT
-   select DMA_COHERENT
select SYS_SUPPORTS_64BIT_KERNEL
select SYS_SUPPORTS_BIG_ENDIAN
select EDAC_SUPPORT
@@ -944,7 +934,6 @@ config NLM_XLR_BOARD
select PHYS_ADDR_T_64BIT
select SYS_SUPPORTS_BIG_ENDIAN
select SYS_SUPPORTS_HIGHMEM
-   select DMA_COHERENT
select NR_CPUS_DEFAULT_32
select CEVT_R4K
select CSRC_R4K
@@ -972,7 +961,6 @@ config NLM_XLP_BOARD
select SYS_SUPPORTS_BIG_ENDIAN
select SYS_SUPPORTS_LITTLE_ENDIAN
select SYS_SUPPORTS_HIGHMEM
-   select DMA_COHERENT
select NR_CPUS_DEFAULT_32
select CEVT_R4K
select CSRC_R4K
@@ -991,7 +979,6 @@ config MIPS_PARAVIRT
bool "Para-Virtualized guest system"
select CEVT_R4K
select CSRC_R4K
-   select DMA_COHERENT
select SYS_SUPPORTS_64BIT_KERNEL
select SYS_SUPPORTS_32BIT_KERNEL
select SYS_SUPPORTS_BIG_ENDIAN
@@ -1117,9 +1104,6 @@ config DMA_PERDEV_COHERENT
bool
select DMA_MAYBE_COHERENT
 
-config DMA_COHERENT
-   bool
-
 config DMA_NONCOHERENT
bool
select NEED_DMA_MAP_STATE
diff --git a/arch/mips/include/asm/dma-coherence.h 
b/arch/mips/include/asm/dma-coherence.h
index 72d0eab02afc..8eda48748ed5 100644
--- a/arch/mips/include/asm/dma-coherence.h
+++ b/arch/mips/include/asm/dma-coherence.h
@@ -21,10 +21,10 @@ enum coherent_io_user_state {
 extern enum coherent_io_user_state coherentio;
 extern int hw_coherentio;
 #else
-#ifdef CONFIG_DMA_COHERENT
-#define coherentio IO_COHERENCE_ENABLED
-#else
+#ifdef CONFIG_DMA_NONCOHERENT
 #define coherentio IO_COHERENCE_DISABLED
+#else
+#define coherentio IO_COHERENCE_ENABLED
 #endif
 #define hw_coherentio  0
 #endif /* CONFIG_DMA_MAYBE_COHERENT */
diff --git a/arch/mips/include/asm/mach-generic/kmalloc.h 
b/arch/mips/include/asm/mach-generic/kmalloc.h
index 74207c7bd00d..649a98338886 100644
--- a/arch/mips/include/asm/mach-generic/kmalloc.h
+++ 

[PATCH 04/25] MIPS: Octeon: unexport __phys_to_dma and __dma_to_phys

2018-06-15 Thread Christoph Hellwig
These functions are just low-level helpers for the swiotlb and dma-direct
implementations, and should never be used by drivers.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Paul Burton 
---
 arch/mips/cavium-octeon/dma-octeon.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/mips/cavium-octeon/dma-octeon.c 
b/arch/mips/cavium-octeon/dma-octeon.c
index 7b335ab21697..e5d00c79bd26 100644
--- a/arch/mips/cavium-octeon/dma-octeon.c
+++ b/arch/mips/cavium-octeon/dma-octeon.c
@@ -13,7 +13,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -190,7 +189,6 @@ dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t 
paddr)
 
return ops->phys_to_dma(dev, paddr);
 }
-EXPORT_SYMBOL(__phys_to_dma);
 
 phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t daddr)
 {
@@ -200,7 +198,6 @@ phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t 
daddr)
 
return ops->dma_to_phys(dev, daddr);
 }
-EXPORT_SYMBOL(__dma_to_phys);
 
 static struct octeon_dma_map_ops octeon_linear_dma_map_ops = {
.dma_map_ops = {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 02/25] MIPS: simplify CONFIG_DMA_NONCOHERENT ifdefs

2018-06-15 Thread Christoph Hellwig
CONFIG_DMA_MAYBE_COHERENT already selects CONFIG_DMA_NONCOHERENT, so we
can remove the extra conditions.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Paul Burton 
---
 arch/mips/include/asm/io.h | 4 ++--
 arch/mips/mm/c-r4k.c   | 4 ++--
 arch/mips/mm/cache.c   | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/mips/include/asm/io.h b/arch/mips/include/asm/io.h
index a7d0b836f2f7..6d6bdc6a48eb 100644
--- a/arch/mips/include/asm/io.h
+++ b/arch/mips/include/asm/io.h
@@ -588,7 +588,7 @@ static inline void memcpy_toio(volatile void __iomem *dst, 
const void *src, int
  *
  * This API used to be exported; it now is for arch code internal use only.
  */
-#if defined(CONFIG_DMA_NONCOHERENT) || defined(CONFIG_DMA_MAYBE_COHERENT)
+#ifdef CONFIG_DMA_NONCOHERENT
 
 extern void (*_dma_cache_wback_inv)(unsigned long start, unsigned long size);
 extern void (*_dma_cache_wback)(unsigned long start, unsigned long size);
@@ -607,7 +607,7 @@ extern void (*_dma_cache_inv)(unsigned long start, unsigned 
long size);
 #define dma_cache_inv(start,size)  \
do { (void) (start); (void) (size); } while (0)
 
-#endif /* CONFIG_DMA_NONCOHERENT || CONFIG_DMA_MAYBE_COHERENT */
+#endif /* CONFIG_DMA_NONCOHERENT */
 
 /*
  * Read a 32-bit register that requires a 64-bit read cycle on the bus.
diff --git a/arch/mips/mm/c-r4k.c b/arch/mips/mm/c-r4k.c
index e12dfa48b478..b83ecfb2fbfc 100644
--- a/arch/mips/mm/c-r4k.c
+++ b/arch/mips/mm/c-r4k.c
@@ -830,7 +830,7 @@ static void r4k_flush_icache_user_range(unsigned long 
start, unsigned long end)
return __r4k_flush_icache_range(start, end, true);
 }
 
-#if defined(CONFIG_DMA_NONCOHERENT) || defined(CONFIG_DMA_MAYBE_COHERENT)
+#ifdef CONFIG_DMA_NONCOHERENT
 
 static void r4k_dma_cache_wback_inv(unsigned long addr, unsigned long size)
 {
@@ -904,7 +904,7 @@ static void r4k_dma_cache_inv(unsigned long addr, unsigned 
long size)
bc_inv(addr, size);
__sync();
 }
-#endif /* CONFIG_DMA_NONCOHERENT || CONFIG_DMA_MAYBE_COHERENT */
+#endif /* CONFIG_DMA_NONCOHERENT */
 
 struct flush_cache_sigtramp_args {
struct mm_struct *mm;
diff --git a/arch/mips/mm/cache.c b/arch/mips/mm/cache.c
index 0d3c656feba0..70a523151ff3 100644
--- a/arch/mips/mm/cache.c
+++ b/arch/mips/mm/cache.c
@@ -56,7 +56,7 @@ EXPORT_SYMBOL_GPL(local_flush_data_cache_page);
 EXPORT_SYMBOL(flush_data_cache_page);
 EXPORT_SYMBOL(flush_icache_all);
 
-#if defined(CONFIG_DMA_NONCOHERENT) || defined(CONFIG_DMA_MAYBE_COHERENT)
+#ifdef CONFIG_DMA_NONCOHERENT
 
 /* DMA cache operations. */
 void (*_dma_cache_wback_inv)(unsigned long start, unsigned long size);
@@ -65,7 +65,7 @@ void (*_dma_cache_inv)(unsigned long start, unsigned long 
size);
 
 EXPORT_SYMBOL(_dma_cache_wback_inv);
 
-#endif /* CONFIG_DMA_NONCOHERENT || CONFIG_DMA_MAYBE_COHERENT */
+#endif /* CONFIG_DMA_NONCOHERENT */
 
 /*
  * We could optimize the case where the cache argument is not BCACHE but
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 20/23] watchdog/hardlockup/hpet: Rotate interrupt among all monitored CPUs

2018-06-15 Thread Thomas Gleixner
On Thu, 14 Jun 2018, Ricardo Neri wrote:
> On Wed, Jun 13, 2018 at 11:48:09AM +0200, Thomas Gleixner wrote:
> > On Tue, 12 Jun 2018, Ricardo Neri wrote:
> > > + /* There are no CPUs to monitor. */
> > > + if (!cpumask_weight(>monitored_mask))
> > > + return NMI_HANDLED;
> > > +
> > >   inspect_for_hardlockups(regs);
> > >  
> > > + /*
> > > +  * Target a new CPU. Keep trying until we find a monitored CPU. CPUs
> > > +  * are addded and removed to this mask at cpu_up() and cpu_down(),
> > > +  * respectively. Thus, the interrupt should be able to be moved to
> > > +  * the next monitored CPU.
> > > +  */
> > > + spin_lock(_data->lock);
> > 
> > Yuck. Taking a spinlock from NMI ...
> 
> I am sorry. I will look into other options for locking. Do you think rcu_lock
> would help in this case? I need this locking because the CPUs being monitored
> changes as CPUs come online and offline.

Sure, but you _cannot_ take any locks in NMI context which are also taken
in !NMI context. And RCU will not help either. How so? The NMI can hit
exactly before the CPU bit is cleared and then the CPU goes down. So RCU
_cannot_ protect anything.

All you can do there is make sure that the TIMn_CONF is only ever accessed
in !NMI code. Then you can stop the timer _before_ a CPU goes down and make
sure that the eventually on the fly NMI is finished. After that you can
fiddle with the CPU mask and restart the timer. Be aware that this is going
to be more corner case handling that actual functionality.

> > > + for_each_cpu_wrap(cpu, >monitored_mask, smp_processor_id() + 1) {
> > > + if (!irq_set_affinity(hld_data->irq, cpumask_of(cpu)))
> > > + break;
> > 
> > ... and then calling into generic interrupt code which will take even more
> > locks is completely broken.
> 
> I will into reworking how the destination of the interrupt is set.

You have to consider two cases:

 1) !remapped mode:

That's reasonably simple because you just have to deal with the HPET
TIMERn_PROCMSG_ROUT register. But then you need to do this directly and
not through any of the existing interrupt facilities.

 2) remapped mode:

That's way more complex as you _cannot_ ever do anything which touches
the IOMMU and the related tables.

So you'd need to reserve an IOMMU remapping entry for each CPU upfront,
store the resulting value for the HPET TIMERn_PROCMSG_ROUT register in
per cpu storage and just modify that one from NMI.

Though there might be subtle side effects involved, which are related to
the acknowledge part. You need to talk to the IOMMU wizards first.

All in all, the idea itself is interesting, but the envisioned approach of
round robin and no fast accessible NMI reason detection is going to create
more problems than it solves.

This all could have been avoided if Intel hadn't decided to reuse the APIC
timer registers for the TSC deadline timer. If both would be available we'd
have a CPU local fast accessible watchdog timer when TSC deadline is used
for general timer purposes. But why am I complaining? I've resigned to the
fact that timers are designed^Wcobbled together by janitors long ago.

Thanks,

tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 17/23] watchdog/hardlockup/hpet: Convert the timer's interrupt to NMI

2018-06-15 Thread Thomas Gleixner
On Thu, 14 Jun 2018, Ricardo Neri wrote:
> On Wed, Jun 13, 2018 at 11:40:00AM +0200, Thomas Gleixner wrote:
> > On Tue, 12 Jun 2018, Ricardo Neri wrote:
> > > @@ -183,6 +184,8 @@ static irqreturn_t 
> > > hardlockup_detector_irq_handler(int irq, void *data)
> > >   if (!(hdata->flags & HPET_DEV_PERI_CAP))
> > >   kick_timer(hdata);
> > >  
> > > + pr_err("This interrupt should not have happened. Ensure delivery mode 
> > > is NMI.\n");
> > 
> > Eeew.
> 
> If you don't mind me asking. What is the problem with this error message?

The problem is not the error message. The problem is the abuse of
request_irq() and the fact that this irq handler function exists in the
first place for something which is NMI based.

> > And in case that the HPET does not support periodic mode this reprogramms
> > the timer on every NMI which means that while perf is running the watchdog
> > will never ever detect anything.
> 
> Yes. I see that this is wrong. With MSI interrupts, as far as I can
> see, there is not a way to make sure that the HPET timer caused the NMI
> perhaps the only option is to use an IO APIC interrupt and read the
> interrupt status register.
> 
> > Aside of that, reading TWO HPET registers for every NMI is insane. HPET
> > access is horribly slow, so any high frequency perf monitoring will take a
> > massive performance hit.
> 
> If an IO APIC interrupt is used, only HPET register (the status register)
> would need to be read for every NMI. Would that be more acceptable? Otherwise,
> there is no way to determine if the HPET cause the NMI.

You need level trigger for the HPET status register to be useful at all
because in edge mode the interrupt status bits read always 0.

That means you have to fiddle with the IOAPIC acknowledge magic from NMI
context. Brilliant idea. If the NMI hits in the middle of a regular
io_apic_read() then the interrupted code will endup with the wrong index
register. Not to talk about the fun which the affinity rotation from NMI
context would bring.

Do not even think about using IOAPIC and level for this.

> Alternatively, there could be a counter that skips reading the HPET status
> register (and the detection of hardlockups) for every X NMIs. This would
> reduce the overall frequency of HPET register reads.

Great plan. So if the watchdog is the only NMI (because perf is off) then
you delay the watchdog detection by that count.

You neither can do a time based check, because time might be corrupted and
then you end up in lala land as well.

Thanks,

tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 03/23] genirq: Introduce IRQF_DELIVER_AS_NMI

2018-06-15 Thread Julien Thierry

Hi Ricardo,

On 15/06/18 03:12, Ricardo Neri wrote:

On Wed, Jun 13, 2018 at 11:06:25AM +0100, Marc Zyngier wrote:

On 13/06/18 10:20, Thomas Gleixner wrote:

On Wed, 13 Jun 2018, Julien Thierry wrote:

On 13/06/18 09:34, Peter Zijlstra wrote:

On Tue, Jun 12, 2018 at 05:57:23PM -0700, Ricardo Neri wrote:

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 5426627..dbc5e02 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -61,6 +61,8 @@
*interrupt handler after suspending interrupts. For
system
*wakeup devices users need to implement wakeup
detection in
*their interrupt handlers.
+ * IRQF_DELIVER_AS_NMI - Configure interrupt to be delivered as
non-maskable, if
+ *supported by the chip.
*/


NAK on the first 6 patches. You really _REALLY_ don't want to expose
NMIs to this level.



I've been working on something similar on arm64 side, and effectively the one
thing that might be common to arm64 and intel is the interface to set an
interrupt as NMI. So I guess it would be nice to agree on the right approach
for this.

The way I did it was by introducing a new irq_state and let the irqchip driver
handle most of the work (if it supports that state):

https://lkml.org/lkml/2018/5/25/181

This has not been ACKed nor NAKed. So I am just asking whether this is a more
suitable approach, and if not, is there any suggestions on how to do this?


I really didn't pay attention to that as it's burried in the GIC/ARM series
which is usually Marc's playground.


I'm working my way through it ATM now that I have some brain cycles back.


Adding NMI delivery support at low level architecture irq chip level is
perfectly fine, but the exposure of that needs to be restricted very
much. Adding it to the generic interrupt control interfaces is not going to
happen. That's doomed to begin with and a complete abuse of the interface
as the handler can not ever be used for that.


I can only agree with that. Allowing random driver to use request_irq()
to make anything an NMI ultimately turns it into a complete mess ("hey,
NMI is *faster*, let's use that"), and a potential source of horrible
deadlocks.

What I'd find more palatable is a way for an irqchip to be able to
prioritize some interrupts based on a set of architecturally-defined
requirements, and a separate NMI requesting/handling framework that is
separate from the IRQ API, as the overall requirements are likely to
completely different.

It shouldn't have to be nearly as complex as the IRQ API, and require
much stricter requirements in terms of what you can do there (flow
handling should definitely be different).


Marc, Julien, do you plan to actively work on this? Would you mind keeping
me in the loop? I also need this work for this watchdog. In the meantime,
I will go through Julien's patches and try to adapt it to my work.


We are going to work on this and of course your input is most welcome to 
make sure we have an interface usable across different architectures.


In my patches, I'm not sure there is much to adapt to your work as most 
of it is arch specific (although I wont say no to another pair of eyes 
looking at them). From what I've seen of your patches, the point where 
we converge is that need for some code to be able to tell the irqchip "I 
want that particular interrupt line to be treated/setup as an NMI".


We'll make sure to keep you in the loop for discussions/suggestions on this.

Thanks,

--
Julien Thierry
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu