date:20180808

Re: [PATCH 17/20] powerpc/dma-swiotlb: use generic swiotlb_dma_ops

2018-08-08 Thread Benjamin Herrenschmidt

On Thu, 2018-08-09 at 10:54 +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote:
> > These are identical to the arch specific ones, so remove them.
> > 
> > Signed-off-by: Christoph Hellwig 
> 
> Acked-by: Benjamin Herrenschmidt 

Note: We will still need to implement some custom variant of this
for our secure VMs ... 

Basically we'll need to use the existing bounce bufferring as-is but
the condition will be different, it won't be whether the address is
below a certain limit, it will be *always*.

Cheers,
Ben.

> > ---
> >  arch/powerpc/include/asm/dma-direct.h |  4 
> >  arch/powerpc/include/asm/swiotlb.h|  2 --
> >  arch/powerpc/kernel/dma-swiotlb.c | 28 ++-
> >  arch/powerpc/sysdev/fsl_pci.c |  2 +-
> >  4 files changed, 7 insertions(+), 29 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/dma-direct.h 
> > b/arch/powerpc/include/asm/dma-direct.h
> > index 0fba19445ae8..657f84ddb20d 100644
> > --- a/arch/powerpc/include/asm/dma-direct.h
> > +++ b/arch/powerpc/include/asm/dma-direct.h
> > @@ -30,4 +30,8 @@ static inline phys_addr_t __dma_to_phys(struct device 
> > *dev, dma_addr_t daddr)
> > return daddr - PCI_DRAM_OFFSET;
> > return daddr - dev->archdata.dma_offset;
> >  }
> > +
> > +u64 swiotlb_powerpc_get_required(struct device *dev);
> > +#define swiotlb_get_required_mask swiotlb_powerpc_get_required
> > +
> >  #endif /* ASM_POWERPC_DMA_DIRECT_H */
> > diff --git a/arch/powerpc/include/asm/swiotlb.h 
> > b/arch/powerpc/include/asm/swiotlb.h
> > index f65ecf57b66c..1d8c1da26ab3 100644
> > --- a/arch/powerpc/include/asm/swiotlb.h
> > +++ b/arch/powerpc/include/asm/swiotlb.h
> > @@ -13,8 +13,6 @@
> >  
> >  #include 
> >  
> > -extern const struct dma_map_ops powerpc_swiotlb_dma_ops;
> > -
> >  extern unsigned int ppc_swiotlb_enable;
> >  int __init swiotlb_setup_bus_notifier(void);
> >  
> > diff --git a/arch/powerpc/kernel/dma-swiotlb.c 
> > b/arch/powerpc/kernel/dma-swiotlb.c
> > index 25986fcd1e5e..0c269de61f39 100644
> > --- a/arch/powerpc/kernel/dma-swiotlb.c
> > +++ b/arch/powerpc/kernel/dma-swiotlb.c
> > @@ -24,7 +24,7 @@
> >  
> >  unsigned int ppc_swiotlb_enable;
> >  
> > -static u64 swiotlb_powerpc_get_required(struct device *dev)
> > +u64 swiotlb_powerpc_get_required(struct device *dev)
> >  {
> > u64 end, mask, max_direct_dma_addr = dev->archdata.max_direct_dma_addr;
> >  
> > @@ -38,30 +38,6 @@ static u64 swiotlb_powerpc_get_required(struct device 
> > *dev)
> > return mask;
> >  }
> >  
> > -/*
> > - * At the moment, all platforms that use this code only require
> > - * swiotlb to be used if we're operating on HIGHMEM.  Since
> > - * we don't ever call anything other than map_sg, unmap_sg,
> > - * map_page, and unmap_page on highmem, use normal dma_ops
> > - * for everything else.
> > - */
> > -const struct dma_map_ops powerpc_swiotlb_dma_ops = {
> > -   .alloc = dma_direct_alloc,
> > -   .free = dma_direct_free,
> > -   .mmap = dma_nommu_mmap_coherent,
> > -   .map_sg = swiotlb_map_sg_attrs,
> > -   .unmap_sg = swiotlb_unmap_sg_attrs,
> > -   .dma_supported = swiotlb_dma_supported,
> > -   .map_page = swiotlb_map_page,
> > -   .unmap_page = swiotlb_unmap_page,
> > -   .sync_single_for_cpu = swiotlb_sync_single_for_cpu,
> > -   .sync_single_for_device = swiotlb_sync_single_for_device,
> > -   .sync_sg_for_cpu = swiotlb_sync_sg_for_cpu,
> > -   .sync_sg_for_device = swiotlb_sync_sg_for_device,
> > -   .mapping_error = swiotlb_dma_mapping_error,
> > -   .get_required_mask = swiotlb_powerpc_get_required,
> > -};
> > -
> >  void pci_dma_dev_setup_swiotlb(struct pci_dev *pdev)
> >  {
> > struct pci_controller *hose;
> > @@ -88,7 +64,7 @@ static int ppc_swiotlb_bus_notify(struct notifier_block 
> > *nb,
> >  
> > /* May need to bounce if the device can't address all of DRAM */
> > if ((dma_get_mask(dev) + 1) < memblock_end_of_DRAM())
> > -   set_dma_ops(dev, &powerpc_swiotlb_dma_ops);
> > +   set_dma_ops(dev, &swiotlb_dma_ops);
> >  
> > return NOTIFY_DONE;
> >  }
> > diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
> > index 918be816b097..daf44bc0108d 100644
> > --- a/arch/powerpc/sysdev/fsl_pci.c
> > +++ b/arch/powerpc/sysdev/fsl_pci.c
> > @@ -118,7 +118,7 @@ static void setup_swiotlb_ops(struct pci_controller 
> > *hose)
> >  {
> > if (ppc_swiotlb_enable) {
> > hose->controller_ops.dma_dev_setup = pci_dma_dev_setup_swiotlb;
> > -   set_pci_dma_ops(&powerpc_swiotlb_dma_ops);
> > +   set_pci_dma_ops(&swiotlb_dma_ops);
> > }
> >  }
> >  #else

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/1] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout

2018-08-08 Thread Leizhen (ThunderTown)




On 2018/8/8 18:12, Will Deacon wrote:
> Hi Thunder,
> 
> On Mon, Aug 06, 2018 at 08:31:29PM +0800, Zhen Lei wrote:
>> The condition "(int)(VAL - sync_idx) >= 0" to break loop in function
>> __arm_smmu_sync_poll_msi requires that sync_idx must be increased
>> monotonously according to the sequence of the CMDs in the cmdq.
>>
>> But ".msidata = atomic_inc_return_relaxed(&smmu->sync_nr)" is not protected
>> by spinlock, so the following scenarios may appear:
>> cpu0 cpu1
>> msidata=0
>>  msidata=1
>>  insert cmd1
>> insert cmd0
>>  smmu execute cmd1
>> smmu execute cmd0
>>  poll timeout, because msidata=1 is overridden by
>>  cmd0, that means VAL=0, sync_idx=1.
> 
> Oh yuck, you're right! We probably want a CC stable on this. Did you see
> this go wrong in practice?
Just misreported and make the caller wait for a long time until TIMEOUT. It's
rare to happen, because any other CMD_SYNC during the waiting period will break
it.

> 
> One comment on your patch...
> 
>> Signed-off-by: Zhen Lei 
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 7 +++
>>  1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 1d64710..4810f61 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -566,7 +566,7 @@ struct arm_smmu_device {
>>
>>  int gerr_irq;
>>  int combined_irq;
>> -atomic_tsync_nr;
>> +u32 sync_nr;
>>
>>  unsigned long   ias; /* IPA */
>>  unsigned long   oas; /* PA */
>> @@ -836,7 +836,6 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct 
>> arm_smmu_cmdq_ent *ent)
>>  cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, 
>> CMDQ_SYNC_0_CS_SEV);
>>  cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSH, ARM_SMMU_SH_ISH);
>>  cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIATTR, 
>> ARM_SMMU_MEMATTR_OIWB);
>> -cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent->sync.msidata);
>>  cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK;
>>  break;
>>  default:
>> @@ -947,7 +946,6 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct 
>> arm_smmu_device *smmu)
>>  struct arm_smmu_cmdq_ent ent = {
>>  .opcode = CMDQ_OP_CMD_SYNC,
>>  .sync   = {
>> -.msidata = atomic_inc_return_relaxed(&smmu->sync_nr),
>>  .msiaddr = virt_to_phys(&smmu->sync_count),
>>  },
>>  };
>> @@ -955,6 +953,8 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct 
>> arm_smmu_device *smmu)
>>  arm_smmu_cmdq_build_cmd(cmd, &ent);
>>
>>  spin_lock_irqsave(&smmu->cmdq.lock, flags);
>> +ent.sync.msidata = ++smmu->sync_nr;
>> +cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent.sync.msidata);
> 
> I really don't like splitting this out from building the rest of the
> command. Can you just move the call to arm_smmu_cmdq_build_cmd into the
> critical section, please?
OK. I have considered that before, just worry it will increase the compition of 
spinlock.

In addition, I will append a optimization patch: the adjacent CMD_SYNCs, we 
only need one.

> 
> Thanks,
> 
> Will
> 
> .
> 

-- 
Thanks!
BestRegards

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH 2/7] iommu: Add share domain interface in iommu for spimdev

2018-08-08 Thread Kenneth Lee

On Wed, Aug 08, 2018 at 11:13:54AM +0200, Joerg Roedel wrote:
> Date: Wed, 8 Aug 2018 11:13:54 +0200
> From: Joerg Roedel 
> To: Kenneth Lee 
> CC: Jonathan Corbet , Herbert Xu
>  , "David S . Miller" ,
>  Alex Williamson , Kenneth Lee
>  , Hao Fang , Zhou Wang
>  , Zaibo Xu , Philippe
>  Ombredanne , Greg Kroah-Hartman
>  , Thomas Gleixner ,
>  linux-...@vger.kernel.org, linux-ker...@vger.kernel.org,
>  linux-cry...@vger.kernel.org, iommu@lists.linux-foundation.org,
>  k...@vger.kernel.org, linux-accelerat...@lists.ozlabs.org, Lu Baolu
>  , Sanjay Kumar ,
>  linux...@huawei.com
> Subject: Re: [RFC PATCH 2/7] iommu: Add share domain interface in iommu for
>  spimdev
> User-Agent: NeoMutt/20170421 (1.8.2)
> Message-ID: <20180808091354.ppqgineql3puf...@8bytes.org>
> 
> On Wed, Aug 01, 2018 at 06:22:16PM +0800, Kenneth Lee wrote:
> > From: Kenneth Lee 
> > 
> > This patch add sharing interface for a iommu_group. The new interface:
> > 
> > iommu_group_share_domain()
> > iommu_group_unshare_domain()
> > 
> > can be used by some virtual iommu_group (such as iommu_group for spimdev)
> > to share their parent's iommu_group.
> > 
> > When the domain of the group is shared, it cannot be changed before
> > unshared.  In the future, notification can be added if update is required.
> 
> >From the description or the patch I don't really get why this is needed.
> Please update the description and make a case for this API addition.
> 

Yes, I will add more description in next version.

The idea here is:

1. iommu_domain is the setting of the IOMMU(the unit) of an iommu_group.

2. The iommu_group originally uses a default domain or a created one (e.g. by 
vfio)
   as its effective domain. There is only one user for a domain. If a device is
   bound to vfio driver, it uses the new created domain. And if it is unbound 
from
   vfio driver, it is switched back to the default domain.

3. But for spimdev, we create an mdev which share it parent device's IOMMU. That
   means, the parent device (along with its iommu group) will continue use its
   iommu_domain, and the mdev will use the same at the same time.

4. This API, is to tell the iommu_group that the domain is shared by the
   spimdev. And it should be changed before it is unshared.

Cheers

> 
>   Joerg

-- 
-Kenneth(Hisilicon)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 20/20] powerpc/dma: remove dma_nommu_mmap_coherent

2018-08-08 Thread Benjamin Herrenschmidt

On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote:
> The remaining implementation for coherent caches is functionally
> identical to the default provided in common code.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: Benjamin Herrenschmidt 

> ---
>  arch/powerpc/include/asm/dma-mapping.h |  7 ---
>  arch/powerpc/kernel/dma-iommu.c|  1 -
>  arch/powerpc/kernel/dma.c  | 13 -
>  arch/powerpc/platforms/pseries/vio.c   |  1 -
>  4 files changed, 22 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/dma-mapping.h 
> b/arch/powerpc/include/asm/dma-mapping.h
> index 879c4efba785..e62e23aa3714 100644
> --- a/arch/powerpc/include/asm/dma-mapping.h
> +++ b/arch/powerpc/include/asm/dma-mapping.h
> @@ -18,13 +18,6 @@
>  #include 
>  #include 
>  
> -/* Some dma direct funcs must be visible for use in other dma_ops */
> -extern int dma_nommu_mmap_coherent(struct device *dev,
> - struct vm_area_struct *vma,
> - void *cpu_addr, dma_addr_t handle,
> - size_t size, unsigned long attrs);
> -
> -
>  static inline unsigned long device_to_mask(struct device *dev)
>  {
>   if (dev->dma_mask && *dev->dma_mask)
> diff --git a/arch/powerpc/kernel/dma-iommu.c b/arch/powerpc/kernel/dma-iommu.c
> index f9fe2080ceb9..bf5234e1f71b 100644
> --- a/arch/powerpc/kernel/dma-iommu.c
> +++ b/arch/powerpc/kernel/dma-iommu.c
> @@ -114,7 +114,6 @@ int dma_iommu_mapping_error(struct device *dev, 
> dma_addr_t dma_addr)
>  struct dma_map_ops dma_iommu_ops = {
>   .alloc  = dma_iommu_alloc_coherent,
>   .free   = dma_iommu_free_coherent,
> - .mmap   = dma_nommu_mmap_coherent,
>   .map_sg = dma_iommu_map_sg,
>   .unmap_sg   = dma_iommu_unmap_sg,
>   .dma_supported  = dma_iommu_dma_supported,
> diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
> index 08b12cbd7abf..5b71c9d1b8cc 100644
> --- a/arch/powerpc/kernel/dma.c
> +++ b/arch/powerpc/kernel/dma.c
> @@ -70,18 +70,6 @@ static void dma_nommu_free_coherent(struct device *dev, 
> size_t size,
>   iommu_free_coherent(iommu, size, vaddr, dma_handle);
>  }
>  
> -int dma_nommu_mmap_coherent(struct device *dev, struct vm_area_struct *vma,
> -  void *cpu_addr, dma_addr_t handle, size_t size,
> -  unsigned long attrs)
> -{
> - unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
> -
> - return remap_pfn_range(vma, vma->vm_start,
> -pfn + vma->vm_pgoff,
> -vma->vm_end - vma->vm_start,
> -vma->vm_page_prot);
> -}
> -
>  /* note: needs to be called arch_get_required_mask for dma-noncoherent.c */
>  u64 arch_get_required_mask(struct device *dev)
>  {
> @@ -98,7 +86,6 @@ u64 arch_get_required_mask(struct device *dev)
>  const struct dma_map_ops dma_nommu_ops = {
>   .alloc  = dma_nommu_alloc_coherent,
>   .free   = dma_nommu_free_coherent,
> - .mmap   = dma_nommu_mmap_coherent,
>   .map_sg = dma_direct_map_sg,
>   .map_page   = dma_direct_map_page,
>   .get_required_mask  = arch_get_required_mask,
> diff --git a/arch/powerpc/platforms/pseries/vio.c 
> b/arch/powerpc/platforms/pseries/vio.c
> index 49e04ec19238..51d564313bd0 100644
> --- a/arch/powerpc/platforms/pseries/vio.c
> +++ b/arch/powerpc/platforms/pseries/vio.c
> @@ -618,7 +618,6 @@ static u64 vio_dma_get_required_mask(struct device *dev)
>  static const struct dma_map_ops vio_dma_mapping_ops = {
>   .alloc = vio_dma_iommu_alloc_coherent,
>   .free  = vio_dma_iommu_free_coherent,
> - .mmap  = dma_nommu_mmap_coherent,
>   .map_sg= vio_dma_iommu_map_sg,
>   .unmap_sg  = vio_dma_iommu_unmap_sg,
>   .map_page  = vio_dma_iommu_map_page,

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 18/20] powerpc/dma-noncoherent: use generic dma_noncoherent_ops

2018-08-08 Thread Benjamin Herrenschmidt

On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote:
> The generic dma-noncoherent code provides all that is needed by powerpc.
> 
> Note that the cache maintainance in the existing code is a bit odd
> as it implements both the sync_to_device and sync_to_cpu callouts,
> but never flushes caches when unmapping.  This patch keeps both
> directions arounds, which will lead to more flushing than the previous
> implementation.  Someone more familar with the required CPUs should
> eventually take a look and optimize the cache flush handling if needed.

The original code looks bogus indeed.

I think we got away with it because those older CPUs wouldn't speculate
or prefetch aggressively enough (or at all) so the flush on map was
sufficient, the stuff wouldn't come back into the cache.

But safe is better than sorry, so ... tentative Ack, I do need to try
to dig one of these things to test, which might take a while.

Acked-by: Benjamin Herrenschmidt 


> Signed-off-by: Christoph Hellwig 
> ---
>  arch/powerpc/Kconfig   |  2 +-
>  arch/powerpc/include/asm/dma-mapping.h | 29 -
>  arch/powerpc/kernel/dma.c  | 59 +++---
>  arch/powerpc/kernel/pci-common.c   |  5 ++-
>  arch/powerpc/kernel/setup-common.c |  4 ++
>  arch/powerpc/mm/dma-noncoherent.c  | 52 +--
>  arch/powerpc/platforms/44x/warp.c  |  2 +-
>  arch/powerpc/platforms/Kconfig.cputype |  6 ++-
>  8 files changed, 60 insertions(+), 99 deletions(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index bbfa6a8df4da..33c6017ffce6 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -129,7 +129,7 @@ config PPC
>   # Please keep this list sorted alphabetically.
>   #
>   select ARCH_HAS_DEVMEM_IS_ALLOWED
> - select ARCH_HAS_DMA_SET_COHERENT_MASK
> + select ARCH_HAS_DMA_SET_COHERENT_MASK if !NOT_COHERENT_CACHE
>   select ARCH_HAS_ELF_RANDOMIZE
>   select ARCH_HAS_FORTIFY_SOURCE
>   select ARCH_HAS_GCOV_PROFILE_ALL
> diff --git a/arch/powerpc/include/asm/dma-mapping.h 
> b/arch/powerpc/include/asm/dma-mapping.h
> index f0bf7ac2686c..879c4efba785 100644
> --- a/arch/powerpc/include/asm/dma-mapping.h
> +++ b/arch/powerpc/include/asm/dma-mapping.h
> @@ -19,40 +19,11 @@
>  #include 
>  
>  /* Some dma direct funcs must be visible for use in other dma_ops */
> -extern void *__dma_nommu_alloc_coherent(struct device *dev, size_t size,
> -  dma_addr_t *dma_handle, gfp_t flag,
> -  unsigned long attrs);
> -extern void __dma_nommu_free_coherent(struct device *dev, size_t size,
> -void *vaddr, dma_addr_t dma_handle,
> -unsigned long attrs);
>  extern int dma_nommu_mmap_coherent(struct device *dev,
>   struct vm_area_struct *vma,
>   void *cpu_addr, dma_addr_t handle,
>   size_t size, unsigned long attrs);
>  
> -#ifdef CONFIG_NOT_COHERENT_CACHE
> -/*
> - * DMA-consistent mapping functions for PowerPCs that don't support
> - * cache snooping.  These allocate/free a region of uncached mapped
> - * memory space for use with DMA devices.  Alternatively, you could
> - * allocate the space "normally" and use the cache management functions
> - * to ensure it is consistent.
> - */
> -struct device;
> -extern void __dma_sync(void *vaddr, size_t size, int direction);
> -extern void __dma_sync_page(struct page *page, unsigned long offset,
> -  size_t size, int direction);
> -extern unsigned long __dma_get_coherent_pfn(unsigned long cpu_addr);
> -
> -#else /* ! CONFIG_NOT_COHERENT_CACHE */
> -/*
> - * Cache coherent cores.
> - */
> -
> -#define __dma_sync(addr, size, rw)   ((void)0)
> -#define __dma_sync_page(pg, off, sz, rw) ((void)0)
> -
> -#endif /* ! CONFIG_NOT_COHERENT_CACHE */
>  
>  static inline unsigned long device_to_mask(struct device *dev)
>  {
> diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
> index 2b90a403cdac..b2e88075b2ea 100644
> --- a/arch/powerpc/kernel/dma.c
> +++ b/arch/powerpc/kernel/dma.c
> @@ -36,12 +36,7 @@ static void *dma_nommu_alloc_coherent(struct device *dev, 
> size_t size,
>* we can really use the direct ops
>*/
>   if (dma_direct_supported(dev, dev->coherent_dma_mask))
> -#ifdef CONFIG_NOT_COHERENT_CACHE
> - return __dma_nommu_alloc_coherent(dev, size, dma_handle,
> -flag, attrs);
> -#else
>   return dma_direct_alloc(dev, size, dma_handle, flag, attrs);
> -#endif
>  
>   /* Ok we can't ... do we have an iommu ? If not, fail */
>   iommu = get_iommu_table_base(dev);
> @@ -62,12 +57,7 @@ static void dma_nommu_free_coherent(struct device *dev, 
> size_t size,
>  
>   /* See comments in dma_nommu_al

Re: [PATCH 17/20] powerpc/dma-swiotlb: use generic swiotlb_dma_ops

2018-08-08 Thread Benjamin Herrenschmidt

On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote:
> These are identical to the arch specific ones, so remove them.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: Benjamin Herrenschmidt 

> ---
>  arch/powerpc/include/asm/dma-direct.h |  4 
>  arch/powerpc/include/asm/swiotlb.h|  2 --
>  arch/powerpc/kernel/dma-swiotlb.c | 28 ++-
>  arch/powerpc/sysdev/fsl_pci.c |  2 +-
>  4 files changed, 7 insertions(+), 29 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/dma-direct.h 
> b/arch/powerpc/include/asm/dma-direct.h
> index 0fba19445ae8..657f84ddb20d 100644
> --- a/arch/powerpc/include/asm/dma-direct.h
> +++ b/arch/powerpc/include/asm/dma-direct.h
> @@ -30,4 +30,8 @@ static inline phys_addr_t __dma_to_phys(struct device *dev, 
> dma_addr_t daddr)
>   return daddr - PCI_DRAM_OFFSET;
>   return daddr - dev->archdata.dma_offset;
>  }
> +
> +u64 swiotlb_powerpc_get_required(struct device *dev);
> +#define swiotlb_get_required_mask swiotlb_powerpc_get_required
> +
>  #endif /* ASM_POWERPC_DMA_DIRECT_H */
> diff --git a/arch/powerpc/include/asm/swiotlb.h 
> b/arch/powerpc/include/asm/swiotlb.h
> index f65ecf57b66c..1d8c1da26ab3 100644
> --- a/arch/powerpc/include/asm/swiotlb.h
> +++ b/arch/powerpc/include/asm/swiotlb.h
> @@ -13,8 +13,6 @@
>  
>  #include 
>  
> -extern const struct dma_map_ops powerpc_swiotlb_dma_ops;
> -
>  extern unsigned int ppc_swiotlb_enable;
>  int __init swiotlb_setup_bus_notifier(void);
>  
> diff --git a/arch/powerpc/kernel/dma-swiotlb.c 
> b/arch/powerpc/kernel/dma-swiotlb.c
> index 25986fcd1e5e..0c269de61f39 100644
> --- a/arch/powerpc/kernel/dma-swiotlb.c
> +++ b/arch/powerpc/kernel/dma-swiotlb.c
> @@ -24,7 +24,7 @@
>  
>  unsigned int ppc_swiotlb_enable;
>  
> -static u64 swiotlb_powerpc_get_required(struct device *dev)
> +u64 swiotlb_powerpc_get_required(struct device *dev)
>  {
>   u64 end, mask, max_direct_dma_addr = dev->archdata.max_direct_dma_addr;
>  
> @@ -38,30 +38,6 @@ static u64 swiotlb_powerpc_get_required(struct device *dev)
>   return mask;
>  }
>  
> -/*
> - * At the moment, all platforms that use this code only require
> - * swiotlb to be used if we're operating on HIGHMEM.  Since
> - * we don't ever call anything other than map_sg, unmap_sg,
> - * map_page, and unmap_page on highmem, use normal dma_ops
> - * for everything else.
> - */
> -const struct dma_map_ops powerpc_swiotlb_dma_ops = {
> - .alloc = dma_direct_alloc,
> - .free = dma_direct_free,
> - .mmap = dma_nommu_mmap_coherent,
> - .map_sg = swiotlb_map_sg_attrs,
> - .unmap_sg = swiotlb_unmap_sg_attrs,
> - .dma_supported = swiotlb_dma_supported,
> - .map_page = swiotlb_map_page,
> - .unmap_page = swiotlb_unmap_page,
> - .sync_single_for_cpu = swiotlb_sync_single_for_cpu,
> - .sync_single_for_device = swiotlb_sync_single_for_device,
> - .sync_sg_for_cpu = swiotlb_sync_sg_for_cpu,
> - .sync_sg_for_device = swiotlb_sync_sg_for_device,
> - .mapping_error = swiotlb_dma_mapping_error,
> - .get_required_mask = swiotlb_powerpc_get_required,
> -};
> -
>  void pci_dma_dev_setup_swiotlb(struct pci_dev *pdev)
>  {
>   struct pci_controller *hose;
> @@ -88,7 +64,7 @@ static int ppc_swiotlb_bus_notify(struct notifier_block *nb,
>  
>   /* May need to bounce if the device can't address all of DRAM */
>   if ((dma_get_mask(dev) + 1) < memblock_end_of_DRAM())
> - set_dma_ops(dev, &powerpc_swiotlb_dma_ops);
> + set_dma_ops(dev, &swiotlb_dma_ops);
>  
>   return NOTIFY_DONE;
>  }
> diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
> index 918be816b097..daf44bc0108d 100644
> --- a/arch/powerpc/sysdev/fsl_pci.c
> +++ b/arch/powerpc/sysdev/fsl_pci.c
> @@ -118,7 +118,7 @@ static void setup_swiotlb_ops(struct pci_controller *hose)
>  {
>   if (ppc_swiotlb_enable) {
>   hose->controller_ops.dma_dev_setup = pci_dma_dev_setup_swiotlb;
> - set_pci_dma_ops(&powerpc_swiotlb_dma_ops);
> + set_pci_dma_ops(&swiotlb_dma_ops);
>   }
>  }
>  #else

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 16/20] powerpc/dma: use dma_direct_{alloc,free}

2018-08-08 Thread Benjamin Herrenschmidt

On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote:
> These do the same functionality as the existing helpers, but do it
> simpler, and also allow the (optional) use of CMA.
> 
> Note that the swiotlb code now calls into the dma_direct code directly,
> given that it doesn't work with noncoherent caches at all, and isn't called
> when we have an iommu either, so the iommu special case in
> dma_nommu_alloc_coherent isn't required for swiotlb.

I am not convinced that this will produce the same results due to
the way the zone picking works.

As for the interaction with swiotlb, we'll need the FSL guys to have
a look. Scott, do you remember what this is about ?

> Signed-off-by: Christoph Hellwig 
> ---
>  arch/powerpc/include/asm/pgtable.h |  1 -
>  arch/powerpc/kernel/dma-swiotlb.c  |  4 +-
>  arch/powerpc/kernel/dma.c  | 78 --
>  arch/powerpc/mm/mem.c  | 19 
>  4 files changed, 11 insertions(+), 91 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/pgtable.h 
> b/arch/powerpc/include/asm/pgtable.h
> index 14c79a7dc855..123de4958d2e 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -38,7 +38,6 @@ extern unsigned long empty_zero_page[];
>  extern pgd_t swapper_pg_dir[];
>  
>  void limit_zone_pfn(enum zone_type zone, unsigned long max_pfn);
> -int dma_pfn_limit_to_zone(u64 pfn_limit);
>  extern void paging_init(void);
>  
>  /*
> diff --git a/arch/powerpc/kernel/dma-swiotlb.c 
> b/arch/powerpc/kernel/dma-swiotlb.c
> index f6e0701c5303..25986fcd1e5e 100644
> --- a/arch/powerpc/kernel/dma-swiotlb.c
> +++ b/arch/powerpc/kernel/dma-swiotlb.c
> @@ -46,8 +46,8 @@ static u64 swiotlb_powerpc_get_required(struct device *dev)
>   * for everything else.
>   */
>  const struct dma_map_ops powerpc_swiotlb_dma_ops = {
> - .alloc = __dma_nommu_alloc_coherent,
> - .free = __dma_nommu_free_coherent,
> + .alloc = dma_direct_alloc,
> + .free = dma_direct_free,
>   .mmap = dma_nommu_mmap_coherent,
>   .map_sg = swiotlb_map_sg_attrs,
>   .unmap_sg = swiotlb_unmap_sg_attrs,
> diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
> index 2cfc45acbb52..2b90a403cdac 100644
> --- a/arch/powerpc/kernel/dma.c
> +++ b/arch/powerpc/kernel/dma.c
> @@ -26,75 +26,6 @@
>   * can set archdata.dma_data to an unsigned long holding the offset. By
>   * default the offset is PCI_DRAM_OFFSET.
>   */
> -
> -static u64 __maybe_unused get_pfn_limit(struct device *dev)
> -{
> - u64 pfn = (dev->coherent_dma_mask >> PAGE_SHIFT) + 1;
> - struct dev_archdata __maybe_unused *sd = &dev->archdata;
> -
> -#ifdef CONFIG_SWIOTLB
> - if (sd->max_direct_dma_addr && dev->dma_ops == &powerpc_swiotlb_dma_ops)
> - pfn = min_t(u64, pfn, sd->max_direct_dma_addr >> PAGE_SHIFT);
> -#endif
> -
> - return pfn;
> -}
> -
> -#ifndef CONFIG_NOT_COHERENT_CACHE
> -void *__dma_nommu_alloc_coherent(struct device *dev, size_t size,
> -   dma_addr_t *dma_handle, gfp_t flag,
> -   unsigned long attrs)
> -{
> - void *ret;
> - struct page *page;
> - int node = dev_to_node(dev);
> -#ifdef CONFIG_FSL_SOC
> - u64 pfn = get_pfn_limit(dev);
> - int zone;
> -
> - /*
> -  * This code should be OK on other platforms, but we have drivers that
> -  * don't set coherent_dma_mask. As a workaround we just ifdef it. This
> -  * whole routine needs some serious cleanup.
> -  */
> -
> - zone = dma_pfn_limit_to_zone(pfn);
> - if (zone < 0) {
> - dev_err(dev, "%s: No suitable zone for pfn %#llx\n",
> - __func__, pfn);
> - return NULL;
> - }
> -
> - switch (zone) {
> - case ZONE_DMA:
> - flag |= GFP_DMA;
> - break;
> -#ifdef CONFIG_ZONE_DMA32
> - case ZONE_DMA32:
> - flag |= GFP_DMA32;
> - break;
> -#endif
> - };
> -#endif /* CONFIG_FSL_SOC */
> -
> - page = alloc_pages_node(node, flag, get_order(size));
> - if (page == NULL)
> - return NULL;
> - ret = page_address(page);
> - memset(ret, 0, size);
> - *dma_handle = phys_to_dma(dev,__pa(ret));
> -
> - return ret;
> -}
> -
> -void __dma_nommu_free_coherent(struct device *dev, size_t size,
> - void *vaddr, dma_addr_t dma_handle,
> - unsigned long attrs)
> -{
> - free_pages((unsigned long)vaddr, get_order(size));
> -}
> -#endif /* !CONFIG_NOT_COHERENT_CACHE */
> -
>  static void *dma_nommu_alloc_coherent(struct device *dev, size_t size,
>  dma_addr_t *dma_handle, gfp_t flag,
>  unsigned long attrs)
> @@ -105,8 +36,12 @@ static void *dma_nommu_alloc_coherent(struct device *dev, 
> size_t size,
>* we can really use the direct ops
>*/
>   if (dma_direct_supported(dev, dev->coherent_d

Re: [PATCH 15/20] powerpc/dma: remove the unused unmap_page and unmap_sg methods

2018-08-08 Thread Benjamin Herrenschmidt

On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote:
> These methods are optional to start with, no need to implement no-op
> versions.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: Benjamin Herrenschmidt 

> ---
>  arch/powerpc/kernel/dma.c | 16 
>  1 file changed, 16 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
> index 511a4972560d..2cfc45acbb52 100644
> --- a/arch/powerpc/kernel/dma.c
> +++ b/arch/powerpc/kernel/dma.c
> @@ -178,12 +178,6 @@ static int dma_nommu_map_sg(struct device *dev, struct 
> scatterlist *sgl,
>   return nents;
>  }
>  
> -static void dma_nommu_unmap_sg(struct device *dev, struct scatterlist *sg,
> - int nents, enum dma_data_direction direction,
> - unsigned long attrs)
> -{
> -}
> -
>  static u64 dma_nommu_get_required_mask(struct device *dev)
>  {
>   u64 end, mask;
> @@ -209,14 +203,6 @@ static inline dma_addr_t dma_nommu_map_page(struct 
> device *dev,
>   return phys_to_dma(dev, page_to_phys(page)) + offset;
>  }
>  
> -static inline void dma_nommu_unmap_page(struct device *dev,
> -  dma_addr_t dma_address,
> -  size_t size,
> -  enum dma_data_direction direction,
> -  unsigned long attrs)
> -{
> -}
> -
>  #ifdef CONFIG_NOT_COHERENT_CACHE
>  static inline void dma_nommu_sync_sg(struct device *dev,
>   struct scatterlist *sgl, int nents,
> @@ -242,10 +228,8 @@ const struct dma_map_ops dma_nommu_ops = {
>   .free   = dma_nommu_free_coherent,
>   .mmap   = dma_nommu_mmap_coherent,
>   .map_sg = dma_nommu_map_sg,
> - .unmap_sg   = dma_nommu_unmap_sg,
>   .dma_supported  = dma_direct_supported,
>   .map_page   = dma_nommu_map_page,
> - .unmap_page = dma_nommu_unmap_page,
>   .get_required_mask  = dma_nommu_get_required_mask,
>  #ifdef CONFIG_NOT_COHERENT_CACHE
>   .sync_single_for_cpu= dma_nommu_sync_single,

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 14/20] powerpc/dma: replace dma_nommu_dma_supported with dma_direct_supported

2018-08-08 Thread Benjamin Herrenschmidt

On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote:
> The ppc32 case of dma_nommu_dma_supported already was a no-op, and the
> 64-bit case came to the same conclusion as dma_direct_supported, so
> replace it with the generic version.

It's not at all equivalent (see my review on your earlier patch) or
am I missing something ?

 - ppc32 always return 1, but dma_direct_supported() will not for
devices with a <32-bit mask (and yes ppc32 isn't quite right to do
so, it should check against memory size, but in practice it worked
as the only limited devices we deal with on systems we still support
have a 31-bit limitation)

 - ppc64 needs to check against the end of DRAM as some devices will
fail the check, dma_direct_supported() doesn't seem to be doing that.

Also as I mentioned, I'm not sure about the business with ZONE_DMA,
and that arbitrary 24-bit limit since our entire memory is in ZONE_DMA
but that's a different can of worms I suppose.

> Signed-off-by: Christoph Hellwig 


> ---
>  arch/powerpc/Kconfig  |  1 +
>  arch/powerpc/kernel/dma.c | 28 +++-
>  2 files changed, 4 insertions(+), 25 deletions(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index f9cae7edd735..bbfa6a8df4da 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -158,6 +158,7 @@ config PPC
>   select CLONE_BACKWARDS
>   select DCACHE_WORD_ACCESS   if PPC64 && CPU_LITTLE_ENDIAN
>   select DYNAMIC_FTRACE   if FUNCTION_TRACER
> + select DMA_DIRECT_OPS
>   select EDAC_ATOMIC_SCRUB
>   select EDAC_SUPPORT
>   select GENERIC_ATOMIC64 if PPC32
> diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
> index 3487de83bb37..511a4972560d 100644
> --- a/arch/powerpc/kernel/dma.c
> +++ b/arch/powerpc/kernel/dma.c
> @@ -40,28 +40,6 @@ static u64 __maybe_unused get_pfn_limit(struct device *dev)
>   return pfn;
>  }
>  
> -static int dma_nommu_dma_supported(struct device *dev, u64 mask)
> -{
> -#ifdef CONFIG_PPC64
> - u64 limit = phys_to_dma(dev, (memblock_end_of_DRAM() - 1));
> -
> - /* Limit fits in the mask, we are good */
> - if (mask >= limit)
> - return 1;
> -
> -#ifdef CONFIG_FSL_SOC
> - /* Freescale gets another chance via ZONE_DMA/ZONE_DMA32, however
> -  * that will have to be refined if/when they support iommus
> -  */
> - return 1;
> -#endif
> - /* Sorry ... */
> - return 0;
> -#else
> - return 1;
> -#endif
> -}
> -
>  #ifndef CONFIG_NOT_COHERENT_CACHE
>  void *__dma_nommu_alloc_coherent(struct device *dev, size_t size,
> dma_addr_t *dma_handle, gfp_t flag,
> @@ -126,7 +104,7 @@ static void *dma_nommu_alloc_coherent(struct device *dev, 
> size_t size,
>   /* The coherent mask may be smaller than the real mask, check if
>* we can really use the direct ops
>*/
> - if (dma_nommu_dma_supported(dev, dev->coherent_dma_mask))
> + if (dma_direct_supported(dev, dev->coherent_dma_mask))
>   return __dma_nommu_alloc_coherent(dev, size, dma_handle,
>  flag, attrs);
>  
> @@ -148,7 +126,7 @@ static void dma_nommu_free_coherent(struct device *dev, 
> size_t size,
>   struct iommu_table *iommu;
>  
>   /* See comments in dma_nommu_alloc_coherent() */
> - if (dma_nommu_dma_supported(dev, dev->coherent_dma_mask))
> + if (dma_direct_supported(dev, dev->coherent_dma_mask))
>   return __dma_nommu_free_coherent(dev, size, vaddr, dma_handle,
> attrs);
>   /* Maybe we used an iommu ... */
> @@ -265,7 +243,7 @@ const struct dma_map_ops dma_nommu_ops = {
>   .mmap   = dma_nommu_mmap_coherent,
>   .map_sg = dma_nommu_map_sg,
>   .unmap_sg   = dma_nommu_unmap_sg,
> - .dma_supported  = dma_nommu_dma_supported,
> + .dma_supported  = dma_direct_supported,
>   .map_page   = dma_nommu_map_page,
>   .unmap_page = dma_nommu_unmap_page,
>   .get_required_mask  = dma_nommu_get_required_mask,

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 13/20] powerpc/dma: remove get_dma_offset

2018-08-08 Thread Benjamin Herrenschmidt

On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote:
> Just fold the calculation into __phys_to_dma/__dma_to_phys as those are
> the only places that should know about it.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: Benjamin Herrenschmidt 

> ---
>  arch/powerpc/include/asm/dma-direct.h  |  8 ++--
>  arch/powerpc/include/asm/dma-mapping.h | 16 
>  2 files changed, 6 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/dma-direct.h 
> b/arch/powerpc/include/asm/dma-direct.h
> index 7702875aabb7..0fba19445ae8 100644
> --- a/arch/powerpc/include/asm/dma-direct.h
> +++ b/arch/powerpc/include/asm/dma-direct.h
> @@ -19,11 +19,15 @@ static inline bool dma_capable(struct device *dev, 
> dma_addr_t addr, size_t size)
>  
>  static inline dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
>  {
> - return paddr + get_dma_offset(dev);
> + if (!dev)
> + return paddr + PCI_DRAM_OFFSET;
> + return paddr + dev->archdata.dma_offset;
>  }
>  
>  static inline phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t daddr)
>  {
> - return daddr - get_dma_offset(dev);
> + if (!dev)
> + return daddr - PCI_DRAM_OFFSET;
> + return daddr - dev->archdata.dma_offset;
>  }
>  #endif /* ASM_POWERPC_DMA_DIRECT_H */
> diff --git a/arch/powerpc/include/asm/dma-mapping.h 
> b/arch/powerpc/include/asm/dma-mapping.h
> index dacd0f93f2b2..f0bf7ac2686c 100644
> --- a/arch/powerpc/include/asm/dma-mapping.h
> +++ b/arch/powerpc/include/asm/dma-mapping.h
> @@ -80,22 +80,6 @@ static inline const struct dma_map_ops 
> *get_arch_dma_ops(struct bus_type *bus)
>   return NULL;
>  }
>  
> -/*
> - * get_dma_offset()
> - *
> - * Get the dma offset on configurations where the dma address can be 
> determined
> - * from the physical address by looking at a simple offset.  Direct dma and
> - * swiotlb use this function, but it is typically not used by implementations
> - * with an iommu.
> - */
> -static inline dma_addr_t get_dma_offset(struct device *dev)
> -{
> - if (dev)
> - return dev->archdata.dma_offset;
> -
> - return PCI_DRAM_OFFSET;
> -}
> -
>  static inline void set_dma_offset(struct device *dev, dma_addr_t off)
>  {
>   if (dev)

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 12/20] powerpc/dma: use phys_to_dma instead of get_dma_offset

2018-08-08 Thread Benjamin Herrenschmidt

On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote:
> Use the standard portable helper instead of the powerpc specific one,
> which is about to go away.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: Benjamin Herrenschmidt 

> ---
>  arch/powerpc/kernel/dma-swiotlb.c |  5 ++---
>  arch/powerpc/kernel/dma.c | 12 ++--
>  2 files changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/dma-swiotlb.c 
> b/arch/powerpc/kernel/dma-swiotlb.c
> index 88f3963ca30f..f6e0701c5303 100644
> --- a/arch/powerpc/kernel/dma-swiotlb.c
> +++ b/arch/powerpc/kernel/dma-swiotlb.c
> @@ -11,7 +11,7 @@
>   *
>   */
>  
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -31,9 +31,8 @@ static u64 swiotlb_powerpc_get_required(struct device *dev)
> end = memblock_end_of_DRAM();
> if (max_direct_dma_addr && end > max_direct_dma_addr)
> end = max_direct_dma_addr;
> -   end += get_dma_offset(dev);
>  
> -   mask = 1ULL << (fls64(end) - 1);
> +   mask = 1ULL << (fls64(phys_to_dma(dev, end)) - 1);
> mask += mask - 1;
>  
> return mask;
> diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
> index eceaa92e6986..3487de83bb37 100644
> --- a/arch/powerpc/kernel/dma.c
> +++ b/arch/powerpc/kernel/dma.c
> @@ -6,7 +6,7 @@
>   */
>  
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -43,7 +43,7 @@ static u64 __maybe_unused get_pfn_limit(struct device *dev)
>  static int dma_nommu_dma_supported(struct device *dev, u64 mask)
>  {
>  #ifdef CONFIG_PPC64
> -   u64 limit = get_dma_offset(dev) + (memblock_end_of_DRAM() - 1);
> +   u64 limit = phys_to_dma(dev, (memblock_end_of_DRAM() - 1));
>  
> /* Limit fits in the mask, we are good */
> if (mask >= limit)
> @@ -104,7 +104,7 @@ void *__dma_nommu_alloc_coherent(struct device *dev, 
> size_t size,
> return NULL;
> ret = page_address(page);
> memset(ret, 0, size);
> -   *dma_handle = __pa(ret) + get_dma_offset(dev);
> +   *dma_handle = phys_to_dma(dev,__pa(ret));
>  
> return ret;
>  }
> @@ -188,7 +188,7 @@ static int dma_nommu_map_sg(struct device *dev, struct 
> scatterlist *sgl,
> int i;
>  
> for_each_sg(sgl, sg, nents, i) {
> -   sg->dma_address = sg_phys(sg) + get_dma_offset(dev);
> +   sg->dma_address = phys_to_dma(dev, sg_phys(sg));
> sg->dma_length = sg->length;
>  
> if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
> @@ -210,7 +210,7 @@ static u64 dma_nommu_get_required_mask(struct device *dev)
>  {
> u64 end, mask;
>  
> -   end = memblock_end_of_DRAM() + get_dma_offset(dev);
> +   end = phys_to_dma(dev, memblock_end_of_DRAM());
>  
> mask = 1ULL << (fls64(end) - 1);
> mask += mask - 1;
> @@ -228,7 +228,7 @@ static inline dma_addr_t dma_nommu_map_page(struct device 
> *dev,
> if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
> __dma_sync_page(page, offset, size, dir);
>  
> -   return page_to_phys(page) + offset + get_dma_offset(dev);
> +   return phys_to_dma(dev, page_to_phys(page)) + offset;
>  }
>  
>  static inline void dma_nommu_unmap_page(struct device *dev,

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 11/20] powerpc/dma: split the two __dma_alloc_coherent implementations

2018-08-08 Thread Benjamin Herrenschmidt

On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote:
> The implemementation for the CONFIG_NOT_COHERENT_CACHE case doesn't share
> any code with the one for systems with coherent caches.  Split it off
> and merge it with the helpers in dma-noncoherent.c that have no other
> callers.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: Benjamin Herrenschmidt 

> ---
>  arch/powerpc/include/asm/dma-mapping.h |  5 -
>  arch/powerpc/kernel/dma.c  | 14 ++
>  arch/powerpc/mm/dma-noncoherent.c  | 15 +++
>  arch/powerpc/platforms/44x/warp.c  |  2 +-
>  4 files changed, 10 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/dma-mapping.h 
> b/arch/powerpc/include/asm/dma-mapping.h
> index f2a4a7142b1e..dacd0f93f2b2 100644
> --- a/arch/powerpc/include/asm/dma-mapping.h
> +++ b/arch/powerpc/include/asm/dma-mapping.h
> @@ -39,9 +39,6 @@ extern int dma_nommu_mmap_coherent(struct device *dev,
>   * to ensure it is consistent.
>   */
>  struct device;
> -extern void *__dma_alloc_coherent(struct device *dev, size_t size,
> -   dma_addr_t *handle, gfp_t gfp);
> -extern void __dma_free_coherent(size_t size, void *vaddr);
>  extern void __dma_sync(void *vaddr, size_t size, int direction);
>  extern void __dma_sync_page(struct page *page, unsigned long offset,
>size_t size, int direction);
> @@ -52,8 +49,6 @@ extern unsigned long __dma_get_coherent_pfn(unsigned long 
> cpu_addr);
>   * Cache coherent cores.
>   */
>  
> -#define __dma_alloc_coherent(dev, gfp, size, handle) NULL
> -#define __dma_free_coherent(size, addr)  ((void)0)
>  #define __dma_sync(addr, size, rw)   ((void)0)
>  #define __dma_sync_page(pg, off, sz, rw) ((void)0)
>  
> diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
> index 3939589aab04..eceaa92e6986 100644
> --- a/arch/powerpc/kernel/dma.c
> +++ b/arch/powerpc/kernel/dma.c
> @@ -62,18 +62,12 @@ static int dma_nommu_dma_supported(struct device *dev, 
> u64 mask)
>  #endif
>  }
>  
> +#ifndef CONFIG_NOT_COHERENT_CACHE
>  void *__dma_nommu_alloc_coherent(struct device *dev, size_t size,
> dma_addr_t *dma_handle, gfp_t flag,
> unsigned long attrs)
>  {
>   void *ret;
> -#ifdef CONFIG_NOT_COHERENT_CACHE
> - ret = __dma_alloc_coherent(dev, size, dma_handle, flag);
> - if (ret == NULL)
> - return NULL;
> - *dma_handle += get_dma_offset(dev);
> - return ret;
> -#else
>   struct page *page;
>   int node = dev_to_node(dev);
>  #ifdef CONFIG_FSL_SOC
> @@ -113,19 +107,15 @@ void *__dma_nommu_alloc_coherent(struct device *dev, 
> size_t size,
>   *dma_handle = __pa(ret) + get_dma_offset(dev);
>  
>   return ret;
> -#endif
>  }
>  
>  void __dma_nommu_free_coherent(struct device *dev, size_t size,
>   void *vaddr, dma_addr_t dma_handle,
>   unsigned long attrs)
>  {
> -#ifdef CONFIG_NOT_COHERENT_CACHE
> - __dma_free_coherent(size, vaddr);
> -#else
>   free_pages((unsigned long)vaddr, get_order(size));
> -#endif
>  }
> +#endif /* !CONFIG_NOT_COHERENT_CACHE */
>  
>  static void *dma_nommu_alloc_coherent(struct device *dev, size_t size,
>  dma_addr_t *dma_handle, gfp_t flag,
> diff --git a/arch/powerpc/mm/dma-noncoherent.c 
> b/arch/powerpc/mm/dma-noncoherent.c
> index d1c16456abac..cfc48a253707 100644
> --- a/arch/powerpc/mm/dma-noncoherent.c
> +++ b/arch/powerpc/mm/dma-noncoherent.c
> @@ -29,7 +29,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  
>  #include 
> @@ -151,8 +151,8 @@ static struct ppc_vm_region *ppc_vm_region_find(struct 
> ppc_vm_region *head, unsi
>   * Allocate DMA-coherent memory space and return both the kernel remapped
>   * virtual and bus address for that space.
>   */
> -void *
> -__dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *handle, 
> gfp_t gfp)
> +void *__dma_nommu_alloc_coherent(struct device *dev, size_t size,
> + dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
>  {
>   struct page *page;
>   struct ppc_vm_region *c;
> @@ -223,7 +223,7 @@ __dma_alloc_coherent(struct device *dev, size_t size, 
> dma_addr_t *handle, gfp_t
>   /*
>* Set the "dma handle"
>*/
> - *handle = page_to_phys(page);
> + *dma_handle = phys_to_dma(dev, page_to_phys(page));
>  
>   do {
>   SetPageReserved(page);
> @@ -249,12 +249,12 @@ __dma_alloc_coherent(struct device *dev, size_t size, 
> dma_addr_t *handle, gfp_t
>   no_page:
>   return NULL;
>  }
> -EXPORT_SYMBOL(__dma_alloc_coherent);
>  
>  /*
>   * free a page as defined by the above mapping.
>   */
> -void __dma_free_coherent(size_t size, void *vaddr)
> +void __dma_nommu_free_coherent(struct devi

Re: [PATCH 10/20] powerpc/dma-noncoherent: don't disable irqs over kmap_atomic

2018-08-08 Thread Benjamin Herrenschmidt

On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote:
> The requirement to disable local irqs over kmap_atomic is long gone,
> so remove those calls.

Really ? I'm trying to verify that and getting lost in a mess of macros
from hell in the per-cpu stuff but if you look at our implementation
of kmap_atomic_prot(), all it does is a preempt_disable(), and then
it uses kmap_atomic_idx_push():

int idx = __this_cpu_inc_return(__kmap_atomic_idx) - 1;

Note the use of __this_cpu_inc_return(), not this_cpu_inc_return(),
ie this is the non-interrupt safe version...

Ben.

> Signed-off-by: Christoph Hellwig 
> ---
>  arch/powerpc/mm/dma-noncoherent.c | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/mm/dma-noncoherent.c 
> b/arch/powerpc/mm/dma-noncoherent.c
> index 382528475433..d1c16456abac 100644
> --- a/arch/powerpc/mm/dma-noncoherent.c
> +++ b/arch/powerpc/mm/dma-noncoherent.c
> @@ -357,12 +357,10 @@ static inline void __dma_sync_page_highmem(struct page 
> *page,
>  {
>   size_t seg_size = min((size_t)(PAGE_SIZE - offset), size);
>   size_t cur_size = seg_size;
> - unsigned long flags, start, seg_offset = offset;
> + unsigned long start, seg_offset = offset;
>   int nr_segs = 1 + ((size - seg_size) + PAGE_SIZE - 1)/PAGE_SIZE;
>   int seg_nr = 0;
>  
> - local_irq_save(flags);
> -
>   do {
>   start = (unsigned long)kmap_atomic(page + seg_nr) + seg_offset;
>  
> @@ -378,8 +376,6 @@ static inline void __dma_sync_page_highmem(struct page 
> *page,
>   cur_size += seg_size;
>   seg_offset = 0;
>   } while (seg_nr < nr_segs);
> -
> - local_irq_restore(flags);
>  }
>  #endif /* CONFIG_HIGHMEM */
>  

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 09/20] powerpc/dma: remove the unused ISA_DMA_THRESHOLD export

2018-08-08 Thread Benjamin Herrenschmidt

On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig 

Acked-by: Benjamin Herrenschmidt 


> ---
>  arch/powerpc/kernel/setup_32.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
> index 74457485574b..3c2d093f74c7 100644
> --- a/arch/powerpc/kernel/setup_32.c
> +++ b/arch/powerpc/kernel/setup_32.c
> @@ -55,7 +55,6 @@ unsigned long ISA_DMA_THRESHOLD;
>  unsigned int DMA_MODE_READ;
>  unsigned int DMA_MODE_WRITE;
>  
> -EXPORT_SYMBOL(ISA_DMA_THRESHOLD);
>  EXPORT_SYMBOL(DMA_MODE_READ);
>  EXPORT_SYMBOL(DMA_MODE_WRITE);
>  fooBenjam

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 08/20] powerpc/dma: remove the unused dma_nommu_ops export

2018-08-08 Thread Benjamin Herrenschmidt

On Tue, 2018-07-31 at 14:16 +0200, Christoph Hellwig wrote:
> It turns out cxl actually uses it.  So for now skip this patch,
> although random code in drivers messing with dma ops will need to
> be sorted out sooner or later.

CXL devices are "special", they bypass the classic iommu in favor of
allowing the device to operate using the main processor page tables
using an MMU context (so basically the device can use userspace
addresses directly), akin to ATS.

I think the code currently uses the nommu ops as a way to do a simple
kernel mapping for kernel drivers using CXL (not userspace stuff)
though.

Ben.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 07/20] powerpc/dma: remove the unused ARCH_HAS_DMA_MMAP_COHERENT define

2018-08-08 Thread Benjamin Herrenschmidt

On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig 

Acked-by: Benjamin Herrenschmidt 

> ---
>  arch/powerpc/include/asm/dma-mapping.h | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/dma-mapping.h 
> b/arch/powerpc/include/asm/dma-mapping.h
> index 8fa394520af6..f2a4a7142b1e 100644
> --- a/arch/powerpc/include/asm/dma-mapping.h
> +++ b/arch/powerpc/include/asm/dma-mapping.h
> @@ -112,7 +112,5 @@ extern int dma_set_mask(struct device *dev, u64 dma_mask);
>  
>  extern u64 __dma_get_required_mask(struct device *dev);
>  
> -#define ARCH_HAS_DMA_MMAP_COHERENT
> -
>  #endif /* __KERNEL__ */
>  #endif   /* _ASM_DMA_MAPPING_H */

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 02/20] kernel/dma/direct: refine dma_direct_alloc zone selection

2018-08-08 Thread Benjamin Herrenschmidt

On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote:
> We need to take the DMA offset and encryption bit into account when selecting
> a zone.  Add a helper that takes those into account and use it.

That whole "encryption" stuff seems to be completely specific to the
way x86 does memory encryption, or am I mistaken ? It's not clear to me
what that does in practice and how it relates to DMA mappings.

I'm also not sure about that whole business with ZONE_DMA and
ARCH_ZONE_DMA_BITS...

On ppc64, unless you enable swiotlb (which we only do currently on
some embedded platforms), you have all of memory in ZONE_DMA.

[0.00] Zone ranges:
[0.00]   DMA  [mem 0x-0x001f]
[0.00]   DMA32empty
[0.00]   Normal   empty
[0.00]   Device   empty

I'm not sure how this will work with that dma direct code.

I also see a number of tests against a 64-bit mask rather than the
top of memory...

Ben.

> Signed-off-by: Christoph Hellwig 
> ---
>  kernel/dma/direct.c | 16 
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index d32d4f0d2c0c..c2c1df8827f2 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -58,6 +58,14 @@ static bool dma_coherent_ok(struct device *dev, 
> phys_addr_t phys, size_t size)
>   return addr + size - 1 <= dev->coherent_dma_mask;
>  }
>  
> +static bool dma_coherent_below(struct device *dev, u64 mask)
> +{
> + dma_addr_t addr = force_dma_unencrypted() ?
> + __phys_to_dma(dev, mask) : phys_to_dma(dev, mask);
> +
> + return dev->coherent_dma_mask <= addr;
> +}
> +
>  void *dma_direct_alloc(struct device *dev, size_t size, dma_addr_t 
> *dma_handle,
>   gfp_t gfp, unsigned long attrs)
>  {
> @@ -70,9 +78,9 @@ void *dma_direct_alloc(struct device *dev, size_t size, 
> dma_addr_t *dma_handle,
>   gfp &= ~__GFP_ZERO;
>  
>   /* GFP_DMA32 and GFP_DMA are no ops without the corresponding zones: */
> - if (dev->coherent_dma_mask <= DMA_BIT_MASK(ARCH_ZONE_DMA_BITS))
> + if (dma_coherent_below(dev, DMA_BIT_MASK(ARCH_ZONE_DMA_BITS)))
>   gfp |= GFP_DMA;
> - if (dev->coherent_dma_mask <= DMA_BIT_MASK(32) && !(gfp & GFP_DMA))
> + if (dma_coherent_below(dev, DMA_BIT_MASK(32) && !(gfp & GFP_DMA)))
>   gfp |= GFP_DMA32;
>  
>  again:
> @@ -92,14 +100,14 @@ void *dma_direct_alloc(struct device *dev, size_t size, 
> dma_addr_t *dma_handle,
>   page = NULL;
>  
>   if (IS_ENABLED(CONFIG_ZONE_DMA32) &&
> - dev->coherent_dma_mask < DMA_BIT_MASK(64) &&
> + dma_coherent_below(dev, DMA_BIT_MASK(64)) &&
>   !(gfp & (GFP_DMA32 | GFP_DMA))) {
>   gfp |= GFP_DMA32;
>   goto again;
>   }
>  
>   if (IS_ENABLED(CONFIG_ZONE_DMA) &&
> - dev->coherent_dma_mask < DMA_BIT_MASK(32) &&
> + dma_coherent_below(dev, DMA_BIT_MASK(32)) &&
>   !(gfp & GFP_DMA)) {
>   gfp = (gfp & ~GFP_DMA32) | GFP_DMA;
>   goto again;

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 01/20] kernel/dma/direct: take DMA offset into account in dma_direct_supported

2018-08-08 Thread Benjamin Herrenschmidt

On Mon, 2018-07-30 at 18:38 +0200, Christoph Hellwig wrote:
> When a device has a DMA offset the dma capable result will change due
> to the difference between the physical and DMA address.  Take that into
> account.

The patch in itself makes sense.

However, there are a number of things in that dma_direct.c file that I
don't quite get:

 - looking more generally at what that function does, I worry about the
switch of ppc32 to this later on:

We do have the occasional device with things like 31-bit DMA
limitation. We know they happens to work because those systems
can't have enough memory to be a problem. This is why our current
DMA direct ops in powerpc just unconditionally return true on ppc32.

The test against a full 32-bit mask here will break them I think.

Thing is, I'm not sure I still have access to one of these things
to test, I'll have to dig (from memory things like b43 wifi).

Also those platforms don't have an iommu.

 - What is this trying to achieve ?

/*
 * Various PCI/PCIe bridges have broken support for > 32bit DMA even
 * if the device itself might support it.
 */
if (dev->dma_32bit_limit && mask > phys_to_dma(dev, DMA_BIT_MASK(32)))
return 0;

IE, if the device has a 32-bit limit, we fail an attempt at checking
if a >32-bit mask works ? That doesn't quite seem to be the right thing
to do... Shouldn't this be in dma_set_mask() and just clamp the mask down ?

IE, dma_set_mask() is what a driver uses to establish the device capability,
so it makes sense tot have dma_32bit_limit just reduce that capability, not
fail because the device can do more than what the bridge can 

Sorry if I'm a bit confused here.

 - How is that file supposed to work on 64-bit platforms ? From what I can
tell, dma_supported() will unconditionally return true if the mask is
32-bit or larger (appart from the above issue). This doesn't look right,
the mask needs to be compared to the max memory address. There are a bunch
of devices out there with masks anywhere bettween 40 and 64 bits, and
some of these will not work "out of the box" if the offseted top
of memory is beyond the mask limit. Or am I missing something ?

Cheers,
Ben.

> Signed-off-by: Christoph Hellwig 

Reviewed-by: Benjamin Herrenschmidt 

> ---
>  kernel/dma/direct.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 8be8106270c2..d32d4f0d2c0c 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -167,7 +167,7 @@ int dma_direct_map_sg(struct device *dev, struct 
> scatterlist *sgl, int nents,
>  int dma_direct_supported(struct device *dev, u64 mask)
>  {
>  #ifdef CONFIG_ZONE_DMA
> - if (mask < DMA_BIT_MASK(ARCH_ZONE_DMA_BITS))
> + if (mask < phys_to_dma(dev, DMA_BIT_MASK(ARCH_ZONE_DMA_BITS)))
>   return 0;
>  #else
>   /*
> @@ -176,14 +176,14 @@ int dma_direct_supported(struct device *dev, u64 mask)
>* memory, or by providing a ZONE_DMA32.  If neither is the case, the
>* architecture needs to use an IOMMU instead of the direct mapping.
>*/
> - if (mask < DMA_BIT_MASK(32))
> + if (mask < phys_to_dma(dev, DMA_BIT_MASK(32)))
>   return 0;
>  #endif
>   /*
>* Various PCI/PCIe bridges have broken support for > 32bit DMA even
>* if the device itself might support it.
>*/
> - if (dev->dma_32bit_limit && mask > DMA_BIT_MASK(32))
> + if (dev->dma_32bit_limit && mask > phys_to_dma(dev, DMA_BIT_MASK(32)))
>   return 0;
>   return 1;
>  }

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/2] arm64: dts: stratix10: Add Stratix10 SMMU support

2018-08-08 Thread Robin Murphy


On 25/07/18 19:24, thor.tha...@linux.intel.com wrote:

From: Thor Thayer 

Add SMMU support to the Stratix10 Device Tree which
includes adding the SMMU node and adding IOMMU stream
ids to the SMMU peripherals. Update bindings.

Signed-off-by: Thor Thayer 
---
This patch is dependent on the patch series
"iommu/arm-smmu: Add runtime pm/sleep support"
(https://patchwork.ozlabs.org/cover/946160/)
---
  .../devicetree/bindings/iommu/arm,smmu.txt | 25 ++
  arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi  | 30 ++
  2 files changed, 55 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index 7c71a6ed465a..8e3fe0594e3e 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -18,6 +18,7 @@ conditions.
  "arm,mmu-500"
  "cavium,smmu-v2"
  "qcom,-smmu-v2", "qcom,smmu-v2"
+"altr,smmu-v2"


Can we guarantee that no Altera SoC will ever exist with a different 
SMMU implementation, configuration, or clock tree? If we must have 
compatibles for SoC-specific integrations, I'd be a lot happier if they 
were actually SoC-specific, i.e. at least "altr,stratix10-smmu", or even 
something like "altr,gx5500-smmu" if there's a chance of new 
incompatible designs being added to the Stratix 10 family in future.


I'm still dubious that we actually need this for MMU-500, though, since 
we will always need the TCU clock enabled to do anything, and given the 
difficulty in associating particular TBU clocks with whichever domains 
might cause allocations into which TBU's TLBs, it seems highly unlikely 
that it'll ever be feasible to work at a granularity finer than "all of 
the clocks". And at that point the names don't really matter, and we 
merely need something like the proposed of_clk_bulk_get()[1], which 
works fine regardless of how many TBUs and distinct clocks exist for a 
particular MMU-500 configuration and integration.


  
depending on the particular implementation and/or the

version of the architecture implemented.
@@ -179,3 +180,27 @@ conditions.
 <&mmcc SMMU_MDP_AHB_CLK>;
clock-names = "bus", "iface";
};
+
+   /* Stratix10 arm,smmu-v2 implementation */
+   smmu5: iommu@fa00 {
+   compatible = "altr,smmu-v2", "arm,mmu-500",
+"arm,smmu-v2";
+   reg = <0xfa00 0x4>;
+   #global-interrupts = <2>;
+   #iommu-cells = <1>;
+   clocks = <&clkmgr STRATIX10_L4_MAIN_CLK>;
+   clock-names = "masters";


This isn't documented as an actual property, and if it also clocks the 
TCU then I'm not sure it's really the most accurate name.


Robin.

[1] https://patchwork.kernel.org/patch/10427095/


+   interrupt-parent = <&intc>;
+   interrupts = <0 128 4>,   /* Global Secure Fault */
+   <0 129 4>, /* Global Non-secure Fault */
+   /* Non-secure Context Interrupts (32) */
+   <0 138 4>, <0 139 4>, <0 140 4>, <0 141 4>,
+   <0 142 4>, <0 143 4>, <0 144 4>, <0 145 4>,
+   <0 146 4>, <0 147 4>, <0 148 4>, <0 149 4>,
+   <0 150 4>, <0 151 4>, <0 152 4>, <0 153 4>,
+   <0 154 4>, <0 155 4>, <0 156 4>, <0 157 4>,
+   <0 158 4>, <0 159 4>, <0 160 4>, <0 161 4>,
+   <0 162 4>, <0 163 4>, <0 164 4>, <0 165 4>,
+   <0 166 4>, <0 167 4>, <0 168 4>, <0 169 4>;
+   stream-match-mask = <0x7ff0>;
+   };
diff --git a/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi 
b/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi
index d033da401c26..e38ca86d48f6 100644
--- a/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi
+++ b/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi
@@ -137,6 +137,7 @@
reset-names = "stmmaceth", "stmmaceth-ocp";
clocks = <&clkmgr STRATIX10_EMAC0_CLK>;
clock-names = "stmmaceth";
+   iommus = <&smmu 1>;
status = "disabled";
};
  
@@ -150,6 +151,7 @@

reset-names = "stmmaceth", "stmmaceth-ocp";
clocks = <&clkmgr STRATIX10_EMAC1_CLK>;
clock-names = "stmmaceth";
+   iommus = <&smmu 2>;
status = "disabled";
};
  
@@ -163,6 +165,7 @@

reset-names = "stmmaceth", "stmmaceth-ocp";
clocks = <&clkmgr STRATIX10_EMAC2_CLK>;
clock-names = "stmmaceth";
+   iommus = <&smm

Re: [RFC PATCH 0/7] A General Accelerator Framework, WarpDrive

2018-08-08 Thread Jerome Glisse

On Wed, Aug 08, 2018 at 09:08:42AM +0800, Kenneth Lee wrote:
> 
> 
> 在 2018年08月06日 星期一 11:32 下午, Jerome Glisse 写道:
> > On Mon, Aug 06, 2018 at 11:12:52AM +0800, Kenneth Lee wrote:
> > > On Fri, Aug 03, 2018 at 10:39:44AM -0400, Jerome Glisse wrote:
> > > > On Fri, Aug 03, 2018 at 11:47:21AM +0800, Kenneth Lee wrote:
> > > > > On Thu, Aug 02, 2018 at 10:22:43AM -0400, Jerome Glisse wrote:
> > > > > > On Thu, Aug 02, 2018 at 12:05:57PM +0800, Kenneth Lee wrote:
> > > > > > > On Thu, Aug 02, 2018 at 02:33:12AM +, Tian, Kevin wrote:
> > > > > > > > > On Wed, Aug 01, 2018 at 06:22:14PM +0800, Kenneth Lee wrote:

[...]

> > > > > > > > > My more general question is do we want to grow VFIO to become
> > > > > > > > > a more generic device driver API. This patchset adds a command
> > > > > > > > > queue concept to it (i don't think it exist today but i have
> > > > > > > > > not follow VFIO closely).
> > > > > > > > > 
> > > > > > > The thing is, VFIO is the only place to support DMA from user 
> > > > > > > land. If we don't
> > > > > > > put it here, we have to create another similar facility to 
> > > > > > > support the same.
> > > > > > No it is not, network device, GPU, block device, ... they all do
> > > > > > support DMA. The point i am trying to make here is that even in
> > > > > Sorry, wait a minute, are we talking the same thing? I meant "DMA 
> > > > > from user
> > > > > land", not "DMA from kernel driver". To do that we have to manipulate 
> > > > > the
> > > > > IOMMU(Unit). I think it can only be done by default_domain or vfio 
> > > > > domain. Or
> > > > > the user space have to directly access the IOMMU.
> > > > GPU do DMA in the sense that you pass to the kernel a valid
> > > > virtual address (kernel driver do all the proper check) and
> > > > then you can use the GPU to copy from or to that range of
> > > > virtual address. Exactly how you want to use this compression
> > > > engine. It does not rely on SVM but SVM going forward would
> > > > still be the prefered option.
> > > > 
> > > No, SVM is not the reason why we rely on Jean's SVM(SVA) series. We rely 
> > > on
> > > Jean's series because of multi-process (PASID or substream ID) support.
> > > 
> > > But of couse, WarpDrive can still benefit from the SVM feature.
> > We are getting side tracked here. PASID/ID do not require VFIO.
> > 
> Yes, PASID itself do not require VFIO. But what if:
> 
> 1. Support DMA from user space.
> 2. The hardware makes use of standard IOMMU/SMMU for IO address translation.
> 3. The IOMMU facility is shared by both kernel and user drivers.
> 4. Support PASID with the current IOMMU facility

I do not see how any of this means it has to be in VFIO.
Other devices do just that. GPUs driver for instance share
DMA engine (that copy data around) between kernel and user
space. Sometime kernel use it to move things around. Evict
some memory to make room for a new process is the common
example. Same DMA engines is often use by userspace itself
during rendering or compute (program moving things on there
own). So they are already kernel driver that do all 4 of
the above and are not in VFIO.


> > > > > > your mechanisms the userspace must have a specific userspace
> > > > > > drivers for each hardware and thus there are virtually no
> > > > > > differences between having this userspace driver open a device
> > > > > > file in vfio or somewhere else in the device filesystem. This is
> > > > > > just a different path.
> > > > > > 
> > > > > The basic problem WarpDrive want to solve it to avoid syscall. This 
> > > > > is important
> > > > > to accelerators. We have some data here:
> > > > > https://www.slideshare.net/linaroorg/progress-and-demonstration-of-wrapdrive-a-accelerator-framework-sfo17317
> > > > > 
> > > > > (see page 3)
> > > > > 
> > > > > The performance is different on using kernel and user drivers.
> > > > Yes and example i point to is exactly that. You have a one time setup
> > > > cost (creating command buffer binding PASID with command buffer and
> > > > couple other setup steps). Then userspace no longer have to do any
> > > > ioctl to schedule work on the GPU. It is all down from userspace and
> > > > it use a doorbell to notify hardware when it should go look at command
> > > > buffer for new thing to execute.
> > > > 
> > > > My point stands on that. You have existing driver already doing so
> > > > with no new framework and in your scheme you need a userspace driver.
> > > > So i do not see the value add, using one path or the other in the
> > > > userspace driver is litteraly one line to change.
> > > > 
> > > Sorry, I'd got confuse here. I partially agree that the user driver is
> > > redundance of kernel driver. (But for WarpDrive, the kernel driver is a 
> > > full
> > > driver include all preparation and setup stuff for the hardware, the user 
> > > driver
> > > is simply to send request and receive answer). Yes, it is just a choice 
> > > of path.
> > > But the user path is faster if the

Re: [PATCH v3 08/10] dmapool: improve accuracy of debug statistics

2018-08-08 Thread Tony Battersby

On 08/08/2018 05:54 AM, Andy Shevchenko wrote:
> On Tue, Aug 7, 2018 at 7:49 PM, Tony Battersby  wrote:
>> The "total number of blocks in pool" debug statistic currently does not
>> take the boundary value into account, so it diverges from the "total
>> number of blocks in use" statistic when a boundary is in effect.  Add a
>> calculation for the number of blocks per allocation that takes the
>> boundary into account, and use it to replace the inaccurate calculation.
>
>> +   retval->blks_per_alloc =
>> +   (allocation / boundary) * (boundary / size) +
>> +   (allocation % boundary) / size;
> If boundary is guaranteed to be power of 2, this can avoid cost
> divisions (though it's a slow path anyway).
>
At this point in the function, boundary is guaranteed to be either a
power of 2 or equal to allocation, which might not be a power of 2.  Not
worth special-casing a slow path.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/1] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout

2018-08-08 Thread Will Deacon

Hi Thunder,

On Mon, Aug 06, 2018 at 08:31:29PM +0800, Zhen Lei wrote:
> The condition "(int)(VAL - sync_idx) >= 0" to break loop in function
> __arm_smmu_sync_poll_msi requires that sync_idx must be increased
> monotonously according to the sequence of the CMDs in the cmdq.
> 
> But ".msidata = atomic_inc_return_relaxed(&smmu->sync_nr)" is not protected
> by spinlock, so the following scenarios may appear:
> cpu0  cpu1
> msidata=0
>   msidata=1
>   insert cmd1
> insert cmd0
>   smmu execute cmd1
> smmu execute cmd0
>   poll timeout, because msidata=1 is overridden by
>   cmd0, that means VAL=0, sync_idx=1.

Oh yuck, you're right! We probably want a CC stable on this. Did you see
this go wrong in practice?

One comment on your patch...

> Signed-off-by: Zhen Lei 
> ---
>  drivers/iommu/arm-smmu-v3.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 1d64710..4810f61 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -566,7 +566,7 @@ struct arm_smmu_device {
> 
>   int gerr_irq;
>   int combined_irq;
> - atomic_tsync_nr;
> + u32 sync_nr;
> 
>   unsigned long   ias; /* IPA */
>   unsigned long   oas; /* PA */
> @@ -836,7 +836,6 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct 
> arm_smmu_cmdq_ent *ent)
>   cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, 
> CMDQ_SYNC_0_CS_SEV);
>   cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSH, ARM_SMMU_SH_ISH);
>   cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIATTR, 
> ARM_SMMU_MEMATTR_OIWB);
> - cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent->sync.msidata);
>   cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK;
>   break;
>   default:
> @@ -947,7 +946,6 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct 
> arm_smmu_device *smmu)
>   struct arm_smmu_cmdq_ent ent = {
>   .opcode = CMDQ_OP_CMD_SYNC,
>   .sync   = {
> - .msidata = atomic_inc_return_relaxed(&smmu->sync_nr),
>   .msiaddr = virt_to_phys(&smmu->sync_count),
>   },
>   };
> @@ -955,6 +953,8 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct 
> arm_smmu_device *smmu)
>   arm_smmu_cmdq_build_cmd(cmd, &ent);
> 
>   spin_lock_irqsave(&smmu->cmdq.lock, flags);
> + ent.sync.msidata = ++smmu->sync_nr;
> + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent.sync.msidata);

I really don't like splitting this out from building the rest of the
command. Can you just move the call to arm_smmu_cmdq_build_cmd into the
critical section, please?

Thanks,

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 08/10] dmapool: improve accuracy of debug statistics

2018-08-08 Thread Andy Shevchenko

On Tue, Aug 7, 2018 at 7:49 PM, Tony Battersby  wrote:
> The "total number of blocks in pool" debug statistic currently does not
> take the boundary value into account, so it diverges from the "total
> number of blocks in use" statistic when a boundary is in effect.  Add a
> calculation for the number of blocks per allocation that takes the
> boundary into account, and use it to replace the inaccurate calculation.


> +   retval->blks_per_alloc =
> +   (allocation / boundary) * (boundary / size) +
> +   (allocation % boundary) / size;

If boundary is guaranteed to be power of 2, this can avoid cost
divisions (though it's a slow path anyway).

-- 
With Best Regards,
Andy Shevchenko
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 07/10] dmapool: cleanup integer types

2018-08-08 Thread Andy Shevchenko

On Tue, Aug 7, 2018 at 7:48 PM, Tony Battersby  wrote:
> To represent the size of a single allocation, dmapool currently uses
> 'unsigned int' in some places and 'size_t' in other places.  Standardize
> on 'unsigned int' to reduce overhead, but use 'size_t' when counting all
> the blocks in the entire pool.

> else if ((boundary < size) || (boundary & (boundary - 1)))
> return NULL;

Just a side note: in above it's is_power_of_2() opencoded.

-- 
With Best Regards,
Andy Shevchenko
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 0/8] Tegra GART driver clean up and optimization

2018-08-08 Thread Joerg Roedel

Hey Thierry,

On Sat, Aug 04, 2018 at 05:29:55PM +0300, Dmitry Osipenko wrote:
> Dmitry Osipenko (8):
>   memory: tegra: Provide facility for integration with the GART driver
>   iommu/tegra: gart: Provide access to Memory Controller driver
>   iommu/tegra: gart: Clean up drivers module code
>   iommu/tegra: gart: Remove pr_fmt and clean up includes
>   iommu/tegra: gart: Clean up driver probe errors handling
>   iommu/tegra: gart: Ignore devices without IOMMU phandle in DT
>   iommu: Introduce iotlb_sync_map callback
>   iommu/tegra: gart: Optimize mapping / unmapping performance
> 
>  drivers/iommu/iommu.c  |  8 ++-
>  drivers/iommu/tegra-gart.c | 99 +++---
>  drivers/memory/tegra/mc.c  | 26 --
>  include/linux/iommu.h  |  1 +
>  include/soc/tegra/mc.h | 13 +
>  5 files changed, 103 insertions(+), 44 deletions(-)

Can you have a look please whether this is safe to go upstream?

Thanks,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH 2/7] iommu: Add share domain interface in iommu for spimdev

2018-08-08 Thread Joerg Roedel

On Wed, Aug 01, 2018 at 06:22:16PM +0800, Kenneth Lee wrote:
> From: Kenneth Lee 
> 
> This patch add sharing interface for a iommu_group. The new interface:
> 
>   iommu_group_share_domain()
>   iommu_group_unshare_domain()
> 
> can be used by some virtual iommu_group (such as iommu_group for spimdev)
> to share their parent's iommu_group.
> 
> When the domain of the group is shared, it cannot be changed before
> unshared.  In the future, notification can be added if update is required.

>From the description or the patch I don't really get why this is needed.
Please update the description and make a case for this API addition.


Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/omap: Fix cache flushes on L2 table entries

2018-08-08 Thread 'Joerg Roedel'

On Mon, Aug 06, 2018 at 05:00:36PM +0200, Ralf Goebel wrote:
> The base address used for DMA operations on the second-level table
> did incorrectly include the offset for the table entry. The offset
> was then added again which lead to incorrect behavior.
> 
> Operations on the L1 table are not affected.
> 
> The calculation of the base address is changed to point to the
> beginning of the L2 table.
> 
> Fixes: bfee0cf0ee1d ("iommu/omap: Use DMA-API for performing cache flushes")
> Acked-by: Suman Anna 
> Signed-off-by: Ralf Goebel 

Applied for v4.19, thanks.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: remove the ->map_sg indirection

2018-08-08 Thread Joerg Roedel

On Mon, Jul 30, 2018 at 09:36:26AM +0200, Christoph Hellwig wrote:
> All iommu drivers use the default_iommu_map_sg implementation, and there
> is no good reason to ever override it.  Just expose it as iommu_map_sg
> directly and remove the indirection, specially in our post-spectre world
> where indirect calls are horribly expensive.
> 
> Signed-off-by: Christoph Hellwig 

Applied, thanks.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 17/20] powerpc/dma-swiotlb: use generic swiotlb_dma_ops

Re: [PATCH 1/1] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout

Re: [RFC PATCH 2/7] iommu: Add share domain interface in iommu for spimdev

Re: [PATCH 20/20] powerpc/dma: remove dma_nommu_mmap_coherent

Re: [PATCH 18/20] powerpc/dma-noncoherent: use generic dma_noncoherent_ops

Re: [PATCH 17/20] powerpc/dma-swiotlb: use generic swiotlb_dma_ops

Re: [PATCH 16/20] powerpc/dma: use dma_direct_{alloc,free}

Re: [PATCH 15/20] powerpc/dma: remove the unused unmap_page and unmap_sg methods

Re: [PATCH 14/20] powerpc/dma: replace dma_nommu_dma_supported with dma_direct_supported

Re: [PATCH 13/20] powerpc/dma: remove get_dma_offset

Re: [PATCH 12/20] powerpc/dma: use phys_to_dma instead of get_dma_offset

Re: [PATCH 11/20] powerpc/dma: split the two __dma_alloc_coherent implementations

Re: [PATCH 10/20] powerpc/dma-noncoherent: don't disable irqs over kmap_atomic

Re: [PATCH 09/20] powerpc/dma: remove the unused ISA_DMA_THRESHOLD export

Re: [PATCH 08/20] powerpc/dma: remove the unused dma_nommu_ops export

Re: [PATCH 07/20] powerpc/dma: remove the unused ARCH_HAS_DMA_MMAP_COHERENT define

Re: [PATCH 02/20] kernel/dma/direct: refine dma_direct_alloc zone selection

Re: [PATCH 01/20] kernel/dma/direct: take DMA offset into account in dma_direct_supported

Re: [PATCH 1/2] arm64: dts: stratix10: Add Stratix10 SMMU support

Re: [RFC PATCH 0/7] A General Accelerator Framework, WarpDrive

Re: [PATCH v3 08/10] dmapool: improve accuracy of debug statistics

Re: [PATCH 1/1] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout

Re: [PATCH v3 08/10] dmapool: improve accuracy of debug statistics

Re: [PATCH v3 07/10] dmapool: cleanup integer types

Re: [PATCH v2 0/8] Tegra GART driver clean up and optimization

Re: [RFC PATCH 2/7] iommu: Add share domain interface in iommu for spimdev

Re: [PATCH] iommu/omap: Fix cache flushes on L2 table entries

Re: [PATCH] iommu: remove the ->map_sg indirection

28 matches

Site Navigation

Mail list logo

Footer information