Re: [PATCH v3 2/2] drivers: remove force dma flag from buses

2018-04-09 Thread Rob Herring
On Fri, Mar 30, 2018 at 01:24:45PM +0530, Nipun Gupta wrote:
> With each bus implementing its own DMA configuration callback,
> there is no need for bus to explicitly have force_dma in its
> global structure. This patch modifies of_dma_configure API to
> accept an input parameter which specifies if implicit DMA
> configuration is required even when it is not described by the
> firmware.
> 
> Signed-off-by: Nipun Gupta 
> ---
> Changes in v2:
>   - This is a new change suggested by Robin and Christoph
> and is added to the series.
> 
> Changes in v3:
>   - Rebase and changes corresponding to the changes in patch 1/2
> 
>  drivers/amba/bus.c| 1 -
>  drivers/base/platform.c   | 3 +--
>  drivers/bcma/main.c   | 2 +-
>  drivers/dma/qcom/hidma_mgmt.c | 2 +-
>  drivers/gpu/host1x/bus.c  | 5 ++---
>  drivers/of/device.c   | 6 --
>  drivers/of/of_reserved_mem.c  | 2 +-
>  drivers/pci/pci-driver.c  | 3 +--
>  include/linux/device.h| 4 
>  include/linux/of_device.h | 8 ++--
>  10 files changed, 17 insertions(+), 19 deletions(-)

Reviewed-by: Rob Herring 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v1 4/4] iommu/tegra: gart: Optimize map/unmap

2018-04-09 Thread Dmitry Osipenko
Currently GART writes one page entry at a time. More optimal would be to
aggregate the writes and flush BUS buffer in the end, this gives map/unmap
10-40% (depending on size of mapping) performance boost compared to a
flushing after each entry update.

Signed-off-by: Dmitry Osipenko 
---
 drivers/iommu/tegra-gart.c | 63 +++---
 1 file changed, 48 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 4a0607669d34..9f59f5f17661 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -36,7 +36,7 @@
 #define GART_APERTURE_SIZE SZ_32M
 
 /* bitmap of the page sizes currently supported */
-#define GART_IOMMU_PGSIZES (SZ_4K)
+#define GART_IOMMU_PGSIZES GENMASK(24, 12)
 
 #define GART_REG_BASE  0x24
 #define GART_CONFIG(0x24 - GART_REG_BASE)
@@ -269,25 +269,21 @@ static void gart_iommu_domain_free(struct iommu_domain 
*domain)
kfree(gart_domain);
 }
 
-static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
- phys_addr_t pa, size_t bytes, int prot)
+static int gart_iommu_map_page(struct gart_device *gart,
+  unsigned long iova,
+  phys_addr_t pa)
 {
-   struct gart_domain *gart_domain = to_gart_domain(domain);
-   struct gart_device *gart = gart_domain->gart;
unsigned long flags;
unsigned long pfn;
unsigned long pte;
 
-   if (!gart_iova_range_valid(gart, iova, bytes))
-   return -EINVAL;
-
-   spin_lock_irqsave(&gart->pte_lock, flags);
pfn = __phys_to_pfn(pa);
if (!pfn_valid(pfn)) {
dev_err(gart->dev, "Invalid page: %pa\n", &pa);
-   spin_unlock_irqrestore(&gart->pte_lock, flags);
return -EINVAL;
}
+
+   spin_lock_irqsave(&gart->pte_lock, flags);
if (gart_debug) {
pte = gart_read_pte(gart, iova);
if (pte & GART_ENTRY_PHYS_ADDR_VALID) {
@@ -297,8 +293,41 @@ static int gart_iommu_map(struct iommu_domain *domain, 
unsigned long iova,
}
}
gart_set_pte(gart, iova, GART_PTE(pfn));
+   spin_unlock_irqrestore(&gart->pte_lock, flags);
+
+   return 0;
+}
+
+static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova,
+ phys_addr_t pa, size_t bytes, int prot)
+{
+   struct gart_domain *gart_domain = to_gart_domain(domain);
+   struct gart_device *gart = gart_domain->gart;
+   size_t mapped;
+   int ret = -1;
+
+   if (!gart_iova_range_valid(gart, iova, bytes))
+   return -EINVAL;
+
+   for (mapped = 0; mapped < bytes; mapped += GART_PAGE_SIZE) {
+   ret = gart_iommu_map_page(gart, iova + mapped, pa + mapped);
+   if (ret)
+   break;
+   }
+
FLUSH_GART_REGS(gart);
+   return ret;
+}
+
+static int gart_iommu_unmap_page(struct gart_device *gart,
+unsigned long iova)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(&gart->pte_lock, flags);
+   gart_set_pte(gart, iova, 0);
spin_unlock_irqrestore(&gart->pte_lock, flags);
+
return 0;
 }
 
@@ -307,16 +336,20 @@ static size_t gart_iommu_unmap(struct iommu_domain 
*domain, unsigned long iova,
 {
struct gart_domain *gart_domain = to_gart_domain(domain);
struct gart_device *gart = gart_domain->gart;
-   unsigned long flags;
+   size_t unmapped;
+   int ret;
 
if (!gart_iova_range_valid(gart, iova, bytes))
return 0;
 
-   spin_lock_irqsave(&gart->pte_lock, flags);
-   gart_set_pte(gart, iova, 0);
+   for (unmapped = 0; unmapped < bytes; unmapped += GART_PAGE_SIZE) {
+   ret = gart_iommu_unmap_page(gart, iova + unmapped);
+   if (ret)
+   break;
+   }
+
FLUSH_GART_REGS(gart);
-   spin_unlock_irqrestore(&gart->pte_lock, flags);
-   return bytes;
+   return unmapped;
 }
 
 static phys_addr_t gart_iommu_iova_to_phys(struct iommu_domain *domain,
-- 
2.16.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v1 0/4] Tegra GART fixes and improvements

2018-04-09 Thread Dmitry Osipenko
GART driver wasn't ever been utilized in upstream, but finally this should
change sometime soon with Tegra's DRM driver rework. In general GART driver
works fine, though there are couple things that could be improved.

Dmitry Osipenko (4):
  iommu/tegra: gart: Add debugging facility
  iommu/tegra: gart: Fix gart_iommu_unmap()
  iommu/tegra: gart: Constify number of GART pages
  iommu/tegra: gart: Optimize map/unmap

 drivers/iommu/tegra-gart.c | 90 +++---
 1 file changed, 69 insertions(+), 21 deletions(-)

-- 
2.16.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v1 2/4] iommu/tegra: gart: Fix gart_iommu_unmap()

2018-04-09 Thread Dmitry Osipenko
It must return the number of unmapped bytes on success, returning 0 means
that unmapping failed and in result only one page is unmapped.

Signed-off-by: Dmitry Osipenko 
---
 drivers/iommu/tegra-gart.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 4c0abdcd1ad2..89ec24c6952c 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -313,7 +313,7 @@ static size_t gart_iommu_unmap(struct iommu_domain *domain, 
unsigned long iova,
gart_set_pte(gart, iova, 0);
FLUSH_GART_REGS(gart);
spin_unlock_irqrestore(&gart->pte_lock, flags);
-   return 0;
+   return bytes;
 }
 
 static phys_addr_t gart_iommu_iova_to_phys(struct iommu_domain *domain,
-- 
2.16.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v1 1/4] iommu/tegra: gart: Add debugging facility

2018-04-09 Thread Dmitry Osipenko
Page mapping could overwritten by an accident (a bug). We can catch this
case by checking 'VALID' bit of GART's page entry prior to mapping of a
page. Since that check introduces a small performance impact, it should be
enabled explicitly using new GART's kernel module 'debug' parameter.

Signed-off-by: Dmitry Osipenko 
---
 drivers/iommu/tegra-gart.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index b62f790ad1ba..4c0abdcd1ad2 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -72,6 +72,8 @@ struct gart_domain {
 
 static struct gart_device *gart_handle; /* unique for a system */
 
+static bool gart_debug;
+
 #define GART_PTE(_pfn) \
(GART_ENTRY_PHYS_ADDR_VALID | ((_pfn) << PAGE_SHIFT))
 
@@ -271,6 +273,7 @@ static int gart_iommu_map(struct iommu_domain *domain, 
unsigned long iova,
struct gart_device *gart = gart_domain->gart;
unsigned long flags;
unsigned long pfn;
+   unsigned long pte;
 
if (!gart_iova_range_valid(gart, iova, bytes))
return -EINVAL;
@@ -282,6 +285,14 @@ static int gart_iommu_map(struct iommu_domain *domain, 
unsigned long iova,
spin_unlock_irqrestore(&gart->pte_lock, flags);
return -EINVAL;
}
+   if (gart_debug) {
+   pte = gart_read_pte(gart, iova);
+   if (pte & GART_ENTRY_PHYS_ADDR_VALID) {
+   spin_unlock_irqrestore(&gart->pte_lock, flags);
+   dev_err(gart->dev, "Page entry is in-use\n");
+   return -EBUSY;
+   }
+   }
gart_set_pte(gart, iova, GART_PTE(pfn));
FLUSH_GART_REGS(gart);
spin_unlock_irqrestore(&gart->pte_lock, flags);
@@ -515,7 +526,9 @@ static void __exit tegra_gart_exit(void)
 
 subsys_initcall(tegra_gart_init);
 module_exit(tegra_gart_exit);
+module_param(gart_debug, bool, 0644);
 
+MODULE_PARM_DESC(gart_debug, "Enable GART debugging");
 MODULE_DESCRIPTION("IOMMU API for GART in Tegra20");
 MODULE_AUTHOR("Hiroshi DOYU ");
 MODULE_ALIAS("platform:tegra-gart");
-- 
2.16.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v1 3/4] iommu/tegra: gart: Constify number of GART pages

2018-04-09 Thread Dmitry Osipenko
GART has a fixed aperture size, hence the number of pages is constant.

Signed-off-by: Dmitry Osipenko 
---
 drivers/iommu/tegra-gart.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 89ec24c6952c..4a0607669d34 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -33,6 +33,8 @@
 
 #include 
 
+#define GART_APERTURE_SIZE SZ_32M
+
 /* bitmap of the page sizes currently supported */
 #define GART_IOMMU_PGSIZES (SZ_4K)
 
@@ -47,6 +49,8 @@
 #define GART_PAGE_MASK \
(~(GART_PAGE_SIZE - 1) & ~GART_ENTRY_PHYS_ADDR_VALID)
 
+#define GART_PAGECOUNT (GART_APERTURE_SIZE >> GART_PAGE_SHIFT)
+
 struct gart_client {
struct device   *dev;
struct list_headlist;
@@ -55,7 +59,6 @@ struct gart_client {
 struct gart_device {
void __iomem*regs;
u32 *savedata;
-   u32 page_count; /* total remappable size */
dma_addr_t  iovmm_base; /* offset to vmm_area */
spinlock_t  pte_lock;   /* for pagetable */
struct list_headclient;
@@ -91,7 +94,7 @@ static struct gart_domain *to_gart_domain(struct iommu_domain 
*dom)
 
 #define for_each_gart_pte(gart, iova)  \
for (iova = gart->iovmm_base;   \
-iova < gart->iovmm_base + GART_PAGE_SIZE * gart->page_count; \
+iova < gart->iovmm_base + GART_APERTURE_SIZE;  \
 iova += GART_PAGE_SIZE)
 
 static inline void gart_set_pte(struct gart_device *gart,
@@ -158,7 +161,7 @@ static inline bool gart_iova_range_valid(struct gart_device 
*gart,
iova_start = iova;
iova_end = iova_start + bytes - 1;
gart_start = gart->iovmm_base;
-   gart_end = gart_start + gart->page_count * GART_PAGE_SIZE - 1;
+   gart_end = gart_start + GART_APERTURE_SIZE - 1;
 
if (iova_start < gart_start)
return false;
@@ -241,7 +244,7 @@ static struct iommu_domain 
*gart_iommu_domain_alloc(unsigned type)
gart_domain->gart = gart;
gart_domain->domain.geometry.aperture_start = gart->iovmm_base;
gart_domain->domain.geometry.aperture_end = gart->iovmm_base +
-   gart->page_count * GART_PAGE_SIZE - 1;
+   GART_APERTURE_SIZE - 1;
gart_domain->domain.geometry.force_aperture = true;
 
return &gart_domain->domain;
@@ -463,9 +466,8 @@ static int tegra_gart_probe(struct platform_device *pdev)
INIT_LIST_HEAD(&gart->client);
gart->regs = gart_regs;
gart->iovmm_base = (dma_addr_t)res_remap->start;
-   gart->page_count = (resource_size(res_remap) >> GART_PAGE_SHIFT);
 
-   gart->savedata = vmalloc(sizeof(u32) * gart->page_count);
+   gart->savedata = vmalloc(sizeof(u32) * GART_PAGECOUNT);
if (!gart->savedata) {
dev_err(dev, "failed to allocate context save area\n");
return -ENOMEM;
-- 
2.16.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] dma-coherent: Clarify dma_mmap_from_dev_coherent documentation

2018-04-09 Thread Robin Murphy
The use of "correctly mapped" here is misleading, since it can give the
wrong expectation in the case that the memory *should* have been mapped
from the per-device pool, but doing so failed for other reasons.

Signed-off-by: Robin Murphy 
---
 drivers/base/dma-coherent.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
index 1e6396bb807b..597d40893862 100644
--- a/drivers/base/dma-coherent.c
+++ b/drivers/base/dma-coherent.c
@@ -312,8 +312,9 @@ static int __dma_mmap_from_coherent(struct dma_coherent_mem 
*mem,
  * This checks whether the memory was allocated from the per-device
  * coherent memory pool and if so, maps that memory to the provided vma.
  *
- * Returns 1 if we correctly mapped the memory, or 0 if the caller should
- * proceed with mapping memory from generic pools.
+ * Returns 1 if @vaddr belongs to the device coherent pool and the caller
+ * should return @ret, or 0 if they should proceed with mapping memory from
+ * generic areas.
  */
 int dma_mmap_from_dev_coherent(struct device *dev, struct vm_area_struct *vma,
   void *vaddr, size_t size, int *ret)
-- 
2.16.1.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] base: dma-mapping: Postpone cpu addr translation on mmap()

2018-04-09 Thread Christoph Hellwig
On Mon, Apr 09, 2018 at 06:59:08PM +0200, Jacopo Mondi wrote:
> I'm still a bit puzzled on what happens if dma_mmap_from_dev_coherent() fails.
> Does a dma_mmap_from_dev_coherent() failure guarantee anyhow that the
> successive virt_to_page() isn't problematic as it is today?
> Or is it the
>   if (off < count && user_count <= (count - off))
> check that makes the translation safe?

It doesn't.  I think one major issue is that we should not simply fall
to dma_common_mmap if no method is required, but need every instance of
dma_map_ops to explicitly opt into an mmap method that is known to work.

>  #ifndef CONFIG_ARCH_NO_COHERENT_DMA_MMAP
>   unsigned long user_count = vma_pages(vma);
>   unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> - unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
>   unsigned long off = vma->vm_pgoff;
> + unsigned long pfn;
> 
>   vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> 
> @@ -235,6 +235,7 @@ int dma_common_mmap(struct device *dev, struct 
> vm_area_struct *vma,
>   return ret;
> 
>   if (off < count && user_count <= (count - off)) {
> + pfn = page_to_pfn(virt_to_page(cpu_addr));
>   ret = remap_pfn_range(vma, vma->vm_start,
> pfn + off,
> user_count << PAGE_SHIFT,

Why not:

ret = remap_pfn_range(vma, vma->vm_start,
page_to_pfn(virt_to_page(cpu_addr)) + off,

and save the temp variable?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] base: dma-mapping: Postpone cpu addr translation on mmap()

2018-04-09 Thread Jacopo Mondi
Postpone calling virt_to_page() translation on memory locations not
guaranteed to be backed by a struct page.

This patch fixes a specific issue of SH architecture configured with
SPARSEMEM memory model, when mapping buffers allocated with the memblock
APIs at system initialization time, and thus not backed by the page
infrastructure.

It does apply to the general case though, as an early translation is anyhow
incorrect and shall be postponed after trying to map memory from the device
coherent memory pool first.

Suggested-by: Laurent Pinchart 
Signed-off-by: Jacopo Mondi 

---
Compared to the RFC version I have tried to generalize the commit message,
please suggest any improvement to that.

I'm still a bit puzzled on what happens if dma_mmap_from_dev_coherent() fails.
Does a dma_mmap_from_dev_coherent() failure guarantee anyhow that the
successive virt_to_page() isn't problematic as it is today?
Or is it the
if (off < count && user_count <= (count - off))
check that makes the translation safe?

Thanks
   j

---
 drivers/base/dma-mapping.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c
index 3b11835..8b4ec34 100644
--- a/drivers/base/dma-mapping.c
+++ b/drivers/base/dma-mapping.c
@@ -226,8 +226,8 @@ int dma_common_mmap(struct device *dev, struct 
vm_area_struct *vma,
 #ifndef CONFIG_ARCH_NO_COHERENT_DMA_MMAP
unsigned long user_count = vma_pages(vma);
unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-   unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
unsigned long off = vma->vm_pgoff;
+   unsigned long pfn;

vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);

@@ -235,6 +235,7 @@ int dma_common_mmap(struct device *dev, struct 
vm_area_struct *vma,
return ret;

if (off < count && user_count <= (count - off)) {
+   pfn = page_to_pfn(virt_to_page(cpu_addr));
ret = remap_pfn_range(vma, vma->vm_start,
  pfn + off,
  user_count << PAGE_SHIFT,
--
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v2 2/2] base: dma-mapping: Postpone page_to_pfn() on mmap()

2018-04-09 Thread Rich Felker
On Mon, Apr 09, 2018 at 04:06:15PM +0300, Laurent Pinchart wrote:
> Hello,
> 
> On Monday, 9 April 2018 14:11:22 EEST Robin Murphy wrote:
> > On 09/04/18 08:25, jacopo mondi wrote:
> > > Hi Robin, Laurent,
> > > 
> > >  a long time passed, sorry about this.
> > > 
> > > On Wed, Nov 15, 2017 at 01:38:23PM +, Robin Murphy wrote:
> > >> On 14/11/17 17:08, Jacopo Mondi wrote:
> > >>> On SH4 architecture, with SPARSEMEM memory model, translating page to
> > >>> pfn hangs the CPU. Post-pone translation to pfn after
> > >>> dma_mmap_from_dev_coherent() function call as it succeeds and make page
> > >>> translation not necessary.
> > >>> 
> > >>> This patch was suggested by Laurent Pinchart and he's working to submit
> > >>> a proper fix mainline. Not sending for inclusion at the moment.
> > >> 
> > >> Y'know, I think this patch does have some merit by itself - until we know
> > >> that cpu_addr *doesn't* represent some device-private memory which is not
> > >> guaranteed to be backed by a struct page, calling virt_to_page() on it is
> > >> arguably semantically incorrect, even if it might happen to be benign in
> > >> most cases.
> > > 
> > > I still need to carry this patch in my trees to have a working dma memory
> > > mapping on SH4 platforms. My understanding from your comment is that
> > > there may be a way forward for this patch, do you still think the same?
> > > Have you got any suggestion on how to improve this eventually for
> > > inclusion?
> > 
> > As before, the change itself does seem reasonable; it might be worth
> > rewording the commit message in more general terms rather than making it
> > sound like an SH-specific workaround (which I really don't think it is),
> > but otherwise I'd say just repost it as a non-RFC patch.
> 
> I actually can't remember any better fix I would have in mind, so this looks 
> good to me :-) I agree with Robin, the commit message should be reworded. 
> Robin's explanation of why virt_to_page() should be postponed until we know 
> that cpu_addr represents memory that is guaranteed to be backed by a struct 
> page is a good starting point. You can mention SH4 as an example of an 
> architecture that will crash when calling virt_to_page() in such a case, but 
> the fix isn't specific to SH4.

Just checking since I joined late -- is the consensus that this does
not require any action/review specific to SH (in my role as maintainer
way behind on maintenance)?

Rich
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v2 2/2] base: dma-mapping: Postpone page_to_pfn() on mmap()

2018-04-09 Thread Laurent Pinchart
Hi Rich,

On Monday, 9 April 2018 18:11:13 EEST Rich Felker wrote:
> On Mon, Apr 09, 2018 at 04:06:15PM +0300, Laurent Pinchart wrote:
> > On Monday, 9 April 2018 14:11:22 EEST Robin Murphy wrote:
> >> On 09/04/18 08:25, jacopo mondi wrote:
> >>> Hi Robin, Laurent,
> >>> 
> >>>  a long time passed, sorry about this.
> >>> 
> >>> On Wed, Nov 15, 2017 at 01:38:23PM +, Robin Murphy wrote:
>  On 14/11/17 17:08, Jacopo Mondi wrote:
> > On SH4 architecture, with SPARSEMEM memory model, translating page
> > to pfn hangs the CPU. Post-pone translation to pfn after
> > dma_mmap_from_dev_coherent() function call as it succeeds and make
> > page translation not necessary.
> > 
> > This patch was suggested by Laurent Pinchart and he's working to
> > submit a proper fix mainline. Not sending for inclusion at the
> > moment.
>  
>  Y'know, I think this patch does have some merit by itself - until we
>  know that cpu_addr *doesn't* represent some device-private memory
>  which is not guaranteed to be backed by a struct page, calling
>  virt_to_page() on it is arguably semantically incorrect, even if it
>  might happen to be benign in most cases.
> >>> 
> >>> I still need to carry this patch in my trees to have a working dma
> >>> memory mapping on SH4 platforms. My understanding from your comment is
> >>> that there may be a way forward for this patch, do you still think the
> >>> same? Have you got any suggestion on how to improve this eventually
> >>> for inclusion?
> >> 
> >> As before, the change itself does seem reasonable; it might be worth
> >> rewording the commit message in more general terms rather than making it
> >> sound like an SH-specific workaround (which I really don't think it is),
> >> but otherwise I'd say just repost it as a non-RFC patch.
> > 
> > I actually can't remember any better fix I would have in mind, so this
> > looks good to me :-) I agree with Robin, the commit message should be
> > reworded. Robin's explanation of why virt_to_page() should be postponed
> > until we know that cpu_addr represents memory that is guaranteed to be
> > backed by a struct page is a good starting point. You can mention SH4 as
> > an example of an architecture that will crash when calling virt_to_page()
> > in such a case, but the fix isn't specific to SH4.
> 
> Just checking since I joined late -- is the consensus that this does
> not require any action/review specific to SH (in my role as maintainer
> way behind on maintenance)?

I don't think it requires any action specific to SH. If you want to review the 
patch in the context of SH though, please feel free to do so :-)

-- 
Regards,

Laurent Pinchart



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v2 2/2] base: dma-mapping: Postpone page_to_pfn() on mmap()

2018-04-09 Thread Laurent Pinchart
Hello,

On Monday, 9 April 2018 14:11:22 EEST Robin Murphy wrote:
> On 09/04/18 08:25, jacopo mondi wrote:
> > Hi Robin, Laurent,
> > 
> >  a long time passed, sorry about this.
> > 
> > On Wed, Nov 15, 2017 at 01:38:23PM +, Robin Murphy wrote:
> >> On 14/11/17 17:08, Jacopo Mondi wrote:
> >>> On SH4 architecture, with SPARSEMEM memory model, translating page to
> >>> pfn hangs the CPU. Post-pone translation to pfn after
> >>> dma_mmap_from_dev_coherent() function call as it succeeds and make page
> >>> translation not necessary.
> >>> 
> >>> This patch was suggested by Laurent Pinchart and he's working to submit
> >>> a proper fix mainline. Not sending for inclusion at the moment.
> >> 
> >> Y'know, I think this patch does have some merit by itself - until we know
> >> that cpu_addr *doesn't* represent some device-private memory which is not
> >> guaranteed to be backed by a struct page, calling virt_to_page() on it is
> >> arguably semantically incorrect, even if it might happen to be benign in
> >> most cases.
> > 
> > I still need to carry this patch in my trees to have a working dma memory
> > mapping on SH4 platforms. My understanding from your comment is that
> > there may be a way forward for this patch, do you still think the same?
> > Have you got any suggestion on how to improve this eventually for
> > inclusion?
> 
> As before, the change itself does seem reasonable; it might be worth
> rewording the commit message in more general terms rather than making it
> sound like an SH-specific workaround (which I really don't think it is),
> but otherwise I'd say just repost it as a non-RFC patch.

I actually can't remember any better fix I would have in mind, so this looks 
good to me :-) I agree with Robin, the commit message should be reworded. 
Robin's explanation of why virt_to_page() should be postponed until we know 
that cpu_addr represents memory that is guaranteed to be backed by a struct 
page is a good starting point. You can mention SH4 as an example of an 
architecture that will crash when calling virt_to_page() in such a case, but 
the fix isn't specific to SH4.

> >>> Suggested-by: Laurent Pinchart 
> >>> Signed-off-by: Jacopo Mondi 
> >>> ---
> >>> 
> >>>   drivers/base/dma-mapping.c | 3 ++-
> >>>   1 file changed, 2 insertions(+), 1 deletion(-)
> >>> 
> >>> diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c
> >>> index e584edd..73d64d3 100644
> >>> --- a/drivers/base/dma-mapping.c
> >>> +++ b/drivers/base/dma-mapping.c
> >>> @@ -227,8 +227,8 @@ int dma_common_mmap(struct device *dev, struct
> >>> vm_area_struct *vma,
> >>>   #ifndef CONFIG_ARCH_NO_COHERENT_DMA_MMAP
> >>>   unsigned long user_count = vma_pages(vma);
> >>>   unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> >>> - unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
> >>>   unsigned long off = vma->vm_pgoff;
> >>> + unsigned long pfn;
> >>> 
> >>>   vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> >>> 
> >>> @@ -236,6 +236,7 @@ int dma_common_mmap(struct device *dev, struct
> >>> vm_area_struct *vma,
> >>>   return ret;
> >>>   
> >>>   if (off < count && user_count <= (count - off)) {
> >>> + pfn = page_to_pfn(virt_to_page(cpu_addr));
> >>>   ret = remap_pfn_range(vma, vma->vm_start,
> >>> pfn + off,
> >>> user_count << PAGE_SHIFT,

-- 
Regards,

Laurent Pinchart



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v2 2/2] base: dma-mapping: Postpone page_to_pfn() on mmap()

2018-04-09 Thread Robin Murphy

Hi Jacopo,

On 09/04/18 08:25, jacopo mondi wrote:

Hi Robin, Laurent,
 a long time passed, sorry about this.

On Wed, Nov 15, 2017 at 01:38:23PM +, Robin Murphy wrote:

On 14/11/17 17:08, Jacopo Mondi wrote:

On SH4 architecture, with SPARSEMEM memory model, translating page to
pfn hangs the CPU. Post-pone translation to pfn after
dma_mmap_from_dev_coherent() function call as it succeeds and make page
translation not necessary.

This patch was suggested by Laurent Pinchart and he's working to submit
a proper fix mainline. Not sending for inclusion at the moment.


Y'know, I think this patch does have some merit by itself - until we know
that cpu_addr *doesn't* represent some device-private memory which is not
guaranteed to be backed by a struct page, calling virt_to_page() on it is
arguably semantically incorrect, even if it might happen to be benign in
most cases.


I still need to carry this patch in my trees to have a working dma memory 
mapping
on SH4 platforms. My understanding from your comment is that there may
be a way forward for this patch, do you still think the same? Have you
got any suggestion on how to improve this eventually for inclusion?


As before, the change itself does seem reasonable; it might be worth 
rewording the commit message in more general terms rather than making it 
sound like an SH-specific workaround (which I really don't think it is), 
but otherwise I'd say just repost it as a non-RFC patch.


Robin.



Thanks
j



Robin.


Suggested-by: Laurent Pinchart 
Signed-off-by: Jacopo Mondi 
---
  drivers/base/dma-mapping.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c
index e584edd..73d64d3 100644
--- a/drivers/base/dma-mapping.c
+++ b/drivers/base/dma-mapping.c
@@ -227,8 +227,8 @@ int dma_common_mmap(struct device *dev, struct 
vm_area_struct *vma,
  #ifndef CONFIG_ARCH_NO_COHERENT_DMA_MMAP
unsigned long user_count = vma_pages(vma);
unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-   unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
unsigned long off = vma->vm_pgoff;
+   unsigned long pfn;

vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);

@@ -236,6 +236,7 @@ int dma_common_mmap(struct device *dev, struct 
vm_area_struct *vma,
return ret;

if (off < count && user_count <= (count - off)) {
+   pfn = page_to_pfn(virt_to_page(cpu_addr));
ret = remap_pfn_range(vma, vma->vm_start,
  pfn + off,
  user_count << PAGE_SHIFT,
--
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


dma-mapping: clear harmful GFP_* flags in common code

2018-04-09 Thread Alexey Brodkin
Hello,

May we have
e89f5b370153 ("dma-mapping: Don't clear GFP_ZERO in dma_alloc_attrs")
back-ported to 4.16 kernel as it fixes:

57bf5a8 ("dma-mapping: clear harmful GFP_* flags in common code").

For more info about introduced problem see this thread:
http://lists.infradead.org/pipermail/linux-snps-arc/2018-March/003652.html

-Alexey
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


fix x86 swiotlb regression

2018-04-09 Thread Christoph Hellwig
Hi all,

this patch fixes a regression in the x86 swiotlb conversion.  This mostly
happend because swiotlb_dma_support does the wrong thing (and did so for
a long time) and we switched x86 to use it.

There are a few others users of swiotlb_dma_supported that also look
rather broken, but I'll take care of those for the next merge window.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] swiotlb: use dma_direct_supported for swiotlb_ops

2018-04-09 Thread Christoph Hellwig
swiotlb_alloc calls dma_direct_alloc, which can satisfy lower than 32-bit
dma mask requests using GFP_DMA if the architecture supports it.  Various
x86 drivers rely on that, so we need to support that.  At the same time
the whole kernel expects 32-bit dma mask to just work, so the other magic
in swiotlb_dma_support isn't actually needed either.

Reported-by: Dominik Brodowski 
Fixes: 6e4bf5867783 ("x86/dma: Use generic swiotlb_ops")
Signed-off-by: Christoph Hellwig 
---
 lib/swiotlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 47aeb04c1997..32aacd0d56a8 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -1087,6 +1087,6 @@ const struct dma_map_ops swiotlb_dma_ops = {
.unmap_sg   = swiotlb_unmap_sg_attrs,
.map_page   = swiotlb_map_page,
.unmap_page = swiotlb_unmap_page,
-   .dma_supported  = swiotlb_dma_supported,
+   .dma_supported  = dma_direct_supported,
 };
 #endif /* CONFIG_DMA_DIRECT_OPS */
-- 
2.16.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v2 2/2] base: dma-mapping: Postpone page_to_pfn() on mmap()

2018-04-09 Thread jacopo mondi
Hi Robin, Laurent,
a long time passed, sorry about this.

On Wed, Nov 15, 2017 at 01:38:23PM +, Robin Murphy wrote:
> On 14/11/17 17:08, Jacopo Mondi wrote:
> >On SH4 architecture, with SPARSEMEM memory model, translating page to
> >pfn hangs the CPU. Post-pone translation to pfn after
> >dma_mmap_from_dev_coherent() function call as it succeeds and make page
> >translation not necessary.
> >
> >This patch was suggested by Laurent Pinchart and he's working to submit
> >a proper fix mainline. Not sending for inclusion at the moment.
>
> Y'know, I think this patch does have some merit by itself - until we know
> that cpu_addr *doesn't* represent some device-private memory which is not
> guaranteed to be backed by a struct page, calling virt_to_page() on it is
> arguably semantically incorrect, even if it might happen to be benign in
> most cases.

I still need to carry this patch in my trees to have a working dma memory 
mapping
on SH4 platforms. My understanding from your comment is that there may
be a way forward for this patch, do you still think the same? Have you
got any suggestion on how to improve this eventually for inclusion?

Thanks
   j

>
> Robin.
>
> >Suggested-by: Laurent Pinchart 
> >Signed-off-by: Jacopo Mondi 
> >---
> >  drivers/base/dma-mapping.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> >diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c
> >index e584edd..73d64d3 100644
> >--- a/drivers/base/dma-mapping.c
> >+++ b/drivers/base/dma-mapping.c
> >@@ -227,8 +227,8 @@ int dma_common_mmap(struct device *dev, struct 
> >vm_area_struct *vma,
> >  #ifndef CONFIG_ARCH_NO_COHERENT_DMA_MMAP
> > unsigned long user_count = vma_pages(vma);
> > unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> >-unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
> > unsigned long off = vma->vm_pgoff;
> >+unsigned long pfn;
> >
> > vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> >
> >@@ -236,6 +236,7 @@ int dma_common_mmap(struct device *dev, struct 
> >vm_area_struct *vma,
> > return ret;
> >
> > if (off < count && user_count <= (count - off)) {
> >+pfn = page_to_pfn(virt_to_page(cpu_addr));
> > ret = remap_pfn_range(vma, vma->vm_start,
> >   pfn + off,
> >   user_count << PAGE_SHIFT,
> >--
> >2.7.4
> >
> >___
> >iommu mailing list
> >iommu@lists.linux-foundation.org
> >https://lists.linuxfoundation.org/mailman/listinfo/iommu
> >


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu