[RFC PATCH v7 2/5] iommu/dma: Add a new dma_map_ops of get_merge_boundary()

2019-06-20 Thread Yoshihiro Shimoda
This patch adds a new dma_map_ops of get_merge_boundary() to
expose the DMA merge boundary if the domain type is IOMMU_DOMAIN_DMA.

Signed-off-by: Yoshihiro Shimoda 
---
 drivers/iommu/dma-iommu.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 205d694..9950cb5 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1091,6 +1091,16 @@ static int iommu_dma_get_sgtable(struct device *dev, 
struct sg_table *sgt,
return ret;
 }
 
+static unsigned long iommu_dma_get_merge_boundary(struct device *dev)
+{
+   struct iommu_domain *domain = iommu_get_dma_domain(dev);
+
+   if (domain->type != IOMMU_DOMAIN_DMA)
+   return 0;   /* can't merge */
+
+   return (1 << __ffs(domain->pgsize_bitmap)) - 1;
+}
+
 static const struct dma_map_ops iommu_dma_ops = {
.alloc  = iommu_dma_alloc,
.free   = iommu_dma_free,
@@ -1106,6 +1116,7 @@ static const struct dma_map_ops iommu_dma_ops = {
.sync_sg_for_device = iommu_dma_sync_sg_for_device,
.map_resource   = iommu_dma_map_resource,
.unmap_resource = iommu_dma_unmap_resource,
+   .get_merge_boundary = iommu_dma_get_merge_boundary,
 };
 
 /*
-- 
2.7.4



RE: [RFC PATCH v6 5/5] mmc: queue: Use bigger segments if IOMMU can merge the segments

2019-06-17 Thread Yoshihiro Shimoda
Hi Christoph,

> From: Christoph Hellwig, Sent: Monday, June 17, 2019 3:54 PM
> 
> On Mon, Jun 17, 2019 at 06:46:33AM +, Yoshihiro Shimoda wrote:
> > > can_merge seems a little too generic a name to me.  Maybe can_iommu_merge?
> >
> > I'll fix the name. Also, only the device_iommu_mapped() condition wiil cause
> > a problem on iommu=pt [1]. So, I'll add another condition here.
> 
> Instead of adding another condition here I think we need to properly
> abstract it out in the DMA layer.  E.g. have a

Thank you for your comment and sample code! I'll add such functions
on next patch series.

Best regards,
Yoshihiro Shimoda

> unsigned long dma_get_merge_boundary(struct device *dev)
> {
>   const struct dma_map_ops *ops = get_dma_ops(dev);
> 
>   if (!ops || !ops->get_merge_boundary)
>   return 0; /* can't merge */
>   return ops->get_merge_boundary(dev);
> }
> 
> and then implement the method in dma-iommu.c.
> 
> blk_queue_can_use_iommu_merging then comes:
> 
> bool blk_queue_enable_iommu_merging(struct request_queue *q,
>   struct device *dev)
> {
>   unsigned long boundary = dma_get_merge_boundary(dev);
> 
>   if (!boundary)
>   return false;
>   blk_queue_virt_boundary(q, boundary);
>   return true;
> }


RE: [RFC PATCH v6 4/5] mmc: tmio: Use dma_max_mapping_size() instead of a workaround

2019-06-17 Thread Yoshihiro Shimoda
Hi Geert-san,

> From: Geert Uytterhoeven, Sent: Monday, June 17, 2019 3:23 PM
> 
> Hi Shimoda-san,
> 
> On Mon, Jun 17, 2019 at 6:54 AM Yoshihiro Shimoda
>  wrote:
> > > From: Geert Uytterhoeven, Sent: Friday, June 14, 2019 4:27 PM
> > > On Fri, Jun 14, 2019 at 9:18 AM Christoph Hellwig wrote:
> > > > On Thu, Jun 13, 2019 at 10:35:44PM +0200, Geert Uytterhoeven wrote:

> > > > This really should use a min_t on size_t.  Otherwise the patch looks
> > > > fine:
> > >
> > > Followed by another min() to make it fit in mmc->max_req_size, which is
> > > unsigned int.
> >
> > Geert-san:
> >
> > I'm afraid, but I cannot understand this means.
> > Is this patch is possible to be upstream? Or, do you have any concern?
> 
> Please disregard my last comment: as the value of "mmc->max_blk_size *
> mmc->max_blk_count" is always 0x_ or less, "min_t(size_t,
> mmc->max_blk_size * mmc->max_blk_count, dma_max_mapping_size(>dev))"
> will always be 0x_ or less, too, so there is no extra step needed
> to make it fit in mmc->max_req_size.

Thank you for your prompt reply! I understood it.

> Sorry for the confusion.

No worries.

Best regards,
Yoshihiro Shimoda

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH v6 5/5] mmc: queue: Use bigger segments if IOMMU can merge the segments

2019-06-17 Thread Yoshihiro Shimoda
Hi Christoph,

> From: Christoph Hellwig, Sent: Friday, June 14, 2019 4:25 PM
> 
> On Thu, Jun 13, 2019 at 07:20:15PM +0900, Yoshihiro Shimoda wrote:
> > +static unsigned int mmc_get_max_segments(struct mmc_host *host)
> > +{
> > +   return host->can_merge ? BLK_MAX_SEGMENTS : host->max_segs;
> > +}
> 
> Note that BLK_MAX_SEGMENTS is really a little misnamed, it just
> is a BLK_DEFAULT_SEGMENTS.  I think you are better of picking your
> own value here (even if 128 ends up ok) than reusing this somewhat
> confusing constant.

Thank you for your comments. I got it. I'll fix this.

> > +   /*
> > +* Since blk_mq_alloc_tag_set() calls .init_request() of mmc_mq_ops,
> > +* the host->can_merge should be set before to get max_segs from
> > +* mmc_get_max_segments().
> > +*/
> > +   if (host->max_segs < BLK_MAX_SEGMENTS &&
> > +   device_iommu_mapped(mmc_dev(host)))
> > +   host->can_merge = 1;
> > +   else
> > +   host->can_merge = 0;
> > +
> 
> can_merge seems a little too generic a name to me.  Maybe can_iommu_merge?

I'll fix the name. Also, only the device_iommu_mapped() condition wiil cause
a problem on iommu=pt [1]. So, I'll add another condition here.

[1]
https://marc.info/?l=linux-mmc=156050608709643=2

Best regards,
Yoshihiro Shimoda

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH v6 5/5] mmc: queue: Use bigger segments if IOMMU can merge the segments

2019-06-17 Thread Yoshihiro Shimoda
Hi Wolfram-san,

> From: Wolfram Sang, Sent: Friday, June 14, 2019 4:59 AM
> 
> > -   blk_queue_max_segments(mq->queue, host->max_segs);
> > +   /* blk_queue_can_use_iommu_merging() should succeed if can_merge = 1 */
> > +   if (host->can_merge &&
> > +   !blk_queue_can_use_iommu_merging(mq->queue, mmc_dev(host)))
> > +   WARN_ON(1);
> > +   blk_queue_max_segments(mq->queue, mmc_get_max_segments(host));
> 
> Maybe we could use WARN here to save the comment and move the info to
> the printout?
> 
> - blk_queue_max_segments(mq->queue, host->max_segs);
> + if (host->can_merge)
> + WARN(!blk_queue_can_use_iommu_merging(mq->queue, mmc_dev(host)),
> +  "merging was advertised but not possible\n");
> + blk_queue_max_segments(mq->queue, mmc_get_max_segments(host));

Thank you for the suggestion. It's a good idea! I'll fix the patch.

Best regards,
Yoshihiro Shimoda



RE: [RFC PATCH v6 3/5] block: add a helper function to merge the segments by an IOMMU

2019-06-17 Thread Yoshihiro Shimoda
Hi Robin,

> From: Robin Murphy, Sent: Friday, June 14, 2019 6:55 PM
> 
> On 13/06/2019 11:20, Yoshihiro Shimoda wrote:

> > +bool blk_queue_can_use_iommu_merging(struct request_queue *q,
> > +struct device *dev)
> > +{
> > +   struct iommu_domain *domain;
> > +
> > +   /*
> > +* If the device DMA is translated by an IOMMU, we can assume
> > +* the device can merge the segments.
> > +*/
> > +   if (!device_iommu_mapped(dev))
> 
> Careful here - I think this validates the comment I made when this
> function was introduced, in that that name doesn't necesarily mean what
> it sounds like it might mean - "iommu_mapped" was as close as we managed
> to get to a convenient shorthand for "performs DMA through an
> IOMMU-API-enabled IOMMU". Specifically, it does not imply that
> translation is *currently* active; if you boot with "iommu=pt" or
> equivalent this will still return true even though the device will be
> using direct/SWIOTLB DMA ops without any IOMMU translation.

Thank you for your comments. I understood the mean of "iommu_mapped" and
this patch's condition causes a problem on iommu=pt.
So, I'll add and additional condition like
"domain->type == IOMMU_DOMAIN_DMA" to check whether the translation is
currently active on the domain or not.

Best regards,
Yoshihiro Shimoda

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH v6 1/5] iommu: add an exported function to get minimum page size for a domain

2019-06-16 Thread Yoshihiro Shimoda
Hi Robin,

> From: Robin Murphy, Sent: Friday, June 14, 2019 6:41 PM
> 
> On 13/06/2019 11:20, Yoshihiro Shimoda wrote:
> > This patch adds an exported function to get minimum page size for
> > a domain. This patch also modifies similar codes on the iommu.c.
> 
> Heh, seeing this gave me a genuine déjà vu moment...
> 
> ...but it turns out I actually *have* reviewed this patch before :)
> 
> https://lore.kernel.org/lkml/05eca601-0264-8141-ceeb-7ef7ad5d5...@arm.com/

Thank you for the information :)
I realized my patch should have taken care of the CONFIG_IPMMU_API=n case.

However, the latest patch series doesn't have such a patch. So, I'll keep this
my patch on next patch version.
https://lore.kernel.org/lkml/20190603011620.31999-1-baolu...@linux.intel.com/

Best regards,
Yoshihiro Shimoda

> Robin.
> 
> > Signed-off-by: Yoshihiro Shimoda 
> > ---
> >   drivers/iommu/iommu.c | 18 +++---
> >   include/linux/iommu.h |  1 +
> >   2 files changed, 16 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > index 2a90638..7ed16af 100644
> > --- a/drivers/iommu/iommu.c
> > +++ b/drivers/iommu/iommu.c
> > @@ -280,6 +280,18 @@ iommu_insert_device_resv_regions(struct list_head 
> > *dev_resv_regions,
> > return ret;
> >   }
> >
> > +/**
> > + * iommu_get_minimum_page_size - get minimum page size for a domain
> > + * @domain: the domain
> > + *
> > + * Allow iommu driver to get a minimum page size for a domain.
> > + */
> > +unsigned long iommu_get_minimum_page_size(struct iommu_domain *domain)
> > +{
> > +   return 1UL << __ffs(domain->pgsize_bitmap);
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_get_minimum_page_size);
> > +
> >   int iommu_get_group_resv_regions(struct iommu_group *group,
> >  struct list_head *head)
> >   {
> > @@ -558,7 +570,7 @@ static int iommu_group_create_direct_mappings(struct 
> > iommu_group *group,
> >
> > BUG_ON(!domain->pgsize_bitmap);
> >
> > -   pg_size = 1UL << __ffs(domain->pgsize_bitmap);
> > +   pg_size = iommu_get_minimum_page_size(domain);
> > INIT_LIST_HEAD();
> >
> > iommu_get_resv_regions(dev, );
> > @@ -1595,7 +1607,7 @@ int iommu_map(struct iommu_domain *domain, unsigned 
> > long iova,
> > return -EINVAL;
> >
> > /* find out the minimum page size supported */
> > -   min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
> > +   min_pagesz = iommu_get_minimum_page_size(domain);
> >
> > /*
> >  * both the virtual address and the physical one, as well as
> > @@ -1655,7 +1667,7 @@ static size_t __iommu_unmap(struct iommu_domain 
> > *domain,
> > return 0;
> >
> > /* find out the minimum page size supported */
> > -   min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
> > +   min_pagesz = iommu_get_minimum_page_size(domain);
> >
> > /*
> >  * The virtual address, as well as the size of the mapping, must be
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > index 91af22a..7e53b43 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -366,6 +366,7 @@ extern int iommu_request_dma_domain_for_dev(struct 
> > device *dev);
> >   extern struct iommu_resv_region *
> >   iommu_alloc_resv_region(phys_addr_t start, size_t length, int prot,
> > enum iommu_resv_type type);
> > +extern unsigned long iommu_get_minimum_page_size(struct iommu_domain 
> > *domain);
> >   extern int iommu_get_group_resv_regions(struct iommu_group *group,
> > struct list_head *head);
> >
> >


RE: [RFC PATCH v6 1/5] iommu: add an exported function to get minimum page size for a domain

2019-06-16 Thread Yoshihiro Shimoda
Hi Wolfram-san,

> From: Wolfram Sang, Sent: Friday, June 14, 2019 4:38 AM
> 
> On Thu, Jun 13, 2019 at 07:20:11PM +0900, Yoshihiro Shimoda wrote:
> > This patch adds an exported function to get minimum page size for
> > a domain. This patch also modifies similar codes on the iommu.c.
> >
> > Signed-off-by: Yoshihiro Shimoda 
> > ---
> >  drivers/iommu/iommu.c | 18 +++---
> >  include/linux/iommu.h |  1 +
> >  2 files changed, 16 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > index 2a90638..7ed16af 100644
> > --- a/drivers/iommu/iommu.c
> > +++ b/drivers/iommu/iommu.c
> > @@ -280,6 +280,18 @@ iommu_insert_device_resv_regions(struct list_head 
> > *dev_resv_regions,
> > return ret;
> >  }
> >
> > +/**
> > + * iommu_get_minimum_page_size - get minimum page size for a domain
> > + * @domain: the domain
> > + *
> > + * Allow iommu driver to get a minimum page size for a domain.
> > + */
> > +unsigned long iommu_get_minimum_page_size(struct iommu_domain *domain)
> > +{
> > +   return 1UL << __ffs(domain->pgsize_bitmap);
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_get_minimum_page_size);
> 
> What about making this a 'static inline' in the iommu header file? I'd
> think it is simple enough and would save us the EXPORT symbol.

Thank you for the review. I think so. I'll use 'static inline' instead of
EXPORT symbol.

Best regards,
Yoshihiro Shimoda



RE: [RFC PATCH v6 4/5] mmc: tmio: Use dma_max_mapping_size() instead of a workaround

2019-06-16 Thread Yoshihiro Shimoda
Hi Geert, Christoph,

Thank you for your comments!

> From: Geert Uytterhoeven, Sent: Friday, June 14, 2019 4:27 PM
> 
> Hi Christoph,
> 
> On Fri, Jun 14, 2019 at 9:18 AM Christoph Hellwig wrote:
> > On Thu, Jun 13, 2019 at 10:35:44PM +0200, Geert Uytterhoeven wrote:
> > > I'm always triggered by the use of min_t() and other casts:
> > > mmc->max_blk_size and mmc->max_blk_count are both unsigned int.
> > > dma_max_mapping_size() returns size_t, which can be 64-bit.
> > >
> > >  1) Can the multiplication overflow?
> > > Probably not, as per commit 2a55c1eac7882232 ("mmc: renesas_sdhi:
> > > prevent overflow for max_req_size"), but I thought I'd better ask.

Geert-san:

I agree.

> > >  2) In theory, dma_max_mapping_size() can return a number that doesn't
> > > fit in 32-bit, and will be truncated (to e.g. 0), leading to 
> > > max_req_size
> > > is zero?

Geert-san:

I agree. If dma_max_mapping_size() return 0x1__, it will be truncated 
to 0
and then max_req_size is set to zero. It is a problem. Also, the second argument
"mmc->max_blk_size * mmc->max_blk_count" will not be overflow and then the 
value is
0x_ or less. So, I also think this should use size_t instead of 
unsigned int.

> > This really should use a min_t on size_t.  Otherwise the patch looks
> > fine:
> 
> Followed by another min() to make it fit in mmc->max_req_size, which is
> unsigned int.

Geert-san:

I'm afraid, but I cannot understand this means.
Is this patch is possible to be upstream? Or, do you have any concern?


Best regards,
Yoshihiro Shimoda



RE: [RFC PATCH v6 4/5] mmc: tmio: Use dma_max_mapping_size() instead of a workaround

2019-06-16 Thread Yoshihiro Shimoda
Hi Wolfram-san,

> From: Wolfram Sang, Sent: Friday, June 14, 2019 4:46 AM
> 
> On Thu, Jun 13, 2019 at 07:20:14PM +0900, Yoshihiro Shimoda wrote:
> > Since the commit 133d624b1cee ("dma: Introduce dma_max_mapping_size()")
> > provides a helper function to get the max mapping size, we can use
> > the function instead of the workaround code for swiotlb.
> >
> > Signed-off-by: Yoshihiro Shimoda 
> 
> I love it! I'd really like to see this code go away. Do I get this right
> that this patch is kinda independent of the reset of the series? Anyway:

Thank you for your suggestion! I think so (because IOMMU and block patches seem
to need update). I'll submit 3/5 and 4/5 patches as independent later.

> Acked-by: Wolfram Sang 

Thank you for your Acked-by!

Best regards,
Yoshihiro Shimoda



[RFC PATCH v6 2/5] block: sort headers on blk-setting.c

2019-06-13 Thread Yoshihiro Shimoda
This patch sorts the headers in alphabetic order to ease
the maintenance for this part.

Signed-off-by: Yoshihiro Shimoda 
---
 block/blk-settings.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 2ae348c..45f2c52 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -2,16 +2,16 @@
 /*
  * Functions related to setting various queue properties from drivers
  */
-#include 
-#include 
-#include 
 #include 
 #include 
-#include /* for max_pfn/max_low_pfn */
 #include 
-#include 
-#include 
 #include 
+#include 
+#include 
+#include 
+#include 
+#include  /* for max_pfn/max_low_pfn */
+#include 
 
 #include "blk.h"
 #include "blk-wbt.h"
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH v6 3/5] block: add a helper function to merge the segments by an IOMMU

2019-06-13 Thread Yoshihiro Shimoda
This patch adds a helper function whether a queue can merge
the segments by an IOMMU.

Signed-off-by: Yoshihiro Shimoda 
---
 block/blk-settings.c   | 28 
 include/linux/blkdev.h |  2 ++
 2 files changed, 30 insertions(+)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 45f2c52..4e4e13e 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -4,9 +4,11 @@
  */
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -831,6 +833,32 @@ void blk_queue_write_cache(struct request_queue *q, bool 
wc, bool fua)
 }
 EXPORT_SYMBOL_GPL(blk_queue_write_cache);
 
+/**
+ * blk_queue_can_use_iommu_merging - configure queue for merging segments.
+ * @q: the request queue for the device
+ * @dev:   the device pointer for dma
+ *
+ * Tell the block layer about the iommu merging of @q.
+ */
+bool blk_queue_can_use_iommu_merging(struct request_queue *q,
+struct device *dev)
+{
+   struct iommu_domain *domain;
+
+   /*
+* If the device DMA is translated by an IOMMU, we can assume
+* the device can merge the segments.
+*/
+   if (!device_iommu_mapped(dev))
+   return false;
+
+   domain = iommu_get_domain_for_dev(dev);
+   /* No need to update max_segment_size. see blk_queue_virt_boundary() */
+   blk_queue_virt_boundary(q, iommu_get_minimum_page_size(domain) - 1);
+
+   return true;
+}
+
 static int __init blk_settings_init(void)
 {
blk_max_low_pfn = max_low_pfn - 1;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 592669b..4d1f7dc 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1091,6 +1091,8 @@ extern void blk_queue_dma_alignment(struct request_queue 
*, int);
 extern void blk_queue_update_dma_alignment(struct request_queue *, int);
 extern void blk_queue_rq_timeout(struct request_queue *, unsigned int);
 extern void blk_queue_write_cache(struct request_queue *q, bool enabled, bool 
fua);
+extern bool blk_queue_can_use_iommu_merging(struct request_queue *q,
+   struct device *dev);
 
 /*
  * Number of physical segments as sent to the device.
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH v6 5/5] mmc: queue: Use bigger segments if IOMMU can merge the segments

2019-06-13 Thread Yoshihiro Shimoda
If max_segs of a mmc host is smaller than BLK_MAX_SEGMENTS,
the mmc subsystem tries to use such bigger segments by using
IOMMU subsystem, and then the mmc subsystem exposes such information
to the block layer by using blk_queue_can_use_iommu_merging().

Signed-off-by: Yoshihiro Shimoda 
---
 drivers/mmc/core/queue.c | 33 +
 include/linux/mmc/host.h |  1 +
 2 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index b5b9c61..59d7606 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -196,6 +196,11 @@ static void mmc_queue_setup_discard(struct request_queue 
*q,
blk_queue_flag_set(QUEUE_FLAG_SECERASE, q);
 }
 
+static unsigned int mmc_get_max_segments(struct mmc_host *host)
+{
+   return host->can_merge ? BLK_MAX_SEGMENTS : host->max_segs;
+}
+
 /**
  * mmc_init_request() - initialize the MMC-specific per-request data
  * @q: the request queue
@@ -209,7 +214,7 @@ static int __mmc_init_request(struct mmc_queue *mq, struct 
request *req,
struct mmc_card *card = mq->card;
struct mmc_host *host = card->host;
 
-   mq_rq->sg = mmc_alloc_sg(host->max_segs, gfp);
+   mq_rq->sg = mmc_alloc_sg(mmc_get_max_segments(host), gfp);
if (!mq_rq->sg)
return -ENOMEM;
 
@@ -368,15 +373,24 @@ static void mmc_setup_queue(struct mmc_queue *mq, struct 
mmc_card *card)
blk_queue_bounce_limit(mq->queue, limit);
blk_queue_max_hw_sectors(mq->queue,
min(host->max_blk_count, host->max_req_size / 512));
-   blk_queue_max_segments(mq->queue, host->max_segs);
+   /* blk_queue_can_use_iommu_merging() should succeed if can_merge = 1 */
+   if (host->can_merge &&
+   !blk_queue_can_use_iommu_merging(mq->queue, mmc_dev(host)))
+   WARN_ON(1);
+   blk_queue_max_segments(mq->queue, mmc_get_max_segments(host));
 
if (mmc_card_mmc(card))
block_size = card->ext_csd.data_sector_size;
 
blk_queue_logical_block_size(mq->queue, block_size);
-   blk_queue_max_segment_size(mq->queue,
+   /*
+* After blk_queue_can_use_iommu_merging() was called with succeed,
+* since it calls blk_queue_virt_boundary for IOMMU, the mmc should
+* not call blk_queue_max_segment_size().
+*/
+   if (!host->can_merge)
+   blk_queue_max_segment_size(mq->queue,
round_down(host->max_seg_size, block_size));
-
INIT_WORK(>recovery_work, mmc_mq_recovery_handler);
INIT_WORK(>complete_work, mmc_blk_mq_complete_work);
 
@@ -422,6 +436,17 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card 
*card)
mq->tag_set.cmd_size = sizeof(struct mmc_queue_req);
mq->tag_set.driver_data = mq;
 
+   /*
+* Since blk_mq_alloc_tag_set() calls .init_request() of mmc_mq_ops,
+* the host->can_merge should be set before to get max_segs from
+* mmc_get_max_segments().
+*/
+   if (host->max_segs < BLK_MAX_SEGMENTS &&
+   device_iommu_mapped(mmc_dev(host)))
+   host->can_merge = 1;
+   else
+   host->can_merge = 0;
+
ret = blk_mq_alloc_tag_set(>tag_set);
if (ret)
return ret;
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 43d0f0c..84b9bef 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -398,6 +398,7 @@ struct mmc_host {
unsigned intretune_now:1;   /* do re-tuning at next req */
unsigned intretune_paused:1; /* re-tuning is temporarily 
disabled */
unsigned intuse_blk_mq:1;   /* use blk-mq */
+   unsigned intcan_merge:1;/* merging can be used */
 
int rescan_disable; /* disable card detection */
int rescan_entered; /* used with nonremovable 
devices */
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH v6 4/5] mmc: tmio: Use dma_max_mapping_size() instead of a workaround

2019-06-13 Thread Yoshihiro Shimoda
Since the commit 133d624b1cee ("dma: Introduce dma_max_mapping_size()")
provides a helper function to get the max mapping size, we can use
the function instead of the workaround code for swiotlb.

Signed-off-by: Yoshihiro Shimoda 
---
 drivers/mmc/host/tmio_mmc_core.c | 17 -
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/drivers/mmc/host/tmio_mmc_core.c b/drivers/mmc/host/tmio_mmc_core.c
index 130b91c..85bd6aa6 100644
--- a/drivers/mmc/host/tmio_mmc_core.c
+++ b/drivers/mmc/host/tmio_mmc_core.c
@@ -26,6 +26,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1189,19 +1190,9 @@ int tmio_mmc_host_probe(struct tmio_mmc_host *_host)
mmc->max_blk_size = TMIO_MAX_BLK_SIZE;
mmc->max_blk_count = pdata->max_blk_count ? :
(PAGE_SIZE / mmc->max_blk_size) * mmc->max_segs;
-   mmc->max_req_size = mmc->max_blk_size * mmc->max_blk_count;
-   /*
-* Since swiotlb has memory size limitation, this will calculate
-* the maximum size locally (because we don't have any APIs for it now)
-* and check the current max_req_size. And then, this will update
-* the max_req_size if needed as a workaround.
-*/
-   if (swiotlb_max_segment()) {
-   unsigned int max_size = (1 << IO_TLB_SHIFT) * IO_TLB_SEGSIZE;
-
-   if (mmc->max_req_size > max_size)
-   mmc->max_req_size = max_size;
-   }
+   mmc->max_req_size = min_t(unsigned int,
+ mmc->max_blk_size * mmc->max_blk_count,
+ dma_max_mapping_size(>dev));
mmc->max_seg_size = mmc->max_req_size;
 
if (mmc_can_gpio_ro(mmc))
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH v6 1/5] iommu: add an exported function to get minimum page size for a domain

2019-06-13 Thread Yoshihiro Shimoda
This patch adds an exported function to get minimum page size for
a domain. This patch also modifies similar codes on the iommu.c.

Signed-off-by: Yoshihiro Shimoda 
---
 drivers/iommu/iommu.c | 18 +++---
 include/linux/iommu.h |  1 +
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 2a90638..7ed16af 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -280,6 +280,18 @@ iommu_insert_device_resv_regions(struct list_head 
*dev_resv_regions,
return ret;
 }
 
+/**
+ * iommu_get_minimum_page_size - get minimum page size for a domain
+ * @domain: the domain
+ *
+ * Allow iommu driver to get a minimum page size for a domain.
+ */
+unsigned long iommu_get_minimum_page_size(struct iommu_domain *domain)
+{
+   return 1UL << __ffs(domain->pgsize_bitmap);
+}
+EXPORT_SYMBOL_GPL(iommu_get_minimum_page_size);
+
 int iommu_get_group_resv_regions(struct iommu_group *group,
 struct list_head *head)
 {
@@ -558,7 +570,7 @@ static int iommu_group_create_direct_mappings(struct 
iommu_group *group,
 
BUG_ON(!domain->pgsize_bitmap);
 
-   pg_size = 1UL << __ffs(domain->pgsize_bitmap);
+   pg_size = iommu_get_minimum_page_size(domain);
INIT_LIST_HEAD();
 
iommu_get_resv_regions(dev, );
@@ -1595,7 +1607,7 @@ int iommu_map(struct iommu_domain *domain, unsigned long 
iova,
return -EINVAL;
 
/* find out the minimum page size supported */
-   min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
+   min_pagesz = iommu_get_minimum_page_size(domain);
 
/*
 * both the virtual address and the physical one, as well as
@@ -1655,7 +1667,7 @@ static size_t __iommu_unmap(struct iommu_domain *domain,
return 0;
 
/* find out the minimum page size supported */
-   min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
+   min_pagesz = iommu_get_minimum_page_size(domain);
 
/*
 * The virtual address, as well as the size of the mapping, must be
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 91af22a..7e53b43 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -366,6 +366,7 @@ extern int iommu_request_dma_domain_for_dev(struct device 
*dev);
 extern struct iommu_resv_region *
 iommu_alloc_resv_region(phys_addr_t start, size_t length, int prot,
enum iommu_resv_type type);
+extern unsigned long iommu_get_minimum_page_size(struct iommu_domain *domain);
 extern int iommu_get_group_resv_regions(struct iommu_group *group,
struct list_head *head);
 
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH v6 0/5] treewide: improve R-Car SDHI performance

2019-06-13 Thread Yoshihiro Shimoda
This patch series is based on iommu.git / next branch.

Since SDHI host internal DMAC of the R-Car Gen3 cannot handle two or
more segments, the performance rate (especially, eMMC HS400 reading)
is not good. However, if IOMMU is enabled on the DMAC, since IOMMU will
map multiple scatter gather buffers as one contignous iova, the DMAC can
handle the iova as well and then the performance rate is possible to
improve. In fact, I have measured the performance by using bonnie++,
"Sequential Input - block" rate was improved on r8a7795.

To achieve this, this patch series modifies IOMMU and Block subsystem
at first. Since I'd like to get any feedback from each subsystem whether
this way is acceptable for upstream, I submit it to treewide with RFC.

Changes from v5:
 - Almost all patches are new code.
 - [4/5 for MMC] This is a refactor patch so that I don't add any
   {Tested,Reviewed}-by tags.
 - [5/5 for MMC] Modify MMC subsystem to use bigger segments instead of
   the renesas_sdhi driver.
 - [5/5 for MMC] Use BLK_MAX_SEGMENTS (128) instead of local value
   SDHI_MAX_SEGS_IN_IOMMU (512). Even if we use BLK_MAX_SEGMENTS,
   the performance is still good.
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=127511

Changes from v4:
 - [DMA MAPPING] Add a new device_dma_parameters for iova contiguous.
 - [IOMMU] Add a new capable for "merging" segments.
 - [IOMMU] Add a capable ops into the ipmmu-vmsa driver.
 - [MMC] Sort headers in renesas_sdhi_core.c.
 - [MMC] Remove the following codes that made on v3 that can be achieved by
 DMA MAPPING and IOMMU subsystem:
 -- Check if R-Car Gen3 IPMMU is used or not on patch 3.
 -- Check if all multiple segment buffers are aligned to PAGE_SIZE on patch 3.
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=125593

Changes from v3:
 - Use a helper function device_iommu_mapped on patch 1 and 3.
 - Check if R-Car Gen3 IPMMU is used or not on patch 3.
 - Check if all multiple segment buffers are aligned to PAGE_SIZE on patch 3.
 - Add Reviewed-by Wolfram-san on patch 1 and 2. Note that I also got his
   Reviewed-by on patch 3, but I changed it from v2. So, I didn't add
   his Reviewed-by at this time.
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=120985

Changes from v2:
 - Add some conditions in the init_card().
 - Add a comment in the init_card().
 - Add definitions for some "MAX_SEGS".
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=116729

Changes from v1:
 - Remove adding init_card ops into struct tmio_mmc_dma_ops and
   tmio_mmc_host and just set init_card on renesas_sdhi_core.c.
 - Revise typos on "mmc: tmio: No memory size limitation if runs on IOMMU".
 - Add Simon-san's Reviewed-by on a tmio patch.
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=110485

Yoshihiro Shimoda (5):
  iommu: add an exported function to get minimum page size for a domain
  block: sort headers on blk-setting.c
  block: add a helper function to merge the segments by an IOMMU
  mmc: tmio: Use dma_max_mapping_size() instead of a workaround
  mmc: queue: Use bigger segments if IOMMU can merge the segments

 block/blk-settings.c | 40 ++--
 drivers/iommu/iommu.c| 18 +++---
 drivers/mmc/core/queue.c | 33 +
 drivers/mmc/host/tmio_mmc_core.c | 17 -
 include/linux/blkdev.h   |  2 ++
 include/linux/iommu.h|  1 +
 include/linux/mmc/host.h |  1 +
 7 files changed, 86 insertions(+), 26 deletions(-)

-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: How to resolve an issue in swiotlb environment?

2019-06-12 Thread Yoshihiro Shimoda
Hi Christoph,

> From: Christoph Hellwig, Sent: Wednesday, June 12, 2019 8:31 PM
> 
> On Wed, Jun 12, 2019 at 08:52:21AM +, Yoshihiro Shimoda wrote:
> > Hi Christoph,
> >
> > > From: Christoph Hellwig, Sent: Wednesday, June 12, 2019 4:31 PM
> > >
> > > First things first:
> > >
> > > Yoshihiro, can you try this git branch?  The new bits are just the three
> > > patches at the end, but they sit on top of a few patches already sent
> > > out to the list, so a branch is probably either:
> > >
> > >git://git.infradead.org/users/hch/misc.git scsi-virt-boundary-fixes
> >
> > Thank you for the patches!
> > Unfortunately, the three patches could not resolve this issue.
> > However, it's a hint to me, and then I found the root cause:
> >  - slave_configure() in drivers/usb/storage/scsiglue.c calls
> >blk_queue_max_hw_sectors() with 2048 sectors (1 MiB) when 
> > USB_SPEED_SUPER or more.
> >  -- So that, even if your patches (also I fixed it a little [1]) could not 
> > resolve
> > the issue because the max_sectors is overwritten by above code.
> >
> > So, I think we should fix the slave_configure() by using 
> > dma_max_mapping_size().
> > What do you think? If so, I can make such a patch.
> 
> Yes, please do.

Thank you for your comment. I sent a patch to related mailing lists and you.

Best regards,
Yoshihiro Shimoda

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: How to resolve an issue in swiotlb environment?

2019-06-12 Thread Yoshihiro Shimoda
Hi Christoph,

> From: Christoph Hellwig, Sent: Wednesday, June 12, 2019 4:31 PM
> 
> First things first:
> 
> Yoshihiro, can you try this git branch?  The new bits are just the three
> patches at the end, but they sit on top of a few patches already sent
> out to the list, so a branch is probably either:
> 
>git://git.infradead.org/users/hch/misc.git scsi-virt-boundary-fixes

Thank you for the patches!
Unfortunately, the three patches could not resolve this issue.
However, it's a hint to me, and then I found the root cause:
 - slave_configure() in drivers/usb/storage/scsiglue.c calls
   blk_queue_max_hw_sectors() with 2048 sectors (1 MiB) when USB_SPEED_SUPER or 
more.
 -- So that, even if your patches (also I fixed it a little [1]) could not 
resolve
the issue because the max_sectors is overwritten by above code.

So, I think we should fix the slave_configure() by using dma_max_mapping_size().
What do you think? If so, I can make such a patch.

[1]
In the "scsi: take the DMA max mapping size into account" patch,
+   shost->max_sectors = min_t(unsigned int, shost->max_sectors,
+   dma_max_mapping_size(dev) << SECTOR_SHIFT);

it should be:
+   dma_max_mapping_size(dev) >> SECTOR_SHIFT);

But, if we fix the slave_configure(), we don't need this patch, IIUC.

Best regards,
Yoshihiro Shimoda



RE: How to resolve an issue in swiotlb environment?

2019-06-11 Thread Yoshihiro Shimoda
Hi Christoph, Alan,

> From: Alan Stern, Sent: Tuesday, June 11, 2019 3:46 AM
> 
> On Mon, 10 Jun 2019, Christoph Hellwig wrote:
> 
> > Hi Yoshihiro,
> >
> > sorry for not taking care of this earlier, today is a public holiday
> > here and thus I'm not working much over the long weekend.

To Christoph:

No worries.

> > On Mon, Jun 10, 2019 at 11:13:07AM +, Yoshihiro Shimoda wrote:
> > > I have another way to avoid the issue. But it doesn't seem that a good 
> > > way though...
> > > According to the commit that adding blk_queue_virt_boundary() [3],
> > > this is needed for vhci_hcd as a workaround so that if we avoid to call it
> > > on xhci-hcd driver, the issue disappeared. What do you think?
> > > JFYI, I pasted a tentative patch in the end of email [4].
> >
> > Oh, I hadn't even look at why USB uses blk_queue_virt_boundary, and it
> > seems like the usage is wrong, as it doesn't follow the same rules as
> > all the others.  I think your patch goes in the right direction,
> > but instead of comparing a hcd name it needs to be keyed of a flag
> > set by the driver (I suspect there is one indicating native SG support,
> > but I can't quickly find it), and we need an alternative solution
> > for drivers that don't see like vhci.  I suspect just limiting the
> > entire transfer size to something that works for a single packet
> > for them would be fine.
> 
> Christoph:
> 
> In most of the different kinds of USB host controllers, the hardware is
> not capable of assembling a packet out of multiple buffers at arbitrary
> addresses.  As a matter of fact, xHCI is the only kind that _can_ do
> this.
> 
> In some cases, the hardware can assemble packets provided each buffer
> other than the last ends at a page boundary and each buffer other than
> the first starts at a page boundary (Intel would say the buffers are
> "virtually contiguous"), but this is a rather complex rule and we don't
> want to rely on it.  Plus, in other cases the hardware _can't_ do this.
> 
> Instead, we want the SG buffers to be set up so that each one (except
> the last) is an exact multiple of the maximum packet size.  That way,
> each packet can be assembled from the contents of a single buffer and
> there's no problem.

There is out of this topic though, if we prepare such an exact multiple
of the maximum packet size (1024, 512 or 64), is it possible to cause
trouble on IOMMU environment? IIUC, dma_map_sg() maps SG buffers as
a single segment and then the segment buffer is not contiguous.

> The maximum packet size depends on the type of USB connection.
> Typical values are 1024, 512, or 64.  It's always a power of two and
> it's smaller than 4096.  Therefore we simplify the problem even further
> by requiring that each SG buffer in a scatterlist (except the last one)
> be a multiple of the page size.  (It doesn't need to be aligned on a
> page boundary, as far as I remember.)
> 
> That's why the blk_queue_virt_boundary usage was added to the USB code.
> Perhaps it's not the right way of doing this; I'm not an expert on the
> inner workings of the block layer.  If you can suggest a better way to
> express our requirement, that would be great.

Since I'm also not familiar with the block layer, I could not find a better
way...

Best regards,
Yoshihiro Shimoda

> Alan Stern
> 
> PS: There _is_ a flag saying whether an HCD supports SG.  But what it
> means is that the driver can handle an SG list that meets the
> requirement above; it doesn't mean that the driver can reassemble the
> data from an SG list into a series of bounce buffers in order to meet
> the requirement.  We very much want not to do that, especially since
> the block layer should already be capable of doing it for us.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: How to resolve an issue in swiotlb environment?

2019-06-10 Thread Yoshihiro Shimoda
Hi Christoph, Alan,
(add linux-usb ML on CC.)

> From: Yoshihiro Shimoda, Sent: Friday, June 7, 2019 9:00 PM
> 
> Hi Christoph,
> 
> I think we should continue to discuss on this email thread instead of the 
> fixed DMA-API.txt patch [1]
> 
> [1]
> https://marc.info/?t=15598941221=1=2
> 
> > From: Yoshihiro Shimoda, Sent: Monday, June 3, 2019 3:42 PM
> >
> > Hi linux-block and iommu mailing lists,
> >
> > I have an issue that a USB SSD with xHCI on R-Car H3 causes "swiotlb is 
> > full" like below.
> >
> > [   36.745286] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 
> > 524288 bytes), total 32768 (slots), used 1338
> (slots)
> >
> > I have investigated this issue by using git bisect, and then I found the 
> > following commit:
> >
> > ---
> > commit 09324d32d2a0843e66652a087da6f77924358e62
> > Author: Christoph Hellwig 
> > Date:   Tue May 21 09:01:41 2019 +0200
> >
> > block: force an unlimited segment size on queues with a virt boundary
> > ---
> 
> Thank you for your comment on other email thread [2] like below:
> ---
> Turns out it isn't as simple as I thought, as there doesn't seem to
> be an easy way to get to the struct device used for DMA mapping
> from USB drivers.  I'll need to think a bit more how to handle that
> best.
> ---
> 
> [2]
> https://marc.info/?l=linux-doc=155989651620473=2

I have another way to avoid the issue. But it doesn't seem that a good way 
though...
According to the commit that adding blk_queue_virt_boundary() [3],
this is needed for vhci_hcd as a workaround so that if we avoid to call it
on xhci-hcd driver, the issue disappeared. What do you think?
JFYI, I pasted a tentative patch in the end of email [4].

---
[3]
commit 747668dbc061b3e62bc1982767a3a1f9815fcf0e
Author: Alan Stern 
Date:   Mon Apr 15 13:19:25 2019 -0400

usb-storage: Set virt_boundary_mask to avoid SG overflows
---
[4]
diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
index 59190d8..277c6f7e 100644
--- a/drivers/usb/storage/scsiglue.c
+++ b/drivers/usb/storage/scsiglue.c
@@ -30,6 +30,8 @@
 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -65,6 +67,7 @@ static const char* host_info(struct Scsi_Host *host)
 static int slave_alloc (struct scsi_device *sdev)
 {
struct us_data *us = host_to_us(sdev->host);
+   struct usb_hcd *hcd = bus_to_hcd(us->pusb_dev->bus);
int maxp;
 
/*
@@ -80,8 +83,10 @@ static int slave_alloc (struct scsi_device *sdev)
 * Bulk maxpacket value.  Fortunately this value is always a
 * power of 2.  Inform the block layer about this requirement.
 */
-   maxp = usb_maxpacket(us->pusb_dev, us->recv_bulk_pipe, 0);
-   blk_queue_virt_boundary(sdev->request_queue, maxp - 1);
+   if (!strcmp(hcd->driver->description, "vhci_hcd")) {
+   maxp = usb_maxpacket(us->pusb_dev, us->recv_bulk_pipe, 0);
+   blk_queue_virt_boundary(sdev->request_queue, maxp - 1);
+   }
 
/*
 * Some host controllers may have alignment requirements.
---
Best regards,
Yoshihiro Shimoda



RE: [PATCH] Documentation: DMA-API: fix a function name of max_mapping_size

2019-06-07 Thread Yoshihiro Shimoda
Hi Christoph,

> From: Christoph Hellwig, Sent: Friday, June 7, 2019 5:35 PM
> 
> On Fri, Jun 07, 2019 at 08:19:08AM +, Yoshihiro Shimoda wrote:
> > Hi Christoph,
> >
> > > From: Christoph Hellwig, Sent: Friday, June 7, 2019 5:08 PM
> > >
> > > Looks good.  And it seems like you've also found the solution to
> > > your usb storage problem, but I'm going to post the variant I just
> > > hacked up nevertheless.
> >
> > Thank you for your reply! I think this API is related to my problem,
> > but I don't have any actual solution (a patch) for now. So, I'll wait
> > for your patch!
> 
> Turns out it isn't as simple as I thought, as there doesn't seem to
> be an easy way to get to the struct device used for DMA mapping
> from USB drivers.  I'll need to think a bit more how to handle that
> best.

Thank you for your reply. I sent an email on the original report as below.
https://marc.info/?l=linux-block=155990883224615=2

Best regards,
Yoshihiro Shimoda



RE: How to resolve an issue in swiotlb environment?

2019-06-07 Thread Yoshihiro Shimoda
Hi Christoph,

I think we should continue to discuss on this email thread instead of the fixed 
DMA-API.txt patch [1]

[1]
https://marc.info/?t=15598941221=1=2

> From: Yoshihiro Shimoda, Sent: Monday, June 3, 2019 3:42 PM
> 
> Hi linux-block and iommu mailing lists,
> 
> I have an issue that a USB SSD with xHCI on R-Car H3 causes "swiotlb is full" 
> like below.
> 
> [   36.745286] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 524288 
> bytes), total 32768 (slots), used 1338 (slots)
> 
> I have investigated this issue by using git bisect, and then I found the 
> following commit:
> 
> ---
> commit 09324d32d2a0843e66652a087da6f77924358e62
> Author: Christoph Hellwig 
> Date:   Tue May 21 09:01:41 2019 +0200
> 
> block: force an unlimited segment size on queues with a virt boundary
> ---

Thank you for your comment on other email thread [2] like below:
---
Turns out it isn't as simple as I thought, as there doesn't seem to
be an easy way to get to the struct device used for DMA mapping
from USB drivers.  I'll need to think a bit more how to handle that
best.
---

[2]
https://marc.info/?l=linux-doc=155989651620473=2

I'm not sure this is a correct way, but the issue disappears if I applied a 
patch below
to USB storage driver. Especially, WARNING happened on 
blk_queue_max_segment_size().
Maybe we need to expand the argument "struct device *" of 
blk_queue_virt_boundary() to
call dma_max_mapping_size()?
---
diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
index 59190d8..fa37b39 100644
--- a/drivers/usb/storage/scsiglue.c
+++ b/drivers/usb/storage/scsiglue.c
@@ -28,6 +28,7 @@
  * status of a command.
  */
 
+#include 
 #include 
 #include 
 
@@ -83,6 +84,15 @@ static int slave_alloc (struct scsi_device *sdev)
maxp = usb_maxpacket(us->pusb_dev, us->recv_bulk_pipe, 0);
blk_queue_virt_boundary(sdev->request_queue, maxp - 1);
 
+{
+   struct device *dev = us->pusb_dev->bus->controller;
+
+   dev_info(dev, "%s: size = %zu\n", __func__, dma_max_mapping_size(dev));
+   blk_queue_max_segment_size(sdev->request_queue,
+  dma_max_mapping_size(dev));
+}
+
+
/*
 * Some host controllers may have alignment requirements.
 * We'll play it safe by requiring 512-byte alignment always.
---

Best regards,
Yoshihiro Shimoda



RE: [PATCH] Documentation: DMA-API: fix a function name of max_mapping_size

2019-06-07 Thread Yoshihiro Shimoda
Hi Christoph,

> From: Christoph Hellwig, Sent: Friday, June 7, 2019 5:08 PM
> 
> Looks good.  And it seems like you've also found the solution to
> your usb storage problem, but I'm going to post the variant I just
> hacked up nevertheless.

Thank you for your reply! I think this API is related to my problem,
but I don't have any actual solution (a patch) for now. So, I'll wait
for your patch!

Best regards,
Yoshihiro Shimoda



[PATCH] Documentation: DMA-API: fix a function name of max_mapping_size

2019-06-07 Thread Yoshihiro Shimoda
The exported function name is dma_max_mapping_size(), not
dma_direct_max_mapping_size() so that this patch fixes
the function name in the documentation.

Fixes: 133d624b1cee ("dma: Introduce dma_max_mapping_size()")
Signed-off-by: Yoshihiro Shimoda 
---
 Documentation/DMA-API.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt
index 0076150..e47c63b 100644
--- a/Documentation/DMA-API.txt
+++ b/Documentation/DMA-API.txt
@@ -198,7 +198,7 @@ call to set the mask to the value returned.
 ::
 
size_t
-   dma_direct_max_mapping_size(struct device *dev);
+   dma_max_mapping_size(struct device *dev);
 
 Returns the maximum size of a mapping for the device. The size parameter
 of the mapping functions like dma_map_single(), dma_map_page() and
-- 
2.7.4



RE: [RFC PATCH v5 3/8] iommu: add a new capable IOMMU_CAP_MERGING

2019-06-07 Thread Yoshihiro Shimoda
Hi Christoph,

> From: Christoph Hellwig, Sent: Friday, June 7, 2019 2:50 PM
> 
> On Fri, Jun 07, 2019 at 05:41:56AM +, Yoshihiro Shimoda wrote:
> > > bool blk_can_use_iommu_merging(struct request_queue *q, struct device 
> > > *dev)
> > > {
> > >   if (!IOMMU_CAN_MERGE_SEGMENTS(dev))
> > >   return false;
> >
> > As Robin mentioned, all IOMMUs can merge segments so that we don't need
> > this condition, IIUC. However, this should check whether the device is 
> > mapped
> > on iommu by using device_iommu_mapped().
> 
> There are plenty of dma_map_ops based drivers that can't merge segments.
> Examples:
> 
>  - arch/ia64/sn/pci/pci_dma.c
>  - arch/mips/jazz/jazzdma.c
>  - arch/sparc/mm/io-unit.c
>  - arch/sparc/mm/iommu.c
>  - arch/x86/kernel/pci-calgary_64.c

Thank you for the indicate. I'll check these codes.

> Nevermind the diret mapping, swiotlb and other weirdos.

I got it.

> > >   blk_queue_virt_boundary(q, IOMMU_PAGE_SIZE(dev));
> > >   blk_queue_max_segment_size(q, IOMMU_MAX_SEGMENT_SIZE(dev));
> >
> > By the way, I reported an issue [1] and I'm thinking dima_is_direct() 
> > environment
> > (especially for swiotlb) is also needed such max_segment_size changes 
> > somehow.
> > What do you think?
> >
> > [1]
> > https://marc.info/?l=linux-block=155954415603356=2
> 
> That doesn't seem to be related to the segment merging.  I'll take
> a look, but next time please Cc the author of a suspect commit if
> you already bisect things.

Oops. I'll Cc the author in next time.

Best regards,
Yoshihiro Shimoda



RE: [RFC PATCH v5 3/8] iommu: add a new capable IOMMU_CAP_MERGING

2019-06-06 Thread Yoshihiro Shimoda
Hi Christoph,

> From: Christoph Hellwig, Sent: Thursday, June 6, 2019 4:01 PM
> 
> On Thu, Jun 06, 2019 at 06:28:47AM +, Yoshihiro Shimoda wrote:
> > > The problem is that we need a way to communicate to the block layer
> > > that more than a single segment is ok IFF the DMA API instance supports
> > > merging.  And of course the answer will depend on futher parameters
> > > like the maximum merged segment size and alignment for the segement.
> >
> > I'm afraid but I don't understand why we need a way to communicate to
> > the block layer that more than a single segment is ok IFF the DMA API
> > instance supports merging.
> 
> Assume a device (which I think is your case) that only supports a single
> segment in hardware.  In that case we set max_segments to 1 if no
> IOMMU is present.  But if we have a merge capable IOMMU we can set
> max_segments to unlimited (or some software limit for scatterlist
> allocation), as long as we set a virt_boundary matching what the IOMMU
> expects, and max_sectors_kb isn't larger than the max IOMMU mapping
> size.  Now we could probably just open code this in the driver, but
> I'd feel much happier having a block layer like this:

Thank you for the explanation in detail!

> bool blk_can_use_iommu_merging(struct request_queue *q, struct device *dev)
> {
>   if (!IOMMU_CAN_MERGE_SEGMENTS(dev))
>   return false;

As Robin mentioned, all IOMMUs can merge segments so that we don't need
this condition, IIUC. However, this should check whether the device is mapped
on iommu by using device_iommu_mapped().

>   blk_queue_virt_boundary(q, IOMMU_PAGE_SIZE(dev));
>   blk_queue_max_segment_size(q, IOMMU_MAX_SEGMENT_SIZE(dev));

By the way, I reported an issue [1] and I'm thinking dima_is_direct() 
environment
(especially for swiotlb) is also needed such max_segment_size changes somehow.
What do you think?

[1]
https://marc.info/?l=linux-block=155954415603356=2

>   return true;
> }
> 
> and the driver then does:
> 
>   if (blk_can_use_iommu_merging(q, dev)) {
>   blk_queue_max_segments(q, MAX_SW_SEGMENTS);
>   // initialize sg mempool, etc..
>   }

In this case, I assume that "the driver" is ./drivers/mmc/core/queue.c,
not any drivers/mmc/host/ code.

> Where the SCREAMING pseudo code calls are something we need to find a
> good API for.

I assumed
 - IOMMU_PAGE_SIZE(dev) = dma_get_seg_boundary(dev);
 - IOMMU_MAX_SEGMENT_SIZE(dev) = dma_get_max_seg_size(dev);

I could not find "IOMMU_PAGE_SIZE(dev))" for now.
If it's true, I'll add such a new API.

> And thinking about it the backend doesn't need to be an iommu, swiotlb
> could handle this as well, which might be interesting for devices
> that need to boune buffer anyway.  IIRC mmc actually has some code
> to copy multiple segments into a bounce buffer somewhere.

I see. So, as I mentioned above, this seems that swiotlb is also needed.
IIUC, now mmc doesn't have a bounce buffer from the commit [2].

[2]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/mmc/core?h=v5.2-rc3=de3ee99b097dd51938276e3af388cd4ad0f2750a

> > The block layer already has a limit "max_segment_size" for each device so 
> > that
> > regardless it can/cannot merge the segments, we can use the limit.
> > Is my understanding incorrect?
> 
> Yes.

Now I understand that block layer's max_segment_size differs with IOMMU's one.

Best regards,
Yoshihiro Shimoda

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH v5 3/8] iommu: add a new capable IOMMU_CAP_MERGING

2019-06-06 Thread Yoshihiro Shimoda
Hi Christoph,

Thank you for your comments!

> From: Christoph Hellwig, Sent: Wednesday, June 5, 2019 9:38 PM
> 
> On Wed, Jun 05, 2019 at 01:21:59PM +0100, Robin Murphy wrote:
> > And if the problem is really that you're not getting merging because of
> > exposing the wrong parameters to the DMA API and/or the block layer, or
> > that you just can't quite express your requirement to the block layer in
> > the first place, then that should really be tackled at the source rather
> > than worked around further down in the stack.
> 
> The problem is that we need a way to communicate to the block layer
> that more than a single segment is ok IFF the DMA API instance supports
> merging.  And of course the answer will depend on futher parameters
> like the maximum merged segment size and alignment for the segement.

I'm afraid but I don't understand why we need a way to communicate to
the block layer that more than a single segment is ok IFF the DMA API
instance supports merging.

> We'll need some way to communicate that, but I don't really think this
> is series is the way to go.

I should discard the patches 1/8 through 4/8.

> We'd really need something hanging off the device (or through a query
> API) how the dma map ops implementation exposes under what circumstances
> it can merge.  The driver can then communicate that to the block layer
> so that the block layer doesn't split requests up when reaching the
> segement limit.

The block layer already has a limit "max_segment_size" for each device so that
regardless it can/cannot merge the segments, we can use the limit.
Is my understanding incorrect?

Best regards,
Yoshihiro Shimoda

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH v5 3/8] iommu: add a new capable IOMMU_CAP_MERGING

2019-06-05 Thread Yoshihiro Shimoda
Hi Robin,

Thank you for your comments!

> From: Robin Murphy, Sent: Wednesday, June 5, 2019 9:22 PM

> > @@ -902,8 +914,18 @@ static int iommu_dma_map_sg(struct device *dev, struct 
> > scatterlist *sg,
> > if (iommu_map_sg(domain, iova, sg, nents, prot) < iova_len)
> > goto out_free_iova;
> >
> > -   return __finalise_sg(dev, sg, nents, iova);
> > +   ret = __finalise_sg(dev, sg, nents, iova);
> > +   /*
> > +* Check whether the sg entry is single if a device requires and
> > +* the IOMMU driver has the capable.
> > +*/
> > +   if (iova_contiguous && ret != 1)
> > +   goto out_unmap_sg;
> 
> I don't see that just failing really gives this option any value.
> Clearly the MMC driver has to do *something* to handle the failure (plus
> presumably the case of not having IOMMU DMA ops at all), which begs the
> question of why it couldn't just do whatever that is anyway, without all
> this infrastructure. For starters, it would be a far simpler and less
> invasive patch:
> 
>   if (dma_map_sg(...) > 1) {
>   dma_unmap_sg(...);
>   /* split into multiple requests and try again */
>   }

I understood it.

> But then it would make even more sense to just have the driver be
> proactive about its special requirement in the first place, and simply
> validate the list before it even tries to map it:
> 
>   for_each_sg(sgl, sg, n, i)
>   if ((i > 0 && sg->offset % PAGE_SIZE) ||
>   (i < n - 1 && sg->length % PAGE_SIZE))
>   /* segment will not be mergeable */

In previous version, I made such a code [1].
But, I think I misunderstood Christoph's comments [2] [3].

[1]
https://patchwork.kernel.org/patch/10970047/

[2]
https://marc.info/?l=linux-renesas-soc=155956751811689=2

[3]
https://marc.info/?l=linux-renesas-soc=155852814607202=2

> For reference, I think v4l2 and possibly some areas of DRM already do
> something vaguely similar to judge whether they get contiguous buffers
> or not.

I see. I'll check these areas later.

> > +
> > +   return ret;
> >
> > +out_unmap_sg:
> > +   iommu_dma_unmap_sg(dev, sg, nents, dir, attrs);
> >   out_free_iova:
> > iommu_dma_free_iova(cookie, iova, iova_len);
> >   out_restore_sg:
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > index 91af22a..f971dd3 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -104,6 +104,7 @@ enum iommu_cap {
> >transactions */
> > IOMMU_CAP_INTR_REMAP,   /* IOMMU supports interrupt isolation */
> > IOMMU_CAP_NOEXEC,   /* IOMMU_NOEXEC flag */
> > +   IOMMU_CAP_MERGING,  /* IOMMU supports segments merging */
> 
> This isn't a 'capability' of the IOMMU - "segment merging" equates to
> just remapping pages, and there's already a fundamental assumption that
> IOMMUs are capable of that. Plus it's very much a DMA API concept, so
> hardly belongs in the IOMMU API anyway.

I got it.

> All in all, I'm struggling to see the point of this. Although it's not a
> DMA API guarantee, iommu-dma already merges scatterlists as aggressively
> as it is allowed to, and will continue to do so for the foreseeable
> future (since it avoids considerable complication in the IOVA
> allocation), so if you want to make sure iommu_dma_map_sg() merges an
> entire list, just don't give it a non-mergeable list.

Thank you for the explanation. I didn't know that a driver should not
give it a non-mergeable list.

> And if you still
> really really want dma_map_sg() to have a behaviour of "merge to a
> single segment or fail", then that should at least be a DMA API
> attribute, which could in principle be honoured by bounce-buffering
> implementations as well.

I got it. For this patch series, it seems I have to modify a block layer
so that such a new DMA API is not needed though.

> And if the problem is really that you're not getting merging because of
> exposing the wrong parameters to the DMA API and/or the block layer, or
> that you just can't quite express your requirement to the block layer in
> the first place, then that should really be tackled at the source rather
> than worked around further down in the stack.

I'll reply on Christoph email about this topic later.

Best regards,
Yoshihiro Shimoda

> Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH v5 1/8] dma-mapping: add a device driver helper for iova contiguous

2019-06-05 Thread Yoshihiro Shimoda
This API can set a flag whether a device requires iova contiguous
strictly.

Signed-off-by: Yoshihiro Shimoda 
---
 include/linux/device.h  |  1 +
 include/linux/dma-mapping.h | 16 
 2 files changed, 17 insertions(+)

diff --git a/include/linux/device.h b/include/linux/device.h
index e85264f..a33d611 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -752,6 +752,7 @@ struct device_dma_parameters {
 */
unsigned int max_segment_size;
unsigned long segment_boundary_mask;
+   bool iova_contiguous;
 };
 
 /**
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 6309a72..cdb4e75 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -729,6 +729,22 @@ static inline int dma_set_seg_boundary(struct device *dev, 
unsigned long mask)
return -EIO;
 }
 
+static inline int dma_get_iova_contiguous(struct device *dev)
+{
+   if (dev->dma_parms)
+   return dev->dma_parms->iova_contiguous;
+   return false;
+}
+
+static inline int dma_set_iova_contiguous(struct device *dev, bool contiguous)
+{
+   if (dev->dma_parms) {
+   dev->dma_parms->iova_contiguous = contiguous;
+   return 0;
+   }
+   return -EIO;
+}
+
 #ifndef dma_max_pfn
 static inline unsigned long dma_max_pfn(struct device *dev)
 {
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH v5 4/8] iommu/ipmmu-vmsa: add capable ops

2019-06-05 Thread Yoshihiro Shimoda
This patch adds the .capable into iommu_ops that can merge scatter
gather segments.

Signed-off-by: Yoshihiro Shimoda 
---
 drivers/iommu/ipmmu-vmsa.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 408ad0b..81170b8 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -608,6 +608,18 @@ static irqreturn_t ipmmu_irq(int irq, void *dev)
  * IOMMU Operations
  */
 
+static bool ipmmu_capable(enum iommu_cap cap)
+{
+   switch (cap) {
+   case IOMMU_CAP_MERGING:
+   return true;
+   default:
+   break;
+   }
+
+   return false;
+}
+
 static struct iommu_domain *__ipmmu_domain_alloc(unsigned type)
 {
struct ipmmu_vmsa_domain *domain;
@@ -950,6 +962,7 @@ static struct iommu_group *ipmmu_find_group(struct device 
*dev)
 }
 
 static const struct iommu_ops ipmmu_ops = {
+   .capable = ipmmu_capable,
.domain_alloc = ipmmu_domain_alloc,
.domain_free = ipmmu_domain_free,
.attach_dev = ipmmu_attach_device,
-- 
2.7.4



[RFC PATCH v5 5/8] mmc: tmio: No memory size limitation if runs on IOMMU

2019-06-05 Thread Yoshihiro Shimoda
This patch adds a condition to avoid a memory size limitation of
swiotlb if the driver runs on IOMMU.

Tested-by: Takeshi Saito 
Signed-off-by: Yoshihiro Shimoda 
Reviewed-by: Simon Horman 
Reviewed-by: Wolfram Sang 
---
 drivers/mmc/host/tmio_mmc_core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/tmio_mmc_core.c b/drivers/mmc/host/tmio_mmc_core.c
index 130b91c..c9f6a59 100644
--- a/drivers/mmc/host/tmio_mmc_core.c
+++ b/drivers/mmc/host/tmio_mmc_core.c
@@ -1194,9 +1194,10 @@ int tmio_mmc_host_probe(struct tmio_mmc_host *_host)
 * Since swiotlb has memory size limitation, this will calculate
 * the maximum size locally (because we don't have any APIs for it now)
 * and check the current max_req_size. And then, this will update
-* the max_req_size if needed as a workaround.
+* the max_req_size if needed as a workaround. However, if the driver
+* runs on IOMMU, this workaround isn't needed.
 */
-   if (swiotlb_max_segment()) {
+   if (swiotlb_max_segment() && !device_iommu_mapped(>dev)) {
unsigned int max_size = (1 << IO_TLB_SHIFT) * IO_TLB_SEGSIZE;
 
if (mmc->max_req_size > max_size)
-- 
2.7.4



[RFC PATCH v5 8/8] mmc: renesas_sdhi: use multiple segments if possible

2019-06-05 Thread Yoshihiro Shimoda
If the IOMMU driver has a capable for merging segments to
a segment strictly, this can expose the host->mmc->max_segs with
multiple segments to a block layer by using blk_queue_max_segments()
that mmc_setup_queue() calls. Notes that an sdio card may be
possible to use multiple segments with non page aligned size, so
that this will not expose multiple segments to avoid failing
dma_map_sg() every time.

Notes that on renesas_sdhi_sys_dmac, the max_segs value will change
from 32 to 512, but the sys_dmac can handle 512 segments, so that
this init_card ops is added on "TMIO_MMC_MIN_RCAR2" environment.

Signed-off-by: Yoshihiro Shimoda 
---
 drivers/mmc/host/renesas_sdhi_core.c  | 27 +++
 drivers/mmc/host/renesas_sdhi_internal_dmac.c |  4 
 2 files changed, 31 insertions(+)

diff --git a/drivers/mmc/host/renesas_sdhi_core.c 
b/drivers/mmc/host/renesas_sdhi_core.c
index c5ee4a6..379cefa 100644
--- a/drivers/mmc/host/renesas_sdhi_core.c
+++ b/drivers/mmc/host/renesas_sdhi_core.c
@@ -20,6 +20,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -46,6 +47,8 @@
 #define SDHI_VER_GEN3_SD   0xcc10
 #define SDHI_VER_GEN3_SDMMC0xcd10
 
+#define SDHI_MAX_SEGS_IN_IOMMU 512
+
 struct renesas_sdhi_quirks {
bool hs400_disabled;
bool hs400_4taps;
@@ -203,6 +206,28 @@ static void renesas_sdhi_clk_disable(struct tmio_mmc_host 
*host)
clk_disable_unprepare(priv->clk_cd);
 }
 
+static void renesas_sdhi_init_card(struct mmc_host *mmc, struct mmc_card *card)
+{
+   struct tmio_mmc_host *host = mmc_priv(mmc);
+
+   /*
+* If the IOMMU driver has a capable for merging segments to
+* a segment strictly, this can expose the host->mmc->max_segs with
+* multiple segments to a block layer by using blk_queue_max_segments()
+* that mmc_setup_queue() calls. Notes that an sdio card may be
+* possible to use multiple segments with non page aligned size, so
+* that this will not expose multiple segments to avoid failing
+* dma_map_sg() every time.
+*/
+   if (host->pdata->max_segs < SDHI_MAX_SEGS_IN_IOMMU &&
+   iommu_capable(host->pdev->dev.bus, IOMMU_CAP_MERGING) &&
+   (mmc_card_mmc(card) || mmc_card_sd(card)))
+   host->mmc->max_segs = SDHI_MAX_SEGS_IN_IOMMU;
+   else
+   host->mmc->max_segs = host->pdata->max_segs ? :
+ TMIO_DEFAULT_MAX_SEGS;
+}
+
 static int renesas_sdhi_card_busy(struct mmc_host *mmc)
 {
struct tmio_mmc_host *host = mmc_priv(mmc);
@@ -726,6 +751,8 @@ int renesas_sdhi_probe(struct platform_device *pdev,
 
/* SDR speeds are only available on Gen2+ */
if (mmc_data->flags & TMIO_MMC_MIN_RCAR2) {
+   host->ops.init_card = renesas_sdhi_init_card;
+
/* card_busy caused issues on r8a73a4 (pre-Gen2) CD-less SDHI */
host->ops.card_busy = renesas_sdhi_card_busy;
host->ops.start_signal_voltage_switch =
diff --git a/drivers/mmc/host/renesas_sdhi_internal_dmac.c 
b/drivers/mmc/host/renesas_sdhi_internal_dmac.c
index 751fe91..a442f86 100644
--- a/drivers/mmc/host/renesas_sdhi_internal_dmac.c
+++ b/drivers/mmc/host/renesas_sdhi_internal_dmac.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -337,6 +338,9 @@ static int renesas_sdhi_internal_dmac_probe(struct 
platform_device *pdev)
/* value is max of SD_SECCNT. Confirmed by HW engineers */
dma_set_max_seg_size(dev, 0x);
 
+   if (iommu_capable(dev->bus, IOMMU_CAP_MERGING))
+   dma_set_iova_contiguous(dev, true);
+
return renesas_sdhi_probe(pdev, _sdhi_internal_dmac_dma_ops);
 }
 
-- 
2.7.4



[RFC PATCH v5 7/8] mmc: renesas_sdhi: sort headers

2019-06-05 Thread Yoshihiro Shimoda
This patch ports the headers in alphabetic order to ease
the maintenance for this part.

Signed-off-by: Yoshihiro Shimoda 
---
 drivers/mmc/host/renesas_sdhi_core.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/mmc/host/renesas_sdhi_core.c 
b/drivers/mmc/host/renesas_sdhi_core.c
index 5e9e36e..c5ee4a6 100644
--- a/drivers/mmc/host/renesas_sdhi_core.c
+++ b/drivers/mmc/host/renesas_sdhi_core.c
@@ -18,20 +18,20 @@
  *
  */
 
-#include 
 #include 
-#include 
-#include 
-#include 
-#include 
+#include 
+#include 
+#include 
 #include 
 #include 
-#include 
-#include 
-#include 
+#include 
+#include 
 #include 
 #include 
+#include 
 #include 
+#include 
+#include 
 #include 
 
 #include "renesas_sdhi.h"
-- 
2.7.4



[RFC PATCH v5 2/8] iommu/dma: move iommu_dma_unmap_sg() place

2019-06-05 Thread Yoshihiro Shimoda
iommu_dma_map_sg() will use the unmap function in the future. To
avoid a forward declaration, this patch move the function place.

Signed-off-by: Yoshihiro Shimoda 
---
 drivers/iommu/dma-iommu.c | 48 +++
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 0dee374..034caae 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -730,6 +730,30 @@ static void iommu_dma_unmap_page(struct device *dev, 
dma_addr_t dma_handle,
__iommu_dma_unmap(dev, dma_handle, size);
 }
 
+static void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
+   int nents, enum dma_data_direction dir, unsigned long attrs)
+{
+   dma_addr_t start, end;
+   struct scatterlist *tmp;
+   int i;
+
+   if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+   iommu_dma_sync_sg_for_cpu(dev, sg, nents, dir);
+
+   /*
+* The scatterlist segments are mapped into a single
+* contiguous IOVA allocation, so this is incredibly easy.
+*/
+   start = sg_dma_address(sg);
+   for_each_sg(sg_next(sg), tmp, nents - 1, i) {
+   if (sg_dma_len(tmp) == 0)
+   break;
+   sg = tmp;
+   }
+   end = sg_dma_address(sg) + sg_dma_len(sg);
+   __iommu_dma_unmap(dev, start, end - start);
+}
+
 /*
  * Prepare a successfully-mapped scatterlist to give back to the caller.
  *
@@ -887,30 +911,6 @@ static int iommu_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
return 0;
 }
 
-static void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
-   int nents, enum dma_data_direction dir, unsigned long attrs)
-{
-   dma_addr_t start, end;
-   struct scatterlist *tmp;
-   int i;
-
-   if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-   iommu_dma_sync_sg_for_cpu(dev, sg, nents, dir);
-
-   /*
-* The scatterlist segments are mapped into a single
-* contiguous IOVA allocation, so this is incredibly easy.
-*/
-   start = sg_dma_address(sg);
-   for_each_sg(sg_next(sg), tmp, nents - 1, i) {
-   if (sg_dma_len(tmp) == 0)
-   break;
-   sg = tmp;
-   }
-   end = sg_dma_address(sg) + sg_dma_len(sg);
-   __iommu_dma_unmap(dev, start, end - start);
-}
-
 static dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH v5 6/8] mmc: tmio: Add a definition for default max_segs

2019-06-05 Thread Yoshihiro Shimoda
This patch adds a definition for default max_segs to be used by other
driver (renesas_sdhi) in the future.

Signed-off-by: Yoshihiro Shimoda 
Reviewed-by: Wolfram Sang 
---
 drivers/mmc/host/tmio_mmc.h  | 1 +
 drivers/mmc/host/tmio_mmc_core.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/tmio_mmc.h b/drivers/mmc/host/tmio_mmc.h
index c5ba13f..9e387be 100644
--- a/drivers/mmc/host/tmio_mmc.h
+++ b/drivers/mmc/host/tmio_mmc.h
@@ -106,6 +106,7 @@
 #define TMIO_MASK_IRQ (TMIO_MASK_READOP | TMIO_MASK_WRITEOP | 
TMIO_MASK_CMD)
 
 #define TMIO_MAX_BLK_SIZE 512
+#define TMIO_DEFAULT_MAX_SEGS 32
 
 struct tmio_mmc_data;
 struct tmio_mmc_host;
diff --git a/drivers/mmc/host/tmio_mmc_core.c b/drivers/mmc/host/tmio_mmc_core.c
index c9f6a59..af1343e 100644
--- a/drivers/mmc/host/tmio_mmc_core.c
+++ b/drivers/mmc/host/tmio_mmc_core.c
@@ -1185,7 +1185,7 @@ int tmio_mmc_host_probe(struct tmio_mmc_host *_host)
 
mmc->caps |= MMC_CAP_4_BIT_DATA | pdata->capabilities;
mmc->caps2 |= pdata->capabilities2;
-   mmc->max_segs = pdata->max_segs ? : 32;
+   mmc->max_segs = pdata->max_segs ? : TMIO_DEFAULT_MAX_SEGS;
mmc->max_blk_size = TMIO_MAX_BLK_SIZE;
mmc->max_blk_count = pdata->max_blk_count ? :
(PAGE_SIZE / mmc->max_blk_size) * mmc->max_segs;
-- 
2.7.4



[RFC PATCH v5 3/8] iommu: add a new capable IOMMU_CAP_MERGING

2019-06-05 Thread Yoshihiro Shimoda
This patch adds a new capable IOMMU_CAP_MERGING to check whether
the IOVA would be contiguous strictly if a device requires and
the IOMMU driver has the capable.

Signed-off-by: Yoshihiro Shimoda 
---
 drivers/iommu/dma-iommu.c | 26 --
 include/linux/iommu.h |  1 +
 2 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 034caae..ecf1a04 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -847,11 +847,16 @@ static int iommu_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
dma_addr_t iova;
size_t iova_len = 0;
unsigned long mask = dma_get_seg_boundary(dev);
-   int i;
+   int i, ret;
+   bool iova_contiguous = false;
 
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
iommu_dma_sync_sg_for_device(dev, sg, nents, dir);
 
+   if (dma_get_iova_contiguous(dev) &&
+   iommu_capable(dev->bus, IOMMU_CAP_MERGING))
+   iova_contiguous = true;
+
/*
 * Work out how much IOVA space we need, and align the segments to
 * IOVA granules for the IOMMU driver to handle. With some clever
@@ -867,6 +872,13 @@ static int iommu_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
sg_dma_len(s) = s_length;
s->offset -= s_iova_off;
s_length = iova_align(iovad, s_length + s_iova_off);
+   /*
+* Check whether the IOVA would be contiguous strictly if
+* a device requires and the IOMMU driver has the capable.
+*/
+   if (iova_contiguous && i > 0 &&
+   (s_iova_off || s->length != s_length))
+   return 0;
s->length = s_length;
 
/*
@@ -902,8 +914,18 @@ static int iommu_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
if (iommu_map_sg(domain, iova, sg, nents, prot) < iova_len)
goto out_free_iova;
 
-   return __finalise_sg(dev, sg, nents, iova);
+   ret = __finalise_sg(dev, sg, nents, iova);
+   /*
+* Check whether the sg entry is single if a device requires and
+* the IOMMU driver has the capable.
+*/
+   if (iova_contiguous && ret != 1)
+   goto out_unmap_sg;
+
+   return ret;
 
+out_unmap_sg:
+   iommu_dma_unmap_sg(dev, sg, nents, dir, attrs);
 out_free_iova:
iommu_dma_free_iova(cookie, iova, iova_len);
 out_restore_sg:
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 91af22a..f971dd3 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -104,6 +104,7 @@ enum iommu_cap {
   transactions */
IOMMU_CAP_INTR_REMAP,   /* IOMMU supports interrupt isolation */
IOMMU_CAP_NOEXEC,   /* IOMMU_NOEXEC flag */
+   IOMMU_CAP_MERGING,  /* IOMMU supports segments merging */
 };
 
 /*
-- 
2.7.4



[RFC PATCH v5 0/8] treewide: improve R-Car SDHI performance

2019-06-05 Thread Yoshihiro Shimoda
This patch series is based on iommu.git / next branch.

Since SDHI host internal DMAC of the R-Car Gen3 cannot handle two or
more segments, the performance rate (especially, eMMC HS400 reading)
is not good. However, if IOMMU is enabled on the DMAC, since IOMMU will
map multiple scatter gather buffers as one contignous iova, the DMAC can
handle the iova as well and then the performance rate is possible to
improve. In fact, I have measured the performance by using bonnie++,
"Sequential Input - block" rate was improved on r8a7795.

To achieve this, this patch series modifies DMA MAPPING and IOMMU
subsystem at first. Since I'd like to get any feedback from each
subsystem whether this way is acceptable for upstream, I submit it
to treewide with RFC.

Changes from v4:
 - [DMA MAPPING] Add a new device_dma_parameters for iova contiguous.
 - [IOMMU] Add a new capable for "merging" segments.
 - [IOMMU] Add a capable ops into the ipmmu-vmsa driver.
 - [MMC] Sort headers in renesas_sdhi_core.c.
 - [MMC] Remove the following codes that made on v3 that can be achieved by
 DMA MAPPING and IOMMU subsystem:
 -- Check if R-Car Gen3 IPMMU is used or not on patch 3.
 -- Check if all multiple segment buffers are aligned to PAGE_SIZE on patch 3.
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=125593

Changes from v3:
 - Use a helper function device_iommu_mapped on patch 1 and 3.
 - Check if R-Car Gen3 IPMMU is used or not on patch 3.
 - Check if all multiple segment buffers are aligned to PAGE_SIZE on patch 3.
 - Add Reviewed-by Wolfram-san on patch 1 and 2. Note that I also got his
   Reviewed-by on patch 3, but I changed it from v2. So, I didn't add
   his Reviewed-by at this time.
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=120985

Changes from v2:
 - Add some conditions in the init_card().
 - Add a comment in the init_card().
 - Add definitions for some "MAX_SEGS".
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=116729

Changes from v1:
 - Remove adding init_card ops into struct tmio_mmc_dma_ops and
   tmio_mmc_host and just set init_card on renesas_sdhi_core.c.
 - Revise typos on "mmc: tmio: No memory size limitation if runs on IOMMU".
 - Add Simon-san's Reviewed-by on a tmio patch.
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=110485

*** BLURB HERE ***

Yoshihiro Shimoda (8):
  dma-mapping: add a device driver helper for iova contiguous
  iommu/dma: move iommu_dma_unmap_sg() place
  iommu: add a new capable IOMMU_CAP_MERGING
  iommu/ipmmu-vmsa: add capable ops
  mmc: tmio: No memory size limitation if runs on IOMMU
  mmc: tmio: Add a definition for default max_segs
  mmc: renesas_sdhi: sort headers
  mmc: renesas_sdhi: use multiple segments if possible

 drivers/iommu/dma-iommu.c | 74 +--
 drivers/iommu/ipmmu-vmsa.c| 13 +
 drivers/mmc/host/renesas_sdhi_core.c  | 43 +---
 drivers/mmc/host/renesas_sdhi_internal_dmac.c |  4 ++
 drivers/mmc/host/tmio_mmc.h   |  1 +
 drivers/mmc/host/tmio_mmc_core.c  |  7 +--
 include/linux/device.h|  1 +
 include/linux/dma-mapping.h   | 16 ++
 include/linux/iommu.h |  1 +
 9 files changed, 123 insertions(+), 37 deletions(-)

-- 
2.7.4



How to resolve an issue in swiotlb environment?

2019-06-03 Thread Yoshihiro Shimoda
ables:0/17891039/1789 
done
Writing inode tables:0/1789 done  
  
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information:0/1789   
  [   36.745286] xhci-hcd ee00.usb: swiotlb buffer is full 
(sz: 524288 bytes), total 32768 (slots), used 1338 (slots)
[   36.755854] xhci-hcd ee00.usb: overflow 0x00073358+524288 of DMA 
mask  bus mask 0
[   36.765148] WARNING: CPU: 4 PID: 3652 at kernel/dma/direct.c:43 
report_addr+0x38/0xa0
[   36.772971] Modules linked in: rcar_du_drm rcar_lvds drm_kms_helper drm 
drm_panel_orientation_quirks vsp1 videobuf2_vmalloc videobuf2_dma_contig 
videobuf2_memops videobuf2_v4l2 videobuf2_common videodev snd_soc_rcar 
renesas_usbhs snd_soc_audio_graph_card media snd_soc_simple_card_utils 
crct10dif_ce renesas_usb3 snd_soc_ak4613 rcar_fcp pwm_rcar usb_dmac 
phy_rcar_gen3_usb3 pwm_bl ipv6
[   36.806896] CPU: 4 PID: 3652 Comm: usb-storage Not tainted 
5.1.0-12510-g09324d3 #17
[   36.814545] Hardware name: Renesas Salvator-X board based on r8a7795 ES2.0+ 
(DT)
[   36.821936] pstate: 4005 (nZcv daif -PAN -UAO)
[   36.826723] pc : report_addr+0x38/0xa0
[   36.830466] lr : report_addr+0x98/0xa0
[   36.834208] sp : 11f63970
[   36.837516] x29: 11f63970 x28:  
[   36.842823] x27:  x26: 1f020280 
[   36.848129] x25: 8006f32682a8 x24:  
[   36.853435] x23: 0001 x22:  
[   36.858742] x21: 0008 x20: 112b9000 
[   36.864049] x19: 8006fa399010 x18:  
[   36.869355] x17:  x16:  
[   36.874662] x15: 112b96c8 x14: 206b 
[   36.879968] x13: 73616d20414d4420 x12: 666f203838323432 
[   36.885275] x11: 352b303030303835 x10: 112b9f20 
[   36.890582] x9 : 11293018 x8 : 0187 
[   36.895888] x7 : ffcc x6 : 8006ff77f180 
[   36.901195] x5 : 8006ff77f180 x4 :  
[   36.906501] x3 : 8006ff785f10 x2 : e0090e1a0d687e00 
[   36.911808] x1 : e0090e1a0d687e00 x0 :  
[   36.917116] Call trace:
[   36.919558]  report_addr+0x38/0xa0
[   36.922957]  dma_direct_map_page+0x140/0x150
[   36.927222]  dma_direct_map_sg+0x78/0xe0
[   36.931146]  usb_hcd_map_urb_for_dma+0x2e4/0x448
[   36.935763]  xhci_map_urb_for_dma+0x54/0x60
[   36.939941]  usb_hcd_submit_urb+0x88/0x948
[   36.944032]  usb_submit_urb+0x3b4/0x558
[   36.947863]  usb_sg_wait+0x98/0x158
[   36.951352]  usb_stor_bulk_transfer_sglist.part.3+0x94/0x128
[   36.957006]  usb_stor_bulk_srb+0x48/0x88
[   36.960923]  usb_stor_Bulk_transport+0x10c/0x380
[   36.965536]  usb_stor_invoke_transport+0x3c/0x4f0
[   36.970236]  usb_stor_transparent_scsi_command+0xc/0x18
[   36.975456]  usb_stor_control_thread+0x1bc/0x258
[   36.980071]  kthread+0x124/0x128
[   36.983298]  ret_from_fork+0x10/0x18
[   36.986868] ---[ end trace 26ffc6c07675054c ]---
[   36.991976] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 524288 
bytes), total 32768 (slots), used 1338 (slots)
[   37.002973] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 524288 
bytes), total 32768 (slots), used 1338 (slots)
[   37.013873] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 524288 
bytes), total 32768 (slots), used 1338 (slots)
[   37.024721] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 524288 
bytes), total 32768 (slots), used 1338 (slots)
[   37.035537] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 524288 
bytes), total 32768 (slots), used 1338 (slots)
[   37.046333] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 524288 
bytes), total 32768 (slots), used 1338 (slots)
[   37.057115] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 524288 
bytes), total 32768 (slots), used 1338 (slots)
[   37.067899] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 524288 
bytes), total 32768 (slots), used 1338 (slots)
[   37.078676] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 524288 
bytes), total 32768 (slots), used 1338 (slots)
[   41.745471] swiotlb_tbl_map_single: 22211 callbacks suppressed

Best regards,
Yoshihiro Shimoda

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v3 3/3] mmc: renesas_sdhi: use multiple segments if possible

2019-05-22 Thread Yoshihiro Shimoda
Hi Christoph,

Thank you for your review!

> From: Christoph Hellwig, Sent: Wednesday, May 22, 2019 9:29 PM
> 
> On Wed, May 22, 2019 at 07:18:39PM +0900, Yoshihiro Shimoda wrote:
> > In IOMMU environment, since it's possible to merge scatter gather
> > buffers of memory requests to one iova, this patch changes the max_segs
> > value when init_card of mmc_host timing to improve the transfer
> > performance on renesas_sdhi_internal_dmac.
> 
> Well, you can't merge everything with an IOMMU.  For one not every
> IOMMU can merge multiple scatterlist segments,

I didn't know such IOMMU exists. But, since R-Car Gen3 IOMMU device
(handled by ipmmu-vmsa.c) can merge multiple scatterlist segments,
should this mmc driver check whether the IOMMU device is used or not somehow?

> second even it can merge
> segements the segments need to be aligned to the IOMMU page size.

If this driver checks whether the segments are aligned to the IOMMU
page size before DMA API is called every time, is it acceptable?
If one of the segments is not aligned, this driver should not use
the DMAC.

>  And
> then of course we might have an upper limit on the total mapping.

IIUC, if such a case, DMA API will fail. What do you think?

> > +   if (host->pdata->max_segs < SDHI_MAX_SEGS_IN_IOMMU &&
> > +   host->pdev->dev.iommu_group &&
> > +   (mmc_card_mmc(card) || mmc_card_sd(card)))
> > +   host->mmc->max_segs = SDHI_MAX_SEGS_IN_IOMMU;
> 
> This is way to magic.  We'll need a proper DMA layer API to expose
> this information, and preferably a block layer helper to increase
> max_segs instead of hacking that up in the driver.

I think I should have described the detail somewhere. This can expose
this information to a block layer by using blk_queue_max_segments()
that mmc_setup_queue() calls. In other words, this init_card() ops
is called before a block device is created. Is this acceptable if
such a comment is described here?

Best regards,
Yoshihiro Shimoda

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v3 0/6] iommu/ipmmu-vmsa: Suspend/resume support and assorted cleanups

2019-05-13 Thread Yoshihiro Shimoda
Hi Geert-san,

> From: Geert Uytterhoeven, Sent: Wednesday, April 24, 2019 10:55 PM
> 
>   Hi Jörg, Magnus,
> 
> On R-Car Gen3 systems with PSCI, PSCI may power down the SoC during
> system suspend, thus losing all IOMMU state.  Hence after s2ram, devices
> behind an IPMMU (e.g. SATA), and configured to use it, will fail to
> complete their I/O operations.
> 
> This patch series adds suspend/resume support to the Renesas IPMMU-VMSA
> IOMMU driver, and performs some smaller cleanups and fixes during the
> process.  Most patches are fairly independent, except for patch 6/6,
> which depends on patches 4/6 and 5/6.
> 
> Changes compared to v2:
>   - Fix sysfs path typo in patch description,
>   - Add Reviewed-by.
> 
> Changes compared to v1:
>   - Dropped "iommu/ipmmu-vmsa: Call ipmmu_ctx_write_root() instead of
> open coding",
>   - Add Reviewed-by,
>   - Merge IMEAR/IMELAR,
>   - s/ipmmu_context_init/ipmmu_domain_setup_context/,
>   - Drop PSCI checks.
> 
> This has been tested on Salvator-XS with R-Car H3 ES2.0, with IPMMU
> suport for SATA enabled.  To play safe, the resume operation has also
> been tested on R-Car M2-W.

Thank you for the patch! I reviewed this patch series and tested it on
R-Car H3 ES3.0 with IPMMU support for USB3.0 host and SDHI. So,

Reviewed-by: Yoshihiro Shimoda 
Tested-by: Yoshihiro Shimoda 

Best regards,
Yoshihiro Shimoda



[PATCH v2 2/2] iommu/ipmmu-vmsa: add an array of slave devices whitelist

2018-11-28 Thread Yoshihiro Shimoda
To avoid adding copy and pasted strcmp codes in the future,
this patch adds an array "rcar_gen3_slave_whitelist" to check
whether the device can work with the IPMMU or not.

Signed-off-by: Yoshihiro Shimoda 
Reviewed-by: Geert Uytterhoeven 
---
 drivers/iommu/ipmmu-vmsa.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 5f88031..60e3314 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -773,8 +773,13 @@ static int ipmmu_init_platform_device(struct device *dev,
{ /* sentinel */ }
 };
 
+static const char * const rcar_gen3_slave_whitelist[] = {
+};
+
 static bool ipmmu_slave_whitelist(struct device *dev)
 {
+   unsigned int i;
+
/*
 * For R-Car Gen3 use a white list to opt-in slave devices.
 * For Other SoCs, this returns true anyway.
@@ -786,7 +791,13 @@ static bool ipmmu_slave_whitelist(struct device *dev)
if (!soc_device_match(soc_rcar_gen3_whitelist))
return false;
 
-   /* By default, do not allow use of IPMMU */
+   /* Check whether this slave device can work with the IPMMU */
+   for (i = 0; i < ARRAY_SIZE(rcar_gen3_slave_whitelist); i++) {
+   if (!strcmp(dev_name(dev), rcar_gen3_slave_whitelist[i]))
+   return true;
+   }
+
+   /* Otherwise, do not allow use of IPMMU */
return false;
 }
 
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/2] iommu/ipmmu-vmsa: Modify ipmmu_slave_whitelist() to check SoC revisions

2018-11-28 Thread Yoshihiro Shimoda
Some R-Car Gen3 SoCs has hardware restrictions on the IPMMU. So,
to check whether this R-Car Gen3 SoC can use the IPMMU correctly,
this patch modifies the ipmmu_slave_whitelist().

Signed-off-by: Yoshihiro Shimoda 
Reviewed-by: Geert Uytterhoeven 
---
 drivers/iommu/ipmmu-vmsa.c | 34 ++
 1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 9e2655f..5f88031 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -754,12 +754,6 @@ static int ipmmu_init_platform_device(struct device *dev,
return 0;
 }
 
-static bool ipmmu_slave_whitelist(struct device *dev)
-{
-   /* By default, do not allow use of IPMMU */
-   return false;
-}
-
 static const struct soc_device_attribute soc_rcar_gen3[] = {
{ .soc_id = "r8a774a1", },
{ .soc_id = "r8a7795", },
@@ -771,11 +765,35 @@ static bool ipmmu_slave_whitelist(struct device *dev)
{ /* sentinel */ }
 };
 
+static const struct soc_device_attribute soc_rcar_gen3_whitelist[] = {
+   { .soc_id = "r8a7795", .revision = "ES3.*" },
+   { .soc_id = "r8a77965", },
+   { .soc_id = "r8a77990", },
+   { .soc_id = "r8a77995", },
+   { /* sentinel */ }
+};
+
+static bool ipmmu_slave_whitelist(struct device *dev)
+{
+   /*
+* For R-Car Gen3 use a white list to opt-in slave devices.
+* For Other SoCs, this returns true anyway.
+*/
+   if (!soc_device_match(soc_rcar_gen3))
+   return true;
+
+   /* Check whether this R-Car Gen3 can use the IPMMU correctly or not */
+   if (!soc_device_match(soc_rcar_gen3_whitelist))
+   return false;
+
+   /* By default, do not allow use of IPMMU */
+   return false;
+}
+
 static int ipmmu_of_xlate(struct device *dev,
  struct of_phandle_args *spec)
 {
-   /* For R-Car Gen3 use a white list to opt-in slave devices */
-   if (soc_device_match(soc_rcar_gen3) && !ipmmu_slave_whitelist(dev))
+   if (!ipmmu_slave_whitelist(dev))
return -ENODEV;
 
iommu_fwspec_add_ids(dev, spec->args, 1);
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/2] iommu/ipmmu-vmsa: Modify ipmmu_slave_whitelist()

2018-11-28 Thread Yoshihiro Shimoda
This patch set is based on iommu.git / latest next branch
(commit id = f262283c224537962cba0f41b8823e3be9f7b0ff)

I talked with Geert-san about this topic on below:
https://patchwork.kernel.org/patch/10651375/

Also Simon-san suggests we should keep the whitelist.

So, not to change behavior of R-Car Gen2, this patch set adds
two conditions. After applied this patch set, we can add slave
devices easily like below:

--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -774,6 +774,8 @@ static int ipmmu_init_platform_device(struct device *dev,
 };
 
 static const char * const rcar_gen3_slave_whitelist[] = {
+   "e670.dma-controller",
+   "e730.dma-controller"
 };
 
 static bool ipmmu_slave_whitelist(struct device *dev)


Changes from v1:
 - Use "ES3.*" instead of "ES3.0" for r8a7795 in patch 1.
 - Use "unsigned int" instead of "int" in patch 2.
 - Add Geert-san's Reviewed-by.
 

Yoshihiro Shimoda (2):
  iommu/ipmmu-vmsa: Modify ipmmu_slave_whitelist() to check SoC
revisions
  iommu/ipmmu-vmsa: add an array of slave devices whitelist

 drivers/iommu/ipmmu-vmsa.c | 45 +
 1 file changed, 37 insertions(+), 8 deletions(-)

-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 2/2] iommu/ipmmu-vmsa: add an array of slave devices whitelist

2018-11-28 Thread Yoshihiro Shimoda
Hi Geert-san,

> From: Geert Uytterhoeven, Sent: Wednesday, November 28, 2018 5:48 PM
> 
> Hi Shimoda-san,
> 
> On Wed, Nov 28, 2018 at 7:10 AM Yoshihiro Shimoda
>  wrote:
> > To avoid adding copy and pasted strcmp codes in the future,
> > this patch adds an array "rcar_gen3_slave_whitelist" to check
> > whether the device can work with the IPMMU or not.
> >
> > Signed-off-by: Yoshihiro Shimoda 
> 
> Reviewed-by: Geert Uytterhoeven 

Thank you for your review!

> One small comment below.
> 
> > --- a/drivers/iommu/ipmmu-vmsa.c
> > +++ b/drivers/iommu/ipmmu-vmsa.c
> > @@ -773,8 +773,13 @@ static int ipmmu_init_platform_device(struct device 
> > *dev,
> > { /* sentinel */ }
> >  };
> >
> > +static const char * const rcar_gen3_slave_whitelist[] = {
> > +};
> > +
> >  static bool ipmmu_slave_whitelist(struct device *dev)
> >  {
> > +   int i;
> 
> unsigned int

I got it. I'll submit v2 patch.

Best regards,
Yoshihiro Shimoda

> > +
> > /*
> >  * For R-Car Gen3 use a white list to opt-in slave devices.
> >  * For Other SoCs, this returns true anyway.
> > @@ -786,7 +791,13 @@ static bool ipmmu_slave_whitelist(struct device *dev)
> > if (!soc_device_match(soc_rcar_gen3_whitelist))
> > return false;
> >
> > -   /* By default, do not allow use of IPMMU */
> > +   /* Check whether this slave device can work with the IPMMU */
> > +   for (i = 0; i < ARRAY_SIZE(rcar_gen3_slave_whitelist); i++) {
> > +   if (!strcmp(dev_name(dev), rcar_gen3_slave_whitelist[i]))
> > +   return true;
> > +   }
> > +
> > +   /* Otherwise, do not allow use of IPMMU */
> > return false;
> >  }
> 
> Gr{oetje,eeting}s,
> 
> Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 1/2] iommu/ipmmu-vmsa: Modify ipmmu_slave_whitelist() to check SoC revisions

2018-11-28 Thread Yoshihiro Shimoda
Hi Geert-san,

> From: Geert Uytterhoeven, Sent: Wednesday, November 28, 2018 5:47 PM
> 
> Hi Shimoda-san,
> 
> On Wed, Nov 28, 2018 at 7:10 AM Yoshihiro Shimoda
>  wrote:
> > Some R-Car Gen3 SoCs has hardware restrictions on the IPMMU. So,
> > to check whether this R-Car Gen3 SoC can use the IPMMU correctly,
> > this patch modifies the ipmmu_slave_whitelist().
> >
> > Signed-off-by: Yoshihiro Shimoda 
> 
> Reviewed-by: Geert Uytterhoeven 

Thank you for the review!

> One question below.
> 
> > --- a/drivers/iommu/ipmmu-vmsa.c
> > +++ b/drivers/iommu/ipmmu-vmsa.c
> 
> > @@ -771,11 +765,35 @@ static bool ipmmu_slave_whitelist(struct device *dev)
> > { /* sentinel */ }
> >  };
> >
> > +static const struct soc_device_attribute soc_rcar_gen3_whitelist[] = {
> > +   { .soc_id = "r8a7795", .revision = "ES3.0" },
> 
> Don't you want "ES3.*"?

Indeed (I want "ES3.*"). So, I'll submit v2 patch to fix it.

Best regards,
Yoshihiro Shimoda

> > +   { .soc_id = "r8a77965", },
> > +   { .soc_id = "r8a77990", },
> > +   { .soc_id = "r8a77995", },
> > +   { /* sentinel */ }
> > +};
> 
> Gr{oetje,eeting}s,
> 
> Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] iommu/ipmmu-vmsa: add an array of slave devices whitelist

2018-11-27 Thread Yoshihiro Shimoda
To avoid adding copy and pasted strcmp codes in the future,
this patch adds an array "rcar_gen3_slave_whitelist" to check
whether the device can work with the IPMMU or not.

Signed-off-by: Yoshihiro Shimoda 
---
 drivers/iommu/ipmmu-vmsa.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 1c38538..1f65469 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -773,8 +773,13 @@ static int ipmmu_init_platform_device(struct device *dev,
{ /* sentinel */ }
 };
 
+static const char * const rcar_gen3_slave_whitelist[] = {
+};
+
 static bool ipmmu_slave_whitelist(struct device *dev)
 {
+   int i;
+
/*
 * For R-Car Gen3 use a white list to opt-in slave devices.
 * For Other SoCs, this returns true anyway.
@@ -786,7 +791,13 @@ static bool ipmmu_slave_whitelist(struct device *dev)
if (!soc_device_match(soc_rcar_gen3_whitelist))
return false;
 
-   /* By default, do not allow use of IPMMU */
+   /* Check whether this slave device can work with the IPMMU */
+   for (i = 0; i < ARRAY_SIZE(rcar_gen3_slave_whitelist); i++) {
+   if (!strcmp(dev_name(dev), rcar_gen3_slave_whitelist[i]))
+   return true;
+   }
+
+   /* Otherwise, do not allow use of IPMMU */
return false;
 }
 
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] iommu/ipmmu-vmsa: Modify ipmmu_slave_whitelist() to check SoC revisions

2018-11-27 Thread Yoshihiro Shimoda
Some R-Car Gen3 SoCs has hardware restrictions on the IPMMU. So,
to check whether this R-Car Gen3 SoC can use the IPMMU correctly,
this patch modifies the ipmmu_slave_whitelist().

Signed-off-by: Yoshihiro Shimoda 
---
 drivers/iommu/ipmmu-vmsa.c | 34 ++
 1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 9e2655f..1c38538 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -754,12 +754,6 @@ static int ipmmu_init_platform_device(struct device *dev,
return 0;
 }
 
-static bool ipmmu_slave_whitelist(struct device *dev)
-{
-   /* By default, do not allow use of IPMMU */
-   return false;
-}
-
 static const struct soc_device_attribute soc_rcar_gen3[] = {
{ .soc_id = "r8a774a1", },
{ .soc_id = "r8a7795", },
@@ -771,11 +765,35 @@ static bool ipmmu_slave_whitelist(struct device *dev)
{ /* sentinel */ }
 };
 
+static const struct soc_device_attribute soc_rcar_gen3_whitelist[] = {
+   { .soc_id = "r8a7795", .revision = "ES3.0" },
+   { .soc_id = "r8a77965", },
+   { .soc_id = "r8a77990", },
+   { .soc_id = "r8a77995", },
+   { /* sentinel */ }
+};
+
+static bool ipmmu_slave_whitelist(struct device *dev)
+{
+   /*
+* For R-Car Gen3 use a white list to opt-in slave devices.
+* For Other SoCs, this returns true anyway.
+*/
+   if (!soc_device_match(soc_rcar_gen3))
+   return true;
+
+   /* Check whether this R-Car Gen3 can use the IPMMU correctly or not */
+   if (!soc_device_match(soc_rcar_gen3_whitelist))
+   return false;
+
+   /* By default, do not allow use of IPMMU */
+   return false;
+}
+
 static int ipmmu_of_xlate(struct device *dev,
  struct of_phandle_args *spec)
 {
-   /* For R-Car Gen3 use a white list to opt-in slave devices */
-   if (soc_device_match(soc_rcar_gen3) && !ipmmu_slave_whitelist(dev))
+   if (!ipmmu_slave_whitelist(dev))
return -ENODEV;
 
iommu_fwspec_add_ids(dev, spec->args, 1);
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/2] iommu/ipmmu-vmsa: Modify ipmmu_slave_whitelist()

2018-11-27 Thread Yoshihiro Shimoda
This patch set is based on iommu.git / latest next branch
(commit id = f262283c224537962cba0f41b8823e3be9f7b0ff)

I talked with Geert-san about this topic on below:
https://patchwork.kernel.org/patch/10651375/

Also Simon-san suggests we should keep the whitelist.

So, not to change behavior of R-Car Gen2, this patch set adds
two conditions. After applied this patch set, we can add slave
devices easily like below:

--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -774,6 +774,8 @@ static int ipmmu_init_platform_device(struct device *dev,
 };
 
 static const char * const rcar_gen3_slave_whitelist[] = {
+   "e670.dma-controller",
+   "e730.dma-controller"
 };
 
 static bool ipmmu_slave_whitelist(struct device *dev)


Yoshihiro Shimoda (2):
  iommu/ipmmu-vmsa: Modify ipmmu_slave_whitelist() to check SoC
revisions
  iommu/ipmmu-vmsa: add an array of slave devices whitelist

 drivers/iommu/ipmmu-vmsa.c | 45 +
 1 file changed, 37 insertions(+), 8 deletions(-)

-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/ipmmu-vmsa: IMUCTRn.TTSEL needs a special usage on R-Car Gen3

2018-07-08 Thread Yoshihiro Shimoda
The TTSEL bit of IMUCTRn register of R-Car Gen3 needs to be set
unused MMU context number even if uTLBs are disabled
(The MMUEN bit of IMUCTRn register = 0).
Since initial values of IMUCTRn.TTSEL on all IPMMU-domains are 0,
this patch adds a new feature "reserved_context" to reserve IPMMU
context number 0 as the unused MMU context.

Signed-off-by: Yoshihiro Shimoda 
---
 drivers/iommu/ipmmu-vmsa.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 6a0e714..6cbd2bd 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -47,6 +47,7 @@ struct ipmmu_features {
unsigned int number_of_contexts;
bool setup_imbuscr;
bool twobit_imttbcr_sl0;
+   bool reserved_context;
 };
 
 struct ipmmu_vmsa_device {
@@ -925,6 +926,7 @@ static void ipmmu_device_reset(struct ipmmu_vmsa_device 
*mmu)
.number_of_contexts = 1, /* software only tested with one context */
.setup_imbuscr = true,
.twobit_imttbcr_sl0 = false,
+   .reserved_context = false,
 };
 
 static const struct ipmmu_features ipmmu_features_rcar_gen3 = {
@@ -933,6 +935,7 @@ static void ipmmu_device_reset(struct ipmmu_vmsa_device 
*mmu)
.number_of_contexts = 8,
.setup_imbuscr = false,
.twobit_imttbcr_sl0 = true,
+   .reserved_context = true,
 };
 
 static const struct of_device_id ipmmu_of_ids[] = {
@@ -1038,6 +1041,11 @@ static int ipmmu_probe(struct platform_device *pdev)
}
 
ipmmu_device_reset(mmu);
+
+   if (mmu->features->reserved_context) {
+   dev_info(>dev, "IPMMU context 0 is reserved\n");
+   set_bit(0, mmu->ctx);
+   }
}
 
/*
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 1/2] mmc: renesas_sdhi: fix swiotlb buffer is full

2017-11-01 Thread Yoshihiro Shimoda
Hi,

> From: Konrad Rzeszutek, Sent: Wednesday, November 1, 2017 10:27 PM
> 
> On Fri, Oct 20, 2017 at 03:18:55AM +, Yoshihiro Shimoda wrote:
> > Hi again!
> >
> > > From: Yoshihiro Shimoda, Sent: Thursday, October 19, 2017 8:39 PM
> > >
> > > Hi Geert-san, Konrad-san,
> > >
> > > > From: Geert Uytterhoeven, Sent: Thursday, October 19, 2017 5:34 PM
> > > >
> > > > Hi Konrad,
> > > >
> > > > On Thu, Oct 19, 2017 at 2:24 AM, Konrad Rzeszutek Wilk
> > > > <kon...@darnok.org> wrote:
> > < snip >
> > > > >> > diff --git a/drivers/mmc/host/renesas_sdhi_internal_dmac.c 
> > > > >> > b/drivers/mmc/host/renesas_sdhi_internal_dmac.c
> > > > >> > index f905f23..6c9b4b2 100644
> > > > >> > --- a/drivers/mmc/host/renesas_sdhi_internal_dmac.c
> > > > >> > +++ b/drivers/mmc/host/renesas_sdhi_internal_dmac.c
> > > > >> > @@ -80,8 +80,9 @@
> > > > >> > .scc_offset = 0x1000,
> > > > >> > .taps   = rcar_gen3_scc_taps,
> > > > >> > .taps_num   = ARRAY_SIZE(rcar_gen3_scc_taps),
> > > > >> > -   /* Gen3 SDHI DMAC can handle 0x blk count, but seg 
> > > > >> > = 1 */
> > > > >> > -   .max_blk_count  = 0x,
> > > > >> > +   /* The swiotlb can handle memory size up to 256 kbytes for 
> > > > >> > now. */
> > > > >> > +   .max_blk_count  = 512,
> > > > >>
> > > > >> Fixing this in the individual drivers feels like the wrong solution 
> > > > >> to me.
> > > > >>
> > > > >> iommu: Is there a better (generic) way to handle this?
> > > > >
> > > > > Yes. See 7453c549f5f6485c0d79cad7844870dcc7d1b34d, aka 
> > > > > swiotlb_max_segment
> > > >
> > > > Thanks for the pointer!
> > > >
> > > > While I agree this can be used to avoid the swiotlb buffer full issue,
> > > > I believe it is a suboptimal solution if the device actually uses an 
> > > > IOMMU.
> > > > It limits the mapping size if CONFIG_SWIOTLB=y, which is always the
> > > > case for arm/arm64 these days.
> > >
> > > I'm afraid but I misunderstood this API's spec when I read it at first.
> > > After I tried to use it, I found the API cannot be used for a workaround 
> > > because
> > > this API returns total size of swiotlb.
> > >
> > > For example:
> > >  - The swiotlb_max_segment() returns 64M bytes from the API when a 
> > > default setting.
> > >   - In this case, the maximum size per a map is 256k bytes.
> > >  - The swiotlb_max_segment() returns 128M bytes from the API when I added 
> > > swiotlb=65536
> > >into the kernel parameter on arm64.
> > >   - In this case, the maximum size per a map is still 256k bytes because
> > > the swiotlb has hardcoded the size by the following code:
> > >  
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/swiotlb.c?h=v4.14-rc5#n254
> > >
> > > So, how do we handle to resolve (or avoid) the issue?
> >
> > Anyway, I made v2 patches by using swiotlb related definitions. Would you 
> > check it?
> 
> Did I miss that email? As in was I cc-ed?

This was my fault. When I submitted v2 patches, I didn't include your email and 
iommu mailing list...

> > https://patchwork.kernel.org/patch/10018879/
> 
> Why not use IO_TLB_SEGSIZE << IO_TLB_SHIFT or alternatively
> swiotlb_max_segment?  See 5584f1b1d73e9

I already made such a patch as v2 and it was merged into mmc.git / fixes branch.

https://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc.git/commit/?h=fixes=e90e8da72ad694a16a4ffa6e5adae3610208f73b

Best regards,
Yoshihiro Shimoda

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 1/2] mmc: renesas_sdhi: fix swiotlb buffer is full

2017-10-19 Thread Yoshihiro Shimoda
Hi again!

> From: Yoshihiro Shimoda, Sent: Thursday, October 19, 2017 8:39 PM
> 
> Hi Geert-san, Konrad-san,
> 
> > From: Geert Uytterhoeven, Sent: Thursday, October 19, 2017 5:34 PM
> >
> > Hi Konrad,
> >
> > On Thu, Oct 19, 2017 at 2:24 AM, Konrad Rzeszutek Wilk
> > <kon...@darnok.org> wrote:
< snip >
> > >> > diff --git a/drivers/mmc/host/renesas_sdhi_internal_dmac.c 
> > >> > b/drivers/mmc/host/renesas_sdhi_internal_dmac.c
> > >> > index f905f23..6c9b4b2 100644
> > >> > --- a/drivers/mmc/host/renesas_sdhi_internal_dmac.c
> > >> > +++ b/drivers/mmc/host/renesas_sdhi_internal_dmac.c
> > >> > @@ -80,8 +80,9 @@
> > >> > .scc_offset = 0x1000,
> > >> > .taps   = rcar_gen3_scc_taps,
> > >> > .taps_num   = ARRAY_SIZE(rcar_gen3_scc_taps),
> > >> > -   /* Gen3 SDHI DMAC can handle 0x blk count, but seg = 1 
> > >> > */
> > >> > -   .max_blk_count  = 0x,
> > >> > +   /* The swiotlb can handle memory size up to 256 kbytes for 
> > >> > now. */
> > >> > +   .max_blk_count  = 512,
> > >>
> > >> Fixing this in the individual drivers feels like the wrong solution to 
> > >> me.
> > >>
> > >> iommu: Is there a better (generic) way to handle this?
> > >
> > > Yes. See 7453c549f5f6485c0d79cad7844870dcc7d1b34d, aka swiotlb_max_segment
> >
> > Thanks for the pointer!
> >
> > While I agree this can be used to avoid the swiotlb buffer full issue,
> > I believe it is a suboptimal solution if the device actually uses an IOMMU.
> > It limits the mapping size if CONFIG_SWIOTLB=y, which is always the
> > case for arm/arm64 these days.
> 
> I'm afraid but I misunderstood this API's spec when I read it at first.
> After I tried to use it, I found the API cannot be used for a workaround 
> because
> this API returns total size of swiotlb.
> 
> For example:
>  - The swiotlb_max_segment() returns 64M bytes from the API when a default 
> setting.
>   - In this case, the maximum size per a map is 256k bytes.
>  - The swiotlb_max_segment() returns 128M bytes from the API when I added 
> swiotlb=65536
>into the kernel parameter on arm64.
>   - In this case, the maximum size per a map is still 256k bytes because
> the swiotlb has hardcoded the size by the following code:
>  
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/swiotlb.c?h=v4.14-rc5#n254
> 
> So, how do we handle to resolve (or avoid) the issue?

Anyway, I made v2 patches by using swiotlb related definitions. Would you check 
it?
https://patchwork.kernel.org/patch/10018879/

Best regards,
Yoshihiro Shimoda

> Best regards,
> Yoshihiro Shimoda
> 
> > Gr{oetje,eeting}s,
> >
> > Geert
> >
> > --
> > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> > ge...@linux-m68k.org
> >
> > In personal conversations with technical people, I call myself a hacker. But
> > when I'm talking to journalists I just say "programmer" or something like 
> > that.
> > -- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 1/2] mmc: renesas_sdhi: fix swiotlb buffer is full

2017-10-19 Thread Yoshihiro Shimoda
Hi Geert-san, Konrad-san,

> From: Geert Uytterhoeven, Sent: Thursday, October 19, 2017 5:34 PM
> 
> Hi Konrad,
> 
> On Thu, Oct 19, 2017 at 2:24 AM, Konrad Rzeszutek Wilk
> <kon...@darnok.org> wrote:
> > On Tue, Oct 17, 2017 at 10:02:50AM +0200, Geert Uytterhoeven wrote:
> >> On Tue, Oct 17, 2017 at 9:30 AM, Yoshihiro Shimoda
> >> <yoshihiro.shimoda...@renesas.com> wrote:
> >> > Since the commit de3ee99b097d ("mmc: Delete bounce buffer handling")
> >> > deletes the bounce buffer handling, a request data size will be referred
> >> > to max_{req,seg}_size instead of MMC_QUEUE_BOUNCESZ (64k bytes).
> >> >
> >> > In other hand, renesas_sdhi_internal_dmac.c will set very big value of
> >> > max_{req,seg}_size because the max_blk_count is set to 0x.
> >> > And then, "swiotlb buffer is full" happens because swiotlb can handle
> >> > a memory size up to 256k bytes only (IO_TLB_SEGSIZE = 128 and
> >> > IO_TLB_SHIFT = 11).
> >> >
> >> > So, this patch fixes the issue to set max_blk_count to 512. Then,
> >> > the max_{req,seg}_size will be set to 256k bytes.
> >> >
> >> > Reported-by: Dirk Behme <dirk.be...@de.bosch.com>
> >> > Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda...@renesas.com>
> >>
> >> Thanks for your patch!
> >>
> >> > ---
> >> >  drivers/mmc/host/renesas_sdhi_internal_dmac.c | 5 +++--
> >> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/drivers/mmc/host/renesas_sdhi_internal_dmac.c 
> >> > b/drivers/mmc/host/renesas_sdhi_internal_dmac.c
> >> > index f905f23..6c9b4b2 100644
> >> > --- a/drivers/mmc/host/renesas_sdhi_internal_dmac.c
> >> > +++ b/drivers/mmc/host/renesas_sdhi_internal_dmac.c
> >> > @@ -80,8 +80,9 @@
> >> > .scc_offset = 0x1000,
> >> > .taps   = rcar_gen3_scc_taps,
> >> > .taps_num   = ARRAY_SIZE(rcar_gen3_scc_taps),
> >> > -   /* Gen3 SDHI DMAC can handle 0x blk count, but seg = 1 */
> >> > -   .max_blk_count  = 0x,
> >> > +   /* The swiotlb can handle memory size up to 256 kbytes for now. 
> >> > */
> >> > +   .max_blk_count  = 512,
> >>
> >> Fixing this in the individual drivers feels like the wrong solution to me.
> >>
> >> iommu: Is there a better (generic) way to handle this?
> >
> > Yes. See 7453c549f5f6485c0d79cad7844870dcc7d1b34d, aka swiotlb_max_segment
> 
> Thanks for the pointer!
> 
> While I agree this can be used to avoid the swiotlb buffer full issue,
> I believe it is a suboptimal solution if the device actually uses an IOMMU.
> It limits the mapping size if CONFIG_SWIOTLB=y, which is always the
> case for arm/arm64 these days.

I'm afraid but I misunderstood this API's spec when I read it at first.
After I tried to use it, I found the API cannot be used for a workaround because
this API returns total size of swiotlb.

For example:
 - The swiotlb_max_segment() returns 64M bytes from the API when a default 
setting.
  - In this case, the maximum size per a map is 256k bytes.
 - The swiotlb_max_segment() returns 128M bytes from the API when I added 
swiotlb=65536
   into the kernel parameter on arm64.
  - In this case, the maximum size per a map is still 256k bytes because
the swiotlb has hardcoded the size by the following code:
 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/swiotlb.c?h=v4.14-rc5#n254

So, how do we handle to resolve (or avoid) the issue?

Best regards,
Yoshihiro Shimoda

> Gr{oetje,eeting}s,
> 
> Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH/RFC] iommu/dma: Per-domain flag to control size-alignment

2017-01-29 Thread Yoshihiro Shimoda
Hi Robin, Magnus,

> -Original Message-
> From: Robin Murphy
> Sent: Saturday, January 28, 2017 2:38 AM
> 
> Hi Magnus,
> 
> On 27/01/17 06:24, Magnus Damm wrote:
> > From: Magnus Damm <damm+rene...@opensource.se>
> >
> > Introduce the flag "no_size_align" to allow disabling size-alignment
> > on a per-domain basis. This follows the suggestion by the comment
> > in the code, however a per-device control may be preferred?
> >
> > Needed to make virtual space contiguous for certain devices.
> 
> That sounds very suspicious - a single allocation is contiguous with
> itself by definition, and anyone relying on multiple allocations being
> contiguous with one another is doing it wrong, because there's no way we
> could ever guarantee that (with this allocator, at any rate). I'd be
> very reticent to touch this without a specific example of what problem
> it solves.

Thank you for the comment! This patch was from my request.
But, I completely misunderstood this "size-alignment" behavior.
And, my concern was already resolved by the following patch at last April:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/iommu?id=809eac54cdd62c67afea1e17080e681dfa33dc09

So, no one needs this patch anymore.

Best regards,
Yoshihiro Shimoda

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH/RFC 3/4] iommu: dma: iommu iova domain reset

2017-01-25 Thread Yoshihiro Shimoda
From: Magnus Damm <damm+rene...@opensource.se>

To add a workaround code for ipmmu-vmsa driver, this patch adds
a new geometry "force_reset_when_empty" not to reuse iova space.
When all pfns happen to get unmapped then ask the IOMMU driver to
flush the state followed by starting from an empty iova space.

Signed-off-by: Magnus Damm <damm+rene...@opensource.se>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda...@renesas.com>
---
 drivers/iommu/dma-iommu.c | 42 +-
 drivers/iommu/iova.c  |  9 +
 include/linux/iommu.h |  2 ++
 include/linux/iova.h  |  1 +
 4 files changed, 49 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index a0b8c0f..d0fa0b1 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -42,6 +42,7 @@ struct iommu_dma_cookie {
struct iova_domain  iovad;
struct list_headmsi_page_list;
spinlock_t  msi_lock;
+   spinlock_t  reset_lock;
 };
 
 static inline struct iova_domain *cookie_iovad(struct iommu_domain *domain)
@@ -74,6 +75,7 @@ int iommu_get_dma_cookie(struct iommu_domain *domain)
 
spin_lock_init(>msi_lock);
INIT_LIST_HEAD(>msi_page_list);
+   spin_lock_init(>reset_lock);
domain->iova_cookie = cookie;
return 0;
 }
@@ -208,9 +210,11 @@ int dma_direction_to_prot(enum dma_data_direction dir, 
bool coherent)
 static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
dma_addr_t dma_limit)
 {
+   struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iova_domain *iovad = cookie_iovad(domain);
unsigned long shift = iova_shift(iovad);
unsigned long length = iova_align(iovad, size) >> shift;
+   unsigned long flags;
struct iova *iova;
 
if (domain->geometry.force_aperture)
@@ -219,9 +223,19 @@ static struct iova *__alloc_iova(struct iommu_domain 
*domain, size_t size,
 * Enforce size-alignment to be safe - there could perhaps be an
 * attribute to control this per-device, or at least per-domain...
 */
-   iova = alloc_iova(iovad, length, dma_limit >> shift, true);
-   if (iova)
-   atomic_add(iova_size(iova), >iova_pfns_mapped);
+   if (domain->geometry.force_reset_when_empty) {
+   spin_lock_irqsave(>reset_lock, flags);
+
+   iova = alloc_iova(iovad, length, dma_limit >> shift, true);
+   if (iova)
+   atomic_add(iova_size(iova), >iova_pfns_mapped);
+
+   spin_unlock_irqrestore(>reset_lock, flags);
+   } else {
+   iova = alloc_iova(iovad, length, dma_limit >> shift, true);
+   if (iova)
+   atomic_add(iova_size(iova), >iova_pfns_mapped);
+   }
 
return iova;
 }
@@ -229,10 +243,28 @@ static struct iova *__alloc_iova(struct iommu_domain 
*domain, size_t size,
 void
 __free_iova_domain(struct iommu_domain *domain, struct iova *iova)
 {
+   struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iova_domain *iovad = cookie_iovad(domain);
+   unsigned long flags;
 
-   atomic_sub(iova_size(iova), >iova_pfns_mapped);
-   __free_iova(iovad, iova);
+   /* In case force_reset_when_empty is set, do not reuse iova space
+* but instead simply keep on expanding seemingly forever.
+* When all pfns happen to get unmapped then ask the IOMMU driver to
+* flush the state followed by starting from an empty iova space.
+*/
+   if (domain->geometry.force_reset_when_empty) {
+   spin_lock_irqsave(>reset_lock, flags);
+   if (atomic_sub_return(iova_size(iova),
+ >iova_pfns_mapped) == 0) {
+   reset_iova_domain(iovad);
+   if (domain->ops->domain_reset)
+   domain->ops->domain_reset(domain);
+   }
+   spin_unlock_irqrestore(>reset_lock, flags);
+   } else {
+   atomic_sub(iova_size(iova), >iova_pfns_mapped);
+   __free_iova(iovad, iova);
+   }
 }
 
 
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 866ad65..50aaa46 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -464,6 +464,7 @@ void put_iova_domain(struct iova_domain *iovad)
while (node) {
struct iova *iova = rb_entry(node, struct iova, node);
 
+   __cached_rbnode_delete_update(iovad, iova);
rb_erase(node, >rbroot);
free_iova_mem(iova);
node = rb_first(>rbroot);
@@ -472,6 +473,14 @@ void put_iova_domain(struct iova_domain *iovad)
 }
 EXPORT_SYMBOL_GPL(put_iova_domain);
 
+void
+reset_io

[PATCH/RFC 2/4] iommu: iova: use __alloc_percpu_gfp() with GFP_NOWAIT in init_iova_rcaches()

2017-01-25 Thread Yoshihiro Shimoda
In the future, the init_iova_rcaches will be called in atomic.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda...@renesas.com>
---
 drivers/iommu/iova.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index b7268a1..866ad65 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -723,7 +723,9 @@ static void init_iova_rcaches(struct iova_domain *iovad)
rcache = >rcaches[i];
spin_lock_init(>lock);
rcache->depot_size = 0;
-   rcache->cpu_rcaches = __alloc_percpu(sizeof(*cpu_rcache), 
cache_line_size());
+   rcache->cpu_rcaches = __alloc_percpu_gfp(sizeof(*cpu_rcache),
+cache_line_size(),
+GFP_NOWAIT);
if (WARN_ON(!rcache->cpu_rcaches))
continue;
for_each_possible_cpu(cpu) {
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH/RFC 4/4] iommu: ipmmu-vmsa: enable force_reset_when_empty

2017-01-25 Thread Yoshihiro Shimoda
The IPMMU of R-Car Gen3 will mistake an address translation if
IMCTR.FLUSH is set while some related devices that on the same doamin
are running. To avoid this, this patch uses the force_reset_when_empty
feature.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda...@renesas.com>
---
 drivers/iommu/ipmmu-vmsa.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 11550ac..4b62969 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -302,7 +302,8 @@ static void ipmmu_tlb_flush_all(void *cookie)
 {
struct ipmmu_vmsa_domain *domain = cookie;
 
-   ipmmu_tlb_invalidate(domain);
+   if (!domain->io_domain.geometry.force_reset_when_empty)
+   ipmmu_tlb_invalidate(domain);
 }
 
 static void ipmmu_tlb_add_flush(unsigned long iova, size_t size,
@@ -555,6 +556,13 @@ static void ipmmu_domain_free(struct iommu_domain 
*io_domain)
kfree(domain);
 }
 
+static void ipmmu_domain_reset(struct iommu_domain *io_domain)
+{
+   struct ipmmu_vmsa_domain *domain = to_vmsa_domain(io_domain);
+
+   ipmmu_tlb_invalidate(domain);
+}
+
 static int ipmmu_attach_device(struct iommu_domain *io_domain,
   struct device *dev)
 {
@@ -832,6 +840,7 @@ static struct iommu_domain *ipmmu_domain_alloc(unsigned 
type)
 static const struct iommu_ops ipmmu_ops = {
.domain_alloc = ipmmu_domain_alloc,
.domain_free = ipmmu_domain_free,
+   .domain_reset = ipmmu_domain_reset,
.attach_dev = ipmmu_attach_device,
.detach_dev = ipmmu_detach_device,
.map = ipmmu_map,
@@ -858,8 +867,10 @@ static struct iommu_domain 
*ipmmu_domain_alloc_dma(unsigned type)
 
case IOMMU_DOMAIN_DMA:
io_domain = __ipmmu_domain_alloc(type);
-   if (io_domain)
+   if (io_domain) {
iommu_get_dma_cookie(io_domain);
+   io_domain->geometry.force_reset_when_empty = true;
+   }
break;
}
 
@@ -927,6 +938,7 @@ static int ipmmu_of_xlate_dma(struct device *dev,
 static const struct iommu_ops ipmmu_ops = {
.domain_alloc = ipmmu_domain_alloc_dma,
.domain_free = ipmmu_domain_free_dma,
+   .domain_reset = ipmmu_domain_reset,
.attach_dev = ipmmu_attach_device,
.detach_dev = ipmmu_detach_device,
.map = ipmmu_map,
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu: fix second argument of trace_map() to report correct paddr

2016-02-09 Thread Yoshihiro Shimoda
Since iommu_map() code added pgsize value to the paddr, trace_map()
used wrong paddr. So, this patch adds "orig_paddr" value in the
iommu_map() to use for the trace_map().

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda...@renesas.com>
---
 drivers/iommu/iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0e3b009..bfd4f7c 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1314,6 +1314,7 @@ int iommu_map(struct iommu_domain *domain, unsigned long 
iova,
unsigned long orig_iova = iova;
unsigned int min_pagesz;
size_t orig_size = size;
+   phys_addr_t orig_paddr = paddr;
int ret = 0;
 
if (unlikely(domain->ops->map == NULL ||
@@ -1358,7 +1359,7 @@ int iommu_map(struct iommu_domain *domain, unsigned long 
iova,
if (ret)
iommu_unmap(domain, orig_iova, orig_size - size);
else
-   trace_map(orig_iova, paddr, orig_size);
+   trace_map(orig_iova, orig_paddr, orig_size);
 
return ret;
 }
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


<    1   2