RE: [PATCH 1/2] mmc: renesas_sdhi: fix swiotlb buffer is full

2017-11-01 Thread Yoshihiro Shimoda
Hi,

> From: Konrad Rzeszutek, Sent: Wednesday, November 1, 2017 10:27 PM
> 
> On Fri, Oct 20, 2017 at 03:18:55AM +, Yoshihiro Shimoda wrote:
> > Hi again!
> >
> > > From: Yoshihiro Shimoda, Sent: Thursday, October 19, 2017 8:39 PM
> > >
> > > Hi Geert-san, Konrad-san,
> > >
> > > > From: Geert Uytterhoeven, Sent: Thursday, October 19, 2017 5:34 PM
> > > >
> > > > Hi Konrad,
> > > >
> > > > On Thu, Oct 19, 2017 at 2:24 AM, Konrad Rzeszutek Wilk
> > > >  wrote:
> > < snip >
> > > > >> > diff --git a/drivers/mmc/host/renesas_sdhi_internal_dmac.c 
> > > > >> > b/drivers/mmc/host/renesas_sdhi_internal_dmac.c
> > > > >> > index f905f23..6c9b4b2 100644
> > > > >> > --- a/drivers/mmc/host/renesas_sdhi_internal_dmac.c
> > > > >> > +++ b/drivers/mmc/host/renesas_sdhi_internal_dmac.c
> > > > >> > @@ -80,8 +80,9 @@
> > > > >> > .scc_offset = 0x1000,
> > > > >> > .taps   = rcar_gen3_scc_taps,
> > > > >> > .taps_num   = ARRAY_SIZE(rcar_gen3_scc_taps),
> > > > >> > -   /* Gen3 SDHI DMAC can handle 0x blk count, but seg 
> > > > >> > = 1 */
> > > > >> > -   .max_blk_count  = 0x,
> > > > >> > +   /* The swiotlb can handle memory size up to 256 kbytes for 
> > > > >> > now. */
> > > > >> > +   .max_blk_count  = 512,
> > > > >>
> > > > >> Fixing this in the individual drivers feels like the wrong solution 
> > > > >> to me.
> > > > >>
> > > > >> iommu: Is there a better (generic) way to handle this?
> > > > >
> > > > > Yes. See 7453c549f5f6485c0d79cad7844870dcc7d1b34d, aka 
> > > > > swiotlb_max_segment
> > > >
> > > > Thanks for the pointer!
> > > >
> > > > While I agree this can be used to avoid the swiotlb buffer full issue,
> > > > I believe it is a suboptimal solution if the device actually uses an 
> > > > IOMMU.
> > > > It limits the mapping size if CONFIG_SWIOTLB=y, which is always the
> > > > case for arm/arm64 these days.
> > >
> > > I'm afraid but I misunderstood this API's spec when I read it at first.
> > > After I tried to use it, I found the API cannot be used for a workaround 
> > > because
> > > this API returns total size of swiotlb.
> > >
> > > For example:
> > >  - The swiotlb_max_segment() returns 64M bytes from the API when a 
> > > default setting.
> > >   - In this case, the maximum size per a map is 256k bytes.
> > >  - The swiotlb_max_segment() returns 128M bytes from the API when I added 
> > > swiotlb=65536
> > >into the kernel parameter on arm64.
> > >   - In this case, the maximum size per a map is still 256k bytes because
> > > the swiotlb has hardcoded the size by the following code:
> > >  
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/swiotlb.c?h=v4.14-rc5#n254
> > >
> > > So, how do we handle to resolve (or avoid) the issue?
> >
> > Anyway, I made v2 patches by using swiotlb related definitions. Would you 
> > check it?
> 
> Did I miss that email? As in was I cc-ed?

This was my fault. When I submitted v2 patches, I didn't include your email and 
iommu mailing list...

> > https://patchwork.kernel.org/patch/10018879/
> 
> Why not use IO_TLB_SEGSIZE << IO_TLB_SHIFT or alternatively
> swiotlb_max_segment?  See 5584f1b1d73e9

I already made such a patch as v2 and it was merged into mmc.git / fixes branch.

https://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc.git/commit/?h=fixes=e90e8da72ad694a16a4ffa6e5adae3610208f73b

Best regards,
Yoshihiro Shimoda

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v4 03/12] intel-ipu3: Add IOMMU based dmamap support

2017-11-01 Thread Zhi, Yong
Hi, Sakari,

> -Original Message-
> From: Sakari Ailus [mailto:sakari.ai...@iki.fi]
> Sent: Friday, October 20, 2017 2:20 AM
> To: Zhi, Yong 
> Cc: linux-me...@vger.kernel.org; sakari.ai...@linux.intel.com; Zheng, Jian
> Xu ; Mani, Rajmohan
> ; Toivonen, Tuukka
> ; Hu, Jerry W ;
> a...@arndb.de; h...@lst.de; robin.mur...@arm.com; iommu@lists.linux-
> foundation.org; Tomasz Figa 
> Subject: Re: [PATCH v4 03/12] intel-ipu3: Add IOMMU based dmamap
> support
> 
> Hi Yong,
> 
> On Tue, Oct 17, 2017 at 10:48:59PM -0500, Yong Zhi wrote:
> > From: Tomasz Figa 
> >
> > This patch adds driver to support IPU3-specific MMU-aware memory
> > alloc/free and sg mapping functions.
> >
> > Signed-off-by: Tomasz Figa 
> > Signed-off-by: Yong Zhi 
> > ---
> >  drivers/media/pci/intel/ipu3/Kconfig   |   7 +
> >  drivers/media/pci/intel/ipu3/Makefile  |   2 +-
> >  drivers/media/pci/intel/ipu3/ipu3-dmamap.c | 342
> > +
> > drivers/media/pci/intel/ipu3/ipu3-dmamap.h |  33 +++
> >  4 files changed, 383 insertions(+), 1 deletion(-)  create mode 100644
> > drivers/media/pci/intel/ipu3/ipu3-dmamap.c
> >  create mode 100644 drivers/media/pci/intel/ipu3/ipu3-dmamap.h
> >
> > diff --git a/drivers/media/pci/intel/ipu3/Kconfig
> > b/drivers/media/pci/intel/ipu3/Kconfig
> > index 46ff138f3e50..d7dab52dc881 100644
> > --- a/drivers/media/pci/intel/ipu3/Kconfig
> > +++ b/drivers/media/pci/intel/ipu3/Kconfig
> > @@ -26,3 +26,10 @@ config INTEL_IPU3_MMU
> > ---help---
> >   For IPU3, this option enables its MMU driver to translate its internal
> >   virtual address to 39 bits wide physical address for 64GBytes space
> access.
> > +
> > +config INTEL_IPU3_DMAMAP
> > +   tristate
> > +   default n
> > +   select IOMMU_IOVA
> > +   ---help---
> > + This is IPU3 IOMMU domain specific DMA driver.
> > diff --git a/drivers/media/pci/intel/ipu3/Makefile
> > b/drivers/media/pci/intel/ipu3/Makefile
> > index 91cac9cb7401..651773231496 100644
> > --- a/drivers/media/pci/intel/ipu3/Makefile
> > +++ b/drivers/media/pci/intel/ipu3/Makefile
> > @@ -13,4 +13,4 @@
> >
> >  obj-$(CONFIG_VIDEO_IPU3_CIO2) += ipu3-cio2.o
> >  obj-$(CONFIG_INTEL_IPU3_MMU) += ipu3-mmu.o
> > -
> > +obj-$(CONFIG_INTEL_IPU3_DMAMAP) += ipu3-dmamap.o
> > diff --git a/drivers/media/pci/intel/ipu3/ipu3-dmamap.c
> > b/drivers/media/pci/intel/ipu3/ipu3-dmamap.c
> > new file mode 100644
> > index ..e54bd9dfa302
> > --- /dev/null
> > +++ b/drivers/media/pci/intel/ipu3/ipu3-dmamap.c
> > @@ -0,0 +1,342 @@
> > +/*
> > + * Copyright (c) 2017 Intel Corporation.
> > + * Copyright (C) 2017 Google, Inc.
> > + *
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License
> > +version
> > + * 2 as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> 
> Do you need this for something?
> 

Ouch, will remove the un-needed headers.

> > +#include 
> > +
> > +#include "ipu3-css-pool.h"
> > +#include "ipu3.h"
> > +
> > +/*
> > + * Based on arch/arm64/mm/dma-mapping.c, with simplifications
> > +possible due
> > + * to driver-specific character of this file.
> > + */
> > +
> > +static int dma_direction_to_prot(enum dma_data_direction dir, bool
> > +coherent) {
> > +   int prot = coherent ? IOMMU_CACHE : 0;
> > +
> > +   switch (dir) {
> > +   case DMA_BIDIRECTIONAL:
> > +   return prot | IOMMU_READ | IOMMU_WRITE;
> > +   case DMA_TO_DEVICE:
> > +   return prot | IOMMU_READ;
> > +   case DMA_FROM_DEVICE:
> > +   return prot | IOMMU_WRITE;
> > +   default:
> > +   return 0;
> > +   }
> > +}
> > +
> > +/*
> > + * Free a buffer allocated by ipu3_dmamap_alloc_buffer()  */ static
> > +void ipu3_dmamap_free_buffer(struct page **pages,
> > +   size_t size)
> > +{
> > +   int count = size >> PAGE_SHIFT;
> > +
> > +   while (count--)
> > +   __free_page(pages[count]);
> > +   kvfree(pages);
> > +}
> > +
> > +/*
> > + * Based on the implementation of __iommu_dma_alloc_pages()
> > + * defined in drivers/iommu/dma-iommu.c  */ static struct page
> > +**ipu3_dmamap_alloc_buffer(size_t size,
> > + unsigned long order_mask,
> > + gfp_t gfp)
> > +{
> > +   struct page **pages;
> > +   unsigned int i = 0, count = size >> PAGE_SHIFT;
> > +  

[PATCH 2/2] iommu/ipmmu-vmsa: Hook up r8a779(70|95) DT matching code

2017-11-01 Thread Simon Horman
Support the r8a77970 (R-Car V3M) and r8a77995 (R-Car D3) IPMMUs by sharing
feature flags with r8a7795 (R-Car H3) and r8a7796 (R-Car M3-W). Also update
IOMMU_OF_DECLARE to hook up the compat strings.

Based on work for the r8a7796 by Magnus Damm

Signed-off-by: Simon Horman 
---
 drivers/iommu/ipmmu-vmsa.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 02989bb060cc..c22520a453ff 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -749,6 +749,8 @@ static bool ipmmu_slave_whitelist(struct device *dev)
 static const struct soc_device_attribute soc_rcar_gen3[] = {
{ .soc_id = "r8a7795", },
{ .soc_id = "r8a7796", },
+   { .soc_id = "r8a77970", },
+   { .soc_id = "r8a77995", },
{ /* sentinel */ }
 };
 
@@ -1036,6 +1038,12 @@ static const struct of_device_id ipmmu_of_ids[] = {
.compatible = "renesas,ipmmu-r8a7796",
.data = _features_rcar_gen3,
}, {
+   .compatible = "renesas,ipmmu-r8a77970",
+   .data = _features_rcar_gen3,
+   }, {
+   .compatible = "renesas,ipmmu-r8a77995",
+   .data = _features_rcar_gen3,
+   }, {
/* Terminator */
},
 };
@@ -1224,6 +1232,10 @@ IOMMU_OF_DECLARE(ipmmu_r8a7795_iommu_of, 
"renesas,ipmmu-r8a7795",
 ipmmu_vmsa_iommu_of_setup);
 IOMMU_OF_DECLARE(ipmmu_r8a7796_iommu_of, "renesas,ipmmu-r8a7796",
 ipmmu_vmsa_iommu_of_setup);
+IOMMU_OF_DECLARE(ipmmu_r8a77970_iommu_of, "renesas,ipmmu-r8a77970",
+ipmmu_vmsa_iommu_of_setup);
+IOMMU_OF_DECLARE(ipmmu_r8a77995_iommu_of, "renesas,ipmmu-r8a77995",
+ipmmu_vmsa_iommu_of_setup);
 #endif
 
 MODULE_DESCRIPTION("IOMMU API for Renesas VMSA-compatible IPMMU");
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] mmc: renesas_sdhi: fix swiotlb buffer is full

2017-11-01 Thread Konrad Rzeszutek Wilk
On Fri, Oct 20, 2017 at 03:18:55AM +, Yoshihiro Shimoda wrote:
> Hi again!
> 
> > From: Yoshihiro Shimoda, Sent: Thursday, October 19, 2017 8:39 PM
> > 
> > Hi Geert-san, Konrad-san,
> > 
> > > From: Geert Uytterhoeven, Sent: Thursday, October 19, 2017 5:34 PM
> > >
> > > Hi Konrad,
> > >
> > > On Thu, Oct 19, 2017 at 2:24 AM, Konrad Rzeszutek Wilk
> > >  wrote:
> < snip >
> > > >> > diff --git a/drivers/mmc/host/renesas_sdhi_internal_dmac.c 
> > > >> > b/drivers/mmc/host/renesas_sdhi_internal_dmac.c
> > > >> > index f905f23..6c9b4b2 100644
> > > >> > --- a/drivers/mmc/host/renesas_sdhi_internal_dmac.c
> > > >> > +++ b/drivers/mmc/host/renesas_sdhi_internal_dmac.c
> > > >> > @@ -80,8 +80,9 @@
> > > >> > .scc_offset = 0x1000,
> > > >> > .taps   = rcar_gen3_scc_taps,
> > > >> > .taps_num   = ARRAY_SIZE(rcar_gen3_scc_taps),
> > > >> > -   /* Gen3 SDHI DMAC can handle 0x blk count, but seg = 
> > > >> > 1 */
> > > >> > -   .max_blk_count  = 0x,
> > > >> > +   /* The swiotlb can handle memory size up to 256 kbytes for 
> > > >> > now. */
> > > >> > +   .max_blk_count  = 512,
> > > >>
> > > >> Fixing this in the individual drivers feels like the wrong solution to 
> > > >> me.
> > > >>
> > > >> iommu: Is there a better (generic) way to handle this?
> > > >
> > > > Yes. See 7453c549f5f6485c0d79cad7844870dcc7d1b34d, aka 
> > > > swiotlb_max_segment
> > >
> > > Thanks for the pointer!
> > >
> > > While I agree this can be used to avoid the swiotlb buffer full issue,
> > > I believe it is a suboptimal solution if the device actually uses an 
> > > IOMMU.
> > > It limits the mapping size if CONFIG_SWIOTLB=y, which is always the
> > > case for arm/arm64 these days.
> > 
> > I'm afraid but I misunderstood this API's spec when I read it at first.
> > After I tried to use it, I found the API cannot be used for a workaround 
> > because
> > this API returns total size of swiotlb.
> > 
> > For example:
> >  - The swiotlb_max_segment() returns 64M bytes from the API when a default 
> > setting.
> >   - In this case, the maximum size per a map is 256k bytes.
> >  - The swiotlb_max_segment() returns 128M bytes from the API when I added 
> > swiotlb=65536
> >into the kernel parameter on arm64.
> >   - In this case, the maximum size per a map is still 256k bytes because
> > the swiotlb has hardcoded the size by the following code:
> >  
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/swiotlb.c?h=v4.14-rc5#n254
> > 
> > So, how do we handle to resolve (or avoid) the issue?
> 
> Anyway, I made v2 patches by using swiotlb related definitions. Would you 
> check it?

Did I miss that email? As in was I cc-ed?

> https://patchwork.kernel.org/patch/10018879/

Why not use IO_TLB_SEGSIZE << IO_TLB_SHIFT or alternatively
swiotlb_max_segment?  See 5584f1b1d73e9
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH for-next 2/4] RDMA/hns: Add IOMMU enable support in hip08

2017-11-01 Thread Robin Murphy
On 01/11/17 07:46, Wei Hu (Xavier) wrote:
> 
> 
> On 2017/10/12 20:59, Robin Murphy wrote:
>> On 12/10/17 13:31, Wei Hu (Xavier) wrote:
>>>
>>> On 2017/10/1 0:10, Leon Romanovsky wrote:
 On Sat, Sep 30, 2017 at 05:28:59PM +0800, Wei Hu (Xavier) wrote:
> If the IOMMU is enabled, the length of sg obtained from
> __iommu_map_sg_attrs is not 4kB. When the IOVA is set with the sg
> dma address, the IOVA will not be page continuous. and the VA
> returned from dma_alloc_coherent is a vmalloc address. However,
> the VA obtained by the page_address is a discontinuous VA. Under
> these circumstances, the IOVA should be calculated based on the
> sg length, and record the VA returned from dma_alloc_coherent
> in the struct of hem.
>
> Signed-off-by: Wei Hu (Xavier) 
> Signed-off-by: Shaobo Xu 
> Signed-off-by: Lijun Ou 
> ---
 Doug,

 I didn't invest time in reviewing it, but having "is_vmalloc_addr" in
 driver code to deal with dma_alloc_coherent is most probably wrong.

 Thanks
>>> Hi,  Leon & Doug
>>> We refered the function named __ttm_dma_alloc_page in the kernel
>>> code as below:
>>> And there are similar methods in bch_bio_map and mem_to_page
>>> functions in current 4.14-rcx.
>>>
>>> static struct dma_page *__ttm_dma_alloc_page(struct dma_pool *pool)
>>> {
>>> struct dma_page *d_page;
>>>
>>> d_page = kmalloc(sizeof(struct dma_page), GFP_KERNEL);
>>> if (!d_page)
>>> return NULL;
>>>
>>> d_page->vaddr = dma_alloc_coherent(pool->dev, pool->size,
>>>_page->dma,
>>>pool->gfp_flags);
>>> if (d_page->vaddr) {
>>> if (is_vmalloc_addr(d_page->vaddr))
>>> d_page->p = vmalloc_to_page(d_page->vaddr);
>>> else
>>> d_page->p = virt_to_page(d_page->vaddr);
>> There are cases on various architectures where neither of those is
>> right. Whether those actually intersect with TTM or RDMA use-cases is
>> another matter, of course.
>>
>> What definitely is a problem is if you ever take that page and end up
>> accessing it through any virtual address other than the one explicitly
>> returned by dma_alloc_coherent(). That can blow the coherency wide open
>> and invite data loss, right up to killing the whole system with a
>> machine check on certain architectures.
>>
>> Robin.
> Hi, Robin
> Thanks for your comment.
> 
> We have one problem and the related code as below.
> 1. call dma_alloc_coherent function  serval times to alloc memory.
> 2. vmap the allocated memory pages.
> 3. software access memory by using the return virt addr of vmap
> and hardware using the dma addr of dma_alloc_coherent.

The simple answer is "don't do that". Seriously. dma_alloc_coherent()
gives you a CPU virtual address and a DMA address with which to access
your buffer, and that is the limit of what you may infer about it. You
have no guarantee that the virtual address is either in the linear map
or vmalloc, and not some other special place. You have no guarantee that
the underlying memory even has an associated struct page at all.

> When IOMMU is disabled in ARM64 architecture, we use virt_to_page()
> before vmap(), it works. And when IOMMU is enabled using
> virt_to_page() will cause calltrace later, we found the return
> addr of dma_alloc_coherent is vmalloc addr, so we add the
> condition judgement statement as below, it works. 
> for (i = 0; i < buf->nbufs; ++i)
> pages[i] =
> is_vmalloc_addr(buf->page_list[i].buf) ?
> vmalloc_to_page(buf->page_list[i].buf) :
> virt_to_page(buf->page_list[i].buf);
> Can you give us suggestion? better method?

Oh my goodness, having now taken a closer look at this driver, I'm lost
for words in disbelief. To pick just one example:

u32 bits_per_long = BITS_PER_LONG;
...
if (bits_per_long == 64) {
/* memory mapping nonsense */
}

WTF does the size of a long have to do with DMA buffer management!?

Of course I can guess that it might be trying to make some tortuous
inference about vmalloc space being constrained on 32-bit platforms, but
still...

> 
> The related code as below:
> buf->page_list = kcalloc(buf->nbufs, sizeof(*buf->page_list),
>  GFP_KERNEL);
> if (!buf->page_list)
> return -ENOMEM;
> 
> for (i = 0; i < buf->nbufs; ++i) {
> buf->page_list[i].buf = dma_alloc_coherent(dev,
>   page_size, ,
>   GFP_KERNEL);
> if (!buf->page_list[i].buf)
> goto err_free;
> 
> 

Re: [PATCH for-next 2/4] RDMA/hns: Add IOMMU enable support in hip08

2017-11-01 Thread Wei Hu (Xavier)


On 2017/10/12 20:59, Robin Murphy wrote:
> On 12/10/17 13:31, Wei Hu (Xavier) wrote:
>>
>> On 2017/10/1 0:10, Leon Romanovsky wrote:
>>> On Sat, Sep 30, 2017 at 05:28:59PM +0800, Wei Hu (Xavier) wrote:
 If the IOMMU is enabled, the length of sg obtained from
 __iommu_map_sg_attrs is not 4kB. When the IOVA is set with the sg
 dma address, the IOVA will not be page continuous. and the VA
 returned from dma_alloc_coherent is a vmalloc address. However,
 the VA obtained by the page_address is a discontinuous VA. Under
 these circumstances, the IOVA should be calculated based on the
 sg length, and record the VA returned from dma_alloc_coherent
 in the struct of hem.

 Signed-off-by: Wei Hu (Xavier) 
 Signed-off-by: Shaobo Xu 
 Signed-off-by: Lijun Ou 
 ---
>>> Doug,
>>>
>>> I didn't invest time in reviewing it, but having "is_vmalloc_addr" in
>>> driver code to deal with dma_alloc_coherent is most probably wrong.
>>>
>>> Thanks
>> Hi,  Leon & Doug
>> We refered the function named __ttm_dma_alloc_page in the kernel
>> code as below:
>> And there are similar methods in bch_bio_map and mem_to_page
>> functions in current 4.14-rcx.
>>
>> static struct dma_page *__ttm_dma_alloc_page(struct dma_pool *pool)
>> {
>> struct dma_page *d_page;
>>
>> d_page = kmalloc(sizeof(struct dma_page), GFP_KERNEL);
>> if (!d_page)
>> return NULL;
>>
>> d_page->vaddr = dma_alloc_coherent(pool->dev, pool->size,
>>_page->dma,
>>pool->gfp_flags);
>> if (d_page->vaddr) {
>> if (is_vmalloc_addr(d_page->vaddr))
>> d_page->p = vmalloc_to_page(d_page->vaddr);
>> else
>> d_page->p = virt_to_page(d_page->vaddr);
> There are cases on various architectures where neither of those is
> right. Whether those actually intersect with TTM or RDMA use-cases is
> another matter, of course.
>
> What definitely is a problem is if you ever take that page and end up
> accessing it through any virtual address other than the one explicitly
> returned by dma_alloc_coherent(). That can blow the coherency wide open
> and invite data loss, right up to killing the whole system with a
> machine check on certain architectures.
>
> Robin.
Hi, Robin
Thanks for your comment.

We have one problem and the related code as below.
1. call dma_alloc_coherent function  serval times to alloc memory.
2. vmap the allocated memory pages.
3. software access memory by using the return virt addr of vmap
and hardware using the dma addr of dma_alloc_coherent.
 
When IOMMU is disabled in ARM64 architecture, we use virt_to_page()
before vmap(), it works. And when IOMMU is enabled using
virt_to_page() will cause calltrace later, we found the return
addr of dma_alloc_coherent is vmalloc addr, so we add the
condition judgement statement as below, it works. 
for (i = 0; i < buf->nbufs; ++i)
pages[i] =
is_vmalloc_addr(buf->page_list[i].buf) ?
vmalloc_to_page(buf->page_list[i].buf) :
virt_to_page(buf->page_list[i].buf);
Can you give us suggestion? better method?

The related code as below:
buf->page_list = kcalloc(buf->nbufs, sizeof(*buf->page_list),
 GFP_KERNEL);
if (!buf->page_list)
return -ENOMEM;

for (i = 0; i < buf->nbufs; ++i) {
buf->page_list[i].buf = dma_alloc_coherent(dev,
  page_size, ,
  GFP_KERNEL);
if (!buf->page_list[i].buf)
goto err_free;

buf->page_list[i].map = t;
memset(buf->page_list[i].buf, 0, page_size);
}

pages = kmalloc_array(buf->nbufs, sizeof(*pages),
  GFP_KERNEL);
if (!pages)
goto err_free;

for (i = 0; i < buf->nbufs; ++i)
pages[i] =
is_vmalloc_addr(buf->page_list[i].buf) ?
vmalloc_to_page(buf->page_list[i].buf) :
virt_to_page(buf->page_list[i].buf);

buf->direct.buf = vmap(pages, buf->nbufs, VM_MAP,
   PAGE_KERNEL);
kfree(pages);
if (!buf->direct.buf)
goto err_free;

Regards
Wei Hu
>> } else {
>> kfree(d_page);
>> d_page = NULL;
>> }
>> return d_page;
>> }
>>
>> Regards
>> Wei Hu
   drivers/infiniband/hw/hns/hns_roce_alloc.c |  5 -
   drivers/infiniband/hw/hns/hns_roce_hem.c   | 30
 +++---
   drivers/infiniband/hw/hns/hns_roce_hem.h   |  6 ++