Re: [PATCH v2 5/8] iommu/io-pgtable-arm: Support lockless operation

2017-06-26 Thread Linu Cherian
On Fri Jun 23, 2017 at 05:04:05PM +0530, Linu Cherian wrote: > On Fri Jun 23, 2017 at 11:35:25AM +0100, Robin Murphy wrote: > > On 23/06/17 09:56, Linu Cherian wrote: > > > On Fri Jun 23, 2017 at 11:23:26AM +0530, Linu Cherian wrote: > > >> > > >> Robin, > > >> Was trying to understand the new

Re: clean up and modularize arch dma_mapping interface V2

2017-06-26 Thread tndave
On 06/26/2017 02:47 AM, Christoph Hellwig wrote: On Sat, Jun 24, 2017 at 10:36:56AM -0500, Benjamin Herrenschmidt wrote: I think we still need to do it. For example we have a bunch new "funky" cases. I have no plan to do away with the selection - I just want a better interface than the

Re: [PATCH v7 34/36] x86/mm: Add support to encrypt the kernel in-place

2017-06-26 Thread Tom Lendacky
On 6/26/2017 10:45 AM, Borislav Petkov wrote: On Fri, Jun 23, 2017 at 12:44:46PM -0500, Tom Lendacky wrote: Normally the __p4d() macro would be used and that would be ok whether CONFIG_X86_5LEVEL is defined or not. But since __p4d() is part of the paravirt ops path I have to use

Re: [PATCH v7 34/36] x86/mm: Add support to encrypt the kernel in-place

2017-06-26 Thread Borislav Petkov
On Fri, Jun 23, 2017 at 12:44:46PM -0500, Tom Lendacky wrote: > Normally the __p4d() macro would be used and that would be ok whether > CONFIG_X86_5LEVEL is defined or not. But since __p4d() is part of the > paravirt ops path I have to use native_make_p4d(). So __p4d is in !CONFIG_PARAVIRT path.

Re: [RFC 5/9] iommu: Introduce fault notifier API

2017-06-26 Thread Alex Williamson
On Mon, 26 Jun 2017 08:27:52 -0700 Jacob Pan wrote: > On Fri, 23 Jun 2017 13:15:51 -0600 > Alex Williamson wrote: > > > On Fri, 23 Jun 2017 11:59:28 -0700 > > Jacob Pan wrote: > > > > > On Thu, 22 Jun

Re: [RFC 5/9] iommu: Introduce fault notifier API

2017-06-26 Thread Jacob Pan
On Fri, 23 Jun 2017 13:15:51 -0600 Alex Williamson wrote: > On Fri, 23 Jun 2017 11:59:28 -0700 > Jacob Pan wrote: > > > On Thu, 22 Jun 2017 16:53:17 -0600 > > Alex Williamson wrote: > > > > > On Wed, 14

Kernel crashes in iommu_flush_iotlb_psi()

2017-06-26 Thread Christian Kauhaus
Hi everyone, after updating our storage servers to kernel 4.9.33 we experience repeated crashes on some machines. A typical kernel stacktrace looks like this: BUG: unable to handle kernel NULL pointer dereference at 0304 [136234.369489] IP: [] flush_unmaps_timeout+0xa7/0x1c0

Re: [PATCH 1/1] iommu/arm-smmu-v3: replace writel with writel_relaxed in queue_inc_prod

2017-06-26 Thread Leizhen (ThunderTown)
On 2017/6/26 21:29, Leizhen (ThunderTown) wrote: > > > On 2017/6/21 17:08, Will Deacon wrote: >> On Wed, Jun 21, 2017 at 09:28:23AM +0800, Leizhen (ThunderTown) wrote: >>> On 2017/6/20 19:35, Robin Murphy wrote: On 20/06/17 12:04, Zhen Lei wrote: > This function is protected by

[PATCH 1/5] iommu/arm-smmu-v3: put off the execution of TLBI* to reduce lock confliction

2017-06-26 Thread Zhen Lei
Because all TLBI commands should be followed by a SYNC command, to make sure that it has been completely finished. So we can just add the TLBI commands into the queue, and put off the execution until meet SYNC or other commands. To prevent the followed SYNC command waiting for a long time because

[PATCH 0/5] arm-smmu: performance optimization

2017-06-26 Thread Zhen Lei
I described the optimization more detail in patch 1 and 2, and patch 3-5 are the implementation on arm-smmu/arm-smmu-v3 of patch 2. Patch 1 is v2. In v1, I directly replaced writel with writel_relaxed in queue_inc_prod. But Robin figured that it may lead SMMU consume stale memory contents. I

[PATCH 3/5] iommu/arm-smmu-v3: add support for unmap an iova range with only one tlb sync

2017-06-26 Thread Zhen Lei
1. remove tlb_sync operation in "unmap" 2. make sure each "unmap" will always be followed by tlb sync operation The resultant effect is as below: unmap memory page-1 tlb invalidate page-1 ... unmap memory page-n tlb invalidate page-n tlb sync

[PATCH 4/5] iommu/arm-smmu: add support for unmap a memory range with only one tlb sync

2017-06-26 Thread Zhen Lei
1. remove tlb_sync operation in "unmap" 2. make sure each "unmap" will always be followed by tlb sync operation The resultant effect is as below: unmap memory page-1 tlb invalidate page-1 ... unmap memory page-n tlb invalidate page-n tlb sync

[PATCH 2/5] iommu: add a new member unmap_tlb_sync into struct iommu_ops

2017-06-26 Thread Zhen Lei
An iova range may contain many pages/blocks, especially for the case of unmap_sg. Currently, for each page/block unmapping, a tlb invalidation operation will be followed and wait(called tlb_sync) until the operation's over. But actually we only need one tlb_sync in the last stage. Look at the loop

[PATCH 5/5] iommu/io-pgtable: delete member tlb_sync_pending of struct io_pgtable

2017-06-26 Thread Zhen Lei
This member is unused now, because the previous patches ensured that each unmap will always be followed by tlb sync operation. By the way, ->tlb_flush_all executes tlb_sync by itself. Signed-off-by: Zhen Lei --- drivers/iommu/io-pgtable.h | 8 +--- 1 file

Re: [PATCH 1/1] iommu/arm-smmu-v3: replace writel with writel_relaxed in queue_inc_prod

2017-06-26 Thread Leizhen (ThunderTown)
On 2017/6/21 17:08, Will Deacon wrote: > On Wed, Jun 21, 2017 at 09:28:23AM +0800, Leizhen (ThunderTown) wrote: >> On 2017/6/20 19:35, Robin Murphy wrote: >>> On 20/06/17 12:04, Zhen Lei wrote: This function is protected by spinlock, and the latter will do memory barrier implicitly. So

Re: [PATCH v2 0/8] io-pgtable lock removal

2017-06-26 Thread Leizhen (ThunderTown)
On 2017/6/26 21:12, John Garry wrote: > >>> >>> I saw Will has already sent the pull request. But, FWIW, we are seeing >>> roughly the same performance as v1 patchset. For PCI NIC, Zhou again >>> found performance drop goes from ~15->8% with SMMU enabled, and for >>> integrated storage

Re: [PATCH v2 0/8] io-pgtable lock removal

2017-06-26 Thread John Garry
I saw Will has already sent the pull request. But, FWIW, we are seeing roughly the same performance as v1 patchset. For PCI NIC, Zhou again found performance drop goes from ~15->8% with SMMU enabled, and for integrated storage controller [platform device], we still see a drop of about 50%,

[PATCH 4/4] iommu: qcom: initialize secure page table

2017-06-26 Thread Rob Clark
From: Stanimir Varbanov This basically gets the secure page table size, allocates memory for secure pagetables and passes the physical address to the trusted zone. Signed-off-by: Stanimir Varbanov Signed-off-by: Rob Clark

[PATCH 3/4] iommu: add qcom_iommu

2017-06-26 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do implement the ARM SMMU spec, but not in a way that is compatible with how the arm-smmu driver is designed. It seems SMMU_SCR1.GASRAE=1 so the global register space is not accessible. This means it needs to get configuration from devicetree

[PATCH 1/4] Docs: dt: document qcom iommu bindings

2017-06-26 Thread Rob Clark
Cc: devicet...@vger.kernel.org Signed-off-by: Rob Clark Reviewed-by: Rob Herring --- .../devicetree/bindings/iommu/qcom,iommu.txt | 121 + 1 file changed, 121 insertions(+) create mode 100644

[PATCH 2/4] iommu: arm-smmu: split out register defines

2017-06-26 Thread Rob Clark
I want to re-use some of these for qcom_iommu, which has (roughly) the same context-bank registers. Signed-off-by: Rob Clark --- drivers/iommu/arm-smmu-regs.h | 227 ++ drivers/iommu/arm-smmu.c | 203

[PATCH 0/4] iommu: add qcom_iommu for early "B" family devices

2017-06-26 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do not implement the ARM SMMU spec in a way that is compatible with the arm-smmu driver. Rob Clark (3): Docs: dt: document qcom iommu bindings iommu: arm-smmu: split out register defines iommu: add qcom_iommu Stanimir Varbanov (1):

Re: [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush

2017-06-26 Thread Joerg Roedel
On Fri, Jun 23, 2017 at 10:20:47AM -0400, Jan Vesely wrote: > I was able to trigger "Completion-Wait loop timed out" messages in the > following situation: > Hung OpenCL task running on dGPU. > dGPU goes to sleep. > sigterm to hung task. > it seems to recover OK after the dGPU is powered back on

Re: [PATCH v2 0/8] io-pgtable lock removal

2017-06-26 Thread John Garry
On 23/06/2017 10:58, Robin Murphy wrote: On 23/06/17 09:47, John Garry wrote: On 22/06/2017 16:53, Robin Murphy wrote: The feedback has been promising, so v2 is just a final update to cover a handful of memory ordering and cosmetic tweaks that came up when Will and I went through this offline.

[GIT PULL] iommu/arm-smmu: Updates for 4.13

2017-06-26 Thread Will Deacon
Hi Joerg, Please pull these arm-smmu updates for 4.13. The headline feature is Robin's conversion of the page table code to a lockless implementation, which significantly closes the DMA performance gap when compared to a system with the SMMU in bypass mode. We'll look at improving unmap

Re: [PATCH] iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel

2017-06-26 Thread Joerg Roedel
Hi Baoquan, On Fri, Jun 23, 2017 at 07:43:10PM +0800, Baoquan He wrote: > Do you think whether it's necessary to continue my kdump fix of amd iommu > patchset? Seems my last post was in Jan this year. I know you are very > busy on fixing bugs and reviewing tons of patches. Without your > guidance

Re: clean up and modularize arch dma_mapping interface V2

2017-06-26 Thread Christoph Hellwig
On Sat, Jun 24, 2017 at 10:36:56AM -0500, Benjamin Herrenschmidt wrote: > I think we still need to do it. For example we have a bunch new "funky" > cases. I have no plan to do away with the selection - I just want a better interface than the current one.

Re: DMA_ATTR_WEAK_ORDERING defintion, was Re: [PATCH] nvme: set DMA_ATTR_WEAK_ORDERING attribute on dma buffers

2017-06-26 Thread Arnd Bergmann
On Sat, Jun 24, 2017 at 9:35 AM, Christoph Hellwig wrote: > I always assumed that our streaming mappings are relaxed order for > TLP anyway. And at very least Documentation/DMA-attributes.txt seems > to imply something different: > > > DMA_ATTR_WEAK_ORDERING >

Re: new dma-mapping tree, was Re: clean up and modularize arch dma_mapping interface V2

2017-06-26 Thread Christoph Hellwig
On Wed, Jun 21, 2017 at 03:32:39PM +0200, Marek Szyprowski wrote: > linux-next > was a side effect of that. I think that for now it can be dropped in favor > of > Christoph's tree. I can also do some review and help in maintainers work if > needed, although I was recently busy with other stuff. >