[PATCH v6 7/7] iommu/amd: Use only natural aligned flushes in a VM

2021-07-23 Thread Nadav Amit
From: Nadav Amit When running on an AMD vIOMMU, it is better to avoid TLB flushes of unmodified PTEs. vIOMMUs require the hypervisor to synchronize the virtualized IOMMU's PTEs with the physical ones. This process induce overheads. AMD IOMMU allows us to flush any range that is aligned

[PATCH v6 6/7] iommu/amd: Sync once for scatter-gather operations

2021-07-23 Thread Nadav Amit
From: Nadav Amit On virtual machines, software must flush the IOTLB after each page table entry update. The iommu_map_sg() code iterates through the given scatter-gather list and invokes iommu_map() for each element in the scatter-gather list, which calls into the vendor IOMMU driver through

[PATCH v6 5/7] iommu/amd: Tailored gather logic for AMD

2021-07-23 Thread Nadav Amit
From: Nadav Amit AMD's IOMMU can flush efficiently (i.e., in a single flush) any range. This is in contrast, for instnace, to Intel IOMMUs that have a limit on the number of pages that can be flushed in a single flush. In addition, AMD's IOMMU do not care about the page-size, so changes

[PATCH v6 4/7] iommu: Factor iommu_iotlb_gather_is_disjoint() out

2021-07-23 Thread Nadav Amit
From: Nadav Amit Refactor iommu_iotlb_gather_add_page() and factor out the logic that detects whether IOTLB gather range and a new range are disjoint. To be used by the next patch that implements different gathering logic for AMD. Note that updating gather->pgsize unconditionally d

[PATCH v6 1/7] iommu/amd: Selective flush on unmap

2021-07-23 Thread Nadav Amit
From: Nadav Amit Recent patch attempted to enable selective page flushes on AMD IOMMU but neglected to adapt amd_iommu_iotlb_sync() to use the selective flushes. Adapt amd_iommu_iotlb_sync() to use selective flushes and change amd_iommu_unmap() to collect the flushes. As a defensive measure

[PATCH v6 3/7] iommu: Improve iommu_iotlb_gather helpers

2021-07-23 Thread Nadav Amit
for clarity. Cc: Joerg Roedel Cc: Will Deacon Cc: Jiajun Cao Cc: Robin Murphy Cc: Lu Baolu Cc: iommu@lists.linux-foundation.org Cc: linux-ker...@vger.kernel.org Signed-off-by: Robin Murphy Signed-off-by: Nadav Amit --- drivers/iommu/mtk_iommu.c | 6 +- include/linux/iommu.h | 38

[PATCH v6 2/7] iommu/amd: Do not use flush-queue when NpCache is on

2021-07-23 Thread Nadav Amit
From: Nadav Amit Do not use flush-queue on virtualized environments, where the NpCache capability of the IOMMU is set. This is required to reduce virtualization overheads. This change follows a similar change to Intel's VT-d and a detailed explanation as for the rationale is described in commit

[PATCH v6 0/7] iommu/amd: Enable page-selective flushes

2021-07-23 Thread Nadav Amit
From: Nadav Amit The previous patch, commit 268aa4548277 ("iommu/amd: Page-specific invalidations for more than one page") was supposed to enable page-selective IOTLB flushes on AMD. Besides the bug that was already fixed by commit a017c567915f ("iommu/amd: Fix wrong pare

Re: [PATCH v5 5/7] iommu/amd: Tailored gather logic for AMD

2021-07-13 Thread Nadav Amit
> On Jul 13, 2021, at 11:40 AM, Robin Murphy wrote: > > On 2021-07-13 10:41, Nadav Amit wrote: >> From: Nadav Amit >> AMD's IOMMU can flush efficiently (i.e., in a single flush) any range. >> This is in contrast, for instnace, to Intel IOMMUs that have a limi

[PATCH v5 7/7] iommu/amd: Use only natural aligned flushes in a VM

2021-07-13 Thread Nadav Amit
From: Nadav Amit When running on an AMD vIOMMU, it is better to avoid TLB flushes of unmodified PTEs. vIOMMUs require the hypervisor to synchronize the virtualized IOMMU's PTEs with the physical ones. This process induce overheads. AMD IOMMU allows us to flush any range that is aligned

[PATCH v5 6/7] iommu/amd: Sync once for scatter-gather operations

2021-07-13 Thread Nadav Amit
From: Nadav Amit On virtual machines, software must flush the IOTLB after each page table entry update. The iommu_map_sg() code iterates through the given scatter-gather list and invokes iommu_map() for each element in the scatter-gather list, which calls into the vendor IOMMU driver through

[PATCH v5 5/7] iommu/amd: Tailored gather logic for AMD

2021-07-13 Thread Nadav Amit
From: Nadav Amit AMD's IOMMU can flush efficiently (i.e., in a single flush) any range. This is in contrast, for instnace, to Intel IOMMUs that have a limit on the number of pages that can be flushed in a single flush. In addition, AMD's IOMMU do not care about the page-size, so changes

[PATCH v5 4/7] iommu: Factor iommu_iotlb_gather_is_disjoint() out

2021-07-13 Thread Nadav Amit
From: Nadav Amit Refactor iommu_iotlb_gather_add_page() and factor out the logic that detects whether IOTLB gather range and a new range are disjoint. To be used by the next patch that implements different gathering logic for AMD. Note that updating gather->pgsize unconditionally d

[PATCH v5 3/7] iommu: Improve iommu_iotlb_gather helpers

2021-07-13 Thread Nadav Amit
for clarity. Cc: Joerg Roedel Cc: Will Deacon Cc: Jiajun Cao Cc: Robin Murphy Cc: Lu Baolu Cc: iommu@lists.linux-foundation.org Cc: linux-ker...@vger.kernel.org Signed-off-by: Robin Murphy Signed-off-by: Nadav Amit --- drivers/iommu/mtk_iommu.c | 6 +- include/linux/iommu.h | 38

[PATCH v5 2/7] iommu/amd: Do not use flush-queue when NpCache is on

2021-07-13 Thread Nadav Amit
From: Nadav Amit Do not use flush-queue on virtualized environments, where the NpCache capability of the IOMMU is set. This is required to reduce virtualization overheads. This change follows a similar change to Intel's VT-d and a detailed explanation as for the rationale is described in commit

[PATCH v5 1/7] iommu/amd: Selective flush on unmap

2021-07-13 Thread Nadav Amit
From: Nadav Amit Recent patch attempted to enable selective page flushes on AMD IOMMU but neglected to adapt amd_iommu_iotlb_sync() to use the selective flushes. Adapt amd_iommu_iotlb_sync() to use selective flushes and change amd_iommu_unmap() to collect the flushes. As a defensive measure

[PATCH v5 0/7] iommu/amd: Enable page-selective flushes

2021-07-13 Thread Nadav Amit
From: Nadav Amit The previous patch, commit 268aa4548277 ("iommu/amd: Page-specific invalidations for more than one page") was supposed to enable page-selective IOTLB flushes on AMD. Besides the bug that was already fixed by commit a017c567915f ("iommu/amd: Fix wrong pare

[PATCH v4 7/7] iommu/amd: Use only natural aligned flushes in a VM

2021-06-16 Thread Nadav Amit
From: Nadav Amit When running on an AMD vIOMMU, it is better to avoid TLB flushes of unmodified PTEs. vIOMMUs require the hypervisor to synchronize the virtualized IOMMU's PTEs with the physical ones. This process induce overheads. AMD IOMMU allows us to flush any range that is aligned

[PATCH v4 6/7] iommu/amd: Sync once for scatter-gather operations

2021-06-16 Thread Nadav Amit
From: Nadav Amit On virtual machines, software must flush the IOTLB after each page table entry update. The iommu_map_sg() code iterates through the given scatter-gather list and invokes iommu_map() for each element in the scatter-gather list, which calls into the vendor IOMMU driver through

[PATCH v4 4/7] iommu: Factor iommu_iotlb_gather_is_disjoint() out

2021-06-16 Thread Nadav Amit
From: Nadav Amit Refactor iommu_iotlb_gather_add_page() and factor out the logic that detects whether IOTLB gather range and a new range are disjoint. To be used by the next patch that implements different gathering logic for AMD. Note that updating gather->pgsize unconditionally d

[PATCH v4 5/7] iommu/amd: Tailored gather logic for AMD

2021-06-16 Thread Nadav Amit
From: Nadav Amit AMD's IOMMU can flush efficiently (i.e., in a single flush) any range. This is in contrast, for instnace, to Intel IOMMUs that have a limit on the number of pages that can be flushed in a single flush. In addition, AMD's IOMMU do not care about the page-size, so changes

[PATCH v4 3/7] iommu: Improve iommu_iotlb_gather helpers

2021-06-16 Thread Nadav Amit
for clarity. Cc: Joerg Roedel Cc: Will Deacon Cc: Jiajun Cao Cc: Robin Murphy Cc: Lu Baolu Cc: iommu@lists.linux-foundation.org Cc: linux-ker...@vger.kernel.org Signed-off-by: Robin Murphy Signed-off-by: Nadav Amit --- Changes from Robin's version: * Added iommu_iotlb_gather_add_range

[PATCH v4 2/7] iommu/amd: Do not use flush-queue when NpCache is on

2021-06-16 Thread Nadav Amit
From: Nadav Amit Do not use flush-queue on virtualized environments, where the NpCache capability of the IOMMU is set. This is required to reduce virtualization overheads. This change follows a similar change to Intel's VT-d and a detailed explanation as for the rationale is described in commit

[PATCH v4 1/7] iommu/amd: Selective flush on unmap

2021-06-16 Thread Nadav Amit
From: Nadav Amit Recent patch attempted to enable selective page flushes on AMD IOMMU but neglected to adapt amd_iommu_iotlb_sync() to use the selective flushes. Adapt amd_iommu_iotlb_sync() to use selective flushes and change amd_iommu_unmap() to collect the flushes. As a defensive measure

[PATCH v4 0/7] iommu/amd: Enable page-selective flushes

2021-06-16 Thread Nadav Amit
From: Nadav Amit The previous patch, commit 268aa4548277 ("iommu/amd: Page-specific invalidations for more than one page") was supposed to enable page-selective IOTLB flushes on AMD. Besides the bug that was already fixed by commit a017c567915f ("iommu/amd: Fix wrong pare

Re: [PATCH v3 5/6] iommu/amd: Tailored gather logic for AMD

2021-06-15 Thread Nadav Amit
> On Jun 15, 2021, at 12:20 PM, Robin Murphy wrote: > > On 2021-06-15 19:14, Nadav Amit wrote: >>> On Jun 15, 2021, at 5:55 AM, Robin Murphy wrote: >>> >>> On 2021-06-07 19:25, Nadav Amit wrote: >>>> From: Nadav Amit >>>> AMD's IOM

Re: [PATCH v3 3/6] iommu: Improve iommu_iotlb_gather helpers

2021-06-15 Thread Nadav Amit
> On Jun 15, 2021, at 12:05 PM, Nadav Amit wrote: > > > >> On Jun 15, 2021, at 3:42 AM, Robin Murphy wrote: >> >> On 2021-06-07 19:25, Nadav Amit wrote: >>> From: Robin Murphy >>> The Mediatek driver is not the only one which might want a

Re: [PATCH v3 3/6] iommu: Improve iommu_iotlb_gather helpers

2021-06-15 Thread Nadav Amit
> On Jun 15, 2021, at 3:42 AM, Robin Murphy wrote: > > On 2021-06-07 19:25, Nadav Amit wrote: >> From: Robin Murphy >> The Mediatek driver is not the only one which might want a basic >> address-based gathering behaviour, so although it's arguably simple >>

Re: [PATCH v3 4/6] iommu: Factor iommu_iotlb_gather_is_disjoint() out

2021-06-15 Thread Nadav Amit
> On Jun 15, 2021, at 3:29 AM, Will Deacon wrote: > > On Fri, Jun 11, 2021 at 09:50:31AM -0700, Nadav Amit wrote: >> >> >>> On Jun 11, 2021, at 6:57 AM, Will Deacon wrote: >>> >>> On Mon, Jun 07, 2021 at 11:25:39AM -0700, Nadav Amit w

Re: [PATCH v3 6/6] iommu/amd: Sync once for scatter-gather operations

2021-06-15 Thread Nadav Amit
> On Jun 15, 2021, at 4:25 AM, Robin Murphy wrote: > > On 2021-06-07 19:25, Nadav Amit wrote: >> From: Nadav Amit >> On virtual machines, software must flush the IOTLB after each page table >> entry update. >> The iommu_map_sg() code iterates thro

Re: [PATCH v3 2/6] iommu/amd: Do not use flush-queue when NpCache is on

2021-06-15 Thread Nadav Amit
> On Jun 15, 2021, at 6:08 AM, Robin Murphy wrote: > > On 2021-06-07 19:25, Nadav Amit wrote: >> From: Nadav Amit >> Do not use flush-queue on virtualized environments, where the NpCache >> capability of the IOMMU is set. This is required to reduce >> virtual

Re: [PATCH v3 5/6] iommu/amd: Tailored gather logic for AMD

2021-06-15 Thread Nadav Amit
> On Jun 15, 2021, at 5:55 AM, Robin Murphy wrote: > > On 2021-06-07 19:25, Nadav Amit wrote: >> From: Nadav Amit >> AMD's IOMMU can flush efficiently (i.e., in a single flush) any range. >> This is in contrast, for instnace, to Intel IOMMUs that have a limi

Re: [PATCH v3 4/6] iommu: Factor iommu_iotlb_gather_is_disjoint() out

2021-06-11 Thread Nadav Amit
> On Jun 11, 2021, at 6:57 AM, Will Deacon wrote: > > On Mon, Jun 07, 2021 at 11:25:39AM -0700, Nadav Amit wrote: >> From: Nadav Amit >> >> Refactor iommu_iotlb_gather_add_page() and factor out the logic that >> detects whether IOTLB gather range and a new

[PATCH v3 5/6] iommu/amd: Tailored gather logic for AMD

2021-06-07 Thread Nadav Amit
From: Nadav Amit AMD's IOMMU can flush efficiently (i.e., in a single flush) any range. This is in contrast, for instnace, to Intel IOMMUs that have a limit on the number of pages that can be flushed in a single flush. In addition, AMD's IOMMU do not care about the page-size, so changes

[PATCH v3 6/6] iommu/amd: Sync once for scatter-gather operations

2021-06-07 Thread Nadav Amit
From: Nadav Amit On virtual machines, software must flush the IOTLB after each page table entry update. The iommu_map_sg() code iterates through the given scatter-gather list and invokes iommu_map() for each element in the scatter-gather list, which calls into the vendor IOMMU driver through

[PATCH v3 3/6] iommu: Improve iommu_iotlb_gather helpers

2021-06-07 Thread Nadav Amit
From: Robin Murphy The Mediatek driver is not the only one which might want a basic address-based gathering behaviour, so although it's arguably simple enough to open-code, let's factor it out for the sake of cleanliness. Let's also take this opportunity to document the intent of these helpers

[PATCH v3 4/6] iommu: Factor iommu_iotlb_gather_is_disjoint() out

2021-06-07 Thread Nadav Amit
From: Nadav Amit Refactor iommu_iotlb_gather_add_page() and factor out the logic that detects whether IOTLB gather range and a new range are disjoint. To be used by the next patch that implements different gathering logic for AMD. Cc: Joerg Roedel Cc: Will Deacon Cc: Jiajun Cao Cc: Robin

[PATCH v3 2/6] iommu/amd: Do not use flush-queue when NpCache is on

2021-06-07 Thread Nadav Amit
From: Nadav Amit Do not use flush-queue on virtualized environments, where the NpCache capability of the IOMMU is set. This is required to reduce virtualization overheads. This change follows a similar change to Intel's VT-d and a detailed explanation as for the rationale is described in commit

[PATCH v3 1/6] iommu/amd: Selective flush on unmap

2021-06-07 Thread Nadav Amit
From: Nadav Amit Recent patch attempted to enable selective page flushes on AMD IOMMU but neglected to adapt amd_iommu_iotlb_sync() to use the selective flushes. Adapt amd_iommu_iotlb_sync() to use selective flushes and change amd_iommu_unmap() to collect the flushes. As a defensive measure

[PATCH v3 0/6] iommu/amd: Enable page-selective flushes

2021-06-07 Thread Nadav Amit
From: Nadav Amit The previous patch, commit 268aa4548277 ("iommu/amd: Page-specific invalidations for more than one page") was supposed to enable page-selective IOTLB flushes on AMD. Besides the bug that was already fixed by commit a017c567915f ("iommu/amd: Fix wrong pare

Re: [PATCH v2 0/4] iommu/amd: Enable page-selective flushes

2021-06-04 Thread Nadav Amit
> On Jun 4, 2021, at 11:53 AM, Robin Murphy wrote: > > On 2021-06-04 18:10, Nadav Amit wrote: >>> On Jun 4, 2021, at 8:38 AM, Joerg Roedel wrote: >>> >>> Hi Nadav, >>> >>> [Adding Robin] >>> >>> On Mon, May 24, 2021

Re: [PATCH v2 0/4] iommu/amd: Enable page-selective flushes

2021-06-04 Thread Nadav Amit
> On Jun 4, 2021, at 8:38 AM, Joerg Roedel wrote: > > Hi Nadav, > > [Adding Robin] > > On Mon, May 24, 2021 at 03:41:55PM -0700, Nadav Amit wrote: >> Nadav Amit (4): >> iommu/amd: Fix wrong parentheses on page-specific invalidations > > This patch i

Re: [PATCH 3/4] iommu/amd: Do not sync on page size changes

2021-06-01 Thread Nadav Amit
> On Jun 1, 2021, at 10:27 AM, Robin Murphy wrote: > > On 2021-06-01 17:39, Nadav Amit wrote: >>> On Jun 1, 2021, at 8:59 AM, Robin Murphy wrote: >>> >>> On 2021-05-02 07:59, Nadav Amit wrote: >>>> From: Nadav Amit >>>> Some IOMM

Re: [PATCH 3/4] iommu/amd: Do not sync on page size changes

2021-06-01 Thread Nadav Amit
> On Jun 1, 2021, at 8:59 AM, Robin Murphy wrote: > > On 2021-05-02 07:59, Nadav Amit wrote: >> From: Nadav Amit >> Some IOMMU architectures perform invalidations regardless of the page >> size. In such architectures there is no need to sync when the page size >

Re: [PATCH 1/4] iommu/amd: Fix wrong parentheses on page-specific invalidations

2021-05-31 Thread Nadav Amit
> On May 18, 2021, at 2:23 AM, Joerg Roedel wrote: > > On Sat, May 01, 2021 at 11:59:56PM -0700, Nadav Amit wrote: >> From: Nadav Amit >> >> The logic to determine the mask of page-specific invalidations was >> tested in userspace. As the code was copied in

Re: [git pull] IOMMU Fixes for Linux v5.13-rc3

2021-05-27 Thread Nadav Amit
> On May 27, 2021, at 10:57 AM, Joerg Roedel wrote: > > Signed PGP part > Hi Linus, > > The following changes since commit d07f6ca923ea0927a1024dfccafc5b53b61cfecc: > > Linux 5.13-rc2 (2021-05-16 15:27:44 -0700) For 5.13-rc3? Not -rc4? ___ iommu

[PATCH v2 4/4] iommu/amd: Do not use flush-queue when NpCache is on

2021-05-25 Thread Nadav Amit
From: Nadav Amit Do not use flush-queue on virtualized environments, where the NpCache capability of the IOMMU is set. This is required to reduce virtualization overheads. This change follows a similar change to Intel's VT-d and a detailed explanation as for the rationale is described in commit

[PATCH v2 2/4] iommu/amd: Selective flush on unmap

2021-05-25 Thread Nadav Amit
From: Nadav Amit Recent patch attempted to enable selective page flushes on AMD IOMMU but neglected to adapt amd_iommu_iotlb_sync() to use the selective flushes. Adapt amd_iommu_iotlb_sync() to use selective flushes and change amd_iommu_unmap() to collect the flushes. As a defensive measure

[PATCH v2 3/4] iommu/amd: Do not sync on page size changes

2021-05-25 Thread Nadav Amit
From: Nadav Amit Some IOMMU architectures perform invalidations regardless of the page size. In such architectures there is no need to sync when the page size changes or to regard pgsize when making interim flush in iommu_iotlb_gather_add_page(). Add a "ignore_gather_pgsize" propert

[PATCH v2 1/4] iommu/amd: Fix wrong parentheses on page-specific invalidations

2021-05-25 Thread Nadav Amit
From: Nadav Amit The logic to determine the mask of page-specific invalidations was tested in userspace. As the code was copied into the kernel, the parentheses were mistakenly set in the wrong place, resulting in the wrong mask. Fix it. Cc: Joerg Roedel Cc: Will Deacon Cc: Jiajun Cao Cc

[PATCH v2 0/4] iommu/amd: Enable page-selective flushes

2021-05-25 Thread Nadav Amit
From: Nadav Amit The previous patch, commit 268aa4548277 ("iommu/amd: Page-specific invalidations for more than one page") was supposed to enable page-selective IOTLB flushes on AMD. The patch had an embaressing bug, and I apologize for it. Analysis as for why this bug did

[PATCH 4/4] iommu/amd: Do not use flush-queue when NpCache is on

2021-05-02 Thread Nadav Amit
From: Nadav Amit Do not use flush-queue on virtualized environments, where the NpCache capability of the IOMMU is set. This is required to reduce virtualization overheads. This change follows a similar change to Intel's VT-d and a detailed explanation as for the rationale is described in commit

[PATCH 3/4] iommu/amd: Selective flush on unmap

2021-05-02 Thread Nadav Amit
From: Nadav Amit Recent patch attempted to enable selective page flushes on AMD IOMMU but neglected to adapt amd_iommu_iotlb_sync() to use the selective flushes. Adapt amd_iommu_iotlb_sync() to use selective flushes and change amd_iommu_unmap() to collect the flushes. As a defensive measure

[PATCH 3/4] iommu/amd: Do not sync on page size changes

2021-05-02 Thread Nadav Amit
From: Nadav Amit Some IOMMU architectures perform invalidations regardless of the page size. In such architectures there is no need to sync when the page size changes or to regard pgsize when making interim flush in iommu_iotlb_gather_add_page(). Add a "ignore_gather_pgsize" propert

[PATCH 2/4] iommu/amd: Selective flush on unmap

2021-05-02 Thread Nadav Amit
From: Nadav Amit Recent patch attempted to enable selective page flushes on AMD IOMMU but neglected to adapt amd_iommu_iotlb_sync() to use the selective flushes. Adapt amd_iommu_iotlb_sync() to use selective flushes and change amd_iommu_unmap() to collect the flushes. As a defensive measure

[PATCH 2/4] iommu/amd: Do not sync on page size changes

2021-05-02 Thread Nadav Amit
From: Nadav Amit Some IOMMU architectures perform invalidations regardless of the page size. In such architectures there is no need to sync when the page size changes. In such architecture, there is no need to regard pgsize when making interim flush in iommu_iotlb_gather_add_page(). Add

[PATCH 1/4] iommu/amd: Fix wrong parentheses on page-specific invalidations

2021-05-02 Thread Nadav Amit
From: Nadav Amit The logic to determine the mask of page-specific invalidations was tested in userspace. As the code was copied into the kernel, the parentheses were mistakenly set in the wrong place, resulting in the wrong mask. Fix it. Cc: Joerg Roedel Cc: Will Deacon Cc: Jiajun Cao Cc

[PATCH 0/4] iommu/amd: Enable page-selective flushes

2021-05-02 Thread Nadav Amit
From: Nadav Amit The previous patch, commit 268aa4548277 ("iommu/amd: Page-specific invalidations for more than one page") was supposed to enable page-selective IOTLB flushes on AMD. The patch had an embaressing bug, and I apologize for it. Analysis as for why this bug did

Re: [PATCH v2] iommu/vt-d: Force to flush iotlb before creating superpage

2021-04-30 Thread Nadav Amit
> On Apr 15, 2021, at 7:13 AM, Joerg Roedel wrote: > > On Thu, Apr 15, 2021 at 08:46:28AM +0800, Longpeng(Mike) wrote: >> Fixes: 6491d4d02893 ("intel-iommu: Free old page tables before creating >> superpage") >> Cc: # v3.0+ >> Link: >>

Re: [PATCH] iommu/amd: page-specific invalidations for more than one page

2021-04-08 Thread Nadav Amit
> On Apr 8, 2021, at 12:18 AM, Joerg Roedel wrote: > > Hi Nadav, > > On Wed, Apr 07, 2021 at 05:57:31PM +0000, Nadav Amit wrote: >> I tested it on real bare-metal hardware. I ran some basic I/O workloads >> with the IOMMU enabled, checkers enabled/disabled, and so

Re: [PATCH] iommu/amd: page-specific invalidations for more than one page

2021-04-07 Thread Nadav Amit
> On Apr 7, 2021, at 3:01 AM, Joerg Roedel wrote: > > On Tue, Mar 23, 2021 at 02:06:19PM -0700, Nadav Amit wrote: >> From: Nadav Amit >> >> Currently, IOMMU invalidations and device-IOTLB invalidations using >> AMD IOMMU fall back to full address-space inva

Re: A problem of Intel IOMMU hardware ?

2021-03-26 Thread Nadav Amit
> On Mar 26, 2021, at 7:31 PM, Lu Baolu wrote: > > Hi Nadav, > > On 3/19/21 12:46 AM, Nadav Amit wrote: >> So here is my guess: >> Intel probably used as a basis for the IOTLB an implementation of >> some other (regular) TLB design. >> Intel SDM say

[PATCH] iommu/amd: page-specific invalidations for more than one page

2021-03-23 Thread Nadav Amit
From: Nadav Amit Currently, IOMMU invalidations and device-IOTLB invalidations using AMD IOMMU fall back to full address-space invalidation if more than a single page need to be flushed. Full flushes are especially inefficient when the IOMMU is virtualized by a hypervisor, since it requires

Re: A problem of Intel IOMMU hardware ?

2021-03-18 Thread Nadav Amit
e, Cloud Infrastructure Service Product Dept.) >> ; Nadav Amit >> Cc: chenjiashang ; David Woodhouse >> ; iommu@lists.linux-foundation.org; LKML >> ; alex.william...@redhat.com; Gonglei (Arei) >> ; w...@kernel.org >> Subject: RE: A problem of Intel IOMMU hardwar

Re: A problem of Intel IOMMU hardware ?

2021-03-18 Thread Nadav Amit
> On Mar 17, 2021, at 9:46 PM, Longpeng (Mike, Cloud Infrastructure Service > Product Dept.) wrote: > [Snip] > > NOTE, the magical thing happen...(*Operation-4*) we write the PTE > of Operation-1 from 0 to 0x3 which means can Read/Write, and then > we trigger DMA read again, it success and

Re: A problem of Intel IOMMU hardware ?

2021-03-17 Thread Nadav Amit
> On Mar 17, 2021, at 2:35 AM, Longpeng (Mike, Cloud Infrastructure Service > Product Dept.) wrote: > > Hi Nadav, > >> -Original Message- >> From: Nadav Amit [mailto:nadav.a...@gmail.com] >>> reproduce the problem with high probability (~50%). >

Re: A problem of Intel IOMMU hardware ?

2021-03-16 Thread Nadav Amit
> On Mar 16, 2021, at 8:16 PM, Longpeng (Mike, Cloud Infrastructure Service > Product Dept.) wrote: > > Hi guys, > > We find the Intel iommu cache (i.e. iotlb) maybe works wrong in a special > situation, it would cause DMA fails or get wrong data. > > The reproducer (based on Alex's vfio

Re: [PATCH v2] iommu/vt-d: do not use flush-queue when caching-mode is on

2021-01-27 Thread Nadav Amit
> On Jan 27, 2021, at 3:25 AM, Lu Baolu wrote: > > On 2021/1/27 14:17, Nadav Amit wrote: >> From: Nadav Amit >> When an Intel IOMMU is virtualized, and a physical device is >> passed-through to the VM, changes of the virtual IOMMU need to be >> propagated to t

[PATCH v3] iommu/vt-d: do not use flush-queue when caching-mode is on

2021-01-27 Thread Nadav Amit
From: Nadav Amit When an Intel IOMMU is virtualized, and a physical device is passed-through to the VM, changes of the virtual IOMMU need to be propagated to the physical IOMMU. The hypervisor therefore needs to monitor PTE mappings in the IOMMU page-tables. Intel specifications provide "ca

[PATCH v2] iommu/vt-d: do not use flush-queue when caching-mode is on

2021-01-26 Thread Nadav Amit
From: Nadav Amit When an Intel IOMMU is virtualized, and a physical device is passed-through to the VM, changes of the virtual IOMMU need to be propagated to the physical IOMMU. The hypervisor therefore needs to monitor PTE mappings in the IOMMU page-tables. Intel specifications provide "ca

Re: [PATCH] iommu/vt-d: do not use flush-queue when caching-mode is on

2021-01-26 Thread Nadav Amit
> On Jan 26, 2021, at 4:26 PM, Lu Baolu wrote: > > Hi Nadav, > > On 1/27/21 4:38 AM, Nadav Amit wrote: >> From: Nadav Amit >> When an Intel IOMMU is virtualized, and a physical device is >> passed-through to the VM, changes of the virtual IOMMU need to be &g

[PATCH] iommu/vt-d: do not use flush-queue when caching-mode is on

2021-01-26 Thread Nadav Amit
From: Nadav Amit When an Intel IOMMU is virtualized, and a physical device is passed-through to the VM, changes of the virtual IOMMU need to be propagated to the physical IOMMU. The hypervisor therefore needs to monitor PTE mappings in the IOMMU page-tables. Intel specifications provide "ca

[PATCH] iommu/vt-d: Fix wrong analysis whether devices share the same bus

2019-08-20 Thread Nadav Amit via iommu
mu/vt-d: Allow interrupts from the entire bus for aliased devices") Cc: sta...@vger.kernel.org Cc: Logan Gunthorpe Cc: David Woodhouse Cc: Joerg Roedel Cc: Jacob Pan Signed-off-by: Nadav Amit --- drivers/iommu/intel_irq_remapping.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletion

Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Nadav Amit
> On Apr 17, 2019, at 10:26 AM, Ingo Molnar wrote: > > > * Nadav Amit wrote: > >>> On Apr 17, 2019, at 10:09 AM, Ingo Molnar wrote: >>> >>> >>> * Khalid Aziz wrote: >>> >>>>> I.e. the original motivation of

Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Nadav Amit
> On Apr 17, 2019, at 10:09 AM, Ingo Molnar wrote: > > > * Khalid Aziz wrote: > >>> I.e. the original motivation of the XPFO patches was to prevent execution >>> of direct kernel mappings. Is this motivation still present if those >>> mappings are non-executable? >>> >>> (Sorry if this has

Re: [RFC] avoid indirect calls for DMA direct mappings

2018-12-06 Thread Nadav Amit
> On Dec 6, 2018, at 9:43 AM, Jesper Dangaard Brouer wrote: > > On Thu, 6 Dec 2018 07:37:19 -0800 > Christoph Hellwig wrote: > >> Hi all, >> >> a while ago Jesper reported major performance regressions due to the >> spectre v2 mitigations in his XDP forwarding workloads. A large part >> of

Re: [RFC/RFT] Add noats flag to boot parameters

2018-05-03 Thread Nadav Amit
Sinan Kaya wrote: > +Bjorn, > > On 5/3/2018 9:59 AM, Joerg Roedel wrote: >> On Thu, May 03, 2018 at 09:46:34AM -0400, Sinan Kaya wrote: >>> I also like the idea in general. >>> Minor nit.. >>> >>> Shouldn't this be an iommu parameter rather than a PCI kernel command line

Re: [PATCH] mm/mmu_notifier: avoid double notification when it is useless

2017-10-03 Thread Nadav Amit
Jerome Glisse wrote: > On Wed, Oct 04, 2017 at 01:42:15AM +0200, Andrea Arcangeli wrote: > >> I'd like some more explanation about the inner working of "that new >> user" as per comment above. >> >> It would be enough to drop mmu_notifier_invalidate_range from above >>

Re: [PATCH 02/13] mm/rmap: update to new mmu_notifier semantic

2017-08-31 Thread Nadav Amit
Andrea Arcangeli <aarca...@redhat.com> wrote: > On Wed, Aug 30, 2017 at 08:47:19PM -0400, Jerome Glisse wrote: >> On Wed, Aug 30, 2017 at 04:25:54PM -0700, Nadav Amit wrote: >>> For both CoW and KSM, the correctness is maintained by calling >>> ptep_c

Re: [PATCH 02/13] mm/rmap: update to new mmu_notifier semantic

2017-08-30 Thread Nadav Amit
[cc’ing IOMMU people, which for some reason are not cc’d] Andrea Arcangeli <aarca...@redhat.com> wrote: > On Wed, Aug 30, 2017 at 11:00:32AM -0700, Nadav Amit wrote: >> It is not trivial to flush TLBs (primary or secondary) without holding the >> page-table lock, and as we

Re: [PATCH] iommu/vt-d: Remove unnecassary qi clflushes

2016-07-05 Thread Nadav Amit
Paolo Bonzini <pbonz...@redhat.com> wrote: > > > On 05/07/2016 18:27, Nadav Amit wrote: >>> Although such hardware is old, there are some hypervisors that do not set >>> the ecap.coherency of emulated IOMMUs. Yes, it is unwise, but there is no >>>

Re: [PATCH] iommu/vt-d: Remove unnecassary qi clflushes

2016-07-05 Thread Nadav Amit
Nadav Amit <nadav.a...@gmail.com> wrote: > Joerg Roedel <j...@8bytes.org> wrote: > >> On Fri, Jun 24, 2016 at 06:13:14AM -0700, Nadav Amit wrote: >>> According to the manual: "Hardware access to ... invalidation queue ... >>> are always c

Re: [PATCH] iommu/vt-d: Remove unnecassary qi clflushes

2016-06-27 Thread Nadav Amit
Joerg Roedel <j...@8bytes.org> wrote: > On Fri, Jun 24, 2016 at 06:13:14AM -0700, Nadav Amit wrote: >> According to the manual: "Hardware access to ... invalidation queue ... >> are always coherent." >> >> Remove unnecassary clflushes according

[PATCH] iommu/vt-d: Remove unnecassary qi clflushes

2016-06-24 Thread Nadav Amit
According to the manual: "Hardware access to ... invalidation queue ... are always coherent." Remove unnecassary clflushes accordingly. Signed-off-by: Nadav Amit <na...@vmware.com> --- Build-tested since I do not have an IOMMU that does not support coherency. --- drivers/

[PATCH v3] iommu/vt-d: Avoid write-tearing on PTE clear

2016-06-15 Thread Nadav Amit
. Avoid this scenario by using WRITE_ONCE, and order the writes on 32-bit kernels. Signed-off-by: Nadav Amit <na...@vmware.com> --- V3: Move split_dma_pte struct to dma_clear_pte (Joerg) Add comments (Joerg) V2: Use two WRITE_ONCE on 32-bit to avoid reordering --- drivers/iommu/intel-i

[PATCH v2] iommu/vt-d: Avoid write-tearing on PTE clear

2016-06-15 Thread Nadav Amit
. Avoid this scenario by using WRITE_ONCE, and order the writes on 32-bit kernels. Signed-off-by: Nadav Amit <na...@vmware.com> --- V2: Use two WRITE_ONCE on 32-bit to avoid reordering --- drivers/iommu/intel-iommu.c | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-)

Re: BUG: using smp_processor_id() in preemptible [00000000] code]

2016-06-09 Thread Nadav Amit
Alan Stern wrote: > On Thu, 9 Jun 2016, M G Berberich wrote: > >> Hello, >> >> With 4.7-rc2, after detecting a USB Mass Storage device >> >> [ 11.589843] usb-storage 4-2:1.0: USB Mass Storage device detected >> >> a constant flow of kernel-BUGS is reported

Re: [PATCH] iommu/vt-d: Avoid write-tearing on PTE clear

2016-06-03 Thread Nadav Amit
Ping? Nadav Amit <na...@vmware.com> wrote: > When a PTE is cleared, the write may be teared or perform by multiple > writes. In addition, in 32-bit kernel, writes are currently performed > using a single 64-bit write, which does not guarantee order. > > The byte-code ri

[PATCH] iommu/vt-d: Avoid write-tearing on PTE clear

2016-05-21 Thread Nadav Amit
. Avoid this scenario by using WRITE_ONCE, and order the writes on 32-bit kernels. Signed-off-by: Nadav Amit <na...@vmware.com> --- drivers/iommu/intel-iommu.c | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu

Re: [v3 13/26] KVM: Define a new interface kvm_find_dest_vcpu() for VT-d PI

2015-01-20 Thread Nadav Amit
Radim Kr?má? rkrc...@redhat.com wrote: 2015-01-14 01:27+, Wu, Feng: the new hardware even doesn't consider the TPR for lowest priority interrupts delivery. A bold move ... what hardware was the first to do so? I think it was starting with Nehalem. Thanks, (Could be that QPI