On Fri, Sep 25, 2020 at 2:56 AM John Garry wrote:
>
> This series contains a patch to solve the longterm IOVA issue which
> leizhen originally tried to address at [0].
>
> I also included the small optimisation from Cong Wang, which never seems
> to be have been accepted [
On Wed, Jan 22, 2020 at 9:07 AM Robin Murphy wrote:
>
> On 21/01/2020 5:21 pm, Cong Wang wrote:
> > On Tue, Jan 21, 2020 at 3:11 AM Robin Murphy wrote:
> >>
> >> On 18/12/2019 4:39 am, Cong Wang wrote:
> >>> The IOVA cache algorithm implemented in I
On Wed, Jan 22, 2020 at 9:34 AM Robin Murphy wrote:
> Sorry, but without convincing evidence, this change just looks like
> churn for the sake of it.
The time I wasted on arguing with you isn't worth anything than
the value this patch brings. So let's just drop it to save some
time.
Thanks.
On Tue, Jan 21, 2020 at 1:52 AM Robin Murphy wrote:
>
> On 18/12/2019 4:39 am, Cong Wang wrote:
> > If the magazine is empty, iova_magazine_free_pfns() should
> > be a nop, however it misses the case of mag->size==0. So we
> > should just call iova_magazine_empty().
>
On Tue, Jan 21, 2020 at 3:11 AM Robin Murphy wrote:
>
> On 18/12/2019 4:39 am, Cong Wang wrote:
> > The IOVA cache algorithm implemented in IOMMU code does not
> > exactly match the original algorithm described in the paper
> > "Magazines and Vmem: Extending the Sl
On Tue, Dec 17, 2019 at 8:40 PM Cong Wang wrote:
>
> This patchset contains three small optimizations for the global spinlock
> contention in IOVA cache. Our memcache perf test shows this reduced its
> p999 latency down by 45% on AMD when IOMMU is enabled.
>
> (Resending v3
Both find_iova() and __free_iova() take iova_rbtree_lock,
there is no reason to take and release it twice inside
free_iova().
Fold them into one critical section by calling the unlock
versions instead.
Cc: Joerg Roedel
Cc: John Garry
Signed-off-by: Cong Wang
---
drivers/iommu/iova.c | 8
This patchset contains three small optimizations for the global spinlock
contention in IOVA cache. Our memcache perf test shows this reduced its
p999 latency down by 45% on AMD when IOMMU is enabled.
(Resending v3 on Joerg's request.)
Cong Wang (3):
iommu: avoid unnecessary mag
ed-off-by: Cong Wang
---
drivers/iommu/iova.c | 22 +++---
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index cb473ddce4cf..184d4c0e20b5 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -797,13 +797,23 @@
.
Together with a few other changes to make it exactly match
the pseudo code in the paper.
Cc: Joerg Roedel
Cc: John Garry
Signed-off-by: Cong Wang
---
drivers/iommu/iova.c | 45 +++-
1 file changed, 28 insertions(+), 17 deletions(-)
diff --git a/driv
On Tue, Dec 17, 2019 at 1:43 AM Joerg Roedel wrote:
>
> On Thu, Nov 28, 2019 at 04:48:52PM -0800, Cong Wang wrote:
> > This patchset contains three small optimizations for the global spinlock
> > contention in IOVA cache. Our memcache perf test shows this reduced its
> >
This patchset contains three small optimizations for the global spinlock
contention in IOVA cache. Our memcache perf test shows this reduced its
p999 latency down by 45% on AMD when IOMMU is enabled.
Cong Wang (3):
iommu: avoid unnecessary magazine allocations
iommu: optimize
ed-off-by: Cong Wang
---
drivers/iommu/iova.c | 22 +++---
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index cb473ddce4cf..184d4c0e20b5 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -797,13 +797,23 @@
.
Together with a few other changes to make it exactly match
the pseudo code in the paper.
Cc: Joerg Roedel
Cc: John Garry
Signed-off-by: Cong Wang
---
drivers/iommu/iova.c | 45 +++-
1 file changed, 28 insertions(+), 17 deletions(-)
diff --git a/driv
Both find_iova() and __free_iova() take iova_rbtree_lock,
there is no reason to take and release it twice inside
free_iova().
Fold them into one critical section by calling the unlock
versions instead.
Cc: Joerg Roedel
Cc: John Garry
Signed-off-by: Cong Wang
---
drivers/iommu/iova.c | 8
On Mon, Dec 2, 2019 at 2:02 AM John Garry wrote:
>
> On 30/11/2019 06:02, Cong Wang wrote:
> > On Fri, Nov 29, 2019 at 5:24 AM John Garry wrote:
> >>
> >> On 29/11/2019 00:48, Cong Wang wrote:
> >>> If the maganize is empty, iova_magazine_free_pfns() sh
On Mon, Dec 2, 2019 at 8:59 AM Christoph Hellwig wrote:
>
> > + return (mag && mag->size == IOVA_MAG_SIZE);
>
> > + return (!mag || mag->size == 0);
>
> No need for the braces in both cases.
The current code is already this, I don't want to mix coding style
changes with a non-coding-style
On Mon, Dec 2, 2019 at 2:55 AM John Garry wrote:
> Apart from this change, did anyone ever consider kmem cache for the
> magazines?
You can always make any changes you want after this patch,
I can't do all optimizations in one single patch. :)
So, I will leave this to you.
Thanks.
_
On Mon, Dec 2, 2019 at 8:58 AM Christoph Hellwig wrote:
>
> I think a subject line better describes what you change, no that
> it matches an original algorithm. The fact that the fix matches
> the original algorithm can go somewhere towards the commit log,
> preferably with a reference to the act
On Fri, Nov 29, 2019 at 5:34 AM John Garry wrote:
>
> On 29/11/2019 00:48, Cong Wang wrote:
> > Both find_iova() and __free_iova() take iova_rbtree_lock,
> > there is no reason to take and release it twice inside
> > free_iova().
> >
> > Fold them into the cr
On Fri, Nov 29, 2019 at 5:24 AM John Garry wrote:
>
> On 29/11/2019 00:48, Cong Wang wrote:
> > If the maganize is empty, iova_magazine_free_pfns() should
>
> magazine
Good catch!
>
> > be a nop, however it misses the case of mag->size==0. So we
> >
On Fri, Nov 29, 2019 at 6:43 AM John Garry wrote:
>
> On 29/11/2019 00:48, Cong Wang wrote:
> > The IOVA cache algorithm implemented in IOMMU code does not
> > exactly match the original algorithm described in the paper.
> >
>
> which paper?
It's in drivers
This patchset contains three small optimizations for the global spinlock
contention in IOVA cache. Our memcache perf test shows this reduced its
p999 latency down by 45% on AMD when IOMMU is enabled.
Cong Wang (3):
iommu: match the original algorithm
iommu: optimize iova_magazine_free_pfns
Both find_iova() and __free_iova() take iova_rbtree_lock,
there is no reason to take and release it twice inside
free_iova().
Fold them into the critical section by calling the unlock
versions instead.
Cc: Joerg Roedel
Signed-off-by: Cong Wang
---
drivers/iommu/iova.c | 8 ++--
1 file
If the maganize is empty, iova_magazine_free_pfns() should
be a nop, however it misses the case of mag->size==0. So we
should just call iova_magazine_empty().
This should reduce the contention on iovad->iova_rbtree_lock
a little bit.
Cc: Joerg Roedel
Signed-off-by: Cong Wang
---
drivers
t and only recycle them
when all of them are full.
Before this patch, rcache->depot[] contains either full or
freed entries, after this patch, it contains either full or
empty (but allocated) entries.
Cc: Joerg Roedel
Signed-off-by: Cong Wang
---
drivers/iommu/iov
On Wed, Nov 27, 2019 at 10:01 AM John Garry wrote:
>
> On 21/11/2019 00:13, Cong Wang wrote:
> > The IOVA cache algorithm implemented in IOMMU code does not
> > exactly match the original algorithm described in the paper.
> >
> > Particularly, it doesn't nee
Both find_iova() and __free_iova() take iova_rbtree_lock,
there is no reason to take and release it twice inside
free_iova().
Fold them into the critical section by calling the unlock
versions instead.
Cc: Joerg Roedel
Signed-off-by: Cong Wang
---
drivers/iommu/iova.c | 8 ++--
1 file
This patchset contains three small optimizations for the global spinlock
contention in IOVA cache. Our memcache perf test shows this reduced its
p999 latency down by 45% on AMD when IOMMU is enabled.
Cong Wang (3):
iommu: match the original algorithm
iommu: optimize iova_magazine_free_pfns
If the maganize is empty, iova_magazine_free_pfns() should
be a nop, however it misses the case of mag->size==0. So we
should just call iova_magazine_empty().
This should reduce the contention on iovad->iova_rbtree_lock
a little bit.
Cc: Joerg Roedel
Signed-off-by: Cong Wang
---
drivers
.
Cc: Joerg Roedel
Signed-off-by: Cong Wang
---
drivers/iommu/iova.c | 14 --
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 41c605b0058f..92f72a85e62a 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -900,7 +
31 matches
Mail list logo