On Fri, Sep 25, 2020 at 2:56 AM John Garry wrote:
>
> This series contains a patch to solve the longterm IOVA issue which
> leizhen originally tried to address at [0].
>
> I also included the small optimisation from Cong Wang, which never seems
> to be have been accepted [
On Wed, Jan 22, 2020 at 9:07 AM Robin Murphy wrote:
>
> On 21/01/2020 5:21 pm, Cong Wang wrote:
> > On Tue, Jan 21, 2020 at 3:11 AM Robin Murphy wrote:
> >>
> >> On 18/12/2019 4:39 am, Cong Wang wrote:
> >>> The IOVA cache algorithm implemented
On Wed, Jan 22, 2020 at 9:34 AM Robin Murphy wrote:
> Sorry, but without convincing evidence, this change just looks like
> churn for the sake of it.
The time I wasted on arguing with you isn't worth anything than
the value this patch brings. So let's just drop it to save some
time.
Thanks.
On Tue, Jan 21, 2020 at 1:52 AM Robin Murphy wrote:
>
> On 18/12/2019 4:39 am, Cong Wang wrote:
> > If the magazine is empty, iova_magazine_free_pfns() should
> > be a nop, however it misses the case of mag->size==0. So we
> > should just call iova_magazine_empty().
On Tue, Jan 21, 2020 at 3:11 AM Robin Murphy wrote:
>
> On 18/12/2019 4:39 am, Cong Wang wrote:
> > The IOVA cache algorithm implemented in IOMMU code does not
> > exactly match the original algorithm described in the paper
> > "Magazines and Vmem: Extending the Sl
On Tue, Dec 17, 2019 at 8:40 PM Cong Wang wrote:
>
> This patchset contains three small optimizations for the global spinlock
> contention in IOVA cache. Our memcache perf test shows this reduced its
> p999 latency down by 45% on AMD when IOMMU is enabled.
>
> (Resending v3
Both find_iova() and __free_iova() take iova_rbtree_lock,
there is no reason to take and release it twice inside
free_iova().
Fold them into one critical section by calling the unlock
versions instead.
Cc: Joerg Roedel
Cc: John Garry
Signed-off-by: Cong Wang
---
drivers/iommu/iova.c | 8
This patchset contains three small optimizations for the global spinlock
contention in IOVA cache. Our memcache perf test shows this reduced its
p999 latency down by 45% on AMD when IOMMU is enabled.
(Resending v3 on Joerg's request.)
Cong Wang (3):
iommu: avoid unnecessary magazine
ed-off-by: Cong Wang
---
drivers/iommu/iova.c | 22 +++---
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index cb473ddce4cf..184d4c0e20b5 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -797,13 +797,23 @@
es.
Together with a few other changes to make it exactly match
the pseudo code in the paper.
Cc: Joerg Roedel
Cc: John Garry
Signed-off-by: Cong Wang
---
drivers/iommu/iova.c | 45 +++-
1 file changed, 28 insertions(+), 17 deletions(-)
diff --git a/drivers/i
This patchset contains three small optimizations for the global spinlock
contention in IOVA cache. Our memcache perf test shows this reduced its
p999 latency down by 45% on AMD when IOMMU is enabled.
Cong Wang (3):
iommu: avoid unnecessary magazine allocations
iommu: optimize
ed-off-by: Cong Wang
---
drivers/iommu/iova.c | 22 +++---
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index cb473ddce4cf..184d4c0e20b5 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -797,13 +797,23 @@
es.
Together with a few other changes to make it exactly match
the pseudo code in the paper.
Cc: Joerg Roedel
Cc: John Garry
Signed-off-by: Cong Wang
---
drivers/iommu/iova.c | 45 +++-
1 file changed, 28 insertions(+), 17 deletions(-)
diff --git a/drivers/i
Both find_iova() and __free_iova() take iova_rbtree_lock,
there is no reason to take and release it twice inside
free_iova().
Fold them into one critical section by calling the unlock
versions instead.
Cc: Joerg Roedel
Cc: John Garry
Signed-off-by: Cong Wang
---
drivers/iommu/iova.c | 8
On Mon, Dec 2, 2019 at 2:02 AM John Garry wrote:
>
> On 30/11/2019 06:02, Cong Wang wrote:
> > On Fri, Nov 29, 2019 at 5:24 AM John Garry wrote:
> >>
> >> On 29/11/2019 00:48, Cong Wang wrote:
> >>> If the maganize is empty, iova_magazine_free_pfns() sh
On Mon, Dec 2, 2019 at 8:59 AM Christoph Hellwig wrote:
>
> > + return (mag && mag->size == IOVA_MAG_SIZE);
>
> > + return (!mag || mag->size == 0);
>
> No need for the braces in both cases.
The current code is already this, I don't want to mix coding style
changes with a
On Mon, Dec 2, 2019 at 2:55 AM John Garry wrote:
> Apart from this change, did anyone ever consider kmem cache for the
> magazines?
You can always make any changes you want after this patch,
I can't do all optimizations in one single patch. :)
So, I will leave this to you.
Thanks.
On Mon, Dec 2, 2019 at 8:58 AM Christoph Hellwig wrote:
>
> I think a subject line better describes what you change, no that
> it matches an original algorithm. The fact that the fix matches
> the original algorithm can go somewhere towards the commit log,
> preferably with a reference to the
On Fri, Nov 29, 2019 at 5:34 AM John Garry wrote:
>
> On 29/11/2019 00:48, Cong Wang wrote:
> > Both find_iova() and __free_iova() take iova_rbtree_lock,
> > there is no reason to take and release it twice inside
> > free_iova().
> >
> > Fold them into the cr
On Fri, Nov 29, 2019 at 5:24 AM John Garry wrote:
>
> On 29/11/2019 00:48, Cong Wang wrote:
> > If the maganize is empty, iova_magazine_free_pfns() should
>
> magazine
Good catch!
>
> > be a nop, however it misses the case of mag->size==0. So we
> >
On Fri, Nov 29, 2019 at 6:43 AM John Garry wrote:
>
> On 29/11/2019 00:48, Cong Wang wrote:
> > The IOVA cache algorithm implemented in IOMMU code does not
> > exactly match the original algorithm described in the paper.
> >
>
> which paper?
It's in drivers/iomm
This patchset contains three small optimizations for the global spinlock
contention in IOVA cache. Our memcache perf test shows this reduced its
p999 latency down by 45% on AMD when IOMMU is enabled.
Cong Wang (3):
iommu: match the original algorithm
iommu: optimize iova_magazine_free_pfns
Both find_iova() and __free_iova() take iova_rbtree_lock,
there is no reason to take and release it twice inside
free_iova().
Fold them into the critical section by calling the unlock
versions instead.
Cc: Joerg Roedel
Signed-off-by: Cong Wang
---
drivers/iommu/iova.c | 8 ++--
1 file
If the maganize is empty, iova_magazine_free_pfns() should
be a nop, however it misses the case of mag->size==0. So we
should just call iova_magazine_empty().
This should reduce the contention on iovad->iova_rbtree_lock
a little bit.
Cc: Joerg Roedel
Signed-off-by: Cong Wang
---
drivers
and only recycle them
when all of them are full.
Before this patch, rcache->depot[] contains either full or
freed entries, after this patch, it contains either full or
empty (but allocated) entries.
Cc: Joerg Roedel
Signed-off-by: Cong Wang
---
drivers/iommu/iova.c |
On Wed, Nov 27, 2019 at 10:01 AM John Garry wrote:
>
> On 21/11/2019 00:13, Cong Wang wrote:
> > The IOVA cache algorithm implemented in IOMMU code does not
> > exactly match the original algorithm described in the paper.
> >
> > Particularly, it doesn't need to
Both find_iova() and __free_iova() take iova_rbtree_lock,
there is no reason to take and release it twice inside
free_iova().
Fold them into the critical section by calling the unlock
versions instead.
Cc: Joerg Roedel
Signed-off-by: Cong Wang
---
drivers/iommu/iova.c | 8 ++--
1 file
This patchset contains three small optimizations for the global spinlock
contention in IOVA cache. Our memcache perf test shows this reduced its
p999 latency down by 45% on AMD when IOMMU is enabled.
Cong Wang (3):
iommu: match the original algorithm
iommu: optimize iova_magazine_free_pfns
If the maganize is empty, iova_magazine_free_pfns() should
be a nop, however it misses the case of mag->size==0. So we
should just call iova_magazine_empty().
This should reduce the contention on iovad->iova_rbtree_lock
a little bit.
Cc: Joerg Roedel
Signed-off-by: Cong Wang
---
drivers
: Joerg Roedel
Signed-off-by: Cong Wang
---
drivers/iommu/iova.c | 14 --
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 41c605b0058f..92f72a85e62a 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -900,7 +900,7
30 matches
Mail list logo