Re: [PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte

2012-04-20 Thread Takuya Yoshikawa
Sorry for the delay.

On Wed, 18 Apr 2012 12:01:03 +0800
Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote:

  I have checked dirty-log-perf myself with this patch [01-07].
  
  GET_DIRTY_LOG for 1GB dirty pages has become more than 15% slower.

 Thanks for your test!
 
 Unbelievable, i will do more test and check it more carefully.

GET_DIRTY_LOG now traverses rmap lists intensively.
So only a small change can affect the performance when there are many
dirty pages.

 Could you please open your tool, then i can reproduction it and find the
 real reason?

It's already in kvm unit tests!

 I will check whether your tool is better then kernbench/autotest after
 review your tool.

Let's focus on lock-less now: so dirty-log-perf is not needed now.

I think you failed to appeal the real advantage of your lock-less approach!
I will write about this on v3-threads.

Thanks,
Takuya
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte

2012-04-20 Thread Takuya Yoshikawa
On Wed, 18 Apr 2012 18:03:10 +0800
Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote:

 By the way, what is the case did you test? ept = 1 ?

Yes!

I am not worrying about without-EPT/NPT-migration.

Takuya
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte

2012-04-20 Thread Xiao Guangrong
On 04/21/2012 09:01 AM, Takuya Yoshikawa wrote:

 Sorry for the delay.
 
 On Wed, 18 Apr 2012 12:01:03 +0800
 Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote:
 
 I have checked dirty-log-perf myself with this patch [01-07].

 GET_DIRTY_LOG for 1GB dirty pages has become more than 15% slower.
 
 Thanks for your test!

 Unbelievable, i will do more test and check it more carefully.
 
 GET_DIRTY_LOG now traverses rmap lists intensively.
 So only a small change can affect the performance when there are many
 dirty pages.
 
 Could you please open your tool, then i can reproduction it and find the
 real reason?
 
 It's already in kvm unit tests!
 


Great, i will check that.

 I will check whether your tool is better then kernbench/autotest after
 review your tool.
 
 Let's focus on lock-less now: so dirty-log-perf is not needed now.
 
 I think you failed to appeal the real advantage of your lock-less approach!
 I will write about this on v3-threads.


Thank you very much, Takuya!
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte

2012-04-18 Thread Xiao Guangrong
On 04/17/2012 10:47 PM, Takuya Yoshikawa wrote:

 On Mon, 16 Apr 2012 11:36:25 +0800
 Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote:
 
 I tested it with kernbench, no regression is found.
 
 Because kernbench is not at all good test for this.
 
 It is not a problem since the iter and spte should be in the cache.
 
 I have checked dirty-log-perf myself with this patch [01-07].
 
 GET_DIRTY_LOG for 1GB dirty pages has become more than 15% slower.


By the way, what is the case did you test? ept = 1 ?

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte

2012-04-17 Thread Takuya Yoshikawa
On Mon, 16 Apr 2012 11:36:25 +0800
Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote:

 I tested it with kernbench, no regression is found.

Because kernbench is not at all good test for this.

 It is not a problem since the iter and spte should be in the cache.

I have checked dirty-log-perf myself with this patch [01-07].

GET_DIRTY_LOG for 1GB dirty pages has become more than 15% slower.

This is not a pure cleanup and introduces significant regression to
rmap handling.

This is the reason why I asked to remove cleanup patches from this
patch series.  Had it been really trivial cleanup, I would not have
asked that.


Note: if you had checked the worst case latency with this patch series,
you should have noticed this regression by yourself.

Auto-test and kernbench are not enough to see the effect of this work.

Thanks,
Takuya
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte

2012-04-17 Thread Xiao Guangrong
On 04/17/2012 10:47 PM, Takuya Yoshikawa wrote:

 On Mon, 16 Apr 2012 11:36:25 +0800
 Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote:
 
 I tested it with kernbench, no regression is found.
 
 Because kernbench is not at all good test for this.
 
 It is not a problem since the iter and spte should be in the cache.
 
 I have checked dirty-log-perf myself with this patch [01-07].
 
 GET_DIRTY_LOG for 1GB dirty pages has become more than 15% slower.
 


Thanks for your test!

Unbelievable, i will do more test and check it more carefully.

Could you please open your tool, then i can reproduction it and find the
real reason?

 
 Note: if you had checked the worst case latency with this patch series,
 you should have noticed this regression by yourself.
 
 Auto-test and kernbench are not enough to see the effect of this work.
 


I will check whether your tool is better then kernbench/autotest after
review your tool.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte

2012-04-15 Thread Xiao Guangrong
On 04/14/2012 10:44 AM, Takuya Yoshikawa wrote:

 On Fri, 13 Apr 2012 18:12:41 +0800b
 Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote:
 
 It is used to walk all the sptes of the specified pte_list, after
 this, the code of pte_list_walk can be removed

 And it can restart the walking automatically if the spte is zapped
 
 Well, I want to ask two questions:
 
   - why do you prefer pte_list_* naming to rmap_*?
 (not a big issue but just curious)


pte_list is a common infrastructure for both parent-list and rmap.

   - Are you sure the whole indirection by this patch will
 not introduce any regression?
 (not restricted to get_dirty)
 


I tested it with kernbench, no regression is found.

It is not a problem since the iter and spte should be in the cache.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte

2012-04-13 Thread Xiao Guangrong
It is used to walk all the sptes of the specified pte_list, after
this, the code of pte_list_walk can be removed

And it can restart the walking automatically if the spte is zapped

Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 arch/x86/kvm/mmu.c   |  233 +++---
 arch/x86/kvm/mmu_audit.c |6 +-
 2 files changed, 118 insertions(+), 121 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index a1c3628..4e91e94 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -900,26 +900,110 @@ static void pte_list_remove(u64 *spte, unsigned long 
*pte_list)
}
 }

-typedef void (*pte_list_walk_fn) (u64 *spte);
-static void pte_list_walk(unsigned long *pte_list, pte_list_walk_fn fn)
-{
+/*
+ * Used by the following functions to iterate through the sptes linked by a
+ * pte_list. All fields are private and not assumed to be used outside.
+ */
+struct spte_iterator {
+   /* private fields */
+   unsigned long *pte_list;
+   /* holds the sptep if not NULL */
struct pte_list_desc *desc;
-   int i;
+   /* index of the sptep, -1 means the walking does not start. */
+   int pos;
+};

-   if (!*pte_list)
-   return;
+static void pte_list_walk_init(struct spte_iterator *iter,
+  unsigned long *pte_list)
+{
+   iter-pte_list = pte_list;
+   iter-pos = -1;
+}
+
+static void pte_list_walk_check_restart(struct spte_iterator *iter, u64 *spte)
+{
+   /*
+* If the spte is zapped, we should set 'iter-pos = -1' to restart
+* the walk.
+*/
+   if (!is_shadow_present_pte(*spte))
+   iter-pos = -1;
+}
+
+static u64 *pte_list_first(struct spte_iterator *iter)
+{
+   unsigned long pte_list = *iter-pte_list;
+   u64 *sptep;
+
+   if (!pte_list)
+   return NULL;
+
+   if (!(pte_list  1)) {
+   iter-desc = NULL;
+   iter-pos = 0;
+   sptep = (u64 *)pte_list;
+
+   goto exit;
+   }
+
+   iter-desc = (struct pte_list_desc *)(pte_list  ~1ul);
+   iter-pos = 0;
+   sptep = iter-desc-sptes[iter-pos];
+
+exit:
+   WARN_ON(sptep  !is_shadow_present_pte(*sptep));
+   return sptep;
+}

-   if (!(*pte_list  1))
-   return fn((u64 *)*pte_list);
+static u64 *pte_list_next(struct spte_iterator *iter)
+{
+   u64 *sptep = NULL;

-   desc = (struct pte_list_desc *)(*pte_list  ~1ul);
-   while (desc) {
-   for (i = 0; i  PTE_LIST_EXT  desc-sptes[i]; ++i)
-   fn(desc-sptes[i]);
-   desc = desc-more;
+   if (iter-pos  0)
+   return pte_list_first(iter);
+
+   if (iter-desc) {
+   if (iter-pos  PTE_LIST_EXT - 1) {
+   ++iter-pos;
+   sptep = iter-desc-sptes[iter-pos];
+   if (sptep)
+   goto exit;
+   }
+
+   iter-desc = iter-desc-more;
+
+   if (iter-desc) {
+   iter-pos = 0;
+   /* desc-sptes[0] cannot be NULL */
+   sptep = iter-desc-sptes[iter-pos];
+   goto exit;
+   }
}
+
+exit:
+   WARN_ON(sptep  !is_shadow_present_pte(*sptep));
+   return sptep;
 }

+
+#define for_each_pte_list_spte(pte_list, iter, spte)   \
+   for (pte_list_walk_init(iter, pte_list);\
+ (spte = pte_list_next(iter)) != NULL; \
+   pte_list_walk_check_restart(iter, spte))
+
+typedef void (*pte_list_walk_fn) (u64 *spte);
+static void pte_list_walk(unsigned long *pte_list, pte_list_walk_fn fn)
+{
+   struct spte_iterator iter;
+   u64 *spte;
+
+   for_each_pte_list_spte(pte_list, iter, spte)
+   fn(spte);
+}
+
+#define for_each_rmap_spte(rmap, iter, spte)   \
+   for_each_pte_list_spte(rmap, iter, spte)
+
 static unsigned long *__gfn_to_rmap(gfn_t gfn, int level,
struct kvm_memory_slot *slot)
 {
@@ -974,92 +1058,19 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
pte_list_remove(spte, rmapp);
 }

-/*
- * Used by the following functions to iterate through the sptes linked by a
- * rmap.  All fields are private and not assumed to be used outside.
- */
-struct rmap_iterator {
-   /* private fields */
-   struct pte_list_desc *desc; /* holds the sptep if not NULL */
-   int pos;/* index of the sptep */
-};
-
-/*
- * Iteration must be started by this function.  This should also be used after
- * removing/dropping sptes from the rmap link because in such cases the
- * information in the itererator may not be valid.
- *
- * Returns sptep if found, NULL otherwise.
- */
-static u64 *rmap_get_first(unsigned long rmap, struct rmap_iterator *iter)
-{
-   u64 *sptep;
-
-   if (!rmap)
- 

Re: [PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte

2012-04-13 Thread Takuya Yoshikawa
On Fri, 13 Apr 2012 18:12:41 +0800b
Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote:

 It is used to walk all the sptes of the specified pte_list, after
 this, the code of pte_list_walk can be removed
 
 And it can restart the walking automatically if the spte is zapped

Well, I want to ask two questions:

- why do you prefer pte_list_* naming to rmap_*?
  (not a big issue but just curious)
- Are you sure the whole indirection by this patch will
  not introduce any regression?
  (not restricted to get_dirty)

As my previous patch showed, just a slight trivial change can introduce
siginificant performance regression/improvement.

Thanks,
Takuya

 
 Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
 ---
  arch/x86/kvm/mmu.c   |  233 
 +++---
  arch/x86/kvm/mmu_audit.c |6 +-
  2 files changed, 118 insertions(+), 121 deletions(-)
 
 diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
 index a1c3628..4e91e94 100644
 --- a/arch/x86/kvm/mmu.c
 +++ b/arch/x86/kvm/mmu.c
 @@ -900,26 +900,110 @@ static void pte_list_remove(u64 *spte, unsigned long 
 *pte_list)
   }
  }
 
 -typedef void (*pte_list_walk_fn) (u64 *spte);
 -static void pte_list_walk(unsigned long *pte_list, pte_list_walk_fn fn)
 -{
 +/*
 + * Used by the following functions to iterate through the sptes linked by a
 + * pte_list. All fields are private and not assumed to be used outside.
 + */
 +struct spte_iterator {
 + /* private fields */
 + unsigned long *pte_list;
 + /* holds the sptep if not NULL */
   struct pte_list_desc *desc;
 - int i;
 + /* index of the sptep, -1 means the walking does not start. */
 + int pos;
 +};
 
 - if (!*pte_list)
 - return;
 +static void pte_list_walk_init(struct spte_iterator *iter,
 +unsigned long *pte_list)
 +{
 + iter-pte_list = pte_list;
 + iter-pos = -1;
 +}
 +
 +static void pte_list_walk_check_restart(struct spte_iterator *iter, u64 
 *spte)
 +{
 + /*
 +  * If the spte is zapped, we should set 'iter-pos = -1' to restart
 +  * the walk.
 +  */
 + if (!is_shadow_present_pte(*spte))
 + iter-pos = -1;
 +}
 +
 +static u64 *pte_list_first(struct spte_iterator *iter)
 +{
 + unsigned long pte_list = *iter-pte_list;
 + u64 *sptep;
 +
 + if (!pte_list)
 + return NULL;
 +
 + if (!(pte_list  1)) {
 + iter-desc = NULL;
 + iter-pos = 0;
 + sptep = (u64 *)pte_list;
 +
 + goto exit;
 + }
 +
 + iter-desc = (struct pte_list_desc *)(pte_list  ~1ul);
 + iter-pos = 0;
 + sptep = iter-desc-sptes[iter-pos];
 +
 +exit:
 + WARN_ON(sptep  !is_shadow_present_pte(*sptep));
 + return sptep;
 +}
 
 - if (!(*pte_list  1))
 - return fn((u64 *)*pte_list);
 +static u64 *pte_list_next(struct spte_iterator *iter)
 +{
 + u64 *sptep = NULL;
 
 - desc = (struct pte_list_desc *)(*pte_list  ~1ul);
 - while (desc) {
 - for (i = 0; i  PTE_LIST_EXT  desc-sptes[i]; ++i)
 - fn(desc-sptes[i]);
 - desc = desc-more;
 + if (iter-pos  0)
 + return pte_list_first(iter);
 +
 + if (iter-desc) {
 + if (iter-pos  PTE_LIST_EXT - 1) {
 + ++iter-pos;
 + sptep = iter-desc-sptes[iter-pos];
 + if (sptep)
 + goto exit;
 + }
 +
 + iter-desc = iter-desc-more;
 +
 + if (iter-desc) {
 + iter-pos = 0;
 + /* desc-sptes[0] cannot be NULL */
 + sptep = iter-desc-sptes[iter-pos];
 + goto exit;
 + }
   }
 +
 +exit:
 + WARN_ON(sptep  !is_shadow_present_pte(*sptep));
 + return sptep;
  }
 
 +
 +#define for_each_pte_list_spte(pte_list, iter, spte) \
 + for (pte_list_walk_init(iter, pte_list);\
 +   (spte = pte_list_next(iter)) != NULL; \
 + pte_list_walk_check_restart(iter, spte))
 +
 +typedef void (*pte_list_walk_fn) (u64 *spte);
 +static void pte_list_walk(unsigned long *pte_list, pte_list_walk_fn fn)
 +{
 + struct spte_iterator iter;
 + u64 *spte;
 +
 + for_each_pte_list_spte(pte_list, iter, spte)
 + fn(spte);
 +}
 +
 +#define for_each_rmap_spte(rmap, iter, spte) \
 + for_each_pte_list_spte(rmap, iter, spte)
 +
  static unsigned long *__gfn_to_rmap(gfn_t gfn, int level,
   struct kvm_memory_slot *slot)
  {
 @@ -974,92 +1058,19 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
   pte_list_remove(spte, rmapp);
  }
 
 -/*
 - * Used by the following functions to iterate through the sptes linked by a
 - * rmap.  All fields are private and not assumed to be used outside.
 - */
 -struct rmap_iterator {
 - /* private fields */