Re: [PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte
Sorry for the delay. On Wed, 18 Apr 2012 12:01:03 +0800 Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: I have checked dirty-log-perf myself with this patch [01-07]. GET_DIRTY_LOG for 1GB dirty pages has become more than 15% slower. Thanks for your test! Unbelievable, i will do more test and check it more carefully. GET_DIRTY_LOG now traverses rmap lists intensively. So only a small change can affect the performance when there are many dirty pages. Could you please open your tool, then i can reproduction it and find the real reason? It's already in kvm unit tests! I will check whether your tool is better then kernbench/autotest after review your tool. Let's focus on lock-less now: so dirty-log-perf is not needed now. I think you failed to appeal the real advantage of your lock-less approach! I will write about this on v3-threads. Thanks, Takuya -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte
On Wed, 18 Apr 2012 18:03:10 +0800 Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: By the way, what is the case did you test? ept = 1 ? Yes! I am not worrying about without-EPT/NPT-migration. Takuya -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte
On 04/21/2012 09:01 AM, Takuya Yoshikawa wrote: Sorry for the delay. On Wed, 18 Apr 2012 12:01:03 +0800 Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: I have checked dirty-log-perf myself with this patch [01-07]. GET_DIRTY_LOG for 1GB dirty pages has become more than 15% slower. Thanks for your test! Unbelievable, i will do more test and check it more carefully. GET_DIRTY_LOG now traverses rmap lists intensively. So only a small change can affect the performance when there are many dirty pages. Could you please open your tool, then i can reproduction it and find the real reason? It's already in kvm unit tests! Great, i will check that. I will check whether your tool is better then kernbench/autotest after review your tool. Let's focus on lock-less now: so dirty-log-perf is not needed now. I think you failed to appeal the real advantage of your lock-less approach! I will write about this on v3-threads. Thank you very much, Takuya! -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte
On 04/17/2012 10:47 PM, Takuya Yoshikawa wrote: On Mon, 16 Apr 2012 11:36:25 +0800 Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: I tested it with kernbench, no regression is found. Because kernbench is not at all good test for this. It is not a problem since the iter and spte should be in the cache. I have checked dirty-log-perf myself with this patch [01-07]. GET_DIRTY_LOG for 1GB dirty pages has become more than 15% slower. By the way, what is the case did you test? ept = 1 ? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte
On Mon, 16 Apr 2012 11:36:25 +0800 Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: I tested it with kernbench, no regression is found. Because kernbench is not at all good test for this. It is not a problem since the iter and spte should be in the cache. I have checked dirty-log-perf myself with this patch [01-07]. GET_DIRTY_LOG for 1GB dirty pages has become more than 15% slower. This is not a pure cleanup and introduces significant regression to rmap handling. This is the reason why I asked to remove cleanup patches from this patch series. Had it been really trivial cleanup, I would not have asked that. Note: if you had checked the worst case latency with this patch series, you should have noticed this regression by yourself. Auto-test and kernbench are not enough to see the effect of this work. Thanks, Takuya -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte
On 04/17/2012 10:47 PM, Takuya Yoshikawa wrote: On Mon, 16 Apr 2012 11:36:25 +0800 Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: I tested it with kernbench, no regression is found. Because kernbench is not at all good test for this. It is not a problem since the iter and spte should be in the cache. I have checked dirty-log-perf myself with this patch [01-07]. GET_DIRTY_LOG for 1GB dirty pages has become more than 15% slower. Thanks for your test! Unbelievable, i will do more test and check it more carefully. Could you please open your tool, then i can reproduction it and find the real reason? Note: if you had checked the worst case latency with this patch series, you should have noticed this regression by yourself. Auto-test and kernbench are not enough to see the effect of this work. I will check whether your tool is better then kernbench/autotest after review your tool. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte
On 04/14/2012 10:44 AM, Takuya Yoshikawa wrote: On Fri, 13 Apr 2012 18:12:41 +0800b Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: It is used to walk all the sptes of the specified pte_list, after this, the code of pte_list_walk can be removed And it can restart the walking automatically if the spte is zapped Well, I want to ask two questions: - why do you prefer pte_list_* naming to rmap_*? (not a big issue but just curious) pte_list is a common infrastructure for both parent-list and rmap. - Are you sure the whole indirection by this patch will not introduce any regression? (not restricted to get_dirty) I tested it with kernbench, no regression is found. It is not a problem since the iter and spte should be in the cache. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte
It is used to walk all the sptes of the specified pte_list, after this, the code of pte_list_walk can be removed And it can restart the walking automatically if the spte is zapped Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 233 +++--- arch/x86/kvm/mmu_audit.c |6 +- 2 files changed, 118 insertions(+), 121 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index a1c3628..4e91e94 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -900,26 +900,110 @@ static void pte_list_remove(u64 *spte, unsigned long *pte_list) } } -typedef void (*pte_list_walk_fn) (u64 *spte); -static void pte_list_walk(unsigned long *pte_list, pte_list_walk_fn fn) -{ +/* + * Used by the following functions to iterate through the sptes linked by a + * pte_list. All fields are private and not assumed to be used outside. + */ +struct spte_iterator { + /* private fields */ + unsigned long *pte_list; + /* holds the sptep if not NULL */ struct pte_list_desc *desc; - int i; + /* index of the sptep, -1 means the walking does not start. */ + int pos; +}; - if (!*pte_list) - return; +static void pte_list_walk_init(struct spte_iterator *iter, + unsigned long *pte_list) +{ + iter-pte_list = pte_list; + iter-pos = -1; +} + +static void pte_list_walk_check_restart(struct spte_iterator *iter, u64 *spte) +{ + /* +* If the spte is zapped, we should set 'iter-pos = -1' to restart +* the walk. +*/ + if (!is_shadow_present_pte(*spte)) + iter-pos = -1; +} + +static u64 *pte_list_first(struct spte_iterator *iter) +{ + unsigned long pte_list = *iter-pte_list; + u64 *sptep; + + if (!pte_list) + return NULL; + + if (!(pte_list 1)) { + iter-desc = NULL; + iter-pos = 0; + sptep = (u64 *)pte_list; + + goto exit; + } + + iter-desc = (struct pte_list_desc *)(pte_list ~1ul); + iter-pos = 0; + sptep = iter-desc-sptes[iter-pos]; + +exit: + WARN_ON(sptep !is_shadow_present_pte(*sptep)); + return sptep; +} - if (!(*pte_list 1)) - return fn((u64 *)*pte_list); +static u64 *pte_list_next(struct spte_iterator *iter) +{ + u64 *sptep = NULL; - desc = (struct pte_list_desc *)(*pte_list ~1ul); - while (desc) { - for (i = 0; i PTE_LIST_EXT desc-sptes[i]; ++i) - fn(desc-sptes[i]); - desc = desc-more; + if (iter-pos 0) + return pte_list_first(iter); + + if (iter-desc) { + if (iter-pos PTE_LIST_EXT - 1) { + ++iter-pos; + sptep = iter-desc-sptes[iter-pos]; + if (sptep) + goto exit; + } + + iter-desc = iter-desc-more; + + if (iter-desc) { + iter-pos = 0; + /* desc-sptes[0] cannot be NULL */ + sptep = iter-desc-sptes[iter-pos]; + goto exit; + } } + +exit: + WARN_ON(sptep !is_shadow_present_pte(*sptep)); + return sptep; } + +#define for_each_pte_list_spte(pte_list, iter, spte) \ + for (pte_list_walk_init(iter, pte_list);\ + (spte = pte_list_next(iter)) != NULL; \ + pte_list_walk_check_restart(iter, spte)) + +typedef void (*pte_list_walk_fn) (u64 *spte); +static void pte_list_walk(unsigned long *pte_list, pte_list_walk_fn fn) +{ + struct spte_iterator iter; + u64 *spte; + + for_each_pte_list_spte(pte_list, iter, spte) + fn(spte); +} + +#define for_each_rmap_spte(rmap, iter, spte) \ + for_each_pte_list_spte(rmap, iter, spte) + static unsigned long *__gfn_to_rmap(gfn_t gfn, int level, struct kvm_memory_slot *slot) { @@ -974,92 +1058,19 @@ static void rmap_remove(struct kvm *kvm, u64 *spte) pte_list_remove(spte, rmapp); } -/* - * Used by the following functions to iterate through the sptes linked by a - * rmap. All fields are private and not assumed to be used outside. - */ -struct rmap_iterator { - /* private fields */ - struct pte_list_desc *desc; /* holds the sptep if not NULL */ - int pos;/* index of the sptep */ -}; - -/* - * Iteration must be started by this function. This should also be used after - * removing/dropping sptes from the rmap link because in such cases the - * information in the itererator may not be valid. - * - * Returns sptep if found, NULL otherwise. - */ -static u64 *rmap_get_first(unsigned long rmap, struct rmap_iterator *iter) -{ - u64 *sptep; - - if (!rmap) -
Re: [PATCH v2 07/16] KVM: MMU: introduce for_each_pte_list_spte
On Fri, 13 Apr 2012 18:12:41 +0800b Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: It is used to walk all the sptes of the specified pte_list, after this, the code of pte_list_walk can be removed And it can restart the walking automatically if the spte is zapped Well, I want to ask two questions: - why do you prefer pte_list_* naming to rmap_*? (not a big issue but just curious) - Are you sure the whole indirection by this patch will not introduce any regression? (not restricted to get_dirty) As my previous patch showed, just a slight trivial change can introduce siginificant performance regression/improvement. Thanks, Takuya Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 233 +++--- arch/x86/kvm/mmu_audit.c |6 +- 2 files changed, 118 insertions(+), 121 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index a1c3628..4e91e94 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -900,26 +900,110 @@ static void pte_list_remove(u64 *spte, unsigned long *pte_list) } } -typedef void (*pte_list_walk_fn) (u64 *spte); -static void pte_list_walk(unsigned long *pte_list, pte_list_walk_fn fn) -{ +/* + * Used by the following functions to iterate through the sptes linked by a + * pte_list. All fields are private and not assumed to be used outside. + */ +struct spte_iterator { + /* private fields */ + unsigned long *pte_list; + /* holds the sptep if not NULL */ struct pte_list_desc *desc; - int i; + /* index of the sptep, -1 means the walking does not start. */ + int pos; +}; - if (!*pte_list) - return; +static void pte_list_walk_init(struct spte_iterator *iter, +unsigned long *pte_list) +{ + iter-pte_list = pte_list; + iter-pos = -1; +} + +static void pte_list_walk_check_restart(struct spte_iterator *iter, u64 *spte) +{ + /* + * If the spte is zapped, we should set 'iter-pos = -1' to restart + * the walk. + */ + if (!is_shadow_present_pte(*spte)) + iter-pos = -1; +} + +static u64 *pte_list_first(struct spte_iterator *iter) +{ + unsigned long pte_list = *iter-pte_list; + u64 *sptep; + + if (!pte_list) + return NULL; + + if (!(pte_list 1)) { + iter-desc = NULL; + iter-pos = 0; + sptep = (u64 *)pte_list; + + goto exit; + } + + iter-desc = (struct pte_list_desc *)(pte_list ~1ul); + iter-pos = 0; + sptep = iter-desc-sptes[iter-pos]; + +exit: + WARN_ON(sptep !is_shadow_present_pte(*sptep)); + return sptep; +} - if (!(*pte_list 1)) - return fn((u64 *)*pte_list); +static u64 *pte_list_next(struct spte_iterator *iter) +{ + u64 *sptep = NULL; - desc = (struct pte_list_desc *)(*pte_list ~1ul); - while (desc) { - for (i = 0; i PTE_LIST_EXT desc-sptes[i]; ++i) - fn(desc-sptes[i]); - desc = desc-more; + if (iter-pos 0) + return pte_list_first(iter); + + if (iter-desc) { + if (iter-pos PTE_LIST_EXT - 1) { + ++iter-pos; + sptep = iter-desc-sptes[iter-pos]; + if (sptep) + goto exit; + } + + iter-desc = iter-desc-more; + + if (iter-desc) { + iter-pos = 0; + /* desc-sptes[0] cannot be NULL */ + sptep = iter-desc-sptes[iter-pos]; + goto exit; + } } + +exit: + WARN_ON(sptep !is_shadow_present_pte(*sptep)); + return sptep; } + +#define for_each_pte_list_spte(pte_list, iter, spte) \ + for (pte_list_walk_init(iter, pte_list);\ + (spte = pte_list_next(iter)) != NULL; \ + pte_list_walk_check_restart(iter, spte)) + +typedef void (*pte_list_walk_fn) (u64 *spte); +static void pte_list_walk(unsigned long *pte_list, pte_list_walk_fn fn) +{ + struct spte_iterator iter; + u64 *spte; + + for_each_pte_list_spte(pte_list, iter, spte) + fn(spte); +} + +#define for_each_rmap_spte(rmap, iter, spte) \ + for_each_pte_list_spte(rmap, iter, spte) + static unsigned long *__gfn_to_rmap(gfn_t gfn, int level, struct kvm_memory_slot *slot) { @@ -974,92 +1058,19 @@ static void rmap_remove(struct kvm *kvm, u64 *spte) pte_list_remove(spte, rmapp); } -/* - * Used by the following functions to iterate through the sptes linked by a - * rmap. All fields are private and not assumed to be used outside. - */ -struct rmap_iterator { - /* private fields */