Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-28 Thread Xiao Guangrong
On 11/27/2013 03:58 AM, Marcelo Tosatti wrote: > On Tue, Nov 26, 2013 at 11:10:19AM +0800, Xiao Guangrong wrote: >> On 11/25/2013 10:23 PM, Marcelo Tosatti wrote: >>> On Mon, Nov 25, 2013 at 02:48:37PM +0200, Avi Kivity wrote: >>>> On Mon, Nov 25, 2013 at 8:11

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-28 Thread Xiao Guangrong
On 11/27/2013 03:58 AM, Marcelo Tosatti wrote: On Tue, Nov 26, 2013 at 11:10:19AM +0800, Xiao Guangrong wrote: On 11/25/2013 10:23 PM, Marcelo Tosatti wrote: On Mon, Nov 25, 2013 at 02:48:37PM +0200, Avi Kivity wrote: On Mon, Nov 25, 2013 at 8:11 AM, Xiao Guangrong xiaoguangr

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-28 Thread Xiao Guangrong
On 11/27/2013 03:31 AM, Marcelo Tosatti wrote: On Tue, Nov 26, 2013 at 11:21:37AM +0800, Xiao Guangrong wrote: On 11/26/2013 02:12 AM, Marcelo Tosatti wrote: On Mon, Nov 25, 2013 at 02:29:03PM +0800, Xiao Guangrong wrote: Also, there is no guarantee of termination (as long as sptes

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-25 Thread Xiao Guangrong
On 11/26/2013 02:12 AM, Marcelo Tosatti wrote: > On Mon, Nov 25, 2013 at 02:29:03PM +0800, Xiao Guangrong wrote: >>>> Also, there is no guarantee of termination (as long as sptes are >>>> deleted with the correct timing). BTW, can't see any guarantee of >>>&g

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-25 Thread Xiao Guangrong
On 11/25/2013 10:23 PM, Marcelo Tosatti wrote: > On Mon, Nov 25, 2013 at 02:48:37PM +0200, Avi Kivity wrote: >> On Mon, Nov 25, 2013 at 8:11 AM, Xiao Guangrong >> wrote: >>> >>> On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti wrote: >> >> >> &g

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-25 Thread Xiao Guangrong
On 11/25/2013 10:08 PM, Marcelo Tosatti wrote: > On Mon, Nov 25, 2013 at 02:11:31PM +0800, Xiao Guangrong wrote: >> >> On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti wrote: >> >>> On Wed, Oct 23, 2013 at 09:29:25PM +0800, Xiao Guangrong wrote: >>>> I

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-25 Thread Xiao Guangrong
Hi Peter, On 11/25/2013 05:31 PM, Peter Zijlstra wrote: > On Fri, Nov 22, 2013 at 05:14:29PM -0200, Marcelo Tosatti wrote: >> Also, there is no guarantee of termination (as long as sptes are >> deleted with the correct timing). BTW, can't see any guarantee of >> termination for rculist nulls

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-25 Thread Xiao Guangrong
On 11/25/2013 06:19 PM, Gleb Natapov wrote: > On Mon, Nov 25, 2013 at 02:11:31PM +0800, Xiao Guangrong wrote: >>> >>> For example, nothing prevents lockless walker to move into some >>> parent_ptes chain, right? >> >> No. >> >> The nulls can h

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-25 Thread Xiao Guangrong
On 11/25/2013 06:19 PM, Gleb Natapov wrote: On Mon, Nov 25, 2013 at 02:11:31PM +0800, Xiao Guangrong wrote: For example, nothing prevents lockless walker to move into some parent_ptes chain, right? No. The nulls can help us to detect this case, for parent_ptes, the nulls points to shadow

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-25 Thread Xiao Guangrong
Hi Peter, On 11/25/2013 05:31 PM, Peter Zijlstra wrote: On Fri, Nov 22, 2013 at 05:14:29PM -0200, Marcelo Tosatti wrote: Also, there is no guarantee of termination (as long as sptes are deleted with the correct timing). BTW, can't see any guarantee of termination for rculist nulls either (a

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-25 Thread Xiao Guangrong
On 11/25/2013 10:08 PM, Marcelo Tosatti wrote: On Mon, Nov 25, 2013 at 02:11:31PM +0800, Xiao Guangrong wrote: On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti mtosa...@redhat.com wrote: On Wed, Oct 23, 2013 at 09:29:25PM +0800, Xiao Guangrong wrote: It likes nulls list and we use the pte-list

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-25 Thread Xiao Guangrong
On 11/25/2013 10:23 PM, Marcelo Tosatti wrote: On Mon, Nov 25, 2013 at 02:48:37PM +0200, Avi Kivity wrote: On Mon, Nov 25, 2013 at 8:11 AM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti mtosa...@redhat.com wrote: snip complicated stuff

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-25 Thread Xiao Guangrong
On 11/26/2013 02:12 AM, Marcelo Tosatti wrote: On Mon, Nov 25, 2013 at 02:29:03PM +0800, Xiao Guangrong wrote: Also, there is no guarantee of termination (as long as sptes are deleted with the correct timing). BTW, can't see any guarantee of termination for rculist nulls either (a writer can

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-24 Thread Xiao Guangrong
On 11/25/2013 02:11 PM, Xiao Guangrong wrote: > > On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti wrote: > >> On Wed, Oct 23, 2013 at 09:29:25PM +0800, Xiao Guangrong wrote: >>> It likes nulls list and we use the pte-list as the nulls which can help us >>>

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-24 Thread Xiao Guangrong
On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti wrote: > On Wed, Oct 23, 2013 at 09:29:25PM +0800, Xiao Guangrong wrote: >> It likes nulls list and we use the pte-list as the nulls which can help us to >> detect whether the "desc" is moved to anther rmap then we

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-24 Thread Xiao Guangrong
On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti mtosa...@redhat.com wrote: On Wed, Oct 23, 2013 at 09:29:25PM +0800, Xiao Guangrong wrote: It likes nulls list and we use the pte-list as the nulls which can help us to detect whether the desc is moved to anther rmap then we can re-walk the rmap

Re: [PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-11-24 Thread Xiao Guangrong
On 11/25/2013 02:11 PM, Xiao Guangrong wrote: On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti mtosa...@redhat.com wrote: On Wed, Oct 23, 2013 at 09:29:25PM +0800, Xiao Guangrong wrote: It likes nulls list and we use the pte-list as the nulls which can help us to detect whether the desc

Re: [PATCH 04/12] KVM: MMU: log dirty page after marking spte writable

2013-11-20 Thread Xiao Guangrong
On Nov 21, 2013, at 3:47 AM, Marcelo Tosatti wrote: > On Wed, Nov 20, 2013 at 10:20:09PM +0800, Xiao Guangrong wrote: >>> But what guarantee does userspace require, from GET_DIRTY_LOG, while vcpus >>> are >>> executing? >> >> Aha. Single calling GET

Re: [PATCH 04/12] KVM: MMU: log dirty page after marking spte writable

2013-11-20 Thread Xiao Guangrong
On Nov 20, 2013, at 8:29 AM, Marcelo Tosatti wrote: > On Wed, Aug 07, 2013 at 12:06:49PM +0800, Xiao Guangrong wrote: >> On 08/07/2013 09:48 AM, Marcelo Tosatti wrote: >>> On Tue, Jul 30, 2013 at 09:02:02PM +0800, Xiao Guangrong wrote: >>>> Make sure we can see the

Re: [PATCH 04/12] KVM: MMU: log dirty page after marking spte writable

2013-11-20 Thread Xiao Guangrong
On Nov 20, 2013, at 8:29 AM, Marcelo Tosatti mtosa...@redhat.com wrote: On Wed, Aug 07, 2013 at 12:06:49PM +0800, Xiao Guangrong wrote: On 08/07/2013 09:48 AM, Marcelo Tosatti wrote: On Tue, Jul 30, 2013 at 09:02:02PM +0800, Xiao Guangrong wrote: Make sure we can see the writable spte before

Re: [PATCH 04/12] KVM: MMU: log dirty page after marking spte writable

2013-11-20 Thread Xiao Guangrong
On Nov 21, 2013, at 3:47 AM, Marcelo Tosatti mtosa...@redhat.com wrote: On Wed, Nov 20, 2013 at 10:20:09PM +0800, Xiao Guangrong wrote: But what guarantee does userspace require, from GET_DIRTY_LOG, while vcpus are executing? Aha. Single calling GET_DIRTY_LOG is useless since new dirty

Re: [PATCH v3 04/15] KVM: MMU: flush tlb out of mmu lock when write-protect the sptes

2013-11-14 Thread Xiao Guangrong
On 11/15/2013 02:39 AM, Marcelo Tosatti wrote: > On Thu, Nov 14, 2013 at 01:15:24PM +0800, Xiao Guangrong wrote: >> >> Hi Marcelo, >> >> On 11/14/2013 08:36 AM, Marcelo Tosatti wrote: >> >>> >>> Any code location which reads the writable

Re: [PATCH v3 04/15] KVM: MMU: flush tlb out of mmu lock when write-protect the sptes

2013-11-14 Thread Xiao Guangrong
On 11/15/2013 02:39 AM, Marcelo Tosatti wrote: On Thu, Nov 14, 2013 at 01:15:24PM +0800, Xiao Guangrong wrote: Hi Marcelo, On 11/14/2013 08:36 AM, Marcelo Tosatti wrote: Any code location which reads the writable bit in the spte and assumes if its not set, that the translation which

Re: [PATCH v3 04/15] KVM: MMU: flush tlb out of mmu lock when write-protect the sptes

2013-11-13 Thread Xiao Guangrong
Hi Marcelo, On 11/14/2013 08:36 AM, Marcelo Tosatti wrote: > > Any code location which reads the writable bit in the spte and assumes if its > not > set, that the translation which the spte refers to is not cached in a > remote CPU's TLB can become buggy. (*) > > It might be the case that

Re: [PATCH v3 04/15] KVM: MMU: flush tlb out of mmu lock when write-protect the sptes

2013-11-13 Thread Xiao Guangrong
Hi Marcelo, On 11/14/2013 08:36 AM, Marcelo Tosatti wrote: Any code location which reads the writable bit in the spte and assumes if its not set, that the translation which the spte refers to is not cached in a remote CPU's TLB can become buggy. (*) It might be the case that now its

Re: [PATCH v3 00/15] KVM: MMU: locklessly write-protect

2013-11-10 Thread Xiao Guangrong
On 11/03/2013 08:29 PM, Gleb Natapov wrote: > Marcelo can you review it please? > Ping.. > On Wed, Oct 23, 2013 at 09:29:18PM +0800, Xiao Guangrong wrote: >> Changelog v3: >> - the changes from Gleb's review: >> 1) drop the patch which fixed the count o

Re: [PATCH v3 00/15] KVM: MMU: locklessly write-protect

2013-11-10 Thread Xiao Guangrong
On 11/03/2013 08:29 PM, Gleb Natapov wrote: Marcelo can you review it please? Ping.. On Wed, Oct 23, 2013 at 09:29:18PM +0800, Xiao Guangrong wrote: Changelog v3: - the changes from Gleb's review: 1) drop the patch which fixed the count of spte number in rmap since it can

Re: [PATCH v3 10/15] KVM: MMU: allocate shadow pages from slab

2013-10-27 Thread Xiao Guangrong
On 10/24/2013 08:32 PM, Gleb Natapov wrote: > On Thu, Oct 24, 2013 at 07:01:49PM +0800, Xiao Guangrong wrote: >> On 10/24/2013 06:39 PM, Gleb Natapov wrote: >>> On Thu, Oct 24, 2013 at 06:10:46PM +0800, Xiao Guangrong wrote: >>>> On 10/24/2013 05:52 PM, Gleb Natap

Re: [PATCH v3 10/15] KVM: MMU: allocate shadow pages from slab

2013-10-27 Thread Xiao Guangrong
On 10/24/2013 08:32 PM, Gleb Natapov wrote: On Thu, Oct 24, 2013 at 07:01:49PM +0800, Xiao Guangrong wrote: On 10/24/2013 06:39 PM, Gleb Natapov wrote: On Thu, Oct 24, 2013 at 06:10:46PM +0800, Xiao Guangrong wrote: On 10/24/2013 05:52 PM, Gleb Natapov wrote: On Thu, Oct 24, 2013 at 05:29

Re: [PATCH v3 10/15] KVM: MMU: allocate shadow pages from slab

2013-10-24 Thread Xiao Guangrong
On 10/24/2013 06:39 PM, Gleb Natapov wrote: > On Thu, Oct 24, 2013 at 06:10:46PM +0800, Xiao Guangrong wrote: >> On 10/24/2013 05:52 PM, Gleb Natapov wrote: >>> On Thu, Oct 24, 2013 at 05:29:44PM +0800, Xiao Guangrong wrote: >>>> On 10/24/2013 05:19 PM, Gleb Natapo

Re: [PATCH v3 10/15] KVM: MMU: allocate shadow pages from slab

2013-10-24 Thread Xiao Guangrong
On 10/24/2013 05:52 PM, Gleb Natapov wrote: > On Thu, Oct 24, 2013 at 05:29:44PM +0800, Xiao Guangrong wrote: >> On 10/24/2013 05:19 PM, Gleb Natapov wrote: >> >>>> @@ -946,7 +947,7 @@ static inline struct kvm_mmu_page *page_header(hpa_t >>>> shadow_

Re: [PATCH v3 13/15] KVM: MMU: locklessly write-protect the page

2013-10-24 Thread Xiao Guangrong
On 10/24/2013 05:17 PM, Gleb Natapov wrote: >> >> -/** >> - * kvm_mmu_write_protect_pt_masked - write protect selected PT level pages >> +static void __rmap_write_protect_lockless(u64 *sptep) >> +{ >> +u64 spte; >> + >> +retry: >> +/* >> + * Note we may partly read the sptep on

Re: [PATCH v3 10/15] KVM: MMU: allocate shadow pages from slab

2013-10-24 Thread Xiao Guangrong
On 10/24/2013 05:19 PM, Gleb Natapov wrote: >> @@ -946,7 +947,7 @@ static inline struct kvm_mmu_page *page_header(hpa_t >> shadow_page) >> { >> struct page *page = pfn_to_page(shadow_page >> PAGE_SHIFT); >> >> -return (struct kvm_mmu_page *)page_private(page); >> +return (struct

Re: [PATCH v3 10/15] KVM: MMU: allocate shadow pages from slab

2013-10-24 Thread Xiao Guangrong
On 10/24/2013 05:19 PM, Gleb Natapov wrote: @@ -946,7 +947,7 @@ static inline struct kvm_mmu_page *page_header(hpa_t shadow_page) { struct page *page = pfn_to_page(shadow_page PAGE_SHIFT); -return (struct kvm_mmu_page *)page_private(page); +return (struct kvm_mmu_page

Re: [PATCH v3 13/15] KVM: MMU: locklessly write-protect the page

2013-10-24 Thread Xiao Guangrong
On 10/24/2013 05:17 PM, Gleb Natapov wrote: -/** - * kvm_mmu_write_protect_pt_masked - write protect selected PT level pages +static void __rmap_write_protect_lockless(u64 *sptep) +{ +u64 spte; + +retry: +/* + * Note we may partly read the sptep on 32bit host, however, we

Re: [PATCH v3 10/15] KVM: MMU: allocate shadow pages from slab

2013-10-24 Thread Xiao Guangrong
On 10/24/2013 05:52 PM, Gleb Natapov wrote: On Thu, Oct 24, 2013 at 05:29:44PM +0800, Xiao Guangrong wrote: On 10/24/2013 05:19 PM, Gleb Natapov wrote: @@ -946,7 +947,7 @@ static inline struct kvm_mmu_page *page_header(hpa_t shadow_page) { struct page *page = pfn_to_page(shadow_page

Re: [PATCH v3 10/15] KVM: MMU: allocate shadow pages from slab

2013-10-24 Thread Xiao Guangrong
On 10/24/2013 06:39 PM, Gleb Natapov wrote: On Thu, Oct 24, 2013 at 06:10:46PM +0800, Xiao Guangrong wrote: On 10/24/2013 05:52 PM, Gleb Natapov wrote: On Thu, Oct 24, 2013 at 05:29:44PM +0800, Xiao Guangrong wrote: On 10/24/2013 05:19 PM, Gleb Natapov wrote: @@ -946,7 +947,7 @@ static

[PATCH v3 05/15] KVM: MMU: update spte and add it into rmap before dirty log

2013-10-23 Thread Xiao Guangrong
Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 84 ++ arch/x86/kvm/x86.c | 10 +++ 2 files changed, 76 insertions(+), 18 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 337d173..e85eed6 100644 --- a/arch/x86

[PATCH v3 02/15] KVM: MMU: lazily drop large spte

2013-10-23 Thread Xiao Guangrong
large spte to writable but only dirty the first page into the dirty-bitmap that means other pages are missed. Fixed it by only the normal sptes (on the PT_PAGE_TABLE_LEVEL level) can be fast fixed Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 36 arch

[PATCH v3 01/15] KVM: MMU: properly check last spte in fast_page_fault()

2013-10-23 Thread Xiao Guangrong
ble and avoids potential issue in the further development Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 40772ef..d2aacc2 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/

[PATCH v3 03/15] KVM: MMU: flush tlb if the spte can be locklessly modified

2013-10-23 Thread Xiao Guangrong
, see spte.w = 0, then without flush tlb unlock mmu-lock !!! At this point, the shadow page can still be writable due to the corrupt tlb entry Flush all TLB Signed-off-by: Xiao Guangrong

[PATCH v3 00/15] KVM: MMU: locklessly write-protect

2013-10-23 Thread Xiao Guangrong
te based on the dirty bitmap, we should ensure the writable spte can be found in rmap before the dirty bitmap is visible. Otherwise, we cleared the dirty bitmap and failed to write-protect the page. Performance result The performance result and the benchmark can be found at: http

[PATCH v3 04/15] KVM: MMU: flush tlb out of mmu lock when write-protect the sptes

2013-10-23 Thread Xiao Guangrong
, that means it does not depend on PT_WRITABLE_MASK anymore Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 18 ++ arch/x86/kvm/x86.c | 9 +++-- 2 files changed, 21 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 62f18ec..337d173

[PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-10-23 Thread Xiao Guangrong
that lock) so that we can not see the same nulls used on different rmaps Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 35 +-- 1 file changed, 29 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 5cce039..4687329 100644 -

[PATCH v3 08/15] KVM: MMU: introduce pte-list lockless walker

2013-10-23 Thread Xiao Guangrong
-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 57 ++ 1 file changed, 53 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 4687329..a864140 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -975,6

[PATCH v3 11/15] KVM: MMU: locklessly access shadow page under rcu protection

2013-10-23 Thread Xiao Guangrong
Use SLAB_DESTROY_BY_RCU to prevent the shadow page to be freed from the slab, so that it can be locklessly accessed by holding rcu lock Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm

[PATCH v3 12/15] KVM: MMU: check last spte with unawareness of mapping level

2013-10-23 Thread Xiao Guangrong
is_last_spte() do not depend on the mapping level anymore This is important to implement lockless write-protection since only spte is available at that time Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 25 - arch/x86/kvm/mmu_audit.c | 6 +++--- arch

[PATCH v3 10/15] KVM: MMU: allocate shadow pages from slab

2013-10-23 Thread Xiao Guangrong
Allocate shadow pages from slab instead of page-allocator, frequent shadow page allocation and free can be hit in the slab cache, it is very useful for shadow mmu Signed-off-by: Xiao Guangrong --- arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/mmu.c | 46

[PATCH v3 14/15] KVM: MMU: clean up spte_write_protect

2013-10-23 Thread Xiao Guangrong
Now, the only user of spte_write_protect is rmap_write_protect which always calls spte_write_protect with pt_protect = true, so drop it and the unused parameter @kvm Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 19 --- 1 file changed, 8 insertions(+), 11 deletions

[PATCH v3 09/15] KVM: MMU: initialize the pointers in pte_list_desc properly

2013-10-23 Thread Xiao Guangrong
kmem_cache_zalloc Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 27 +-- 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index a864140..f3ae74e6 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -687,14 +687,15

[PATCH v3 13/15] KVM: MMU: locklessly write-protect the page

2013-10-23 Thread Xiao Guangrong
-off-by: Xiao Guangrong --- arch/x86/include/asm/kvm_host.h | 4 --- arch/x86/kvm/mmu.c | 59 ++--- arch/x86/kvm/mmu.h | 6 + arch/x86/kvm/x86.c | 11 4 files changed, 55 insertions(+), 25 deletions(-) diff

[PATCH v3 15/15] KVM: MMU: use rcu functions to access the pointer

2013-10-23 Thread Xiao Guangrong
Use rcu_assign_pointer() to update all the pointer in desc and use rcu_dereference() to lockless read the pointer Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 46 -- 1 file changed, 28 insertions(+), 18 deletions(-) diff --git a/arch/x86

[PATCH v3 06/15] KVM: MMU: redesign the algorithm of pte_list

2013-10-23 Thread Xiao Guangrong
che miss Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 179 - 1 file changed, 123 insertions(+), 56 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index e85eed6..5cce039 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86

[PATCH v3 06/15] KVM: MMU: redesign the algorithm of pte_list

2013-10-23 Thread Xiao Guangrong
-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 179 - 1 file changed, 123 insertions(+), 56 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index e85eed6..5cce039 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch

[PATCH v3 15/15] KVM: MMU: use rcu functions to access the pointer

2013-10-23 Thread Xiao Guangrong
Use rcu_assign_pointer() to update all the pointer in desc and use rcu_dereference() to lockless read the pointer Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 46 -- 1 file changed, 28 insertions(+), 18

[PATCH v3 09/15] KVM: MMU: initialize the pointers in pte_list_desc properly

2013-10-23 Thread Xiao Guangrong
kmem_cache_zalloc Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 27 +-- 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index a864140..f3ae74e6 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86

[PATCH v3 13/15] KVM: MMU: locklessly write-protect the page

2013-10-23 Thread Xiao Guangrong
-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/include/asm/kvm_host.h | 4 --- arch/x86/kvm/mmu.c | 59 ++--- arch/x86/kvm/mmu.h | 6 + arch/x86/kvm/x86.c | 11 4 files changed, 55

[PATCH v3 14/15] KVM: MMU: clean up spte_write_protect

2013-10-23 Thread Xiao Guangrong
Now, the only user of spte_write_protect is rmap_write_protect which always calls spte_write_protect with pt_protect = true, so drop it and the unused parameter @kvm Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 19 --- 1 file changed, 8

[PATCH v3 12/15] KVM: MMU: check last spte with unawareness of mapping level

2013-10-23 Thread Xiao Guangrong
is_last_spte() do not depend on the mapping level anymore This is important to implement lockless write-protection since only spte is available at that time Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 25 - arch/x86/kvm

[PATCH v3 10/15] KVM: MMU: allocate shadow pages from slab

2013-10-23 Thread Xiao Guangrong
Allocate shadow pages from slab instead of page-allocator, frequent shadow page allocation and free can be hit in the slab cache, it is very useful for shadow mmu Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/mmu.c

[PATCH v3 11/15] KVM: MMU: locklessly access shadow page under rcu protection

2013-10-23 Thread Xiao Guangrong
Use SLAB_DESTROY_BY_RCU to prevent the shadow page to be freed from the slab, so that it can be locklessly accessed by holding rcu lock Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git

[PATCH v3 07/15] KVM: MMU: introduce nulls desc

2013-10-23 Thread Xiao Guangrong
can not see the same nulls used on different rmaps Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 35 +-- 1 file changed, 29 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 5cce039

[PATCH v3 08/15] KVM: MMU: introduce pte-list lockless walker

2013-10-23 Thread Xiao Guangrong
-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 57 ++ 1 file changed, 53 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 4687329..a864140 100644 --- a/arch/x86/kvm/mmu.c +++ b

[PATCH v3 04/15] KVM: MMU: flush tlb out of mmu lock when write-protect the sptes

2013-10-23 Thread Xiao Guangrong
, that means it does not depend on PT_WRITABLE_MASK anymore Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 18 ++ arch/x86/kvm/x86.c | 9 +++-- 2 files changed, 21 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm

[PATCH v3 00/15] KVM: MMU: locklessly write-protect

2013-10-23 Thread Xiao Guangrong
is visible. Otherwise, we cleared the dirty bitmap and failed to write-protect the page. Performance result The performance result and the benchmark can be found at: http://permalink.gmane.org/gmane.linux.kernel/1534876 Xiao Guangrong (15): KVM: MMU: properly check last spte

[PATCH v3 03/15] KVM: MMU: flush tlb if the spte can be locklessly modified

2013-10-23 Thread Xiao Guangrong
, see spte.w = 0, then without flush tlb unlock mmu-lock !!! At this point, the shadow page can still be writable due to the corrupt tlb entry Flush all TLB Signed-off-by: Xiao Guangrong

[PATCH v3 01/15] KVM: MMU: properly check last spte in fast_page_fault()

2013-10-23 Thread Xiao Guangrong
and avoids potential issue in the further development Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 40772ef..d2aacc2 100644 --- a/arch/x86

[PATCH v3 02/15] KVM: MMU: lazily drop large spte

2013-10-23 Thread Xiao Guangrong
large spte to writable but only dirty the first page into the dirty-bitmap that means other pages are missed. Fixed it by only the normal sptes (on the PT_PAGE_TABLE_LEVEL level) can be fast fixed Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 36

[PATCH v3 05/15] KVM: MMU: update spte and add it into rmap before dirty log

2013-10-23 Thread Xiao Guangrong
Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 84 ++ arch/x86/kvm/x86.c | 10 +++ 2 files changed, 76 insertions(+), 18 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index

Re: [PATCH v2 12/15] KVM: MMU: allow locklessly access shadow page table out of vcpu thread

2013-10-15 Thread Xiao Guangrong
On Oct 16, 2013, at 6:21 AM, Marcelo Tosatti wrote: > On Tue, Oct 15, 2013 at 06:57:05AM +0300, Gleb Natapov wrote: >>> >>> Why is it safe to allow access, by the lockless page write protect >>> side, to spt pointer for shadow page A that can change to a shadow page >>> pointer of shadow page

Re: [PATCH v2 12/15] KVM: MMU: allow locklessly access shadow page table out of vcpu thread

2013-10-15 Thread Xiao Guangrong
On Oct 16, 2013, at 6:21 AM, Marcelo Tosatti mtosa...@redhat.com wrote: On Tue, Oct 15, 2013 at 06:57:05AM +0300, Gleb Natapov wrote: Why is it safe to allow access, by the lockless page write protect side, to spt pointer for shadow page A that can change to a shadow page pointer of

Re: [PATCH v2 12/15] KVM: MMU: allow locklessly access shadow page table out of vcpu thread

2013-10-09 Thread Xiao Guangrong
On 10/09/2013 09:56 AM, Marcelo Tosatti wrote: > On Tue, Oct 08, 2013 at 12:02:32PM +0800, Xiao Guangrong wrote: >> >> Hi Marcelo, >> >> On Oct 8, 2013, at 9:23 AM, Marcelo Tosatti wrote: >> >>>> >>>> + if (kvm->arch.rcu_free_shado

Re: [PATCH v2 12/15] KVM: MMU: allow locklessly access shadow page table out of vcpu thread

2013-10-09 Thread Xiao Guangrong
On 10/09/2013 09:56 AM, Marcelo Tosatti wrote: On Tue, Oct 08, 2013 at 12:02:32PM +0800, Xiao Guangrong wrote: Hi Marcelo, On Oct 8, 2013, at 9:23 AM, Marcelo Tosatti mtosa...@redhat.com wrote: + if (kvm-arch.rcu_free_shadow_page) { + kvm_mmu_isolate_pages(invalid_list

Re: [PATCH v2 12/15] KVM: MMU: allow locklessly access shadow page table out of vcpu thread

2013-10-07 Thread Xiao Guangrong
Hi Marcelo, On Oct 8, 2013, at 9:23 AM, Marcelo Tosatti wrote: >> >> +if (kvm->arch.rcu_free_shadow_page) { >> +kvm_mmu_isolate_pages(invalid_list); >> +sp = list_first_entry(invalid_list, struct kvm_mmu_page, link); >> +list_del_init(invalid_list); >>

Re: [PATCH v2 12/15] KVM: MMU: allow locklessly access shadow page table out of vcpu thread

2013-10-07 Thread Xiao Guangrong
Hi Marcelo, On Oct 8, 2013, at 9:23 AM, Marcelo Tosatti mtosa...@redhat.com wrote: +if (kvm-arch.rcu_free_shadow_page) { +kvm_mmu_isolate_pages(invalid_list); +sp = list_first_entry(invalid_list, struct kvm_mmu_page, link); +

Re: [PATCH v2 05/15] KVM: MMU: flush tlb out of mmu lock when write-protect the sptes

2013-10-03 Thread Xiao Guangrong
On Oct 1, 2013, at 7:05 AM, Marcelo Tosatti wrote: > On Thu, Sep 05, 2013 at 06:29:08PM +0800, Xiao Guangrong wrote: >> Now we can flush all the TLBs out of the mmu lock without TLB corruption when >> write-proect the sptes, it is because: >> - we have marked large s

Re: [PATCH v2 03/15] KVM: MMU: lazily drop large spte

2013-10-03 Thread Xiao Guangrong
On Oct 1, 2013, at 6:39 AM, Marcelo Tosatti wrote: > On Thu, Sep 05, 2013 at 06:29:06PM +0800, Xiao Guangrong wrote: >> Currently, kvm zaps the large spte if write-protected is needed, the later >> read can fault on that spte. Actually, we can make the large spte readonly >&

Re: [PATCH v2 02/15] KVM: MMU: properly check last spte in fast_page_fault()

2013-10-03 Thread Xiao Guangrong
On Oct 1, 2013, at 5:23 AM, Marcelo Tosatti wrote: >> > > Unrelated to this patch: > > If vcpu->mode = OUTSIDE_GUEST_MODE, no IPI is sent > by kvm_flush_remote_tlbs. Yes. > > So how is this supposed to work again? > >/* > * Wait for all vcpus to exit guest mode and/or

Re: [PATCH v2 02/15] KVM: MMU: properly check last spte in fast_page_fault()

2013-10-03 Thread Xiao Guangrong
On Oct 1, 2013, at 5:23 AM, Marcelo Tosatti mtosa...@redhat.com wrote: Unrelated to this patch: If vcpu-mode = OUTSIDE_GUEST_MODE, no IPI is sent by kvm_flush_remote_tlbs. Yes. So how is this supposed to work again? /* * Wait for all vcpus to exit guest mode

Re: [PATCH v2 03/15] KVM: MMU: lazily drop large spte

2013-10-03 Thread Xiao Guangrong
On Oct 1, 2013, at 6:39 AM, Marcelo Tosatti mtosa...@redhat.com wrote: On Thu, Sep 05, 2013 at 06:29:06PM +0800, Xiao Guangrong wrote: Currently, kvm zaps the large spte if write-protected is needed, the later read can fault on that spte. Actually, we can make the large spte readonly instead

Re: [PATCH v2 05/15] KVM: MMU: flush tlb out of mmu lock when write-protect the sptes

2013-10-03 Thread Xiao Guangrong
On Oct 1, 2013, at 7:05 AM, Marcelo Tosatti mtosa...@redhat.com wrote: On Thu, Sep 05, 2013 at 06:29:08PM +0800, Xiao Guangrong wrote: Now we can flush all the TLBs out of the mmu lock without TLB corruption when write-proect the sptes, it is because: - we have marked large sptes readonly

Re: [PATCH v2 09/15] KVM: MMU: introduce pte-list lockless walker

2013-09-16 Thread Xiao Guangrong
Hi Gleb, On 09/16/2013 08:42 PM, Gleb Natapov wrote: >> static unsigned long *__gfn_to_rmap(gfn_t gfn, int level, >> struct kvm_memory_slot *slot) >> { >> @@ -4651,7 +4700,7 @@ int kvm_mmu_module_init(void) >> { >> pte_list_desc_cache =

Re: [PATCH v2 09/15] KVM: MMU: introduce pte-list lockless walker

2013-09-16 Thread Xiao Guangrong
Hi Gleb, On 09/16/2013 08:42 PM, Gleb Natapov wrote: static unsigned long *__gfn_to_rmap(gfn_t gfn, int level, struct kvm_memory_slot *slot) { @@ -4651,7 +4700,7 @@ int kvm_mmu_module_init(void) { pte_list_desc_cache =

Re: [PATCH v2 01/15] KVM: MMU: fix the count of spte number

2013-09-08 Thread Xiao Guangrong
On Sep 8, 2013, at 10:01 PM, Gleb Natapov wrote: > On Sun, Sep 08, 2013 at 09:55:04PM +0800, Xiao Guangrong wrote: >> >> On Sep 8, 2013, at 8:19 PM, Gleb Natapov wrote: >> >>> On Thu, Sep 05, 2013 at 06:29:04PM +0800, Xiao Guangrong wrote: >>>> If th

Re: [PATCH v2 01/15] KVM: MMU: fix the count of spte number

2013-09-08 Thread Xiao Guangrong
On Sep 8, 2013, at 8:19 PM, Gleb Natapov wrote: > On Thu, Sep 05, 2013 at 06:29:04PM +0800, Xiao Guangrong wrote: >> If the desc is the last one and it is full, its sptes is not counted >> > Hmm, if desc is not full but it is not the last one all sptes after the > des

Re: [PATCH v2 09/15] KVM: MMU: introduce pte-list lockless walker

2013-09-08 Thread Xiao Guangrong
On Sep 5, 2013, at 6:29 PM, Xiao Guangrong wrote: > > A desc only has 3 entries > in the current code so it is not a problem now, but the issue will > be triggered if we expend the size of desc in the further development Sorry, this description is obvious wrong, the bug exis

Re: [PATCH v2 09/15] KVM: MMU: introduce pte-list lockless walker

2013-09-08 Thread Xiao Guangrong
On Sep 5, 2013, at 6:29 PM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: A desc only has 3 entries in the current code so it is not a problem now, but the issue will be triggered if we expend the size of desc in the further development Sorry, this description is obvious wrong

Re: [PATCH v2 01/15] KVM: MMU: fix the count of spte number

2013-09-08 Thread Xiao Guangrong
On Sep 8, 2013, at 8:19 PM, Gleb Natapov g...@redhat.com wrote: On Thu, Sep 05, 2013 at 06:29:04PM +0800, Xiao Guangrong wrote: If the desc is the last one and it is full, its sptes is not counted Hmm, if desc is not full but it is not the last one all sptes after the desc are not counted

Re: [PATCH v2 01/15] KVM: MMU: fix the count of spte number

2013-09-08 Thread Xiao Guangrong
On Sep 8, 2013, at 10:01 PM, Gleb Natapov g...@redhat.com wrote: On Sun, Sep 08, 2013 at 09:55:04PM +0800, Xiao Guangrong wrote: On Sep 8, 2013, at 8:19 PM, Gleb Natapov g...@redhat.com wrote: On Thu, Sep 05, 2013 at 06:29:04PM +0800, Xiao Guangrong wrote: If the desc is the last one

Re: [PATCH v2] KVM: mmu: allow page tables to be in read-only slots

2013-09-05 Thread Xiao Guangrong
ng the accessed and > dirty bits. > > Note that this scenario is not supported by NPT at all, as explained by > comments in the code. Reviewed-by: Xiao Guangrong -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to major

[PATCH v2 06/15] KVM: MMU: update spte and add it into rmap before dirty log

2013-09-05 Thread Xiao Guangrong
Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 84 ++ arch/x86/kvm/x86.c | 10 +++ 2 files changed, 76 insertions(+), 18 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index a983570..8ea54d9 100644 --- a/arch/x86

[PATCH v2 05/15] KVM: MMU: flush tlb out of mmu lock when write-protect the sptes

2013-09-05 Thread Xiao Guangrong
, that means it does not depend on PT_WRITABLE_MASK anymore Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 18 ++ arch/x86/kvm/x86.c | 9 +++-- 2 files changed, 21 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 7488229..a983570

[PATCH v2 07/15] KVM: MMU: redesign the algorithm of pte_list

2013-09-05 Thread Xiao Guangrong
che miss Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 180 - 1 file changed, 123 insertions(+), 57 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 8ea54d9..08fb4e2 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86

[PATCH v2 03/15] KVM: MMU: lazily drop large spte

2013-09-05 Thread Xiao Guangrong
large spte to writable but only dirty the first page into the dirty-bitmap that means other pages are missed. Fixed it by only the normal sptes (on the PT_PAGE_TABLE_LEVEL level) can be fast fixed Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 36 arch

[PATCH v2 08/15] KVM: MMU: introduce nulls desc

2013-09-05 Thread Xiao Guangrong
that lock) so that we can not see the same nulls used on different rmaps Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 35 +-- 1 file changed, 29 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 08fb4e2..c5f1b27 100644 -

[PATCH v2 01/15] KVM: MMU: fix the count of spte number

2013-09-05 Thread Xiao Guangrong
If the desc is the last one and it is full, its sptes is not counted Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 6e2d2c8..7714fd8 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c

[PATCH v2 14/15] KVM: MMU: clean up spte_write_protect

2013-09-05 Thread Xiao Guangrong
Now, the only user of spte_write_protect is rmap_write_protect which always calls spte_write_protect with pt_protect = true, so drop it and the unused parameter @kvm Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 19 --- 1 file changed, 8 insertions(+), 11 deletions

[PATCH v2 09/15] KVM: MMU: introduce pte-list lockless walker

2013-09-05 Thread Xiao Guangrong
, but the issue will be triggered if we expend the size of desc in the further development Thanks to SLAB_DESTROY_BY_RCU, the desc can be quickly reused Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 57 ++ 1 file changed, 53 insertions

[PATCH v2 04/15] KVM: MMU: flush tlb if the spte can be locklessly modified

2013-09-05 Thread Xiao Guangrong
, see spte.w = 0, then without flush tlb unlock mmu-lock !!! At this point, the shadow page can still be writable due to the corrupt tlb entry Flush all TLB Signed-off-by: Xiao Guangrong

[PATCH v2 10/15] KVM: MMU: initialize the pointers in pte_list_desc properly

2013-09-05 Thread Xiao Guangrong
kmem_cache_zalloc Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 27 +-- 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 3e1432f..fe80019 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -687,14 +687,15

<    4   5   6   7   8   9   10   11   12   13   >