On 11/27/2013 03:58 AM, Marcelo Tosatti wrote:
> On Tue, Nov 26, 2013 at 11:10:19AM +0800, Xiao Guangrong wrote:
>> On 11/25/2013 10:23 PM, Marcelo Tosatti wrote:
>>> On Mon, Nov 25, 2013 at 02:48:37PM +0200, Avi Kivity wrote:
>>>> On Mon, Nov 25, 2013 at 8:11
On 11/27/2013 03:58 AM, Marcelo Tosatti wrote:
On Tue, Nov 26, 2013 at 11:10:19AM +0800, Xiao Guangrong wrote:
On 11/25/2013 10:23 PM, Marcelo Tosatti wrote:
On Mon, Nov 25, 2013 at 02:48:37PM +0200, Avi Kivity wrote:
On Mon, Nov 25, 2013 at 8:11 AM, Xiao Guangrong
xiaoguangr
On 11/27/2013 03:31 AM, Marcelo Tosatti wrote:
On Tue, Nov 26, 2013 at 11:21:37AM +0800, Xiao Guangrong wrote:
On 11/26/2013 02:12 AM, Marcelo Tosatti wrote:
On Mon, Nov 25, 2013 at 02:29:03PM +0800, Xiao Guangrong wrote:
Also, there is no guarantee of termination (as long as sptes
On 11/26/2013 02:12 AM, Marcelo Tosatti wrote:
> On Mon, Nov 25, 2013 at 02:29:03PM +0800, Xiao Guangrong wrote:
>>>> Also, there is no guarantee of termination (as long as sptes are
>>>> deleted with the correct timing). BTW, can't see any guarantee of
>>>&g
On 11/25/2013 10:23 PM, Marcelo Tosatti wrote:
> On Mon, Nov 25, 2013 at 02:48:37PM +0200, Avi Kivity wrote:
>> On Mon, Nov 25, 2013 at 8:11 AM, Xiao Guangrong
>> wrote:
>>>
>>> On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti wrote:
>>
>>
>>
&g
On 11/25/2013 10:08 PM, Marcelo Tosatti wrote:
> On Mon, Nov 25, 2013 at 02:11:31PM +0800, Xiao Guangrong wrote:
>>
>> On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti wrote:
>>
>>> On Wed, Oct 23, 2013 at 09:29:25PM +0800, Xiao Guangrong wrote:
>>>> I
Hi Peter,
On 11/25/2013 05:31 PM, Peter Zijlstra wrote:
> On Fri, Nov 22, 2013 at 05:14:29PM -0200, Marcelo Tosatti wrote:
>> Also, there is no guarantee of termination (as long as sptes are
>> deleted with the correct timing). BTW, can't see any guarantee of
>> termination for rculist nulls
On 11/25/2013 06:19 PM, Gleb Natapov wrote:
> On Mon, Nov 25, 2013 at 02:11:31PM +0800, Xiao Guangrong wrote:
>>>
>>> For example, nothing prevents lockless walker to move into some
>>> parent_ptes chain, right?
>>
>> No.
>>
>> The nulls can h
On 11/25/2013 06:19 PM, Gleb Natapov wrote:
On Mon, Nov 25, 2013 at 02:11:31PM +0800, Xiao Guangrong wrote:
For example, nothing prevents lockless walker to move into some
parent_ptes chain, right?
No.
The nulls can help us to detect this case, for parent_ptes, the nulls points
to shadow
Hi Peter,
On 11/25/2013 05:31 PM, Peter Zijlstra wrote:
On Fri, Nov 22, 2013 at 05:14:29PM -0200, Marcelo Tosatti wrote:
Also, there is no guarantee of termination (as long as sptes are
deleted with the correct timing). BTW, can't see any guarantee of
termination for rculist nulls either (a
On 11/25/2013 10:08 PM, Marcelo Tosatti wrote:
On Mon, Nov 25, 2013 at 02:11:31PM +0800, Xiao Guangrong wrote:
On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
On Wed, Oct 23, 2013 at 09:29:25PM +0800, Xiao Guangrong wrote:
It likes nulls list and we use the pte-list
On 11/25/2013 10:23 PM, Marcelo Tosatti wrote:
On Mon, Nov 25, 2013 at 02:48:37PM +0200, Avi Kivity wrote:
On Mon, Nov 25, 2013 at 8:11 AM, Xiao Guangrong
xiaoguangr...@linux.vnet.ibm.com wrote:
On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
snip complicated stuff
On 11/26/2013 02:12 AM, Marcelo Tosatti wrote:
On Mon, Nov 25, 2013 at 02:29:03PM +0800, Xiao Guangrong wrote:
Also, there is no guarantee of termination (as long as sptes are
deleted with the correct timing). BTW, can't see any guarantee of
termination for rculist nulls either (a writer can
On 11/25/2013 02:11 PM, Xiao Guangrong wrote:
>
> On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti wrote:
>
>> On Wed, Oct 23, 2013 at 09:29:25PM +0800, Xiao Guangrong wrote:
>>> It likes nulls list and we use the pte-list as the nulls which can help us
>>>
On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti wrote:
> On Wed, Oct 23, 2013 at 09:29:25PM +0800, Xiao Guangrong wrote:
>> It likes nulls list and we use the pte-list as the nulls which can help us to
>> detect whether the "desc" is moved to anther rmap then we
On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
On Wed, Oct 23, 2013 at 09:29:25PM +0800, Xiao Guangrong wrote:
It likes nulls list and we use the pte-list as the nulls which can help us to
detect whether the desc is moved to anther rmap then we can re-walk the
rmap
On 11/25/2013 02:11 PM, Xiao Guangrong wrote:
On Nov 23, 2013, at 3:14 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
On Wed, Oct 23, 2013 at 09:29:25PM +0800, Xiao Guangrong wrote:
It likes nulls list and we use the pte-list as the nulls which can help us
to
detect whether the desc
On Nov 21, 2013, at 3:47 AM, Marcelo Tosatti wrote:
> On Wed, Nov 20, 2013 at 10:20:09PM +0800, Xiao Guangrong wrote:
>>> But what guarantee does userspace require, from GET_DIRTY_LOG, while vcpus
>>> are
>>> executing?
>>
>> Aha. Single calling GET
On Nov 20, 2013, at 8:29 AM, Marcelo Tosatti wrote:
> On Wed, Aug 07, 2013 at 12:06:49PM +0800, Xiao Guangrong wrote:
>> On 08/07/2013 09:48 AM, Marcelo Tosatti wrote:
>>> On Tue, Jul 30, 2013 at 09:02:02PM +0800, Xiao Guangrong wrote:
>>>> Make sure we can see the
On Nov 20, 2013, at 8:29 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
On Wed, Aug 07, 2013 at 12:06:49PM +0800, Xiao Guangrong wrote:
On 08/07/2013 09:48 AM, Marcelo Tosatti wrote:
On Tue, Jul 30, 2013 at 09:02:02PM +0800, Xiao Guangrong wrote:
Make sure we can see the writable spte before
On Nov 21, 2013, at 3:47 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
On Wed, Nov 20, 2013 at 10:20:09PM +0800, Xiao Guangrong wrote:
But what guarantee does userspace require, from GET_DIRTY_LOG, while vcpus
are
executing?
Aha. Single calling GET_DIRTY_LOG is useless since new dirty
On 11/15/2013 02:39 AM, Marcelo Tosatti wrote:
> On Thu, Nov 14, 2013 at 01:15:24PM +0800, Xiao Guangrong wrote:
>>
>> Hi Marcelo,
>>
>> On 11/14/2013 08:36 AM, Marcelo Tosatti wrote:
>>
>>>
>>> Any code location which reads the writable
On 11/15/2013 02:39 AM, Marcelo Tosatti wrote:
On Thu, Nov 14, 2013 at 01:15:24PM +0800, Xiao Guangrong wrote:
Hi Marcelo,
On 11/14/2013 08:36 AM, Marcelo Tosatti wrote:
Any code location which reads the writable bit in the spte and assumes if
its not
set, that the translation which
Hi Marcelo,
On 11/14/2013 08:36 AM, Marcelo Tosatti wrote:
>
> Any code location which reads the writable bit in the spte and assumes if its
> not
> set, that the translation which the spte refers to is not cached in a
> remote CPU's TLB can become buggy. (*)
>
> It might be the case that
Hi Marcelo,
On 11/14/2013 08:36 AM, Marcelo Tosatti wrote:
Any code location which reads the writable bit in the spte and assumes if its
not
set, that the translation which the spte refers to is not cached in a
remote CPU's TLB can become buggy. (*)
It might be the case that now its
On 11/03/2013 08:29 PM, Gleb Natapov wrote:
> Marcelo can you review it please?
>
Ping..
> On Wed, Oct 23, 2013 at 09:29:18PM +0800, Xiao Guangrong wrote:
>> Changelog v3:
>> - the changes from Gleb's review:
>> 1) drop the patch which fixed the count o
On 11/03/2013 08:29 PM, Gleb Natapov wrote:
Marcelo can you review it please?
Ping..
On Wed, Oct 23, 2013 at 09:29:18PM +0800, Xiao Guangrong wrote:
Changelog v3:
- the changes from Gleb's review:
1) drop the patch which fixed the count of spte number in rmap since it
can
On 10/24/2013 08:32 PM, Gleb Natapov wrote:
> On Thu, Oct 24, 2013 at 07:01:49PM +0800, Xiao Guangrong wrote:
>> On 10/24/2013 06:39 PM, Gleb Natapov wrote:
>>> On Thu, Oct 24, 2013 at 06:10:46PM +0800, Xiao Guangrong wrote:
>>>> On 10/24/2013 05:52 PM, Gleb Natap
On 10/24/2013 08:32 PM, Gleb Natapov wrote:
On Thu, Oct 24, 2013 at 07:01:49PM +0800, Xiao Guangrong wrote:
On 10/24/2013 06:39 PM, Gleb Natapov wrote:
On Thu, Oct 24, 2013 at 06:10:46PM +0800, Xiao Guangrong wrote:
On 10/24/2013 05:52 PM, Gleb Natapov wrote:
On Thu, Oct 24, 2013 at 05:29
On 10/24/2013 06:39 PM, Gleb Natapov wrote:
> On Thu, Oct 24, 2013 at 06:10:46PM +0800, Xiao Guangrong wrote:
>> On 10/24/2013 05:52 PM, Gleb Natapov wrote:
>>> On Thu, Oct 24, 2013 at 05:29:44PM +0800, Xiao Guangrong wrote:
>>>> On 10/24/2013 05:19 PM, Gleb Natapo
On 10/24/2013 05:52 PM, Gleb Natapov wrote:
> On Thu, Oct 24, 2013 at 05:29:44PM +0800, Xiao Guangrong wrote:
>> On 10/24/2013 05:19 PM, Gleb Natapov wrote:
>>
>>>> @@ -946,7 +947,7 @@ static inline struct kvm_mmu_page *page_header(hpa_t
>>>> shadow_
On 10/24/2013 05:17 PM, Gleb Natapov wrote:
>>
>> -/**
>> - * kvm_mmu_write_protect_pt_masked - write protect selected PT level pages
>> +static void __rmap_write_protect_lockless(u64 *sptep)
>> +{
>> +u64 spte;
>> +
>> +retry:
>> +/*
>> + * Note we may partly read the sptep on
On 10/24/2013 05:19 PM, Gleb Natapov wrote:
>> @@ -946,7 +947,7 @@ static inline struct kvm_mmu_page *page_header(hpa_t
>> shadow_page)
>> {
>> struct page *page = pfn_to_page(shadow_page >> PAGE_SHIFT);
>>
>> -return (struct kvm_mmu_page *)page_private(page);
>> +return (struct
On 10/24/2013 05:19 PM, Gleb Natapov wrote:
@@ -946,7 +947,7 @@ static inline struct kvm_mmu_page *page_header(hpa_t
shadow_page)
{
struct page *page = pfn_to_page(shadow_page PAGE_SHIFT);
-return (struct kvm_mmu_page *)page_private(page);
+return (struct kvm_mmu_page
On 10/24/2013 05:17 PM, Gleb Natapov wrote:
-/**
- * kvm_mmu_write_protect_pt_masked - write protect selected PT level pages
+static void __rmap_write_protect_lockless(u64 *sptep)
+{
+u64 spte;
+
+retry:
+/*
+ * Note we may partly read the sptep on 32bit host, however, we
On 10/24/2013 05:52 PM, Gleb Natapov wrote:
On Thu, Oct 24, 2013 at 05:29:44PM +0800, Xiao Guangrong wrote:
On 10/24/2013 05:19 PM, Gleb Natapov wrote:
@@ -946,7 +947,7 @@ static inline struct kvm_mmu_page *page_header(hpa_t
shadow_page)
{
struct page *page = pfn_to_page(shadow_page
On 10/24/2013 06:39 PM, Gleb Natapov wrote:
On Thu, Oct 24, 2013 at 06:10:46PM +0800, Xiao Guangrong wrote:
On 10/24/2013 05:52 PM, Gleb Natapov wrote:
On Thu, Oct 24, 2013 at 05:29:44PM +0800, Xiao Guangrong wrote:
On 10/24/2013 05:19 PM, Gleb Natapov wrote:
@@ -946,7 +947,7 @@ static
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 84 ++
arch/x86/kvm/x86.c | 10 +++
2 files changed, 76 insertions(+), 18 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 337d173..e85eed6 100644
--- a/arch/x86
large spte to
writable but only dirty the first page into the dirty-bitmap that means
other pages are missed. Fixed it by only the normal sptes (on the
PT_PAGE_TABLE_LEVEL level) can be fast fixed
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 36
arch
ble and avoids potential issue in the
further development
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 10 ++
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 40772ef..d2aacc2 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/
,
see spte.w = 0, then without flush tlb
unlock mmu-lock
!!! At this point, the shadow page can still be
writable due to the corrupt tlb entry
Flush all TLB
Signed-off-by: Xiao Guangrong
te based on the dirty bitmap,
we should ensure the writable spte can be found in rmap before the dirty bitmap
is visible. Otherwise, we cleared the dirty bitmap and failed to write-protect
the page.
Performance result
The performance result and the benchmark can be found at:
http
, that
means it does not depend on PT_WRITABLE_MASK anymore
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 18 ++
arch/x86/kvm/x86.c | 9 +++--
2 files changed, 21 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 62f18ec..337d173
that lock) so that we can not see the same
nulls used on different rmaps
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 35 +--
1 file changed, 29 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 5cce039..4687329 100644
-
-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 57 ++
1 file changed, 53 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 4687329..a864140 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -975,6
Use SLAB_DESTROY_BY_RCU to prevent the shadow page to be freed from the
slab, so that it can be locklessly accessed by holding rcu lock
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm
is_last_spte() do not depend on
the mapping level anymore
This is important to implement lockless write-protection since only spte is
available at that time
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 25 -
arch/x86/kvm/mmu_audit.c | 6 +++---
arch
Allocate shadow pages from slab instead of page-allocator, frequent
shadow page allocation and free can be hit in the slab cache, it is
very useful for shadow mmu
Signed-off-by: Xiao Guangrong
---
arch/x86/include/asm/kvm_host.h | 3 ++-
arch/x86/kvm/mmu.c | 46
Now, the only user of spte_write_protect is rmap_write_protect which
always calls spte_write_protect with pt_protect = true, so drop
it and the unused parameter @kvm
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 19 ---
1 file changed, 8 insertions(+), 11 deletions
kmem_cache_zalloc
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 27 +--
1 file changed, 21 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index a864140..f3ae74e6 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -687,14 +687,15
-off-by: Xiao Guangrong
---
arch/x86/include/asm/kvm_host.h | 4 ---
arch/x86/kvm/mmu.c | 59 ++---
arch/x86/kvm/mmu.h | 6 +
arch/x86/kvm/x86.c | 11
4 files changed, 55 insertions(+), 25 deletions(-)
diff
Use rcu_assign_pointer() to update all the pointer in desc
and use rcu_dereference() to lockless read the pointer
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 46 --
1 file changed, 28 insertions(+), 18 deletions(-)
diff --git a/arch/x86
che miss
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 179 -
1 file changed, 123 insertions(+), 56 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index e85eed6..5cce039 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86
-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
arch/x86/kvm/mmu.c | 179 -
1 file changed, 123 insertions(+), 56 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index e85eed6..5cce039 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch
Use rcu_assign_pointer() to update all the pointer in desc
and use rcu_dereference() to lockless read the pointer
Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
arch/x86/kvm/mmu.c | 46 --
1 file changed, 28 insertions(+), 18
kmem_cache_zalloc
Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
arch/x86/kvm/mmu.c | 27 +--
1 file changed, 21 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index a864140..f3ae74e6 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86
-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
arch/x86/include/asm/kvm_host.h | 4 ---
arch/x86/kvm/mmu.c | 59 ++---
arch/x86/kvm/mmu.h | 6 +
arch/x86/kvm/x86.c | 11
4 files changed, 55
Now, the only user of spte_write_protect is rmap_write_protect which
always calls spte_write_protect with pt_protect = true, so drop
it and the unused parameter @kvm
Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
arch/x86/kvm/mmu.c | 19 ---
1 file changed, 8
is_last_spte() do not depend on
the mapping level anymore
This is important to implement lockless write-protection since only spte is
available at that time
Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
arch/x86/kvm/mmu.c | 25 -
arch/x86/kvm
Allocate shadow pages from slab instead of page-allocator, frequent
shadow page allocation and free can be hit in the slab cache, it is
very useful for shadow mmu
Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
arch/x86/include/asm/kvm_host.h | 3 ++-
arch/x86/kvm/mmu.c
Use SLAB_DESTROY_BY_RCU to prevent the shadow page to be freed from the
slab, so that it can be locklessly accessed by holding rcu lock
Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
arch/x86/kvm/mmu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git
can not see the same
nulls used on different rmaps
Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
arch/x86/kvm/mmu.c | 35 +--
1 file changed, 29 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 5cce039
-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
arch/x86/kvm/mmu.c | 57 ++
1 file changed, 53 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 4687329..a864140 100644
--- a/arch/x86/kvm/mmu.c
+++ b
, that
means it does not depend on PT_WRITABLE_MASK anymore
Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
arch/x86/kvm/mmu.c | 18 ++
arch/x86/kvm/x86.c | 9 +++--
2 files changed, 21 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm
is visible. Otherwise, we cleared the dirty bitmap and failed to write-protect
the page.
Performance result
The performance result and the benchmark can be found at:
http://permalink.gmane.org/gmane.linux.kernel/1534876
Xiao Guangrong (15):
KVM: MMU: properly check last spte
,
see spte.w = 0, then without flush tlb
unlock mmu-lock
!!! At this point, the shadow page can still be
writable due to the corrupt tlb entry
Flush all TLB
Signed-off-by: Xiao Guangrong
and avoids potential issue in the
further development
Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
arch/x86/kvm/mmu.c | 10 ++
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 40772ef..d2aacc2 100644
--- a/arch/x86
large spte to
writable but only dirty the first page into the dirty-bitmap that means
other pages are missed. Fixed it by only the normal sptes (on the
PT_PAGE_TABLE_LEVEL level) can be fast fixed
Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
arch/x86/kvm/mmu.c | 36
Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
arch/x86/kvm/mmu.c | 84 ++
arch/x86/kvm/x86.c | 10 +++
2 files changed, 76 insertions(+), 18 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index
On Oct 16, 2013, at 6:21 AM, Marcelo Tosatti wrote:
> On Tue, Oct 15, 2013 at 06:57:05AM +0300, Gleb Natapov wrote:
>>>
>>> Why is it safe to allow access, by the lockless page write protect
>>> side, to spt pointer for shadow page A that can change to a shadow page
>>> pointer of shadow page
On Oct 16, 2013, at 6:21 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
On Tue, Oct 15, 2013 at 06:57:05AM +0300, Gleb Natapov wrote:
Why is it safe to allow access, by the lockless page write protect
side, to spt pointer for shadow page A that can change to a shadow page
pointer of
On 10/09/2013 09:56 AM, Marcelo Tosatti wrote:
> On Tue, Oct 08, 2013 at 12:02:32PM +0800, Xiao Guangrong wrote:
>>
>> Hi Marcelo,
>>
>> On Oct 8, 2013, at 9:23 AM, Marcelo Tosatti wrote:
>>
>>>>
>>>> + if (kvm->arch.rcu_free_shado
On 10/09/2013 09:56 AM, Marcelo Tosatti wrote:
On Tue, Oct 08, 2013 at 12:02:32PM +0800, Xiao Guangrong wrote:
Hi Marcelo,
On Oct 8, 2013, at 9:23 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
+ if (kvm-arch.rcu_free_shadow_page) {
+ kvm_mmu_isolate_pages(invalid_list
Hi Marcelo,
On Oct 8, 2013, at 9:23 AM, Marcelo Tosatti wrote:
>>
>> +if (kvm->arch.rcu_free_shadow_page) {
>> +kvm_mmu_isolate_pages(invalid_list);
>> +sp = list_first_entry(invalid_list, struct kvm_mmu_page, link);
>> +list_del_init(invalid_list);
>>
Hi Marcelo,
On Oct 8, 2013, at 9:23 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
+if (kvm-arch.rcu_free_shadow_page) {
+kvm_mmu_isolate_pages(invalid_list);
+sp = list_first_entry(invalid_list, struct kvm_mmu_page, link);
+
On Oct 1, 2013, at 7:05 AM, Marcelo Tosatti wrote:
> On Thu, Sep 05, 2013 at 06:29:08PM +0800, Xiao Guangrong wrote:
>> Now we can flush all the TLBs out of the mmu lock without TLB corruption when
>> write-proect the sptes, it is because:
>> - we have marked large s
On Oct 1, 2013, at 6:39 AM, Marcelo Tosatti wrote:
> On Thu, Sep 05, 2013 at 06:29:06PM +0800, Xiao Guangrong wrote:
>> Currently, kvm zaps the large spte if write-protected is needed, the later
>> read can fault on that spte. Actually, we can make the large spte readonly
>&
On Oct 1, 2013, at 5:23 AM, Marcelo Tosatti wrote:
>>
>
> Unrelated to this patch:
>
> If vcpu->mode = OUTSIDE_GUEST_MODE, no IPI is sent
> by kvm_flush_remote_tlbs.
Yes.
>
> So how is this supposed to work again?
>
>/*
> * Wait for all vcpus to exit guest mode and/or
On Oct 1, 2013, at 5:23 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
Unrelated to this patch:
If vcpu-mode = OUTSIDE_GUEST_MODE, no IPI is sent
by kvm_flush_remote_tlbs.
Yes.
So how is this supposed to work again?
/*
* Wait for all vcpus to exit guest mode
On Oct 1, 2013, at 6:39 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
On Thu, Sep 05, 2013 at 06:29:06PM +0800, Xiao Guangrong wrote:
Currently, kvm zaps the large spte if write-protected is needed, the later
read can fault on that spte. Actually, we can make the large spte readonly
instead
On Oct 1, 2013, at 7:05 AM, Marcelo Tosatti mtosa...@redhat.com wrote:
On Thu, Sep 05, 2013 at 06:29:08PM +0800, Xiao Guangrong wrote:
Now we can flush all the TLBs out of the mmu lock without TLB corruption when
write-proect the sptes, it is because:
- we have marked large sptes readonly
Hi Gleb,
On 09/16/2013 08:42 PM, Gleb Natapov wrote:
>> static unsigned long *__gfn_to_rmap(gfn_t gfn, int level,
>> struct kvm_memory_slot *slot)
>> {
>> @@ -4651,7 +4700,7 @@ int kvm_mmu_module_init(void)
>> {
>> pte_list_desc_cache =
Hi Gleb,
On 09/16/2013 08:42 PM, Gleb Natapov wrote:
static unsigned long *__gfn_to_rmap(gfn_t gfn, int level,
struct kvm_memory_slot *slot)
{
@@ -4651,7 +4700,7 @@ int kvm_mmu_module_init(void)
{
pte_list_desc_cache =
On Sep 8, 2013, at 10:01 PM, Gleb Natapov wrote:
> On Sun, Sep 08, 2013 at 09:55:04PM +0800, Xiao Guangrong wrote:
>>
>> On Sep 8, 2013, at 8:19 PM, Gleb Natapov wrote:
>>
>>> On Thu, Sep 05, 2013 at 06:29:04PM +0800, Xiao Guangrong wrote:
>>>> If th
On Sep 8, 2013, at 8:19 PM, Gleb Natapov wrote:
> On Thu, Sep 05, 2013 at 06:29:04PM +0800, Xiao Guangrong wrote:
>> If the desc is the last one and it is full, its sptes is not counted
>>
> Hmm, if desc is not full but it is not the last one all sptes after the
> des
On Sep 5, 2013, at 6:29 PM, Xiao Guangrong
wrote:
>
> A desc only has 3 entries
> in the current code so it is not a problem now, but the issue will
> be triggered if we expend the size of desc in the further development
Sorry, this description is obvious wrong, the bug exis
On Sep 5, 2013, at 6:29 PM, Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
wrote:
A desc only has 3 entries
in the current code so it is not a problem now, but the issue will
be triggered if we expend the size of desc in the further development
Sorry, this description is obvious wrong
On Sep 8, 2013, at 8:19 PM, Gleb Natapov g...@redhat.com wrote:
On Thu, Sep 05, 2013 at 06:29:04PM +0800, Xiao Guangrong wrote:
If the desc is the last one and it is full, its sptes is not counted
Hmm, if desc is not full but it is not the last one all sptes after the
desc are not counted
On Sep 8, 2013, at 10:01 PM, Gleb Natapov g...@redhat.com wrote:
On Sun, Sep 08, 2013 at 09:55:04PM +0800, Xiao Guangrong wrote:
On Sep 8, 2013, at 8:19 PM, Gleb Natapov g...@redhat.com wrote:
On Thu, Sep 05, 2013 at 06:29:04PM +0800, Xiao Guangrong wrote:
If the desc is the last one
ng the accessed and
> dirty bits.
>
> Note that this scenario is not supported by NPT at all, as explained by
> comments in the code.
Reviewed-by: Xiao Guangrong
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to major
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 84 ++
arch/x86/kvm/x86.c | 10 +++
2 files changed, 76 insertions(+), 18 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index a983570..8ea54d9 100644
--- a/arch/x86
, that
means it does not depend on PT_WRITABLE_MASK anymore
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 18 ++
arch/x86/kvm/x86.c | 9 +++--
2 files changed, 21 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 7488229..a983570
che miss
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 180 -
1 file changed, 123 insertions(+), 57 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 8ea54d9..08fb4e2 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86
large spte to
writable but only dirty the first page into the dirty-bitmap that means
other pages are missed. Fixed it by only the normal sptes (on the
PT_PAGE_TABLE_LEVEL level) can be fast fixed
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 36
arch
that lock) so that we can not see the same
nulls used on different rmaps
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 35 +--
1 file changed, 29 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 08fb4e2..c5f1b27 100644
-
If the desc is the last one and it is full, its sptes is not counted
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 6e2d2c8..7714fd8 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
Now, the only user of spte_write_protect is rmap_write_protect which
always calls spte_write_protect with pt_protect = true, so drop
it and the unused parameter @kvm
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 19 ---
1 file changed, 8 insertions(+), 11 deletions
, but the issue will
be triggered if we expend the size of desc in the further development
Thanks to SLAB_DESTROY_BY_RCU, the desc can be quickly reused
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 57 ++
1 file changed, 53 insertions
,
see spte.w = 0, then without flush tlb
unlock mmu-lock
!!! At this point, the shadow page can still be
writable due to the corrupt tlb entry
Flush all TLB
Signed-off-by: Xiao Guangrong
kmem_cache_zalloc
Signed-off-by: Xiao Guangrong
---
arch/x86/kvm/mmu.c | 27 +--
1 file changed, 21 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 3e1432f..fe80019 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -687,14 +687,15
801 - 900 of 2152 matches
Mail list logo