Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable

2012-04-17 Thread Xiao Guangrong
On 04/17/2012 03:41 PM, Avi Kivity wrote:

> On 04/17/2012 06:55 AM, Xiao Guangrong wrote:
>> On 04/16/2012 07:47 PM, Avi Kivity wrote:
>>
>>> On 04/16/2012 01:20 PM, Xiao Guangrong wrote:
>>
>> It is used to avoid the unnecessary overload 
>
> It's overloading me :(
>


 Sorry.

>>>
>>> The trick is to send those in separate patchset so the maintainer
>>> doesn't notice.
>>>
>>
>>
>> Thanks for your suggestion, i will pay more attention on it in the
>> further.
>>
>> For this patch, what did you mean of "those"? You mean the whole
>> rmap.PTE_LIST_WP_BIT (fast check for shadow page table write protection
>> and host write protection) or just about host_page_write_protect
>> (for KSM only)?
> 
> All of it.  Let's start with just modifying sptes concurrently and only
> later add reading bits from rmap concurrently, if it proves necessary.
> 


Okay, i agree.

>>
>> If we do not have rmap.PTE_LIST_WP_BIT, there may have regression on
>> shadow mmu.
>>
>> Hmm, do i need implement rmap.PTE_LIST_WP_BIT, then fast page fault?
> 
> Let's try to measure the effect without rmap.PTE_LIST_WP_BIT.  Usually
> PTE chains for page tables are short so the effect would be small.  Of
> course we can't tell about all guest.
> 


It is not about rmap's spte, it is about sp.sync write-protect, if the sp.sync
is written, the fast page fault path will be triggered even if no migration and
no framebuffer.

I have done a quick test for kernbench for 10 times and get the average value
without xwindow:

keep rmap.PTE_LIST_WP_BIT: 53.494
comment rmap.PTE_LIST_WP_BIT checking in page_fault_can_be_fast: 53.948

Anyway, for good review, let move fast page fault in first and discuss this in
the separate patchset later.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable

2012-04-17 Thread Avi Kivity
On 04/17/2012 06:55 AM, Xiao Guangrong wrote:
> On 04/16/2012 07:47 PM, Avi Kivity wrote:
>
> > On 04/16/2012 01:20 PM, Xiao Guangrong wrote:
> 
>  It is used to avoid the unnecessary overload 
> >>>
> >>> It's overloading me :(
> >>>
> >>
> >>
> >> Sorry.
> >>
> > 
> > The trick is to send those in separate patchset so the maintainer
> > doesn't notice.
> > 
>
>
> Thanks for your suggestion, i will pay more attention on it in the
> further.
>
> For this patch, what did you mean of "those"? You mean the whole
> rmap.PTE_LIST_WP_BIT (fast check for shadow page table write protection
> and host write protection) or just about host_page_write_protect
> (for KSM only)?

All of it.  Let's start with just modifying sptes concurrently and only
later add reading bits from rmap concurrently, if it proves necessary.

>
> If we do not have rmap.PTE_LIST_WP_BIT, there may have regression on
> shadow mmu.
>
> Hmm, do i need implement rmap.PTE_LIST_WP_BIT, then fast page fault?

Let's try to measure the effect without rmap.PTE_LIST_WP_BIT.  Usually
PTE chains for page tables are short so the effect would be small.  Of
course we can't tell about all guest.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable

2012-04-16 Thread Xiao Guangrong
On 04/16/2012 07:47 PM, Avi Kivity wrote:

> On 04/16/2012 01:20 PM, Xiao Guangrong wrote:

 It is used to avoid the unnecessary overload 
>>>
>>> It's overloading me :(
>>>
>>
>>
>> Sorry.
>>
> 
> The trick is to send those in separate patchset so the maintainer
> doesn't notice.
> 


Thanks for your suggestion, i will pay more attention on it in the
further.

For this patch, what did you mean of "those"? You mean the whole
rmap.PTE_LIST_WP_BIT (fast check for shadow page table write protection
and host write protection) or just about host_page_write_protect
(for KSM only)?

If we do not have rmap.PTE_LIST_WP_BIT, there may have regression on
shadow mmu.

Hmm, do i need implement rmap.PTE_LIST_WP_BIT, then fast page fault?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable

2012-04-16 Thread Avi Kivity
On 04/16/2012 01:20 PM, Xiao Guangrong wrote:
> >>
> >> It is used to avoid the unnecessary overload 
> > 
> > It's overloading me :(
> > 
>
>
> Sorry.
>

The trick is to send those in separate patchset so the maintainer
doesn't notice.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable

2012-04-16 Thread Xiao Guangrong
On 04/16/2012 06:02 PM, Avi Kivity wrote:

> On 04/16/2012 06:25 AM, Xiao Guangrong wrote:
>> On 04/15/2012 11:16 PM, Avi Kivity wrote:
>>
>>> On 04/13/2012 01:14 PM, Xiao Guangrong wrote:
 Using bit 1 (PTE_LIST_WP_BIT) in rmap store the write-protect status
 to avoid unnecessary shadow page walking

 Signed-off-by: Xiao Guangrong 
 ---
  arch/x86/kvm/mmu.c |   40 ++--
  1 files changed, 34 insertions(+), 6 deletions(-)

 diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
 index 0c6e92d..8b71908 100644
 --- a/arch/x86/kvm/mmu.c
 +++ b/arch/x86/kvm/mmu.c
 @@ -796,7 +796,9 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t 
 large_gfn)
return level - 1;
  }

 -#define PTE_LIST_DESC (0x1ull)
 +#define PTE_LIST_DESC_BIT 0
 +#define PTE_LIST_WP_BIT   1
 +#define PTE_LIST_DESC (1 << PTE_LIST_DESC_BIT)
  #define PTE_LIST_FLAG_MASK(0x3ull)

  static void
 @@ -1067,6 +1069,12 @@ static bool rmap_can_add(struct kvm_vcpu *vcpu)
return mmu_memory_cache_free_objects(cache);
  }

 +static void host_page_write_protect(u64 *spte, unsigned long *rmapp)
 +{
 +  if (!(*spte & SPTE_HOST_WRITEABLE))
 +  __test_and_set_bit(PTE_LIST_WP_BIT, rmapp);
 +}

>>>
>>> Why is this needed, in addition to spte.SPTE_WRITE_PROTECT?
>>>
>>
>>
>> It is used to avoid the unnecessary overload 
> 
> It's overloading me :(
> 


Sorry.

>> for fast page fault if
>> KSM is enabled. On the fast check path, it can see the gfn is write-protected
>> by host, then the fast page fault path is not called.
> 
> The fast page fault path is supposed to be fast, so it's okay if we take
> a bit of extra overhead before a COW (which is going to be slow anyway).
> 
> Let's get the simplest possible version in, and then discuss if/how we
> need to optimize it further.
> 


Okay, i will drop setting PTE_LIST_WP_BIT for this case in the next version.:)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable

2012-04-16 Thread Avi Kivity
On 04/16/2012 06:25 AM, Xiao Guangrong wrote:
> On 04/15/2012 11:16 PM, Avi Kivity wrote:
>
> > On 04/13/2012 01:14 PM, Xiao Guangrong wrote:
> >> Using bit 1 (PTE_LIST_WP_BIT) in rmap store the write-protect status
> >> to avoid unnecessary shadow page walking
> >>
> >> Signed-off-by: Xiao Guangrong 
> >> ---
> >>  arch/x86/kvm/mmu.c |   40 ++--
> >>  1 files changed, 34 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> >> index 0c6e92d..8b71908 100644
> >> --- a/arch/x86/kvm/mmu.c
> >> +++ b/arch/x86/kvm/mmu.c
> >> @@ -796,7 +796,9 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t 
> >> large_gfn)
> >>return level - 1;
> >>  }
> >>
> >> -#define PTE_LIST_DESC (0x1ull)
> >> +#define PTE_LIST_DESC_BIT 0
> >> +#define PTE_LIST_WP_BIT   1
> >> +#define PTE_LIST_DESC (1 << PTE_LIST_DESC_BIT)
> >>  #define PTE_LIST_FLAG_MASK(0x3ull)
> >>
> >>  static void
> >> @@ -1067,6 +1069,12 @@ static bool rmap_can_add(struct kvm_vcpu *vcpu)
> >>return mmu_memory_cache_free_objects(cache);
> >>  }
> >>
> >> +static void host_page_write_protect(u64 *spte, unsigned long *rmapp)
> >> +{
> >> +  if (!(*spte & SPTE_HOST_WRITEABLE))
> >> +  __test_and_set_bit(PTE_LIST_WP_BIT, rmapp);
> >> +}
> >>
> > 
> > Why is this needed, in addition to spte.SPTE_WRITE_PROTECT?
> > 
>
>
> It is used to avoid the unnecessary overload 

It's overloading me :(

> for fast page fault if
> KSM is enabled. On the fast check path, it can see the gfn is write-protected
> by host, then the fast page fault path is not called.

The fast page fault path is supposed to be fast, so it's okay if we take
a bit of extra overhead before a COW (which is going to be slow anyway).

Let's get the simplest possible version in, and then discuss if/how we
need to optimize it further.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable

2012-04-15 Thread Xiao Guangrong
On 04/14/2012 11:01 AM, Takuya Yoshikawa wrote:

> On Fri, 13 Apr 2012 18:14:26 +0800
> Xiao Guangrong  wrote:
> 
>> Using bit 1 (PTE_LIST_WP_BIT) in rmap store the write-protect status
>> to avoid unnecessary shadow page walking
>>
>> Signed-off-by: Xiao Guangrong 
>> ---
>>  arch/x86/kvm/mmu.c |   40 ++--
>>  1 files changed, 34 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index 0c6e92d..8b71908 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -796,7 +796,9 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t 
>> large_gfn)
>>  return level - 1;
>>  }
>>
>> -#define PTE_LIST_DESC   (0x1ull)
>> +#define PTE_LIST_DESC_BIT   0
>> +#define PTE_LIST_WP_BIT 1
> 
> _BIT ?


What is the problem?

> 
>> @@ -2291,9 +2310,15 @@ static int mmu_need_write_protect(struct kvm_vcpu 
>> *vcpu, gfn_t gfn,
>>  {
>>  struct kvm_mmu_page *s;
>>  struct hlist_node *node;
>> +unsigned long *rmap;
>>  bool need_unsync = false;
>>
>> +rmap = gfn_to_rmap(vcpu->kvm, gfn, PT_PAGE_TABLE_LEVEL);
> 
> Please use consistent variable names.
> 
> In other parts of this patch, you are using rmapp for this.


Okay.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable

2012-04-15 Thread Xiao Guangrong
On 04/15/2012 11:16 PM, Avi Kivity wrote:

> On 04/13/2012 01:14 PM, Xiao Guangrong wrote:
>> Using bit 1 (PTE_LIST_WP_BIT) in rmap store the write-protect status
>> to avoid unnecessary shadow page walking
>>
>> Signed-off-by: Xiao Guangrong 
>> ---
>>  arch/x86/kvm/mmu.c |   40 ++--
>>  1 files changed, 34 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index 0c6e92d..8b71908 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -796,7 +796,9 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t 
>> large_gfn)
>>  return level - 1;
>>  }
>>
>> -#define PTE_LIST_DESC   (0x1ull)
>> +#define PTE_LIST_DESC_BIT   0
>> +#define PTE_LIST_WP_BIT 1
>> +#define PTE_LIST_DESC   (1 << PTE_LIST_DESC_BIT)
>>  #define PTE_LIST_FLAG_MASK  (0x3ull)
>>
>>  static void
>> @@ -1067,6 +1069,12 @@ static bool rmap_can_add(struct kvm_vcpu *vcpu)
>>  return mmu_memory_cache_free_objects(cache);
>>  }
>>
>> +static void host_page_write_protect(u64 *spte, unsigned long *rmapp)
>> +{
>> +if (!(*spte & SPTE_HOST_WRITEABLE))
>> +__test_and_set_bit(PTE_LIST_WP_BIT, rmapp);
>> +}
>>
> 
> Why is this needed, in addition to spte.SPTE_WRITE_PROTECT?
> 


It is used to avoid the unnecessary overload for fast page fault if
KSM is enabled. On the fast check path, it can see the gfn is write-protected
by host, then the fast page fault path is not called.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable

2012-04-15 Thread Avi Kivity
On 04/13/2012 01:14 PM, Xiao Guangrong wrote:
> Using bit 1 (PTE_LIST_WP_BIT) in rmap store the write-protect status
> to avoid unnecessary shadow page walking
>
> Signed-off-by: Xiao Guangrong 
> ---
>  arch/x86/kvm/mmu.c |   40 ++--
>  1 files changed, 34 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 0c6e92d..8b71908 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -796,7 +796,9 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t 
> large_gfn)
>   return level - 1;
>  }
>
> -#define PTE_LIST_DESC(0x1ull)
> +#define PTE_LIST_DESC_BIT0
> +#define PTE_LIST_WP_BIT  1
> +#define PTE_LIST_DESC(1 << PTE_LIST_DESC_BIT)
>  #define PTE_LIST_FLAG_MASK   (0x3ull)
>
>  static void
> @@ -1067,6 +1069,12 @@ static bool rmap_can_add(struct kvm_vcpu *vcpu)
>   return mmu_memory_cache_free_objects(cache);
>  }
>
> +static void host_page_write_protect(u64 *spte, unsigned long *rmapp)
> +{
> + if (!(*spte & SPTE_HOST_WRITEABLE))
> + __test_and_set_bit(PTE_LIST_WP_BIT, rmapp);
> +}
>

Why is this needed, in addition to spte.SPTE_WRITE_PROTECT?

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable

2012-04-13 Thread Takuya Yoshikawa
On Fri, 13 Apr 2012 18:14:26 +0800
Xiao Guangrong  wrote:

> Using bit 1 (PTE_LIST_WP_BIT) in rmap store the write-protect status
> to avoid unnecessary shadow page walking
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  arch/x86/kvm/mmu.c |   40 ++--
>  1 files changed, 34 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 0c6e92d..8b71908 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -796,7 +796,9 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t 
> large_gfn)
>   return level - 1;
>  }
> 
> -#define PTE_LIST_DESC(0x1ull)
> +#define PTE_LIST_DESC_BIT0
> +#define PTE_LIST_WP_BIT  1

_BIT ?

> +#define PTE_LIST_DESC(1 << PTE_LIST_DESC_BIT)
>  #define PTE_LIST_FLAG_MASK   (0x3ull)
> 

Well, I think that rmap is losing its original meaning:
Before, it held just the information needed to link spteps.

Now it has more roles, and getting kind of descriptor?

...

> @@ -2291,9 +2310,15 @@ static int mmu_need_write_protect(struct kvm_vcpu 
> *vcpu, gfn_t gfn,
>  {
>   struct kvm_mmu_page *s;
>   struct hlist_node *node;
> + unsigned long *rmap;
>   bool need_unsync = false;
> 
> + rmap = gfn_to_rmap(vcpu->kvm, gfn, PT_PAGE_TABLE_LEVEL);

Please use consistent variable names.

In other parts of this patch, you are using rmapp for this.

Takuya
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 10/16] KVM: MMU: fask check whether page is writable

2012-04-13 Thread Xiao Guangrong
Using bit 1 (PTE_LIST_WP_BIT) in rmap store the write-protect status
to avoid unnecessary shadow page walking

Signed-off-by: Xiao Guangrong 
---
 arch/x86/kvm/mmu.c |   40 ++--
 1 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 0c6e92d..8b71908 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -796,7 +796,9 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t 
large_gfn)
return level - 1;
 }

-#define PTE_LIST_DESC  (0x1ull)
+#define PTE_LIST_DESC_BIT  0
+#define PTE_LIST_WP_BIT1
+#define PTE_LIST_DESC  (1 << PTE_LIST_DESC_BIT)
 #define PTE_LIST_FLAG_MASK (0x3ull)

 static void
@@ -1067,6 +1069,12 @@ static bool rmap_can_add(struct kvm_vcpu *vcpu)
return mmu_memory_cache_free_objects(cache);
 }

+static void host_page_write_protect(u64 *spte, unsigned long *rmapp)
+{
+   if (!(*spte & SPTE_HOST_WRITEABLE))
+   __test_and_set_bit(PTE_LIST_WP_BIT, rmapp);
+}
+
 static int rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn)
 {
struct kvm_mmu_page *sp;
@@ -1074,7 +1082,10 @@ static int rmap_add(struct kvm_vcpu *vcpu, u64 *spte, 
gfn_t gfn)

sp = page_header(__pa(spte));
kvm_mmu_page_set_gfn(sp, spte - sp->spt, gfn);
+
rmapp = gfn_to_rmap(vcpu->kvm, gfn, sp->role.level);
+   host_page_write_protect(spte, rmapp);
+
return pte_list_add(vcpu, spte, rmapp);
 }

@@ -1164,16 +1175,23 @@ static bool rmap_write_protect(struct kvm *kvm, u64 gfn)
 {
struct kvm_memory_slot *slot;
unsigned long *rmapp;
-   int i;
+   int i = PT_PAGE_TABLE_LEVEL;
bool write_protected = false;

slot = gfn_to_memslot(kvm, gfn);
+   rmapp = __gfn_to_rmap(gfn, i, slot);
+
+   if (__test_and_set_bit(PTE_LIST_WP_BIT, rmapp))
+   return false;
+
+   do {
+   write_protected |= __rmap_write_protect(kvm, rmapp, i++);
+
+   if (i >= PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES)
+   break;

-   for (i = PT_PAGE_TABLE_LEVEL;
-i < PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES; ++i) {
rmapp = __gfn_to_rmap(gfn, i, slot);
-   write_protected |= __rmap_write_protect(kvm, rmapp, i);
-   }
+   } while (true);

return write_protected;
 }
@@ -1225,6 +1243,7 @@ static int kvm_set_pte_rmapp(struct kvm *kvm, unsigned 
long *rmapp,

mmu_spte_clear_track_bits(sptep);
mmu_spte_set(sptep, new_spte);
+   host_page_write_protect(sptep, rmapp);
}
}

@@ -2291,9 +2310,15 @@ static int mmu_need_write_protect(struct kvm_vcpu *vcpu, 
gfn_t gfn,
 {
struct kvm_mmu_page *s;
struct hlist_node *node;
+   unsigned long *rmap;
bool need_unsync = false;

+   rmap = gfn_to_rmap(vcpu->kvm, gfn, PT_PAGE_TABLE_LEVEL);
+
if (!vcpu->kvm->arch.indirect_shadow_pages)
+   goto write_free;
+
+   if (!test_bit(PTE_LIST_WP_BIT, rmap))
return 0;

for_each_gfn_indirect_valid_sp(vcpu->kvm, s, gfn, node) {
@@ -2309,6 +2334,9 @@ static int mmu_need_write_protect(struct kvm_vcpu *vcpu, 
gfn_t gfn,
}
if (need_unsync)
kvm_unsync_pages(vcpu, gfn);
+
+write_free:
+   __clear_bit(PTE_LIST_WP_BIT, rmap);
return 0;
 }

-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html