Re: KVM: MMU: optimize set_spte for page sync

2008-12-10 Thread Avi Kivity

Marcelo Tosatti wrote:

Do you have objections for submitting this patch for 2.6.28 ? The hash
lookup kills performance of pagetable write + context switch intensive
workloads.
  


Can you quantify?

2.6.28 is out of the question, 2.6.28.stable is possible, depending on 
the actual performance difference.  I'd hate to introduce a correctness 
problem just to improve performance.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM: MMU: optimize set_spte for page sync

2008-12-09 Thread Marcelo Tosatti
On Wed, Nov 26, 2008 at 01:12:20PM +0200, Avi Kivity wrote:
 Marcelo Tosatti wrote:

 Here it goes.

 KVM: MMU: optimize set_spte for page sync

 The write protect verification in set_spte is unnecessary for page sync.

 Its guaranteed that, if the unsync spte was writable, the target page
 does not have a write protected shadow (if it had, the spte would have
 been write protected under mmu_lock by rmap_write_protect before).

 Same reasoning applies to mark_page_dirty: the gfn has been marked as
 dirty via the pagefault path.

 The cost of hash table and memslot lookups are quite significant if the
 workload is pagetable write intensive resulting in increased mmu_lock
 contention.

   

 Applied, thanks.

Avi,

Do you have objections for submitting this patch for 2.6.28 ? The hash
lookup kills performance of pagetable write + context switch intensive
workloads.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 44/44] KVM: MMU: optimize set_spte for page sync

2008-12-09 Thread Avi Kivity
From: Marcelo Tosatti [EMAIL PROTECTED]

The write protect verification in set_spte is unnecessary for page sync.

Its guaranteed that, if the unsync spte was writable, the target page
does not have a write protected shadow (if it had, the spte would have
been write protected under mmu_lock by rmap_write_protect before).

Same reasoning applies to mark_page_dirty: the gfn has been marked as
dirty via the pagefault path.

The cost of hash table and memslot lookups are quite significant if the
workload is pagetable write intensive resulting in increased mmu_lock
contention.

Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]
Signed-off-by: Avi Kivity [EMAIL PROTECTED]
---
 arch/x86/kvm/mmu.c |9 +
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index fa3486d..dd20b19 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1593,6 +1593,15 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 
*shadow_pte,
 
spte |= PT_WRITABLE_MASK;
 
+   /*
+* Optimization: for pte sync, if spte was writable the hash
+* lookup is unnecessary (and expensive). Write protection
+* is responsibility of mmu_get_page / kvm_sync_page.
+* Same reasoning can be applied to dirty page accounting.
+*/
+   if (!can_unsync  is_writeble_pte(*shadow_pte))
+   goto set_pte;
+
if (mmu_need_write_protect(vcpu, gfn, can_unsync)) {
pgprintk(%s: found shadow page for %lx, marking ro\n,
 __func__, gfn);
-- 
1.6.0.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: MMU: optimize set_spte for page sync

2008-11-26 Thread Avi Kivity
From: Marcelo Tosatti [EMAIL PROTECTED]

The write protect verification in set_spte is unnecessary for page sync.

Its guaranteed that, if the unsync spte was writable, the target page
does not have a write protected shadow (if it had, the spte would have
been write protected under mmu_lock by rmap_write_protect before).

Same reasoning applies to mark_page_dirty: the gfn has been marked as
dirty via the pagefault path.

The cost of hash table and memslot lookups are quite significant if the
workload is pagetable write intensive resulting in increased mmu_lock
contention.

Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]
Signed-off-by: Avi Kivity [EMAIL PROTECTED]

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index fa3486d..dd20b19 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1593,6 +1593,15 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 
*shadow_pte,
 
spte |= PT_WRITABLE_MASK;
 
+   /*
+* Optimization: for pte sync, if spte was writable the hash
+* lookup is unnecessary (and expensive). Write protection
+* is responsibility of mmu_get_page / kvm_sync_page.
+* Same reasoning can be applied to dirty page accounting.
+*/
+   if (!can_unsync  is_writeble_pte(*shadow_pte))
+   goto set_pte;
+
if (mmu_need_write_protect(vcpu, gfn, can_unsync)) {
pgprintk(%s: found shadow page for %lx, marking ro\n,
 __func__, gfn);
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM: MMU: optimize set_spte for page sync

2008-11-26 Thread Avi Kivity

Marcelo Tosatti wrote:


Here it goes.

KVM: MMU: optimize set_spte for page sync

The write protect verification in set_spte is unnecessary for page sync.

Its guaranteed that, if the unsync spte was writable, the target page
does not have a write protected shadow (if it had, the spte would have
been write protected under mmu_lock by rmap_write_protect before).

Same reasoning applies to mark_page_dirty: the gfn has been marked as
dirty via the pagefault path.

The cost of hash table and memslot lookups are quite significant if the
workload is pagetable write intensive resulting in increased mmu_lock
contention.

  


Applied, thanks.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM: MMU: optimize set_spte for page sync

2008-11-25 Thread Avi Kivity

Marcelo Tosatti wrote:

*shadow_pte can point to a different page if the guest updates
pagetable, there is a fault before resync, the fault updates the
spte with new gfn (and pfn) via mmu_set_spte. In which case the gfn
cache is updated since:

} else if (pfn != spte_to_pfn(*shadow_pte)) {
printk(hfn old %lx new %lx\n,
 spte_to_pfn(*shadow_pte), pfn);
rmap_remove(vcpu-kvm, shadow_pte);
  


Okay.  Please resend but without the reversal of can_unsync, it will 
make a more readable patch.  If you like, send a follow on that only 
does the reversal.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM: MMU: optimize set_spte for page sync

2008-11-25 Thread Marcelo Tosatti
On Tue, Nov 25, 2008 at 04:38:13PM +0200, Avi Kivity wrote:
 Marcelo Tosatti wrote:
 *shadow_pte can point to a different page if the guest updates
 pagetable, there is a fault before resync, the fault updates the
 spte with new gfn (and pfn) via mmu_set_spte. In which case the gfn
 cache is updated since:

 } else if (pfn != spte_to_pfn(*shadow_pte)) {
 printk(hfn old %lx new %lx\n,
  spte_to_pfn(*shadow_pte), pfn);
 rmap_remove(vcpu-kvm, shadow_pte);
   

 Okay.  Please resend but without the reversal of can_unsync, it will  
 make a more readable patch.  If you like, send a follow on that only  
 does the reversal.

Here it goes.

KVM: MMU: optimize set_spte for page sync

The write protect verification in set_spte is unnecessary for page sync.

Its guaranteed that, if the unsync spte was writable, the target page
does not have a write protected shadow (if it had, the spte would have
been write protected under mmu_lock by rmap_write_protect before).

Same reasoning applies to mark_page_dirty: the gfn has been marked as
dirty via the pagefault path.

The cost of hash table and memslot lookups are quite significant if the
workload is pagetable write intensive resulting in increased mmu_lock
contention.

Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]

Index: kvm/arch/x86/kvm/mmu.c
===
--- kvm.orig/arch/x86/kvm/mmu.c
+++ kvm/arch/x86/kvm/mmu.c
@@ -1593,6 +1593,15 @@ static int set_spte(struct kvm_vcpu *vcp
 
spte |= PT_WRITABLE_MASK;
 
+   /*
+* Optimization: for pte sync, if spte was writable the hash
+* lookup is unnecessary (and expensive). Write protection
+* is responsibility of mmu_get_page / kvm_sync_page.
+* Same reasoning can be applied to dirty page accounting.
+*/
+   if (!can_unsync  is_writeble_pte(*shadow_pte))
+   goto set_pte;
+
if (mmu_need_write_protect(vcpu, gfn, can_unsync)) {
pgprintk(%s: found shadow page for %lx, marking ro\n,
 __func__, gfn);

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM: MMU: optimize set_spte for page sync

2008-11-24 Thread Marcelo Tosatti
On Sun, Nov 23, 2008 at 12:36:29PM +0200, Avi Kivity wrote:
 Marcelo Tosatti wrote:

 The cost of hash table and memslot lookups are quite significant if the
 workload is pagetable write intensive resulting in increased mmu_lock
 contention.

 @@ -1593,7 +1593,16 @@ static int set_spte(struct kvm_vcpu *vcp
  spte |= PT_WRITABLE_MASK;
  -   if (mmu_need_write_protect(vcpu, gfn, can_unsync)) {
 +/*
 + * Optimization: for pte sync, if spte was writable the hash
 + * lookup is unnecessary (and expensive). Write protection
 + * is responsibility of mmu_get_page / kvm_sync_page.
 + * Same reasoning can be applied to dirty page accounting.
 + */
 +if (sync_page  is_writeble_pte(*shadow_pte))
 +goto set_pte;
   

 What if *shadow_pte points at a different page?  Is that possible?

To a different gfn? Then sync_page will have nuked the spte:

if (gpte_to_gfn(gpte) != gfn || !is_present_pte(gpte) ||
!(gpte  PT_ACCESSED_MASK)) {
u64 nonpresent;
..
set_shadow_pte(sp-spt[i], nonpresent);
}

Otherwise:

/*
 * Using the cached information from sp-gfns is safe because:
 * - The spte has a reference to the struct page, so the pfn for a given
 * gfn can't change unless all sptes pointing to it are nuked first.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM: MMU: optimize set_spte for page sync

2008-11-24 Thread Marcelo Tosatti
On Mon, Nov 24, 2008 at 01:04:23PM +0100, Marcelo Tosatti wrote:
 On Sun, Nov 23, 2008 at 12:36:29PM +0200, Avi Kivity wrote:
  Marcelo Tosatti wrote:
 
  The cost of hash table and memslot lookups are quite significant if the
  workload is pagetable write intensive resulting in increased mmu_lock
  contention.
 
  @@ -1593,7 +1593,16 @@ static int set_spte(struct kvm_vcpu *vcp
 spte |= PT_WRITABLE_MASK;
   - if (mmu_need_write_protect(vcpu, gfn, can_unsync)) {
  +  /*
  +   * Optimization: for pte sync, if spte was writable the hash
  +   * lookup is unnecessary (and expensive). Write protection
  +   * is responsibility of mmu_get_page / kvm_sync_page.
  +   * Same reasoning can be applied to dirty page accounting.
  +   */
  +  if (sync_page  is_writeble_pte(*shadow_pte))
  +  goto set_pte;

 
  What if *shadow_pte points at a different page?  Is that possible?

 To a different gfn? Then sync_page will have nuked the spte:
 
 if (gpte_to_gfn(gpte) != gfn || !is_present_pte(gpte) ||
 !(gpte  PT_ACCESSED_MASK)) {
 u64 nonpresent;
 ..
 set_shadow_pte(sp-spt[i], nonpresent);
 }
 
 Otherwise:
 
 /*
  * Using the cached information from sp-gfns is safe because:
  * - The spte has a reference to the struct page, so the pfn for a given
  * gfn can't change unless all sptes pointing to it are nuked first.

*shadow_pte can point to a different page if the guest updates
pagetable, there is a fault before resync, the fault updates the
spte with new gfn (and pfn) via mmu_set_spte. In which case the gfn
cache is updated since:

} else if (pfn != spte_to_pfn(*shadow_pte)) {
printk(hfn old %lx new %lx\n,
 spte_to_pfn(*shadow_pte), pfn);
rmap_remove(vcpu-kvm, shadow_pte);


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KVM: MMU: optimize set_spte for page sync

2008-11-21 Thread Marcelo Tosatti

The write protect verification in set_spte is unnecessary for page sync.

Its guaranteed that, if the unsync spte was writable, the target page
does not have a write protected shadow (if it had, the spte would have
been write protected under mmu_lock by rmap_write_protect before).

Same reasoning applies to mark_page_dirty: the gfn has been marked as
dirty via the pagefault path.

The cost of hash table and memslot lookups are quite significant if the
workload is pagetable write intensive resulting in increased mmu_lock
contention.

Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]

Index: kvm/arch/x86/kvm/mmu.c
===
--- kvm.orig/arch/x86/kvm/mmu.c
+++ kvm/arch/x86/kvm/mmu.c
@@ -1529,7 +1529,7 @@ static int kvm_unsync_page(struct kvm_vc
 }
 
 static int mmu_need_write_protect(struct kvm_vcpu *vcpu, gfn_t gfn,
- bool can_unsync)
+ bool sync_page)
 {
struct kvm_mmu_page *shadow;
 
@@ -1539,7 +1539,7 @@ static int mmu_need_write_protect(struct
return 1;
if (shadow-unsync)
return 0;
-   if (can_unsync  oos_shadow)
+   if (!sync_page  oos_shadow)
return kvm_unsync_page(vcpu, shadow);
return 1;
}
@@ -1550,7 +1550,7 @@ static int set_spte(struct kvm_vcpu *vcp
unsigned pte_access, int user_fault,
int write_fault, int dirty, int largepage,
gfn_t gfn, pfn_t pfn, bool speculative,
-   bool can_unsync)
+   bool sync_page)
 {
u64 spte;
int ret = 0;
@@ -1593,7 +1593,16 @@ static int set_spte(struct kvm_vcpu *vcp
 
spte |= PT_WRITABLE_MASK;
 
-   if (mmu_need_write_protect(vcpu, gfn, can_unsync)) {
+   /*
+* Optimization: for pte sync, if spte was writable the hash
+* lookup is unnecessary (and expensive). Write protection
+* is responsibility of mmu_get_page / kvm_sync_page.
+* Same reasoning can be applied to dirty page accounting.
+*/
+   if (sync_page  is_writeble_pte(*shadow_pte))
+   goto set_pte;
+
+   if (mmu_need_write_protect(vcpu, gfn, sync_page)) {
pgprintk(%s: found shadow page for %lx, marking ro\n,
 __func__, gfn);
ret = 1;
@@ -1648,7 +1657,7 @@ static void mmu_set_spte(struct kvm_vcpu
}
}
if (set_spte(vcpu, shadow_pte, pte_access, user_fault, write_fault,
- dirty, largepage, gfn, pfn, speculative, true)) {
+ dirty, largepage, gfn, pfn, speculative, false)) {
if (write_fault)
*ptwrite = 1;
kvm_x86_ops-tlb_flush(vcpu);
Index: kvm/arch/x86/kvm/paging_tmpl.h
===
--- kvm.orig/arch/x86/kvm/paging_tmpl.h
+++ kvm/arch/x86/kvm/paging_tmpl.h
@@ -580,7 +580,7 @@ static int FNAME(sync_page)(struct kvm_v
pte_access = sp-role.access  FNAME(gpte_access)(vcpu, gpte);
set_spte(vcpu, sp-spt[i], pte_access, 0, 0,
 is_dirty_pte(gpte), 0, gfn,
-spte_to_pfn(sp-spt[i]), true, false);
+spte_to_pfn(sp-spt[i]), true, true);
}
 
return !nr_present;
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html