Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable
On 04/17/2012 03:41 PM, Avi Kivity wrote: > On 04/17/2012 06:55 AM, Xiao Guangrong wrote: >> On 04/16/2012 07:47 PM, Avi Kivity wrote: >> >>> On 04/16/2012 01:20 PM, Xiao Guangrong wrote: >> >> It is used to avoid the unnecessary overload > > It's overloading me :( > Sorry. >>> >>> The trick is to send those in separate patchset so the maintainer >>> doesn't notice. >>> >> >> >> Thanks for your suggestion, i will pay more attention on it in the >> further. >> >> For this patch, what did you mean of "those"? You mean the whole >> rmap.PTE_LIST_WP_BIT (fast check for shadow page table write protection >> and host write protection) or just about host_page_write_protect >> (for KSM only)? > > All of it. Let's start with just modifying sptes concurrently and only > later add reading bits from rmap concurrently, if it proves necessary. > Okay, i agree. >> >> If we do not have rmap.PTE_LIST_WP_BIT, there may have regression on >> shadow mmu. >> >> Hmm, do i need implement rmap.PTE_LIST_WP_BIT, then fast page fault? > > Let's try to measure the effect without rmap.PTE_LIST_WP_BIT. Usually > PTE chains for page tables are short so the effect would be small. Of > course we can't tell about all guest. > It is not about rmap's spte, it is about sp.sync write-protect, if the sp.sync is written, the fast page fault path will be triggered even if no migration and no framebuffer. I have done a quick test for kernbench for 10 times and get the average value without xwindow: keep rmap.PTE_LIST_WP_BIT: 53.494 comment rmap.PTE_LIST_WP_BIT checking in page_fault_can_be_fast: 53.948 Anyway, for good review, let move fast page fault in first and discuss this in the separate patchset later. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable
On 04/17/2012 06:55 AM, Xiao Guangrong wrote: > On 04/16/2012 07:47 PM, Avi Kivity wrote: > > > On 04/16/2012 01:20 PM, Xiao Guangrong wrote: > > It is used to avoid the unnecessary overload > >>> > >>> It's overloading me :( > >>> > >> > >> > >> Sorry. > >> > > > > The trick is to send those in separate patchset so the maintainer > > doesn't notice. > > > > > Thanks for your suggestion, i will pay more attention on it in the > further. > > For this patch, what did you mean of "those"? You mean the whole > rmap.PTE_LIST_WP_BIT (fast check for shadow page table write protection > and host write protection) or just about host_page_write_protect > (for KSM only)? All of it. Let's start with just modifying sptes concurrently and only later add reading bits from rmap concurrently, if it proves necessary. > > If we do not have rmap.PTE_LIST_WP_BIT, there may have regression on > shadow mmu. > > Hmm, do i need implement rmap.PTE_LIST_WP_BIT, then fast page fault? Let's try to measure the effect without rmap.PTE_LIST_WP_BIT. Usually PTE chains for page tables are short so the effect would be small. Of course we can't tell about all guest. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable
On 04/16/2012 07:47 PM, Avi Kivity wrote: > On 04/16/2012 01:20 PM, Xiao Guangrong wrote: It is used to avoid the unnecessary overload >>> >>> It's overloading me :( >>> >> >> >> Sorry. >> > > The trick is to send those in separate patchset so the maintainer > doesn't notice. > Thanks for your suggestion, i will pay more attention on it in the further. For this patch, what did you mean of "those"? You mean the whole rmap.PTE_LIST_WP_BIT (fast check for shadow page table write protection and host write protection) or just about host_page_write_protect (for KSM only)? If we do not have rmap.PTE_LIST_WP_BIT, there may have regression on shadow mmu. Hmm, do i need implement rmap.PTE_LIST_WP_BIT, then fast page fault? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable
On 04/16/2012 01:20 PM, Xiao Guangrong wrote: > >> > >> It is used to avoid the unnecessary overload > > > > It's overloading me :( > > > > > Sorry. > The trick is to send those in separate patchset so the maintainer doesn't notice. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable
On 04/16/2012 06:02 PM, Avi Kivity wrote: > On 04/16/2012 06:25 AM, Xiao Guangrong wrote: >> On 04/15/2012 11:16 PM, Avi Kivity wrote: >> >>> On 04/13/2012 01:14 PM, Xiao Guangrong wrote: Using bit 1 (PTE_LIST_WP_BIT) in rmap store the write-protect status to avoid unnecessary shadow page walking Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 40 ++-- 1 files changed, 34 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 0c6e92d..8b71908 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -796,7 +796,9 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t large_gfn) return level - 1; } -#define PTE_LIST_DESC (0x1ull) +#define PTE_LIST_DESC_BIT 0 +#define PTE_LIST_WP_BIT 1 +#define PTE_LIST_DESC (1 << PTE_LIST_DESC_BIT) #define PTE_LIST_FLAG_MASK(0x3ull) static void @@ -1067,6 +1069,12 @@ static bool rmap_can_add(struct kvm_vcpu *vcpu) return mmu_memory_cache_free_objects(cache); } +static void host_page_write_protect(u64 *spte, unsigned long *rmapp) +{ + if (!(*spte & SPTE_HOST_WRITEABLE)) + __test_and_set_bit(PTE_LIST_WP_BIT, rmapp); +} >>> >>> Why is this needed, in addition to spte.SPTE_WRITE_PROTECT? >>> >> >> >> It is used to avoid the unnecessary overload > > It's overloading me :( > Sorry. >> for fast page fault if >> KSM is enabled. On the fast check path, it can see the gfn is write-protected >> by host, then the fast page fault path is not called. > > The fast page fault path is supposed to be fast, so it's okay if we take > a bit of extra overhead before a COW (which is going to be slow anyway). > > Let's get the simplest possible version in, and then discuss if/how we > need to optimize it further. > Okay, i will drop setting PTE_LIST_WP_BIT for this case in the next version.:) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable
On 04/16/2012 06:25 AM, Xiao Guangrong wrote: > On 04/15/2012 11:16 PM, Avi Kivity wrote: > > > On 04/13/2012 01:14 PM, Xiao Guangrong wrote: > >> Using bit 1 (PTE_LIST_WP_BIT) in rmap store the write-protect status > >> to avoid unnecessary shadow page walking > >> > >> Signed-off-by: Xiao Guangrong > >> --- > >> arch/x86/kvm/mmu.c | 40 ++-- > >> 1 files changed, 34 insertions(+), 6 deletions(-) > >> > >> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > >> index 0c6e92d..8b71908 100644 > >> --- a/arch/x86/kvm/mmu.c > >> +++ b/arch/x86/kvm/mmu.c > >> @@ -796,7 +796,9 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t > >> large_gfn) > >>return level - 1; > >> } > >> > >> -#define PTE_LIST_DESC (0x1ull) > >> +#define PTE_LIST_DESC_BIT 0 > >> +#define PTE_LIST_WP_BIT 1 > >> +#define PTE_LIST_DESC (1 << PTE_LIST_DESC_BIT) > >> #define PTE_LIST_FLAG_MASK(0x3ull) > >> > >> static void > >> @@ -1067,6 +1069,12 @@ static bool rmap_can_add(struct kvm_vcpu *vcpu) > >>return mmu_memory_cache_free_objects(cache); > >> } > >> > >> +static void host_page_write_protect(u64 *spte, unsigned long *rmapp) > >> +{ > >> + if (!(*spte & SPTE_HOST_WRITEABLE)) > >> + __test_and_set_bit(PTE_LIST_WP_BIT, rmapp); > >> +} > >> > > > > Why is this needed, in addition to spte.SPTE_WRITE_PROTECT? > > > > > It is used to avoid the unnecessary overload It's overloading me :( > for fast page fault if > KSM is enabled. On the fast check path, it can see the gfn is write-protected > by host, then the fast page fault path is not called. The fast page fault path is supposed to be fast, so it's okay if we take a bit of extra overhead before a COW (which is going to be slow anyway). Let's get the simplest possible version in, and then discuss if/how we need to optimize it further. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable
On 04/14/2012 11:01 AM, Takuya Yoshikawa wrote: > On Fri, 13 Apr 2012 18:14:26 +0800 > Xiao Guangrong wrote: > >> Using bit 1 (PTE_LIST_WP_BIT) in rmap store the write-protect status >> to avoid unnecessary shadow page walking >> >> Signed-off-by: Xiao Guangrong >> --- >> arch/x86/kvm/mmu.c | 40 ++-- >> 1 files changed, 34 insertions(+), 6 deletions(-) >> >> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c >> index 0c6e92d..8b71908 100644 >> --- a/arch/x86/kvm/mmu.c >> +++ b/arch/x86/kvm/mmu.c >> @@ -796,7 +796,9 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t >> large_gfn) >> return level - 1; >> } >> >> -#define PTE_LIST_DESC (0x1ull) >> +#define PTE_LIST_DESC_BIT 0 >> +#define PTE_LIST_WP_BIT 1 > > _BIT ? What is the problem? > >> @@ -2291,9 +2310,15 @@ static int mmu_need_write_protect(struct kvm_vcpu >> *vcpu, gfn_t gfn, >> { >> struct kvm_mmu_page *s; >> struct hlist_node *node; >> +unsigned long *rmap; >> bool need_unsync = false; >> >> +rmap = gfn_to_rmap(vcpu->kvm, gfn, PT_PAGE_TABLE_LEVEL); > > Please use consistent variable names. > > In other parts of this patch, you are using rmapp for this. Okay. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable
On 04/15/2012 11:16 PM, Avi Kivity wrote: > On 04/13/2012 01:14 PM, Xiao Guangrong wrote: >> Using bit 1 (PTE_LIST_WP_BIT) in rmap store the write-protect status >> to avoid unnecessary shadow page walking >> >> Signed-off-by: Xiao Guangrong >> --- >> arch/x86/kvm/mmu.c | 40 ++-- >> 1 files changed, 34 insertions(+), 6 deletions(-) >> >> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c >> index 0c6e92d..8b71908 100644 >> --- a/arch/x86/kvm/mmu.c >> +++ b/arch/x86/kvm/mmu.c >> @@ -796,7 +796,9 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t >> large_gfn) >> return level - 1; >> } >> >> -#define PTE_LIST_DESC (0x1ull) >> +#define PTE_LIST_DESC_BIT 0 >> +#define PTE_LIST_WP_BIT 1 >> +#define PTE_LIST_DESC (1 << PTE_LIST_DESC_BIT) >> #define PTE_LIST_FLAG_MASK (0x3ull) >> >> static void >> @@ -1067,6 +1069,12 @@ static bool rmap_can_add(struct kvm_vcpu *vcpu) >> return mmu_memory_cache_free_objects(cache); >> } >> >> +static void host_page_write_protect(u64 *spte, unsigned long *rmapp) >> +{ >> +if (!(*spte & SPTE_HOST_WRITEABLE)) >> +__test_and_set_bit(PTE_LIST_WP_BIT, rmapp); >> +} >> > > Why is this needed, in addition to spte.SPTE_WRITE_PROTECT? > It is used to avoid the unnecessary overload for fast page fault if KSM is enabled. On the fast check path, it can see the gfn is write-protected by host, then the fast page fault path is not called. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable
On 04/13/2012 01:14 PM, Xiao Guangrong wrote: > Using bit 1 (PTE_LIST_WP_BIT) in rmap store the write-protect status > to avoid unnecessary shadow page walking > > Signed-off-by: Xiao Guangrong > --- > arch/x86/kvm/mmu.c | 40 ++-- > 1 files changed, 34 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 0c6e92d..8b71908 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -796,7 +796,9 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t > large_gfn) > return level - 1; > } > > -#define PTE_LIST_DESC(0x1ull) > +#define PTE_LIST_DESC_BIT0 > +#define PTE_LIST_WP_BIT 1 > +#define PTE_LIST_DESC(1 << PTE_LIST_DESC_BIT) > #define PTE_LIST_FLAG_MASK (0x3ull) > > static void > @@ -1067,6 +1069,12 @@ static bool rmap_can_add(struct kvm_vcpu *vcpu) > return mmu_memory_cache_free_objects(cache); > } > > +static void host_page_write_protect(u64 *spte, unsigned long *rmapp) > +{ > + if (!(*spte & SPTE_HOST_WRITEABLE)) > + __test_and_set_bit(PTE_LIST_WP_BIT, rmapp); > +} > Why is this needed, in addition to spte.SPTE_WRITE_PROTECT? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 10/16] KVM: MMU: fask check whether page is writable
On Fri, 13 Apr 2012 18:14:26 +0800 Xiao Guangrong wrote: > Using bit 1 (PTE_LIST_WP_BIT) in rmap store the write-protect status > to avoid unnecessary shadow page walking > > Signed-off-by: Xiao Guangrong > --- > arch/x86/kvm/mmu.c | 40 ++-- > 1 files changed, 34 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 0c6e92d..8b71908 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -796,7 +796,9 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t > large_gfn) > return level - 1; > } > > -#define PTE_LIST_DESC(0x1ull) > +#define PTE_LIST_DESC_BIT0 > +#define PTE_LIST_WP_BIT 1 _BIT ? > +#define PTE_LIST_DESC(1 << PTE_LIST_DESC_BIT) > #define PTE_LIST_FLAG_MASK (0x3ull) > Well, I think that rmap is losing its original meaning: Before, it held just the information needed to link spteps. Now it has more roles, and getting kind of descriptor? ... > @@ -2291,9 +2310,15 @@ static int mmu_need_write_protect(struct kvm_vcpu > *vcpu, gfn_t gfn, > { > struct kvm_mmu_page *s; > struct hlist_node *node; > + unsigned long *rmap; > bool need_unsync = false; > > + rmap = gfn_to_rmap(vcpu->kvm, gfn, PT_PAGE_TABLE_LEVEL); Please use consistent variable names. In other parts of this patch, you are using rmapp for this. Takuya -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 10/16] KVM: MMU: fask check whether page is writable
Using bit 1 (PTE_LIST_WP_BIT) in rmap store the write-protect status to avoid unnecessary shadow page walking Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 40 ++-- 1 files changed, 34 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 0c6e92d..8b71908 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -796,7 +796,9 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t large_gfn) return level - 1; } -#define PTE_LIST_DESC (0x1ull) +#define PTE_LIST_DESC_BIT 0 +#define PTE_LIST_WP_BIT1 +#define PTE_LIST_DESC (1 << PTE_LIST_DESC_BIT) #define PTE_LIST_FLAG_MASK (0x3ull) static void @@ -1067,6 +1069,12 @@ static bool rmap_can_add(struct kvm_vcpu *vcpu) return mmu_memory_cache_free_objects(cache); } +static void host_page_write_protect(u64 *spte, unsigned long *rmapp) +{ + if (!(*spte & SPTE_HOST_WRITEABLE)) + __test_and_set_bit(PTE_LIST_WP_BIT, rmapp); +} + static int rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn) { struct kvm_mmu_page *sp; @@ -1074,7 +1082,10 @@ static int rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn) sp = page_header(__pa(spte)); kvm_mmu_page_set_gfn(sp, spte - sp->spt, gfn); + rmapp = gfn_to_rmap(vcpu->kvm, gfn, sp->role.level); + host_page_write_protect(spte, rmapp); + return pte_list_add(vcpu, spte, rmapp); } @@ -1164,16 +1175,23 @@ static bool rmap_write_protect(struct kvm *kvm, u64 gfn) { struct kvm_memory_slot *slot; unsigned long *rmapp; - int i; + int i = PT_PAGE_TABLE_LEVEL; bool write_protected = false; slot = gfn_to_memslot(kvm, gfn); + rmapp = __gfn_to_rmap(gfn, i, slot); + + if (__test_and_set_bit(PTE_LIST_WP_BIT, rmapp)) + return false; + + do { + write_protected |= __rmap_write_protect(kvm, rmapp, i++); + + if (i >= PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES) + break; - for (i = PT_PAGE_TABLE_LEVEL; -i < PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES; ++i) { rmapp = __gfn_to_rmap(gfn, i, slot); - write_protected |= __rmap_write_protect(kvm, rmapp, i); - } + } while (true); return write_protected; } @@ -1225,6 +1243,7 @@ static int kvm_set_pte_rmapp(struct kvm *kvm, unsigned long *rmapp, mmu_spte_clear_track_bits(sptep); mmu_spte_set(sptep, new_spte); + host_page_write_protect(sptep, rmapp); } } @@ -2291,9 +2310,15 @@ static int mmu_need_write_protect(struct kvm_vcpu *vcpu, gfn_t gfn, { struct kvm_mmu_page *s; struct hlist_node *node; + unsigned long *rmap; bool need_unsync = false; + rmap = gfn_to_rmap(vcpu->kvm, gfn, PT_PAGE_TABLE_LEVEL); + if (!vcpu->kvm->arch.indirect_shadow_pages) + goto write_free; + + if (!test_bit(PTE_LIST_WP_BIT, rmap)) return 0; for_each_gfn_indirect_valid_sp(vcpu->kvm, s, gfn, node) { @@ -2309,6 +2334,9 @@ static int mmu_need_write_protect(struct kvm_vcpu *vcpu, gfn_t gfn, } if (need_unsync) kvm_unsync_pages(vcpu, gfn); + +write_free: + __clear_bit(PTE_LIST_WP_BIT, rmap); return 0; } -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html