Re: [PATCH v3 0/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-19 Thread Paolo Bonzini
Il 07/06/2013 10:51, Xiao Guangrong ha scritto:
> Changelog:
> V3:
>   All of these changes are from Gleb's review:
>   1) rename RET_MMIO_PF_EMU to RET_MMIO_PF_EMULATE.
>   2) smartly adjust kvm generation number in kvm_current_mmio_generatio()
>  to avoid kvm_memslots->generation overflow.
> 
> V2:
>   - rename kvm_mmu_invalid_mmio_spte to kvm_mmu_invalid_mmio_sptes
>   - use kvm->memslots->generation as kvm global generation-number
>   - fix comment and codestyle
>   - init kvm generation close to mmio wrap-around value
>   - keep kvm_mmu_zap_mmio_sptes
> 
> The current way is holding hot mmu-lock and walking all shadow pages, this
> is not scale. This patchset tries to introduce a very simple and scale way
> to fast invalidate all mmio sptes - it need not walk any shadow pages and hold
> any locks.
> 
> The idea is simple:
> KVM maintains a global mmio valid generation-number which is stored in
> kvm->memslots.generation and every mmio spte stores the current global
> generation-number into his available bits when it is created
> 
> When KVM need zap all mmio sptes, it just simply increase the global
> generation-number. When guests do mmio access, KVM intercepts a MMIO #PF
> then it walks the shadow page table and get the mmio spte. If the
> generation-number on the spte does not equal the global generation-number,
> it will go to the normal #PF handler to update the mmio spte
> 
> Since 19 bits are used to store generation-number on mmio spte, we zap all
> mmio sptes when the number is round
> 
> Xiao Guangrong (6):
>   KVM: MMU: retain more available bits on mmio spte
>   KVM: MMU: store generation-number into mmio spte
>   KVM: MMU: make return value of mmio page fault handler more readable
>   KVM: MMU: fast invalidate all mmio sptes
>   KVM: MMU: add tracepoint for check_mmio_spte
>   KVM: MMU: init kvm generation close to mmio wrap-around value
> 
>  arch/x86/include/asm/kvm_host.h |   2 +-
>  arch/x86/kvm/mmu.c  | 131 
> 
>  arch/x86/kvm/mmu.h  |  17 ++
>  arch/x86/kvm/mmutrace.h |  34 +--
>  arch/x86/kvm/paging_tmpl.h  |  10 ++-
>  arch/x86/kvm/vmx.c  |  12 ++--
>  arch/x86/kvm/x86.c  |  11 +++-
>  7 files changed, 177 insertions(+), 40 deletions(-)
> 

Applied all to queue for now.  Will graduate to next after doing some
more tests.

Thanks!

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-19 Thread Xiao Guangrong
On 06/19/2013 07:08 PM, Paolo Bonzini wrote:
> Il 10/06/2013 19:03, Gleb Natapov ha scritto:
>> On Mon, Jun 10, 2013 at 10:43:52PM +0900, Takuya Yoshikawa wrote:
>>> On Mon, 10 Jun 2013 16:39:37 +0800
>>> Xiao Guangrong  wrote:
>>>
 On 06/10/2013 03:56 PM, Gleb Natapov wrote:
> On Fri, Jun 07, 2013 at 04:51:22PM +0800, Xiao Guangrong wrote:
>>>
> Looks good to me, but doesn't tis obsolete kvm_mmu_zap_mmio_sptes() and
> sp->mmio_cached, so they should be removed as part of the patch series?

 Yes, i agree, they should be removed. :)
>>>
>>> I'm fine with removing it but please make it clear that you all agree
>>> on the same basis.
>>>
>>> Last time, Paolo mentioned the possibility to use some bits of spte for
>>> other things.  The suggestion there was to keep sp->mmio_cached code
>>> for the time we would need to reduce the bits for generation numbers.
>>>
>>> Do you think that zap_all() is now preemptible and can treat the
>>> situation reasonably well as the current kvm_mmu_zap_mmio_sptes()?
>>>
>>> One downside is the need to zap unrelated shadow pages, but if this case
>>> is really very rare, yes I agree, it should not be a problem: it depends
>>> on how many bits we can use.
>>>
>>> Just please reconfirm.
>>>
>> That was me who mention the possibility to use some bits of spte for
>> other things. But for now I have a use for one bit only. Now that you
>> have reminded me about that discussion I am not so sure we want to
>> remove kvm_mmu_zap_mmio_sptes(), but on the other hand it is non
>> preemptable, so large number of mmio sptes can cause soft lockups.
>> zap_all() is better in this regards now.
> 
> I asked Gleb on IRC, and he's fine with applying patch 7 too (otherwise
> there's hardly any benefit, because kvm_mmu_zap_mmio_sptes is
> non-preemptable).
> 
> I'm also changing the -13 to -150 since it's quite easy to generate 150
> calls to KVM_SET_USER_MEMORY_REGION.  Using QEMU, and for a pretty basic
> guest with virtio-net, IDE controller and VGA you get:
> 
> - 9-10 calls before starting the guest, depending on the guest memory size
> 
> - around 25 during the BIOS
> 
> - around 20 during kernel boot
> 
> - 34 during a single dump of the 64 KB ROM from a virtio-net device.

Okay. The change is find to me. :)


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-19 Thread Paolo Bonzini
Il 10/06/2013 19:03, Gleb Natapov ha scritto:
> On Mon, Jun 10, 2013 at 10:43:52PM +0900, Takuya Yoshikawa wrote:
>> On Mon, 10 Jun 2013 16:39:37 +0800
>> Xiao Guangrong  wrote:
>>
>>> On 06/10/2013 03:56 PM, Gleb Natapov wrote:
 On Fri, Jun 07, 2013 at 04:51:22PM +0800, Xiao Guangrong wrote:
>>
 Looks good to me, but doesn't tis obsolete kvm_mmu_zap_mmio_sptes() and
 sp->mmio_cached, so they should be removed as part of the patch series?
>>>
>>> Yes, i agree, they should be removed. :)
>>
>> I'm fine with removing it but please make it clear that you all agree
>> on the same basis.
>>
>> Last time, Paolo mentioned the possibility to use some bits of spte for
>> other things.  The suggestion there was to keep sp->mmio_cached code
>> for the time we would need to reduce the bits for generation numbers.
>>
>> Do you think that zap_all() is now preemptible and can treat the
>> situation reasonably well as the current kvm_mmu_zap_mmio_sptes()?
>>
>> One downside is the need to zap unrelated shadow pages, but if this case
>> is really very rare, yes I agree, it should not be a problem: it depends
>> on how many bits we can use.
>>
>> Just please reconfirm.
>>
> That was me who mention the possibility to use some bits of spte for
> other things. But for now I have a use for one bit only. Now that you
> have reminded me about that discussion I am not so sure we want to
> remove kvm_mmu_zap_mmio_sptes(), but on the other hand it is non
> preemptable, so large number of mmio sptes can cause soft lockups.
> zap_all() is better in this regards now.

I asked Gleb on IRC, and he's fine with applying patch 7 too (otherwise
there's hardly any benefit, because kvm_mmu_zap_mmio_sptes is
non-preemptable).

I'm also changing the -13 to -150 since it's quite easy to generate 150
calls to KVM_SET_USER_MEMORY_REGION.  Using QEMU, and for a pretty basic
guest with virtio-net, IDE controller and VGA you get:

- 9-10 calls before starting the guest, depending on the guest memory size

- around 25 during the BIOS

- around 20 during kernel boot

- 34 during a single dump of the 64 KB ROM from a virtio-net device.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-18 Thread Xiao Guangrong
On 06/18/2013 10:26 PM, Paolo Bonzini wrote:
> Il 07/06/2013 10:51, Xiao Guangrong ha scritto:
>> Changelog:
>> V3:
>>   All of these changes are from Gleb's review:
>>   1) rename RET_MMIO_PF_EMU to RET_MMIO_PF_EMULATE.
>>   2) smartly adjust kvm generation number in kvm_current_mmio_generatio()
>>  to avoid kvm_memslots->generation overflow.
>>
>> V2:
>>   - rename kvm_mmu_invalid_mmio_spte to kvm_mmu_invalid_mmio_sptes
>>   - use kvm->memslots->generation as kvm global generation-number
>>   - fix comment and codestyle
>>   - init kvm generation close to mmio wrap-around value
>>   - keep kvm_mmu_zap_mmio_sptes
>>
>> The current way is holding hot mmu-lock and walking all shadow pages, this
>> is not scale. This patchset tries to introduce a very simple and scale way
>> to fast invalidate all mmio sptes - it need not walk any shadow pages and 
>> hold
>> any locks.
>>
>> The idea is simple:
>> KVM maintains a global mmio valid generation-number which is stored in
>> kvm->memslots.generation and every mmio spte stores the current global
>> generation-number into his available bits when it is created
>>
>> When KVM need zap all mmio sptes, it just simply increase the global
>> generation-number. When guests do mmio access, KVM intercepts a MMIO #PF
>> then it walks the shadow page table and get the mmio spte. If the
>> generation-number on the spte does not equal the global generation-number,
>> it will go to the normal #PF handler to update the mmio spte
>>
>> Since 19 bits are used to store generation-number on mmio spte, we zap all
>> mmio sptes when the number is round
>>
>> Xiao Guangrong (6):
>>   KVM: MMU: retain more available bits on mmio spte
>>   KVM: MMU: store generation-number into mmio spte
>>   KVM: MMU: make return value of mmio page fault handler more readable
>>   KVM: MMU: fast invalidate all mmio sptes
>>   KVM: MMU: add tracepoint for check_mmio_spte
>>   KVM: MMU: init kvm generation close to mmio wrap-around value
>>
>>  arch/x86/include/asm/kvm_host.h |   2 +-
>>  arch/x86/kvm/mmu.c  | 131 
>> 
>>  arch/x86/kvm/mmu.h  |  17 ++
>>  arch/x86/kvm/mmutrace.h |  34 +--
>>  arch/x86/kvm/paging_tmpl.h  |  10 ++-
>>  arch/x86/kvm/vmx.c  |  12 ++--
>>  arch/x86/kvm/x86.c  |  11 +++-
>>  7 files changed, 177 insertions(+), 40 deletions(-)
>>
> 
> Xiao, is it time to add more comments to the code or update
> Documentation/virtual/kvm/mmu.txt?  

Yes. it is.

We missed to update mmu.txt for a long time. I will post a separate patchset
to update it to the current mmu code.

> Don't worry about the English, it is
> more than understandable and I can help with the editing.

Okay. Thank you in advance, Paolo! :)


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-18 Thread Marcelo Tosatti
On Mon, Jun 17, 2013 at 07:59:15PM +0800, Xiao Guangrong wrote:
> Sorry for the delay reply since i was on vacation.
> 
> On 06/15/2013 10:22 AM, Takuya Yoshikawa wrote:
> > On Thu, 13 Jun 2013 21:08:21 -0300
> > Marcelo Tosatti  wrote:
> > 
> >> On Fri, Jun 07, 2013 at 04:51:22PM +0800, Xiao Guangrong wrote:
> > 
> >> - Where is the generation number increased?
> > 
> > Looks like when a new slot is installed in update_memslots() because
> > it's based on slots->generation.  This is not restricted to "create"
> > and "move".
> 
> Yes. It reuses slots->generation to avoid unnecessary synchronizations
> (RCU, memory barrier).
> 
> Increasing mmio generation number in the case of "create" and "move"
> is ok - it is no addition work unless mmio generation number is overflow
> which is hardly triggered (since the valid mmio generation number is
> large enough and zap_all is scale well now.) and the mmio spte is updated
> only when it is used in the future.
> 
> > 
> >> - Should use spinlock breakable code in kvm_mmu_zap_mmio_sptes()
> >> (picture guest with 512GB of RAM, even walking all those pages is
> >> expensive) (ah, patch to remove kvm_mmu_zap_mmio_sptes does that).
> >> - Is -13 enough to test wraparound? Its highly likely the guest has 
> >> not began executing by the time 13 kvm_set_memory_calls are made
> >> (so no sptes around). Perhaps -2000 is more sensible (should confirm
> >> though).
> > 
> > In the future, after we've tested enough, we should change the testing
> > code to be executed only for some debugging configs.  Especially, if we
> > change zap_mmio_sptes() to zap_all_shadows(), very common guests, even
> > without huge memory like 512GB, can see the effect induced by sudden page
> > faults unnecessarily.
> > 
> > If necessary, developers can test the wraparound code by lowering the
> > max_gen itself anyway.
> 
> I agree.
> 
> > 
> >> - Why remove "if (change == KVM_MR_CREATE) || (change
> >> ==  KVM_MR_MOVE)" from kvm_arch_commit_memory_region?
> >> Its instructive.
> > 
> > There may be a chance that we miss generation wraparounds if we don't
> > check other cases: seems unlikely, but theoretically possible.
> > 
> > In short, all memory slot changes make mmio sptes stored in shadow pages
> > obsolete, or zapped for wraparounds, in the new way -- am I right?
> 
> Yes. You are definitely right. :)
> 
> Takuya-san, thank you very much for you answering the questions for me and 
> thanks
> all of you for patiently reviewing my patches.
> 
> Marcelo, your points?

Agreed - points are clear. Patchset looks good.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-18 Thread Paolo Bonzini
Il 07/06/2013 10:51, Xiao Guangrong ha scritto:
> Changelog:
> V3:
>   All of these changes are from Gleb's review:
>   1) rename RET_MMIO_PF_EMU to RET_MMIO_PF_EMULATE.
>   2) smartly adjust kvm generation number in kvm_current_mmio_generatio()
>  to avoid kvm_memslots->generation overflow.
> 
> V2:
>   - rename kvm_mmu_invalid_mmio_spte to kvm_mmu_invalid_mmio_sptes
>   - use kvm->memslots->generation as kvm global generation-number
>   - fix comment and codestyle
>   - init kvm generation close to mmio wrap-around value
>   - keep kvm_mmu_zap_mmio_sptes
> 
> The current way is holding hot mmu-lock and walking all shadow pages, this
> is not scale. This patchset tries to introduce a very simple and scale way
> to fast invalidate all mmio sptes - it need not walk any shadow pages and hold
> any locks.
> 
> The idea is simple:
> KVM maintains a global mmio valid generation-number which is stored in
> kvm->memslots.generation and every mmio spte stores the current global
> generation-number into his available bits when it is created
> 
> When KVM need zap all mmio sptes, it just simply increase the global
> generation-number. When guests do mmio access, KVM intercepts a MMIO #PF
> then it walks the shadow page table and get the mmio spte. If the
> generation-number on the spte does not equal the global generation-number,
> it will go to the normal #PF handler to update the mmio spte
> 
> Since 19 bits are used to store generation-number on mmio spte, we zap all
> mmio sptes when the number is round
> 
> Xiao Guangrong (6):
>   KVM: MMU: retain more available bits on mmio spte
>   KVM: MMU: store generation-number into mmio spte
>   KVM: MMU: make return value of mmio page fault handler more readable
>   KVM: MMU: fast invalidate all mmio sptes
>   KVM: MMU: add tracepoint for check_mmio_spte
>   KVM: MMU: init kvm generation close to mmio wrap-around value
> 
>  arch/x86/include/asm/kvm_host.h |   2 +-
>  arch/x86/kvm/mmu.c  | 131 
> 
>  arch/x86/kvm/mmu.h  |  17 ++
>  arch/x86/kvm/mmutrace.h |  34 +--
>  arch/x86/kvm/paging_tmpl.h  |  10 ++-
>  arch/x86/kvm/vmx.c  |  12 ++--
>  arch/x86/kvm/x86.c  |  11 +++-
>  7 files changed, 177 insertions(+), 40 deletions(-)
> 

Xiao, is it time to add more comments to the code or update
Documentation/virtual/kvm/mmu.txt?  Don't worry about the English, it is
more than understandable and I can help with the editing.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-17 Thread Xiao Guangrong
Sorry for the delay reply since i was on vacation.

On 06/15/2013 10:22 AM, Takuya Yoshikawa wrote:
> On Thu, 13 Jun 2013 21:08:21 -0300
> Marcelo Tosatti  wrote:
> 
>> On Fri, Jun 07, 2013 at 04:51:22PM +0800, Xiao Guangrong wrote:
> 
>> - Where is the generation number increased?
> 
> Looks like when a new slot is installed in update_memslots() because
> it's based on slots->generation.  This is not restricted to "create"
> and "move".

Yes. It reuses slots->generation to avoid unnecessary synchronizations
(RCU, memory barrier).

Increasing mmio generation number in the case of "create" and "move"
is ok - it is no addition work unless mmio generation number is overflow
which is hardly triggered (since the valid mmio generation number is
large enough and zap_all is scale well now.) and the mmio spte is updated
only when it is used in the future.

> 
>> - Should use spinlock breakable code in kvm_mmu_zap_mmio_sptes()
>> (picture guest with 512GB of RAM, even walking all those pages is
>> expensive) (ah, patch to remove kvm_mmu_zap_mmio_sptes does that).
>> - Is -13 enough to test wraparound? Its highly likely the guest has 
>> not began executing by the time 13 kvm_set_memory_calls are made
>> (so no sptes around). Perhaps -2000 is more sensible (should confirm
>> though).
> 
> In the future, after we've tested enough, we should change the testing
> code to be executed only for some debugging configs.  Especially, if we
> change zap_mmio_sptes() to zap_all_shadows(), very common guests, even
> without huge memory like 512GB, can see the effect induced by sudden page
> faults unnecessarily.
> 
> If necessary, developers can test the wraparound code by lowering the
> max_gen itself anyway.

I agree.

> 
>> - Why remove "if (change == KVM_MR_CREATE) || (change
>> ==  KVM_MR_MOVE)" from kvm_arch_commit_memory_region?
>> Its instructive.
> 
> There may be a chance that we miss generation wraparounds if we don't
> check other cases: seems unlikely, but theoretically possible.
> 
> In short, all memory slot changes make mmio sptes stored in shadow pages
> obsolete, or zapped for wraparounds, in the new way -- am I right?

Yes. You are definitely right. :)

Takuya-san, thank you very much for you answering the questions for me and 
thanks
all of you for patiently reviewing my patches.

Marcelo, your points?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-14 Thread Takuya Yoshikawa
On Thu, 13 Jun 2013 21:08:21 -0300
Marcelo Tosatti  wrote:

> On Fri, Jun 07, 2013 at 04:51:22PM +0800, Xiao Guangrong wrote:

> - Where is the generation number increased?

Looks like when a new slot is installed in update_memslots() because
it's based on slots->generation.  This is not restricted to "create"
and "move".

> - Should use spinlock breakable code in kvm_mmu_zap_mmio_sptes()
> (picture guest with 512GB of RAM, even walking all those pages is
> expensive) (ah, patch to remove kvm_mmu_zap_mmio_sptes does that).
> - Is -13 enough to test wraparound? Its highly likely the guest has 
> not began executing by the time 13 kvm_set_memory_calls are made
> (so no sptes around). Perhaps -2000 is more sensible (should confirm
> though).

In the future, after we've tested enough, we should change the testing
code to be executed only for some debugging configs.  Especially, if we
change zap_mmio_sptes() to zap_all_shadows(), very common guests, even
without huge memory like 512GB, can see the effect induced by sudden page
faults unnecessarily.

If necessary, developers can test the wraparound code by lowering the
max_gen itself anyway.

> - Why remove "if (change == KVM_MR_CREATE) || (change
> ==  KVM_MR_MOVE)" from kvm_arch_commit_memory_region?
> Its instructive.

There may be a chance that we miss generation wraparounds if we don't
check other cases: seems unlikely, but theoretically possible.

In short, all memory slot changes make mmio sptes stored in shadow pages
obsolete, or zapped for wraparounds, in the new way -- am I right?

Takuya
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-13 Thread Marcelo Tosatti
On Fri, Jun 07, 2013 at 04:51:22PM +0800, Xiao Guangrong wrote:
> Changelog:
> V3:
>   All of these changes are from Gleb's review:
>   1) rename RET_MMIO_PF_EMU to RET_MMIO_PF_EMULATE.
>   2) smartly adjust kvm generation number in kvm_current_mmio_generatio()
>  to avoid kvm_memslots->generation overflow.
> 
> V2:
>   - rename kvm_mmu_invalid_mmio_spte to kvm_mmu_invalid_mmio_sptes
>   - use kvm->memslots->generation as kvm global generation-number
>   - fix comment and codestyle
>   - init kvm generation close to mmio wrap-around value
>   - keep kvm_mmu_zap_mmio_sptes
> 
> The current way is holding hot mmu-lock and walking all shadow pages, this
> is not scale. This patchset tries to introduce a very simple and scale way
> to fast invalidate all mmio sptes - it need not walk any shadow pages and hold
> any locks.

Hi Xiao,

- Where is the generation number increased?
- Should use spinlock breakable code in kvm_mmu_zap_mmio_sptes()
(picture guest with 512GB of RAM, even walking all those pages is
expensive) (ah, patch to remove kvm_mmu_zap_mmio_sptes does that).
- Is -13 enough to test wraparound? Its highly likely the guest has 
not began executing by the time 13 kvm_set_memory_calls are made
(so no sptes around). Perhaps -2000 is more sensible (should confirm
though).
- Why remove "if (change == KVM_MR_CREATE) || (change
==  KVM_MR_MOVE)" from kvm_arch_commit_memory_region?
Its instructive.

Otherwise looks good.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-10 Thread Gleb Natapov
On Mon, Jun 10, 2013 at 10:43:52PM +0900, Takuya Yoshikawa wrote:
> On Mon, 10 Jun 2013 16:39:37 +0800
> Xiao Guangrong  wrote:
> 
> > On 06/10/2013 03:56 PM, Gleb Natapov wrote:
> > > On Fri, Jun 07, 2013 at 04:51:22PM +0800, Xiao Guangrong wrote:
> 
> > > Looks good to me, but doesn't tis obsolete kvm_mmu_zap_mmio_sptes() and
> > > sp->mmio_cached, so they should be removed as part of the patch series?
> > 
> > Yes, i agree, they should be removed. :)
> 
> I'm fine with removing it but please make it clear that you all agree
> on the same basis.
> 
> Last time, Paolo mentioned the possibility to use some bits of spte for
> other things.  The suggestion there was to keep sp->mmio_cached code
> for the time we would need to reduce the bits for generation numbers.
> 
> Do you think that zap_all() is now preemptible and can treat the
> situation reasonably well as the current kvm_mmu_zap_mmio_sptes()?
> 
> One downside is the need to zap unrelated shadow pages, but if this case
> is really very rare, yes I agree, it should not be a problem: it depends
> on how many bits we can use.
> 
> Just please reconfirm.
> 
That was me who mention the possibility to use some bits of spte for
other things. But for now I have a use for one bit only. Now that you
have reminded me about that discussion I am not so sure we want to
remove kvm_mmu_zap_mmio_sptes(), but on the other hand it is non
preemptable, so large number of mmio sptes can cause soft lockups.
zap_all() is better in this regards now.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-10 Thread Takuya Yoshikawa
On Mon, 10 Jun 2013 16:39:37 +0800
Xiao Guangrong  wrote:

> On 06/10/2013 03:56 PM, Gleb Natapov wrote:
> > On Fri, Jun 07, 2013 at 04:51:22PM +0800, Xiao Guangrong wrote:

> > Looks good to me, but doesn't tis obsolete kvm_mmu_zap_mmio_sptes() and
> > sp->mmio_cached, so they should be removed as part of the patch series?
> 
> Yes, i agree, they should be removed. :)

I'm fine with removing it but please make it clear that you all agree
on the same basis.

Last time, Paolo mentioned the possibility to use some bits of spte for
other things.  The suggestion there was to keep sp->mmio_cached code
for the time we would need to reduce the bits for generation numbers.

Do you think that zap_all() is now preemptible and can treat the
situation reasonably well as the current kvm_mmu_zap_mmio_sptes()?

One downside is the need to zap unrelated shadow pages, but if this case
is really very rare, yes I agree, it should not be a problem: it depends
on how many bits we can use.

Just please reconfirm.

Takuya

> 
> There is the patch to do these things:
> 
> From bc1bc36e2640059f06c4860af802ecc74e1f3d2d Mon Sep 17 00:00:00 2001
> From: Xiao Guangrong 
> Date: Mon, 10 Jun 2013 16:28:55 +0800
> Subject: [PATCH 7/6] KVM: MMU: drop kvm_mmu_zap_mmio_sptes
> 
> Drop kvm_mmu_zap_mmio_sptes and use kvm_mmu_invalidate_zap_all_pages
> instead to handle mmio generation number overflow
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  arch/x86/include/asm/kvm_host.h |  1 -
>  arch/x86/kvm/mmu.c  | 22 +-
>  2 files changed, 1 insertion(+), 22 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 90d05ed..966f265 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -230,7 +230,6 @@ struct kvm_mmu_page {
>  #endif
> 
>   int write_flooding_count;
> - bool mmio_cached;
>  };
> 
>  struct kvm_pio_request {
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 35cd0b6..c87b19d 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -246,13 +246,11 @@ static unsigned int kvm_current_mmio_generation(struct 
> kvm *kvm)
>  static void mark_mmio_spte(struct kvm *kvm, u64 *sptep, u64 gfn,
>  unsigned access)
>  {
> - struct kvm_mmu_page *sp =  page_header(__pa(sptep));
>   unsigned int gen = kvm_current_mmio_generation(kvm);
>   u64 mask = generation_mmio_spte_mask(gen);
> 
>   access &= ACC_WRITE_MASK | ACC_USER_MASK;
>   mask |= shadow_mmio_mask | access | gfn << PAGE_SHIFT;
> - sp->mmio_cached = true;
> 
>   trace_mark_mmio_spte(sptep, gfn, access, gen);
>   mmu_spte_set(sptep, mask);
> @@ -4362,24 +4360,6 @@ void kvm_mmu_invalidate_zap_all_pages(struct kvm *kvm)
>   spin_unlock(&kvm->mmu_lock);
>  }
> 
> -static void kvm_mmu_zap_mmio_sptes(struct kvm *kvm)
> -{
> - struct kvm_mmu_page *sp, *node;
> - LIST_HEAD(invalid_list);
> -
> - spin_lock(&kvm->mmu_lock);
> -restart:
> - list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, link) {
> - if (!sp->mmio_cached)
> - continue;
> - if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
> - goto restart;
> - }
> -
> - kvm_mmu_commit_zap_page(kvm, &invalid_list);
> - spin_unlock(&kvm->mmu_lock);
> -}
> -
>  static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm)
>  {
>   return unlikely(!list_empty_careful(&kvm->arch.zapped_obsolete_pages));
> @@ -4395,7 +4375,7 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm)
>* when mark memslot invalid.
>*/
>   if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1)))
> - kvm_mmu_zap_mmio_sptes(kvm);
> + kvm_mmu_invalidate_zap_all_pages(kvm);
>  }
> 
>  static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
> -- 
> 1.8.1.4
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Takuya Yoshikawa 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-10 Thread Xiao Guangrong
On 06/10/2013 03:56 PM, Gleb Natapov wrote:
> On Fri, Jun 07, 2013 at 04:51:22PM +0800, Xiao Guangrong wrote:
>> Changelog:
>> V3:
>>   All of these changes are from Gleb's review:
>>   1) rename RET_MMIO_PF_EMU to RET_MMIO_PF_EMULATE.
>>   2) smartly adjust kvm generation number in kvm_current_mmio_generatio()
>>  to avoid kvm_memslots->generation overflow.
>>
>> V2:
>>   - rename kvm_mmu_invalid_mmio_spte to kvm_mmu_invalid_mmio_sptes
>>   - use kvm->memslots->generation as kvm global generation-number
>>   - fix comment and codestyle
>>   - init kvm generation close to mmio wrap-around value
>>   - keep kvm_mmu_zap_mmio_sptes
>>
>> The current way is holding hot mmu-lock and walking all shadow pages, this
>> is not scale. This patchset tries to introduce a very simple and scale way
>> to fast invalidate all mmio sptes - it need not walk any shadow pages and 
>> hold
>> any locks.
>>
>> The idea is simple:
>> KVM maintains a global mmio valid generation-number which is stored in
>> kvm->memslots.generation and every mmio spte stores the current global
>> generation-number into his available bits when it is created
>>
>> When KVM need zap all mmio sptes, it just simply increase the global
>> generation-number. When guests do mmio access, KVM intercepts a MMIO #PF
>> then it walks the shadow page table and get the mmio spte. If the
>> generation-number on the spte does not equal the global generation-number,
>> it will go to the normal #PF handler to update the mmio spte
>>
>> Since 19 bits are used to store generation-number on mmio spte, we zap all
>> mmio sptes when the number is round
>>
> Looks good to me, but doesn't tis obsolete kvm_mmu_zap_mmio_sptes() and
> sp->mmio_cached, so they should be removed as part of the patch series?

Yes, i agree, they should be removed. :)

There is the patch to do these things:

>From bc1bc36e2640059f06c4860af802ecc74e1f3d2d Mon Sep 17 00:00:00 2001
From: Xiao Guangrong 
Date: Mon, 10 Jun 2013 16:28:55 +0800
Subject: [PATCH 7/6] KVM: MMU: drop kvm_mmu_zap_mmio_sptes

Drop kvm_mmu_zap_mmio_sptes and use kvm_mmu_invalidate_zap_all_pages
instead to handle mmio generation number overflow

Signed-off-by: Xiao Guangrong 
---
 arch/x86/include/asm/kvm_host.h |  1 -
 arch/x86/kvm/mmu.c  | 22 +-
 2 files changed, 1 insertion(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 90d05ed..966f265 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -230,7 +230,6 @@ struct kvm_mmu_page {
 #endif

int write_flooding_count;
-   bool mmio_cached;
 };

 struct kvm_pio_request {
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 35cd0b6..c87b19d 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -246,13 +246,11 @@ static unsigned int kvm_current_mmio_generation(struct 
kvm *kvm)
 static void mark_mmio_spte(struct kvm *kvm, u64 *sptep, u64 gfn,
   unsigned access)
 {
-   struct kvm_mmu_page *sp =  page_header(__pa(sptep));
unsigned int gen = kvm_current_mmio_generation(kvm);
u64 mask = generation_mmio_spte_mask(gen);

access &= ACC_WRITE_MASK | ACC_USER_MASK;
mask |= shadow_mmio_mask | access | gfn << PAGE_SHIFT;
-   sp->mmio_cached = true;

trace_mark_mmio_spte(sptep, gfn, access, gen);
mmu_spte_set(sptep, mask);
@@ -4362,24 +4360,6 @@ void kvm_mmu_invalidate_zap_all_pages(struct kvm *kvm)
spin_unlock(&kvm->mmu_lock);
 }

-static void kvm_mmu_zap_mmio_sptes(struct kvm *kvm)
-{
-   struct kvm_mmu_page *sp, *node;
-   LIST_HEAD(invalid_list);
-
-   spin_lock(&kvm->mmu_lock);
-restart:
-   list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, link) {
-   if (!sp->mmio_cached)
-   continue;
-   if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
-   goto restart;
-   }
-
-   kvm_mmu_commit_zap_page(kvm, &invalid_list);
-   spin_unlock(&kvm->mmu_lock);
-}
-
 static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm)
 {
return unlikely(!list_empty_careful(&kvm->arch.zapped_obsolete_pages));
@@ -4395,7 +4375,7 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm)
 * when mark memslot invalid.
 */
if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1)))
-   kvm_mmu_zap_mmio_sptes(kvm);
+   kvm_mmu_invalidate_zap_all_pages(kvm);
 }

 static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-10 Thread Gleb Natapov
On Fri, Jun 07, 2013 at 04:51:22PM +0800, Xiao Guangrong wrote:
> Changelog:
> V3:
>   All of these changes are from Gleb's review:
>   1) rename RET_MMIO_PF_EMU to RET_MMIO_PF_EMULATE.
>   2) smartly adjust kvm generation number in kvm_current_mmio_generatio()
>  to avoid kvm_memslots->generation overflow.
> 
> V2:
>   - rename kvm_mmu_invalid_mmio_spte to kvm_mmu_invalid_mmio_sptes
>   - use kvm->memslots->generation as kvm global generation-number
>   - fix comment and codestyle
>   - init kvm generation close to mmio wrap-around value
>   - keep kvm_mmu_zap_mmio_sptes
> 
> The current way is holding hot mmu-lock and walking all shadow pages, this
> is not scale. This patchset tries to introduce a very simple and scale way
> to fast invalidate all mmio sptes - it need not walk any shadow pages and hold
> any locks.
> 
> The idea is simple:
> KVM maintains a global mmio valid generation-number which is stored in
> kvm->memslots.generation and every mmio spte stores the current global
> generation-number into his available bits when it is created
> 
> When KVM need zap all mmio sptes, it just simply increase the global
> generation-number. When guests do mmio access, KVM intercepts a MMIO #PF
> then it walks the shadow page table and get the mmio spte. If the
> generation-number on the spte does not equal the global generation-number,
> it will go to the normal #PF handler to update the mmio spte
> 
> Since 19 bits are used to store generation-number on mmio spte, we zap all
> mmio sptes when the number is round
> 
Looks good to me, but doesn't tis obsolete kvm_mmu_zap_mmio_sptes() and
sp->mmio_cached, so they should be removed as part of the patch series?

> Xiao Guangrong (6):
>   KVM: MMU: retain more available bits on mmio spte
>   KVM: MMU: store generation-number into mmio spte
>   KVM: MMU: make return value of mmio page fault handler more readable
>   KVM: MMU: fast invalidate all mmio sptes
>   KVM: MMU: add tracepoint for check_mmio_spte
>   KVM: MMU: init kvm generation close to mmio wrap-around value
> 
>  arch/x86/include/asm/kvm_host.h |   2 +-
>  arch/x86/kvm/mmu.c  | 131 
> 
>  arch/x86/kvm/mmu.h  |  17 ++
>  arch/x86/kvm/mmutrace.h |  34 +--
>  arch/x86/kvm/paging_tmpl.h  |  10 ++-
>  arch/x86/kvm/vmx.c  |  12 ++--
>  arch/x86/kvm/x86.c  |  11 +++-
>  7 files changed, 177 insertions(+), 40 deletions(-)
> 
> -- 
> 1.8.1.4

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 0/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-07 Thread Xiao Guangrong
Changelog:
V3:
  All of these changes are from Gleb's review:
  1) rename RET_MMIO_PF_EMU to RET_MMIO_PF_EMULATE.
  2) smartly adjust kvm generation number in kvm_current_mmio_generatio()
 to avoid kvm_memslots->generation overflow.

V2:
  - rename kvm_mmu_invalid_mmio_spte to kvm_mmu_invalid_mmio_sptes
  - use kvm->memslots->generation as kvm global generation-number
  - fix comment and codestyle
  - init kvm generation close to mmio wrap-around value
  - keep kvm_mmu_zap_mmio_sptes

The current way is holding hot mmu-lock and walking all shadow pages, this
is not scale. This patchset tries to introduce a very simple and scale way
to fast invalidate all mmio sptes - it need not walk any shadow pages and hold
any locks.

The idea is simple:
KVM maintains a global mmio valid generation-number which is stored in
kvm->memslots.generation and every mmio spte stores the current global
generation-number into his available bits when it is created

When KVM need zap all mmio sptes, it just simply increase the global
generation-number. When guests do mmio access, KVM intercepts a MMIO #PF
then it walks the shadow page table and get the mmio spte. If the
generation-number on the spte does not equal the global generation-number,
it will go to the normal #PF handler to update the mmio spte

Since 19 bits are used to store generation-number on mmio spte, we zap all
mmio sptes when the number is round

Xiao Guangrong (6):
  KVM: MMU: retain more available bits on mmio spte
  KVM: MMU: store generation-number into mmio spte
  KVM: MMU: make return value of mmio page fault handler more readable
  KVM: MMU: fast invalidate all mmio sptes
  KVM: MMU: add tracepoint for check_mmio_spte
  KVM: MMU: init kvm generation close to mmio wrap-around value

 arch/x86/include/asm/kvm_host.h |   2 +-
 arch/x86/kvm/mmu.c  | 131 
 arch/x86/kvm/mmu.h  |  17 ++
 arch/x86/kvm/mmutrace.h |  34 +--
 arch/x86/kvm/paging_tmpl.h  |  10 ++-
 arch/x86/kvm/vmx.c  |  12 ++--
 arch/x86/kvm/x86.c  |  11 +++-
 7 files changed, 177 insertions(+), 40 deletions(-)

-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html