Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages

2013-04-23 Thread Gleb Natapov
On Tue, Apr 23, 2013 at 08:19:02AM +0800, Xiao Guangrong wrote:
 On 04/22/2013 05:21 PM, Gleb Natapov wrote:
  On Sun, Apr 21, 2013 at 10:09:29PM +0800, Xiao Guangrong wrote:
  On 04/21/2013 09:03 PM, Gleb Natapov wrote:
  On Tue, Apr 16, 2013 at 02:32:38PM +0800, Xiao Guangrong wrote:
  This patchset is based on my previous two patchset:
  [PATCH 0/2] KVM: x86: avoid potential soft lockup and unneeded mmu reload
  (https://lkml.org/lkml/2013/4/1/2)
 
  [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
  (https://lkml.org/lkml/2013/4/1/134)
 
  Changlog:
  V3:
completely redesign the algorithm, please see below.
 
  This looks pretty complicated. Is it still needed in order to avoid soft
  lockups after avoid potential soft lockup and unneeded mmu reload patch?
 
  Yes.
 
  I discussed this point with Marcelo:
 
  ==
  BTW, to my honest, i do not think spin_needbreak is a good way - it does
  not fix the hot-lock contention and it just occupies more cpu time to avoid
  possible soft lock-ups.
 
  Especially, zap-all-shadow-pages can let other vcpus fault and vcpus 
  contest
  mmu-lock, then zap-all-shadow-pages release mmu-lock and wait, other vcpus
  create page tables again. zap-all-shadow-page need long time to be 
  finished,
  the worst case is, it can not completed forever on intensive vcpu and 
  memory
  usage.
 
  So what about mixed approach: use generation numbers and reload roots to
  quickly invalidate all shadow pages and then do kvm_mmu_zap_all_invalid().
  kvm_mmu_zap_all_invalid() is a new function that invalidates only shadow
  pages with stale generation number (and uses lock break technique). It
  may traverse active_mmu_pages from tail to head since new shadow pages
  will be added to the head of the list or it may use invalid slot rmap to
  find exactly what should be invalidated.
 
 I prefer to unmapping the invalid rmap instead of zapping stale shadow pages
 in kvm_mmu_zap_all_invalid(), the former is faster.
 
Not sure what do you mean here. What is unmapping the invalid rmap?

 This way may help but not good, after reload mmu with the new generation 
 number,
 all of the vcpu will fault in a long time, try to hold mmu-lock is not good 
 even
 if use lock break technique.
If kvm_mmu_zap_all_invalid(slot) will only zap shadow pages that are
reachable from the slot's rmap, as opposite to zapping all invalid
shadow pages, it will have much less work to do. The slots that we
add/remove during hot plug are usually small. To guaranty reasonable
forward progress we can break the lock only after certain amount of
shadow pages are invalidated. All other invalid shadow pages will be
zapped in make_mmu_pages_available() and zapping will be spread between
page faults.

 
 I think we can do this step first, then unmapping invalid rmap out of mmu-lock
 later.
 

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages

2013-04-23 Thread Xiao Guangrong
On 04/23/2013 02:28 PM, Gleb Natapov wrote:
 On Tue, Apr 23, 2013 at 08:19:02AM +0800, Xiao Guangrong wrote:
 On 04/22/2013 05:21 PM, Gleb Natapov wrote:
 On Sun, Apr 21, 2013 at 10:09:29PM +0800, Xiao Guangrong wrote:
 On 04/21/2013 09:03 PM, Gleb Natapov wrote:
 On Tue, Apr 16, 2013 at 02:32:38PM +0800, Xiao Guangrong wrote:
 This patchset is based on my previous two patchset:
 [PATCH 0/2] KVM: x86: avoid potential soft lockup and unneeded mmu reload
 (https://lkml.org/lkml/2013/4/1/2)

 [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
 (https://lkml.org/lkml/2013/4/1/134)

 Changlog:
 V3:
   completely redesign the algorithm, please see below.

 This looks pretty complicated. Is it still needed in order to avoid soft
 lockups after avoid potential soft lockup and unneeded mmu reload patch?

 Yes.

 I discussed this point with Marcelo:

 ==
 BTW, to my honest, i do not think spin_needbreak is a good way - it does
 not fix the hot-lock contention and it just occupies more cpu time to avoid
 possible soft lock-ups.

 Especially, zap-all-shadow-pages can let other vcpus fault and vcpus 
 contest
 mmu-lock, then zap-all-shadow-pages release mmu-lock and wait, other vcpus
 create page tables again. zap-all-shadow-page need long time to be 
 finished,
 the worst case is, it can not completed forever on intensive vcpu and 
 memory
 usage.

 So what about mixed approach: use generation numbers and reload roots to
 quickly invalidate all shadow pages and then do kvm_mmu_zap_all_invalid().
 kvm_mmu_zap_all_invalid() is a new function that invalidates only shadow
 pages with stale generation number (and uses lock break technique). It
 may traverse active_mmu_pages from tail to head since new shadow pages
 will be added to the head of the list or it may use invalid slot rmap to
 find exactly what should be invalidated.

 I prefer to unmapping the invalid rmap instead of zapping stale shadow pages
 in kvm_mmu_zap_all_invalid(), the former is faster.

 Not sure what do you mean here. What is unmapping the invalid rmap?

it is like you said below:
==
kvm_mmu_zap_all_invalid(slot) will only zap shadow pages that are
reachable from the slot's rmap
==
My suggestion is zapping the spte that are linked in the slot's rmap.

 
 This way may help but not good, after reload mmu with the new generation 
 number,
 all of the vcpu will fault in a long time, try to hold mmu-lock is not good 
 even
 if use lock break technique.
 If kvm_mmu_zap_all_invalid(slot) will only zap shadow pages that are
 reachable from the slot's rmap, as opposite to zapping all invalid
 shadow pages, it will have much less work to do. The slots that we
 add/remove during hot plug are usually small. To guaranty reasonable
 forward progress we can break the lock only after certain amount of
 shadow pages are invalidated. All other invalid shadow pages will be
 zapped in make_mmu_pages_available() and zapping will be spread between
 page faults.

No interested in hot-remove memory?

BTW, could you please review my previous patchsets and apply them if its
looks ok? ;)

[PATCH 0/2] KVM: x86: avoid potential soft lockup and unneeded mmu reload
(https://lkml.org/lkml/2013/4/1/2)

[PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
(https://lkml.org/lkml/2013/4/1/134)

Thanks!


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages

2013-04-23 Thread Gleb Natapov
On Tue, Apr 23, 2013 at 03:20:28PM +0800, Xiao Guangrong wrote:
 On 04/23/2013 02:28 PM, Gleb Natapov wrote:
  On Tue, Apr 23, 2013 at 08:19:02AM +0800, Xiao Guangrong wrote:
  On 04/22/2013 05:21 PM, Gleb Natapov wrote:
  On Sun, Apr 21, 2013 at 10:09:29PM +0800, Xiao Guangrong wrote:
  On 04/21/2013 09:03 PM, Gleb Natapov wrote:
  On Tue, Apr 16, 2013 at 02:32:38PM +0800, Xiao Guangrong wrote:
  This patchset is based on my previous two patchset:
  [PATCH 0/2] KVM: x86: avoid potential soft lockup and unneeded mmu 
  reload
  (https://lkml.org/lkml/2013/4/1/2)
 
  [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
  (https://lkml.org/lkml/2013/4/1/134)
 
  Changlog:
  V3:
completely redesign the algorithm, please see below.
 
  This looks pretty complicated. Is it still needed in order to avoid soft
  lockups after avoid potential soft lockup and unneeded mmu reload 
  patch?
 
  Yes.
 
  I discussed this point with Marcelo:
 
  ==
  BTW, to my honest, i do not think spin_needbreak is a good way - it does
  not fix the hot-lock contention and it just occupies more cpu time to 
  avoid
  possible soft lock-ups.
 
  Especially, zap-all-shadow-pages can let other vcpus fault and vcpus 
  contest
  mmu-lock, then zap-all-shadow-pages release mmu-lock and wait, other 
  vcpus
  create page tables again. zap-all-shadow-page need long time to be 
  finished,
  the worst case is, it can not completed forever on intensive vcpu and 
  memory
  usage.
 
  So what about mixed approach: use generation numbers and reload roots to
  quickly invalidate all shadow pages and then do kvm_mmu_zap_all_invalid().
  kvm_mmu_zap_all_invalid() is a new function that invalidates only shadow
  pages with stale generation number (and uses lock break technique). It
  may traverse active_mmu_pages from tail to head since new shadow pages
  will be added to the head of the list or it may use invalid slot rmap to
  find exactly what should be invalidated.
 
  I prefer to unmapping the invalid rmap instead of zapping stale shadow 
  pages
  in kvm_mmu_zap_all_invalid(), the former is faster.
 
  Not sure what do you mean here. What is unmapping the invalid rmap?
 
 it is like you said below:
 ==
 kvm_mmu_zap_all_invalid(slot) will only zap shadow pages that are
 reachable from the slot's rmap
 ==
 My suggestion is zapping the spte that are linked in the slot's rmap.
 
OK, so we are on the same page.

  
  This way may help but not good, after reload mmu with the new generation 
  number,
  all of the vcpu will fault in a long time, try to hold mmu-lock is not 
  good even
  if use lock break technique.
  If kvm_mmu_zap_all_invalid(slot) will only zap shadow pages that are
  reachable from the slot's rmap, as opposite to zapping all invalid
  shadow pages, it will have much less work to do. The slots that we
  add/remove during hot plug are usually small. To guaranty reasonable
  forward progress we can break the lock only after certain amount of
  shadow pages are invalidated. All other invalid shadow pages will be
  zapped in make_mmu_pages_available() and zapping will be spread between
  page faults.
 
 No interested in hot-remove memory?
 
I am, good point. Still, I think, that with guarantied forward progress the
slot removal time should be bound to something reasonable. At least we
should have evidence of the contrary before optimizing for it. Hot
memory remove is not instantaneous from guest point of view either,
guest needs to move memory around to make it possible.

 BTW, could you please review my previous patchsets and apply them if its
 looks ok? ;)
 
I need Marcelo's acks on them too :)

 [PATCH 0/2] KVM: x86: avoid potential soft lockup and unneeded mmu reload
 (https://lkml.org/lkml/2013/4/1/2)
 
But you yourself saying that with this patch slot remove may never
complete with high memory usage workload.

 [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
 (https://lkml.org/lkml/2013/4/1/134)
 
Missed this one. Will review.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages

2013-04-22 Thread Gleb Natapov
On Sun, Apr 21, 2013 at 10:09:29PM +0800, Xiao Guangrong wrote:
 On 04/21/2013 09:03 PM, Gleb Natapov wrote:
  On Tue, Apr 16, 2013 at 02:32:38PM +0800, Xiao Guangrong wrote:
  This patchset is based on my previous two patchset:
  [PATCH 0/2] KVM: x86: avoid potential soft lockup and unneeded mmu reload
  (https://lkml.org/lkml/2013/4/1/2)
 
  [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
  (https://lkml.org/lkml/2013/4/1/134)
 
  Changlog:
  V3:
completely redesign the algorithm, please see below.
 
  This looks pretty complicated. Is it still needed in order to avoid soft
  lockups after avoid potential soft lockup and unneeded mmu reload patch?
 
 Yes.
 
 I discussed this point with Marcelo:
 
 ==
 BTW, to my honest, i do not think spin_needbreak is a good way - it does
 not fix the hot-lock contention and it just occupies more cpu time to avoid
 possible soft lock-ups.
 
 Especially, zap-all-shadow-pages can let other vcpus fault and vcpus contest
 mmu-lock, then zap-all-shadow-pages release mmu-lock and wait, other vcpus
 create page tables again. zap-all-shadow-page need long time to be finished,
 the worst case is, it can not completed forever on intensive vcpu and memory
 usage.
 
So what about mixed approach: use generation numbers and reload roots to
quickly invalidate all shadow pages and then do kvm_mmu_zap_all_invalid().
kvm_mmu_zap_all_invalid() is a new function that invalidates only shadow
pages with stale generation number (and uses lock break technique). It
may traverse active_mmu_pages from tail to head since new shadow pages
will be added to the head of the list or it may use invalid slot rmap to
find exactly what should be invalidated.

 I still think the right way to fix this kind of thing is optimization for
 mmu-lock.
 ==
 
 Which parts scare you? Let's find a way to optimize for it. ;). For example,
 if you do not like unmap_memslot_rmap_nolock(), we can simplify it - We can
 use walk_shadow_page_lockless_begin() and walk_shadow_page_lockless_end() to
 protect spte instead of kvm-being_unmaped_rmap.
 

kvm-being_unmaped_rmap is particularly tricky, although looks
correct. Additional indirection with rmap ops also does not help following
the code. I'd rather have if(slot is invalid) in a couple of places where
things should be done differently. In most places it will be WARN_ON(slot
is invalid).

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages

2013-04-22 Thread Gleb Natapov
On Sun, Apr 21, 2013 at 12:35:08PM -0300, Marcelo Tosatti wrote:
 On Sun, Apr 21, 2013 at 12:27:51PM -0300, Marcelo Tosatti wrote:
  On Sun, Apr 21, 2013 at 04:03:46PM +0300, Gleb Natapov wrote:
   On Tue, Apr 16, 2013 at 02:32:38PM +0800, Xiao Guangrong wrote:
This patchset is based on my previous two patchset:
[PATCH 0/2] KVM: x86: avoid potential soft lockup and unneeded mmu 
reload
(https://lkml.org/lkml/2013/4/1/2)

[PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
(https://lkml.org/lkml/2013/4/1/134)

Changlog:
V3:
  completely redesign the algorithm, please see below.

   This looks pretty complicated. Is it still needed in order to avoid soft
   lockups after avoid potential soft lockup and unneeded mmu reload patch?
  
  Do not want kvm_set_memory (cases: DELETE/MOVE/CREATES) to be
  suspectible to:
  
  vcpu 1  |   kvm_set_memory
  create shadow page  
  nuke shadow page
  create shadow page
  nuke shadow page
  
  Which is guest triggerable behavior with spinlock preemption algorithm.
 
 Not only guest triggerable as in the sense of a malicious guest, 
 but condition above can be induced by host workload with non-malicious
 guest system.
 
Is the problem that newly created shadow pages are immediately zapped?
Shouldn't generation number/kvm_mmu_zap_all_invalid() idea described here
https://lkml.org/lkml/2013/4/22/111 solve this?

 Also kvm_set_memory being relatively fast with huge memory guests
 is nice (which is what Xiaos idea allows).
 

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages

2013-04-22 Thread Takuya Yoshikawa
On Mon, 22 Apr 2013 15:39:38 +0300
Gleb Natapov g...@redhat.com wrote:

   Do not want kvm_set_memory (cases: DELETE/MOVE/CREATES) to be
   suspectible to:
   
   vcpu 1|   kvm_set_memory
   create shadow page
 nuke shadow page
   create shadow page
 nuke shadow page
   
   Which is guest triggerable behavior with spinlock preemption algorithm.
  
  Not only guest triggerable as in the sense of a malicious guest, 
  but condition above can be induced by host workload with non-malicious
  guest system.
  
 Is the problem that newly created shadow pages are immediately zapped?
 Shouldn't generation number/kvm_mmu_zap_all_invalid() idea described here
 https://lkml.org/lkml/2013/4/22/111 solve this?

I guess so.  That's what Avi described when he tried to achieve
lockless TLB flushes.  Mixing that idea with Xiao's approach will
achieve reasonably nice performance, I think.

Various improvements should be added later on top of that if needed.

  Also kvm_set_memory being relatively fast with huge memory guests
  is nice (which is what Xiaos idea allows).

I agree with this point.  But if so, it should be actually measured on
such guests, even if the algorithm looks promising.

Takuya
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages

2013-04-22 Thread Xiao Guangrong
On 04/22/2013 05:21 PM, Gleb Natapov wrote:
 On Sun, Apr 21, 2013 at 10:09:29PM +0800, Xiao Guangrong wrote:
 On 04/21/2013 09:03 PM, Gleb Natapov wrote:
 On Tue, Apr 16, 2013 at 02:32:38PM +0800, Xiao Guangrong wrote:
 This patchset is based on my previous two patchset:
 [PATCH 0/2] KVM: x86: avoid potential soft lockup and unneeded mmu reload
 (https://lkml.org/lkml/2013/4/1/2)

 [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
 (https://lkml.org/lkml/2013/4/1/134)

 Changlog:
 V3:
   completely redesign the algorithm, please see below.

 This looks pretty complicated. Is it still needed in order to avoid soft
 lockups after avoid potential soft lockup and unneeded mmu reload patch?

 Yes.

 I discussed this point with Marcelo:

 ==
 BTW, to my honest, i do not think spin_needbreak is a good way - it does
 not fix the hot-lock contention and it just occupies more cpu time to avoid
 possible soft lock-ups.

 Especially, zap-all-shadow-pages can let other vcpus fault and vcpus contest
 mmu-lock, then zap-all-shadow-pages release mmu-lock and wait, other vcpus
 create page tables again. zap-all-shadow-page need long time to be finished,
 the worst case is, it can not completed forever on intensive vcpu and memory
 usage.

 So what about mixed approach: use generation numbers and reload roots to
 quickly invalidate all shadow pages and then do kvm_mmu_zap_all_invalid().
 kvm_mmu_zap_all_invalid() is a new function that invalidates only shadow
 pages with stale generation number (and uses lock break technique). It
 may traverse active_mmu_pages from tail to head since new shadow pages
 will be added to the head of the list or it may use invalid slot rmap to
 find exactly what should be invalidated.

I prefer to unmapping the invalid rmap instead of zapping stale shadow pages
in kvm_mmu_zap_all_invalid(), the former is faster.

This way may help but not good, after reload mmu with the new generation number,
all of the vcpu will fault in a long time, try to hold mmu-lock is not good even
if use lock break technique.

I think we can do this step first, then unmapping invalid rmap out of mmu-lock
later.

 
 I still think the right way to fix this kind of thing is optimization for
 mmu-lock.
 ==

 Which parts scare you? Let's find a way to optimize for it. ;). For example,
 if you do not like unmap_memslot_rmap_nolock(), we can simplify it - We can
 use walk_shadow_page_lockless_begin() and walk_shadow_page_lockless_end() to
 protect spte instead of kvm-being_unmaped_rmap.

 
 kvm-being_unmaped_rmap is particularly tricky, although looks

Okay. Will use walk_shadow_page_lockless_begin() and 
walk_shadow_page_lockless_end
instead.

 correct. Additional indirection with rmap ops also does not help following
 the code. I'd rather have if(slot is invalid) in a couple of places where
 things should be done differently. In most places it will be WARN_ON(slot
 is invalid).

Less change, good to me, will do. ;)

Thanks!

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages

2013-04-22 Thread Marcelo Tosatti
On Mon, Apr 22, 2013 at 10:45:53PM +0900, Takuya Yoshikawa wrote:
 On Mon, 22 Apr 2013 15:39:38 +0300
 Gleb Natapov g...@redhat.com wrote:
 
Do not want kvm_set_memory (cases: DELETE/MOVE/CREATES) to be
suspectible to:

vcpu 1  |   kvm_set_memory
create shadow page  
nuke shadow page
create shadow page
nuke shadow page

Which is guest triggerable behavior with spinlock preemption algorithm.
   
   Not only guest triggerable as in the sense of a malicious guest, 
   but condition above can be induced by host workload with non-malicious
   guest system.
   
  Is the problem that newly created shadow pages are immediately zapped?
  Shouldn't generation number/kvm_mmu_zap_all_invalid() idea described here
  https://lkml.org/lkml/2013/4/22/111 solve this?
 
 I guess so.  That's what Avi described when he tried to achieve
 lockless TLB flushes.  Mixing that idea with Xiao's approach will
 achieve reasonably nice performance, I think.

Yes.

 Various improvements should be added later on top of that if needed.
 
   Also kvm_set_memory being relatively fast with huge memory guests
   is nice (which is what Xiaos idea allows).
 
 I agree with this point.  But if so, it should be actually measured on
 such guests, even if the algorithm looks promising.

Works for me.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages

2013-04-21 Thread Gleb Natapov
On Tue, Apr 16, 2013 at 02:32:38PM +0800, Xiao Guangrong wrote:
 This patchset is based on my previous two patchset:
 [PATCH 0/2] KVM: x86: avoid potential soft lockup and unneeded mmu reload
 (https://lkml.org/lkml/2013/4/1/2)
 
 [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
 (https://lkml.org/lkml/2013/4/1/134)
 
 Changlog:
 V3:
   completely redesign the algorithm, please see below.
 
This looks pretty complicated. Is it still needed in order to avoid soft
lockups after avoid potential soft lockup and unneeded mmu reload patch?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages

2013-04-21 Thread Xiao Guangrong
On 04/21/2013 09:03 PM, Gleb Natapov wrote:
 On Tue, Apr 16, 2013 at 02:32:38PM +0800, Xiao Guangrong wrote:
 This patchset is based on my previous two patchset:
 [PATCH 0/2] KVM: x86: avoid potential soft lockup and unneeded mmu reload
 (https://lkml.org/lkml/2013/4/1/2)

 [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
 (https://lkml.org/lkml/2013/4/1/134)

 Changlog:
 V3:
   completely redesign the algorithm, please see below.

 This looks pretty complicated. Is it still needed in order to avoid soft
 lockups after avoid potential soft lockup and unneeded mmu reload patch?

Yes.

I discussed this point with Marcelo:

==
BTW, to my honest, i do not think spin_needbreak is a good way - it does
not fix the hot-lock contention and it just occupies more cpu time to avoid
possible soft lock-ups.

Especially, zap-all-shadow-pages can let other vcpus fault and vcpus contest
mmu-lock, then zap-all-shadow-pages release mmu-lock and wait, other vcpus
create page tables again. zap-all-shadow-page need long time to be finished,
the worst case is, it can not completed forever on intensive vcpu and memory
usage.

I still think the right way to fix this kind of thing is optimization for
mmu-lock.
==

Which parts scare you? Let's find a way to optimize for it. ;). For example,
if you do not like unmap_memslot_rmap_nolock(), we can simplify it - We can
use walk_shadow_page_lockless_begin() and walk_shadow_page_lockless_end() to
protect spte instead of kvm-being_unmaped_rmap.

Thanks!

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages

2013-04-21 Thread Marcelo Tosatti
On Sun, Apr 21, 2013 at 04:03:46PM +0300, Gleb Natapov wrote:
 On Tue, Apr 16, 2013 at 02:32:38PM +0800, Xiao Guangrong wrote:
  This patchset is based on my previous two patchset:
  [PATCH 0/2] KVM: x86: avoid potential soft lockup and unneeded mmu reload
  (https://lkml.org/lkml/2013/4/1/2)
  
  [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
  (https://lkml.org/lkml/2013/4/1/134)
  
  Changlog:
  V3:
completely redesign the algorithm, please see below.
  
 This looks pretty complicated. Is it still needed in order to avoid soft
 lockups after avoid potential soft lockup and unneeded mmu reload patch?

Do not want kvm_set_memory (cases: DELETE/MOVE/CREATES) to be
suspectible to:

vcpu 1  |   kvm_set_memory
create shadow page  
nuke shadow page
create shadow page
nuke shadow page

Which is guest triggerable behavior with spinlock preemption algorithm.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages

2013-04-21 Thread Marcelo Tosatti
On Sun, Apr 21, 2013 at 12:27:51PM -0300, Marcelo Tosatti wrote:
 On Sun, Apr 21, 2013 at 04:03:46PM +0300, Gleb Natapov wrote:
  On Tue, Apr 16, 2013 at 02:32:38PM +0800, Xiao Guangrong wrote:
   This patchset is based on my previous two patchset:
   [PATCH 0/2] KVM: x86: avoid potential soft lockup and unneeded mmu reload
   (https://lkml.org/lkml/2013/4/1/2)
   
   [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
   (https://lkml.org/lkml/2013/4/1/134)
   
   Changlog:
   V3:
 completely redesign the algorithm, please see below.
   
  This looks pretty complicated. Is it still needed in order to avoid soft
  lockups after avoid potential soft lockup and unneeded mmu reload patch?
 
 Do not want kvm_set_memory (cases: DELETE/MOVE/CREATES) to be
 suspectible to:
 
 vcpu 1|   kvm_set_memory
 create shadow page
   nuke shadow page
 create shadow page
   nuke shadow page
 
 Which is guest triggerable behavior with spinlock preemption algorithm.

Not only guest triggerable as in the sense of a malicious guest, 
but condition above can be induced by host workload with non-malicious
guest system.

Also kvm_set_memory being relatively fast with huge memory guests
is nice (which is what Xiaos idea allows).


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages

2013-04-21 Thread Marcelo Tosatti
On Sun, Apr 21, 2013 at 10:09:29PM +0800, Xiao Guangrong wrote:
 On 04/21/2013 09:03 PM, Gleb Natapov wrote:
  On Tue, Apr 16, 2013 at 02:32:38PM +0800, Xiao Guangrong wrote:
  This patchset is based on my previous two patchset:
  [PATCH 0/2] KVM: x86: avoid potential soft lockup and unneeded mmu reload
  (https://lkml.org/lkml/2013/4/1/2)
 
  [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
  (https://lkml.org/lkml/2013/4/1/134)
 
  Changlog:
  V3:
completely redesign the algorithm, please see below.
 
  This looks pretty complicated. Is it still needed in order to avoid soft
  lockups after avoid potential soft lockup and unneeded mmu reload patch?
 
 Yes.
 
 I discussed this point with Marcelo:
 
 ==
 BTW, to my honest, i do not think spin_needbreak is a good way - it does
 not fix the hot-lock contention and it just occupies more cpu time to avoid
 possible soft lock-ups.
 
 Especially, zap-all-shadow-pages can let other vcpus fault and vcpus contest
 mmu-lock, then zap-all-shadow-pages release mmu-lock and wait, other vcpus
 create page tables again. zap-all-shadow-page need long time to be finished,
 the worst case is, it can not completed forever on intensive vcpu and memory
 usage.
 
 I still think the right way to fix this kind of thing is optimization for
 mmu-lock.
 ==
 
 Which parts scare you? Let's find a way to optimize for it. ;). For example,
 if you do not like unmap_memslot_rmap_nolock(), we can simplify it - We can
 use walk_shadow_page_lockless_begin() and walk_shadow_page_lockless_end() to
 protect spte instead of kvm-being_unmaped_rmap.
 
 Thanks!

Xiao,

You can just remove all shadow rmaps now that you've agreed per-memslot
flushes are not necessary. Which then gets rid of necessity for lockless 
rmap accesses. Right?


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages

2013-04-21 Thread Xiao Guangrong
On 04/21/2013 11:24 PM, Marcelo Tosatti wrote:
 On Sun, Apr 21, 2013 at 10:09:29PM +0800, Xiao Guangrong wrote:
 On 04/21/2013 09:03 PM, Gleb Natapov wrote:
 On Tue, Apr 16, 2013 at 02:32:38PM +0800, Xiao Guangrong wrote:
 This patchset is based on my previous two patchset:
 [PATCH 0/2] KVM: x86: avoid potential soft lockup and unneeded mmu reload
 (https://lkml.org/lkml/2013/4/1/2)

 [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
 (https://lkml.org/lkml/2013/4/1/134)

 Changlog:
 V3:
   completely redesign the algorithm, please see below.

 This looks pretty complicated. Is it still needed in order to avoid soft
 lockups after avoid potential soft lockup and unneeded mmu reload patch?

 Yes.

 I discussed this point with Marcelo:

 ==
 BTW, to my honest, i do not think spin_needbreak is a good way - it does
 not fix the hot-lock contention and it just occupies more cpu time to avoid
 possible soft lock-ups.

 Especially, zap-all-shadow-pages can let other vcpus fault and vcpus contest
 mmu-lock, then zap-all-shadow-pages release mmu-lock and wait, other vcpus
 create page tables again. zap-all-shadow-page need long time to be finished,
 the worst case is, it can not completed forever on intensive vcpu and memory
 usage.

 I still think the right way to fix this kind of thing is optimization for
 mmu-lock.
 ==

 Which parts scare you? Let's find a way to optimize for it. ;). For example,
 if you do not like unmap_memslot_rmap_nolock(), we can simplify it - We can
 use walk_shadow_page_lockless_begin() and walk_shadow_page_lockless_end() to
 protect spte instead of kvm-being_unmaped_rmap.

 Thanks!
 
 Xiao,
 
 You can just remove all shadow rmaps now that you've agreed per-memslot
 flushes are not necessary. Which then gets rid of necessity for lockless 
 rmap accesses. Right?

Hi Marcelo,

I am worried about:

==
We can not release all rmaps. If we do this, -invalidate_page and
-invalidate_range_start can not find any spte using the host page,
that means, Accessed/Dirty for host page is missing tracked.
(missing call kvm_set_pfn_accessed and kvm_set_pfn_dirty properly.)

[https://lkml.org/lkml/2013/4/18/358]
==

Do you think this is a issue? What's your idea?

Thanks!

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html