Re: [PATCH 18/22] kvm: mmu: Support disabling dirty logging for the tdp MMU

2020-10-07 Thread Paolo Bonzini
On 07/10/20 19:28, Ben Gardon wrote:
>> No, that would be just another way to write the same thing.  That said,
>> making the iteration API more complicated also has disadvantages because
>> if get a Cartesian explosion of changes.
> I wouldn't be too worried about that. The only things I ever found
> worth making an iterator case for were:
> Every SPTE
> Every present SPTE
> Every present leaf SPTE

* (vcpu, root) * (all levels, large only)

We only need a small subset of these, but the naming would be more
complex at least.

Paolo



Re: [PATCH 18/22] kvm: mmu: Support disabling dirty logging for the tdp MMU

2020-10-07 Thread Ben Gardon
On Wed, Oct 7, 2020 at 10:21 AM Paolo Bonzini  wrote:
>
> On 07/10/20 18:30, Ben Gardon wrote:
> >> I'm starting to wonder if another iterator like
> >> for_each_tdp_leaf_pte_root would be clearer, since this idiom repeats
> >> itself quite often.  The tdp_iter_next_leaf function would be easily
> >> implemented as
> >>
> >> while (likely(iter->valid) &&
> >>(!is_shadow_present_pte(iter.old_spte) ||
> >> is_last_spte(iter.old_spte, iter.level))
> >> tdp_iter_next(iter);
> > Do you see a substantial efficiency difference between adding a
> > tdp_iter_next_leaf and building on for_each_tdp_pte_using_root with
> > something like:
> >
> > #define for_each_tdp_leaf_pte_using_root(_iter, _root, _start, _end)\
> > for_each_tdp_pte_using_root(_iter, _root, _start, _end) \
> > if (!is_shadow_present_pte(_iter.old_spte) ||   \
> > !is_last_spte(_iter.old_spte, _iter.level)) \
> > continue;   \
> > else
> >
> > I agree that putting those checks in a wrapper makes the code more concise.
> >
>
> No, that would be just another way to write the same thing.  That said,
> making the iteration API more complicated also has disadvantages because
> if get a Cartesian explosion of changes.

I wouldn't be too worried about that. The only things I ever found
worth making an iterator case for were:
Every SPTE
Every present SPTE
Every present leaf SPTE

And really there aren't many cases that use the middle one.

>
> Regarding the naming, I'm leaning towards
>
> tdp_root_for_each_pte
> tdp_vcpu_for_each_pte
>
> which is shorter than the version with "using" and still clarifies that
> "root" and "vcpu" are the thing that the iteration works on.

That sounds good to me. I agree it's similarly clear.

>
> Paolo
>


Re: [PATCH 18/22] kvm: mmu: Support disabling dirty logging for the tdp MMU

2020-10-07 Thread Paolo Bonzini
On 07/10/20 18:30, Ben Gardon wrote:
>> I'm starting to wonder if another iterator like
>> for_each_tdp_leaf_pte_root would be clearer, since this idiom repeats
>> itself quite often.  The tdp_iter_next_leaf function would be easily
>> implemented as
>>
>> while (likely(iter->valid) &&
>>(!is_shadow_present_pte(iter.old_spte) ||
>> is_last_spte(iter.old_spte, iter.level))
>> tdp_iter_next(iter);
> Do you see a substantial efficiency difference between adding a
> tdp_iter_next_leaf and building on for_each_tdp_pte_using_root with
> something like:
> 
> #define for_each_tdp_leaf_pte_using_root(_iter, _root, _start, _end)\
> for_each_tdp_pte_using_root(_iter, _root, _start, _end) \
> if (!is_shadow_present_pte(_iter.old_spte) ||   \
> !is_last_spte(_iter.old_spte, _iter.level)) \
> continue;   \
> else
> 
> I agree that putting those checks in a wrapper makes the code more concise.
> 

No, that would be just another way to write the same thing.  That said,
making the iteration API more complicated also has disadvantages because
if get a Cartesian explosion of changes.

Regarding the naming, I'm leaning towards

tdp_root_for_each_pte
tdp_vcpu_for_each_pte

which is shorter than the version with "using" and still clarifies that
"root" and "vcpu" are the thing that the iteration works on.

Paolo



Re: [PATCH 18/22] kvm: mmu: Support disabling dirty logging for the tdp MMU

2020-10-07 Thread Ben Gardon
On Fri, Sep 25, 2020 at 6:09 PM Paolo Bonzini  wrote:
>
> On 25/09/20 23:22, Ben Gardon wrote:
> > + for_each_tdp_pte_root(iter, root, start, end) {
> > + if (!is_shadow_present_pte(iter.old_spte) ||
> > + is_last_spte(iter.old_spte, iter.level))
> > + continue;
> > +
>
> I'm starting to wonder if another iterator like
> for_each_tdp_leaf_pte_root would be clearer, since this idiom repeats
> itself quite often.  The tdp_iter_next_leaf function would be easily
> implemented as
>
> while (likely(iter->valid) &&
>(!is_shadow_present_pte(iter.old_spte) ||
> is_last_spte(iter.old_spte, iter.level))
> tdp_iter_next(iter);

Do you see a substantial efficiency difference between adding a
tdp_iter_next_leaf and building on for_each_tdp_pte_using_root with
something like:

#define for_each_tdp_leaf_pte_using_root(_iter, _root, _start, _end)\
for_each_tdp_pte_using_root(_iter, _root, _start, _end) \
if (!is_shadow_present_pte(_iter.old_spte) ||   \
!is_last_spte(_iter.old_spte, _iter.level)) \
continue;   \
else

I agree that putting those checks in a wrapper makes the code more concise.

>
> Paolo
>


Re: [PATCH 18/22] kvm: mmu: Support disabling dirty logging for the tdp MMU

2020-09-25 Thread Paolo Bonzini
On 25/09/20 23:22, Ben Gardon wrote:
> + for_each_tdp_pte_root(iter, root, start, end) {
> + if (!is_shadow_present_pte(iter.old_spte) ||
> + is_last_spte(iter.old_spte, iter.level))
> + continue;
> +

I'm starting to wonder if another iterator like
for_each_tdp_leaf_pte_root would be clearer, since this idiom repeats
itself quite often.  The tdp_iter_next_leaf function would be easily
implemented as

while (likely(iter->valid) &&
   (!is_shadow_present_pte(iter.old_spte) ||
is_last_spte(iter.old_spte, iter.level))
tdp_iter_next(iter);

Paolo



[PATCH 18/22] kvm: mmu: Support disabling dirty logging for the tdp MMU

2020-09-25 Thread Ben Gardon
Dirty logging ultimately breaks down MMU mappings to 4k granularity.
When dirty logging is no longer needed, these granaular mappings
represent a useless performance penalty. When dirty logging is disabled,
search the paging structure for mappings that could be re-constituted
into a large page mapping. Zap those mappings so that they can be
faulted in again at a higher mapping level.

Tested by running kvm-unit-tests and KVM selftests on an Intel Haswell
machine. This series introduced no new failures.

This series can be viewed in Gerrit at:
https://linux-review.googlesource.com/c/virt/kvm/kvm/+/2538

Signed-off-by: Ben Gardon 
---
 arch/x86/kvm/mmu/mmu.c |  3 ++
 arch/x86/kvm/mmu/tdp_mmu.c | 62 ++
 arch/x86/kvm/mmu/tdp_mmu.h |  2 ++
 3 files changed, 67 insertions(+)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index b9074603f9df1..12892fc4f146d 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6025,6 +6025,9 @@ void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
spin_lock(&kvm->mmu_lock);
slot_handle_leaf(kvm, (struct kvm_memory_slot *)memslot,
 kvm_mmu_zap_collapsible_spte, true);
+
+   if (kvm->arch.tdp_mmu_enabled)
+   kvm_tdp_mmu_zap_collapsible_sptes(kvm, memslot);
spin_unlock(&kvm->mmu_lock);
 }
 
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index e5cb7f0ec23e8..a2895119655ac 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1099,3 +1099,65 @@ bool kvm_tdp_mmu_slot_set_dirty(struct kvm *kvm, struct 
kvm_memory_slot *slot)
return spte_set;
 }
 
+/*
+ * Clear non-leaf entries (and free associated page tables) which could
+ * be replaced by large mappings, for GFNs within the slot.
+ */
+static void zap_collapsible_spte_range(struct kvm *kvm,
+  struct kvm_mmu_page *root,
+  gfn_t start, gfn_t end)
+{
+   struct tdp_iter iter;
+   kvm_pfn_t pfn;
+   bool spte_set = false;
+   int as_id = kvm_mmu_page_as_id(root);
+
+   for_each_tdp_pte_root(iter, root, start, end) {
+   if (!is_shadow_present_pte(iter.old_spte) ||
+   is_last_spte(iter.old_spte, iter.level))
+   continue;
+
+   pfn = spte_to_pfn(iter.old_spte);
+   if (kvm_is_reserved_pfn(pfn) ||
+   !PageTransCompoundMap(pfn_to_page(pfn)))
+   continue;
+
+   *iter.sptep = 0;
+   handle_changed_spte(kvm, as_id, iter.gfn, iter.old_spte,
+   0, iter.level);
+   spte_set = true;
+
+   spte_set = !tdp_mmu_iter_cond_resched(kvm, &iter);
+   }
+
+   if (spte_set)
+   kvm_flush_remote_tlbs(kvm);
+}
+
+/*
+ * Clear non-leaf entries (and free associated page tables) which could
+ * be replaced by large mappings, for GFNs within the slot.
+ */
+void kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm,
+  const struct kvm_memory_slot *slot)
+{
+   struct kvm_mmu_page *root;
+   int root_as_id;
+
+   for_each_tdp_mmu_root(kvm, root) {
+   root_as_id = kvm_mmu_page_as_id(root);
+   if (root_as_id != slot->as_id)
+   continue;
+
+   /*
+* Take a reference on the root so that it cannot be freed if
+* this thread releases the MMU lock and yields in this loop.
+*/
+   get_tdp_mmu_root(kvm, root);
+
+   zap_collapsible_spte_range(kvm, root, slot->base_gfn,
+  slot->base_gfn + slot->npages);
+
+   put_tdp_mmu_root(kvm, root);
+   }
+}
diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
index 2c9322ba3462b..10e70699c5372 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.h
+++ b/arch/x86/kvm/mmu/tdp_mmu.h
@@ -38,4 +38,6 @@ void kvm_tdp_mmu_clear_dirty_pt_masked(struct kvm *kvm,
   gfn_t gfn, unsigned long mask,
   bool wrprot);
 bool kvm_tdp_mmu_slot_set_dirty(struct kvm *kvm, struct kvm_memory_slot *slot);
+void kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm,
+  const struct kvm_memory_slot *slot);
 #endif /* __KVM_X86_MMU_TDP_MMU_H */
-- 
2.28.0.709.gb0816b6eb0-goog