Re: [PATCH 18/22] kvm: mmu: Support disabling dirty logging for the tdp MMU
On 07/10/20 19:28, Ben Gardon wrote: >> No, that would be just another way to write the same thing. That said, >> making the iteration API more complicated also has disadvantages because >> if get a Cartesian explosion of changes. > I wouldn't be too worried about that. The only things I ever found > worth making an iterator case for were: > Every SPTE > Every present SPTE > Every present leaf SPTE * (vcpu, root) * (all levels, large only) We only need a small subset of these, but the naming would be more complex at least. Paolo
Re: [PATCH 18/22] kvm: mmu: Support disabling dirty logging for the tdp MMU
On Wed, Oct 7, 2020 at 10:21 AM Paolo Bonzini wrote: > > On 07/10/20 18:30, Ben Gardon wrote: > >> I'm starting to wonder if another iterator like > >> for_each_tdp_leaf_pte_root would be clearer, since this idiom repeats > >> itself quite often. The tdp_iter_next_leaf function would be easily > >> implemented as > >> > >> while (likely(iter->valid) && > >>(!is_shadow_present_pte(iter.old_spte) || > >> is_last_spte(iter.old_spte, iter.level)) > >> tdp_iter_next(iter); > > Do you see a substantial efficiency difference between adding a > > tdp_iter_next_leaf and building on for_each_tdp_pte_using_root with > > something like: > > > > #define for_each_tdp_leaf_pte_using_root(_iter, _root, _start, _end)\ > > for_each_tdp_pte_using_root(_iter, _root, _start, _end) \ > > if (!is_shadow_present_pte(_iter.old_spte) || \ > > !is_last_spte(_iter.old_spte, _iter.level)) \ > > continue; \ > > else > > > > I agree that putting those checks in a wrapper makes the code more concise. > > > > No, that would be just another way to write the same thing. That said, > making the iteration API more complicated also has disadvantages because > if get a Cartesian explosion of changes. I wouldn't be too worried about that. The only things I ever found worth making an iterator case for were: Every SPTE Every present SPTE Every present leaf SPTE And really there aren't many cases that use the middle one. > > Regarding the naming, I'm leaning towards > > tdp_root_for_each_pte > tdp_vcpu_for_each_pte > > which is shorter than the version with "using" and still clarifies that > "root" and "vcpu" are the thing that the iteration works on. That sounds good to me. I agree it's similarly clear. > > Paolo >
Re: [PATCH 18/22] kvm: mmu: Support disabling dirty logging for the tdp MMU
On 07/10/20 18:30, Ben Gardon wrote: >> I'm starting to wonder if another iterator like >> for_each_tdp_leaf_pte_root would be clearer, since this idiom repeats >> itself quite often. The tdp_iter_next_leaf function would be easily >> implemented as >> >> while (likely(iter->valid) && >>(!is_shadow_present_pte(iter.old_spte) || >> is_last_spte(iter.old_spte, iter.level)) >> tdp_iter_next(iter); > Do you see a substantial efficiency difference between adding a > tdp_iter_next_leaf and building on for_each_tdp_pte_using_root with > something like: > > #define for_each_tdp_leaf_pte_using_root(_iter, _root, _start, _end)\ > for_each_tdp_pte_using_root(_iter, _root, _start, _end) \ > if (!is_shadow_present_pte(_iter.old_spte) || \ > !is_last_spte(_iter.old_spte, _iter.level)) \ > continue; \ > else > > I agree that putting those checks in a wrapper makes the code more concise. > No, that would be just another way to write the same thing. That said, making the iteration API more complicated also has disadvantages because if get a Cartesian explosion of changes. Regarding the naming, I'm leaning towards tdp_root_for_each_pte tdp_vcpu_for_each_pte which is shorter than the version with "using" and still clarifies that "root" and "vcpu" are the thing that the iteration works on. Paolo
Re: [PATCH 18/22] kvm: mmu: Support disabling dirty logging for the tdp MMU
On Fri, Sep 25, 2020 at 6:09 PM Paolo Bonzini wrote: > > On 25/09/20 23:22, Ben Gardon wrote: > > + for_each_tdp_pte_root(iter, root, start, end) { > > + if (!is_shadow_present_pte(iter.old_spte) || > > + is_last_spte(iter.old_spte, iter.level)) > > + continue; > > + > > I'm starting to wonder if another iterator like > for_each_tdp_leaf_pte_root would be clearer, since this idiom repeats > itself quite often. The tdp_iter_next_leaf function would be easily > implemented as > > while (likely(iter->valid) && >(!is_shadow_present_pte(iter.old_spte) || > is_last_spte(iter.old_spte, iter.level)) > tdp_iter_next(iter); Do you see a substantial efficiency difference between adding a tdp_iter_next_leaf and building on for_each_tdp_pte_using_root with something like: #define for_each_tdp_leaf_pte_using_root(_iter, _root, _start, _end)\ for_each_tdp_pte_using_root(_iter, _root, _start, _end) \ if (!is_shadow_present_pte(_iter.old_spte) || \ !is_last_spte(_iter.old_spte, _iter.level)) \ continue; \ else I agree that putting those checks in a wrapper makes the code more concise. > > Paolo >
Re: [PATCH 18/22] kvm: mmu: Support disabling dirty logging for the tdp MMU
On 25/09/20 23:22, Ben Gardon wrote: > + for_each_tdp_pte_root(iter, root, start, end) { > + if (!is_shadow_present_pte(iter.old_spte) || > + is_last_spte(iter.old_spte, iter.level)) > + continue; > + I'm starting to wonder if another iterator like for_each_tdp_leaf_pte_root would be clearer, since this idiom repeats itself quite often. The tdp_iter_next_leaf function would be easily implemented as while (likely(iter->valid) && (!is_shadow_present_pte(iter.old_spte) || is_last_spte(iter.old_spte, iter.level)) tdp_iter_next(iter); Paolo
[PATCH 18/22] kvm: mmu: Support disabling dirty logging for the tdp MMU
Dirty logging ultimately breaks down MMU mappings to 4k granularity. When dirty logging is no longer needed, these granaular mappings represent a useless performance penalty. When dirty logging is disabled, search the paging structure for mappings that could be re-constituted into a large page mapping. Zap those mappings so that they can be faulted in again at a higher mapping level. Tested by running kvm-unit-tests and KVM selftests on an Intel Haswell machine. This series introduced no new failures. This series can be viewed in Gerrit at: https://linux-review.googlesource.com/c/virt/kvm/kvm/+/2538 Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/mmu.c | 3 ++ arch/x86/kvm/mmu/tdp_mmu.c | 62 ++ arch/x86/kvm/mmu/tdp_mmu.h | 2 ++ 3 files changed, 67 insertions(+) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index b9074603f9df1..12892fc4f146d 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6025,6 +6025,9 @@ void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm, spin_lock(&kvm->mmu_lock); slot_handle_leaf(kvm, (struct kvm_memory_slot *)memslot, kvm_mmu_zap_collapsible_spte, true); + + if (kvm->arch.tdp_mmu_enabled) + kvm_tdp_mmu_zap_collapsible_sptes(kvm, memslot); spin_unlock(&kvm->mmu_lock); } diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index e5cb7f0ec23e8..a2895119655ac 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1099,3 +1099,65 @@ bool kvm_tdp_mmu_slot_set_dirty(struct kvm *kvm, struct kvm_memory_slot *slot) return spte_set; } +/* + * Clear non-leaf entries (and free associated page tables) which could + * be replaced by large mappings, for GFNs within the slot. + */ +static void zap_collapsible_spte_range(struct kvm *kvm, + struct kvm_mmu_page *root, + gfn_t start, gfn_t end) +{ + struct tdp_iter iter; + kvm_pfn_t pfn; + bool spte_set = false; + int as_id = kvm_mmu_page_as_id(root); + + for_each_tdp_pte_root(iter, root, start, end) { + if (!is_shadow_present_pte(iter.old_spte) || + is_last_spte(iter.old_spte, iter.level)) + continue; + + pfn = spte_to_pfn(iter.old_spte); + if (kvm_is_reserved_pfn(pfn) || + !PageTransCompoundMap(pfn_to_page(pfn))) + continue; + + *iter.sptep = 0; + handle_changed_spte(kvm, as_id, iter.gfn, iter.old_spte, + 0, iter.level); + spte_set = true; + + spte_set = !tdp_mmu_iter_cond_resched(kvm, &iter); + } + + if (spte_set) + kvm_flush_remote_tlbs(kvm); +} + +/* + * Clear non-leaf entries (and free associated page tables) which could + * be replaced by large mappings, for GFNs within the slot. + */ +void kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm, + const struct kvm_memory_slot *slot) +{ + struct kvm_mmu_page *root; + int root_as_id; + + for_each_tdp_mmu_root(kvm, root) { + root_as_id = kvm_mmu_page_as_id(root); + if (root_as_id != slot->as_id) + continue; + + /* +* Take a reference on the root so that it cannot be freed if +* this thread releases the MMU lock and yields in this loop. +*/ + get_tdp_mmu_root(kvm, root); + + zap_collapsible_spte_range(kvm, root, slot->base_gfn, + slot->base_gfn + slot->npages); + + put_tdp_mmu_root(kvm, root); + } +} diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index 2c9322ba3462b..10e70699c5372 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -38,4 +38,6 @@ void kvm_tdp_mmu_clear_dirty_pt_masked(struct kvm *kvm, gfn_t gfn, unsigned long mask, bool wrprot); bool kvm_tdp_mmu_slot_set_dirty(struct kvm *kvm, struct kvm_memory_slot *slot); +void kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm, + const struct kvm_memory_slot *slot); #endif /* __KVM_X86_MMU_TDP_MMU_H */ -- 2.28.0.709.gb0816b6eb0-goog