[PATCH 5/5] KVM: Get rid of kvm_get_pfn()

2021-07-17 Thread Marc Zyngier
Nobody is using kvm_get_pfn() anymore. Get rid of it.

Signed-off-by: Marc Zyngier 
---
 include/linux/kvm_host.h | 1 -
 virt/kvm/kvm_main.c  | 9 +
 2 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ae7735b490b4..9818d271c2a1 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -824,7 +824,6 @@ void kvm_release_pfn_clean(kvm_pfn_t pfn);
 void kvm_release_pfn_dirty(kvm_pfn_t pfn);
 void kvm_set_pfn_dirty(kvm_pfn_t pfn);
 void kvm_set_pfn_accessed(kvm_pfn_t pfn);
-void kvm_get_pfn(kvm_pfn_t pfn);
 
 void kvm_release_pfn(kvm_pfn_t pfn, bool dirty, struct gfn_to_pfn_cache 
*cache);
 int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2e410a8a6a67..0284418c4400 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2215,7 +2215,7 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 * Get a reference here because callers of *hva_to_pfn* and
 * *gfn_to_pfn* ultimately call kvm_release_pfn_clean on the
 * returned pfn.  This is only needed if the VMA has VM_MIXEDMAP
-* set, but the kvm_get_pfn/kvm_release_pfn_clean pair will
+* set, but the kvm_try_get_pfn/kvm_release_pfn_clean pair will
 * simply do nothing for reserved pfns.
 *
 * Whoever called remap_pfn_range is also going to call e.g.
@@ -2612,13 +2612,6 @@ void kvm_set_pfn_accessed(kvm_pfn_t pfn)
 }
 EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed);
 
-void kvm_get_pfn(kvm_pfn_t pfn)
-{
-   if (!kvm_is_reserved_pfn(pfn))
-   get_page(pfn_to_page(pfn));
-}
-EXPORT_SYMBOL_GPL(kvm_get_pfn);
-
 static int next_segment(unsigned long len, int offset)
 {
if (len > PAGE_SIZE - offset)
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 4/5] KVM: arm64: Use get_page() instead of kvm_get_pfn()

2021-07-17 Thread Marc Zyngier
When mapping a THP, we are guaranteed that the page isn't reserved,
and we can safely avoid the kvm_is_reserved_pfn() call.

Replace kvm_get_pfn() with get_page(pfn_to_page()).

Signed-off-by: Marc Zyngier 
---
 arch/arm64/kvm/mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index c036a480ca27..0e8dab124cbd 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -852,7 +852,7 @@ transparent_hugepage_adjust(struct kvm *kvm, struct 
kvm_memory_slot *memslot,
*ipap &= PMD_MASK;
kvm_release_pfn_clean(pfn);
pfn &= ~(PTRS_PER_PMD - 1);
-   kvm_get_pfn(pfn);
+   get_page(pfn_to_page(pfn));
*pfnp = pfn;
 
return PMD_SIZE;
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 0/5] KVM: Remove kvm_is_transparent_hugepage() and friends

2021-07-17 Thread Marc Zyngier
A while ago, Willy and Sean pointed out[1] that arm64 is the last user
of kvm_is_transparent_hugepage(), and that there would actually be
some benefit in looking at the userspace mapping directly instead.

This small series does exactly that, although it doesn't try to
support more than a PMD-sized mapping yet for THPs. We could probably
look into unifying this with the huge PUD code, and there is still
some potential use of the contiguous hint.

As a consequence, it removes kvm_is_transparent_hugepage(),
PageTransCompoundMap() and kvm_get_pfn(), all of which have no user
left after this rework.

This has been lightly tested on an Altra box. Although nothing caught
fire, it requires some careful reviewing on the arm64 side.

[1] https://lore.kernel.org/r/ylplvfpxrip8n...@google.com

Marc Zyngier (5):
  KVM: arm64: Walk userspace page tables to compute the THP mapping size
  KVM: arm64: Avoid mapping size adjustment on permission fault
  KVM: Remove kvm_is_transparent_hugepage() and PageTransCompoundMap()
  KVM: arm64: Use get_page() instead of kvm_get_pfn()
  KVM: Get rid of kvm_get_pfn()

 arch/arm64/kvm/mmu.c   | 57 +-
 include/linux/kvm_host.h   |  1 -
 include/linux/page-flags.h | 37 -
 virt/kvm/kvm_main.c| 19 +
 4 files changed, 51 insertions(+), 63 deletions(-)

-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 1/5] KVM: arm64: Walk userspace page tables to compute the THP mapping size

2021-07-17 Thread Marc Zyngier
We currently rely on the kvm_is_transparent_hugepage() helper to
discover whether a given page has the potential to be mapped as
a block mapping.

However, this API doesn't really give un everything we want:
- we don't get the size: this is not crucial today as we only
  support PMD-sized THPs, but we'd like to have larger sizes
  in the future
- we're the only user left of the API, and there is a will
  to remove it altogether

To address the above, implement a simple walker using the existing
page table infrastructure, and plumb it into transparent_hugepage_adjust().
No new page sizes are supported in the process.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/kvm/mmu.c | 46 
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 3155c9e778f0..db6314b93e99 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -433,6 +433,44 @@ int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t 
size,
return 0;
 }
 
+static struct kvm_pgtable_mm_ops kvm_user_mm_ops = {
+   /* We shouldn't need any other callback to walk the PT */
+   .phys_to_virt   = kvm_host_va,
+};
+
+struct user_walk_data {
+   u32 level;
+};
+
+static int user_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
+  enum kvm_pgtable_walk_flags flag, void * const arg)
+{
+   struct user_walk_data *data = arg;
+
+   data->level = level;
+   return 0;
+}
+
+static int get_user_mapping_size(struct kvm *kvm, u64 addr)
+{
+   struct user_walk_data data;
+   struct kvm_pgtable pgt = {
+   .pgd= (kvm_pte_t *)kvm->mm->pgd,
+   .ia_bits= VA_BITS,
+   .start_level= 4 - CONFIG_PGTABLE_LEVELS,
+   .mm_ops = _user_mm_ops,
+   };
+   struct kvm_pgtable_walker walker = {
+   .cb = user_walker,
+   .flags  = KVM_PGTABLE_WALK_LEAF,
+   .arg= ,
+   };
+
+   kvm_pgtable_walk(, ALIGN_DOWN(addr, PAGE_SIZE), PAGE_SIZE, );
+
+   return BIT(ARM64_HW_PGTABLE_LEVEL_SHIFT(data.level));
+}
+
 static struct kvm_pgtable_mm_ops kvm_s2_mm_ops = {
.zalloc_page= stage2_memcache_zalloc_page,
.zalloc_pages_exact = kvm_host_zalloc_pages_exact,
@@ -780,7 +818,7 @@ static bool fault_supports_stage2_huge_mapping(struct 
kvm_memory_slot *memslot,
  * Returns the size of the mapping.
  */
 static unsigned long
-transparent_hugepage_adjust(struct kvm_memory_slot *memslot,
+transparent_hugepage_adjust(struct kvm *kvm, struct kvm_memory_slot *memslot,
unsigned long hva, kvm_pfn_t *pfnp,
phys_addr_t *ipap)
 {
@@ -791,8 +829,8 @@ transparent_hugepage_adjust(struct kvm_memory_slot *memslot,
 * sure that the HVA and IPA are sufficiently aligned and that the
 * block map is contained within the memslot.
 */
-   if (kvm_is_transparent_hugepage(pfn) &&
-   fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) {
+   if (fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE) &&
+   get_user_mapping_size(kvm, hva) >= PMD_SIZE) {
/*
 * The address we faulted on is backed by a transparent huge
 * page.  However, because we map the compound huge page and
@@ -1051,7 +1089,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 * backed by a THP and thus use block mapping if possible.
 */
if (vma_pagesize == PAGE_SIZE && !(force_pte || device))
-   vma_pagesize = transparent_hugepage_adjust(memslot, hva,
+   vma_pagesize = transparent_hugepage_adjust(kvm, memslot, hva,
   , _ipa);
 
if (fault_status != FSC_PERM && !device && kvm_has_mte(kvm)) {
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 3/5] KVM: Remove kvm_is_transparent_hugepage() and PageTransCompoundMap()

2021-07-17 Thread Marc Zyngier
Now that arm64 has stopped using kvm_is_transparent_hugepage(),
we can remove it, as well as PageTransCompoundMap() which was
only used by the former.

Signed-off-by: Marc Zyngier 
---
 include/linux/page-flags.h | 37 -
 virt/kvm/kvm_main.c| 10 --
 2 files changed, 47 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 5922031ffab6..1ace27c4a8e0 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -632,43 +632,6 @@ static inline int PageTransCompound(struct page *page)
return PageCompound(page);
 }
 
-/*
- * PageTransCompoundMap is the same as PageTransCompound, but it also
- * guarantees the primary MMU has the entire compound page mapped
- * through pmd_trans_huge, which in turn guarantees the secondary MMUs
- * can also map the entire compound page. This allows the secondary
- * MMUs to call get_user_pages() only once for each compound page and
- * to immediately map the entire compound page with a single secondary
- * MMU fault. If there will be a pmd split later, the secondary MMUs
- * will get an update through the MMU notifier invalidation through
- * split_huge_pmd().
- *
- * Unlike PageTransCompound, this is safe to be called only while
- * split_huge_pmd() cannot run from under us, like if protected by the
- * MMU notifier, otherwise it may result in page->_mapcount check false
- * positives.
- *
- * We have to treat page cache THP differently since every subpage of it
- * would get _mapcount inc'ed once it is PMD mapped.  But, it may be PTE
- * mapped in the current process so comparing subpage's _mapcount to
- * compound_mapcount to filter out PTE mapped case.
- */
-static inline int PageTransCompoundMap(struct page *page)
-{
-   struct page *head;
-
-   if (!PageTransCompound(page))
-   return 0;
-
-   if (PageAnon(page))
-   return atomic_read(>_mapcount) < 0;
-
-   head = compound_head(page);
-   /* File THP is PMD mapped and not PTE mapped */
-   return atomic_read(>_mapcount) ==
-  atomic_read(compound_mapcount_ptr(head));
-}
-
 /*
  * PageTransTail returns true for both transparent huge pages
  * and hugetlbfs pages, so it should only be called when it's known
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7d95126cda9e..2e410a8a6a67 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -189,16 +189,6 @@ bool kvm_is_reserved_pfn(kvm_pfn_t pfn)
return true;
 }
 
-bool kvm_is_transparent_hugepage(kvm_pfn_t pfn)
-{
-   struct page *page = pfn_to_page(pfn);
-
-   if (!PageTransCompoundMap(page))
-   return false;
-
-   return is_transparent_hugepage(compound_head(page));
-}
-
 /*
  * Switches to specified vcpu, until a matching vcpu_put()
  */
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 2/5] KVM: arm64: Avoid mapping size adjustment on permission fault

2021-07-17 Thread Marc Zyngier
Since we only support PMD-sized mappings for THP, getting
a permission fault on a level that results in a mapping
being larger than PAGE_SIZE is a sure indication that we have
already upgraded our mapping to a PMD.

In this case, there is no need to try and parse userspace page
tables, as the fault information already tells us everything.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/kvm/mmu.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index db6314b93e99..c036a480ca27 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1088,9 +1088,14 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 * If we are not forced to use page mapping, check if we are
 * backed by a THP and thus use block mapping if possible.
 */
-   if (vma_pagesize == PAGE_SIZE && !(force_pte || device))
-   vma_pagesize = transparent_hugepage_adjust(kvm, memslot, hva,
-  , _ipa);
+   if (vma_pagesize == PAGE_SIZE && !force_pte) {
+   if (fault_status == FSC_PERM && fault_granule > PAGE_SIZE)
+   vma_pagesize = fault_granule;
+   else
+   vma_pagesize = transparent_hugepage_adjust(kvm, memslot,
+  hva, ,
+  _ipa);
+   }
 
if (fault_status != FSC_PERM && !device && kvm_has_mte(kvm)) {
/* Check the VMM hasn't introduced a new VM_SHARED VMA */
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm