This series is built on top of Fuad's v7 "mapping guest_memfd backed
memory at the host" [1].

With James's KVM userfault [2], it is possible to handle stage-2 faults
in guest_memfd in userspace.  However, KVM itself also triggers faults
in guest_memfd in some cases, for example: PV interfaces like kvmclock,
PV EOI and page table walking code when fetching the MMIO instruction on
x86.  It was agreed in the guest_memfd upstream call on 23 Jan 2025 [3]
that KVM would be accessing those pages via userspace page tables.  In
order for such faults to be handled in userspace, guest_memfd needs to
support userfaultfd.

Changes since v1 [4]:
 - James, Peter: implement a full minor trap instead of a hybrid
   missing/minor trap
 - James, Peter: to avoid shmem- and guest_memfd-specific code in the
   UFFDIO_CONTINUE implementation make it generic by calling
vm_ops->fault()

While generalising UFFDIO_CONTINUE implementation helped avoid
guest_memfd-specific code in mm/userfaulfd, userfaultfd still needs
access to KVM code to be able to verify the VMA type when handling
UFFDIO_REGISTER_MODE_MINOR, so I used a similar approach to what Fuad
did for now [5].

In v1, Peter was mentioning a potential for eliminating taking a folio
lock [6].  I did not implement that, but according to my testing, the
performance of shmem minor fault handling stayed the same after the
migration to calling vm_ops->fault() (tested on an x86).

Before:

./demand_paging_test -u MINOR -s shmem
Random seed: 0x6b8b4567
Testing guest mode: PA-bits:ANY, VA-bits:48,  4K pages
guest physical test memory: [0x3fffbffff000, 0x3ffffffff000)
Finished creating vCPUs and starting uffd threads
Started all vCPUs
All vCPU threads joined
Total guest execution time:     10.979277020s
Per-vcpu demand paging rate:    23876.253375 pgs/sec/vcpu
Overall demand paging rate:     23876.253375 pgs/sec

After:

./demand_paging_test -u MINOR -s shmem
Random seed: 0x6b8b4567
Testing guest mode: PA-bits:ANY, VA-bits:48,  4K pages
guest physical test memory: [0x3fffbffff000, 0x3ffffffff000)
Finished creating vCPUs and starting uffd threads
Started all vCPUs
All vCPU threads joined
Total guest execution time:     10.978893504s
Per-vcpu demand paging rate:    23877.087423 pgs/sec/vcpu
Overall demand paging rate:     23877.087423 pgs/sec

Nikita

[1] https://lore.kernel.org/kvm/20250318161823.4005529-1-ta...@google.com/T/
[2] 
https://lore.kernel.org/kvm/20250109204929.1106563-1-jthough...@google.com/T/
[3] 
https://docs.google.com/document/d/1M6766BzdY1Lhk7LiR5IqVR8B8mG3cr-cxTxOrAosPOk/edit?tab=t.0#heading=h.w1126rgli5e3
[4] https://lore.kernel.org/kvm/20250303133011.44095-1-kalya...@amazon.com/T/
[5] 
https://lore.kernel.org/kvm/20250318161823.4005529-1-ta...@google.com/T/#Z2e.:..:20250318161823.4005529-3-tabba::40google.com:1mm:swap.c
[6] 
https://lore.kernel.org/kvm/20250303133011.44095-1-kalya...@amazon.com/T/#m8695dc24d2cc633a6a486a8990e3f7d50d4efb79

Nikita Kalyazin (5):
  mm: userfaultfd: generic continue for non hugetlbfs
  KVM: guest_memfd: add kvm_gmem_vma_is_gmem
  mm: userfaultfd: allow to register continue for guest_memfd
  KVM: guest_memfd: add support for userfaultfd minor
  KVM: selftests: test userfaultfd minor for guest_memfd

 include/linux/mm_types.h                      |  3 +
 include/linux/userfaultfd_k.h                 | 13 ++-
 mm/hugetlb.c                                  |  2 +-
 mm/shmem.c                                    |  3 +-
 mm/userfaultfd.c                              | 25 +++--
 .../testing/selftests/kvm/guest_memfd_test.c  | 94 +++++++++++++++++++
 virt/kvm/guest_memfd.c                        | 15 +++
 virt/kvm/kvm_mm.h                             |  1 +
 8 files changed, 146 insertions(+), 10 deletions(-)


base-commit: 3cc51efc17a2c41a480eed36b31c1773936717e0
-- 
2.47.1


Reply via email to