This series is built on top of Fuad's v7 "mapping guest_memfd backed memory at the host" [1].
With James's KVM userfault [2], it is possible to handle stage-2 faults in guest_memfd in userspace. However, KVM itself also triggers faults in guest_memfd in some cases, for example: PV interfaces like kvmclock, PV EOI and page table walking code when fetching the MMIO instruction on x86. It was agreed in the guest_memfd upstream call on 23 Jan 2025 [3] that KVM would be accessing those pages via userspace page tables. In order for such faults to be handled in userspace, guest_memfd needs to support userfaultfd. Changes since v1 [4]: - James, Peter: implement a full minor trap instead of a hybrid missing/minor trap - James, Peter: to avoid shmem- and guest_memfd-specific code in the UFFDIO_CONTINUE implementation make it generic by calling vm_ops->fault() While generalising UFFDIO_CONTINUE implementation helped avoid guest_memfd-specific code in mm/userfaulfd, userfaultfd still needs access to KVM code to be able to verify the VMA type when handling UFFDIO_REGISTER_MODE_MINOR, so I used a similar approach to what Fuad did for now [5]. In v1, Peter was mentioning a potential for eliminating taking a folio lock [6]. I did not implement that, but according to my testing, the performance of shmem minor fault handling stayed the same after the migration to calling vm_ops->fault() (tested on an x86). Before: ./demand_paging_test -u MINOR -s shmem Random seed: 0x6b8b4567 Testing guest mode: PA-bits:ANY, VA-bits:48, 4K pages guest physical test memory: [0x3fffbffff000, 0x3ffffffff000) Finished creating vCPUs and starting uffd threads Started all vCPUs All vCPU threads joined Total guest execution time: 10.979277020s Per-vcpu demand paging rate: 23876.253375 pgs/sec/vcpu Overall demand paging rate: 23876.253375 pgs/sec After: ./demand_paging_test -u MINOR -s shmem Random seed: 0x6b8b4567 Testing guest mode: PA-bits:ANY, VA-bits:48, 4K pages guest physical test memory: [0x3fffbffff000, 0x3ffffffff000) Finished creating vCPUs and starting uffd threads Started all vCPUs All vCPU threads joined Total guest execution time: 10.978893504s Per-vcpu demand paging rate: 23877.087423 pgs/sec/vcpu Overall demand paging rate: 23877.087423 pgs/sec Nikita [1] https://lore.kernel.org/kvm/20250318161823.4005529-1-ta...@google.com/T/ [2] https://lore.kernel.org/kvm/20250109204929.1106563-1-jthough...@google.com/T/ [3] https://docs.google.com/document/d/1M6766BzdY1Lhk7LiR5IqVR8B8mG3cr-cxTxOrAosPOk/edit?tab=t.0#heading=h.w1126rgli5e3 [4] https://lore.kernel.org/kvm/20250303133011.44095-1-kalya...@amazon.com/T/ [5] https://lore.kernel.org/kvm/20250318161823.4005529-1-ta...@google.com/T/#Z2e.:..:20250318161823.4005529-3-tabba::40google.com:1mm:swap.c [6] https://lore.kernel.org/kvm/20250303133011.44095-1-kalya...@amazon.com/T/#m8695dc24d2cc633a6a486a8990e3f7d50d4efb79 Nikita Kalyazin (5): mm: userfaultfd: generic continue for non hugetlbfs KVM: guest_memfd: add kvm_gmem_vma_is_gmem mm: userfaultfd: allow to register continue for guest_memfd KVM: guest_memfd: add support for userfaultfd minor KVM: selftests: test userfaultfd minor for guest_memfd include/linux/mm_types.h | 3 + include/linux/userfaultfd_k.h | 13 ++- mm/hugetlb.c | 2 +- mm/shmem.c | 3 +- mm/userfaultfd.c | 25 +++-- .../testing/selftests/kvm/guest_memfd_test.c | 94 +++++++++++++++++++ virt/kvm/guest_memfd.c | 15 +++ virt/kvm/kvm_mm.h | 1 + 8 files changed, 146 insertions(+), 10 deletions(-) base-commit: 3cc51efc17a2c41a480eed36b31c1773936717e0 -- 2.47.1