On Mon, 11 May 2026 at 21:25, Sean Christopherson <[email protected]> wrote: > > On Mon, May 11, 2026, Fuad Tabba wrote: > > On Mon, 11 May 2026 at 15:58, Sean Christopherson <[email protected]> wrote: > > > > > > On Mon, May 11, 2026, Fuad Tabba wrote: > > > > kvm_vm_release() closes vmp->fd and vmp->kvm_fd unconditionally, and > > > > kvm_vm_free() calls kvm_vm_release() at teardown. A test that calls > > > > kvm_vm_release() and then kvm_vm_free() without a > > > > vm_recreate_with_one_vcpu() in between double-closes both FDs. > > > > > > That's a test bug, no? > > > > Fair point. Let me describe the actual use case and then I'd like > > your guidance on which way to fix it. > > > > The pKVM selftests I'm preparing need to verify that protected guest > > memory is zeroed when the VM is destroyed before the host regains > > visibility of those pages. The test pattern is: > > > > vm_userspace_mem_region_add(vm, ..., GPA_BASE, 1, GPA_PAGES, 0); > > /* run protected guest, which writes/shares/unshares its memory */ > > ... > > TEST_ASSERT(get_proc_locked_vm_size() > GPA_SIZE); /* still pinned */ > > > > kvm_vm_release(vm); /* trigger destroy */ > > > > /* host_mem is still mmap'd in the test process; pKVM should have > > * zeroed it before unpinning */ > > TEST_ASSERT(is_zero(region->host_mem, GPA_PAGES * PAGE_SIZE)); > > > > kvm_vm_free(vm); /* tidy up */ > > TEST_ASSERT(get_proc_locked_vm_size() == 0); > > > > I used kvm_vm_release() because it's the only public API that closes > > vm->fd to trigger kernel-side destruction. But the existing callers > > follow it with vm_recreate_with_one_vcpu(), so the "release + later > > kvm_vm_free()" path isn't exercised today. > > > > I see three ways to make this clean: > > a) This patch: kvm_vm_release() becomes idempotent for its three > > FDs, matching the kvm_stats_release() idiom it already invokes. > > b) Leave kvm_vm_release() as-is and add a dedicated helper, e.g. > > kvm_vm_destroy_kernel(), that closes vm->fd to trigger kernel > > destruction while leaving the kvm_vm struct intact for > > post-destruction inspection. kvm_vm_free() learns to handle the > > half-released state. > > c) Something else entirely, e.g., the test should manage vm->fd > > directly and not rely on library helpers for this pattern. > > d) Fully kill the VM; validate the semantics with an explict mmap(). > > The entire point of the test you are writing is to verfiy that a guest_memfd > VMA > doesn't somehow cause KVM to leak state. So, make that obvious instead of > abusing > APIs that kinda sorta do what you want, but not really. > > mem = kvm_mmap(region->mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED, > region->guest_memfd); > > ... > > kvm_vm_free(vm); > > TEST_ASSERT(is_zero(mem, ...));
The test isn't about guest_memfd. The pKVM support that just landed via Will's series [1] is built on anonymous host memory, and that's the only backing this selftest exercises (VM_MEM_SRC_ANONYMOUS and VM_MEM_SRC_ANONYMOUS_THP). The property under test is the one documented by the same series in Documentation/virt/kvm/arm/pkvm.rst: Status: Isolation of anonymous memory and metadata pages. ... Reclaimed memory is zeroed by the hypervisor ... The test mirrors that contract: assert VmLck > donated size before teardown (pages pinned), trigger kernel-side destruction, assert the host VMA reads as zero (hypervisor scrubbed on reclaim), then assert VmLck == 0 (pages unpinned). kvm_mmap() + kvm_vm_free() + is_zero() doesn't translate here. The only host view of the donated pages is the memslot mmap, and kvm_vm_free() munmaps it on the way out, so inspection has to happen between kernel-side destruction and userspace free. kvm_vm_release() is the only library primitive that does that today. What do you suggest? Cheers, /fuad [1] https://lore.kernel.org/all/[email protected]/

