Hi Richard,

Thank you for the feedback.

On 7/23/24 11:33 PM, Richard Henderson wrote:
On 7/23/24 11:05, Don Porter wrote:


+    if (env->hflags & HF_GUEST_MASK) {
+
+        /* Extract the EPTP value from vmcs12 structure, store in arch state */
+        if (env->nested_state->format == KVM_STATE_NESTED_FORMAT_VMX) {
+            struct vmcs12 *vmcs =
+                (struct vmcs12 *) env->nested_state->data.vmx->vmcs12;

This is not required.  You appear to be confused by nested paging.

First: nested paging is how hardware virtualization works.  When we are *using* hardware virtualization, all of that is the kernel's job.  Our job as hypervisor is to give a bag of pages to the kernel and have it map them into the guest intermediate address space.

When we are *using* hardware virtualization, we are only ever concerned with one level of paging: from the guest to the intermediate address space.  From there we use QEMU data structures to map to QEMU virtual address space (address_space_ld/st, etc).

This is all we will ever see from KVM, HVF etc.

With TCG, we can *emulate* hardware virtualization.  It is at this point where we are concerned about two levels of paging, because QEMU is handling both.

I actually think we are close to the same understanding, except that one can use KVM to emulate (well, pass through to) Intel's virtualization hardware for a guest (TCG does only appears to support AMD's hardware virtualization interfaces), which is a use case I care about for my course.

One of my test cases for these debugging features was to have a simple guest/nested hypervisor running on emulated VT-x hardware. As in, the "guest" code enters VT root mode and sets up a VMCS and EPT, and launches a guest, etc.

My understanding is that when one uses kvm in this way, in the kernel kvm creates shadow page tables to merge the guest hypervisor and host hypervisor's tables transparently to the guest.

My reading of the KVM code is that this ioctl is the way that emulated architectural state (like the vmcs) is synced from the kernel back to qemu.  I don't see another KVM API for getting things like the extended page table root and certain VMCS configuration flags that one needs to walk the page tables.  Most of this state is not currently exposed to debugging features in qemu, which is why this definition was not needed.

I am open to other suggestions how to get that state, like the EPT root pointer, from KVM.  Perhaps I am missing something.

----

I will admit the intermediate address space is a lot to get one's head around.  I believe I have consistently used appropriate address_space_ld/st and friends at this point.

Thanks again,

Don


Reply via email to