Re: [PATCH 5/24] Introduce vmcs12: a VMCS structure for L1

Avi Kivity Wed, 23 Jun 2010 00:57:49 -0700

On 06/22/2010 05:54 PM, Nadav Har'El wrote:

I'm not sure what we need to do with vmcs that is not in RAM.  It may
simplify things to return the error_page to the caller and set
KVM_REQ_TRIPLE_FAULT, so we don't have to deal with error handling later on.

This is a very good point. The approach in the patches I sent was to pin
the L1-specified VMCS page and map it every time it was needed, and unpin
and unmap it immediately afterward. This indeed will not work correctly if
called with interrupts disabled and for some reason the vmcs12 page is swapped
out...

I decided to reconsider this whole approach. In the new approach, we pin
and kmap the guest vmcs12 page on VMPTRLD time (when sleeping is fine),
and unpin and unmap it when a different VMPTRLD is done (or VMCLEAR, or
cleanup). Whenever we want to use this vmcs12 in the code - we can do so
without needing to swap in or map anything.

The code should now handle the rare situation when the vmcs12 gets swapped
out, and is cleaner (with almost 100 lines of ugly nested_map_current()/
nested_unmap_current() calls were eliminated). The only downside I see is that
now when nested vmx is being used, a single page, the current vmcs of L1, is
always pinned and kmaped. I believe that pinning and mapping one single page
(no matter how many guests there are) is a small price to pay - we already
spend more than that in other places (e.g., one vmcs02 page per .

Does this sound reasonable to you?

kmap() really should be avoided when possible. It is for when we don'thave a pte pointing to the page (for example, accessing a user page fromoutside its process context).

Really, the correct API is kvm_read_guest() and kvm_write_guest(). Theycan easily be wrapped in with something that takes a vmcs12 field andautomatically references the vmptr:



  kvm_set_cr0(vcpu, gvmcs_read64(vcpu, guest_cr0));

This will take care of mark_page_dirty() etc.

kvm_*_guest() is a slow API since it needs to search the memory slotlist but that can be optimized easily by caching the memory slot (andinvalidating the cache when memory mapping changes). The optimized APIscan end up as doing a pointer fetch and a get/put_user(), which is veryefficient.

Now the only problem is access from atomic contexts, as far as I cantell there are two areas where this is needed:


1) interrupt injection
2) optimized VMREAD/VMWRITE emulation

There are other reasons to move interrupt injection out of atomiccontext. If that's the only thing in the way of usingkvm_read/write_guest(), I'll be happy to prioritize that work.

Alex, Joerg, well gvmcb_{read,write}{32,64}() work for nsvm? All thatkmapping is incredibly annoying.

I guess it's fine to start with the kmap() based implementation andchange it later.

Note: you need to kunmap() and kvm_release_page_dirty() not only onvmxoff and vmptrld/vmclear, but also when ioctl(KVM_VCPU_RUN) exits touserspace. That ensures that live migration sees the page dirtied andis able to copy it.



--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 5/24] Introduce vmcs12: a VMCS structure for L1

Reply via email to