On 03/09/2015 09:07 PM, Nadav Amit wrote:
Avi Kivity <avi.kiv...@gmail.com> wrote:

On 03/09/2015 07:51 PM, Nadav Amit wrote:
Avi Kivity <avi.kiv...@gmail.com> wrote:

On 03/03/2015 11:52 AM, Paolo Bonzini wrote:
In this
case, the VM might expect exceptions when PTE bits which are higher than the
maximum (reported) address width are set, and it would not get such
exceptions. This problem can easily be experienced by small change to the
existing KVM unit-tests.

There are many variants to this problem, and the only solution which I
consider complete is to report to the VM the maximum (52) physical address
width to the VM, configure the VM to exit on #PF with reserved-bit
error-codes, and then emulate these faulting instructions.
Not even that would be a definitive solution.  If the guest tries to map
RAM (e.g. a PCI BAR that is backed by RAM) above the host MAXPHYADDR,
you would get EPT misconfiguration vmexits.

I think there is no way to emulate physical address width correctly,
except by disabling EPT.
Is the issue emulating a higher MAXPHYADDR on the guest than is available
on the host? I don't think there's any need to support that.

Emulating a lower setting on the guest than is available on the host is, I
think, desirable. Whether it would work depends on the relative priority
of EPT misconfiguration exits vs. page table permission faults.
Thanks for the feedback.

Guest page-table permissions faults got priority over EPT misconfiguration.
KVM can even be set to trap page-table permission faults, at least in VT-x.
Anyhow, I don’t think it is enough.
Why is it not enough? If you trap a permission fault, you can inject any 
exception error code you like.
Because there is no real permission fault. In the following example, the VM
expects one (VM’s MAXPHYADDR=40), but there isn’t (Host’s MAXPHYADDR=46), so
the hypervisor cannot trap it. It can only trap all #PF, which is obviously
too intrusive.

There are three cases:

1) The guest has marked the page as not present. In this case, no reserved bits are set and the guest should receive its #PF. 2) The page is present and the permissions are sufficient. In this case, you will get an EPT misconfiguration and can proceed to inject a #PF with the reserved bit flag set. 3) The page is present but permissions are not sufficient. In this case you can trap the fault via the PFEC_MASK register and inject a #PF to the guest.

So you can emulate it and only trap permission faults. It's still too expensive though.


  Here is an example

My machine has MAXPHYADDR of 46. I modified kvm-unit-tests access test to
set pte.45 instead of pte.51, which from the VM point-of-view should cause
the #PF error-code indicate the reserved bits are set (just as pte.51 does).
Here is one error from the log:

test pte.p pte.45 pde.p user: FAIL: error code 5 expected d
Dump mapping: address: 123400000000
------L4: 304b007
------L3: 304c007
------L2: 304d001
------L1: 200002000001
This is with an ept misconfig programmed into that address, yes?
A reserved bit in the PTE is set - from the VM point-of-view. If there
wasn’t another cause for #PF, it would lead to EPT violation/misconfig.

As you can see, the #PF should have had two reasons: reserved bits, and user
access to supervisor only page. The error-code however does not indicate the
reserved-bits are set.

Note that KVM did not trap any exit on that faulting instruction, as
otherwise it would try to emulate the instruction and assuming it is
supported (and that the #PF was not on an instruction fetch), should be able
to emulate the #PF correctly.
[ The test actually crashes soon after this error due to these reasons. ]

Anyhow, that is the reason for me to assume that having the maximum
MAXPHYADDR is better.
Well, that doesn't work for the reasons Paolo noted.  The guest can have a 
ivshmem device attached, and map it above a host-supported virtual address, and 
suddenly it goes slow.
I fully understand. That’s the reason I don’t have a reasonable solution.

I can't think of one with reasonable performance either. Perhaps the maintainers could raise the issue with Intel. It looks academic but it can happen in real life -- KVM for example used to rely on reserved bits faults (it set all bits in the PTE so it wouldn't have been caught by this).
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to