[PATCH 2/2] KVM: SVM: Emulate next_rip svm feature

2010-07-27 Thread Joerg Roedel
This patch implements the emulations of the svm next_rip
feature in the nested svm implementation in kvm.

Signed-off-by: Joerg Roedel 
---
 arch/x86/kvm/svm.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 7d10f2c..b44c9cc 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1919,6 +1919,7 @@ static int nested_svm_vmexit(struct vcpu_svm *svm)
nested_vmcb->control.exit_info_2   = vmcb->control.exit_info_2;
nested_vmcb->control.exit_int_info = vmcb->control.exit_int_info;
nested_vmcb->control.exit_int_info_err = 
vmcb->control.exit_int_info_err;
+   nested_vmcb->control.next_rip  = vmcb->control.next_rip;
 
/*
 * If we emulate a VMRUN/#VMEXIT in the same host #vmexit cycle we have
@@ -3356,7 +3357,12 @@ static void svm_set_supported_cpuid(u32 func, struct 
kvm_cpuid_entry2 *entry)
entry->ebx = 8; /* Lets support 8 ASIDs in case we add proper
   ASID emulation to nested SVM */
entry->ecx = 0; /* Reserved */
-   entry->edx = 0; /* Do not support any additional features */
+   entry->edx = 0; /* Per default do not support any
+  additional features */
+
+   /* Support next_rip if host supports it */
+   if (svm_has(SVM_FEATURE_NRIP))
+   entry->edx |= SVM_FEATURE_NRIP;
 
break;
}
-- 
1.7.0.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: SVM: Emulate next_rip svm feature

2010-07-27 Thread Avi Kivity

 On 07/27/2010 07:14 PM, Joerg Roedel wrote:

This patch implements the emulations of the svm next_rip
feature in the nested svm implementation in kvm.

Signed-off-by: Joerg Roedel
---
  arch/x86/kvm/svm.c |8 +++-
  1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 7d10f2c..b44c9cc 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1919,6 +1919,7 @@ static int nested_svm_vmexit(struct vcpu_svm *svm)
nested_vmcb->control.exit_info_2   = vmcb->control.exit_info_2;
nested_vmcb->control.exit_int_info = vmcb->control.exit_int_info;
nested_vmcb->control.exit_int_info_err = 
vmcb->control.exit_int_info_err;
+   nested_vmcb->control.next_rip  = vmcb->control.next_rip;



Can it be really this simple?  Suppose we emulate a nested guest 
instruction just before vmexit, doesn't that invalidate 
vmcb->control.next_rip?  Can that happen?


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: SVM: Emulate next_rip svm feature

2010-07-28 Thread Roedel, Joerg
On Tue, Jul 27, 2010 at 02:32:35PM -0400, Avi Kivity wrote:
>   On 07/27/2010 07:14 PM, Joerg Roedel wrote:
> > This patch implements the emulations of the svm next_rip
> > feature in the nested svm implementation in kvm.
> >
> > Signed-off-by: Joerg Roedel
> > ---
> >   arch/x86/kvm/svm.c |8 +++-
> >   1 files changed, 7 insertions(+), 1 deletions(-)
> >
> > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> > index 7d10f2c..b44c9cc 100644
> > --- a/arch/x86/kvm/svm.c
> > +++ b/arch/x86/kvm/svm.c
> > @@ -1919,6 +1919,7 @@ static int nested_svm_vmexit(struct vcpu_svm *svm)
> > nested_vmcb->control.exit_info_2   = vmcb->control.exit_info_2;
> > nested_vmcb->control.exit_int_info = vmcb->control.exit_int_info;
> > nested_vmcb->control.exit_int_info_err = 
> > vmcb->control.exit_int_info_err;
> > +   nested_vmcb->control.next_rip  = vmcb->control.next_rip;
> >
> 
> Can it be really this simple?  Suppose we emulate a nested guest 
> instruction just before vmexit, doesn't that invalidate 
> vmcb->control.next_rip?  Can that happen?

Good point. I looked again into it. The documentation states:

The next sequential instruction pointer (nRIP) is saved in
the guest VMCB control area at location C8h on all #VMEXITs that
are due to instruction intercepts, as defined in Section 15.8 on
page 378, as well as MSR and IOIO intercepts and exceptions
caused by the INT3, INTO, and BOUND instructions. For all other
intercepts, nRIP is reset to zero.

There are a few intercepts that may need injection when running nested
immediatly after an instruction emulation on the host side:

INTR, NMI
#PF, #GP, ...

All these instructions do not provide a valid next_rip on #vmexit so we
should be save here. The other way around, copying back a next_rip
pointer when there should be none, should also not happen as far as I
see it. The next_rip is only set for instruction intercepts which are
either handled on the host side or reinjected directly into the L1
hypervisor.
When you don't see a failing case either, I think we are save with this
simple implementation.

Joerg

-- 
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: SVM: Emulate next_rip svm feature

2010-07-28 Thread Avi Kivity

 On 07/28/2010 12:37 PM, Roedel, Joerg wrote:



Can it be really this simple?  Suppose we emulate a nested guest
instruction just before vmexit, doesn't that invalidate
vmcb->control.next_rip?  Can that happen?

Good point. I looked again into it. The documentation states:

The next sequential instruction pointer (nRIP) is saved in
the guest VMCB control area at location C8h on all #VMEXITs that
are due to instruction intercepts, as defined in Section 15.8 on
page 378, as well as MSR and IOIO intercepts and exceptions
caused by the INT3, INTO, and BOUND instructions. For all other
intercepts, nRIP is reset to zero.

There are a few intercepts that may need injection when running nested
immediatly after an instruction emulation on the host side:

INTR, NMI
#PF, #GP, ...

All these instructions do not provide a valid next_rip on #vmexit so we
should be save here. The other way around, copying back a next_rip
pointer when there should be none, should also not happen as far as I
see it. The next_rip is only set for instruction intercepts which are
either handled on the host side or reinjected directly into the L1
hypervisor.
When you don't see a failing case either, I think we are save with this
simple implementation.


I agree, looks like everything's fine here.

We have a slightly different problem, if the nested guest manages to get 
an instruction to be emulated by the host (if the guest assigned it the 
cirrus framebuffer, for example, so from L1's point of view it is RAM, 
but from L0's point of view it is emulated), then we miss the 
intercept.  L2 could take over L1 this way.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: SVM: Emulate next_rip svm feature

2010-07-28 Thread Roedel, Joerg
On Wed, Jul 28, 2010 at 06:28:06AM -0400, Avi Kivity wrote:
> We have a slightly different problem, if the nested guest manages to get 
> an instruction to be emulated by the host (if the guest assigned it the 
> cirrus framebuffer, for example, so from L1's point of view it is RAM, 
> but from L0's point of view it is emulated), then we miss the 
> intercept.  L2 could take over L1 this way.

I wonder how this could happen. Shouldn't the shadow paging code take
care of this?

Joerg

-- 
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: SVM: Emulate next_rip svm feature

2010-07-28 Thread Avi Kivity

 On 07/28/2010 02:25 PM, Roedel, Joerg wrote:

On Wed, Jul 28, 2010 at 06:28:06AM -0400, Avi Kivity wrote:

We have a slightly different problem, if the nested guest manages to get
an instruction to be emulated by the host (if the guest assigned it the
cirrus framebuffer, for example, so from L1's point of view it is RAM,
but from L0's point of view it is emulated), then we miss the
intercept.  L2 could take over L1 this way.

I wonder how this could happen. Shouldn't the shadow paging code take
care of this?



L1 thinks the memory is RAM, so it maps it directly and forgets about 
it.  L0 knows it isn't, so it leaves it unmapped and emulates any 
instruction which accesses it.  The emulator needs to check whether the 
instruction is intercepted or not.


Note, I think if the instruction operand is in mmio, we're safe, since 
the intercept has higher priority than memory access.  But if the 
instruction itself is on mmio, or if we entered the emulator through smp 
trickery, then the emulator will execute the instruction in nested guest 
context.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: SVM: Emulate next_rip svm feature

2010-07-28 Thread Roedel, Joerg
On Wed, Jul 28, 2010 at 07:34:11AM -0400, Avi Kivity wrote:
>   On 07/28/2010 02:25 PM, Roedel, Joerg wrote:
> > On Wed, Jul 28, 2010 at 06:28:06AM -0400, Avi Kivity wrote:
> >> We have a slightly different problem, if the nested guest manages to get
> >> an instruction to be emulated by the host (if the guest assigned it the
> >> cirrus framebuffer, for example, so from L1's point of view it is RAM,
> >> but from L0's point of view it is emulated), then we miss the
> >> intercept.  L2 could take over L1 this way.
> > I wonder how this could happen. Shouldn't the shadow paging code take
> > care of this?
> >
> 
> L1 thinks the memory is RAM, so it maps it directly and forgets about 
> it.  L0 knows it isn't, so it leaves it unmapped and emulates any 
> instruction which accesses it.  The emulator needs to check whether the 
> instruction is intercepted or not.

Instruction intercepts take precedence over exception intercepts. So if
the L2 executes an instruction which the L1 hypervisor wants to
intercept we get this instruction intercept on the host side and
re-inject it.
To my understanding the fault-intercept which causes the emulator to run
can only happen if the instruction causing the fault isn't intercepted
itself.

> Note, I think if the instruction operand is in mmio, we're safe, since 
> the intercept has higher priority than memory access.  But if the 
> instruction itself is on mmio, or if we entered the emulator through smp 
> trickery, then the emulator will execute the instruction in nested guest 
> context.

Right. But if the guest executes code which is on mmio we are doomed
anyway because our instruction emulator does not emulate the whole x86
instruction set, right?

-- 
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: SVM: Emulate next_rip svm feature

2010-07-28 Thread Avi Kivity

 On 07/28/2010 02:51 PM, Roedel, Joerg wrote:

On Wed, Jul 28, 2010 at 07:34:11AM -0400, Avi Kivity wrote:

   On 07/28/2010 02:25 PM, Roedel, Joerg wrote:

On Wed, Jul 28, 2010 at 06:28:06AM -0400, Avi Kivity wrote:

We have a slightly different problem, if the nested guest manages to get
an instruction to be emulated by the host (if the guest assigned it the
cirrus framebuffer, for example, so from L1's point of view it is RAM,
but from L0's point of view it is emulated), then we miss the
intercept.  L2 could take over L1 this way.

I wonder how this could happen. Shouldn't the shadow paging code take
care of this?


L1 thinks the memory is RAM, so it maps it directly and forgets about
it.  L0 knows it isn't, so it leaves it unmapped and emulates any
instruction which accesses it.  The emulator needs to check whether the
instruction is intercepted or not.

Instruction intercepts take precedence over exception intercepts. So if
the L2 executes an instruction which the L1 hypervisor wants to
intercept we get this instruction intercept on the host side and
re-inject it.
To my understanding the fault-intercept which causes the emulator to run
can only happen if the instruction causing the fault isn't intercepted
itself.


If the instruction opcode is on mmio, the processor never sees the 
opcode and thus can not intercept.  Or the processor may see one 
instruction, which is not intercepted, but by the time the emulator 
kicks in a different instruction takes its place, since another vcpu is 
evilly cross-modifying the code.



Note, I think if the instruction operand is in mmio, we're safe, since
the intercept has higher priority than memory access.  But if the
instruction itself is on mmio, or if we entered the emulator through smp
trickery, then the emulator will execute the instruction in nested guest
context.

Right. But if the guest executes code which is on mmio we are doomed
anyway because our instruction emulator does not emulate the whole x86
instruction set, right?


The guest (L2 in this case) is doomed since it execution cannot 
continue.  But L1 and L0 are fine.  The problem with L2 avoiding 
intercepts is that L2 can change control registers and take over L1.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: SVM: Emulate next_rip svm feature

2010-07-28 Thread Roedel, Joerg
On Wed, Jul 28, 2010 at 07:57:36AM -0400, Avi Kivity wrote:

> If the instruction opcode is on mmio, the processor never sees the 
> opcode and thus can not intercept.  Or the processor may see one 
> instruction, which is not intercepted, but by the time the emulator 
> kicks in a different instruction takes its place, since another vcpu is 
> evilly cross-modifying the code.

Right. X-modifying code is a problem too.

> The guest (L2 in this case) is doomed since it execution cannot 
> continue.  But L1 and L0 are fine.  The problem with L2 avoiding 
> intercepts is that L2 can change control registers and take over L1.

Right too. We can not ignore it. The right fix is probably a check for
the instruction intercepts right after the decoder has run and before
the emulator ran.

Joer

-- 
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: SVM: Emulate next_rip svm feature

2010-07-28 Thread Avi Kivity

 On 07/28/2010 03:18 PM, Roedel, Joerg wrote:




The guest (L2 in this case) is doomed since it execution cannot
continue.  But L1 and L0 are fine.  The problem with L2 avoiding
intercepts is that L2 can change control registers and take over L1.

Right too. We can not ignore it. The right fix is probably a check for
the instruction intercepts right after the decoder has run and before
the emulator ran.


Should be easy - just like we have the Priv flag, add a bitfield to 
opcode_table that says which bit we need to check in the control area.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html