Re: [PATCH] svm: implement NEXTRIPsave SVM feature

2010-04-12 Thread Avi Kivity

On 04/12/2010 12:07 AM, Andre Przywara wrote:

On SVM we set the instruction length of skipped instructions
to hard-coded, well known values, which could be wrong when (bogus,
but valid) prefixes (REX, segment override) are used.
Newer AMD processors (Fam10h 45nm and better, aka. PhenomII or
AthlonII) have an explicit NEXTRIP field in the VMCB containing the
desired information.
Since it is cheap to do so, we use this field to override the guessed
value on newer processors.
A fix for older CPUs would be rather expensive, as it would require
to fetch and partially decode the instruction. As the problem is not
a security issue and needs special, handcrafted code to trigger
(no compiler will ever generate such code), I omit a fix for older
CPUs.
If someone is interested, I have both a patch for these CPUs as well as
demo code triggering this issue: It segfaults under KVM, but runs
perfectly on native Linux.


@@ -319,6 +319,9 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
  {
struct vcpu_svm *svm = to_svm(vcpu);

+   if (svm-vmcb-control.next_rip != 0)
+   svm-next_rip = svm-vmcb-control.next_rip;
+
if (!svm-next_rip) {
if (emulate_instruction(vcpu, 0, 0, EMULTYPE_SKIP) !=
EMULATE_DONE)
   


How does this interact with nested svm?  Suppose we exit a nested guest, 
then emulate a #VMEXIT, we'll have next_rip from a previous exit?



--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] svm: implement NEXTRIPsave SVM feature

2010-04-12 Thread Alexander Graf

On 12.04.2010, at 12:20, Avi Kivity wrote:

 On 04/12/2010 12:07 AM, Andre Przywara wrote:
 On SVM we set the instruction length of skipped instructions
 to hard-coded, well known values, which could be wrong when (bogus,
 but valid) prefixes (REX, segment override) are used.
 Newer AMD processors (Fam10h 45nm and better, aka. PhenomII or
 AthlonII) have an explicit NEXTRIP field in the VMCB containing the
 desired information.
 Since it is cheap to do so, we use this field to override the guessed
 value on newer processors.
 A fix for older CPUs would be rather expensive, as it would require
 to fetch and partially decode the instruction. As the problem is not
 a security issue and needs special, handcrafted code to trigger
 (no compiler will ever generate such code), I omit a fix for older
 CPUs.
 If someone is interested, I have both a patch for these CPUs as well as
 demo code triggering this issue: It segfaults under KVM, but runs
 perfectly on native Linux.
 
 
 @@ -319,6 +319,9 @@ static void skip_emulated_instruction(struct kvm_vcpu 
 *vcpu)
  {
  struct vcpu_svm *svm = to_svm(vcpu);
 
 +if (svm-vmcb-control.next_rip != 0)
 +svm-next_rip = svm-vmcb-control.next_rip;
 +
  if (!svm-next_rip) {
  if (emulate_instruction(vcpu, 0, 0, EMULTYPE_SKIP) !=
  EMULATE_DONE)
   
 
 How does this interact with nested svm?  Suppose we exit a nested guest, then 
 emulate a #VMEXIT, we'll have next_rip from a previous exit?

Since we don't call skip_emulated_instruction on #VMEXIT injection I think 
we're safe here. The instruction skip is done at the vmrun intercept.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] svm: implement NEXTRIPsave SVM feature

2010-04-12 Thread Avi Kivity

On 04/12/2010 01:29 PM, Alexander Graf wrote:

On 12.04.2010, at 12:20, Avi Kivity wrote:

   

On 04/12/2010 12:07 AM, Andre Przywara wrote:
 

On SVM we set the instruction length of skipped instructions
to hard-coded, well known values, which could be wrong when (bogus,
but valid) prefixes (REX, segment override) are used.
Newer AMD processors (Fam10h 45nm and better, aka. PhenomII or
AthlonII) have an explicit NEXTRIP field in the VMCB containing the
desired information.
Since it is cheap to do so, we use this field to override the guessed
value on newer processors.
A fix for older CPUs would be rather expensive, as it would require
to fetch and partially decode the instruction. As the problem is not
a security issue and needs special, handcrafted code to trigger
(no compiler will ever generate such code), I omit a fix for older
CPUs.
If someone is interested, I have both a patch for these CPUs as well as
demo code triggering this issue: It segfaults under KVM, but runs
perfectly on native Linux.


@@ -319,6 +319,9 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
  {
struct vcpu_svm *svm = to_svm(vcpu);

+   if (svm-vmcb-control.next_rip != 0)
+   svm-next_rip = svm-vmcb-control.next_rip;
+
if (!svm-next_rip) {
if (emulate_instruction(vcpu, 0, 0, EMULTYPE_SKIP) !=
EMULATE_DONE)

   

How does this interact with nested svm?  Suppose we exit a nested guest, then 
emulate a #VMEXIT, we'll have next_rip from a previous exit?
 

Since we don't call skip_emulated_instruction on #VMEXIT injection I think 
we're safe here. The instruction skip is done at the vmrun intercept.

   


I think you're right.  If the guest intercepts, we don't 
skip_emulated_instruction().  If the guest doesn't intercept, we keep 
the nested vmcb, so next_rip is right.


Neat.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] svm: implement NEXTRIPsave SVM feature

2010-04-11 Thread Alexander Graf

On 11.04.2010, at 23:07, Andre Przywara wrote:

 On SVM we set the instruction length of skipped instructions
 to hard-coded, well known values, which could be wrong when (bogus,
 but valid) prefixes (REX, segment override) are used.
 Newer AMD processors (Fam10h 45nm and better, aka. PhenomII or
 AthlonII) have an explicit NEXTRIP field in the VMCB containing the
 desired information.
 Since it is cheap to do so, we use this field to override the guessed
 value on newer processors.
 A fix for older CPUs would be rather expensive, as it would require
 to fetch and partially decode the instruction. As the problem is not
 a security issue and needs special, handcrafted code to trigger
 (no compiler will ever generate such code), I omit a fix for older
 CPUs.
 If someone is interested, I have both a patch for these CPUs as well as
 demo code triggering this issue: It segfaults under KVM, but runs
 perfectly on native Linux.
 
 Signed-off-by: Andre Przywara andre.przyw...@amd.com
 ---
 arch/x86/include/asm/svm.h |4 +++-
 arch/x86/kvm/svm.c |   13 -
 2 files changed, 11 insertions(+), 6 deletions(-)
 
 diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
 index b26a38d..1d91d05 100644
 --- a/arch/x86/include/asm/svm.h
 +++ b/arch/x86/include/asm/svm.h
 @@ -81,7 +81,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
   u32 event_inj_err;
   u64 nested_cr3;
   u64 lbr_ctl;
 - u8 reserved_5[832];
 + u64 reserved_5;
 + u64 next_rip;
 + u8 reserved_6[816];
 };
 
 
 diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
 index d04c7ad..7fff56c 100644
 --- a/arch/x86/kvm/svm.c
 +++ b/arch/x86/kvm/svm.c
 @@ -43,11 +43,11 @@ MODULE_LICENSE(GPL);
 #define SEG_TYPE_LDT 2
 #define SEG_TYPE_BUSY_TSS16 3
 
 -#define SVM_FEATURE_NPT  (1  0)
 -#define SVM_FEATURE_LBRV (1  1)
 -#define SVM_FEATURE_SVML (1  2)
 -#define SVM_FEATURE_NRIP (1  3)
 -#define SVM_FEATURE_PAUSE_FILTER (1  10)
 +#define SVM_FEATURE_NPT(1   0)
 +#define SVM_FEATURE_LBRV   (1   1)
 +#define SVM_FEATURE_SVML   (1   2)
 +#define SVM_FEATURE_NRIP   (1   3)
 +#define SVM_FEATURE_PAUSE_FILTER   (1  10)

This is pure indent fixing, right? Sounds like a separate patch to me.

 
 #define NESTED_EXIT_HOST  0   /* Exit handled on host level */
 #define NESTED_EXIT_DONE  1   /* Exit caused nested vmexit  */
 @@ -319,6 +319,9 @@ static void skip_emulated_instruction(struct kvm_vcpu 
 *vcpu)
 {
   struct vcpu_svm *svm = to_svm(vcpu);
 
 + if (svm-vmcb-control.next_rip != 0)
 + svm-next_rip = svm-vmcb-control.next_rip;
 +

Wouldn't it be more obvious to create a wrapper function to set next_rip in all 
the callers instead of setting it manually and then have magic override the 
value you just set?

Let me illustrate what I mean. Currently we have:

static int halt_interception(struct vcpu_svm *svm)
{
svm-next_rip = kvm_rip_read(svm-vcpu) + 1;
skip_emulated_instruction(svm-vcpu);
return kvm_emulate_halt(svm-vcpu);
}

skip_emulated_instruction sets pc = next_rip. All fine and obvious. With your 
patch it would simply ignore next_rip, rendering the line before void without 
obvious indication of that behavior.

So what I'd prefer is something like:

/* Either adds offset n to the instruction counter or takes the next
instruction pointer from the vmcb if the CPU supports it */

static u64 svm_next_rip(struct kvm_vcpu *vcpu, int add)
{
if (svm-vmcb-control.next_rip != 0)
return svm-vmcb-control.next_rip;

return kvm_rip_read(svm-vcpu) + add;
}

static int halt_interception(struct vcpu_svm *svm)
{
svm-next_rip = svm_next_rip(svm-vcpu, 1);
skip_emulated_instruction(svm-vcpu);
return kvm_emulate_halt(svm-vcpu);
}

That is more readable IMHO and makes the code flow a lot more obvious.


Alex--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] svm: implement NEXTRIPsave SVM feature

2010-04-11 Thread Alexander Graf

On 11.04.2010, at 23:40, Alexander Graf wrote:

 
 /* Either adds offset n to the instruction counter or takes the next
instruction pointer from the vmcb if the CPU supports it */
 
 static u64 svm_next_rip(struct kvm_vcpu *vcpu, int add)
 {
if (svm-vmcb-control.next_rip != 0)

In fact, that if should probably be:

if (svm_has(SVM_FEATURE_NRIP))


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] svm: implement NEXTRIPsave SVM feature

2010-04-11 Thread Andre Przywara

Alexander Graf wrote:

On 11.04.2010, at 23:40, Alexander Graf wrote:


/* Either adds offset n to the instruction counter or takes the next
   instruction pointer from the vmcb if the CPU supports it */

static u64 svm_next_rip(struct kvm_vcpu *vcpu, int add)
{
   if (svm-vmcb-control.next_rip != 0)


In fact, that if should probably be:

if (svm_has(SVM_FEATURE_NRIP))
This is not sufficient. The next RIP is only provided for some 
intercepts (namely instruction intercepts), so one would need to check 
the validity of this field anyway. By definition reserved VMCB fields 
are 0, and as 0 is never a valid _next_ RIP, this is a quick and correct 
check.


Regards,
Andre.

P.S. I don't have a strong opinion about your proposed refactoring, if 
Avi agrees I will rework it. I only found the current fix small and 
easy, and the mentioned patch for older CPUs removed the add line 
anyway, so the concerns you rose did not apply to the original version 
of the patch.


--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 488-3567-12

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] svm: implement NEXTRIPsave SVM feature

2010-04-11 Thread Alexander Graf

On 11.04.2010, at 23:51, Andre Przywara wrote:

 Alexander Graf wrote:
 On 11.04.2010, at 23:40, Alexander Graf wrote:
 /* Either adds offset n to the instruction counter or takes the next
   instruction pointer from the vmcb if the CPU supports it */
 
 static u64 svm_next_rip(struct kvm_vcpu *vcpu, int add)
 {
   if (svm-vmcb-control.next_rip != 0)
 In fact, that if should probably be:
if (svm_has(SVM_FEATURE_NRIP))
 This is not sufficient. The next RIP is only provided for some intercepts 
 (namely instruction intercepts), so one would need to check the validity of 
 this field anyway. By definition reserved VMCB fields are 0, and as 0 is 
 never a valid _next_ RIP, this is a quick and correct check.

It's not? If you're at -1 which is hlt in our imaginary case. What would the 
next instruction be?

 
 Regards,
 Andre.
 
 P.S. I don't have a strong opinion about your proposed refactoring, if Avi 
 agrees I will rework it. I only found the current fix small and easy, and the 
 mentioned patch for older CPUs removed the add line anyway, so the concerns 
 you rose did not apply to the original version of the patch.

What patch for older CPUs? The one that'd be expensive? I was more concerned 
about readability here - it's great to be able to follow code on what it does 
:-).


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] svm: implement NEXTRIPsave SVM feature

2010-04-11 Thread Andre Przywara

Alexander Graf wrote:

On 11.04.2010, at 23:51, Andre Przywara wrote:


Alexander Graf wrote:

On 11.04.2010, at 23:40, Alexander Graf wrote:

/* Either adds offset n to the instruction counter or takes the next
  instruction pointer from the vmcb if the CPU supports it */

static u64 svm_next_rip(struct kvm_vcpu *vcpu, int add)
{
  if (svm-vmcb-control.next_rip != 0)

In fact, that if should probably be:
   if (svm_has(SVM_FEATURE_NRIP))

This is not sufficient. The next RIP is only provided for some

 intercepts (namely instruction intercepts), so one would need to
 check the validity of this field anyway. By definition reserved

VMCB fields are 0, and as 0 is never a valid _next_ RIP, this

 is a quick and correct check.

It's not? If you're at -1 which is hlt in our imaginary case. What would the 
next instruction be?
A wrap-around to zero? From kernel space to user space? Come on, that 
sounds a bit constructed (A20, someone?). I dimly remember reading in 
our internal docs that 0 is a safe indicator for a missing NEXTRIP. I 
will do some research tomorrow.



P.S. I don't have a strong opinion about your proposed refactoring,
if Avi agrees I will rework it. I only found the current fix small
and easy, and the mentioned patch for older CPUs removed the add

 line anyway, so the concerns you rose did not apply to the original

version of the patch.


What patch for older CPUs? The one that'd be expensive?
Yes. It removes the guessed value lines entirely and triggers a decode 
if NEXTRIP is not available.

 I was more concerned about readability here - it's great to
 be able to follow code on what it does :-).
Maybe a comment about the overriding behavior of the NEXTRIP line would 
appease you?


Andre.

--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 488-3567-12

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] svm: implement NEXTRIPsave SVM feature

2010-04-11 Thread Alexander Graf

On 12.04.2010, at 00:13, Andre Przywara wrote:

 Alexander Graf wrote:
 On 11.04.2010, at 23:51, Andre Przywara wrote:
 Alexander Graf wrote:
 On 11.04.2010, at 23:40, Alexander Graf wrote:
 /* Either adds offset n to the instruction counter or takes the next
  instruction pointer from the vmcb if the CPU supports it */
 
 static u64 svm_next_rip(struct kvm_vcpu *vcpu, int add)
 {
  if (svm-vmcb-control.next_rip != 0)
 In fact, that if should probably be:
   if (svm_has(SVM_FEATURE_NRIP))
 This is not sufficient. The next RIP is only provided for some
  intercepts (namely instruction intercepts), so one would need to
  check the validity of this field anyway. By definition reserved
 VMCB fields are 0, and as 0 is never a valid _next_ RIP, this
  is a quick and correct check.
 It's not? If you're at -1 which is hlt in our imaginary case. What would the 
 next instruction be?
 A wrap-around to zero? From kernel space to user space? Come on, that sounds 
 a bit constructed (A20, someone?). I dimly remember reading in our internal 
 docs that 0 is a safe indicator for a missing NEXTRIP. I will do some 
 research tomorrow.

I remember that this was the XBox hack. So it's not as constructed as it sounds 
:-). It's probably not real-world relevant, but worth keeping in mind that 
we're not 100% compatible then.

 
 P.S. I don't have a strong opinion about your proposed refactoring,
 if Avi agrees I will rework it. I only found the current fix small
 and easy, and the mentioned patch for older CPUs removed the add
  line anyway, so the concerns you rose did not apply to the original
 version of the patch.
 What patch for older CPUs? The one that'd be expensive?
 Yes. It removes the guessed value lines entirely and triggers a decode if 
 NEXTRIP is not available.

I see. No need to go that far. The current code works just fine as is and is 
fast.

  I was more concerned about readability here - it's great to
  be able to follow code on what it does :-).
 Maybe a comment about the overriding behavior of the NEXTRIP line would 
 appease you?

Hrm. I don't think it's too much work to go through the individual next_rip 
setters and convert them to call an inline function. In fact, it's probably 
easier than writing mails in this thread :-). If you like I can do it for you?


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html