Re: [PATCH] svm: implement NEXTRIPsave SVM feature
On 04/12/2010 12:07 AM, Andre Przywara wrote: On SVM we set the instruction length of skipped instructions to hard-coded, well known values, which could be wrong when (bogus, but valid) prefixes (REX, segment override) are used. Newer AMD processors (Fam10h 45nm and better, aka. PhenomII or AthlonII) have an explicit NEXTRIP field in the VMCB containing the desired information. Since it is cheap to do so, we use this field to override the guessed value on newer processors. A fix for older CPUs would be rather expensive, as it would require to fetch and partially decode the instruction. As the problem is not a security issue and needs special, handcrafted code to trigger (no compiler will ever generate such code), I omit a fix for older CPUs. If someone is interested, I have both a patch for these CPUs as well as demo code triggering this issue: It segfaults under KVM, but runs perfectly on native Linux. @@ -319,6 +319,9 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); + if (svm-vmcb-control.next_rip != 0) + svm-next_rip = svm-vmcb-control.next_rip; + if (!svm-next_rip) { if (emulate_instruction(vcpu, 0, 0, EMULTYPE_SKIP) != EMULATE_DONE) How does this interact with nested svm? Suppose we exit a nested guest, then emulate a #VMEXIT, we'll have next_rip from a previous exit? -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] svm: implement NEXTRIPsave SVM feature
On 12.04.2010, at 12:20, Avi Kivity wrote: On 04/12/2010 12:07 AM, Andre Przywara wrote: On SVM we set the instruction length of skipped instructions to hard-coded, well known values, which could be wrong when (bogus, but valid) prefixes (REX, segment override) are used. Newer AMD processors (Fam10h 45nm and better, aka. PhenomII or AthlonII) have an explicit NEXTRIP field in the VMCB containing the desired information. Since it is cheap to do so, we use this field to override the guessed value on newer processors. A fix for older CPUs would be rather expensive, as it would require to fetch and partially decode the instruction. As the problem is not a security issue and needs special, handcrafted code to trigger (no compiler will ever generate such code), I omit a fix for older CPUs. If someone is interested, I have both a patch for these CPUs as well as demo code triggering this issue: It segfaults under KVM, but runs perfectly on native Linux. @@ -319,6 +319,9 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); +if (svm-vmcb-control.next_rip != 0) +svm-next_rip = svm-vmcb-control.next_rip; + if (!svm-next_rip) { if (emulate_instruction(vcpu, 0, 0, EMULTYPE_SKIP) != EMULATE_DONE) How does this interact with nested svm? Suppose we exit a nested guest, then emulate a #VMEXIT, we'll have next_rip from a previous exit? Since we don't call skip_emulated_instruction on #VMEXIT injection I think we're safe here. The instruction skip is done at the vmrun intercept. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] svm: implement NEXTRIPsave SVM feature
On 04/12/2010 01:29 PM, Alexander Graf wrote: On 12.04.2010, at 12:20, Avi Kivity wrote: On 04/12/2010 12:07 AM, Andre Przywara wrote: On SVM we set the instruction length of skipped instructions to hard-coded, well known values, which could be wrong when (bogus, but valid) prefixes (REX, segment override) are used. Newer AMD processors (Fam10h 45nm and better, aka. PhenomII or AthlonII) have an explicit NEXTRIP field in the VMCB containing the desired information. Since it is cheap to do so, we use this field to override the guessed value on newer processors. A fix for older CPUs would be rather expensive, as it would require to fetch and partially decode the instruction. As the problem is not a security issue and needs special, handcrafted code to trigger (no compiler will ever generate such code), I omit a fix for older CPUs. If someone is interested, I have both a patch for these CPUs as well as demo code triggering this issue: It segfaults under KVM, but runs perfectly on native Linux. @@ -319,6 +319,9 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); + if (svm-vmcb-control.next_rip != 0) + svm-next_rip = svm-vmcb-control.next_rip; + if (!svm-next_rip) { if (emulate_instruction(vcpu, 0, 0, EMULTYPE_SKIP) != EMULATE_DONE) How does this interact with nested svm? Suppose we exit a nested guest, then emulate a #VMEXIT, we'll have next_rip from a previous exit? Since we don't call skip_emulated_instruction on #VMEXIT injection I think we're safe here. The instruction skip is done at the vmrun intercept. I think you're right. If the guest intercepts, we don't skip_emulated_instruction(). If the guest doesn't intercept, we keep the nested vmcb, so next_rip is right. Neat. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] svm: implement NEXTRIPsave SVM feature
On 11.04.2010, at 23:07, Andre Przywara wrote: On SVM we set the instruction length of skipped instructions to hard-coded, well known values, which could be wrong when (bogus, but valid) prefixes (REX, segment override) are used. Newer AMD processors (Fam10h 45nm and better, aka. PhenomII or AthlonII) have an explicit NEXTRIP field in the VMCB containing the desired information. Since it is cheap to do so, we use this field to override the guessed value on newer processors. A fix for older CPUs would be rather expensive, as it would require to fetch and partially decode the instruction. As the problem is not a security issue and needs special, handcrafted code to trigger (no compiler will ever generate such code), I omit a fix for older CPUs. If someone is interested, I have both a patch for these CPUs as well as demo code triggering this issue: It segfaults under KVM, but runs perfectly on native Linux. Signed-off-by: Andre Przywara andre.przyw...@amd.com --- arch/x86/include/asm/svm.h |4 +++- arch/x86/kvm/svm.c | 13 - 2 files changed, 11 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h index b26a38d..1d91d05 100644 --- a/arch/x86/include/asm/svm.h +++ b/arch/x86/include/asm/svm.h @@ -81,7 +81,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area { u32 event_inj_err; u64 nested_cr3; u64 lbr_ctl; - u8 reserved_5[832]; + u64 reserved_5; + u64 next_rip; + u8 reserved_6[816]; }; diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index d04c7ad..7fff56c 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -43,11 +43,11 @@ MODULE_LICENSE(GPL); #define SEG_TYPE_LDT 2 #define SEG_TYPE_BUSY_TSS16 3 -#define SVM_FEATURE_NPT (1 0) -#define SVM_FEATURE_LBRV (1 1) -#define SVM_FEATURE_SVML (1 2) -#define SVM_FEATURE_NRIP (1 3) -#define SVM_FEATURE_PAUSE_FILTER (1 10) +#define SVM_FEATURE_NPT(1 0) +#define SVM_FEATURE_LBRV (1 1) +#define SVM_FEATURE_SVML (1 2) +#define SVM_FEATURE_NRIP (1 3) +#define SVM_FEATURE_PAUSE_FILTER (1 10) This is pure indent fixing, right? Sounds like a separate patch to me. #define NESTED_EXIT_HOST 0 /* Exit handled on host level */ #define NESTED_EXIT_DONE 1 /* Exit caused nested vmexit */ @@ -319,6 +319,9 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); + if (svm-vmcb-control.next_rip != 0) + svm-next_rip = svm-vmcb-control.next_rip; + Wouldn't it be more obvious to create a wrapper function to set next_rip in all the callers instead of setting it manually and then have magic override the value you just set? Let me illustrate what I mean. Currently we have: static int halt_interception(struct vcpu_svm *svm) { svm-next_rip = kvm_rip_read(svm-vcpu) + 1; skip_emulated_instruction(svm-vcpu); return kvm_emulate_halt(svm-vcpu); } skip_emulated_instruction sets pc = next_rip. All fine and obvious. With your patch it would simply ignore next_rip, rendering the line before void without obvious indication of that behavior. So what I'd prefer is something like: /* Either adds offset n to the instruction counter or takes the next instruction pointer from the vmcb if the CPU supports it */ static u64 svm_next_rip(struct kvm_vcpu *vcpu, int add) { if (svm-vmcb-control.next_rip != 0) return svm-vmcb-control.next_rip; return kvm_rip_read(svm-vcpu) + add; } static int halt_interception(struct vcpu_svm *svm) { svm-next_rip = svm_next_rip(svm-vcpu, 1); skip_emulated_instruction(svm-vcpu); return kvm_emulate_halt(svm-vcpu); } That is more readable IMHO and makes the code flow a lot more obvious. Alex-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] svm: implement NEXTRIPsave SVM feature
On 11.04.2010, at 23:40, Alexander Graf wrote: /* Either adds offset n to the instruction counter or takes the next instruction pointer from the vmcb if the CPU supports it */ static u64 svm_next_rip(struct kvm_vcpu *vcpu, int add) { if (svm-vmcb-control.next_rip != 0) In fact, that if should probably be: if (svm_has(SVM_FEATURE_NRIP)) Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] svm: implement NEXTRIPsave SVM feature
Alexander Graf wrote: On 11.04.2010, at 23:40, Alexander Graf wrote: /* Either adds offset n to the instruction counter or takes the next instruction pointer from the vmcb if the CPU supports it */ static u64 svm_next_rip(struct kvm_vcpu *vcpu, int add) { if (svm-vmcb-control.next_rip != 0) In fact, that if should probably be: if (svm_has(SVM_FEATURE_NRIP)) This is not sufficient. The next RIP is only provided for some intercepts (namely instruction intercepts), so one would need to check the validity of this field anyway. By definition reserved VMCB fields are 0, and as 0 is never a valid _next_ RIP, this is a quick and correct check. Regards, Andre. P.S. I don't have a strong opinion about your proposed refactoring, if Avi agrees I will rework it. I only found the current fix small and easy, and the mentioned patch for older CPUs removed the add line anyway, so the concerns you rose did not apply to the original version of the patch. -- Andre Przywara AMD-Operating System Research Center (OSRC), Dresden, Germany Tel: +49 351 488-3567-12 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] svm: implement NEXTRIPsave SVM feature
On 11.04.2010, at 23:51, Andre Przywara wrote: Alexander Graf wrote: On 11.04.2010, at 23:40, Alexander Graf wrote: /* Either adds offset n to the instruction counter or takes the next instruction pointer from the vmcb if the CPU supports it */ static u64 svm_next_rip(struct kvm_vcpu *vcpu, int add) { if (svm-vmcb-control.next_rip != 0) In fact, that if should probably be: if (svm_has(SVM_FEATURE_NRIP)) This is not sufficient. The next RIP is only provided for some intercepts (namely instruction intercepts), so one would need to check the validity of this field anyway. By definition reserved VMCB fields are 0, and as 0 is never a valid _next_ RIP, this is a quick and correct check. It's not? If you're at -1 which is hlt in our imaginary case. What would the next instruction be? Regards, Andre. P.S. I don't have a strong opinion about your proposed refactoring, if Avi agrees I will rework it. I only found the current fix small and easy, and the mentioned patch for older CPUs removed the add line anyway, so the concerns you rose did not apply to the original version of the patch. What patch for older CPUs? The one that'd be expensive? I was more concerned about readability here - it's great to be able to follow code on what it does :-). Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] svm: implement NEXTRIPsave SVM feature
Alexander Graf wrote: On 11.04.2010, at 23:51, Andre Przywara wrote: Alexander Graf wrote: On 11.04.2010, at 23:40, Alexander Graf wrote: /* Either adds offset n to the instruction counter or takes the next instruction pointer from the vmcb if the CPU supports it */ static u64 svm_next_rip(struct kvm_vcpu *vcpu, int add) { if (svm-vmcb-control.next_rip != 0) In fact, that if should probably be: if (svm_has(SVM_FEATURE_NRIP)) This is not sufficient. The next RIP is only provided for some intercepts (namely instruction intercepts), so one would need to check the validity of this field anyway. By definition reserved VMCB fields are 0, and as 0 is never a valid _next_ RIP, this is a quick and correct check. It's not? If you're at -1 which is hlt in our imaginary case. What would the next instruction be? A wrap-around to zero? From kernel space to user space? Come on, that sounds a bit constructed (A20, someone?). I dimly remember reading in our internal docs that 0 is a safe indicator for a missing NEXTRIP. I will do some research tomorrow. P.S. I don't have a strong opinion about your proposed refactoring, if Avi agrees I will rework it. I only found the current fix small and easy, and the mentioned patch for older CPUs removed the add line anyway, so the concerns you rose did not apply to the original version of the patch. What patch for older CPUs? The one that'd be expensive? Yes. It removes the guessed value lines entirely and triggers a decode if NEXTRIP is not available. I was more concerned about readability here - it's great to be able to follow code on what it does :-). Maybe a comment about the overriding behavior of the NEXTRIP line would appease you? Andre. -- Andre Przywara AMD-Operating System Research Center (OSRC), Dresden, Germany Tel: +49 351 488-3567-12 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] svm: implement NEXTRIPsave SVM feature
On 12.04.2010, at 00:13, Andre Przywara wrote: Alexander Graf wrote: On 11.04.2010, at 23:51, Andre Przywara wrote: Alexander Graf wrote: On 11.04.2010, at 23:40, Alexander Graf wrote: /* Either adds offset n to the instruction counter or takes the next instruction pointer from the vmcb if the CPU supports it */ static u64 svm_next_rip(struct kvm_vcpu *vcpu, int add) { if (svm-vmcb-control.next_rip != 0) In fact, that if should probably be: if (svm_has(SVM_FEATURE_NRIP)) This is not sufficient. The next RIP is only provided for some intercepts (namely instruction intercepts), so one would need to check the validity of this field anyway. By definition reserved VMCB fields are 0, and as 0 is never a valid _next_ RIP, this is a quick and correct check. It's not? If you're at -1 which is hlt in our imaginary case. What would the next instruction be? A wrap-around to zero? From kernel space to user space? Come on, that sounds a bit constructed (A20, someone?). I dimly remember reading in our internal docs that 0 is a safe indicator for a missing NEXTRIP. I will do some research tomorrow. I remember that this was the XBox hack. So it's not as constructed as it sounds :-). It's probably not real-world relevant, but worth keeping in mind that we're not 100% compatible then. P.S. I don't have a strong opinion about your proposed refactoring, if Avi agrees I will rework it. I only found the current fix small and easy, and the mentioned patch for older CPUs removed the add line anyway, so the concerns you rose did not apply to the original version of the patch. What patch for older CPUs? The one that'd be expensive? Yes. It removes the guessed value lines entirely and triggers a decode if NEXTRIP is not available. I see. No need to go that far. The current code works just fine as is and is fast. I was more concerned about readability here - it's great to be able to follow code on what it does :-). Maybe a comment about the overriding behavior of the NEXTRIP line would appease you? Hrm. I don't think it's too much work to go through the individual next_rip setters and convert them to call an inline function. In fact, it's probably easier than writing mails in this thread :-). If you like I can do it for you? Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html