I looked up Intel manual for VM instruction error. Error number 7 means "VM
entry with invalid control field(s)", which means in process of VM
switching some control fields are not properly configured.

I wonder why some emulated CPUs (e.g.Nehalem) can run properly without
nested VMCS MSR support?

Besides, this bug has also been reported at Red Hat community
https://bugzilla.redhat.com/show_bug.cgi?id=892240
And for some specific kernel (e.g. kernel 3.8.4-202.fc18.x86_64 for
fedora18) it works well.


On Tue, Apr 16, 2013 at 3:03 PM, Jan Kiszka <jan.kis...@web.de> wrote:

> On 2013-04-16 05:49, 李春奇 <Arthur Chunqi Li> wrote:
> > I changed to the latest version of kvm kernel but the bug also occured.
> >
> > On the startup of L1 VM on the host, the host kern.log will output:
> > Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458090] kvm [2808]: vcpu0
> > unhandled rdmsr: 0x345
> > Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458166] kvm_set_msr_common: 22
> > callbacks suppressed
> > Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458169] kvm [2808]: vcpu0
> > unhandled wrmsr: 0x40 data 0
> > Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458176] kvm [2808]: vcpu0
> > unhandled wrmsr: 0x60 data 0
> > Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458182] kvm [2808]: vcpu0
> > unhandled wrmsr: 0x41 data 0
> > Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458188] kvm [2808]: vcpu0
> > unhandled wrmsr: 0x61 data 0
> > Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458194] kvm [2808]: vcpu0
> > unhandled wrmsr: 0x42 data 0
> > Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458200] kvm [2808]: vcpu0
> > unhandled wrmsr: 0x62 data 0
> > Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458206] kvm [2808]: vcpu0
> > unhandled wrmsr: 0x43 data 0
> > Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458211] kvm [2808]: vcpu0
> > unhandled wrmsr: 0x63 data 0
> > Apr 16 11:28:23 Blade1-02 kernel: [ 4908.471014] kvm [2808]: vcpu1
> > unhandled wrmsr: 0x40 data 0
> > Apr 16 11:28:23 Blade1-02 kernel: [ 4908.471024] kvm [2808]: vcpu1
> > unhandled wrmsr: 0x60 data 0
> >
> > When L1 VM starts and crashes, its kern.log will output:
> > Apr 16 11:28:55 kvm1 kernel: [   33.590101] device tap0 entered
> promiscuous
> > mode
> > Apr 16 11:28:55 kvm1 kernel: [   33.590140] br0: port 2(tap0) entered
> > forwarding state
> > Apr 16 11:28:55 kvm1 kernel: [   33.590146] br0: port 2(tap0) entered
> > forwarding state
> > Apr 16 11:29:04 kvm1 kernel: [   42.592103] br0: port 2(tap0) entered
> > forwarding state
> > Apr 16 11:29:19 kvm1 kernel: [   57.752731] kvm [1673]: vcpu0 unhandled
> > rdmsr: 0x345
> > Apr 16 11:29:19 kvm1 kernel: [   57.797261] kvm [1673]: vcpu0 unhandled
> > wrmsr: 0x40 data 0
> > Apr 16 11:29:19 kvm1 kernel: [   57.797315] kvm [1673]: vcpu0 unhandled
> > wrmsr: 0x60 data 0
> > Apr 16 11:29:19 kvm1 kernel: [   57.797366] kvm [1673]: vcpu0 unhandled
> > wrmsr: 0x41 data 0
> > Apr 16 11:29:19 kvm1 kernel: [   57.797416] kvm [1673]: vcpu0 unhandled
> > wrmsr: 0x61 data 0
> > Apr 16 11:29:19 kvm1 kernel: [   57.797466] kvm [1673]: vcpu0 unhandled
> > wrmsr: 0x42 data 0
> > Apr 16 11:29:19 kvm1 kernel: [   57.797516] kvm [1673]: vcpu0 unhandled
> > wrmsr: 0x62 data 0
> > Apr 16 11:29:19 kvm1 kernel: [   57.797566] kvm [1673]: vcpu0 unhandled
> > wrmsr: 0x43 data 0
> > Apr 16 11:29:19 kvm1 kernel: [   57.797616] kvm [1673]: vcpu0 unhandled
> > wrmsr: 0x63 data 0
> >
> > The host will output simultaneously:
> > Apr 16 11:29:20 Blade1-02 kernel: [ 4966.314742] nested_vmx_run: VMCS
> > MSR_{LOAD,STORE} unsupported
>
> That's an important information. KVM is not yet implementing this
> feature, but L1 is using it - doomed to fail. This feature gap of nested
> VMX needs to be closed at some point.
>
> >
> > And the callback trace displayed on the console is the same as the
> previous
> > mail.
> >
> > Besides, the L1 and L2 guest may sometimes crash and output nothing,
> while
> > sometimes it will output as above.
> >
> >
> > So this indicates that the msr controls may fail for core2duo CPU
> emulator.
> >
>
> Maybe varying the CPU type (try e.g. -cpu kvm64,+vmx) reduces the
> likeliness of this scenario with KVM as guest.
>
> >
> > For Jan,
> > I have traced the code of qemu and KVM and found the relevant code of
> errno
> > "KVM: entry failed, hardware error 0x7". The relevant code is in kernel
> > arch/x86/kvm/vmx.c, function vmx_handle_exit():
> >
> > if (exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) {
> > vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> > vcpu->run->fail_entry.hardware_entry_failure_reason
> > = exit_reason;
> > return 0;
> > }
> >
> > if (unlikely(vmx->fail)) {
> > vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> > vcpu->run->fail_entry.hardware_entry_failure_reason
> > = vmcs_read32(VM_INSTRUCTION_ERROR);
> > return 0;
> > }
> >
> > The entry failed hardware error may be caused from these two points, both
> > are caused by VMENTRY failed. Because macro
> VMX_EXIT_REASONS_FAILED_VMENTRY
> > is 0x80000000 and the output errno is 0x7, so this error is caused by the
> > second branch. I'm not very clear what the result of
> > vmcs_read32(VM_INSTRUCTION_ERROR) refers to.
>
> Try to look this up in the Intel manual. It explains what instruction
> error 7 means. You will also find it when tracing down the error message
> of L0.
>
> Jan
>
>
>


-- 
Arthur Chunqi Li
Department of Computer Science
School of EECS
Peking University
Beijing, China

Reply via email to