Re: Release plan for 0.12.0
Hello, 2009/9/30 Anthony Liguori aligu...@us.ibm.com: Hi, Now that 0.11.0 is behind us, it's time to start thinking about 0.12.0. o storage live migration Sorry for a bit off topic. But, my special NBD server can do this independently of VMM implementations. See http://bitbucket.org/hirofuchi/xnbd/wiki/Home if interested. Takahiro -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nice value is ignored on cpu time accounting of a guest?
On Tue, Oct 20, 2009 at 2:21 PM, Avi Kivity a...@redhat.com wrote: On 10/19/2009 06:46 PM, Ryota Ozaki wrote: Hi, I have a question on cputime accounting of a guest. CPU time of a guest is always accounted as 'user' time of cpustat even if nice value of the guest is higher than 0. Is there a reason to do so? I think the cpu time of the guest should be accounted into 'nice' as same as a normal process. Am I wrong? Hm, guest time is accounted separately, and added to user time in /proc (so tools that don't know about guest time can read it as user time). Yes, but I think always added to user time without regard to nice value is a problem. I want to fix it because user time is an account for processes that have nice == 0. Looks like we need to add a separate guest_nice, or get rid of guest time altogether. Hmm, guest time is already exposed via /proc/stat so adding guest_nice is better if fix here? I don't know anyone utilize 'guest' value though. ozaki-r -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nice value is ignored on cpu time accounting of a guest?
On 10/20/2009 04:06 PM, Ryota Ozaki wrote: Looks like we need to add a separate guest_nice, or get rid of guest time altogether. Hmm, guest time is already exposed via /proc/stat so adding guest_nice is better if fix here? I don't know anyone utilize 'guest' value though. No one uses guest time to my knowledge. However, we can't be sure, so it's better to add guest_nice. Note you need to add guest_nice to user_nice, so old tools see it as nice time (same as guest_time now). -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nice value is ignored on cpu time accounting of a guest?
On 10/20/2009 04:27 PM, Ryota Ozaki wrote: On Tue, Oct 20, 2009 at 4:17 PM, Avi Kivitya...@redhat.com wrote: On 10/20/2009 04:06 PM, Ryota Ozaki wrote: Looks like we need to add a separate guest_nice, or get rid of guest time altogether. Hmm, guest time is already exposed via /proc/stat so adding guest_nice is better if fix here? I don't know anyone utilize 'guest' value though. No one uses guest time to my knowledge. However, we can't be sure, so it's better to add guest_nice. Note you need to add guest_nice to user_nice, so old tools see it as nice time (same as guest_time now). Well, like this? /* Add user time to cpustat. */ tmp = cputime_to_cputime64(cputime); if (TASK_NICE(p) 0) { cpustat-nice = cputime64_add(cpustat-nice, tmp); cpustat-guest_nice = cputime64_add(cpustat-guest_nice, tmp); } else { cpustat-user = cputime64_add(cpustat-user, tmp); cpustat-guest = cputime64_add(cpustat-guest, tmp); } In account_guest_time()? Yes. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nice value is ignored on cpu time accounting of a guest?
On Tue, Oct 20, 2009 at 4:34 PM, Avi Kivity a...@redhat.com wrote: On 10/20/2009 04:27 PM, Ryota Ozaki wrote: Well, like this? /* Add user time to cpustat. */ tmp = cputime_to_cputime64(cputime); if (TASK_NICE(p) 0) { cpustat-nice = cputime64_add(cpustat-nice, tmp); cpustat-guest_nice = cputime64_add(cpustat-guest_nice, tmp); } else { cpustat-user = cputime64_add(cpustat-user, tmp); cpustat-guest = cputime64_add(cpustat-guest, tmp); } In account_guest_time()? Yes. Yes. OK, I'll send a patch later. Thanks! ozaki-r -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Fix up vmx_set_segment for booting older guests.
If a guest happens to be unlucky enough to use an address such as 0xc000 in the CS base address field, the next attempt to VM enter will fail. This is because the vmcs_writel() that writes the base address into the VMCS will sign-extend the field to 64-bits, and the Intel manual states that bits 63:32 of this field *must* be 0. Use vmcs_write32() where appropriate. This fixes booting of an absolutely ancient Red Hat Linux 5.2 (not Enterprise Linux!) guest. Signed-off-by: Chris Lalancette clala...@redhat.com --- arch/x86/kvm/vmx.c | 17 - 1 files changed, 16 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 364263a..311afd4 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1846,7 +1846,22 @@ static void vmx_set_segment(struct kvm_vcpu *vcpu, vmx-rmode.tr.ar = vmx_segment_access_rights(var); return; } - vmcs_writel(sf-base, var-base); + + /* Intel 64 and IA-32 Architecture Software Developer's Manual Vol. 3b, +* section 22.3.1.2 states that VMENTRY will fail if bits 63:32 of the +* base address for CS, SS, DS, ES are not 0 and the register is usable. +* +* If var-base happens to have bit 31 set, then it will get sign +* extended on the vmcs_writel(), causing this check to fail. Make +* sure to use the 32-bit version where appropriate. +*/ + if (sf-base == GUEST_CS_BASE || + ((~sf-ar_bytes 0x0001) (sf-base == GUEST_SS_BASE || + sf-base == GUEST_DS_BASE || + sf-base == GUEST_ES_BASE))) + vmcs_write32(sf-base, var-base); + else + vmcs_writel(sf-base, var-base); vmcs_write32(sf-limit, var-limit); vmcs_write16(sf-selector, var-selector); if (vmx-rmode.vm86_active var-s) { -- 1.6.0.6 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Print Guest VMCS state on vmexit failure
If we fail to handle a VMEXIT for some reason, print out a lot more debugging information about the state of the GUEST VMCS area. This does not fix a bug, but helps a lot when trying to track down the cause of a VMEXIT/VMENTRY failure. Signed-off-by: Chris Lalancette clala...@redhat.com --- arch/x86/kvm/vmx.c | 38 ++ 1 files changed, 38 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 311afd4..37b1682 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -3452,6 +3452,14 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { static const int kvm_vmx_max_exit_handlers = ARRAY_SIZE(kvm_vmx_exit_handlers); +#define PRINT_GUEST_SEGMENT(seg) do { \ + printk(KERN_DEBUG #seg : SELECTOR 0x%lx, BASE 0x%lx, LIMIT 0x%lx, AR 0x%lx\n, \ + vmcs_readl(GUEST_##seg##_SELECTOR), \ + vmcs_readl(GUEST_##seg##_BASE), \ + vmcs_readl(GUEST_##seg##_LIMIT), \ + vmcs_readl(GUEST_##seg##_AR_BYTES)); \ + while(0) + /* * The guest has exited. See if we can fix it or if we need userspace * assistance. @@ -3512,6 +3520,36 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu) else { vcpu-run-exit_reason = KVM_EXIT_UNKNOWN; vcpu-run-hw.hardware_exit_reason = exit_reason; + + printk(KERN_DEBUG GUEST STATE:\n); + printk(KERN_DEBUG CR0: 0x%lx\n, vmcs_readl(GUEST_CR0)); + printk(KERN_DEBUG CR3: 0x%lx\n, vmcs_readl(GUEST_CR3)); + printk(KERN_DEBUG CR4: 0x%lx\n, vmcs_readl(GUEST_CR4)); + printk(KERN_DEBUG VMENTRY CONTROL: 0x%lx\n, + vmcs_readl(VM_ENTRY_CONTROLS)); + printk(KERN_DEBUG DR7: 0x%lx\n, vmcs_readl(GUEST_DR7)); + printk(KERN_DEBUG SYSENTER ESP: 0x%lx\n, + vmcs_readl(GUEST_SYSENTER_ESP)); + printk(KERN_DEBUG SYSENTER EIP: 0x%lx\n, + vmcs_readl(GUEST_SYSENTER_EIP)); + + PRINT_GUEST_SEGMENT(CS); + PRINT_GUEST_SEGMENT(SS); + PRINT_GUEST_SEGMENT(DS); + PRINT_GUEST_SEGMENT(ES); + PRINT_GUEST_SEGMENT(FS); + PRINT_GUEST_SEGMENT(GS); + PRINT_GUEST_SEGMENT(TR); + PRINT_GUEST_SEGMENT(LDTR); + + printk(KERN_DEBUG GDTR: BASE 0x%lx, LIMIT 0x%lx, + vmcs_readl(GUEST_GDTR_BASE), + vmcs_readl(GUEST_GDTR_LIMIT)); + printk(KERN_DEBUG IDTR: BASE 0x%lx, LIMIT 0x%lx, + vmcs_readl(GUEST_IDTR_BASE), + vmcs_readl(GUEST_IDTR_LIMIT)); + printk(KERN_DEBUG RIP: 0x%lx\n,vmcs_readl(GUEST_RIP)); + printk(KERN_DEBUG RFLAGS: 0x%lx\n,vmcs_readl(GUEST_RFLAGS)); } return 0; } -- 1.6.0.6 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Print Guest VMCS state on vmexit failure
On Tue, Oct 20, 2009 at 09:50:45AM +0200, Chris Lalancette wrote: If we fail to handle a VMEXIT for some reason, print out a lot more debugging information about the state of the GUEST VMCS area. This does not fix a bug, but helps a lot when trying to track down the cause of a VMEXIT/VMENTRY failure. Signed-off-by: Chris Lalancette clala...@redhat.com --- arch/x86/kvm/vmx.c | 38 ++ 1 files changed, 38 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 311afd4..37b1682 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -3452,6 +3452,14 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { static const int kvm_vmx_max_exit_handlers = ARRAY_SIZE(kvm_vmx_exit_handlers); +#define PRINT_GUEST_SEGMENT(seg) do { \ + printk(KERN_DEBUG #seg : SELECTOR 0x%lx, BASE 0x%lx, LIMIT 0x%lx, AR 0x%lx\n, \ +vmcs_readl(GUEST_##seg##_SELECTOR), \ +vmcs_readl(GUEST_##seg##_BASE), \ +vmcs_readl(GUEST_##seg##_LIMIT), \ +vmcs_readl(GUEST_##seg##_AR_BYTES)); \ + while(0) + /* * The guest has exited. See if we can fix it or if we need userspace * assistance. @@ -3512,6 +3520,36 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu) else { vcpu-run-exit_reason = KVM_EXIT_UNKNOWN; vcpu-run-hw.hardware_exit_reason = exit_reason; + + printk(KERN_DEBUG GUEST STATE:\n); + printk(KERN_DEBUG CR0: 0x%lx\n, vmcs_readl(GUEST_CR0)); + printk(KERN_DEBUG CR3: 0x%lx\n, vmcs_readl(GUEST_CR3)); + printk(KERN_DEBUG CR4: 0x%lx\n, vmcs_readl(GUEST_CR4)); + printk(KERN_DEBUG VMENTRY CONTROL: 0x%lx\n, +vmcs_readl(VM_ENTRY_CONTROLS)); + printk(KERN_DEBUG DR7: 0x%lx\n, vmcs_readl(GUEST_DR7)); + printk(KERN_DEBUG SYSENTER ESP: 0x%lx\n, +vmcs_readl(GUEST_SYSENTER_ESP)); + printk(KERN_DEBUG SYSENTER EIP: 0x%lx\n, +vmcs_readl(GUEST_SYSENTER_EIP)); + + PRINT_GUEST_SEGMENT(CS); + PRINT_GUEST_SEGMENT(SS); + PRINT_GUEST_SEGMENT(DS); + PRINT_GUEST_SEGMENT(ES); + PRINT_GUEST_SEGMENT(FS); + PRINT_GUEST_SEGMENT(GS); + PRINT_GUEST_SEGMENT(TR); + PRINT_GUEST_SEGMENT(LDTR); + + printk(KERN_DEBUG GDTR: BASE 0x%lx, LIMIT 0x%lx, +vmcs_readl(GUEST_GDTR_BASE), +vmcs_readl(GUEST_GDTR_LIMIT)); + printk(KERN_DEBUG IDTR: BASE 0x%lx, LIMIT 0x%lx, +vmcs_readl(GUEST_IDTR_BASE), +vmcs_readl(GUEST_IDTR_LIMIT)); + printk(KERN_DEBUG RIP: 0x%lx\n,vmcs_readl(GUEST_RIP)); + printk(KERN_DEBUG RFLAGS: 0x%lx\n,vmcs_readl(GUEST_RFLAGS)); } return 0; Move this to separate function may be? vmx_handle_exit() will be hard to read with this blob in the middle. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix up vmx_set_segment for booting older guests.
On 10/20/2009 04:50 PM, Chris Lalancette wrote: If a guest happens to be unlucky enough to use an address such as 0xc000 in the CS base address field, the next attempt to VM enter will fail. This is because the vmcs_writel() that writes the base address into the VMCS will sign-extend the field to 64-bits, and the Intel manual states that bits 63:32 of this field *must* be 0. Use vmcs_write32() where appropriate. This fixes booting of an absolutely ancient Red Hat Linux 5.2 (not Enterprise Linux!) guest. diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 364263a..311afd4 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1846,7 +1846,22 @@ static void vmx_set_segment(struct kvm_vcpu *vcpu, vmx-rmode.tr.ar = vmx_segment_access_rights(var); return; } - vmcs_writel(sf-base, var-base); + + /* Intel 64 and IA-32 Architecture Software Developer's Manual Vol. 3b, +* section 22.3.1.2 states that VMENTRY will fail if bits 63:32 of the +* base address for CS, SS, DS, ES are not 0 and the register is usable. +* +* If var-base happens to have bit 31 set, then it will get sign +* extended on the vmcs_writel(), causing this check to fail. Make +* sure to use the 32-bit version where appropriate. +*/ + if (sf-base == GUEST_CS_BASE || + ((~sf-ar_bytes 0x0001) (sf-base == GUEST_SS_BASE || + sf-base == GUEST_DS_BASE || + sf-base == GUEST_ES_BASE))) + vmcs_write32(sf-base, var-base); This will leave high bits untouched, so if any were set, this will fail. + else + vmcs_writel(sf-base, var-base); vmcs_write32(sf-limit, var-limit); I think the correct fix is to zero extend in vmcs_writel() rather than here. But as far as I can tell, it already does. Where does the sign extension occur? Perhaps in userspace? -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Print Guest VMCS state on vmexit failure
On 10/20/2009 04:50 PM, Chris Lalancette wrote: If we fail to handle a VMEXIT for some reason, print out a lot more debugging information about the state of the GUEST VMCS area. This does not fix a bug, but helps a lot when trying to track down the cause of a VMEXIT/VMENTRY failure. register state can just as easily be examined in the qemu monitor. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Do I set up separate bridges for each guest?
On 10/20/2009 04:37 AM, Neil Aggarwal wrote: Hello: I am installing KVM on top of CentOS 5.4 so I can have two guests running on my host. I would like to have the host and guests accessible from my network. Do I set up separate bridges for each guest or would they somehow be shared? If I set up separate bridges, I think I need to do in /etc/sysconfig/network-scripts on the host machine: 1. Set up ifcfg-eth0 with the ip information of the host (For example 192.168.2.200) 2. Set up ifcfg-eth0:1 for the first guest. It will have BRIDGE=br1 3. Create ifcfg-br1 with the IP info for the first guest (For example 192.168.2.201) 4. Set up ifcfg-eth0:2 for the second guest. It will have BRIDGE=br2 5. Create ifcfg-br2 with the IP info for the second guest (For example 192.168.2.202) Is this correct or did I miss something? The simplest thing is to use a single bridge for all - The physical nic should be part of it and supply the outside world connection. The physical nic doesn't need an IP and the bridge should own it. All vms can use this bridge. cat /etc/sysconfig/network-scripts/ifcfg-br0 DEVICE=br0 TYPE=Bridge ONBOOT=yes GATEWAYDEV='' BOOTPROTO=dhcp DELAY=0 HWADDR=00:14:5E:17:D0:04 # cat /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE=eth0 ONBOOT=yes BOOTPROTO=none HWADDR=00:14:5E:17:D0:04 BRIDGE=br0 Thanks, Neil -- Neil Aggarwal, (281)846-8957, www.JAMMConsulting.com Will your e-commerce site go offline if you have a DB server failure, fiber cut, flood, fire, or other disaster? If so, ask about our geographically redundant database system. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Print Guest VMCS state on vmexit failure
Avi Kivity wrote: On 10/20/2009 04:50 PM, Chris Lalancette wrote: If we fail to handle a VMEXIT for some reason, print out a lot more debugging information about the state of the GUEST VMCS area. This does not fix a bug, but helps a lot when trying to track down the cause of a VMEXIT/VMENTRY failure. register state can just as easily be examined in the qemu monitor. Ah, true. OK, forget this patch. -- Chris Lalancette -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/4] KVM: x86: Add VCPU substate for NMI states
Marcelo Tosatti wrote: On Thu, Oct 15, 2009 at 07:05:36PM +0200, Jan Kiszka wrote: This plugs an NMI-related hole in the VCPU synchronization between kernel and user space. So far, neither pending NMIs nor the inhibit NMI mask was properly read/set which was able to cause problems on vmsave/restore, live migration and system reset. Fix it by making use of the new VCPU substate interface. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- Documentation/kvm/api.txt | 12 arch/x86/include/asm/kvm.h |7 +++ arch/x86/include/asm/kvm_host.h |2 ++ arch/x86/kvm/svm.c | 22 ++ arch/x86/kvm/vmx.c | 30 ++ arch/x86/kvm/x86.c | 26 ++ 6 files changed, 99 insertions(+), 0 deletions(-) diff --git a/Documentation/kvm/api.txt b/Documentation/kvm/api.txt index bee5bbd..e483edb 100644 --- a/Documentation/kvm/api.txt +++ b/Documentation/kvm/api.txt @@ -848,3 +848,15 @@ Deprecates: KVM_GET/SET_CPUID2 Architectures: x86 Payload: struct kvm_lapic Deprecates: KVM_GET/SET_LAPIC + +6.8 KVM_X86_VCPU_STATE_NMI + +Architectures: x86 +Payload: struct kvm_nmi_state +Deprecates: - + +struct kvm_nmi_state { + __u8 pending; + __u8 masked; + __u8 pad1[6]; Don't you also have to save nmi_injected, in case of failure during NMI delivery. Something made me think it's not required. Don't ask me what, it was wrong anyway. Will roll out -v3 for this patch. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Print Guest VMCS state on vmexit failure
Hi, maybe it's stupid question, but is this available also when qemu/kvm is started using libvirt stuff? I think it uses monitor so it's inaccessible for user no? n. On Tue, Oct 20, 2009 at 10:42:24AM +0200, Chris Lalancette wrote: Avi Kivity wrote: On 10/20/2009 04:50 PM, Chris Lalancette wrote: If we fail to handle a VMEXIT for some reason, print out a lot more debugging information about the state of the GUEST VMCS area. This does not fix a bug, but helps a lot when trying to track down the cause of a VMEXIT/VMENTRY failure. register state can just as easily be examined in the qemu monitor. Ah, true. OK, forget this patch. -- Chris Lalancette -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- - Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/4] KVM: x86: Add VCPU substate for NMI states
Avi Kivity wrote: On 10/20/2009 05:39 AM, Gleb Natapov wrote: BTW, what happens to exceptions that fail to be delivered? Can't see where they are saved/restored across migration. The instruction that caused an exception will be re-executed after migration and exception will be regenerated. Except for debug exceptions (traps). But I think we should migrate exception anyway for completeness. Yes. So save/restore kvm_vcpu_arch::exception? As another substate or as part of a generalized NMI substate? Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Print Guest VMCS state on vmexit failure
Nikola Ciprich wrote: Hi, maybe it's stupid question, but is this available also when qemu/kvm is started using libvirt stuff? I think it uses monitor so it's inaccessible for user no? Yes and no. The monitor is inaccessible when using libvirt, but I totally forgot that qemu dumps the register state to stderr before abort()'ing on an unknown vm exit. Libvirt takes the output from stderr and stores it in /var/log/libvirt/qemu/guestname. So you would still be able to see this output when using libvirt. -- Chris Lalancette -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
0.11: SMP guests using one host CPU only?
On a 8 CPU host, I created a guest with 4 CPUs (-smp 4). Unfortunately, the guest only uses one host CPU. For example, running cat /dev/urandom | gzip -9 /dev/null several times on this guest causes load on only one host CPU. Is it expected? The host is running 2.6.32-rc5 and qemu-kvm-0.11. I also tried 2.6.31.5 with qemu-kvm-0.11 with the same result. I have another machine, running 2.6.24 kernel, where it works just fine (running several CPU-intensive tasks on a guest result in several host CPUs being loaded). -- Tomasz Chmielewski http://wpkg.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/4] KVM: x86: Add VCPU substate for NMI states
On 10/20/2009 05:56 PM, Jan Kiszka wrote: So save/restore kvm_vcpu_arch::exception? As another substate or as part of a generalized NMI substate? Yes. It's not part of an nmi substate, but both can be part of an exception substate (but need to look at the docs vewy cawefuwy to make sure we don't screw up again). -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/4] KVM: x86: Add VCPU substate for NMI states
On Tue, Oct 20, 2009 at 06:06:36PM +0900, Avi Kivity wrote: On 10/20/2009 05:56 PM, Jan Kiszka wrote: So save/restore kvm_vcpu_arch::exception? As another substate or as part of a generalized NMI substate? Yes. It's not part of an nmi substate, but both can be part of an exception substate (but need to look at the docs vewy cawefuwy to make sure we don't screw up again). What do you mean? How they can be both part of exception substate? -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Print Guest VMCS state on vmexit failure
On 10/20/2009 05:57 PM, Chris Lalancette wrote: Nikola Ciprich wrote: Hi, maybe it's stupid question, but is this available also when qemu/kvm is started using libvirt stuff? I think it uses monitor so it's inaccessible for user no? Yes and no. The monitor is inaccessible when using libvirt, but I totally forgot that qemu dumps the register state to stderr before abort()'ing on an unknown vm exit. Libvirt takes the output from stderr and stores it in /var/log/libvirt/qemu/guestname. So you would still be able to see this output when using libvirt. We've dropped the stderr part (IIRC), but nothing prevents libvirt from accessing the register state and providing it to the user. There's also the multiple monitor support which can be used for debugging. Finally, you can connect with gdb (need to dynamically start the gdb server via the monitor, again needs libvirt support). -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/4] KVM: x86: Add VCPU substate for NMI states
On 10/20/2009 06:08 PM, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 06:06:36PM +0900, Avi Kivity wrote: On 10/20/2009 05:56 PM, Jan Kiszka wrote: So save/restore kvm_vcpu_arch::exception? As another substate or as part of a generalized NMI substate? Yes. It's not part of an nmi substate, but both can be part of an exception substate (but need to look at the docs vewy cawefuwy to make sure we don't screw up again). What do you mean? How they can be both part of exception substate? Sorry, nomenclature failure. We need NMI state, Interrupt state (already provided), and pending exception state (which can be a fault or a trap). There's also some extra state associated with pending debug exceptions (maybe we can copy it into dr6). We can either put all of these into one substate, or into separate substates. I'm not sure which is best. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 0.11: SMP guests using one host CPU only?
On 10/20/2009 06:03 PM, Tomasz Chmielewski wrote: On a 8 CPU host, I created a guest with 4 CPUs (-smp 4). Unfortunately, the guest only uses one host CPU. For example, running cat /dev/urandom | gzip -9 /dev/null several times on this guest causes load on only one host CPU. Is it expected? No. What does 'top -H' show? -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix up vmx_set_segment for booting older guests.
Avi Kivity wrote: +else +vmcs_writel(sf-base, var-base); vmcs_write32(sf-limit, var-limit); I think the correct fix is to zero extend in vmcs_writel() rather than here. But as far as I can tell, it already does. Where does the sign extension occur? Perhaps in userspace? Very good question Avi. I should have dug a bit deeper before posting. I traced this further back, and here's what it looks like is going on: arch/x86/kvm/x86.c:kvm_load_segment_descriptor() is responsible for loading the CPU segment descriptor into the VMCS area. It does this by calling load_segment_descriptor_to_kvm_desct(), doing a few minor transformations of the data, then calling kvm_set_segment() to load it into the VMCS. The problem arises in load_segment_descriptor_to_kvm_desct() - seg_desct_to_kvm_desct(). seg_desct_to_kvm_desct() takes the struct desc_struct (in this case, base0 == 0x0, base1 == 0x0, and base2 == 0xc0), then calls get_desc_base() and stores the result in the struct kvm_segment. The return value from get_desc_base is It's here that the sign-extension occurs, which eventually causes that VM entry failure. get_desc_base() sign-extends because of some complicated u8 to unsigned rules that I'm not completely sure of. The below patch fixes my original issue, but I'm not at all sure that this is the right thing to do. I could also change get_desc_base() itself to do the casting, which should do the right thing for all callers, but I'm not sure if that's what all callers want. Anybody else have an opinion? diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a93ba29..b58bda2 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3997,7 +3997,7 @@ static void kvm_set_segment(struct kvm_vcpu *vcpu, static void seg_desct_to_kvm_desct(struct desc_struct *seg_desc, u16 selector, struct kvm_segment *kvm_desct) { - kvm_desct-base = get_desc_base(seg_desc); + kvm_desct-base = (unsigned)get_desc_base(seg_desc); kvm_desct-limit = get_desc_limit(seg_desc); if (seg_desc-g) { kvm_desct-limit = 12; -- Chris Lalancette -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 0.11: SMP guests using one host CPU only?
Avi Kivity wrote: On 10/20/2009 06:03 PM, Tomasz Chmielewski wrote: On a 8 CPU host, I created a guest with 4 CPUs (-smp 4). Unfortunately, the guest only uses one host CPU. For example, running cat /dev/urandom | gzip -9 /dev/null several times on this guest causes load on only one host CPU. Is it expected? No. What does 'top -H' show? In the guest - 4 CPUs with ~100% usage each (when I press 1), otherwise, in the task list, multiple cat processes taking most CPU time (as it reads from /dev/urandom). In the host - qemu-system-x86 (one process/thread) taking ~100% CPU; when I press 1, I see only one CPU is used 100%, 7 other CPUs are more or less not used. guest command line: /usr/bin/qemu-system-x86_64 -m 1024 -drive file=/srv/kvm/images/lvs2,if=virtio,cache=writeback,index=0,boot=on -net nic,vlan=0,model=virtio,macaddr=F2:4A:51:41:B1:3F -net tap,vlan=0,script=/etc/qemu-ifup -localtime -smp 4 There are 5 other guests (1 CPU) started before this guest. -- Tomasz Chmielewski http://wpkg.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vhost-net patches
On Mon, Oct 19, 2009 at 03:56:54PM -0700, Sridhar Samudrala wrote: On Sun, 2009-10-18 at 19:32 +0200, Michael S. Tsirkin wrote: On Sun, Oct 18, 2009 at 12:53:56PM +0200, Michael S. Tsirkin wrote: On Fri, Oct 16, 2009 at 12:29:29PM -0700, Sridhar Samudrala wrote: Hi Michael, We are trying out your vhost-net patches from your git trees on kernel.org. I am using mst/vhost.git as host kernel and mst/qemu-kvm.git for qemu. I am using the following qemu script to start the guest using userspace tap backend. home/sridhar/git/mst/qemu-kvm/x86_64-softmmu/qemu-system-x86_64 /home/sridhar/kvm_images/fedora10-1-vm -m 512 -drive file=/home/sridhar/kvm_images/fedora10-1-vm,if=virtio,index=0,boot=on -net nic,macaddr=54:52:00:35:e3:73,model=virtio -net tap,ifname=vnet0,script=no,downscript=no Now that i got the default backend to work, i wanted to try vhost in kernel. But could not figure out the right -net option to pass to qemu. Can you let me know the right syntax to start a guest using vhost. Thanks Sridhar Here's an example with raw socket: /root/kvm-test/bin/qemu-system-x86_64 -m 1G -kernel \ /boot/vmlinuz-$release -append \ 'root=UUID=d5d2d201-d086-42ad-bb1d-32fbe40eda71 ro quiet nosplash \ console=tty0 console=ttyS0,9600n8' -initrd /boot/guest-initrd.img \ $HOME/disk.raw.copy -net raw,ifname=eth3 -net nic,model=virtio,vhost \ -balloon none -redir tcp:8023::22 As you see, I changed the command line. You now simply add ,vhost after model, and it will locate host network interface specified earlier and attach to it. This should have been clear from running qemu with -help flag. Could you please suggest how can that text be clarified? I updated to your latest git trees and the default user-space tap backend using the following -net options worked fine. -net tap,ifname=vnet0,script=no,downscript=no -net nic,model=virtio But i could not get vhost to work with either raw or tap backends. I tried the following combinations. 1) -net raw,ifname=eth0 -net nic,model=virtio,vhost 2) -net raw,ifname=vnet0, -net nic,model=virtio,vhost 3) -net tap,ifname=vnet0,script=no,downscript=no -net nic,model=virtio,vhost They all failed with the following error vhost_net_init returned -7 This is an error message from hw/virtio-net.c:virtio_net_driver_ok() when vhost_net_start() fails. It looks like dev-binding-irqfd() is failing in vhost_virtqueue_init(). Haven't yet debugged further. I have CONFIG_EVENTFD enabled in the host kernel. Are all the above -net options supposed to work? In your descriptions, you say that checksum/tso offload is not supported. They should work with tap but not raw sockets yet. Isn't it possible to send/receive large packets without checksum using AF_PACKET sockets if the attached interface supports these offloads. Do you see the same offload issue even when using tap backend via vhost? Thanks Sridhar -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vhost-net patches
On Mon, Oct 19, 2009 at 04:08:24PM -0700, Shirley Ma wrote: Hello Michael, They all failed with the following error vhost_net_init returned -7 This is an error message from hw/virtio-net.c:virtio_net_driver_ok() when vhost_net_start() fails. It looks like dev-binding-irqfd() is failing in vhost_virtqueue_init(). Haven't yet debugged further. I have CONFIG_EVENTFD enabled in the host kernel. From the debug output, looks like the vnet-vector is not defined, what is vnet-vector? And what do you mean by not defined? and the default msix_entries_nr is 3, so it returned EINVAL from virtio_pci_irqfd. Looks we need to either disable QEMU_PCI_CAP_MSIX or define vector in QEMU configuration? You shouldn't have to do anything. I am not familiar with MSIX stuffs. Thanks Shirley Inactive hide details for sri---10/19/2009 03:56:57 PM---On Sun, 2009-10-18 at 19:32 +0200, Michael S. Tsirkin wrote:sri---10/19/2009 03:56:57 PM---On Sun, 2009-10-18 at 19:32 +0200, Michael S. Tsirkin wrote: s...@linux.vnet.ibm.com [cid] * To Michael S. Tsirkin m...@redhat.com, 10/19/2009 03:56 PMkvm@vger.kernel.org [cid] * cc David Stevens/Beaverton/i...@ibmus, Shirley Ma/ Beaverton/i...@ibmus [cid] * Subject Re: vhost-net patches ** On Sun, 2009-10-18 at 19:32 +0200, Michael S. Tsirkin wrote: On Sun, Oct 18, 2009 at 12:53:56PM +0200, Michael S. Tsirkin wrote: On Fri, Oct 16, 2009 at 12:29:29PM -0700, Sridhar Samudrala wrote: Hi Michael, We are trying out your vhost-net patches from your git trees on kernel.org. I am using mst/vhost.git as host kernel and mst/qemu-kvm.git for qemu. I am using the following qemu script to start the guest using userspace tap backend. home/sridhar/git/mst/qemu-kvm/x86_64-softmmu/qemu-system-x86_64 /home/ sridhar/kvm_images/fedora10-1-vm -m 512 -drive file=/home/sridhar/kvm_images/ fedora10-1-vm,if=virtio,index=0,boot=on -net nic,macaddr= 54:52:00:35:e3:73,model=virtio -net tap,ifname=vnet0,script=no,downscript=no Now that i got the default backend to work, i wanted to try vhost in kernel. But could not figure out the right -net option to pass to qemu. Can you let me know the right syntax to start a guest using vhost. Thanks Sridhar Here's an example with raw socket: /root/kvm-test/bin/qemu-system-x86_64 -m 1G -kernel \ /boot/vmlinuz-$release -append \ 'root=UUID=d5d2d201-d086-42ad-bb1d-32fbe40eda71 ro quiet nosplash \ console=tty0 console=ttyS0,9600n8' -initrd /boot/guest-initrd.img \ $HOME/disk.raw.copy -net raw,ifname=eth3 -net nic,model=virtio,vhost \ -balloon none -redir tcp:8023::22 As you see, I changed the command line. You now simply add ,vhost after model, and it will locate host network interface specified earlier and attach to it. This should have been clear from running qemu with -help flag. Could you please suggest how can that text be clarified? I updated to your latest git trees and the default user-space tap backend using the following -net options worked fine. -net tap,ifname=vnet0,script=no,downscript=no -net nic,model=virtio But i could not get vhost to work with either raw or tap backends. I tried the following combinations. 1) -net raw,ifname=eth0 -net nic,model=virtio,vhost 2) -net raw,ifname=vnet0, -net nic,model=virtio,vhost 3) -net tap,ifname=vnet0,script=no,downscript=no -net nic,model=virtio,vhost Yes, should work. They all failed with the following error vhost_net_init returned -7 This is an error message from hw/virtio-net.c:virtio_net_driver_ok() when vhost_net_start() fails. It looks like dev-binding-irqfd() is failing in vhost_virtqueue_init(). what parameters are passed in? Haven't yet debugged further. this calls into virtio_pci_irqfd. I have CONFIG_EVENTFD enabled in the host kernel. Note you need to also enable eventfd support under kvm menu. Are all the above -net options supposed to work? In your descriptions, you say that checksum/tso offload is not supported. Isn't it possible to send/receive large packets without checksum using AF_PACKET sockets if the attached interface supports these offloads. Do you see the same offload issue even when using tap backend via vhost? Thanks Sridhar -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] kvm-kmod: Use the main development tree of kvm as Linux submodule
From: Wolfgang Mauerer wolfgang.maue...@siemens.com Most people won't have the sources installed in the path that is the current default setting. Signed-off-by: Wolfgang Mauerer wolfgang.maue...@siemens.com --- .gitmodules |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/.gitmodules b/.gitmodules index 9c63921..42fc7a1 100644 --- a/.gitmodules +++ b/.gitmodules @@ -1,3 +1,3 @@ [submodule linux-2.6] path = linux-2.6 - url = ../kvm.git + url = git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git -- 1.6.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] kvm-kmod: Document the build process
From: Wolfgang Mauerer wolfgang.maue...@siemens.com A package without build instructions is like a kernel without a penguin. Signed-off-by: Wolfgang Mauerer wolfgang.maue...@siemens.com --- README | 26 ++ 1 files changed, 26 insertions(+), 0 deletions(-) create mode 100644 README diff --git a/README b/README new file mode 100644 index 000..40a72d3 --- /dev/null +++ b/README @@ -0,0 +1,26 @@ +Building the KVM kernel module is performed differently depending on whether +you are working from a clone of the git repository or from a source release. + +- To build from a release, simply use ./configure (possibly with any + arguments that are required for your setup, see ./configure --help) + and make. + +- Building from a cloned git repository requires a kernel tree with the main + kvm sources that is included as a submodule in the linux-2.6/ directory. By + default, the KVM development tree on git.kernel.org is used, but you can + change this setting in .gitmodules + + Before the kvm module can be built, the linux submodule must be initialised + and populated. The required sequence of commands is + + git submodule init + git submodule update + ./configure + make sync + make + + Notice that you can also specify an existing Linux tree for the + synchronisation stage by using + + make sync LINUX=/path/to/tree + -- 1.6.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
KVM: VMX: remove GUEST_CR3 write from vmx_vcpu_run
GUEST_CR3 is updated via kvm_set_cr3 whenever CR3 value changes. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 364263a..325075f 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -3638,10 +3638,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); - if (enable_ept is_paging(vcpu)) { - vmcs_writel(GUEST_CR3, vcpu-arch.cr3); + if (enable_ept is_paging(vcpu)) ept_load_pdptrs(vcpu); - } + /* Record the guest's net vcpu time for enforced NMI injections. */ if (unlikely(!cpu_has_virtual_nmis() vmx-soft_vnmi_blocked)) vmx-entry_time = ktime_get(); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/4] KVM: x86: Add VCPU substate for NMI states
On Tue, Oct 20, 2009 at 06:14:04PM +0900, Avi Kivity wrote: On 10/20/2009 06:08 PM, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 06:06:36PM +0900, Avi Kivity wrote: On 10/20/2009 05:56 PM, Jan Kiszka wrote: So save/restore kvm_vcpu_arch::exception? As another substate or as part of a generalized NMI substate? Yes. It's not part of an nmi substate, but both can be part of an exception substate (but need to look at the docs vewy cawefuwy to make sure we don't screw up again). What do you mean? How they can be both part of exception substate? Sorry, nomenclature failure. We need NMI state, Interrupt state (already provided), and pending exception state (which can be a fault or a trap). There's also some extra state associated with pending debug exceptions (maybe we can copy it into dr6). KVM_REQ_TRIPLE_FAULT can also be lost, but i don't think anybody cares? We can either put all of these into one substate, or into separate substates. I'm not sure which is best. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/4] KVM: x86: Add VCPU substate for NMI states
On Tue, Oct 20, 2009 at 09:13:02AM -0200, Marcelo Tosatti wrote: On Tue, Oct 20, 2009 at 06:14:04PM +0900, Avi Kivity wrote: On 10/20/2009 06:08 PM, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 06:06:36PM +0900, Avi Kivity wrote: On 10/20/2009 05:56 PM, Jan Kiszka wrote: So save/restore kvm_vcpu_arch::exception? As another substate or as part of a generalized NMI substate? Yes. It's not part of an nmi substate, but both can be part of an exception substate (but need to look at the docs vewy cawefuwy to make sure we don't screw up again). What do you mean? How they can be both part of exception substate? Sorry, nomenclature failure. We need NMI state, Interrupt state (already provided), and pending exception state (which can be a fault or a trap). There's also some extra state associated with pending debug exceptions (maybe we can copy it into dr6). KVM_REQ_TRIPLE_FAULT can also be lost, but i don't think anybody cares? If pending exception will be migrated KVM_REQ_TRIPLE_FAULT will be restored after guest will try to re-execute instruction that caused it. One more reason to migrate pending exceptions. And why not migrate KVM_REQ_TRIPLE_FAULT while we are at it. We can either put all of these into one substate, or into separate substates. I'm not sure which is best. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
List of unaccessible x86 states
Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On 20.10.2009, at 15:01, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to sync it. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM: VMX: remove GUEST_CR3 write from vmx_vcpu_run
On 10/20/2009 09:37 PM, Marcelo Tosatti wrote: GUEST_CR3 is updated via kvm_set_cr3 whenever CR3 value changes. diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 364263a..325075f 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -3638,10 +3638,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); - if (enable_ept is_paging(vcpu)) { - vmcs_writel(GUEST_CR3, vcpu-arch.cr3); + if (enable_ept is_paging(vcpu)) ept_load_pdptrs(vcpu); - } + /* Record the guest's net vcpu time for enforced NMI injections. */ if (unlikely(!cpu_has_virtual_nmis() vmx-soft_vnmi_blocked)) vmx-entry_time = ktime_get(); Nice. Any reason why ept_load_pdptrs() couldn't go the same way? -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 0.11: SMP guests using one host CPU only?
On 10/20/2009 07:17 PM, Tomasz Chmielewski wrote: Avi Kivity wrote: On 10/20/2009 06:03 PM, Tomasz Chmielewski wrote: On a 8 CPU host, I created a guest with 4 CPUs (-smp 4). Unfortunately, the guest only uses one host CPU. For example, running cat /dev/urandom | gzip -9 /dev/null several times on this guest causes load on only one host CPU. Is it expected? No. What does 'top -H' show? In the guest - 4 CPUs with ~100% usage each (when I press 1), otherwise, in the task list, multiple cat processes taking most CPU time (as it reads from /dev/urandom). In the host - qemu-system-x86 (one process/thread) taking ~100% CPU; when I press 1, I see only one CPU is used 100%, 7 other CPUs are more or less not used. I meant, how many qemu threads are there, and how much cpu does each take? -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Autotest] [PATCH] Test 802.1Q vlan of nic
See comments below. - Dor Laor dl...@redhat.com wrote: On 10/15/2009 11:48 AM, Amos Kong wrote: Test 802.1Q vlan of nic, config it by vconfig command. 1) Create two VMs 2) Setup guests in different vlan by vconfig and test communication by ping using hard-coded ip address 3) Setup guests in same vlan and test communication by ping 4) Recover the vlan config Signed-off-by: Amos Kongak...@redhat.com --- client/tests/kvm/kvm_tests.cfg.sample |6 +++ client/tests/kvm/tests/vlan_tag.py| 73 + 2 files changed, 79 insertions(+), 0 deletions(-) mode change 100644 = 100755 client/tests/kvm/scripts/qemu-ifup In general the above should come as an independent patch. create mode 100644 client/tests/kvm/tests/vlan_tag.py diff --git a/client/tests/kvm/kvm_tests.cfg.sample b/client/tests/kvm/kvm_tests.cfg.sample index 9ccc9b5..4e47767 100644 --- a/client/tests/kvm/kvm_tests.cfg.sample +++ b/client/tests/kvm/kvm_tests.cfg.sample @@ -166,6 +166,12 @@ variants: used_cpus = 5 used_mem = 2560 +- vlan_tag: install setup +type = vlan_tag +subnet2 = 192.168.123 +vlans = 10 20 If we want to be fanatic and safe we should dynamically choose subnet and vlans numbers that are not used on the host instead of hard code it. For the sake of safety maybe we should start both VMs with -snapshot. Dor, what do you think? Is it safe to start 2 VMs with the same disk image when only one of them uses -snapshot? +nic_mode = tap +nic_model = e1000 Why only e1000? Let's test virtio and rtl8139 as well. Can't you inherit the nic model from the config? It's not just inherited, it's overwritten, because nic_model is defined later in the file in a variants block. So this nic_model line has no effect. - autoit: install setup type = autoit diff --git a/client/tests/kvm/scripts/qemu-ifup b/client/tests/kvm/scripts/qemu-ifup old mode 100644 new mode 100755 diff --git a/client/tests/kvm/tests/vlan_tag.py b/client/tests/kvm/tests/vlan_tag.py new file mode 100644 index 000..15e763f --- /dev/null +++ b/client/tests/kvm/tests/vlan_tag.py @@ -0,0 +1,73 @@ +import logging, time +from autotest_lib.client.common_lib import error +import kvm_subprocess, kvm_test_utils, kvm_utils + +def run_vlan_tag(test, params, env): + +Test 802.1Q vlan of nic, config it by vconfig command. + +1) Create two VMs +2) Setup guests in different vlan by vconfig and test communication by ping + using hard-coded ip address +3) Setup guests in same vlan and test communication by ping +4) Recover the vlan config + +@param test: Kvm test object +@param params: Dictionary with the test parameters. +@param env: Dictionary with test environment. + + +vm = [] +session = [] +subnet2 = params.get(subnet2) +vlans = params.get(vlans).split() + +vm.append(kvm_test_utils.get_living_vm(env, %s % params.get(main_vm))) There's no need for the %s here. ...get_living_vm(env, params.get(main_vm))) should work. +params_vm2 = params.copy() +params_vm2['image_snapshot'] = yes +params_vm2['kill_vm_gracefully'] = no +params_vm2[address_index] = int(params.get(address_index, 0))+1 +vm.append(vm[0].clone(vm2, params_vm2)) +kvm_utils.env_register_vm(env, vm2, vm[1]) +if not vm[1].create(): +raise error.TestError(VM 1 create faild) The whole 7-8 lines above should be grouped as a function to clone existing VM. It should be part of kvm autotest infrastructure. Besides that, it looks good. There's already a clone function and it's being used here. Instead of those 7-8 lines, why not just define the VM in the config file? It looks like you're always using 2 VMs so there's no reason to do this in test code. This should do what you want: - vlan_tag: install setup type = vlan_tag subnet2 = 192.168.123 vlans = 10 20 nic_mode = tap vms += vm2 extra_params_vm2 += -snapshot kill_vm_gracefully_vm2 = no address_index_vm2 = 1 The preprocessor then automatically creates vm2 and registers it in env. To use it in the test just do: vm.append(kvm_test_utils.get_living_vm(env, vm2)) You can also use a parameter that tells the test which VM to use if you don't want the name vm2 hardcoded into the test. Add something like this to the config file: 2nd_vm = vm2 and in the test use params.get(2nd_vm) instead of vm2 (just like you use main_vm). + +for i in range(2): +session.append(kvm_test_utils.wait_for_login(vm[i])) + +try: +vconfig_cmd = vconfig add eth0 %s;ifconfig eth0.%s %s.%s +# Attempt to configure IPs for the VMs and record the results in +# boolean variables +#
Re: List of unaccessible x86 states
Alexander Graf wrote: On 20.10.2009, at 15:01, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to sync it. OK. Whole hflags or just the GIF bit? If we allow access to all bits, can user space cause any problems (beyond screwing up its guests) by passing weird patterns? Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 0.11: SMP guests using one host CPU only?
Avi Kivity wrote: On 10/20/2009 07:17 PM, Tomasz Chmielewski wrote: Avi Kivity wrote: On 10/20/2009 06:03 PM, Tomasz Chmielewski wrote: On a 8 CPU host, I created a guest with 4 CPUs (-smp 4). Unfortunately, the guest only uses one host CPU. For example, running cat /dev/urandom | gzip -9 /dev/null several times on this guest causes load on only one host CPU. Is it expected? No. What does 'top -H' show? In the guest - 4 CPUs with ~100% usage each (when I press 1), otherwise, in the task list, multiple cat processes taking most CPU time (as it reads from /dev/urandom). In the host - qemu-system-x86 (one process/thread) taking ~100% CPU; when I press 1, I see only one CPU is used 100%, 7 other CPUs are more or less not used. I meant, how many qemu threads are there, and how much cpu does each take? There is only one qemu thread for the 4-cpu guest. -- Tomasz Chmielewski http://wpkg.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix up vmx_set_segment for booting older guests.
On 10/20/2009 07:02 PM, Chris Lalancette wrote: get_desc_base() sign-extends because of some complicated u8 to unsigned rules that I'm not completely sure of. The below patch fixes my original issue, but I'm not at all sure that this is the right thing to do. I could also change get_desc_base() itself to do the casting, which should do the right thing for all callers, but I'm not sure if that's what all callers want. Anybody else have an opinion? get_desc_base() is broken and should be fixed. No caller could possibly want this sign extension (64-bit segment bases are only possible using MSR_FS_BASE/MSR_GS_BASE/MSR_KERNEL_GS_BASE). -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 0.11: SMP guests using one host CPU only?
On 10/20/2009 10:19 PM, Tomasz Chmielewski wrote: I meant, how many qemu threads are there, and how much cpu does each take? There is only one qemu thread for the 4-cpu guest. Not possible. Even a single-cpu guest has two threads. What does 'ls /proc/$(pgrep qemu)/task' show? -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On Tue, Oct 20, 2009 at 03:19:41PM +0200, Jan Kiszka wrote: Alexander Graf wrote: On 20.10.2009, at 15:01, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to sync it. OK. Whole hflags or just the GIF bit? If we allow access to all bits, can user space cause any problems (beyond screwing up its guests) by passing weird patterns? HF_NMI_MASK should be migrated too. Destination should enable IRET intercept if HF_NMI_MASK is set. Or we can assume that migration in the middle of NMI will never happen :) -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On 20.10.2009, at 15:19, Jan Kiszka wrote: Alexander Graf wrote: On 20.10.2009, at 15:01, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to sync it. OK. Whole hflags or just the GIF bit? ag...@busu:~/git/kvm grep -R HF_ arch/x86/include/asm/*kvm* arch/x86/include/asm/kvm_host.h:#define HF_GIF_MASK (1 0) arch/x86/include/asm/kvm_host.h:#define HF_HIF_MASK (1 1) arch/x86/include/asm/kvm_host.h:#define HF_VINTR_MASK (1 2) arch/x86/include/asm/kvm_host.h:#define HF_NMI_MASK (1 3) arch/x86/include/asm/kvm_host.h:#define HF_IRET_MASK(1 4) I can only talk for GIF here and that should be fine. Not knowing about the others does seem like we could get race conditions though. If we allow access to all bits, can user space cause any problems (beyond screwing up its guests) by passing weird patterns? IMHO the hflags should be converted between userspace and kernel representation. There's a good chance we run older userspace that doesn't know about certain flags yet and I'd like to keep the bits as flexible as possible. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
Gleb Natapov wrote: On Tue, Oct 20, 2009 at 03:19:41PM +0200, Jan Kiszka wrote: Alexander Graf wrote: On 20.10.2009, at 15:01, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to sync it. OK. Whole hflags or just the GIF bit? If we allow access to all bits, can user space cause any problems (beyond screwing up its guests) by passing weird patterns? HF_NMI_MASK should be migrated too. Destination should enable IRET intercept if HF_NMI_MASK is set. Or we can assume that migration in the middle of NMI will never happen :) HF_NMI_MASK is redundant to the vendor-agnostic nmi_masked and would therefore likely be masked out. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On Tue, Oct 20, 2009 at 03:29:38PM +0200, Jan Kiszka wrote: Gleb Natapov wrote: On Tue, Oct 20, 2009 at 03:19:41PM +0200, Jan Kiszka wrote: Alexander Graf wrote: On 20.10.2009, at 15:01, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to sync it. OK. Whole hflags or just the GIF bit? If we allow access to all bits, can user space cause any problems (beyond screwing up its guests) by passing weird patterns? HF_NMI_MASK should be migrated too. Destination should enable IRET intercept if HF_NMI_MASK is set. Or we can assume that migration in the middle of NMI will never happen :) HF_NMI_MASK is redundant to the vendor-agnostic nmi_masked and would therefore likely be masked out. Correct. We can restore HF_NMI_MASK from nmi_masked. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On Tue, Oct 20, 2009 at 03:01:15PM +0200, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector Should be migrated. - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] v3: use upstream kvm_vcpu_ioctl
[v2: we already return -errno, so fix testers ] [v3: keep error message for apic related failures ] Signed-off-by: Glauber Costa glom...@redhat.com --- kvm-all.c |3 -- qemu-kvm-x86.c | 90 +-- qemu-kvm.c | 31 --- qemu-kvm.h |1 + 4 files changed, 48 insertions(+), 77 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 0a8aa4c..50cd1fb 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -863,7 +863,6 @@ int kvm_vm_ioctl(KVMState *s, int type, ...) return ret; } -#ifdef KVM_UPSTREAM int kvm_vcpu_ioctl(CPUState *env, int type, ...) { int ret; @@ -881,8 +880,6 @@ int kvm_vcpu_ioctl(CPUState *env, int type, ...) return ret; } -#endif - int kvm_has_sync_mmu(void) { #ifdef KVM_CAP_SYNC_MMU diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index fb70ede..09e4f8c 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -174,18 +174,11 @@ static int kvm_handle_tpr_access(CPUState *env) int kvm_enable_vapic(CPUState *env, uint64_t vapic) { - int r; struct kvm_vapic_addr va = { .vapic_addr = vapic, }; - r = ioctl(env-kvm_fd, KVM_SET_VAPIC_ADDR, va); - if (r == -1) { - r = -errno; - perror(kvm_enable_vapic); - return r; - } - return 0; + return kvm_vcpu_ioctl(env, KVM_SET_VAPIC_ADDR, va); } #endif @@ -283,28 +276,29 @@ int kvm_destroy_memory_alias(kvm_context_t kvm, uint64_t phys_start) int kvm_get_lapic(CPUState *env, struct kvm_lapic_state *s) { - int r; +int r = 0; + if (!kvm_irqchip_in_kernel()) - return 0; - r = ioctl(env-kvm_fd, KVM_GET_LAPIC, s); - if (r == -1) { - r = -errno; - perror(kvm_get_lapic); - } - return r; + return r; + + r = kvm_vcpu_ioctl(env, KVM_GET_LAPIC, s); +if (r 0) +fprintf(stderr, KVM_GET_LAPIC failed\n) +return r; } int kvm_set_lapic(CPUState *env, struct kvm_lapic_state *s) { - int r; +int r = 0; + if (!kvm_irqchip_in_kernel()) return 0; - r = ioctl(env-kvm_fd, KVM_SET_LAPIC, s); - if (r == -1) { - r = -errno; - perror(kvm_set_lapic); - } - return r; + + r = kvm_vcpu_ioctl(env, KVM_SET_LAPIC, s); + +if (r 0) +fprintf(stderr, KVM_SET_LAPIC failed\n) +return r; } #endif @@ -356,7 +350,6 @@ int kvm_has_pit_state2(kvm_context_t kvm) void kvm_show_code(CPUState *env) { #define SHOW_CODE_LEN 50 - int fd = env-kvm_fd; struct kvm_regs regs; struct kvm_sregs sregs; int r, n; @@ -365,13 +358,13 @@ void kvm_show_code(CPUState *env) char code_str[SHOW_CODE_LEN * 3 + 1]; unsigned long rip; - r = ioctl(fd, KVM_GET_SREGS, sregs); - if (r == -1) { + r = kvm_vcpu_ioctl(env, KVM_GET_SREGS, sregs); + if (r 0 ) { perror(KVM_GET_SREGS); return; } - r = ioctl(fd, KVM_GET_REGS, regs); - if (r == -1) { + r = kvm_vcpu_ioctl(env, KVM_GET_REGS, regs); + if (r 0) { perror(KVM_GET_REGS); return; } @@ -420,29 +413,25 @@ struct kvm_msr_list *kvm_get_msr_list(kvm_context_t kvm) int kvm_get_msrs(CPUState *env, struct kvm_msr_entry *msrs, int n) { struct kvm_msrs *kmsrs = qemu_malloc(sizeof *kmsrs + n * sizeof *msrs); -int r, e; +int r; kmsrs-nmsrs = n; memcpy(kmsrs-entries, msrs, n * sizeof *msrs); -r = ioctl(env-kvm_fd, KVM_GET_MSRS, kmsrs); -e = errno; +r = kvm_vcpu_ioctl(env, KVM_GET_MSRS, kmsrs); memcpy(msrs, kmsrs-entries, n * sizeof *msrs); free(kmsrs); -errno = e; return r; } int kvm_set_msrs(CPUState *env, struct kvm_msr_entry *msrs, int n) { struct kvm_msrs *kmsrs = qemu_malloc(sizeof *kmsrs + n * sizeof *msrs); -int r, e; +int r; kmsrs-nmsrs = n; memcpy(kmsrs-entries, msrs, n * sizeof *msrs); -r = ioctl(env-kvm_fd, KVM_SET_MSRS, kmsrs); -e = errno; +r = kvm_vcpu_ioctl(env, KVM_SET_MSRS, kmsrs); free(kmsrs); -errno = e; return r; } @@ -464,7 +453,7 @@ int kvm_get_mce_cap_supported(kvm_context_t kvm, uint64_t *mce_cap, int kvm_setup_mce(CPUState *env, uint64_t *mcg_cap) { #ifdef KVM_CAP_MCE -return ioctl(env-kvm_fd, KVM_X86_SETUP_MCE, mcg_cap); +return kvm_vcpu_ioctl(env, KVM_X86_SETUP_MCE, mcg_cap); #else return -ENOSYS; #endif @@ -473,7 +462,7 @@ int kvm_setup_mce(CPUState *env, uint64_t *mcg_cap) int kvm_set_mce(CPUState *env, struct kvm_x86_mce *m) { #ifdef KVM_CAP_MCE -return ioctl(env-kvm_fd, KVM_X86_SET_MCE, m); +return kvm_vcpu_ioctl(env, KVM_X86_SET_MCE, m); #else return -ENOSYS; #endif @@ -496,13 +485,12 @@ static void print_dt(FILE *file, const char *name, struct kvm_dtable *dt) void kvm_show_regs(CPUState
Re: List of unaccessible x86 states
Alexander Graf wrote: On 20.10.2009, at 15:01, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to sync it. BTW, GIF is related to svm nesting, right? Orit, are there any additional states arriving on the vmx side as well with your nesting patches? Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Autotest] [PATCH] Test 802.1Q vlan of nic
On Tue, Oct 20, 2009 at 11:19 AM, Michael Goldish mgold...@redhat.com wrote: See comments below. - Dor Laor dl...@redhat.com wrote: On 10/15/2009 11:48 AM, Amos Kong wrote: Test 802.1Q vlan of nic, config it by vconfig command. 1) Create two VMs 2) Setup guests in different vlan by vconfig and test communication by ping using hard-coded ip address 3) Setup guests in same vlan and test communication by ping 4) Recover the vlan config Signed-off-by: Amos Kongak...@redhat.com --- client/tests/kvm/kvm_tests.cfg.sample | 6 +++ client/tests/kvm/tests/vlan_tag.py | 73 + 2 files changed, 79 insertions(+), 0 deletions(-) mode change 100644 = 100755 client/tests/kvm/scripts/qemu-ifup In general the above should come as an independent patch. create mode 100644 client/tests/kvm/tests/vlan_tag.py diff --git a/client/tests/kvm/kvm_tests.cfg.sample b/client/tests/kvm/kvm_tests.cfg.sample index 9ccc9b5..4e47767 100644 --- a/client/tests/kvm/kvm_tests.cfg.sample +++ b/client/tests/kvm/kvm_tests.cfg.sample @@ -166,6 +166,12 @@ variants: used_cpus = 5 used_mem = 2560 + - vlan_tag: install setup + type = vlan_tag + subnet2 = 192.168.123 + vlans = 10 20 If we want to be fanatic and safe we should dynamically choose subnet and vlans numbers that are not used on the host instead of hard code it. For the sake of safety maybe we should start both VMs with -snapshot. Dor, what do you think? Is it safe to start 2 VMs with the same disk image when only one of them uses -snapshot? + nic_mode = tap + nic_model = e1000 Why only e1000? Let's test virtio and rtl8139 as well. Can't you inherit the nic model from the config? It's not just inherited, it's overwritten, because nic_model is defined later in the file in a variants block. So this nic_model line has no effect. - autoit: install setup type = autoit diff --git a/client/tests/kvm/scripts/qemu-ifup b/client/tests/kvm/scripts/qemu-ifup old mode 100644 new mode 100755 diff --git a/client/tests/kvm/tests/vlan_tag.py b/client/tests/kvm/tests/vlan_tag.py new file mode 100644 index 000..15e763f --- /dev/null +++ b/client/tests/kvm/tests/vlan_tag.py @@ -0,0 +1,73 @@ +import logging, time +from autotest_lib.client.common_lib import error +import kvm_subprocess, kvm_test_utils, kvm_utils + +def run_vlan_tag(test, params, env): + + Test 802.1Q vlan of nic, config it by vconfig command. + + 1) Create two VMs + 2) Setup guests in different vlan by vconfig and test communication by ping + using hard-coded ip address + 3) Setup guests in same vlan and test communication by ping + 4) Recover the vlan config + + �...@param test: Kvm test object + �...@param params: Dictionary with the test parameters. + �...@param env: Dictionary with test environment. + + + vm = [] + session = [] + subnet2 = params.get(subnet2) + vlans = params.get(vlans).split() + + vm.append(kvm_test_utils.get_living_vm(env, %s % params.get(main_vm))) There's no need for the %s here. ...get_living_vm(env, params.get(main_vm))) should work. + params_vm2 = params.copy() + params_vm2['image_snapshot'] = yes + params_vm2['kill_vm_gracefully'] = no + params_vm2[address_index] = int(params.get(address_index, 0))+1 + vm.append(vm[0].clone(vm2, params_vm2)) + kvm_utils.env_register_vm(env, vm2, vm[1]) + if not vm[1].create(): + raise error.TestError(VM 1 create faild) The whole 7-8 lines above should be grouped as a function to clone existing VM. It should be part of kvm autotest infrastructure. Besides that, it looks good. There's already a clone function and it's being used here. Instead of those 7-8 lines, why not just define the VM in the config file? It looks like you're always using 2 VMs so there's no reason to do this in test code. This should do what you want: - vlan_tag: install setup type = vlan_tag subnet2 = 192.168.123 vlans = 10 20 nic_mode = tap vms += vm2 extra_params_vm2 += -snapshot kill_vm_gracefully_vm2 = no address_index_vm2 = 1 The preprocessor then automatically creates vm2 and registers it in env. To use it in the test just do: vm.append(kvm_test_utils.get_living_vm(env, vm2)) You can also use a parameter that tells the test which VM to use if you don't want the name vm2 hardcoded into the test. Add something like this to the config file: 2nd_vm = vm2 and in the test use params.get(2nd_vm) instead of vm2 (just like you use main_vm). + + for i in range(2): + session.append(kvm_test_utils.wait_for_login(vm[i])) + + try: + vconfig_cmd = vconfig add eth0 %s;ifconfig eth0.%s %s.%s + # Attempt
Re: List of unaccessible x86 states
On 20.10.2009, at 15:37, Jan Kiszka wrote: Alexander Graf wrote: On 20.10.2009, at 15:01, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to sync it. BTW, GIF is related to svm nesting, right? Yes and no. It's an architecture addition that came with SVM, yes. The problem is that I don't want to support migrating while in a nested VM. We can just #VMEXIT just before migrating with a VMEXIT_INTR intercept. Now just after #VMEXIT we're in a state that's pure host context, but has GIF=0. So we need to know about that in userspace to support migration. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Do I set up separate bridges for each guest?
Dor: The simplest thing is to use a single bridge for all - The physical nic should be part of it and supply the outside world connection. The physical nic doesn't need an IP and the bridge should own it. All vms can use this bridge. I want to assign a static IP to each of the guests, how would I do that with a single bridge? Thanks, Neil -- Neil Aggarwal, (281)846-8957, www.JAMMConsulting.com Will your e-commerce site go offline if you have a DB server failure, fiber cut, flood, fire, or other disaster? If so, ask about our geographically redundant database system. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote: On 20.10.2009, at 15:37, Jan Kiszka wrote: Alexander Graf wrote: On 20.10.2009, at 15:01, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to sync it. BTW, GIF is related to svm nesting, right? Yes and no. It's an architecture addition that came with SVM, yes. The problem is that I don't want to support migrating while in a Why not? nested VM. We can just #VMEXIT just before migrating with a VMEXIT_INTR intercept. We don't notify kernel about migration currently. CPU state is migrated when VM is already paused, how we can exit nested guest at this point? Now just after #VMEXIT we're in a state that's pure host context, but has GIF=0. So we need to know about that in userspace to support migration. Alex -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 0.11: SMP guests using one host CPU only?
Avi Kivity wrote: On 10/20/2009 10:19 PM, Tomasz Chmielewski wrote: I meant, how many qemu threads are there, and how much cpu does each take? There is only one qemu thread for the 4-cpu guest. Not possible. Even a single-cpu guest has two threads. ps auxH shuld show me all threads? I started it multiple times, and it shown 1 thread for the 4-CPU guest (with no CPU intensive tasks running - could this be a reason?). What does 'ls /proc/$(pgrep qemu)/task' show? Running several CPU-intensive processes on this guest uses only one CPU on the host. Both ps auxH and /proc confirm that this guest has 4-5 threads when I run several CPU-intensive apps. Only one thread for this guest uses 100% CPU time; other threads use ~0%. If I don't run any CPU-intensive tasks on this guests, it only runs one thread (unless I misinterpret something here). Some 1-CPU guests have only one thread though? # QEMU_TASKS=$(pgrep qemu) # for QEMU_TASK in $QEMU_TASKS; do cat /proc/$QEMU_TASK/cmdline ; echo ; ls /proc/$QEMU_TASK/task ; echo ; done /usr/bin/qemu-system-x86_64-m1024-drivefile=/srv/kvm/images/lvs2,if=virtio,cache=writeback,index=0,boot=on-netnic,vlan=0,model=virtio,macaddr=F2:4A:51:41:B1:3F-nettap,vlan=0,script=/etc/qemu-ifup-localtime-smp4 17687/ 19018/ 19020/ 19069/ /usr/bin/qemu-system-x86_64-m1024-drivefile=/srv/kvm/images/gluster1a,if=virtio,cache=writeback,index=0,boot=on-netnic,vlan=0,model=virtio,macaddr=F2:4A:51:41:B1:3A-nettap,vlan=0,script=/etc/qemu-ifup-localtime 19220/ 24857/ /usr/bin/qemu-system-x86_64-m1024-drivefile=/srv/kvm/images/gluster2a,if=virtio,cache=writeback,index=0,boot=on-netnic,vlan=0,model=virtio,macaddr=F2:4A:51:41:B1:3B-nettap,vlan=0,script=/etc/qemu-ifup-localtime 19252/ 24896/ /usr/bin/qemu-system-x86_64-m1024-drivefile=/srv/kvm/images/gluster3a,if=virtio,cache=writeback,index=0,boot=on-netnic,vlan=0,model=virtio,macaddr=F2:4A:51:41:B1:3C-nettap,vlan=0,script=/etc/qemu-ifup-localtime 19258/ 24934/ /usr/bin/qemu-system-x86_64-m1024-drivefile=/srv/kvm/images/gluster4a,if=virtio,cache=writeback,index=0,boot=on-netnic,vlan=0,model=virtio,macaddr=F2:4A:51:41:B1:3D-nettap,vlan=0,script=/etc/qemu-ifup-localtime 25878/ /usr/bin/qemu-system-x86_64-m1024-drivefile=/srv/kvm/images/lvs1,if=virtio,cache=writeback,index=0,boot=on-netnic,vlan=0,model=virtio,macaddr=F2:4A:51:41:B1:3E-nettap,vlan=0,script=/etc/qemu-ifup-localtime 25920/ No CPU-intensive apps: /usr/bin/qemu-system-x86_64-m1024-drivefile=/srv/kvm/images/lvs2,if=virtio,cache=writeback,index=0,boot=on-netnic,vlan=0,model=virtio,macaddr=F2:4A:51:41:B1:3F-nettap,vlan=0,script=/etc/qemu-ifup-localtime-smp4 17687/ -- Tomasz Chmielewski http://wpkg.org -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On 20.10.2009, at 15:48, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote: On 20.10.2009, at 15:37, Jan Kiszka wrote: Alexander Graf wrote: On 20.10.2009, at 15:01, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to sync it. BTW, GIF is related to svm nesting, right? Yes and no. It's an architecture addition that came with SVM, yes. The problem is that I don't want to support migrating while in a Why not? Because then we'd have to transfer the whole host cpu cache and the merged intercept bitmaps to userspace as well. That's just too many internals to expose IMHO. nested VM. We can just #VMEXIT just before migrating with a VMEXIT_INTR intercept. We don't notify kernel about migration currently. CPU state is migrated when VM is already paused, how we can exit nested guest at this point? Hm - introduce a new ioctl? I haven't fully thought it through yet :-). Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] sched, cpuacct: fix niced guest time accounting
Hi Avi, This is the patch we discussed earlier. Please review it. BTW, should this be sent to lkml as well? Regards, ozaki-r From 8aea0f1a9acc891d1208bc462a05797765451ab4 Mon Sep 17 00:00:00 2001 From: Ryota Ozaki ozaki.ry...@gmail.com Date: Tue, 20 Oct 2009 22:41:12 +0900 Subject: [PATCH] sched, cpuacct: fix niced guest time accounting CPU time of a guest is always accounted in 'user' time without concern for the nice value of its counterpart process although the guest is scheduled under the nice value. This patch fixes the defect and accounts cpu time of a niced guest in 'nice' time as same as a niced process. And also the patch adds 'guest_nice' to cpuacct. The value provides niced guest cpu time which is like 'nice' to 'user'. Signed-off-by: Ryota Ozaki ozaki.ry...@gmail.com --- Documentation/filesystems/proc.txt |3 ++- fs/proc/stat.c | 17 +++-- include/linux/kernel_stat.h|1 + kernel/sched.c |9 +++-- 4 files changed, 21 insertions(+), 9 deletions(-) diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index 2c48f94..4af0018 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -1072,7 +1072,8 @@ second). The meanings of the columns are as follows, from left to right: - irq: servicing interrupts - softirq: servicing softirqs - steal: involuntary wait -- guest: running a guest +- guest: running a normal guest +- guest_nice: running a niced guest The intr line gives counts of interrupts serviced since boot time, for each of the possible system interrupts. The first column is the total of all diff --git a/fs/proc/stat.c b/fs/proc/stat.c index 7cc726c..67c30a7 100644 --- a/fs/proc/stat.c +++ b/fs/proc/stat.c @@ -27,7 +27,7 @@ static int show_stat(struct seq_file *p, void *v) int i, j; unsigned long jif; cputime64_t user, nice, system, idle, iowait, irq, softirq, steal; - cputime64_t guest; + cputime64_t guest, guest_nice; u64 sum = 0; u64 sum_softirq = 0; unsigned int per_softirq_sums[NR_SOFTIRQS] = {0}; @@ -36,7 +36,7 @@ static int show_stat(struct seq_file *p, void *v) user = nice = system = idle = iowait = irq = softirq = steal = cputime64_zero; - guest = cputime64_zero; + guest = guest_nice = cputime64_zero; getboottime(boottime); jif = boottime.tv_sec; @@ -51,6 +51,8 @@ static int show_stat(struct seq_file *p, void *v) softirq = cputime64_add(softirq, kstat_cpu(i).cpustat.softirq); steal = cputime64_add(steal, kstat_cpu(i).cpustat.steal); guest = cputime64_add(guest, kstat_cpu(i).cpustat.guest); + guest_nice = cputime64_add(guest_nice, + kstat_cpu(i).cpustat.guest_nice); for_each_irq_nr(j) { sum += kstat_irqs_cpu(j, i); } @@ -65,7 +67,7 @@ static int show_stat(struct seq_file *p, void *v) } sum += arch_irq_stat(); - seq_printf(p, cpu %llu %llu %llu %llu %llu %llu %llu %llu %llu¥n, + seq_printf(p, cpu %llu %llu %llu %llu %llu %llu %llu %llu %llu %llu¥n, (unsigned long long)cputime64_to_clock_t(user), (unsigned long long)cputime64_to_clock_t(nice), (unsigned long long)cputime64_to_clock_t(system), @@ -74,7 +76,8 @@ static int show_stat(struct seq_file *p, void *v) (unsigned long long)cputime64_to_clock_t(irq), (unsigned long long)cputime64_to_clock_t(softirq), (unsigned long long)cputime64_to_clock_t(steal), - (unsigned long long)cputime64_to_clock_t(guest)); + (unsigned long long)cputime64_to_clock_t(guest), + (unsigned long long)cputime64_to_clock_t(guest_nice)); for_each_online_cpu(i) { /* Copy values here to work around gcc-2.95.3, gcc-2.96 */ @@ -88,8 +91,9 @@ static int show_stat(struct seq_file *p, void *v) softirq = kstat_cpu(i).cpustat.softirq; steal = kstat_cpu(i).cpustat.steal; guest = kstat_cpu(i).cpustat.guest; + guest_nice = kstat_cpu(i).cpustat.guest_nice; seq_printf(p, - cpu%d %llu %llu %llu %llu %llu %llu %llu %llu %llu¥n, + cpu%d %llu %llu %llu %llu %llu %llu %llu %llu %llu %llu¥n, i, (unsigned long long)cputime64_to_clock_t(user), (unsigned long long)cputime64_to_clock_t(nice), @@ -99,7 +103,8 @@ static int show_stat(struct seq_file *p, void *v) (unsigned long long)cputime64_to_clock_t(irq), (unsigned long long)cputime64_to_clock_t(softirq), (unsigned long long)cputime64_to_clock_t(steal), -
Interface is requiring IP address even though it is for a bridge
Hello: I am trying to follow the RHEL virtualization guide to set up a bridge on a system running CentOS 5.4. I copied my ifcfg-eth0 to ifcfg-eth0:1 and set its content to this: DEVICE=eth0:1 HWADDR=[The MAC address from eth0] ONBOOT=yes BRIDGE=br1 I then created ifcfg-br1 with this content: DEVICE=br1 TYPE=Bridge BOOTPROTO=static BROADCAST=192.168.2.255 IPADDR=192.168.2.202 NETMASK=255.255.255.0 NETWORK=192.168.2.0 ONBOOT=yes DELAY=0 When I go service network restart, I get this error: error in ifcfg-eth0:1: didn't specify device or ipaddr I specified the device so it looks like it wants an IP address but that is contrary to what I am reading on the Internet. Am I supposed to give eth0:1 an IP address? Thanks, Neil -- Neil Aggarwal, (281)846-8957, www.JAMMConsulting.com Will your e-commerce site go offline if you have a DB server failure, fiber cut, flood, fire, or other disaster? If so, ask about our geographically redundant database system. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] Further integration with qemu.git
On Mon, Oct 19, 2009 at 11:20:41AM -0200, Glauber Costa wrote: A couple of more functions are used from qemu.git. Merging keeps going... Applied, thanks. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] move tpr stuff to qemu-kvm-x86.c
On Mon, Oct 19, 2009 at 11:29:25AM -0200, Glauber Costa wrote: this whole tpr thing does not belong in common code. Move it to i386 specific files. Applied, thanks. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, -next] KVM: x86: Fix 32-bit host build warning
On Tue, Oct 20, 2009 at 02:15:10PM +0200, Jan Kiszka wrote: Fixes cast to pointer from integer of different size on 32-bit hosts and applies a micro-refactoring. Signed-off-by: Jan Kiszka jan.kis...@siemens.com Applied, thanks. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM: VMX: remove GUEST_CR3 write from vmx_vcpu_run
On Tue, Oct 20, 2009 at 10:14:52PM +0900, Avi Kivity wrote: On 10/20/2009 09:37 PM, Marcelo Tosatti wrote: GUEST_CR3 is updated via kvm_set_cr3 whenever CR3 value changes. diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 364263a..325075f 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -3638,10 +3638,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); -if (enable_ept is_paging(vcpu)) { -vmcs_writel(GUEST_CR3, vcpu-arch.cr3); +if (enable_ept is_paging(vcpu)) ept_load_pdptrs(vcpu); -} + /* Record the guest's net vcpu time for enforced NMI injections. */ if (unlikely(!cpu_has_virtual_nmis() vmx-soft_vnmi_blocked)) vmx-entry_time = ktime_get(); Nice. Any reason why ept_load_pdptrs() couldn't go the same way? Its already protected by VCPU_EXREG_PDPTR caching, so it does not buy much. The advantage would symmetry to cr3. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] v3: use upstream kvm_vcpu_ioctl
On Tue, Oct 20, 2009 at 11:36:58AM -0200, Glauber Costa wrote: [v2: we already return -errno, so fix testers ] [v3: keep error message for apic related failures ] Signed-off-by: Glauber Costa glom...@redhat.com Applied, thanks. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] v3: use upstream kvm_vcpu_ioctl
On Tue, Oct 20, 2009 at 11:36:58AM -0200, Glauber Costa wrote: [v2: we already return -errno, so fix testers ] [v3: keep error message for apic related failures ] Signed-off-by: Glauber Costa glom...@redhat.com Dropped, does not compile. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
qemu-kvm: require 4K aligned resource size for memory
KVM does not virtualize low address bits for memory accesses, so we must require that PCI BAR size is a multiple of 4K for passthrough to work (this also guarantees that address is 4K aligned). Users of recent linux kernels can force resource size up to 4K using: commit 32a9a682bef2f6fce7026bd94d1ce20028b0e52d Author: Yuji Shimada shimada-...@necst.nec.co.jp Date: Mon Mar 16 17:13:39 2009 +0900 PCI: allow assignment of memory resources with a specified alignment Signed-off-by: Michael S. Tsirkin m...@redhat.com --- diff --git a/hw/device-assignment.c b/hw/device-assignment.c index 237060f..c2ef31f 100644 --- a/hw/device-assignment.c +++ b/hw/device-assignment.c @@ -381,6 +381,14 @@ static int assigned_dev_register_regions(PCIRegion *io_regions, int t = cur_region-type IORESOURCE_PREFETCH ? PCI_ADDRESS_SPACE_MEM_PREFETCH : PCI_ADDRESS_SPACE_MEM; +if (cur_region-size 0xFFF) { +fprintf(stderr, Unable to assign device: PCI region %d +at address 0x%llx has size 0x%x, + which is not a multiple of 4K\n, +i, (unsigned long long)cur_region-base_addr, +cur_region-size); +return -1; +} /* map physical memory */ pci_dev-v_addrs[i].e_physbase = cur_region-base_addr; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vhost-net patches
On Tue, Oct 20, 2009 at 10:14:55AM -0700, Shirley Ma wrote: Hello Michael, what is vnet-vector? And what do you mean by not defined? In funcation: static int vhost_virtqueue_init() { .. r = vdev-binding-irqfd(vdev-binding_opaque, q-vector, vq-call); .. }. q-vector is 65535, Thanks for debugging this. I think this means that guest does not use MSI-X. You can verify this by booting guest without vhost, and performing the following command: cat /proc/interrupts Please note that you currently need recent kernel in guest, so that it uses MSI-X. I plan on implementing regular IRQ, but not yet, and it will be slower anyway. in static int virtio_pci_irqfd() { .. if (vector = proxy-pci_dev.msix_entries_nr) { fprintf(stderr, pci irq fd returned vector %d, msix_entries_nr %d \n, vector, proxy-pci_dev.msix_entries_nr); --- I added one output line here. return -EINVAL; }... The output is: pci irq fd returned vector 65535, msix_entries_nr 3, EINVAL is returned. thanks Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone(Fax): (503) 578-7638 Michael S. Tsirkin m...@redhat.com To Shirley Ma/Beaverton/i...@ibmus 10/20/2009 04:34 cc AMs...@linux.vnet.ibm.com, David Stevens/Beaverton/i...@ibmus, kvm@vger.kernel.org Subject Re: vhost-net patches On Mon, Oct 19, 2009 at 04:08:24PM -0700, Shirley Ma wrote: Hello Michael, They all failed with the following error vhost_net_init returned -7 This is an error message from hw/virtio-net.c:virtio_net_driver_ok() when vhost_net_start() fails. It looks like dev-binding-irqfd() is failing in vhost_virtqueue_init(). Haven't yet debugged further. I have CONFIG_EVENTFD enabled in the host kernel. From the debug output, looks like the vnet-vector is not defined, what is vnet-vector? And what do you mean by not defined? and the default msix_entries_nr is 3, so it returned EINVAL from virtio_pci_irqfd. Looks we need to either disable QEMU_PCI_CAP_MSIX or define vector in QEMU configuration? You shouldn't have to do anything. I am not familiar with MSIX stuffs. Thanks Shirley Inactive hide details for sri---10/19/2009 03:56:57 PM---On Sun, 2009-10-18 at 19:32 +0200, Michael S. Tsirkin wrote:sri---10/19/2009 03:56:57 PM---On Sun, 2009-10-18 at 19:32 +0200, Michael S. Tsirkin wrote: s...@linux.vnet.ibm.com [cid] * To Michael S. Tsirkin m...@redhat.com, 10/19/2009 03:56 PMkvm@vger.kernel.org [cid] * cc David Stevens/Beaverton/i...@ibmus, Shirley Ma/ Beaverton/i...@ibmus [cid] * Subject Re: vhost-net patches ** On Sun, 2009-10-18 at 19:32 +0200, Michael S. Tsirkin wrote: On Sun, Oct 18, 2009 at 12:53:56PM +0200, Michael S. Tsirkin wrote: On Fri, Oct 16, 2009 at 12:29:29PM -0700, Sridhar Samudrala wrote: Hi Michael, We are trying out your vhost-net patches from your git trees on kernel.org. I am using mst/vhost.git as host kernel and mst/qemu-kvm.git for qemu. I am using the following qemu script to start the guest using userspace tap backend. home/sridhar/git/mst/qemu-kvm/x86_64-softmmu/qemu-system-x86_64 /home/ sridhar/kvm_images/fedora10-1-vm -m 512 -drive file=/home/sridhar/kvm_images/ fedora10-1-vm,if=virtio,index=0,boot=on -net nic,macaddr= 54:52:00:35:e3:73,model=virtio -net tap,ifname=vnet0,script=no,downscript=no Now that i got the default backend to work, i
Re: vhost-net patches
On Tue, Oct 20, 2009 at 10:27:38AM -0700, Shirley Ma wrote: Hello Michael, Here are the output, I am using guest 2.6.32-rc3 kernel. It doesn't use MSIX. So which guest kernel I should use?? [...@localhost ~]$ cat /proc/interrupts. CPU0 0:299 IO-APIC-edge timer. 1: 2 IO-APIC-edge i8042. 2: 0XT-PIC-XTcascade 4: 76 IO-APIC-edge serial 11: 2126 IO-APIC-edge virtio1, virtio0 - here is the virtio for both disk and network i/o?? Yes, this is regular shared IRQ, no good. I think your guest is too old, please use kernel 2.6.31 and up in guest. I will work to improve the error message as well. 12: 89 IO-APIC-edge i8042 NMI: 0 Non-maskable interrupts LOC: 5146 Local timer interrupts SPU: 0 Spurious interrupts CNT: 0 Performance counter interrupts PND: 0 Performance pending work RES: 0 Rescheduling interrupts CAL: 0 Function call interrupts TLB: 0 TLB shootdowns TRM: 0 Thermal event interrupts MCE: 0 Machine check exceptions MCP: 1 Machine check polls ERR: 0 MIS: 0 [...@localhost ~]$ uname -r 2.6.32-rc3 Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone(Fax): (503) 578-7638 Michael S. Tsirkin m...@redhat.com To Shirley Ma/Beaverton/i...@ibmus 10/20/2009 10:18 cc AMDavid Stevens/Beaverton/i...@ibmus, kvm@vger.kernel.org, s...@linux.vnet.ibm.com Subject Re: vhost-net patches On Tue, Oct 20, 2009 at 10:14:55AM -0700, Shirley Ma wrote: Hello Michael, what is vnet-vector? And what do you mean by not defined? In funcation: static int vhost_virtqueue_init() { .. r = vdev-binding-irqfd(vdev-binding_opaque, q-vector, vq-call); .. }. q-vector is 65535, Thanks for debugging this. I think this means that guest does not use MSI-X. You can verify this by booting guest without vhost, and performing the following command: cat /proc/interrupts Please note that you currently need recent kernel in guest, so that it uses MSI-X. I plan on implementing regular IRQ, but not yet, and it will be slower anyway. in static int virtio_pci_irqfd() { .. if (vector = proxy-pci_dev.msix_entries_nr) { fprintf(stderr, pci irq fd returned vector %d, msix_entries_nr %d \n, vector, proxy-pci_dev.msix_entries_nr); --- I added one output line here. return -EINVAL; }... The output is: pci irq fd returned vector 65535, msix_entries_nr 3, EINVAL is returned. thanks Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone(Fax): (503) 578-7638 Michael S. Tsirkin m...@redhat.com To Shirley Ma/Beaverton/i...@ibmus 10/20/2009 04:34 cc AMs...@linux.vnet.ibm.com, David Stevens/Beaverton/i...@ibmus, kvm@vger.kernel.org Subject Re: vhost-net patches On Mon, Oct 19, 2009 at 04:08:24PM -0700, Shirley Ma wrote: Hello Michael, They all failed with the following error vhost_net_init returned -7 This is an error message from hw/virtio-net.c:virtio_net_driver_ok() when vhost_net_start() fails. It looks like dev-binding-irqfd() is failing in vhost_virtqueue_init(). Haven't yet debugged further. I have CONFIG_EVENTFD enabled in the host kernel. From the debug output, looks like the vnet-vector is not defined, what is vnet-vector?
Re: [PATCH] v3: use upstream kvm_vcpu_ioctl
On Tue, Oct 20, 2009 at 03:10:18PM -0200, Marcelo Tosatti wrote: On Tue, Oct 20, 2009 at 11:36:58AM -0200, Glauber Costa wrote: [v2: we already return -errno, so fix testers ] [v3: keep error message for apic related failures ] Signed-off-by: Glauber Costa glom...@redhat.com Dropped, does not compile. sorry, my bad, silly mistake. will send another -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] v4: use upstream kvm_vcpu_ioctl
[v2: we already return -errno, so fix testers ] [v3: keep error message for apic related failures ] [v4: fix silly compile mistake ] Signed-off-by: Glauber Costa glom...@redhat.com --- kvm-all.c |3 -- qemu-kvm-x86.c | 90 +-- qemu-kvm.c | 31 --- qemu-kvm.h |1 + 4 files changed, 48 insertions(+), 77 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 0a8aa4c..50cd1fb 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -863,7 +863,6 @@ int kvm_vm_ioctl(KVMState *s, int type, ...) return ret; } -#ifdef KVM_UPSTREAM int kvm_vcpu_ioctl(CPUState *env, int type, ...) { int ret; @@ -881,8 +880,6 @@ int kvm_vcpu_ioctl(CPUState *env, int type, ...) return ret; } -#endif - int kvm_has_sync_mmu(void) { #ifdef KVM_CAP_SYNC_MMU diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index fb70ede..c1d0ae9 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -174,18 +174,11 @@ static int kvm_handle_tpr_access(CPUState *env) int kvm_enable_vapic(CPUState *env, uint64_t vapic) { - int r; struct kvm_vapic_addr va = { .vapic_addr = vapic, }; - r = ioctl(env-kvm_fd, KVM_SET_VAPIC_ADDR, va); - if (r == -1) { - r = -errno; - perror(kvm_enable_vapic); - return r; - } - return 0; + return kvm_vcpu_ioctl(env, KVM_SET_VAPIC_ADDR, va); } #endif @@ -283,28 +276,29 @@ int kvm_destroy_memory_alias(kvm_context_t kvm, uint64_t phys_start) int kvm_get_lapic(CPUState *env, struct kvm_lapic_state *s) { - int r; +int r = 0; + if (!kvm_irqchip_in_kernel()) - return 0; - r = ioctl(env-kvm_fd, KVM_GET_LAPIC, s); - if (r == -1) { - r = -errno; - perror(kvm_get_lapic); - } - return r; + return r; + + r = kvm_vcpu_ioctl(env, KVM_GET_LAPIC, s); +if (r 0) +fprintf(stderr, KVM_GET_LAPIC failed\n); +return r; } int kvm_set_lapic(CPUState *env, struct kvm_lapic_state *s) { - int r; +int r = 0; + if (!kvm_irqchip_in_kernel()) return 0; - r = ioctl(env-kvm_fd, KVM_SET_LAPIC, s); - if (r == -1) { - r = -errno; - perror(kvm_set_lapic); - } - return r; + + r = kvm_vcpu_ioctl(env, KVM_SET_LAPIC, s); + +if (r 0) +fprintf(stderr, KVM_SET_LAPIC failed\n); +return r; } #endif @@ -356,7 +350,6 @@ int kvm_has_pit_state2(kvm_context_t kvm) void kvm_show_code(CPUState *env) { #define SHOW_CODE_LEN 50 - int fd = env-kvm_fd; struct kvm_regs regs; struct kvm_sregs sregs; int r, n; @@ -365,13 +358,13 @@ void kvm_show_code(CPUState *env) char code_str[SHOW_CODE_LEN * 3 + 1]; unsigned long rip; - r = ioctl(fd, KVM_GET_SREGS, sregs); - if (r == -1) { + r = kvm_vcpu_ioctl(env, KVM_GET_SREGS, sregs); + if (r 0 ) { perror(KVM_GET_SREGS); return; } - r = ioctl(fd, KVM_GET_REGS, regs); - if (r == -1) { + r = kvm_vcpu_ioctl(env, KVM_GET_REGS, regs); + if (r 0) { perror(KVM_GET_REGS); return; } @@ -420,29 +413,25 @@ struct kvm_msr_list *kvm_get_msr_list(kvm_context_t kvm) int kvm_get_msrs(CPUState *env, struct kvm_msr_entry *msrs, int n) { struct kvm_msrs *kmsrs = qemu_malloc(sizeof *kmsrs + n * sizeof *msrs); -int r, e; +int r; kmsrs-nmsrs = n; memcpy(kmsrs-entries, msrs, n * sizeof *msrs); -r = ioctl(env-kvm_fd, KVM_GET_MSRS, kmsrs); -e = errno; +r = kvm_vcpu_ioctl(env, KVM_GET_MSRS, kmsrs); memcpy(msrs, kmsrs-entries, n * sizeof *msrs); free(kmsrs); -errno = e; return r; } int kvm_set_msrs(CPUState *env, struct kvm_msr_entry *msrs, int n) { struct kvm_msrs *kmsrs = qemu_malloc(sizeof *kmsrs + n * sizeof *msrs); -int r, e; +int r; kmsrs-nmsrs = n; memcpy(kmsrs-entries, msrs, n * sizeof *msrs); -r = ioctl(env-kvm_fd, KVM_SET_MSRS, kmsrs); -e = errno; +r = kvm_vcpu_ioctl(env, KVM_SET_MSRS, kmsrs); free(kmsrs); -errno = e; return r; } @@ -464,7 +453,7 @@ int kvm_get_mce_cap_supported(kvm_context_t kvm, uint64_t *mce_cap, int kvm_setup_mce(CPUState *env, uint64_t *mcg_cap) { #ifdef KVM_CAP_MCE -return ioctl(env-kvm_fd, KVM_X86_SETUP_MCE, mcg_cap); +return kvm_vcpu_ioctl(env, KVM_X86_SETUP_MCE, mcg_cap); #else return -ENOSYS; #endif @@ -473,7 +462,7 @@ int kvm_setup_mce(CPUState *env, uint64_t *mcg_cap) int kvm_set_mce(CPUState *env, struct kvm_x86_mce *m) { #ifdef KVM_CAP_MCE -return ioctl(env-kvm_fd, KVM_X86_SET_MCE, m); +return kvm_vcpu_ioctl(env, KVM_X86_SET_MCE, m); #else return -ENOSYS; #endif @@ -496,13 +485,12 @@ static void print_dt(FILE *file, const char *name, struct kvm_dtable
Re: List of unaccessible x86 states
On Tue, Oct 20, 2009 at 03:01:15PM +0200, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Should save/restore the MCE MSRs (its contents are currently lost/overwritten AFAICS). MTRR contents are also dropped. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
GDB + KVM Debug
I have now tried using both Set arch i8086 and Set arch i386:x86-64:intel But still see the same issue. Do I need to apply any patch? Abhishek -Original Message- From: Jan Kiszka [mailto:jan.kis...@siemens.com] Sent: Thursday, September 17, 2009 1:36 AM To: Saksena, Abhishek Cc: kvm@vger.kernel.org Subject: Re: GDB + KVM Debug Saksena, Abhishek wrote: I am using KVM-88. However I can't get gdb still working. I stared qemu with -s -S option and when I try to connect gdb to it I get following error:- (gdb) target remote lochost:1234 lochost: unknown host lochost:1234: No such file or directory. (gdb) target remote locahost:1234 locahost: unknown host locahost:1234: No such file or directory. (gdb) target remote localhost:1234 Remote debugging using localhost:1234 [New Thread 1] Remote 'g' packet reply is too long: 2306f0ff023002f07f03000 0 (gdb) Try 'set arch target-architecture' before connecting. This is required if you didn't load the corresponding target image into gdb. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On Tue, Oct 20, 2009 at 03:51:02PM +0200, Alexander Graf wrote: On 20.10.2009, at 15:48, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote: On 20.10.2009, at 15:37, Jan Kiszka wrote: Alexander Graf wrote: On 20.10.2009, at 15:01, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to sync it. BTW, GIF is related to svm nesting, right? Yes and no. It's an architecture addition that came with SVM, yes. The problem is that I don't want to support migrating while in a Why not? Because then we'd have to transfer the whole host cpu cache and the merged intercept bitmaps to userspace as well. That's just too many internals to expose IMHO. But the amount of information is constant no matter how l2 guest there are. Correct? We can expose it as separate substate. nested VM. We can just #VMEXIT just before migrating with a VMEXIT_INTR intercept. We don't notify kernel about migration currently. CPU state is migrated when VM is already paused, how we can exit nested guest at this point? Hm - introduce a new ioctl? I haven't fully thought it through yet :-). There is not software problem that can't be solved by introducing new ioctl :) -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On 20.10.2009, at 20:55, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 03:51:02PM +0200, Alexander Graf wrote: On 20.10.2009, at 15:48, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote: On 20.10.2009, at 15:37, Jan Kiszka wrote: Alexander Graf wrote: On 20.10.2009, at 15:01, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to sync it. BTW, GIF is related to svm nesting, right? Yes and no. It's an architecture addition that came with SVM, yes. The problem is that I don't want to support migrating while in a Why not? Because then we'd have to transfer the whole host cpu cache and the merged intercept bitmaps to userspace as well. That's just too many internals to expose IMHO. But the amount of information is constant no matter how l2 guest there are. Correct? We can expose it as separate substate. Or we can just not migrate while in a nested guest :-). Which will make everything a lot easier. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On Tue, Oct 20, 2009 at 08:59:48PM +0200, Alexander Graf wrote: On 20.10.2009, at 20:55, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 03:51:02PM +0200, Alexander Graf wrote: On 20.10.2009, at 15:48, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote: On 20.10.2009, at 15:37, Jan Kiszka wrote: Alexander Graf wrote: On 20.10.2009, at 15:01, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to sync it. BTW, GIF is related to svm nesting, right? Yes and no. It's an architecture addition that came with SVM, yes. The problem is that I don't want to support migrating while in a Why not? Because then we'd have to transfer the whole host cpu cache and the merged intercept bitmaps to userspace as well. That's just too many internals to expose IMHO. But the amount of information is constant no matter how l2 guest there are. Correct? We can expose it as separate substate. Or we can just not migrate while in a nested guest :-). Which will make everything a lot easier. Suppose we have a l2 guest that handles interrupt/nmis by itself how can we force it to exit? I don't think requesting certain cpu state before migration is the right thing to do. What if user paused a VM and then decided to migrate? Or VM was paused automatically because of shortage of disk space and management want to migrate VM to other host with bigger disk? -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] move hlt exit to arch-specific code, and use upstream version.
HLT exit calls directly an arch-specific function. Furthermore, upstream qemu already places it on arch specific code, so let's follow it. The function that handles halt itself is almost equal between them. So let's use it. Signed-off-by: Glauber Costa glom...@redhat.com --- qemu-kvm-x86.c| 14 +++--- qemu-kvm.c|3 --- target-i386/kvm.c |2 ++ 3 files changed, 5 insertions(+), 14 deletions(-) diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index c1d0ae9..6573dc5 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -199,6 +199,9 @@ int kvm_arch_run(CPUState *env) r = kvm_handle_tpr_access(env); break; #endif +case KVM_EXIT_HLT: +r = kvm_handle_halt(env); +break; default: r = 1; break; @@ -1377,17 +1380,6 @@ int kvm_arch_init_vcpu(CPUState *cenv) return 0; } -int kvm_arch_halt(CPUState *env) -{ - -if (!((env-interrupt_request CPU_INTERRUPT_HARD) - (env-eflags IF_MASK)) - !(env-interrupt_request CPU_INTERRUPT_NMI)) { -env-halted = 1; -} -return 1; -} - void kvm_arch_pre_kvm_run(void *opaque, CPUState *env) { if (!kvm_irqchip_in_kernel()) diff --git a/qemu-kvm.c b/qemu-kvm.c index b8ae4d8..42ead38 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -1002,9 +1002,6 @@ int kvm_run(CPUState *env) case KVM_EXIT_MMIO: r = handle_mmio(env); break; -case KVM_EXIT_HLT: -r = kvm_arch_halt(env); -break; case KVM_EXIT_IRQ_WINDOW_OPEN: break; case KVM_EXIT_SHUTDOWN: diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 1cf0dc3..de10ef1 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -761,6 +761,7 @@ int kvm_arch_post_run(CPUState *env, struct kvm_run *run) return 0; } +#endif static int kvm_handle_halt(CPUState *env) { @@ -775,6 +776,7 @@ static int kvm_handle_halt(CPUState *env) return 1; } +#ifdef KVM_UPSTREAM int kvm_arch_handle_exit(CPUState *env, struct kvm_run *run) { int ret = 0; -- 1.6.2.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On 20.10.2009, at 21:09, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 08:59:48PM +0200, Alexander Graf wrote: On 20.10.2009, at 20:55, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 03:51:02PM +0200, Alexander Graf wrote: On 20.10.2009, at 15:48, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote: On 20.10.2009, at 15:37, Jan Kiszka wrote: Alexander Graf wrote: On 20.10.2009, at 15:01, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to sync it. BTW, GIF is related to svm nesting, right? Yes and no. It's an architecture addition that came with SVM, yes. The problem is that I don't want to support migrating while in a Why not? Because then we'd have to transfer the whole host cpu cache and the merged intercept bitmaps to userspace as well. That's just too many internals to expose IMHO. But the amount of information is constant no matter how l2 guest there are. Correct? We can expose it as separate substate. Or we can just not migrate while in a nested guest :-). Which will make everything a lot easier. Suppose we have a l2 guest that handles interrupt/nmis by itself how can we force it to exit? If the nested hypervisor doesn't intercept INTR we don't support it anyways. I don't think requesting certain cpu state before migration is the right thing to do. What if user paused a VM and then decided to migrate? So pausing has to make it go out of nested guest context too? Then we're not in the nested guest context, right? :) Or VM was paused automatically because of shortage of disk space and management want to migrate VM to other host with bigger disk? Same as before. Really, pushing the whole nesting state over is not a good idea. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On Tue, Oct 20, 2009 at 09:23:22PM +0200, Alexander Graf wrote: On 20.10.2009, at 21:09, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 08:59:48PM +0200, Alexander Graf wrote: On 20.10.2009, at 20:55, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 03:51:02PM +0200, Alexander Graf wrote: On 20.10.2009, at 15:48, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote: On 20.10.2009, at 15:37, Jan Kiszka wrote: Alexander Graf wrote: On 20.10.2009, at 15:01, Jan Kiszka wrote: Hi all, as the list of yet user-unaccessible x86 states is a bit volatile ATM, this is an attempt to collect the precise requirements for additional state fields. Once everyone feels the list is complete, we can decide how to partition it into one ore more substates for the new KVM_GET/SET_VCPU_STATE interface. What I read so far (or tried to patch already): - nmi_masked - nmi_pending - nmi_injected - kvm_queued_exception (whole struct content) - KVM_REQ_TRIPLE_FAULT (from vcpu.requests) Unclear points (for me) from the last discussion: - sipi_vector - MCE (covered via kvm_queued_exception, or does it require more?) Please extend or correct the list as required. hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to sync it. BTW, GIF is related to svm nesting, right? Yes and no. It's an architecture addition that came with SVM, yes. The problem is that I don't want to support migrating while in a Why not? Because then we'd have to transfer the whole host cpu cache and the merged intercept bitmaps to userspace as well. That's just too many internals to expose IMHO. But the amount of information is constant no matter how l2 guest there are. Correct? We can expose it as separate substate. Or we can just not migrate while in a nested guest :-). Which will make everything a lot easier. Suppose we have a l2 guest that handles interrupt/nmis by itself how can we force it to exit? If the nested hypervisor doesn't intercept INTR we don't support it anyways. Why? I looked at the code briefly and it looks like we just inject interrupt as usual instead of do nested exit if l2 does not intercept INTR. Have I miss interpreted the code. Even if I have why not support it? I don't think requesting certain cpu state before migration is the right thing to do. What if user paused a VM and then decided to migrate? So pausing has to make it go out of nested guest context too? Probably. Then we're not in the nested guest context, right? :) Or VM was paused automatically because of shortage of disk space and management want to migrate VM to other host with bigger disk? Same as before. What do you mean? Really, pushing the whole nesting state over is not a good idea. May be just disallow migration with nested guest running then? Cross vendor migration is not possible anyway. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] move hlt exit to arch-specific code, and use upstream version.
On Tue, Oct 20, 2009 at 05:15:32PM -0200, Glauber Costa wrote: HLT exit calls directly an arch-specific function. Furthermore, upstream qemu already places it on arch specific code, so let's follow it. The function that handles halt itself is almost equal between them. So let's use it. kvm_handle_halt() may return 1. If it does kvm_arch_run() will return 1 too and kvm_run() will abort. Signed-off-by: Glauber Costa glom...@redhat.com --- qemu-kvm-x86.c| 14 +++--- qemu-kvm.c|3 --- target-i386/kvm.c |2 ++ 3 files changed, 5 insertions(+), 14 deletions(-) diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index c1d0ae9..6573dc5 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -199,6 +199,9 @@ int kvm_arch_run(CPUState *env) r = kvm_handle_tpr_access(env); break; #endif +case KVM_EXIT_HLT: +r = kvm_handle_halt(env); +break; default: r = 1; break; @@ -1377,17 +1380,6 @@ int kvm_arch_init_vcpu(CPUState *cenv) return 0; } -int kvm_arch_halt(CPUState *env) -{ - -if (!((env-interrupt_request CPU_INTERRUPT_HARD) - (env-eflags IF_MASK)) - !(env-interrupt_request CPU_INTERRUPT_NMI)) { -env-halted = 1; -} -return 1; -} - void kvm_arch_pre_kvm_run(void *opaque, CPUState *env) { if (!kvm_irqchip_in_kernel()) diff --git a/qemu-kvm.c b/qemu-kvm.c index b8ae4d8..42ead38 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -1002,9 +1002,6 @@ int kvm_run(CPUState *env) case KVM_EXIT_MMIO: r = handle_mmio(env); break; -case KVM_EXIT_HLT: -r = kvm_arch_halt(env); -break; case KVM_EXIT_IRQ_WINDOW_OPEN: break; case KVM_EXIT_SHUTDOWN: diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 1cf0dc3..de10ef1 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -761,6 +761,7 @@ int kvm_arch_post_run(CPUState *env, struct kvm_run *run) return 0; } +#endif static int kvm_handle_halt(CPUState *env) { @@ -775,6 +776,7 @@ static int kvm_handle_halt(CPUState *env) return 1; } +#ifdef KVM_UPSTREAM int kvm_arch_handle_exit(CPUState *env, struct kvm_run *run) { int ret = 0; -- 1.6.2.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] move hlt exit to arch-specific code, and use upstream version.
On Tue, Oct 20, 2009 at 09:47:44PM +0200, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 05:15:32PM -0200, Glauber Costa wrote: HLT exit calls directly an arch-specific function. Furthermore, upstream qemu already places it on arch specific code, so let's follow it. The function that handles halt itself is almost equal between them. So let's use it. kvm_handle_halt() may return 1. If it does kvm_arch_run() will return 1 too and kvm_run() will abort. kvm_arch_halt() may return 1 as well. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] move hlt exit to arch-specific code, and use upstream version.
On Tue, Oct 20, 2009 at 05:56:35PM -0200, Glauber Costa wrote: On Tue, Oct 20, 2009 at 09:47:44PM +0200, Gleb Natapov wrote: On Tue, Oct 20, 2009 at 05:15:32PM -0200, Glauber Costa wrote: HLT exit calls directly an arch-specific function. Furthermore, upstream qemu already places it on arch specific code, so let's follow it. The function that handles halt itself is almost equal between them. So let's use it. kvm_handle_halt() may return 1. If it does kvm_arch_run() will return 1 too and kvm_run() will abort. kvm_arch_halt() may return 1 as well. But it's called from another place and its return value is handled differently. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM: VMX: flush TLB with INVEPT on cpu migration
On Friday 02 October 2009 00:16:58 you wrote: It is possible that stale EPTP-tagged mappings are used, if a vcpu migrates to a different pcpu. Set KVM_REQ_TLB_FLUSH in vmx_vcpu_load, when switching pcpus, which will invalidate both VPID and EPT mappings on the next vm-entry. Thank you - I was at the brink of a nervous break-down before discovering this. Maybe it would help for the future to add a comment to ept_misconfig_inspect_spte that explains that this might be caused by out of sync tlbs, too (esp. when it doesn't show an apparent cause of the misconfig) Signed-off-by: Marcelo Tosatti mtosa...@redhat.com diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index e86f1a6..97f4265 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -708,7 +708,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) if (vcpu-cpu != cpu) { vcpu_clear(vmx); kvm_migrate_timers(vcpu); - vpid_sync_vcpu_all(vmx); + set_bit(KVM_REQ_TLB_FLUSH, vcpu-requests); local_irq_disable(); list_add(vmx-local_vcpus_link, per_cpu(vcpus_on_cpu, cpu)); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html !DSPAM:4ac52dbb832371894110002! -- /\ Best regards, | mla...@freebsd.org \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | mla...@efnet / \ ASCII Ribbon Campaign | Against HTML Mail and News -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Add 'downscript=no' into kvm command line
If no downscript is assigned, add 'downscript=no' to avoid error: /etc/qemu-ifdown: could not launch network script Signed-off-by: Yolkfull Chow yz...@redhat.com --- client/tests/kvm/kvm_vm.py |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py index a8d96ca..0b8efbc 100755 --- a/client/tests/kvm/kvm_vm.py +++ b/client/tests/kvm/kvm_vm.py @@ -252,6 +252,8 @@ class VM: if script_path: script_path = kvm_utils.get_path(root_dir, script_path) qemu_cmd += ,downscript=%s % script_path +else: +qemu_cmd += ,downscript=no # Proceed to next NIC vlan += 1 -- 1.6.2.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ANNOUNCE] Sheepdog: Distributed Storage System for KVM
Hi everyone, Sheepdog is a distributed storage system for KVM/QEMU. It provides highly available block level storage volumes to VMs like Amazon EBS. Sheepdog supports advanced volume management features such as snapshot, cloning, and thin provisioning. Sheepdog runs on several tens or hundreds of nodes, and the architecture is fully symmetric; there is no central node such as a meta-data server. The following list describes the features of Sheepdog. * Linear scalability in performance and capacity * No single point of failure * Redundant architecture (data is written to multiple nodes) - Tolerance against network failure * Zero configuration (newly added machines will join the cluster automatically) - Autonomous load balancing * Snapshot - Online snapshot from qemu-monitor * Clone from a snapshot volume * Thin provisioning - Amazon EBS API support (to use from a Eucalyptus instance) (* = current features, - = on our todo list) More details and download links are here: http://www.osrg.net/sheepdog/ Note that the code is still in an early stage. There are some critical TODO items: - VM image deletion support - Support architectures other than X86_64 - Data recoverys - Free space management - Guarantee reliability and availability under heavy load - Performance improvement - Reclaim unused blocks - More documentation We hope finding people interested in working together. Enjoy! Here are examples: - create images $ kvm-img create -f sheepdog Alice's Disk 256G $ kvm-img create -f sheepdog Bob's Disk 256G - list images $ shepherd info -t vdi 4 : Alice's Disk 256 GB (allocated: 0 MB, shared: 0 MB), 2009-10-15 16:17:18, tag:0, current 8 : Bob's Disk256 GB (allocated: 0 MB, shared: 0 MB), 2009-10-15 16:29:20, tag:0, current - start up a virtual machine $ kvm --drive format=sheepdog,file=Alice's Disk - create a snapshot $ kvm-img snapshot -c name sheepdog:Alice's Disk - clone from a snapshot $ kvm-img create -b sheepdog:Alice's Disk:0 -f sheepdog Charlie's Disk Thanks. -- MORITA, Kazutaka NTT Cyber Space Labs OSS Computing Project Kernel Group E-mail: morita.kazut...@lab.ntt.co.jp -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 26/27] Enable 32bit dirty log pointers on 64bit host
On 30.09.2009, at 15:29, Avi Kivity wrote: On 09/30/2009 03:17 PM, Avi Kivity wrote: { struct page *page[1]; @@ -2331,7 +2374,7 @@ static int kvm_vm_mmap(struct file *file, struct vm_area_struct *vma) static struct file_operations kvm_vm_fops = { .release= kvm_vm_release, .unlocked_ioctl = kvm_vm_ioctl, -.compat_ioctl = kvm_vm_ioctl, +.compat_ioctl = kvm_vm_compat_ioctl, .mmap = kvm_vm_mmap, }; static int kvm_vm_fault(struct vm_area_struct *vma, struct vm_fault *vmf) This is a bit painful - I tried to avoid compat_ioctl. Maybe it's better to have dirty_bitmap_virt, given no existing users are impacted. But that misses compat_ptr(). So it looks like we'll need compat_ioctl. Patch looks fine, except s/log.log/log/. I'd also sizeof (compat_log) instead of sizeof(log) to avoid frightening reviewers. So has there been any decision on which road to take here? Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 26/27] Enable 32bit dirty log pointers on 64bit host
On 20.10.2009, at 15:23, Avi Kivity wrote: On 10/20/2009 07:09 PM, Alexander Graf wrote: On 30.09.2009, at 15:29, Avi Kivity wrote: On 09/30/2009 03:17 PM, Avi Kivity wrote: { struct page *page[1]; @@ -2331,7 +2374,7 @@ static int kvm_vm_mmap(struct file *file, struct vm_area_struct *vma) static struct file_operations kvm_vm_fops = { .release= kvm_vm_release, .unlocked_ioctl = kvm_vm_ioctl, -.compat_ioctl = kvm_vm_ioctl, +.compat_ioctl = kvm_vm_compat_ioctl, .mmap = kvm_vm_mmap, }; static int kvm_vm_fault(struct vm_area_struct *vma, struct vm_fault *vmf) This is a bit painful - I tried to avoid compat_ioctl. Maybe it's better to have dirty_bitmap_virt, given no existing users are impacted. But that misses compat_ptr(). So it looks like we'll need compat_ioctl. Patch looks fine, except s/log.log/log/. I'd also sizeof (compat_log) instead of sizeof(log) to avoid frightening reviewers. So has there been any decision on which road to take here? compat_ioctl, and being more careful in the future. So I'll include Arnd's patch in my patchset instead? Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 26/27] Enable 32bit dirty log pointers on 64bit host
On 10/20/2009 10:28 PM, Alexander Graf wrote: compat_ioctl, and being more careful in the future. So I'll include Arnd's patch in my patchset instead? Send it independently and Marcelo or myself will apply it. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html