[COMMIT master] KVM: VMX: Use macros instead of hex value on cr0 initialization
From: Eduardo Habkost ehabk...@redhat.com This should have no effect, it is just to make the code clearer. Signed-off-by: Eduardo Habkost ehabk...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 364263a..1773017 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2538,7 +2538,7 @@ static int vmx_vcpu_reset(struct kvm_vcpu *vcpu) if (vmx-vpid != 0) vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx-vpid); - vmx-vcpu.arch.cr0 = 0x6010; + vmx-vcpu.arch.cr0 = X86_CR0_NW | X86_CR0_CD | X86_CR0_ET; vmx_set_cr0(vmx-vcpu, vmx-vcpu.arch.cr0); /* enter rmode */ vmx_set_cr4(vmx-vcpu, 0); vmx_set_efer(vmx-vcpu, 0); -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Merge commit 'tip/x86/entry'
From: Avi Kivity a...@redhat.com Merge the user-return-notifier infrastructure. Signed-off-by: Avi Kivity a...@redhat.com -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] KVM: SVM: init_vmcb(): remove redundant save-cr0 initialization
From: Eduardo Habkost ehabk...@redhat.com The svm_set_cr0() call will initialize save-cr0 properly even when npt is enabled, clearing the NW and CD bits as expected, so we don't need to initialize it manually for npt_enabled anymore. Signed-off-by: Eduardo Habkost ehabk...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index c9ef6c0..34b700f 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -648,8 +648,6 @@ static void init_vmcb(struct vcpu_svm *svm) control-intercept_cr_write = ~(INTERCEPT_CR0_MASK| INTERCEPT_CR3_MASK); save-g_pat = 0x0007040600070406ULL; - /* enable caching because the QEMU Bios doesn't enable it */ - save-cr0 = X86_CR0_ET; save-cr3 = 0; save-cr4 = 0; } -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Fix user return notifier build
From: Avi Kivity a...@redhat.com When CONFIG_USER_RETURN_NOTIFIER is set, we need to link kernel/user-return-notifier.o. Signed-off-by: Avi Kivity a...@redhat.com diff --git a/kernel/Makefile b/kernel/Makefile index b8d4cd8..0ae57a8 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -95,6 +95,7 @@ obj-$(CONFIG_RING_BUFFER) += trace/ obj-$(CONFIG_SMP) += sched_cpupri.o obj-$(CONFIG_SLOW_WORK) += slow-work.o obj-$(CONFIG_PERF_EVENTS) += perf_event.o +obj-$(CONFIG_USER_RETURN_NOTIFIER) += user-return-notifier.o ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y) # According to Alan Modra a...@linuxcare.com.au, the -fno-omit-frame-pointer is -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] KVM: SVM: Reset cr0 properly on vcpu reset
From: Eduardo Habkost ehabk...@redhat.com svm_vcpu_reset() was not properly resetting the contents of the guest-visible cr0 register, causing the following issue: https://bugzilla.redhat.com/show_bug.cgi?id=525699 Without resetting cr0 properly, the vcpu was running the SIPI bootstrap routine with paging enabled, making the vcpu get a pagefault exception while trying to run it. Instead of setting vmcb-save.cr0 directly, the new code just resets kvm-arch.cr0 and calls kvm_set_cr0(). The bits that were set/cleared on vmcb-save.cr0 (PG, WP, !CD, !NW) will be set properly by svm_set_cr0(). kvm_set_cr0() is used instead of calling svm_set_cr0() directly to make sure kvm_mmu_reset_context() is called to reset the mmu to nonpaging mode. Signed-off-by: Eduardo Habkost ehabk...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index ffa6ad2..c9ef6c0 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -628,11 +628,12 @@ static void init_vmcb(struct vcpu_svm *svm) save-rip = 0xfff0; svm-vcpu.arch.regs[VCPU_REGS_RIP] = save-rip; - /* -* cr0 val on cpu init should be 0x6010, we enable cpu -* cache by default. the orderly way is to enable cache in bios. + /* This is the guest-visible cr0 value. +* svm_set_cr0() sets PG and WP and clears NW and CD on save-cr0. */ - save-cr0 = 0x0010 | X86_CR0_PG | X86_CR0_WP; + svm-vcpu.arch.cr0 = X86_CR0_NW | X86_CR0_CD | X86_CR0_ET; + kvm_set_cr0(svm-vcpu, svm-vcpu.arch.cr0); + save-cr4 = X86_CR4_PAE; /* rdx = ?? */ -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] KVM: VMX: Move MSR_KERNEL_GS_BASE out of the vmx autoload msr area
From: Avi Kivity a...@redhat.com Currently MSR_KERNEL_GS_BASE is saved and restored as part of the guest/host msr reloading. Since we wish to lazy-restore all the other msrs, save and reload MSR_KERNEL_GS_BASE explicitly instead of using the common code. Signed-off-by: Avi Kivity a...@redhat.com diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 1773017..d1f40cc 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -99,7 +99,8 @@ struct vcpu_vmx { int save_nmsrs; int msr_offset_efer; #ifdef CONFIG_X86_64 - int msr_offset_kernel_gs_base; + u64 msr_host_kernel_gs_base; + u64 msr_guest_kernel_gs_base; #endif struct vmcs *vmcs; struct { @@ -202,7 +203,7 @@ static void ept_save_pdptrs(struct kvm_vcpu *vcpu); */ static const u32 vmx_msr_index[] = { #ifdef CONFIG_X86_64 - MSR_SYSCALL_MASK, MSR_LSTAR, MSR_CSTAR, MSR_KERNEL_GS_BASE, + MSR_SYSCALL_MASK, MSR_LSTAR, MSR_CSTAR, #endif MSR_EFER, MSR_K6_STAR, }; @@ -674,10 +675,10 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu) #endif #ifdef CONFIG_X86_64 - if (is_long_mode(vmx-vcpu)) - save_msrs(vmx-host_msrs + - vmx-msr_offset_kernel_gs_base, 1); - + if (is_long_mode(vmx-vcpu)) { + rdmsrl(MSR_KERNEL_GS_BASE, vmx-msr_host_kernel_gs_base); + wrmsrl(MSR_KERNEL_GS_BASE, vmx-msr_guest_kernel_gs_base); + } #endif load_msrs(vmx-guest_msrs, vmx-save_nmsrs); load_transition_efer(vmx); @@ -711,6 +712,12 @@ static void __vmx_load_host_state(struct vcpu_vmx *vmx) save_msrs(vmx-guest_msrs, vmx-save_nmsrs); load_msrs(vmx-host_msrs, vmx-save_nmsrs); reload_host_efer(vmx); +#ifdef CONFIG_X86_64 + if (is_long_mode(vmx-vcpu)) { + rdmsrl(MSR_KERNEL_GS_BASE, vmx-msr_guest_kernel_gs_base); + wrmsrl(MSR_KERNEL_GS_BASE, vmx-msr_host_kernel_gs_base); + } +#endif } static void vmx_load_host_state(struct vcpu_vmx *vmx) @@ -940,9 +947,6 @@ static void setup_msrs(struct vcpu_vmx *vmx) index = __find_msr_index(vmx, MSR_CSTAR); if (index = 0) move_msr_up(vmx, index, save_nmsrs++); - index = __find_msr_index(vmx, MSR_KERNEL_GS_BASE); - if (index = 0) - move_msr_up(vmx, index, save_nmsrs++); /* * MSR_K6_STAR is only needed on long mode guests, and only * if efer.sce is enabled. @@ -954,10 +958,6 @@ static void setup_msrs(struct vcpu_vmx *vmx) #endif vmx-save_nmsrs = save_nmsrs; -#ifdef CONFIG_X86_64 - vmx-msr_offset_kernel_gs_base = - __find_msr_index(vmx, MSR_KERNEL_GS_BASE); -#endif vmx-msr_offset_efer = __find_msr_index(vmx, MSR_EFER); if (cpu_has_vmx_msr_bitmap()) { @@ -1015,6 +1015,10 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) case MSR_GS_BASE: data = vmcs_readl(GUEST_GS_BASE); break; + case MSR_KERNEL_GS_BASE: + vmx_load_host_state(to_vmx(vcpu)); + data = to_vmx(vcpu)-msr_guest_kernel_gs_base; + break; case MSR_EFER: return kvm_get_msr_common(vcpu, msr_index, pdata); #endif @@ -1068,6 +1072,10 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data) case MSR_GS_BASE: vmcs_writel(GUEST_GS_BASE, data); break; + case MSR_KERNEL_GS_BASE: + vmx_load_host_state(vmx); + vmx-msr_guest_kernel_gs_base = data; + break; #endif case MSR_IA32_SYSENTER_CS: vmcs_write32(GUEST_SYSENTER_CS, data); -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] KVM: Fix Xen hvm msr ioctl by adding a flags field
From: Avi Kivity a...@redhat.com So we can extend it later. Signed-off-by: Avi Kivity a...@redhat.com diff --git a/Documentation/kvm/api.txt b/Documentation/kvm/api.txt index f504e0b..36594ba 100644 --- a/Documentation/kvm/api.txt +++ b/Documentation/kvm/api.txt @@ -608,8 +608,8 @@ page of a blob (32- or 64-bit, depending on the vcpu mode) to guest memory. struct kvm_xen_hvm_config { + __u32 flags; __u32 msr; - __u32 pad1; __u64 blob_addr_32; __u64 blob_addr_64; __u8 blob_size_32; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 7203bca..93ed656 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2542,6 +2542,9 @@ long kvm_arch_vm_ioctl(struct file *filp, if (copy_from_user(kvm-arch.xen_hvm_config, argp, sizeof(struct kvm_xen_hvm_config))) goto out; + r = -EINVAL; + if (kvm-arch.xen_hvm_config.flags) + goto out; r = 0; break; } diff --git a/include/linux/kvm.h b/include/linux/kvm.h index cf2b011..6ed1a12 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -494,8 +494,8 @@ struct kvm_x86_mce { #ifdef KVM_CAP_XEN_HVM struct kvm_xen_hvm_config { + __u32 flags; __u32 msr; - __u32 pad1; __u64 blob_addr_32; __u64 blob_addr_64; __u8 blob_size_32; -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] KVM: powerpc: Fix BUILD_BUG_ON condition
From: Hollis Blanchard holl...@us.ibm.com The old BUILD_BUG_ON implementation didn't work with __builtin_constant_p(). Fixing that revealed this test had been inverted for a long time without anybody noticing... Signed-off-by: Hollis Blanchard holl...@us.ibm.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/arch/powerpc/kvm/timing.h b/arch/powerpc/kvm/timing.h index bb13b1f..a550f0f 100644 --- a/arch/powerpc/kvm/timing.h +++ b/arch/powerpc/kvm/timing.h @@ -48,7 +48,7 @@ static inline void kvmppc_set_exit_type(struct kvm_vcpu *vcpu, int type) {} static inline void kvmppc_account_exit_stat(struct kvm_vcpu *vcpu, int type) { /* type has to be known at build time for optimization */ - BUILD_BUG_ON(__builtin_constant_p(type)); + BUILD_BUG_ON(!__builtin_constant_p(type)); switch (type) { case EXT_INTR_EXITS: vcpu-stat.ext_intr_exits++; -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] KVM: x86 shared msr infrastructure
From: Avi Kivity a...@redhat.com The various syscall-related MSRs are fairly expensive to switch. Currently we switch them on every vcpu preemption, which is far too often: - if we're switching to a kernel thread (idle task, threaded interrupt, kernel-mode virtio server (vhost-net), for example) and back, then there's no need to switch those MSRs since kernel threasd won't be exiting to userspace. - if we're switching to another guest running an identical OS, most likely those MSRs will have the same value, so there's little point in reloading them. - if we're running the same OS on the guest and host, the MSRs will have identical values and reloading is unnecessary. This patch uses the new user return notifiers to implement last-minute switching, and checks the msr values to avoid unnecessary reloading. Signed-off-by: Avi Kivity a...@redhat.com diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 0558ff8..26a74b7 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -809,4 +809,7 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu); int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu); int kvm_cpu_get_interrupt(struct kvm_vcpu *v); +void kvm_define_shared_msr(unsigned index, u32 msr); +void kvm_set_shared_msr(unsigned index, u64 val); + #endif /* _ASM_X86_KVM_HOST_H */ diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index b84e571..4cd4983 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -28,6 +28,7 @@ config KVM select HAVE_KVM_IRQCHIP select HAVE_KVM_EVENTFD select KVM_APIC_ARCHITECTURE + select USER_RETURN_NOTIFIER ---help--- Support hosting fully virtualized guest machines using hardware virtualization extensions. You will need a fairly recent diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b7f9bfe..7203bca 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -37,6 +37,7 @@ #include linux/iommu.h #include linux/intel-iommu.h #include linux/cpufreq.h +#include linux/user-return-notifier.h #include trace/events/kvm.h #undef TRACE_INCLUDE_FILE #define CREATE_TRACE_POINTS @@ -87,6 +88,25 @@ EXPORT_SYMBOL_GPL(kvm_x86_ops); int ignore_msrs = 0; module_param_named(ignore_msrs, ignore_msrs, bool, S_IRUGO | S_IWUSR); +#define KVM_NR_SHARED_MSRS 16 + +struct kvm_shared_msrs_global { + int nr; + struct kvm_shared_msr { + u32 msr; + u64 value; + } msrs[KVM_NR_SHARED_MSRS]; +}; + +struct kvm_shared_msrs { + struct user_return_notifier urn; + bool registered; + u64 current_value[KVM_NR_SHARED_MSRS]; +}; + +static struct kvm_shared_msrs_global __read_mostly shared_msrs_global; +static DEFINE_PER_CPU(struct kvm_shared_msrs, shared_msrs); + struct kvm_stats_debugfs_item debugfs_entries[] = { { pf_fixed, VCPU_STAT(pf_fixed) }, { pf_guest, VCPU_STAT(pf_guest) }, @@ -123,6 +143,64 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { { NULL } }; +static void kvm_on_user_return(struct user_return_notifier *urn) +{ + unsigned slot; + struct kvm_shared_msr *global; + struct kvm_shared_msrs *locals + = container_of(urn, struct kvm_shared_msrs, urn); + + for (slot = 0; slot shared_msrs_global.nr; ++slot) { + global = shared_msrs_global.msrs[slot]; + if (global-value != locals-current_value[slot]) { + wrmsrl(global-msr, global-value); + locals-current_value[slot] = global-value; + } + } + locals-registered = false; + user_return_notifier_unregister(urn); +} + +void kvm_define_shared_msr(unsigned slot, u32 msr) +{ + int cpu; + u64 value; + + if (slot = shared_msrs_global.nr) + shared_msrs_global.nr = slot + 1; + shared_msrs_global.msrs[slot].msr = msr; + rdmsrl_safe(msr, value); + shared_msrs_global.msrs[slot].value = value; + for_each_online_cpu(cpu) + per_cpu(shared_msrs, cpu).current_value[slot] = value; +} +EXPORT_SYMBOL_GPL(kvm_define_shared_msr); + +static void kvm_shared_msr_cpu_online(void) +{ + unsigned i; + struct kvm_shared_msrs *locals = __get_cpu_var(shared_msrs); + + for (i = 0; i shared_msrs_global.nr; ++i) + locals-current_value[i] = shared_msrs_global.msrs[i].value; +} + +void kvm_set_shared_msr(unsigned slot, u64 value) +{ + struct kvm_shared_msrs *smsr = __get_cpu_var(shared_msrs); + + if (value == smsr-current_value[slot]) + return; + smsr-current_value[slot] = value; + wrmsrl(shared_msrs_global.msrs[slot].msr, value); + if (!smsr-registered) { + smsr-urn.on_user_return = kvm_on_user_return; + user_return_notifier_register(smsr-urn); + smsr-registered = true; + } +}
[COMMIT master] KVM: remove duplicated task_switch check
From: Gleb Natapov g...@redhat.com Probably introduced by a bad merge. Signed-off-by: Gleb Natapov g...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index be968f1..2ef3906 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4532,11 +4532,6 @@ int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int reason) if (reason != TASK_SWITCH_CALL reason != TASK_SWITCH_GATE) old_tss_sel = 0x; - /* set back link to prev task only if NT bit is set in eflags - note that old_tss_sel is not used afetr this point */ - if (reason != TASK_SWITCH_CALL reason != TASK_SWITCH_GATE) - old_tss_sel = 0x; - if (nseg_desc.type 8) ret = kvm_task_switch_32(vcpu, tss_selector, old_tss_sel, old_tss_base, nseg_desc); -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Merge commit 'tip/x86/entry'
From: Avi Kivity a...@redhat.com -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] KVM: get_tss_base_addr() should return a gpa_t
From: Gleb Natapov g...@redhat.com If TSS we are switching to resides in high memory task switch will fail since address will be truncated. Windows2k3 does this sometimes when running with more then 4G Cc: sta...@kernel.org Signed-off-by: Gleb Natapov g...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 93ed656..be968f1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4214,7 +4214,7 @@ static int save_guest_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, return kvm_write_guest_virt(dtable.base + index*8, seg_desc, sizeof(*seg_desc), vcpu); } -static u32 get_tss_base_addr(struct kvm_vcpu *vcpu, +static gpa_t get_tss_base_addr(struct kvm_vcpu *vcpu, struct desc_struct *seg_desc) { u32 base_addr = get_desc_base(seg_desc); -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANNOUNCE] Sheepdog: Distributed Storage System for KVM
On 2009/10/25 17:51, Dietmar Maurer wrote: Do you support multiple guests accessing the same image? A VM image can be attached to any VMs but one VM at a time; multiple running VMs cannot access to the same VM image. I guess this is a problem when you want to do live migrations? Yes, because Sheepdog locks a VM image when it is opened. To avoid this problem, locking must be delayed until migration has done. This is also a TODO item. -- MORITA Kazutaka -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] KVM: x86: Add VCPU substate for NMI states
Avi Kivity wrote: On 10/15/2009 01:27 PM, Jan Kiszka wrote: Perhaps it makes sense to query about individual states, including existing ones? That will allow us to deprecate and then phase out broken states. It's probably not worth it. You may do this already with the given design: Set up a VCPU, then issue KVM_GET_VCPU_STATE on the substate in question. You will either get an error code or 0 if the substate is supported. At least no additional kernel code required. No, if some code requires a feature, we don't want to set up a guest and a vcpu and issue dummy commands in order to find out if we can actually run that code. Feature discovery needs to be a 'system ioctl' in the words of Documentation/kvm/api.txt. OK, added some system IOCTL 'KVM_GET_VCPU_STATE_LIST' to my to-do list. Jan signature.asc Description: OpenPGP digital signature
Re: I/O performance of VirtIO
Avi Kivity wrote: On 10/23/2009 12:06 AM, Alexander Graf wrote: Am 22.10.2009 um 18:29 schrieb Avi Kivity a...@redhat.com: On 10/13/2009 08:35 AM, Jan Kiszka wrote: It can be particularly slow if you use in-kernel irqchips and the default NIC emulation (up to 10 times slower), some effect I always wanted to understand on a rainy day. So, when you actually want -net user, try -no-kvm-irqchip. This might be due to a missing SIGIO or SIGALRM; -no-kvm-irqchip generates a lot of extra signals and thus polling opportunities. Isn't that what dedicated io threads are supposed to solve? No. Dedicated I/O threads provide parallelism. All latency needs is to have SIGIO sent on all file descriptors (or rather, in qemu-kvm with irqchip, to have all file descriptors in the poll() call). Jan, does slirp add new connections to the select set? It should do so in slirp_select_fill (it iterates over all TCPUDP sockets of all instances). I think without doing this, slirp wouldn't receive a single bit at all (no activity without FD_ISSET). Jan signature.asc Description: OpenPGP digital signature
Re: List of unaccessible x86 states
On 10/25/2009 06:45 PM, Alexander Graf wrote: It's not. We can't use the guest memory for hsave because then the guest could break the l1 state, so a malicious hypervisor could break us. Guest hsave should be used for storing guest state when switching into the nested guest, not host state. Host state is not part of the save/restore state in any case. No it's not. When going in an l2 guest, we need to save the l1 state in the hsave. Now if we'd use the l1 given hsave, the l2 guest could modify the hsave. That means the l2 guest could rewrite the intercept bitmap to 0 and compromize the host. L1 hsave stores the architected state saved by vmrun, e.g. cs.sel, next_rip, cr0, cr3, etc. The host intercept bitmap is not state since it is calculated from the L1 intercept bitmap and host code. Indeed it can be different from host to host even with the same guest state. That's why we're storing the hsave data in a host allocated page. Of course, we could save the whole hsave are off to the host on migeation... Sorry, -ENOPARSE. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I/O performance of VirtIO
On 10/26/2009 10:12 AM, Jan Kiszka wrote: No. Dedicated I/O threads provide parallelism. All latency needs is to have SIGIO sent on all file descriptors (or rather, in qemu-kvm with irqchip, to have all file descriptors in the poll() call). Jan, does slirp add new connections to the select set? It should do so in slirp_select_fill (it iterates over all TCPUDP sockets of all instances). I think without doing this, slirp wouldn't receive a single bit at all (no activity without FD_ISSET). Yes, so it seems from the code. But something is missing if you get better performance with -no-kvm-irqchip. Perhaps timers are off. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] virtio-net: fix data corruption with OOM
On Mon, Oct 26, 2009 at 12:11:51PM +1030, Rusty Russell wrote: On Mon, 26 Oct 2009 03:33:40 am Michael S. Tsirkin wrote: virtio net used to unlink skbs from send queues on error, but ever since 48925e372f04f5e35fec6269127c62b2c71ab794 we do not do this. This causes guest data corruption and crashes with vhost since net core can requeue the skb or free it without it being taken off the list. This patch fixes this by queueing the skb after successfull transmit. I originally thought that this was racy: as soon as we do add_buf, we need to make sure we're ready for the callback (for virtio_pci, it's -kick, but we shouldn't rely on that). So a comment would be nice. How's this? Acked-by: Michael S. Tsirkin m...@redhat.com Subject: virtio-net: fix data corruption with OOM Date: Sun, 25 Oct 2009 19:03:40 +0200 From: Michael S. Tsirkin m...@redhat.com virtio net used to unlink skbs from send queues on error, but ever since 48925e372f04f5e35fec6269127c62b2c71ab794 we do not do this. This causes guest data corruption and crashes with vhost since net core can requeue the skb or free it without it being taken off the list. This patch fixes this by queueing the skb after successful transmit. Signed-off-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Rusty Russell ru...@rustcorp.com.au (+ comment) --- Rusty, here's a fix for another data corrupter I saw. This fixes a regression from 2.6.31, so definitely 2.6.32 I think. Comments? drivers/net/virtio_net.c |8 +--- 1 files changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -516,8 +516,7 @@ again: /* Free up any pending old buffers before queueing new ones. */ free_old_xmit_skbs(vi); - /* Put new one in send queue and do transmit */ - __skb_queue_head(vi-send, skb); + /* Try to transmit */ capacity = xmit_skb(vi, skb); /* This can happen with OOM and indirect buffers. */ @@ -531,8 +530,17 @@ again: } return NETDEV_TX_BUSY; } + vi-svq-vq_ops-kick(vi-svq); - vi-svq-vq_ops-kick(vi-svq); + /* + * Put new one in send queue. You'd expect we'd need this before + * xmit_skb calls add_buf(), since the callback can be triggered + * immediately after that. But since the callback just triggers + * another call back here, normal network xmit locking prevents the + * race. + */ + __skb_queue_head(vi-send, skb); + /* Don't wait up for transmitted skbs to be freed. */ skb_orphan(skb); nf_reset(skb); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 64 bit guest much faster ?
On 10/23/09 17:54, Stefan wrote: Hello, I have a simple question (sorry I'm a kvm beginner): Is it right that a 64bit guest (8 CPUs, 16GB) is much faster than a 32bit guest (8 CPUs, 16GB PAE). Yes. With *that* much memory the 32bit guest struggles with address space limitations (32bit - 4G), whereas the 64bit guest doesn't. With up to 1G you shouldn't see a noticable difference. But the more highmem the 32bit guest uses the higher is the penalty. Especially without ept/npt as every kmap() of a high page is a roundtrip to the hypervisor then. cheers, Gerd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Kiszka to maintain kvm-kmod
I am pleased to announce that Jan Kiszka has agreed to maintain kvm-kmod.git, the backporting kit that allows running modern kvm code on older kernels. Jan will release kvm-kmod-2.6.x.y packages and kvm-kmod-2.6.x-rcy packages, while Marcelo and I will (with Jan's help) release kvm-kmod-devel-xx. Many thanks to Jan for taking on this task. As there are now many different sources of kvm kernel modules to choose from, I wrote up a page that describes the various releases and what they are suited for. This can be found in http://www.linux-kvm.org/page/Getting_the_kvm_kernel_modules. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] virtio-net: fix data corruption with OOM
On Mon, Oct 26, 2009 at 12:11:51PM +1030, Rusty Russell wrote: On Mon, 26 Oct 2009 03:33:40 am Michael S. Tsirkin wrote: virtio net used to unlink skbs from send queues on error, but ever since 48925e372f04f5e35fec6269127c62b2c71ab794 we do not do this. This causes guest data corruption and crashes with vhost since net core can requeue the skb or free it without it being taken off the list. This patch fixes this by queueing the skb after successfull transmit. I originally thought that this was racy: as soon as we do add_buf, we need to make sure we're ready for the callback (for virtio_pci, it's -kick, but we shouldn't rely on that). BTW, wanted to note that unlink on error would *also* be racy if we did any processing in the callback. So a comment would be nice. How's this? Subject: virtio-net: fix data corruption with OOM Date: Sun, 25 Oct 2009 19:03:40 +0200 From: Michael S. Tsirkin m...@redhat.com virtio net used to unlink skbs from send queues on error, but ever since 48925e372f04f5e35fec6269127c62b2c71ab794 we do not do this. This causes guest data corruption and crashes with vhost since net core can requeue the skb or free it without it being taken off the list. This patch fixes this by queueing the skb after successful transmit. Signed-off-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Rusty Russell ru...@rustcorp.com.au (+ comment) --- Rusty, here's a fix for another data corrupter I saw. This fixes a regression from 2.6.31, so definitely 2.6.32 I think. Comments? drivers/net/virtio_net.c |8 +--- 1 files changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -516,8 +516,7 @@ again: /* Free up any pending old buffers before queueing new ones. */ free_old_xmit_skbs(vi); - /* Put new one in send queue and do transmit */ - __skb_queue_head(vi-send, skb); + /* Try to transmit */ capacity = xmit_skb(vi, skb); /* This can happen with OOM and indirect buffers. */ @@ -531,8 +530,17 @@ again: } return NETDEV_TX_BUSY; } + vi-svq-vq_ops-kick(vi-svq); - vi-svq-vq_ops-kick(vi-svq); + /* + * Put new one in send queue. You'd expect we'd need this before + * xmit_skb calls add_buf(), since the callback can be triggered + * immediately after that. But since the callback just triggers + * another call back here, normal network xmit locking prevents the + * race. + */ + __skb_queue_head(vi-send, skb); + /* Don't wait up for transmitted skbs to be freed. */ skb_orphan(skb); nf_reset(skb); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
Am 26.10.2009 um 09:33 schrieb Avi Kivity a...@redhat.com: On 10/25/2009 06:45 PM, Alexander Graf wrote: It's not. We can't use the guest memory for hsave because then the guest could break the l1 state, so a malicious hypervisor could break us. Guest hsave should be used for storing guest state when switching into the nested guest, not host state. Host state is not part of the save/restore state in any case. No it's not. When going in an l2 guest, we need to save the l1 state in the hsave. Now if we'd use the l1 given hsave, the l2 guest could modify the hsave. That means the l2 guest could rewrite the intercept bitmap to 0 and compromize the host. L1 hsave stores the architected state saved by vmrun, e.g. cs.sel, next_rip, cr0, cr3, etc. The host intercept bitmap is not state since it is calculated from the L1 intercept bitmap and host code. Indeed it can be different from host to host even with the same guest state. Ah, so you'd only save off the cpu state parts of the vmcb. Currently we save off control parts too, so we can easily swap them in on #vmexit. So if we'd migrate off when inside the nested guest, we'd have to save off the resume control state, OR them again with the guest vmcb control states and be inside the nested guest. Wouldn't it be much easier to not migrate / save state when inside a nested guest? I'm afraid the code will become overly complex if we do allow migration while in a nested context. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On Sun, Oct 25, 2009 at 11:49:35AM +0200, Avi Kivity wrote: On 10/24/2009 12:35 PM, Alexander Graf wrote: Hm, thinking about this again, it might be useful to have an currently in nested VM flag here. That way userspace can decide if it needs to get out of the nested state (for migration) or if it just doesn't care. Getting out of nested state involves modifying state (both memory and registers). Nor can we in the general case force it. The guest can set up a situation where it is impossible to #vmexit. There is actually more than that. If the guest runs in guest mode itself we also need to report the host state to be able to do an #vmexit after migration. In nested SVM the host state is not saved in the guest memory to prevent the guest from modifying it and break out of its virtualization jail. Joerg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On 10/26/2009 11:11 AM, Alexander Graf wrote: L1 hsave stores the architected state saved by vmrun, e.g. cs.sel, next_rip, cr0, cr3, etc. The host intercept bitmap is not state since it is calculated from the L1 intercept bitmap and host code. Indeed it can be different from host to host even with the same guest state. Ah, so you'd only save off the cpu state parts of the vmcb. Currently we save off control parts too, so we can easily swap them in on #vmexit. These can still be saved in a host memory area as an optimization, and regenerated if needed. So if we'd migrate off when inside the nested guest, we'd have to save off the resume control state, OR them again with the guest vmcb control states and be inside the nested guest. Right, if the new state bit (guest mode) is set, we look at the control bits and OR them into the vmcb. That part can be reused with the VMRUN code. Wouldn't it be much easier to not migrate / save state when inside a nested guest? I'm afraid the code will become overly complex if we do allow migration while in a nested context. I can't really see why but then I don't know the code as well as you do. The current code won't work for guests which don't intercept external interrupts (probably only malware). For nested vmx it may be necessary since vmx has a mode where interrupts are acknowledged during #VMEXIT and the interrupt vector is saved into a register; you can't fake an interrupt #VMEXIT since you can't fake the vector. Xen is one guest which uses this mode. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On 10/26/2009 11:17 AM, Joerg Roedel wrote: On Sun, Oct 25, 2009 at 11:49:35AM +0200, Avi Kivity wrote: On 10/24/2009 12:35 PM, Alexander Graf wrote: Hm, thinking about this again, it might be useful to have an currently in nested VM flag here. That way userspace can decide if it needs to get out of the nested state (for migration) or if it just doesn't care. Getting out of nested state involves modifying state (both memory and registers). Nor can we in the general case force it. The guest can set up a situation where it is impossible to #vmexit. There is actually more than that. If the guest runs in guest mode itself we also need to report the host state to be able to do an #vmexit after migration. In nested SVM the host state is not saved in the guest memory to prevent the guest from modifying it and break out of its virtualization jail. Which host state? As far as I can tell, it can all be regenerated. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vhost-net patches
On Fri, Oct 23, 2009 at 09:23:40AM -0700, Shirley Ma wrote: I also hit guest skb_xmit panic. If these are the same panics I have seen myself, they are probably fixed with recent virtio patches I sent to Rusty. I put them on my vhost.git tree to make it easier for you to test. If you see any more crashes, please holler, preferably with a backtrace. -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On Mon, Oct 26, 2009 at 11:21:12AM +0200, Avi Kivity wrote: On 10/26/2009 11:17 AM, Joerg Roedel wrote: On Sun, Oct 25, 2009 at 11:49:35AM +0200, Avi Kivity wrote: On 10/24/2009 12:35 PM, Alexander Graf wrote: Hm, thinking about this again, it might be useful to have an currently in nested VM flag here. That way userspace can decide if it needs to get out of the nested state (for migration) or if it just doesn't care. Getting out of nested state involves modifying state (both memory and registers). Nor can we in the general case force it. The guest can set up a situation where it is impossible to #vmexit. There is actually more than that. If the guest runs in guest mode itself we also need to report the host state to be able to do an #vmexit after migration. In nested SVM the host state is not saved in the guest memory to prevent the guest from modifying it and break out of its virtualization jail. Which host state? As far as I can tell, it can all be regenerated. The state which is loaded into the vcpu when a #vmexit is emulated. This includes segments, control registers and the host rip for example. Joerg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On 10/26/2009 11:30 AM, Joerg Roedel wrote: Which host state? As far as I can tell, it can all be regenerated. The state which is loaded into the vcpu when a #vmexit is emulated. This includes segments, control registers and the host rip for example. All of this state does not change between nested guest and normal guest mode. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On Mon, Oct 26, 2009 at 11:39:46AM +0200, Avi Kivity wrote: On 10/26/2009 11:30 AM, Joerg Roedel wrote: Which host state? As far as I can tell, it can all be regenerated. The state which is loaded into the vcpu when a #vmexit is emulated. This includes segments, control registers and the host rip for example. All of this state does not change between nested guest and normal guest mode. I am talking about all the state that is saved in svm-nested.hsave. When we migrate a guest vcpu while it is running in guest mode itself (without forcing a nested #vmexit) this state is required when a #vmexit needs to be emulated on this vcpu after migration. Same is true for the nested intercept conditions. Joerg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On 10/26/2009 11:56 AM, Joerg Roedel wrote: On Mon, Oct 26, 2009 at 11:39:46AM +0200, Avi Kivity wrote: On 10/26/2009 11:30 AM, Joerg Roedel wrote: Which host state? As far as I can tell, it can all be regenerated. The state which is loaded into the vcpu when a #vmexit is emulated. This includes segments, control registers and the host rip for example. All of this state does not change between nested guest and normal guest mode. I am talking about all the state that is saved in svm-nested.hsave. When we migrate a guest vcpu while it is running in guest mode itself (without forcing a nested #vmexit) this state is required when a #vmexit needs to be emulated on this vcpu after migration. Same is true for the nested intercept conditions. The state that is saved by VMRUN can be saved to guest memory and migrated. Extra state (like the intercepts for the previous mode) must be saved to host memory and not migrated; host intercepts can be regenerated. Concretely: hsave-save.es = vmcb-save.es; hsave-save.cs = vmcb-save.cs; hsave-save.ss = vmcb-save.ss; hsave-save.ds = vmcb-save.ds; hsave-save.gdtr = vmcb-save.gdtr; hsave-save.idtr = vmcb-save.idtr; hsave-save.efer = svm-vcpu.arch.shadow_efer; hsave-save.cr0= svm-vcpu.arch.cr0; hsave-save.cr4= svm-vcpu.arch.cr4; hsave-save.rflags = vmcb-save.rflags; hsave-save.rip= svm-next_rip; hsave-save.rsp= vmcb-save.rsp; hsave-save.rax= vmcb-save.rax; if (npt_enabled) hsave-save.cr3= vmcb-save.cr3; else hsave-save.cr3= svm-vcpu.arch.cr3; Can all be saved to guest memory. copy_vmcb_control_area(hsave, vmcb); Must not be saved into guest memory. On the other hand, it is not needed for migration. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 64 bit guest much faster ?
On 10/26/2009 10:58 AM, Gerd Hoffmann wrote: On 10/23/09 17:54, Stefan wrote: Hello, I have a simple question (sorry I'm a kvm beginner): Is it right that a 64bit guest (8 CPUs, 16GB) is much faster than a 32bit guest (8 CPUs, 16GB PAE). Yes. With *that* much memory the 32bit guest struggles with address space limitations (32bit - 4G), whereas the 64bit guest doesn't. With up to 1G you shouldn't see a noticable difference. But the more highmem the 32bit guest uses the higher is the penalty. Especially without ept/npt as every kmap() of a high page is a roundtrip to the hypervisor then. Oh yes, without ept/npt the slowdown should indeed be significant with this much memory. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
kvm problems on new hardware
Hello, I have a KVM virtualization problem. I've put together new hardware (supermicro) server with 2 E5530 cpu's and memory disk to start experimenting with virtualization. I intend to use the www.proxmox.com system/setup. I installed proxmox and started stress testing the hardware: parallel kernel compiles in a loop (concurrency_level=32) memtest86+ during the night etc. The hardware/os performs rocksolid when i stress test it, but the moment i start a virtual guest (eg debian netinstall) i get the first screen of the installation procedure in a vnc screen. I choose either normal install or expert install , the guest screen goes blank with only a cursor and the kvm process prints an error on the console and starts to eat cpu cycles. So the host OS is not barfing, only the kvm process is giving problems and the guest is frozen. To see if it was/is related to the older proxmox kernel setup i installed ubuntu karmic with libvirt on another harddrive for comparison: Same error happens when i start a guest. Oct 23 09:34:14 ubuntu kernel: [ 416.226550] device vnet0 left promiscuous mode Oct 23 09:34:14 ubuntu kernel: [ 416.226554] br0: port 2(vnet0) entering disabled state Oct 23 09:34:57 ubuntu kernel: [ 459.544150] type=1505 audit(1256290497.414:17): operation=profile_load pid=1676 name=libvirt-2ae923e6-f06d-9f0d-d072-c2067b7cbee4 Oct 23 09:34:57 ubuntu kernel: [ 459.550725] device vnet0 entered promiscuous mode Oct 23 09:34:57 ubuntu kernel: [ 459.551888] br0: port 2(vnet0) entering learning state Oct 23 09:34:57 ubuntu kernel: [ 459.557989] type=1503 audit(1256290497.429:18): operation=open pid=1679 parent=1 profile=libvirt-2ae923e6-f06d-9f0d-d072-c2067b7cbee4 requested_mask=rw:: denied_mask=w:: fsuid=0 ouid=0 name=/var/tmp/debian-503-amd64-netinst.iso Oct 23 09:35:05 ubuntu kernel: [ 468.066681] handle_exception: unexpected, vectoring info 0x8010 intr info 0x8b0d Oct 23 09:35:05 ubuntu kernel: [ 468.066760] handle_exception: unexpected, vectoring info 0x800d intr info 0x8b0d Oct 23 09:35:05 ubuntu kernel: [ 468.066836] handle_exception: unexpected, vectoring info 0x800d intr info 0x8b0d In the bios there are settings for vt-d etc, i tried (imho) all combinations but i am not able to start a guest. I spoke to supermicro support and even got a more recent (yet unpublished) bios. All without success. No i am up to the point where i dont know if i have a hardware or a software problem. I installed the intel-microcode package to see if that maybe fixed something: It didn't References: chassis: http://supermicro.com/products/system/2U/6026/SYS-6026TT-BIBQRF.cfm?INF= motherboard: http://supermicro.com/products/motherboard/QPI/5500/X8DTT-F.cfm 7 hourmemtest: http://dth.net/supermicro/memtest86_completed_7hr.jpg More output (dmesg and other stuff) in the same dir: http://dth.net/supermicro/ Am i trying to run this on hardware that is to recent and not yet tested ? I hope somebody has some ideas/hints about this. Danny -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 64 bit guest much faster ?
Avi Kivity wrote: On 10/26/2009 10:58 AM, Gerd Hoffmann wrote: On 10/23/09 17:54, Stefan wrote: Hello, I have a simple question (sorry I'm a kvm beginner): Is it right that a 64bit guest (8 CPUs, 16GB) is much faster than a 32bit guest (8 CPUs, 16GB PAE). Yes. With *that* much memory the 32bit guest struggles with address space limitations (32bit - 4G), whereas the 64bit guest doesn't. With up to 1G you shouldn't see a noticable difference. But the more highmem the 32bit guest uses the higher is the penalty. Especially without ept/npt as every kmap() of a high page is a roundtrip to the hypervisor then. Oh yes, without ept/npt the slowdown should indeed be significant with this much memory. How it is with 4Gb guest/mem without PAE (I mean, with CONFIG_HIGHMEM_4G=y)? Or even 2Gb? In case of npt or without. Can we construct a sort of a table of expected slowdowns (not in numbers but just in terms significant, minor etc) of running 4Gb or 4Gb (and 1Gb and 1Gb if that makes significant diffencece) 32bit guests with and without npt and 64bit guests, please? I guess it's quite interesting to many users. From the above it looks like it's better to run 64bit kernel in the 32bit guest in these situations too. I haven't measured it, just because it never occured to me that there MAY be any difference. But I've only non-npt hardware here at the moment. Thanks! /mjt -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 64 bit guest much faster ?
On 10/26/2009 12:42 PM, Michael Tokarev wrote: Oh yes, without ept/npt the slowdown should indeed be significant with this much memory. How it is with 4Gb guest/mem without PAE (I mean, with CONFIG_HIGHMEM_4G=y)? Or even 2Gb? In case of npt or without. It'll be slow. Just use x86_64 with 1GB. Can we construct a sort of a table of expected slowdowns (not in numbers but just in terms significant, minor etc) of running 4Gb or 4Gb (and 1Gb and 1Gb if that makes significant diffencece) 32bit guests with and without npt and 64bit guests, please? I guess it's quite interesting to many users. These tables will be useless, it greatly depends on workload. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On Mon, Oct 26, 2009 at 12:09:25PM +0200, Avi Kivity wrote: On 10/26/2009 11:56 AM, Joerg Roedel wrote: On Mon, Oct 26, 2009 at 11:39:46AM +0200, Avi Kivity wrote: On 10/26/2009 11:30 AM, Joerg Roedel wrote: Which host state? As far as I can tell, it can all be regenerated. The state which is loaded into the vcpu when a #vmexit is emulated. This includes segments, control registers and the host rip for example. All of this state does not change between nested guest and normal guest mode. I am talking about all the state that is saved in svm-nested.hsave. When we migrate a guest vcpu while it is running in guest mode itself (without forcing a nested #vmexit) this state is required when a #vmexit needs to be emulated on this vcpu after migration. Same is true for the nested intercept conditions. The state that is saved by VMRUN can be saved to guest memory and migrated. Extra state (like the intercepts for the previous mode) must be saved to host memory and not migrated; host intercepts can be regenerated. Ok, parts of the state can be saved in guest memory. But thats currently not done. This will need some care to not introduce a security hole. But it shouldn't be too difficult. The state thats not reproducible in an sane way is the intercept bitmap for the l2 guest. From the nested state what needs to be exposed to userspace for migration is: * guest mode flag (as returned by is_nested) * nested vmcb address * nested hsave msr * nested intercepts * for nested nested paging: guest nested cr3 value Another state which needs exposure is the last branch record related state. Off-topic question: Will the new migration protocol include some kind handshake to find out if migration is possible at all? Joerg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On 10/26/2009 12:45 PM, Joerg Roedel wrote: Ok, parts of the state can be saved in guest memory. But thats currently not done. This will need some care to not introduce a security hole. But it shouldn't be too difficult. The state thats not reproducible in an sane way is the intercept bitmap for the l2 guest. From the nested state what needs to be exposed to userspace for migration is: * guest mode flag (as returned by is_nested) * nested vmcb address Yes, forgot that. We can store it in the hsave area (note the hsave area format becomes an ABI). * nested hsave msr That's already saved. * nested intercepts These are part of the guest vmcb. The host nested intercepts can be recalculated, no? * for nested nested paging: guest nested cr3 value Part of the guest vmcb. Another state which needs exposure is the last branch record related state. Aren't those just more MSRs? Off-topic question: Will the new migration protocol include some kind handshake to find out if migration is possible at all? It's assumed that migration always works for a newer qemu version, and that the management tools don't attempt backward migration. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: List of unaccessible x86 states
On Mon, Oct 26, 2009 at 12:56:31PM +0200, Avi Kivity wrote: On 10/26/2009 12:45 PM, Joerg Roedel wrote: * nested intercepts These are part of the guest vmcb. The host nested intercepts can be recalculated, no? * for nested nested paging: guest nested cr3 value Part of the guest vmcb. This will work is most cases. But its not architecturally sane because real hardware caches this information in the cpu. So software is free to modify the vmcb without impacting the in-cpu state until the next #vmexit. I don't know any software which relies on that so it may be not an issue. Off-topic question: Will the new migration protocol include some kind handshake to find out if migration is possible at all? It's assumed that migration always works for a newer qemu version, and that the management tools don't attempt backward migration. I think such a handshake would make sense to just prevent that a nested svm hypervisor is migrated to an intel machine or vice versa (just an example, there are more like sse*, nested nested paging, ...). Joerg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Make vapic.S into optional rom
Signed-off-by: Gleb Natapov g...@redhat.com diff --git a/Makefile b/Makefile index ea568f5..acd9108 100644 --- a/Makefile +++ b/Makefile @@ -259,6 +259,7 @@ pxe-ne2k_pci.bin pxe-rtl8139.bin pxe-pcnet.bin pxe-e1000.bin \ bamboo.dtb petalogix-s3adsp1800.dtb \ multiboot.bin BLOBS += extboot.bin +BLOBS += vapic.bin else BLOBS= endif diff --git a/hw/pc.c b/hw/pc.c index 83012a9..819b78a 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -53,6 +53,7 @@ #define VGABIOS_FILENAME vgabios.bin #define VGABIOS_CIRRUS_FILENAME vgabios-cirrus.bin #define EXTBOOT_FILENAME extboot.bin +#define VAPIC_FILENAME vapic.bin #define PC_MAX_BIOS_SIZE (4 * 1024 * 1024) @@ -1149,6 +1150,7 @@ static void pc_init1(ram_addr_t ram_size, if (extboot_drive) { option_rom[nb_option_roms++] = qemu_strdup(EXTBOOT_FILENAME); } +option_rom[nb_option_roms++] = qemu_strdup(VAPIC_FILENAME); option_rom_offset = qemu_ram_alloc(PC_ROM_SIZE); cpu_register_physical_memory(PC_ROM_MIN_VGA, PC_ROM_SIZE, option_rom_offset); diff --git a/kvm-tpr-opt.c b/kvm-tpr-opt.c index 932b49b..2565d79 100644 --- a/kvm-tpr-opt.c +++ b/kvm-tpr-opt.c @@ -114,6 +114,7 @@ static uint32_t bios_addr; static uint32_t vapic_phys; static uint32_t bios_enabled; static uint32_t vbios_desc_phys; +static uint32_t vapic_bios_addr; static void update_vbios_real_tpr(void) { @@ -187,16 +188,16 @@ static int bios_is_mapped(CPUState *env, uint64_t rip) struct kvm_sregs sregs; unsigned perms; uint32_t i; -uint32_t offset, fixup; +uint32_t offset, fixup, start = vapic_bios_addr ? : 0xe; if (bios_enabled) return 1; kvm_get_sregs(env, sregs); -probe = (rip 0xf000) + 0xe; +probe = (rip 0xf000) + start; phys = map_addr(sregs, probe, perms); -if (phys != 0xe) +if (phys != start) return 0; bios_addr = probe; for (i = 0; i 64; ++i) { @@ -356,6 +357,17 @@ static int tpr_load(QEMUFile *f, void *s, int version_id) return 0; } +static void vtpr_ioport_write16(void *opaque, uint32_t addr, uint32_t val) +{ +struct kvm_regs regs; +CPUState *env = cpu_single_env; +struct kvm_sregs sregs; +kvm_get_regs(env, regs); +kvm_get_sregs(env, sregs); +vapic_bios_addr = ((sregs.cs.base + regs.rip) ~(512 - 1)) + val; +bios_enabled = 0; +} + static void vtpr_ioport_write(void *opaque, uint32_t addr, uint32_t val) { CPUState *env = cpu_single_env; @@ -386,5 +398,6 @@ void kvm_tpr_opt_setup(void) { register_savevm(kvm-tpr-opt, 0, 1, tpr_save, tpr_load, NULL); register_ioport_write(0x7e, 1, 1, vtpr_ioport_write, NULL); +register_ioport_write(0x7e, 2, 2, vtpr_ioport_write16, NULL); } diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile index 73e74d8..67ecc63 100644 --- a/pc-bios/optionrom/Makefile +++ b/pc-bios/optionrom/Makefile @@ -13,7 +13,7 @@ CFLAGS += -I$(SRC_PATH) CFLAGS += $(call cc-option, $(CFLAGS), -fno-stack-protector) QEMU_CFLAGS = $(CFLAGS) -build-all: multiboot.bin extboot.bin +build-all: multiboot.bin extboot.bin vapic.bin %.img: %.o $(call quiet-command,$(LD) -Ttext 0 -e _start -s -o $@ $, Building $(TARGET_DIR)$@) diff --git a/pc-bios/optionrom/vapic.S b/pc-bios/optionrom/vapic.S new file mode 100644 index 000..1924eeb --- /dev/null +++ b/pc-bios/optionrom/vapic.S @@ -0,0 +1,311 @@ + .text 0 + .code16 +.global _start +_start: + .short 0xaa55 + .byte (_end - _start) / 512 + mov $vapic_base, %ax + out %ax, $0x7e + lret + + .code32 +vapic_size = 2*4096 + +.macro fixup delta=-4 +777: + .text 1 + .long 777b + \delta - vapic_base + .text 0 +.endm + +.macro reenable_vtpr + out %al, $0x7e +.endm + +.text 1 + fixup_start = . +.text 0 + +vapic_base: + .ascii kvm aPiC + + /* relocation data */ + .long vapic_base; fixup + .long fixup_start ; fixup + .long fixup_end ; fixup + + .long vapic ; fixup + .long vapic_size +vcpu_shift: + .long 0 +real_tpr: + .long 0 + .long up_set_tpr; fixup + .long up_set_tpr_eax; fixup + .long up_get_tpr_eax; fixup + .long up_get_tpr_ecx; fixup + .long up_get_tpr_edx; fixup + .long up_get_tpr_ebx; fixup + .long 0 /* esp. won't work. */ + .long up_get_tpr_ebp; fixup + .long up_get_tpr_esi; fixup + .long up_get_tpr_edi; fixup + .long up_get_tpr_stack ; fixup + .long mp_set_tpr; fixup + .long mp_set_tpr_eax; fixup + .long mp_get_tpr_eax; fixup + .long mp_get_tpr_ecx; fixup + .long mp_get_tpr_edx; fixup + .long mp_get_tpr_ebx; fixup + .long 0 /* esp. won't work. */ + .long mp_get_tpr_ebp; fixup + .long mp_get_tpr_esi; fixup + .long mp_get_tpr_edi; fixup + .long
Re: [PATCH] Make vapic.S into optional rom
On 10/26/2009 01:42 PM, Gleb Natapov wrote: Need to remove the original implementation. What was this tested on? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Make vapic.S into optional rom
On Mon, Oct 26, 2009 at 02:31:21PM +0200, Avi Kivity wrote: On 10/26/2009 01:42 PM, Gleb Natapov wrote: Need to remove the original implementation. That's in submodule now. Different repository. May it's worth to leave it in for a while? What was this tested on? WindowsXP 32 bit boot/reboot. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Make vapic.S into optional rom
On 10/26/2009 02:33 PM, Gleb Natapov wrote: On Mon, Oct 26, 2009 at 02:31:21PM +0200, Avi Kivity wrote: On 10/26/2009 01:42 PM, Gleb Natapov wrote: Need to remove the original implementation. That's in submodule now. Different repository. May it's worth to leave it in for a while? Then we won't know which version is used. Please send an additional patch. What was this tested on? WindowsXP 32 bit boot/reboot. smp? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Make vapic.S into optional rom
On Mon, Oct 26, 2009 at 02:36:47PM +0200, Avi Kivity wrote: On 10/26/2009 02:33 PM, Gleb Natapov wrote: On Mon, Oct 26, 2009 at 02:31:21PM +0200, Avi Kivity wrote: On 10/26/2009 01:42 PM, Gleb Natapov wrote: Need to remove the original implementation. That's in submodule now. Different repository. May it's worth to leave it in for a while? Then we won't know which version is used. Please send an additional patch. What was this tested on? WindowsXP 32 bit boot/reboot. smp? Yes -smp 2 -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] remove vapic.S from pcbios
Compiled as option rom now. Signed-off-by: Gleb Natapov g...@redhat.com diff --git a/Makefile b/Makefile index 434d64e..bcd3ee2 100644 --- a/Makefile +++ b/Makefile @@ -105,8 +105,8 @@ rombios32.bin: rombios32.out rombios.h objcopy -O binary $ $@ ./biossums -pad $@ -rombios32.out: rombios32start.o rombios32.o vapic.o rombios32.ld - ld -o $@ -T rombios32.ld rombios32start.o vapic.o rombios32.o +rombios32.out: rombios32start.o rombios32.o rombios32.ld + ld -o $@ -T rombios32.ld rombios32start.o rombios32.o rombios32.o: rombios32.c acpi-dsdt.hex acpi-ssdt.hex $(GCC) -m32 -O2 -Wall -c -o $@ $ @@ -126,9 +126,6 @@ acpi-ssdt.hex: acpi-ssdt.dsl rombios32start.o: rombios32start.S $(GCC) -m32 -c -o $@ $ -vapic.o: vapic.S - $(GCC) -m32 -c -o $@ $ - BIOS-bochs-latest: rombios16.bin rombios32.bin cat rombios32.bin rombios16.bin $@ diff --git a/rombios32.ld b/rombios32.ld index 1fc99c3..ca31f54 100644 --- a/rombios32.ld +++ b/rombios32.ld @@ -6,10 +6,6 @@ SECTIONS . = 0x000e; .text : { *(.text)} .rodata: { *(.rodata*) } -. = ALIGN(64); -fixup_start = .; -.fixup: { *(.fixup) } -fixup_end = .; . = ALIGN(4096); _end = . ; .data 0x700 : AT (_end) { __data_start = .; *(.data); __data_end = .;} diff --git a/vapic.S b/vapic.S deleted file mode 100644 index cf2a474..000 --- a/vapic.S +++ /dev/null @@ -1,294 +0,0 @@ - .text - .code32 - .align 4096 - -vapic_size = 2*4096 - -.macro fixup delta=-4 -777: - .pushsection .fixup, a - .long 777b + \delta - vapic_base - .popsection -.endm - -.macro reenable_vtpr - out %al, $0x7e -.endm - -vapic_base: - .ascii kvm aPiC - - /* relocation data */ - .long vapic_base; fixup - .long fixup_start ; fixup - .long fixup_end ; fixup - - .long vapic ; fixup - .long vapic_size -vcpu_shift: - .long 0 -real_tpr: - .long 0 - .long up_set_tpr; fixup - .long up_set_tpr_eax; fixup - .long up_get_tpr_eax; fixup - .long up_get_tpr_ecx; fixup - .long up_get_tpr_edx; fixup - .long up_get_tpr_ebx; fixup - .long 0 /* esp. won't work. */ - .long up_get_tpr_ebp; fixup - .long up_get_tpr_esi; fixup - .long up_get_tpr_edi; fixup - .long up_get_tpr_stack ; fixup - .long mp_set_tpr; fixup - .long mp_set_tpr_eax; fixup - .long mp_get_tpr_eax; fixup - .long mp_get_tpr_ecx; fixup - .long mp_get_tpr_edx; fixup - .long mp_get_tpr_ebx; fixup - .long 0 /* esp. won't work. */ - .long mp_get_tpr_ebp; fixup - .long mp_get_tpr_esi; fixup - .long mp_get_tpr_edi; fixup - .long mp_get_tpr_stack ; fixup - -.macro kvm_hypercall - .byte 0x0f, 0x01, 0xc1 -.endm - -kvm_hypercall_vapic_poll_irq = 1 - -pcr_cpu = 0x51 - -.align 64 - -mp_get_tpr_eax: - pushf - cli - reenable_vtpr - push %ecx - - fs/movzbl pcr_cpu, %eax - - mov vcpu_shift, %ecx; fixup - shl %cl, %eax - testb $1, vapic+4(%eax) ; fixup delta=-5 - jz mp_get_tpr_bad - movzbl vapic(%eax), %eax ; fixup - -mp_get_tpr_out: - pop %ecx - popf - ret - -mp_get_tpr_bad: - mov real_tpr, %eax ; fixup - mov (%eax), %eax - jmp mp_get_tpr_out - -mp_get_tpr_ebx: - mov %eax, %ebx - call mp_get_tpr_eax - xchg %eax, %ebx - ret - -mp_get_tpr_ecx: - mov %eax, %ecx - call mp_get_tpr_eax - xchg %eax, %ecx - ret - -mp_get_tpr_edx: - mov %eax, %edx - call mp_get_tpr_eax - xchg %eax, %edx - ret - -mp_get_tpr_esi: - mov %eax, %esi - call mp_get_tpr_eax - xchg %eax, %esi - ret - -mp_get_tpr_edi: - mov %eax, %edi - call mp_get_tpr_edi - xchg %eax, %edi - ret - -mp_get_tpr_ebp: - mov %eax, %ebp - call mp_get_tpr_eax - xchg %eax, %ebp - ret - -mp_get_tpr_stack: - call mp_get_tpr_eax - xchg %eax, 4(%esp) - ret - -mp_set_tpr_eax: - push %eax - call mp_set_tpr - ret - -mp_set_tpr: - pushf - push %eax - push %ecx - push %edx - push %ebx - cli - reenable_vtpr - -mp_set_tpr_failed: - fs/movzbl pcr_cpu, %edx - - mov vcpu_shift, %ecx; fixup - shl %cl, %edx - - testb $1, vapic+4(%edx) ; fixup delta=-5 - jz mp_set_tpr_bad - - mov vapic(%edx), %eax ; fixup - - mov %eax, %ebx - mov 24(%esp), %bl - - /* %ebx = new vapic (%bl = tpr, %bh = isr, %b3 = irr) */ - - lock cmpxchg %ebx, vapic(%edx) ; fixup - jnz mp_set_tpr_failed - - /* compute ppr */ - cmp %bh, %bl - jae mp_tpr_is_bigger
Re: [PATCH] Make vapic.S into optional rom
On 10/26/2009 01:42 PM, Gleb Natapov wrote: Signed-off-by: Gleb Natapovg...@redhat.com Applied this and the pcbios patch as well. Thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on default_x86_64_out_of_tree
The Buildbot has detected a new failure of default_x86_64_out_of_tree on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/default_x86_64_out_of_tree/builds/67 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_1 Build Reason: Build Source Stamp: [branch next] HEAD Blamelist: Avi Kivity a...@redhat.com,Gleb Natapov g...@redhat.com BUILD FAILED: failed git sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on default_x86_64_debian_5_0
The Buildbot has detected a new failure of default_x86_64_debian_5_0 on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/default_x86_64_debian_5_0/builds/126 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_1 Build Reason: Build Source Stamp: [branch next] HEAD Blamelist: Avi Kivity a...@redhat.com,Gleb Natapov g...@redhat.com BUILD FAILED: failed git sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on default_i386_out_of_tree
The Buildbot has detected a new failure of default_i386_out_of_tree on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/default_i386_out_of_tree/builds/65 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_2 Build Reason: Build Source Stamp: [branch next] HEAD Blamelist: Avi Kivity a...@redhat.com,Gleb Natapov g...@redhat.com BUILD FAILED: failed git sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on default_i386_debian_5_0
The Buildbot has detected a new failure of default_i386_debian_5_0 on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/default_i386_debian_5_0/builds/128 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_2 Build Reason: Build Source Stamp: [branch next] HEAD Blamelist: Avi Kivity a...@redhat.com,Gleb Natapov g...@redhat.com BUILD FAILED: failed git sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Alacrityvm-devel] [KVM PATCH v2 1/2] KVM: export lockless GSI attribute
Avi Kivity wrote: On 10/23/2009 04:38 AM, Gregory Haskins wrote: Certain GSI's support lockless injecton, but we have no way to detect which ones at the GSI level. Knowledge of this attribute will be useful later in the series so that we can optimize irqfd injection paths for cases where we know the code will not sleep. Therefore, we provide an API to query a specific GSI. Instead of a lockless attribute, how about a -set_atomic() method. For msi this can be the same as -set(), for non-msi it can be a function that schedules the work (which will eventually call -set()). The benefit is that we make a decision only once, when preparing the routing entry, and install that decision in the routing entry instead of making it again and again later. Yeah, I like this idea. I think we can also get rid of the custom workqueue if we do this as well, TBD. +int kvm_irq_check_lockless(struct kvm *kvm, u32 irq) bool kvm_irq_check_lockless(...) We lose the ability to detect failure (such as ENOENT) if we do this, but its moot if we move to the -set_atomic() model, since this attribute is no longer necessary and this patch can be dropped. Kind Regards, -Greg signature.asc Description: OpenPGP digital signature
RE: [Qemu-devel] net packet storms with multiple NICs
-Original Message- From: qemu-devel-bounces+chris.krumme=windriver@nongnu.org [mailto:qemu-devel-bounces+chris.krumme=windriver@nongnu.o rg] On Behalf Of Avi Kivity Sent: Sunday, October 25, 2009 9:23 AM To: Mark McLoughlin Cc: Michael Tokarev; qemu-de...@nongnu.org; KVM list Subject: Re: [Qemu-devel] net packet storms with multiple NICs On 10/23/2009 06:43 PM, Mark McLoughlin wrote: On Fri, 2009-10-23 at 20:25 +0400, Michael Tokarev wrote: I've two questions: o what's the intended usage of all-vlan-equal case, when kvm (or qemu) reflects packets from one interface to another? It's what bridge in linux is for, I think. I don't think it's necessarily an intended use-case for the vlan feature Well, it is. vlan=x really means the ethernet segment named x. If you connect all your guest nics to one vlan, you are connecting them all to one ethernet segment, so any packet transmitted on one will be reflected on others. Whether this is a useful feature is another matter, but the code is functioning as expected. Hello, We had one environment where the NIC understood by u-boot and the NIC understood by the kernel where different. We just attached both to the same VLAN. During u-boot one was used for downloading the kernel, then once the kernel booted the other was used. Not ideal, and maybe not important enough to keep the feature around, but it does get used now and again. Thanks Chris -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] net packet storms with multiple NICs
On 10/26/2009 03:40 PM, Krumme, Chris wrote: Well, it is. vlan=x really means the ethernet segment named x. If you connect all your guest nics to one vlan, you are connecting them all to one ethernet segment, so any packet transmitted on one will be reflected on others. Whether this is a useful feature is another matter, but the code is functioning as expected. Hello, We had one environment where the NIC understood by u-boot and the NIC understood by the kernel where different. We just attached both to the same VLAN. During u-boot one was used for downloading the kernel, then once the kernel booted the other was used. Not ideal, and maybe not important enough to keep the feature around, but it does get used now and again. You could get the same behaviour by using two different vlans connected to the same bridge. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM test: Unattended install: Mount isos as read only
Sometimes CD images can be located on read only NFS shares, so allways pass the ro option to the CD mount command on the unattended.py setup script. Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com --- client/tests/kvm/scripts/unattended.py |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/client/tests/kvm/scripts/unattended.py b/client/tests/kvm/scripts/unattended.py index febea6e..2667649 100755 --- a/client/tests/kvm/scripts/unattended.py +++ b/client/tests/kvm/scripts/unattended.py @@ -136,8 +136,8 @@ class UnattendedInstall(object): pxe_dest = os.path.join(self.tftp_root, 'pxelinux.0') shutil.copyfile(pxe_file, pxe_dest) -m_cmd = 'mount -t iso9660 -v -o loop %s %s' % (self.cdrom_iso, - self.cdrom_mount) +m_cmd = 'mount -t iso9660 -v -o loop,ro %s %s' % (self.cdrom_iso, + self.cdrom_mount) if os.system(m_cmd): raise SetupError('Could not mount CD image %s.' % self.cdrom_iso) -- 1.6.2.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Alacrityvm-devel] [KVM PATCH v2 1/2] KVM: export lockless GSI attribute
Gregory Haskins wrote: Avi Kivity wrote: On 10/23/2009 04:38 AM, Gregory Haskins wrote: Certain GSI's support lockless injecton, but we have no way to detect which ones at the GSI level. Knowledge of this attribute will be useful later in the series so that we can optimize irqfd injection paths for cases where we know the code will not sleep. Therefore, we provide an API to query a specific GSI. Instead of a lockless attribute, how about a -set_atomic() method. For msi this can be the same as -set(), for non-msi it can be a function that schedules the work (which will eventually call -set()). The benefit is that we make a decision only once, when preparing the routing entry, and install that decision in the routing entry instead of making it again and again later. Yeah, I like this idea. I think we can also get rid of the custom workqueue if we do this as well, TBD. So I looked into this. It isn't straight forward because you need to retain some kind of state across the deferment on a per-request basis (not per-GSI). Today, this state is neatly tracked into the irqfd object itself (e.g. it knows to toggle the GSI). So while generalizing this perhaps makes sense at some point, especially if irqfd-like interfaces get added, it probably doesn't make a ton of sense to expend energy on it ATM. It is basically a generalization of the irqfd deferrment code. Lets just wait until we have a user beyond irqfd for now. Sound acceptable? In the meantime, I found a bug in the irq_routing code, so I will submit a v3 with this fix, as well as a few other things I improved in the v2 series. Kind Regards, -Greg signature.asc Description: OpenPGP digital signature
[KVM PATCH v3 0/3] irqfd enhancements, and irq_routing fixes
(Applies to kvm.git/master:11b06403) The following patches are cleanups/enhancements for IRQFD now that we have lockless interrupt injection. For more details, please see the patch headers. These patches pass checkpatch, and are fully tested. Please consider for merging. Patch 1/3 is a fix for an issue that may exist upstream and should be considered for a more timely push upstream. Patches 2/3 - 3/3 are an enhancement only, so there is no urgency to push to mainline until a suitable merge window presents itself. Kind Regards, -Greg [ Change log: v3: *) Added patch 1/3 as a fix for a race condition *) Minor cleanup to 2/3 to ensure that all shared vectors conform to a unified locking model. v2: *) dropped original cleanup which relied on the user registering MSI based GSIs or we may crash at runtime. Instead, we now check at registration whether the GSI supports lockless operation and dynamically adapt to either the original deferred path for lock-based injections, or direct for lockless. v1: *) original release ] --- Gregory Haskins (3): KVM: Directly inject interrupts if they support lockless operation KVM: export lockless GSI attribute KVM: fix race in irq_routing logic include/linux/kvm_host.h |8 virt/kvm/eventfd.c | 31 +++-- virt/kvm/irq_comm.c | 85 ++ virt/kvm/kvm_main.c |1 + 4 files changed, 98 insertions(+), 27 deletions(-) -- Signature -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM PATCH v3 3/3] KVM: Directly inject interrupts if they support lockless operation
IRQFD currently uses a deferred workqueue item to execute the injection operation. It was originally designed this way because kvm_set_irq() required the caller to hold the irq_lock mutex, and the eventfd callback is invoked from within a non-preemptible critical section. With the advent of lockless injection support for certain GSIs, the deferment mechanism is no longer technically needed in all cases. Since context switching to the workqueue is a source of interrupt latency, lets switch to a direct method whenever possible. Fortunately for us, the most common use of irqfd (MSI-based GSIs) readily support lockless injection. Signed-off-by: Gregory Haskins ghask...@novell.com --- virt/kvm/eventfd.c | 31 +++ 1 files changed, 27 insertions(+), 4 deletions(-) diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index 30f70fd..e6cc958 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -51,20 +51,34 @@ struct _irqfd { wait_queue_t wait; struct work_structinject; struct work_structshutdown; + void (*execute)(struct _irqfd *); }; static struct workqueue_struct *irqfd_cleanup_wq; static void -irqfd_inject(struct work_struct *work) +irqfd_inject(struct _irqfd *irqfd) { - struct _irqfd *irqfd = container_of(work, struct _irqfd, inject); struct kvm *kvm = irqfd-kvm; kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 1); kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 0); } +static void +irqfd_deferred_inject(struct work_struct *work) +{ + struct _irqfd *irqfd = container_of(work, struct _irqfd, inject); + + irqfd_inject(irqfd); +} + +static void +irqfd_schedule(struct _irqfd *irqfd) +{ + schedule_work(irqfd-inject); +} + /* * Race-free decouple logic (ordering is critical) */ @@ -126,7 +140,7 @@ irqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, void *key) if (flags POLLIN) /* An event has been signaled, inject an interrupt */ - schedule_work(irqfd-inject); + irqfd-execute(irqfd); if (flags POLLHUP) { /* The eventfd is closing, detach from KVM */ @@ -179,7 +193,7 @@ kvm_irqfd_assign(struct kvm *kvm, int fd, int gsi) irqfd-kvm = kvm; irqfd-gsi = gsi; INIT_LIST_HEAD(irqfd-list); - INIT_WORK(irqfd-inject, irqfd_inject); + INIT_WORK(irqfd-inject, irqfd_deferred_inject); INIT_WORK(irqfd-shutdown, irqfd_shutdown); file = eventfd_fget(fd); @@ -209,6 +223,15 @@ kvm_irqfd_assign(struct kvm *kvm, int fd, int gsi) list_add_tail(irqfd-list, kvm-irqfds.items); spin_unlock_irq(kvm-irqfds.lock); + ret = kvm_irq_check_lockless(kvm, gsi); + if (ret 0) + goto fail; + + if (ret) + irqfd-execute = irqfd_inject; + else + irqfd-execute = irqfd_schedule; + /* * Check if there was an event already pending on the eventfd * before we registered, and trigger it as if we didn't miss it. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM PATCH v3 1/3] KVM: fix race in irq_routing logic
The current code suffers from the following race condition: thread-1thread-2 --- kvm_set_irq() { rcu_read_lock() irq_rt = rcu_dereference(table); rcu_read_unlock(); kvm_set_irq_routing() { mutex_lock(); irq_rt = table; rcu_assign_pointer(); mutex_unlock(); synchronize_rcu(); kfree(irq_rt); irq_rt-entry-set(); /* bad */ - Because the pointer is accessed outside of the read-side critical section. There are two basic patterns we can use to fix this bug: 1) Switch to sleeping-rcu and encompass the -set() access within the read-side critical section, OR 2) Add reference counting to the irq_rt structure, and simply acquire the reference from within the RSCS. This patch implements solution (1). Signed-off-by: Gregory Haskins ghask...@novell.com --- include/linux/kvm_host.h |6 +- virt/kvm/irq_comm.c | 50 +++--- virt/kvm/kvm_main.c |1 + 3 files changed, 35 insertions(+), 22 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index bd5a616..1fe135d 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -185,7 +185,10 @@ struct kvm { struct mutex irq_lock; #ifdef CONFIG_HAVE_KVM_IRQCHIP - struct kvm_irq_routing_table *irq_routing; + struct { + struct srcu_structsrcu; + struct kvm_irq_routing_table *table; + } irq_routing; struct hlist_head mask_notifier_list; struct hlist_head irq_ack_notifier_list; #endif @@ -541,6 +544,7 @@ int kvm_set_irq_routing(struct kvm *kvm, const struct kvm_irq_routing_entry *entries, unsigned nr, unsigned flags); +void kvm_init_irq_routing(struct kvm *kvm); void kvm_free_irq_routing(struct kvm *kvm); #else diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c index 00c68d2..db2553f 100644 --- a/virt/kvm/irq_comm.c +++ b/virt/kvm/irq_comm.c @@ -144,10 +144,11 @@ static int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e, */ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level) { - struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS]; - int ret = -1, i = 0; + struct kvm_kernel_irq_routing_entry *e; + int ret = -1; struct kvm_irq_routing_table *irq_rt; struct hlist_node *n; + int idx; trace_kvm_set_irq(irq, level, irq_source_id); @@ -155,21 +156,19 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level) * IOAPIC. So set the bit in both. The guest will ignore * writes to the unused one. */ - rcu_read_lock(); - irq_rt = rcu_dereference(kvm-irq_routing); + idx = srcu_read_lock(kvm-irq_routing.srcu); + irq_rt = rcu_dereference(kvm-irq_routing.table); if (irq irq_rt-nr_rt_entries) - hlist_for_each_entry(e, n, irq_rt-map[irq], link) - irq_set[i++] = *e; - rcu_read_unlock(); + hlist_for_each_entry(e, n, irq_rt-map[irq], link) { + int r; - while(i--) { - int r; - r = irq_set[i].set(irq_set[i], kvm, irq_source_id, level); - if (r 0) - continue; + r = e-set(e, kvm, irq_source_id, level); + if (r 0) + continue; - ret = r + ((ret 0) ? 0 : ret); - } + ret = r + ((ret 0) ? 0 : ret); + } + srcu_read_unlock(kvm-irq_routing.srcu, idx); return ret; } @@ -179,17 +178,18 @@ void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin) struct kvm_irq_ack_notifier *kian; struct hlist_node *n; int gsi; + int idx; trace_kvm_ack_irq(irqchip, pin); - rcu_read_lock(); - gsi = rcu_dereference(kvm-irq_routing)-chip[irqchip][pin]; + idx = srcu_read_lock(kvm-irq_routing.srcu); + gsi = rcu_dereference(kvm-irq_routing.table)-chip[irqchip][pin]; if (gsi != -1) hlist_for_each_entry_rcu(kian, n, kvm-irq_ack_notifier_list, link) if (kian-gsi == gsi) kian-irq_acked(kian); - rcu_read_unlock(); + srcu_read_unlock(kvm-irq_routing.srcu, idx); } void kvm_register_irq_ack_notifier(struct kvm *kvm, @@ -287,11
[KVM PATCH v3 2/3] KVM: export lockless GSI attribute
Certain GSI's support lockless injecton, but we have no way to detect which ones at the GSI level. Knowledge of this attribute will be useful later in the series so that we can optimize irqfd injection paths for cases where we know the code will not sleep. Therefore, we provide an API to query a specific GSI. Signed-off-by: Gregory Haskins ghask...@novell.com --- include/linux/kvm_host.h |2 ++ virt/kvm/irq_comm.c | 35 ++- 2 files changed, 36 insertions(+), 1 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 1fe135d..01151a6 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -119,6 +119,7 @@ struct kvm_memory_slot { struct kvm_kernel_irq_routing_entry { u32 gsi; u32 type; + bool lockless; int (*set)(struct kvm_kernel_irq_routing_entry *e, struct kvm *kvm, int irq_source_id, int level); union { @@ -420,6 +421,7 @@ void kvm_get_intr_delivery_bitmask(struct kvm_ioapic *ioapic, unsigned long *deliver_bitmask); #endif int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level); +int kvm_irq_check_lockless(struct kvm *kvm, u32 irq); void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin); void kvm_register_irq_ack_notifier(struct kvm *kvm, struct kvm_irq_ack_notifier *kian); diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c index db2553f..a7fd487 100644 --- a/virt/kvm/irq_comm.c +++ b/virt/kvm/irq_comm.c @@ -173,6 +173,35 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level) return ret; } +int kvm_irq_check_lockless(struct kvm *kvm, u32 irq) +{ + struct kvm_kernel_irq_routing_entry *e; + struct kvm_irq_routing_table *irq_rt; + struct hlist_node *n; + int ret = -ENOENT; + int idx; + + idx = srcu_read_lock(kvm-irq_routing.srcu); + irq_rt = rcu_dereference(kvm-irq_routing.table); + if (irq irq_rt-nr_rt_entries) + hlist_for_each_entry(e, n, irq_rt-map[irq], link) { + if (!e-lockless) { + /* +* all destinations need to be lockless to +* declare that the GSI as a whole is also +* lockless +*/ + ret = 0; + break; + } + + ret = 1; + } + srcu_read_unlock(kvm-irq_routing.srcu, idx); + + return ret; +} + void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin) { struct kvm_irq_ack_notifier *kian; @@ -310,18 +339,22 @@ static int setup_routing_entry(struct kvm_irq_routing_table *rt, int delta; struct kvm_kernel_irq_routing_entry *ei; struct hlist_node *n; + bool lockless = ue-type == KVM_IRQ_ROUTING_MSI; /* * Do not allow GSI to be mapped to the same irqchip more than once. * Allow only one to one mapping between GSI and MSI. +* Do not allow mixed lockless vs locked variants to coexist. */ hlist_for_each_entry(ei, n, rt-map[ue-gsi], link) if (ei-type == KVM_IRQ_ROUTING_MSI || - ue-u.irqchip.irqchip == ei-irqchip.irqchip) + ue-u.irqchip.irqchip == ei-irqchip.irqchip || + ei-lockless != lockless) return r; e-gsi = ue-gsi; e-type = ue-type; + e-lockless = lockless; switch (ue-type) { case KVM_IRQ_ROUTING_IRQCHIP: delta = 0; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ANNOUNCE] kvm-kmod-2.6.31.5
This package contains the kvm external modules, using the sources from latest stable Linux release 2.6.31.5. It can be used to update the kernel-side support of kvm without upgrading the host kernel. This release has been tested on x86 down to host kernel 2.6.27 and builds down to 2.6.24. Building against older kernels is expected to be broken, but if anyone provides patches to fix it, I'm open to merge them. Enjoy, Jan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Jan Kiszka to maintain kvm-kmod
Avi Kivity wrote: I am pleased to announce that Jan Kiszka has agreed to maintain kvm-kmod.git, the backporting kit that allows running modern kvm code on older kernels. Jan will release kvm-kmod-2.6.x.y packages and kvm-kmod-2.6.x-rcy packages, while Marcelo and I will (with Jan's help) release kvm-kmod-devel-xx. Many thanks to Jan for taking on this task. Thanks for giving me the chance to screw even more things up. :) Thanks also go to Siemens Corporate Technology and Siemens Enterprise Communications for sponsoring my work on kvm-kmod. As there are now many different sources of kvm kernel modules to choose from, I wrote up a page that describes the various releases and what they are suited for. This can be found in http://www.linux-kvm.org/page/Getting_the_kvm_kernel_modules. And besides those releases, I will try to keep the kvm-kmod.git in sync with latest kvm.git so that developers can test most bleeding-edge kvm on not that much bleeding host kernels (I'm one of those). At this chance I would like to underline that the quality of kvm-kmod support of course continues to depend on patch contributions. So if you are posting a new kvm feature that may require compat wrapping or you discover some breakage, please consider posting a corresponding update of kvm-kmod as well. TiA! To help detecting breakages, I've set up a builtbot [1] that checks kvm-mod against its officially supported kvm version as well as the next branch in kvm.git (the former on commits, the latter on a nightly basis). That forecast already promises the next rain [2] - time to go home... Jan [1]http://buildbot.kiszka.org/kvm-kmod/ [2]http://buildbot.kiszka.org/kvm-kmod/builders/latest-kvm/builds/11/steps/compile/logs/stdio -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu-kvm: sigsegv at exit
On Thu, Oct 22, 2009 at 06:57:27PM -0200, Marcelo Tosatti wrote: On Thu, Oct 22, 2009 at 02:00:15PM +0200, Michael S. Tsirkin wrote: Hi! I'm sometimes getting segfaults when I kill qemu. This time I caught it when qemu was under gdb: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x411d0940 (LWP 14446)] 0x0040afb4 in qemu_mod_timer (ts=0x19f0fd0, expire_time=62275467335) at /home/mst/scm/qemu-kvm/vl.c:1009 1009if ((alarm_timer-flags ALARM_FLAG_EXPIRED) == 0) { (gdb) l 1004ts-next = *pt; 1005*pt = ts; 1006 1007/* Rearm if necessary */ 1008if (pt == active_timers[ts-clock-type]) { 1009if ((alarm_timer-flags ALARM_FLAG_EXPIRED) == 0) { 1010qemu_rearm_alarm_timer(alarm_timer); 1011} 1012/* Interrupt execution to force deadline recalculation. */ 1013if (use_icount) (gdb) p alarm_timer $1 = (struct qemu_alarm_timer *) 0x0 (gdb) where #0 0x0040afb4 in qemu_mod_timer (ts=0x19f0fd0, expire_time=62275467335) at /home/mst/scm/qemu-kvm/vl.c:1009 #1 0x0041aadf in virtio_net_handle_tx (vdev=value optimized out, vq=0x19f5af0) at /home/mst/scm/qemu-kvm/hw/virtio-net.c:696 #2 0x00421669 in kvm_run (vcpu=0x19d46a0, env=0x19c2250) at /home/mst/scm/qemu-kvm/qemu-kvm.c:797 #3 0x004216d6 in kvm_cpu_exec (env=0x83d0f8) at /home/mst/scm/qemu-kvm/qemu-kvm.c:1714 #4 0x00422981 in ap_main_loop (_env=value optimized out) at /home/mst/scm/qemu-kvm/qemu-kvm.c:1969 #5 0x00377dc06367 in start_thread () from /lib64/libpthread.so.0 #6 0x00377d0d30ad in clone () from /lib64/libc.so.6 (gdb) So this probably means that we have already run quit_timers: static void quit_timers(void) { alarm_timer-stop(alarm_timer); alarm_timer = NULL; } but kvm vcpu thread is still running. Not sure what the right fix is here: should we stop kvm after main loop has exited? kvm_main_loop_wait(env, 0) can process the stop request (signalling iothread that vcpu is stopped, so its OK to exit) and continue to kvm_cpu_exec. Can you please try this: I applied this, and have not yet see any segfaults at exit. Not sure whether this is means anything as the crash is not 100% reproducable. Push it out to Anthony and we'll see, long term? Based on the knowledge of how to fix this, how would you go about reproducing it? Add code to trigger the race manually, but i'm pretty sure thats it. Thanks for testing. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] virtio-net: fix data corruption with OOM
On Mon, Oct 26, 2009 at 12:11:51PM +1030, Rusty Russell wrote: On Mon, 26 Oct 2009 03:33:40 am Michael S. Tsirkin wrote: virtio net used to unlink skbs from send queues on error, but ever since 48925e372f04f5e35fec6269127c62b2c71ab794 we do not do this. This causes guest data corruption and crashes with vhost since net core can requeue the skb or free it without it being taken off the list. This patch fixes this by queueing the skb after successfull transmit. I originally thought that this was racy: as soon as we do add_buf, we need to make sure we're ready for the callback (for virtio_pci, it's -kick, but we shouldn't rely on that). Modified the guest slightly, and I am getting crashes again. I didn't have time to debug this, but based on previous experience, I reverted 48925e372f04f5e35fec6269127c62b2c71ab794, and the crash went away. Rusty, what do you say we just revert 48925e372f04f5e35fec6269127c62b2c71ab794 for now? How to reproduce: I used my vhost trees, and modified drivers/vhost/vhost.c : - vhost_workqueue = create_workqueue(vhost); + vhost_workqueue = create_singlethread_workqueue(vhost); My guess is this modifies timing and uncovers more races, but of course there is a possibility that the bug is in vhost. Still, the fact that 2.6.31 and 48925e372f04f5e35fec6269127c62b2c71ab794 as a guest are both fine, this is a strong hint that 48925e372f04f5e35fec6269127c62b2c71ab794 is to blame. [ 24.555691] BUG: unable to handle kernel NULL pointer dereference at 0008 [ 24.556658] IP: [a003f1b1] free_old_xmit_skbs+0x66/0xcd [virtio_net] [ 24.556658] PGD 3e9ee067 PUD 3f38d067 PMD 0 [ 24.556658] Thread overran stack, or stack corrupted [ 24.556658] Oops: 0002 [#1] SMP [ 24.556658] last sysfs file: /sys/devices/virtual/input/input1/capabilities/sw [ 24.556658] CPU 0 [ 24.556658] Modules linked in: virtio_net virtio_blk virtio_pci virtio_ring virtio af_packet aacraid [last unloaded: scsi_wait_scan] [ 24.556658] Pid: 0, comm: swapper Tainted: GW 2.6.32-rc4-net #6 [ 24.556658] RIP: 0010:[a003f1b1] [a003f1b1] free_old_xmit_skbs+0x66/0xcd [virtio_net] [ 24.556658] RSP: 0018:880001c03d70 EFLAGS: 00010202 [ 24.556658] RAX: 88003e951418 RBX: 88003e953398 RCX: [ 24.556658] RDX: RSI: 880001c03d84 RDI: 88003e953398 [ 24.556658] RBP: 880001c03db0 R08: 88003e2c949c R09: [ 24.556658] R10: 880001c03f78 R11: fffbcc57 R12: 88003e65cdc0 [ 24.556658] R13: R14: 2000 R15: 880001c03d84 [ 24.556658] FS: () GS:880001c0() knlGS: [ 24.556658] CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b [ 24.556658] CR2: 0008 CR3: 3eee4000 CR4: 06b0 [ 24.556658] DR0: DR1: DR2: [ 24.556658] DR3: DR6: 0ff0 DR7: 0400 [ 24.556658] Process swapper (pid: 0, threadinfo 8174e000, task 817c09f0) [ 24.556658] Stack: [ 24.556658] 0002 88003e953398 [ 24.556658] 0 88003e953398 88003e65cdc0 88003e65c800 88003e65ce70 [ 24.556658] 0 880001c03df0 a003fb35 88003e65cc28 88003e953398 [ 24.556658] Call Trace: [ 24.556658] IRQ [ 24.556658] [a003fb35] start_xmit+0x38/0x15f [virtio_net] [ 24.556658] [813ff768] dev_hard_start_xmit+0x26c/0x2d3 [ 24.556658] [81412016] sch_direct_xmit+0x5a/0x157
KVM: VMX: move CR3/PDPTR update to vmx_set_cr3
GUEST_CR3 is updated via kvm_set_cr3 whenever CR3 is modified from outside guest context. Similarly pdptrs are updated via load_pdptrs. Let kvm_set_cr3 perform the update, removing it from the vcpu_run fast path. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: b/arch/x86/kvm/vmx.c === --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1748,6 +1748,7 @@ static void vmx_set_cr3(struct kvm_vcpu vmcs_write64(EPT_POINTER, eptp); guest_cr3 = is_paging(vcpu) ? vcpu-arch.cr3 : vcpu-kvm-arch.ept_identity_map_addr; + ept_load_pdptrs(vcpu); } vmx_flush_tlb(vcpu); @@ -3638,10 +3639,6 @@ static void vmx_vcpu_run(struct kvm_vcpu { struct vcpu_vmx *vmx = to_vmx(vcpu); - if (enable_ept is_paging(vcpu)) { - vmcs_writel(GUEST_CR3, vcpu-arch.cr3); - ept_load_pdptrs(vcpu); - } /* Record the guest's net vcpu time for enforced NMI injections. */ if (unlikely(!cpu_has_virtual_nmis() vmx-soft_vnmi_blocked)) vmx-entry_time = ktime_get(); Index: b/arch/x86/kvm/x86.c === --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4517,8 +4517,10 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct mmu_reset_needed |= vcpu-arch.cr4 != sregs-cr4; kvm_x86_ops-set_cr4(vcpu, sregs-cr4); - if (!is_long_mode(vcpu) is_pae(vcpu)) + if (!is_long_mode(vcpu) is_pae(vcpu)) { load_pdptrs(vcpu, vcpu-arch.cr3); + mmu_reset_needed = 1; + } if (mmu_reset_needed) kvm_mmu_reset_context(vcpu); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
performance regression in virtio-net in 2.6.32-rc4
Hi! I noticed a performance regression in virtio net: going from 2.6.31 to 2.6.32-rc4 I see this, for guest to host communication: [...@tuck ~]$ ssh robin sh streamtest1 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.3 (11.0.0.3) port 0 AF_INET : demo Recv SendSend Socket Socket Message Elapsed Size SizeSize Time Throughput bytes bytes bytessecs.10^6bits/sec 87380 16384 1638410.207806.48 [...@tuck ~]$ ssh robin sh streamtest1 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.3 (11.0.0.3) port 0 AF_INET : demo Recv SendSend Socket Socket Message Elapsed Size SizeSize Time Throughput bytes bytes bytessecs.10^6bits/sec 87380 16384 1638410.006814.60 Note: I had to revert 48925e372f04f5e35fec6269127c62b2c71ab794, and I applied a patch virtio-pci: fix per-vq MSI-X request logic which fixes a bug introduced by f68d24082e22ccee3077d11aeb6dc5354f0ca7f1. Any tips on debugging this? -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu-kvm: sigsegv at exit
On Mon, Oct 26, 2009 at 04:43:11PM -0200, Marcelo Tosatti wrote: On Thu, Oct 22, 2009 at 06:57:27PM -0200, Marcelo Tosatti wrote: On Thu, Oct 22, 2009 at 02:00:15PM +0200, Michael S. Tsirkin wrote: Hi! I'm sometimes getting segfaults when I kill qemu. This time I caught it when qemu was under gdb: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x411d0940 (LWP 14446)] 0x0040afb4 in qemu_mod_timer (ts=0x19f0fd0, expire_time=62275467335) at /home/mst/scm/qemu-kvm/vl.c:1009 1009if ((alarm_timer-flags ALARM_FLAG_EXPIRED) == 0) { (gdb) l 1004ts-next = *pt; 1005*pt = ts; 1006 1007/* Rearm if necessary */ 1008if (pt == active_timers[ts-clock-type]) { 1009if ((alarm_timer-flags ALARM_FLAG_EXPIRED) == 0) { 1010qemu_rearm_alarm_timer(alarm_timer); 1011} 1012/* Interrupt execution to force deadline recalculation. */ 1013if (use_icount) (gdb) p alarm_timer $1 = (struct qemu_alarm_timer *) 0x0 (gdb) where #0 0x0040afb4 in qemu_mod_timer (ts=0x19f0fd0, expire_time=62275467335) at /home/mst/scm/qemu-kvm/vl.c:1009 #1 0x0041aadf in virtio_net_handle_tx (vdev=value optimized out, vq=0x19f5af0) at /home/mst/scm/qemu-kvm/hw/virtio-net.c:696 #2 0x00421669 in kvm_run (vcpu=0x19d46a0, env=0x19c2250) at /home/mst/scm/qemu-kvm/qemu-kvm.c:797 #3 0x004216d6 in kvm_cpu_exec (env=0x83d0f8) at /home/mst/scm/qemu-kvm/qemu-kvm.c:1714 #4 0x00422981 in ap_main_loop (_env=value optimized out) at /home/mst/scm/qemu-kvm/qemu-kvm.c:1969 #5 0x00377dc06367 in start_thread () from /lib64/libpthread.so.0 #6 0x00377d0d30ad in clone () from /lib64/libc.so.6 (gdb) So this probably means that we have already run quit_timers: static void quit_timers(void) { alarm_timer-stop(alarm_timer); alarm_timer = NULL; } but kvm vcpu thread is still running. Not sure what the right fix is here: should we stop kvm after main loop has exited? kvm_main_loop_wait(env, 0) can process the stop request (signalling iothread that vcpu is stopped, so its OK to exit) and continue to kvm_cpu_exec. Can you please try this: I applied this, and have not yet see any segfaults at exit. Not sure whether this is means anything as the crash is not 100% reproducable. Push it out to Anthony and we'll see, long term? Based on the knowledge of how to fix this, how would you go about reproducing it? Add code to trigger the race manually, If you like, send a patch adding such code, I will test. but i'm pretty sure thats it. Thanks for testing. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
KVM: MMU: update invlpg handler comment
Large page translations are always synchronized (either in level 3 or level 2), so its not necessary to properly deal with them in the invlpg handler. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: b/arch/x86/kvm/paging_tmpl.h === --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -467,7 +467,6 @@ static void FNAME(invlpg)(struct kvm_vcp level = iterator.level; sptep = iterator.sptep; - /* FIXME: properly handle invlpg on large guest pages */ if (level == PT_PAGE_TABLE_LEVEL || ((level == PT_DIRECTORY_LEVEL is_large_pte(*sptep))) || ((level == PT_PDPE_LEVEL is_large_pte(*sptep { -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fix qemu-kvm sigsegv at exit
On Mon, Oct 26, 2009 at 04:46:02PM -0200, Marcelo Tosatti wrote: Michael reported a qemu-kvm SIGSEGV at shutdown: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x411d0940 (LWP 14446)] 0x0040afb4 in qemu_mod_timer (ts=0x19f0fd0, expire_time=62275467335) at /home/mst/scm/qemu-kvm/vl.c:1009 1009if ((alarm_timer-flags ALARM_FLAG_EXPIRED) == 0) { (gdb) l 1004ts-next = *pt; 1005*pt = ts; 1006 1007/* Rearm if necessary */ 1008if (pt == active_timers[ts-clock-type]) { 1009if ((alarm_timer-flags ALARM_FLAG_EXPIRED) == 0) { 1010 qemu_rearm_alarm_timer(alarm_timer); 1011} 1012/* Interrupt execution to force deadline recalculation. */ 1013if (use_icount) (gdb) p alarm_timer $1 = (struct qemu_alarm_timer *) 0x0 Problem is kvm_main_loop_wait(env, 0) can process the stop request (signalling iothread that vcpu is stopped, so its OK to exit) and continue to kvm_cpu_exec. Make sure cpu is not stopped before proceeding to kvm_cpu_exec. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Reported-by: Michael S. Tsirkin m...@redhat.com diff --git a/qemu-kvm.c b/qemu-kvm.c index 4c13628..ab8f0e4 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -1868,7 +1868,8 @@ static int kvm_main_loop_cpu(CPUState *env) } if (run_cpu) { kvm_main_loop_wait(env, 0); -kvm_cpu_exec(env); +if (!is_cpu_stopped(env)) +kvm_cpu_exec(env); I wonder if calling kvm_cpu_exec() after kvm_main_loop_wait() will fix the problem? } else { kvm_main_loop_wait(env, 1000); } -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fix qemu-kvm sigsegv at exit
On Mon, Oct 26, 2009 at 08:58:49PM +0200, Gleb Natapov wrote: On Mon, Oct 26, 2009 at 04:46:02PM -0200, Marcelo Tosatti wrote: Michael reported a qemu-kvm SIGSEGV at shutdown: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x411d0940 (LWP 14446)] 0x0040afb4 in qemu_mod_timer (ts=0x19f0fd0, expire_time=62275467335) at /home/mst/scm/qemu-kvm/vl.c:1009 1009if ((alarm_timer-flags ALARM_FLAG_EXPIRED) == 0) { (gdb) l 1004ts-next = *pt; 1005*pt = ts; 1006 1007/* Rearm if necessary */ 1008if (pt == active_timers[ts-clock-type]) { 1009if ((alarm_timer-flags ALARM_FLAG_EXPIRED) == 0) { 1010qemu_rearm_alarm_timer(alarm_timer); 1011} 1012/* Interrupt execution to force deadline recalculation. */ 1013if (use_icount) (gdb) p alarm_timer $1 = (struct qemu_alarm_timer *) 0x0 Problem is kvm_main_loop_wait(env, 0) can process the stop request (signalling iothread that vcpu is stopped, so its OK to exit) and continue to kvm_cpu_exec. Make sure cpu is not stopped before proceeding to kvm_cpu_exec. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Reported-by: Michael S. Tsirkin m...@redhat.com diff --git a/qemu-kvm.c b/qemu-kvm.c index 4c13628..ab8f0e4 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -1868,7 +1868,8 @@ static int kvm_main_loop_cpu(CPUState *env) } if (run_cpu) { kvm_main_loop_wait(env, 0); -kvm_cpu_exec(env); +if (!is_cpu_stopped(env)) +kvm_cpu_exec(env); I wonder if calling kvm_cpu_exec() after kvm_main_loop_wait() will fix the problem? Yeah, that would also do it. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] virtio-net: fix data corruption with OOM
On Mon, Oct 26, 2009 at 08:42:43PM +0200, Michael S. Tsirkin wrote: On Mon, Oct 26, 2009 at 12:11:51PM +1030, Rusty Russell wrote: On Mon, 26 Oct 2009 03:33:40 am Michael S. Tsirkin wrote: virtio net used to unlink skbs from send queues on error, but ever since 48925e372f04f5e35fec6269127c62b2c71ab794 we do not do this. This causes guest data corruption and crashes with vhost since net core can requeue the skb or free it without it being taken off the list. This patch fixes this by queueing the skb after successfull transmit. I originally thought that this was racy: as soon as we do add_buf, we need to make sure we're ready for the callback (for virtio_pci, it's -kick, but we shouldn't rely on that). Modified the guest slightly, and I am getting crashes again. I didn't have time to debug this, but based on previous experience, I reverted 48925e372f04f5e35fec6269127c62b2c71ab794, and the crash went away. Rusty, what do you say we just revert 48925e372f04f5e35fec6269127c62b2c71ab794 for now? Hmm. Can't reproduce the crash anymore. There is a small chance that the problem was my error, so I guess I should try to reproduce and debug this, after all. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
-cpu host AMD Host
Is cpu host supported on AMD hosts? Whenever I try to use this option on a Windows Vista/7 client, I get blue screen. Removing the option, the client works fine. Host kernel 2.6.31.4. Userspace is qemu-kvm-0.11.0. (Previous versions fail too) /proc/cpuinfo snippet: vendor_id : AuthenticAMD cpu family : 15 model : 107 model name : AMD Athlon(tm) 64 X2 Dual Core Processor 5200+ Thanks, -- Marty -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vhost-net patches
On Fri, Oct 23, 2009 at 09:23:40AM -0700, Shirley Ma wrote: Hello Michael, Some initial vhost test netperf results on my T61 laptop from the working tap device are here, latency has been significant decreased, but throughput from guest to host has huge regression. I also hit guest skb_xmit panic. netperf TCP_STREAM, default setup, 60 secs run guest-host drops from 3XXXMb/s to 1XXXMb/s (regression) host-guest increases from 3XXXMb/s to 4Mb/s TCP_RR, 60 secs run (very impressive) guest-host trans/s increases from 2XXX/s to 13XXX/s host-guest trans/s increases from 2XXX/s to 13XXX/s Thanks Shirley Shirley, could you please test the following patch? It is surprising to me that it should improve performance, but seems to do this in my setup. Please comment. diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 30708c6..67bfc08 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -775,7 +775,7 @@ void vhost_no_notify(struct vhost_virtqueue *vq) int vhost_init(void) { - vhost_workqueue = create_workqueue(vhost); + vhost_workqueue = create_singlethread_workqueue(vhost); if (!vhost_workqueue) return -ENOMEM; return 0; diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index a140dad..49026bb 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -106,10 +106,14 @@ static void handle_tx(struct vhost_net *net) .msg_flags = MSG_DONTWAIT, }; size_t len, total_len = 0; - int err; + int err, wmem; size_t hdr_size; struct socket *sock = rcu_dereference(vq-private_data); - if (!sock || !sock_writeable(sock-sk)) + if (!sock) + return; + + wmem = atomic_read(sock-sk-sk_wmem_alloc); + if (wmem = sock-sk-sk_sndbuf) return; use_mm(net-dev.mm); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v5
On Sun, 2009-10-25 at 15:01 +0200, Avi Kivity wrote: On 10/23/2009 02:33 AM, Hollis Blanchard wrote: On Wed, 2009-10-21 at 17:03 +0200, Alexander Graf wrote: KVM for PowerPC only supports embedded cores at the moment. While it makes sense to virtualize on small machines, it's even more fun to do so on big boxes. So I figured we need KVM for PowerPC64 as well. This patchset implements KVM support for Book3s_64 hosts and guest support for Book3s_64 and G3/G4. Acked-by: Hollis Blanchardholl...@us.ibm.com Avi, please apply these patches I still need acks for the arch/powerpc/{kernel,mm} bits, simple as they are, from the powerpc maintainers. OK, BenH says they're on his todo list. In the meantime, please apply patch #2, because it fixes the broken qemu build. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
xp guest, blue screen c0000221 on boot
Hangs on boot, xp guest: STOP: c221 Unknown Hard Error \SystemRoot\System32\ntdll.dll Will boot into safe mode, but _not_ into safe mode with networking. Boots into non-MS VMs fine. * what cpu model (examples: Intel Core Duo, Intel Core 2 Duo, AMD Opteron 2210). See /proc/cpuinfo if you're not sure. processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 14 model name : Genuine Intel(R) CPU L2400 @ 1.66GHz stepping: 8 cpu MHz : 1000.000 cache size : 2048 KB physical id : 0 siblings: 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon bts pni monitor vmx est tm2 xtpr pdcm bogomips: 3324.92 clflush size: 64 power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 14 model name : Genuine Intel(R) CPU L2400 @ 1.66GHz stepping: 8 cpu MHz : 1000.000 cache size : 2048 KB physical id : 0 siblings: 2 core id : 1 cpu cores : 2 apicid : 1 initial apicid : 1 fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon bts pni monitor vmx est tm2 xtpr pdcm bogomips: 3324.97 clflush size: 64 power management: * what kvm version you are using. If you're using git directly, provide the output of 'git describe'. Same behavior with ubuntu package 0.11.0-0ubuntu6 (karmic) and source qemu-kvm-0.11.0 * the host kernel version Linux monkamu 2.6.31-14-generic #48-Ubuntu SMP Fri Oct 16 14:04:26 UTC 2009 i686 GNU/Linux * what host kernel arch you are using (i386 or x86_64) i386 * what guest you are using, including OS type (Linux, Windows, Solaris, etc.), bitness (32 or 64), kernel version XP Pro 32 SP 3 * the qemu command line you are using to start the guest kvm -cpu coreduo,-nx -hda /z/xp.img -boot c -usb -usbdevice tablet -m 512 * whether the problem goes away if using the -no-kvm-irqchip or -no-kvm-pit switch. No * whether the problem also appears with the -no-kvm switch. No -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANNOUNCE] kvm-kmod-2.6.31.5
On 26.10.2009, at 18:26, Jan Kiszka wrote: This package contains the kvm external modules, using the sources from latest stable Linux release 2.6.31.5. It can be used to update the kernel-side support of kvm without upgrading the host kernel. This release has been tested on x86 down to host kernel 2.6.27 and builds down to 2.6.24. Building against older kernels is expected to be broken, but if anyone provides patches to fix it, I'm open to merge them. Aww - I'm missing the awesome changelogs :-). Great to see you take this up Jan! Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vhost-net patches
Hello Miachel, On Mon, 2009-10-26 at 22:05 +0200, Michael S. Tsirkin wrote: Shirley, could you please test the following patch? With this patch, the performance has gained from 1xxx to 2xxx Mb/s, still has some performance gap compared to without vhost. It was 3xxxMb/s before from guest to host on my set up. Looks like your git tree virtio_net has fixed the skb_xmit panic I have seen before as well, good news. Thanks Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vhost-net patches
Pulled your git tree, didn't see the panic. Thanks Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vhost-net patches
On Sun, 2009-10-25 at 11:11 +0200, Michael S. Tsirkin wrote: What is vnet0? That's a tap interface. I am binding raw socket to a tap interface and it doesn't work. Does it support? Thanks Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ kvm-Bugs-2886754 ] Extreme slow down using -cpu host
Bugs item #2886754, was opened at 2009-10-26 23:41 Message generated for change (Tracker Item Submitted) made by nwxi You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2886754group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: qemu Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: nwxi (nwxi) Assigned to: Nobody/Anonymous (nobody) Summary: Extreme slow down using -cpu host Initial Comment: Hi! I've currently experienced a massive slowdown when using flag -cpu host on an AMD Phenom 905e with qemu-kvm 0.11.0 and kernel 2.6.31.3 x86_64. This affects network (tested e1000 and virtio) and i/o performance (virtio). Further there are dozens of log messages from kvm module complaining about: cpu0/1 unhandled rdmsr: 0xc0010055 Both, the slowdown and the rdmsr-message do not happen when using qemu64 as cpu, but also occur when using -cpu phenom. Thank you for your comments! regards, Michael -- You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2886754group_id=180599 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v5
Not sure which patch in the series this is needed for since I applied them all, but I got: CC arch/powerpc/kvm/timing.o arch/powerpc/kvm/timing.c:205: error: 'THIS_MODULE' undeclared here (not in a function) Signed-off-by: Olof Johansson o...@lixom.net diff --git a/arch/powerpc/kvm/timing.c b/arch/powerpc/kvm/timing.c index 2aa371e..7037855 100644 --- a/arch/powerpc/kvm/timing.c +++ b/arch/powerpc/kvm/timing.c @@ -23,6 +23,7 @@ #include linux/seq_file.h #include linux/debugfs.h #include linux/uaccess.h +#include linux/module.h #include asm/time.h #include asm-generic/div64.h -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v5
On Mon, 2009-10-26 at 18:06 -0500, Olof Johansson wrote: Not sure which patch in the series this is needed for since I applied them all, but I got: CC arch/powerpc/kvm/timing.o arch/powerpc/kvm/timing.c:205: error: 'THIS_MODULE' undeclared here (not in a function) Signed-off-by: Olof Johansson o...@lixom.net diff --git a/arch/powerpc/kvm/timing.c b/arch/powerpc/kvm/timing.c index 2aa371e..7037855 100644 --- a/arch/powerpc/kvm/timing.c +++ b/arch/powerpc/kvm/timing.c @@ -23,6 +23,7 @@ #include linux/seq_file.h #include linux/debugfs.h #include linux/uaccess.h +#include linux/module.h #include asm/time.h #include asm-generic/div64.h For some reason, I'm not seeing this build break, but the patch is obviously correct. Acked-by: Hollis Blanchard holl...@us.ibm.com -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM test: Add new program cd_hash.py
A new program that evaluates hash strings, intended to help kvm autotest administrators was added, cd_hash. Usage: cd_hash.py [options] Options: -h, --helpshow this help message and exit -i FILENAME, --iso=FILENAME path to a ISO file whose hash string will be evaluated. This script will calculate: * MD5SUM for the 1st MB of the file * SHA1SUM for the 1st MB of the file * MD5SUM for the whole file * SHA1SUM for the whole file The hashes for the 1st MB are calculated first in the case the user only wants them. This program replaces calc_md5sum_1m. Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com --- client/tests/kvm/calc_md5sum_1m.py | 21 -- client/tests/kvm/cd_hash.py| 54 2 files changed, 54 insertions(+), 21 deletions(-) delete mode 100755 client/tests/kvm/calc_md5sum_1m.py create mode 100755 client/tests/kvm/cd_hash.py diff --git a/client/tests/kvm/calc_md5sum_1m.py b/client/tests/kvm/calc_md5sum_1m.py deleted file mode 100755 index 153a1e0..000 --- a/client/tests/kvm/calc_md5sum_1m.py +++ /dev/null @@ -1,21 +0,0 @@ -#!/usr/bin/python - -Program that calculates the md5sum for the first megabyte of a file. -It's faster than calculating the md5sum for the whole ISO image. - -...@copyright: Red Hat 2008-2009 -...@author: Uri Lublin (u...@redhat.com) - - -import os, sys -import kvm_utils - - -if len(sys.argv) 2: -print 'usage: %s iso-filename' % sys.argv[0] -else: -fname = sys.argv[1] -if not os.access(fname, os.F_OK) or not os.access(fname, os.R_OK): -print 'bad file name or permissions' -else: -print kvm_utils.hash_file(fname, 1024*1024, method=md5) diff --git a/client/tests/kvm/cd_hash.py b/client/tests/kvm/cd_hash.py new file mode 100755 index 000..483d71c --- /dev/null +++ b/client/tests/kvm/cd_hash.py @@ -0,0 +1,54 @@ +#!/usr/bin/python + +Program that calculates several hashes for a given CD image. + +...@copyright: Red Hat 2008-2009 + + +import os, sys, optparse, logging +import common, kvm_utils +from autotest_lib.client.common_lib import logging_config, logging_manager + + +class KvmLoggingConfig(logging_config.LoggingConfig): +def configure_logging(self, results_dir=None, verbose=False): +super(KvmLoggingConfig, self).configure_logging(use_console=True, +verbose=verbose) + +if __name__ == __main__: +parser = optparse.OptionParser() +parser.add_option('-i', '--iso', type=string, dest=filename, + action='store', + help='path to a ISO file whose hash string will be ' + 'evaluated.') + +options, args = parser.parse_args() +filename = options.filename + +logging_manager.configure_logging(KvmLoggingConfig()) + +if not filename: +parser.print_help() +sys.exit(1) + +filename = os.path.abspath(filename) + +file_exists = os.path.isfile(filename) +can_read_file = os.access(filename, os.R_OK) +if not file_exists: +logging.critical(File %s does not exist. Aborting..., filename) +sys.exit(1) +if not can_read_file: +logging.critical(File %s does not have read permissions. + Aborting..., filename) +sys.exit(1) + +logging.info(Hash values for file %s, os.path.basename(filename)) +logging.info(md5(1m): %s, kvm_utils.hash_file(filename, 1024*1024, +method=md5)) +logging.info(sha1 (1m): %s, kvm_utils.hash_file(filename, 1024*1024, +method=sha1)) +logging.info(md5 (full): %s, kvm_utils.hash_file(filename, +method=md5)) +logging.info(sha1 (full): %s, kvm_utils.hash_file(filename, +method=sha1)) -- 1.6.2.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v5
On Oct 26, 2009, at 6:20 PM, Hollis Blanchard wrote: For some reason, I'm not seeing this build break, but the patch is obviously correct. Acked-by: Hollis Blanchard holl...@us.ibm.com I saw it when building with pasemi_defconfig + manually enabled KVM options (all available). -Olof -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Virtio block module slower than IDE
Hi, I am running Proxmox 1.4 (which uses the 2.6.30.1 kvm modules) and am experiencing performance problems with Linux guests using the virtio_blk module. Especially with random IO it is a lot slower than IDE. Ubuntu 9.10 VM on LVM storage with VirtIO: === bonnie++ -s 16384 Version 1.03c --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP ubuntu910 16G 39209 96 45383 3 29984 6 33996 73 90472 8 636.5 1 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 + +++ + +++ + +++ 23837 56 + +++ + +++ ubuntu910,16G,39209,96,45383,3,29984,6,33996,73,90472,8,636.5,1,16,+,+++,+,+++,+,+++,23837,56,+,+++,+,+++ postmark set size 1 1000 set number 300 set transactions 2500 run PostMark v1.51 : 8/14/01 Creating files...Done Performing transactions..Done Deleting files...Done Time: 141 seconds total 122 seconds of transactions (20 per second) Files: 1540 created (10 per second) Creation alone: 300 files (17 per second) Mixed with transactions: 1240 files (10 per second) 1242 read (10 per second) 1258 appended (10 per second) 1540 deleted (10 per second) Deletion alone: 280 files (140 per second) Mixed with transactions: 1260 files (10 per second) Data: 7653.28 megabytes read (54.28 megabytes per second) 9534.76 megabytes written (67.62 megabytes per second) === Ubuntu 9.10 VM on LVM storage with IDE: === Version 1.03c --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP ubuntu910 16G 38796 97 63574 5 31138 7 34604 74 92490 8 2803 7 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 + +++ + +++ + +++ 23745 56 + +++ + +++ ubuntu910,16G,38796,97,63574,5,31138,7,34604,74,92490,8,2803.0,7,16,+,+++,+,+++,+,+++,23745,56,+,+++,+,+++ PostMark v1.51 : 8/14/01 Creating files...Done Performing transactions..Done Deleting files...Done Time: 126 seconds total 111 seconds of transactions (22 per second) Files: 1540 created (12 per second) Creation alone: 300 files (20 per second) Mixed with transactions: 1240 files (11 per second) 1242 read (11 per second) 1258 appended (11 per second) 1540 deleted (12 per second) Deletion alone: 280 files (280 per second) Mixed with transactions: 1260 files (11 per second) Data: 7653.28 megabytes read (60.74 megabytes per second) 9534.76 megabytes written (75.67 megabytes per second) === Configuration: dual quadcore Opteron 2350, Mtron 7000 solid state drive, 8 gb ram, 6 gb assigned to vm, swap disabled on both host and vm. KVM command line used by Proxmox for VirtIO: /usr/bin/kvm -monitor unix:/var/run/qemu-server/102.mon,server,nowait -vnc unix:/var/run/qemu-server/102.vnc,password -pidfile /var/run/qemu-server/102.pid -daemonize -usbdevice tablet -name ubuntu910 -smp sockets=1,cores=1 -boot cad -vga cirrus -tdf-drive file=/dev/vmstorage/vm-102-disk-1,if=virtio,index=0,boot=on -m 6000 -net user,vlan=1000,hostname=ubuntu910 -net nic,vlan=1000,model=rtl8139,macaddr=CE:14:D4:DC:2B:94 Also tried with Ubuntu 9.04 instead of 9.10, but the results are similar. Any idea what might be the problem? Yours sincerely, Floris Bos -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] virtio-net: fix data corruption with OOM
From: Michael S. Tsirkin m...@redhat.com Date: Mon, 26 Oct 2009 11:07:13 +0200 Another, and hopefully the last, note, is that git-am can only handle Subject/From lines at the beginning of the message. So git style of the mail would be ... I think it's weird. We could invent some kind of separator that would make git-am accept Subject/From/Date lines in the middle of the message, so that discussion can come before the description. Worth it? There is no need for this. patchwork handles this situation perfectly and this is what I use to apply all networking patches. Anything in a reply to a patch that looks like a signoff or ACK, patchwork adds to the commit message in the mbox blob it spits out for me. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM: VMX: move CR3/PDPTR update to vmx_set_cr3
On Tuesday 27 October 2009 02:48:33 Marcelo Tosatti wrote: GUEST_CR3 is updated via kvm_set_cr3 whenever CR3 is modified from outside guest context. Similarly pdptrs are updated via load_pdptrs. Let kvm_set_cr3 perform the update, removing it from the vcpu_run fast path. Looks fine to me. Acked-by: Sheng Yang sh...@linux.intel.com -- regards Yang, Sheng Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: b/arch/x86/kvm/vmx.c === --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1748,6 +1748,7 @@ static void vmx_set_cr3(struct kvm_vcpu vmcs_write64(EPT_POINTER, eptp); guest_cr3 = is_paging(vcpu) ? vcpu-arch.cr3 : vcpu-kvm-arch.ept_identity_map_addr; + ept_load_pdptrs(vcpu); } vmx_flush_tlb(vcpu); @@ -3638,10 +3639,6 @@ static void vmx_vcpu_run(struct kvm_vcpu { struct vcpu_vmx *vmx = to_vmx(vcpu); - if (enable_ept is_paging(vcpu)) { - vmcs_writel(GUEST_CR3, vcpu-arch.cr3); - ept_load_pdptrs(vcpu); - } /* Record the guest's net vcpu time for enforced NMI injections. */ if (unlikely(!cpu_has_virtual_nmis() vmx-soft_vnmi_blocked)) vmx-entry_time = ktime_get(); Index: b/arch/x86/kvm/x86.c === --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4517,8 +4517,10 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct mmu_reset_needed |= vcpu-arch.cr4 != sregs-cr4; kvm_x86_ops-set_cr4(vcpu, sregs-cr4); - if (!is_long_mode(vcpu) is_pae(vcpu)) + if (!is_long_mode(vcpu) is_pae(vcpu)) { load_pdptrs(vcpu, vcpu-arch.cr3); + mmu_reset_needed = 1; + } if (mmu_reset_needed) kvm_mmu_reset_context(vcpu); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM PATCH v3 1/3] KVM: fix race in irq_routing logic
On Mon, Oct 26, 2009 at 12:21:57PM -0400, Gregory Haskins wrote: The current code suffers from the following race condition: thread-1thread-2 --- kvm_set_irq() { rcu_read_lock() irq_rt = rcu_dereference(table); rcu_read_unlock(); kvm_set_irq_routing() { mutex_lock(); irq_rt = table; rcu_assign_pointer(); mutex_unlock(); synchronize_rcu(); kfree(irq_rt); irq_rt-entry-set(); /* bad */ - Because the pointer is accessed outside of the read-side critical section. There are two basic patterns we can use to fix this bug: 1) Switch to sleeping-rcu and encompass the -set() access within the read-side critical section, OR 2) Add reference counting to the irq_rt structure, and simply acquire the reference from within the RSCS. This patch implements solution (1). Looks like a good transformation! A few questions interspersed below. Signed-off-by: Gregory Haskins ghask...@novell.com --- include/linux/kvm_host.h |6 +- virt/kvm/irq_comm.c | 50 +++--- virt/kvm/kvm_main.c |1 + 3 files changed, 35 insertions(+), 22 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index bd5a616..1fe135d 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -185,7 +185,10 @@ struct kvm { struct mutex irq_lock; #ifdef CONFIG_HAVE_KVM_IRQCHIP - struct kvm_irq_routing_table *irq_routing; + struct { + struct srcu_structsrcu; Each structure has its own SRCU domain. This is OK, but just asking if that is the intent. It does look like the SRCU primitives are passed a pointer to the correct structure, and that the return value from srcu_read_lock() gets passed into the matching srcu_read_unlock() like it needs to be, so that is good. + struct kvm_irq_routing_table *table; + } irq_routing; struct hlist_head mask_notifier_list; struct hlist_head irq_ack_notifier_list; #endif [ . . . ] @@ -155,21 +156,19 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level) * IOAPIC. So set the bit in both. The guest will ignore * writes to the unused one. */ - rcu_read_lock(); - irq_rt = rcu_dereference(kvm-irq_routing); + idx = srcu_read_lock(kvm-irq_routing.srcu); + irq_rt = rcu_dereference(kvm-irq_routing.table); if (irq irq_rt-nr_rt_entries) - hlist_for_each_entry(e, n, irq_rt-map[irq], link) - irq_set[i++] = *e; - rcu_read_unlock(); + hlist_for_each_entry(e, n, irq_rt-map[irq], link) { What prevents the above list from changing while we are traversing it? (Yes, presumably whatever was preventing it from changing before this patch, but what?) Mostly kvm-lock is held, but not always. And if kvm-lock were held all the time, there would be no point in using SRCU. ;-) + int r; - while(i--) { - int r; - r = irq_set[i].set(irq_set[i], kvm, irq_source_id, level); - if (r 0) - continue; + r = e-set(e, kvm, irq_source_id, level); + if (r 0) + continue; - ret = r + ((ret 0) ? 0 : ret); - } + ret = r + ((ret 0) ? 0 : ret); + } + srcu_read_unlock(kvm-irq_routing.srcu, idx); return ret; } @@ -179,17 +178,18 @@ void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin) struct kvm_irq_ack_notifier *kian; struct hlist_node *n; int gsi; + int idx; trace_kvm_ack_irq(irqchip, pin); - rcu_read_lock(); - gsi = rcu_dereference(kvm-irq_routing)-chip[irqchip][pin]; + idx = srcu_read_lock(kvm-irq_routing.srcu); + gsi = rcu_dereference(kvm-irq_routing.table)-chip[irqchip][pin]; if (gsi != -1) hlist_for_each_entry_rcu(kian, n, kvm-irq_ack_notifier_list, link) And same question here -- what keeps the above list from changing while we are traversing it? Thanx, Paul -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM Test: Add re.IGNORECASE to re.compile to verify_ip_address_ in kvm_utils.py
Since the mac address is (changed to) lowercase and the output of 'arping' is in uppercase, we need re.IGNORECASE in the re.compile. (the re.IGNORECASE in the re.search function takes no effect on the compiled regex.) Signed-off-by: Cao, Chen k...@redhat.com --- client/tests/kvm/kvm_utils.py |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/client/tests/kvm/kvm_utils.py b/client/tests/kvm/kvm_utils.py index f72984a..934f223 100644 --- a/client/tests/kvm/kvm_utils.py +++ b/client/tests/kvm/kvm_utils.py @@ -190,7 +190,7 @@ def verify_ip_address_ownership(ip, macs, timeout=10.0): # Compile a regex that matches the given IP address and any of the given # MAC addresses mac_regex = |.join((%s) % mac for mac in macs) -regex = re.compile(r\b%s\b.*\b(%s)\b % (ip, mac_regex)) +regex = re.compile(r\b%s\b.*\b(%s)\b % (ip, mac_regex), re.IGNORECASE) # Check the ARP cache o = commands.getoutput(/sbin/arp -n) -- 1.6.0.6 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] KVM-test: Add execute permission to qemu-ifup script
qemu-ifup is a script for setting network bridge. If no execute permission, always face this problem: autotest/client/tests/kvm/scripts/qemu-ifup: could not launch network script Could not initialize device 'tap Signed-off-by: Amos Kong ak...@redhat.com --- 0 files changed, 0 insertions(+), 0 deletions(-) mode change 100644 = 100755 client/tests/kvm/scripts/qemu-ifup diff --git a/client/tests/kvm/scripts/qemu-ifup b/client/tests/kvm/scripts/qemu-ifup old mode 100644 new mode 100755 -- 1.5.5.6 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] KSM-test: Test 802.1Q vlan of nic
Test 802.1Q vlan of nic, config it by vconfig command. 1) Create two VMs 2) Setup guests in different vlan by vconfig and test communication by ping using hard-coded ip address 3) Setup guests in same vlan and test communication by ping 4) Recover the vlan config The subnet of vlan can be setup in configure file. Signed-off-by: Amos Kong ak...@redhat.com --- client/tests/kvm/kvm_tests.cfg.sample | 12 ++ client/tests/kvm/tests/vlan_tag.py| 68 + 2 files changed, 80 insertions(+), 0 deletions(-) create mode 100644 client/tests/kvm/tests/vlan_tag.py diff --git a/client/tests/kvm/kvm_tests.cfg.sample b/client/tests/kvm/kvm_tests.cfg.sample index 573206c..7f9512a 100644 --- a/client/tests/kvm/kvm_tests.cfg.sample +++ b/client/tests/kvm/kvm_tests.cfg.sample @@ -157,6 +157,18 @@ variants: used_cpus = 5 used_mem = 2560 +- vlan_tag: install setup +type = vlan_tag +# subnet2 should not be used by host +subnet2 = 192.168.123 +vlans = 10 20 +nic_mode = tap +vms += vm2 +extra_params_vm1 += -snapshot +extra_params_vm2 += -snapshot +kill_vm_gracefully_vm2 = no +address_index_vm2 = 1 + - autoit: install setup type = autoit autoit_binary = D:\AutoIt3.exe diff --git a/client/tests/kvm/tests/vlan_tag.py b/client/tests/kvm/tests/vlan_tag.py new file mode 100644 index 000..ada919f --- /dev/null +++ b/client/tests/kvm/tests/vlan_tag.py @@ -0,0 +1,68 @@ +import logging, time +from autotest_lib.client.common_lib import error +import kvm_subprocess, kvm_test_utils, kvm_utils + +def run_vlan_tag(test, params, env): + +Test 802.1Q vlan of nic, config it by vconfig command. + +1) Create two VMs +2) Setup guests in different vlan by vconfig and test communication by ping + using hard-coded ip address +3) Setup guests in same vlan and test communication by ping +4) Recover the vlan config + +@param test: Kvm test object +@param params: Dictionary with the test parameters. +@param env: Dictionary with test environment. + + +vm = [] +session = [] +subnet2 = params.get(subnet2) +vlans = params.get(vlans).split() + +vm.append(kvm_test_utils.get_living_vm(env, params.get(main_vm))) +vm.append(kvm_test_utils.get_living_vm(env, vm2)) + +if not vm[1].create(): +raise error.TestError(VM 1 create faild) + +for i in range(2): +session.append(kvm_test_utils.wait_for_login(vm[i])) + +try: +vconfig_cmd = vconfig add eth0 %s;ifconfig eth0.%s %s.%s +# Attempt to configure IPs for the VMs and record the results in +# boolean variables +# Make vm1 and vm2 in the different vlan + +ip_config_vm1_ok = (session[0].get_command_status(vconfig_cmd + % (vlans[0], vlans[0], subnet2, 11)) == 0) +ip_config_vm2_ok = (session[1].get_command_status(vconfig_cmd + % (vlans[1], vlans[1], subnet2, 12)) == 0) +if not ip_config_vm1_ok or not ip_config_vm2_ok: +raise error.TestError, Fail to config VMs ip address +ping_diff_vlan_ok = (session[0].get_command_status( + ping -c 2 -I eth0.%s %s.12 % (vlans[0], subnet2)) == 0) + +if ping_diff_vlan_ok: +raise error.TestFail(VM 2 is unexpectedly pingable in different + vlan) +# Make vm2 in the same vlan with vm1 +vlan_config_vm2_ok = (session[1].get_command_status( + vconfig rem eth0.%s;vconfig add eth0 %s; + ifconfig eth0.%s %s.12 % + (vlans[1], vlans[0], vlans[0], subnet2)) == 0) +if not vlan_config_vm2_ok: +raise error.TestError, Fail to config ip address of VM 2 + +ping_same_vlan_ok = (session[0].get_command_status( + ping -c 2 -I eth0.%s %s.12 % (vlans[0], subnet2)) == 0) +if not ping_same_vlan_ok: +raise error.TestFail(Fail to ping the guest in same vlan) +finally: +# Clean the vlan config +for i in range(2): +session[i].get_command_status(vconfig rem eth0.%s % vlans[0]) +session[i].close() -- 1.5.5.6 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Autotest] [PATCH] Test 802.1Q vlan of nic
On Wed, Oct 21, 2009 at 06:37:56PM +0800, Amos Kong wrote: On Tue, Oct 20, 2009 at 09:19:50AM -0400, Michael Goldish wrote: See comments below. Hi all, Thanks for your reply. . Agree with you. When I test this case, the original get_command_status() always cause special read problem, so I use sendline(). I'll replace sendline() with get_command_status() later. Other than these minor issues the test looks good. I'll re-send another patch later. Thanks again! Hello all, Execute on VM1 ping -c 2 -I eth0.10 IP_Address_eth0.10_VM2 We can use -I option to assign the interface of ping, then no need to make eth0.10 and eth0 in different subnet. But eth0 and eth0.10 have the same mac address, so eth0.10 could not get address by DHCP. If we assign it in the code, it's maybe repeat with others. The method is not better than assigning subnet2 in configure file. So I'll send another new version first. Welcome any suggestion :) Best Regards, Amos -- Amos Kong Quality Engineer Raycom Office(Beijing), Red Hat Inc. Phone: +86-10-62608183 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v5
On Sun, 2009-10-25 at 15:01 +0200, Avi Kivity wrote: On 10/23/2009 02:33 AM, Hollis Blanchard wrote: On Wed, 2009-10-21 at 17:03 +0200, Alexander Graf wrote: KVM for PowerPC only supports embedded cores at the moment. While it makes sense to virtualize on small machines, it's even more fun to do so on big boxes. So I figured we need KVM for PowerPC64 as well. This patchset implements KVM support for Book3s_64 hosts and guest support for Book3s_64 and G3/G4. Acked-by: Hollis Blanchardholl...@us.ibm.com Avi, please apply these patches I still need acks for the arch/powerpc/{kernel,mm} bits, simple as they are, from the powerpc maintainers. OK, BenH says they're on his todo list. In the meantime, please apply patch #2, because it fixes the broken qemu build. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v5
On Mon, 2009-10-26 at 18:06 -0500, Olof Johansson wrote: Not sure which patch in the series this is needed for since I applied them all, but I got: CC arch/powerpc/kvm/timing.o arch/powerpc/kvm/timing.c:205: error: 'THIS_MODULE' undeclared here (not in a function) Signed-off-by: Olof Johansson o...@lixom.net diff --git a/arch/powerpc/kvm/timing.c b/arch/powerpc/kvm/timing.c index 2aa371e..7037855 100644 --- a/arch/powerpc/kvm/timing.c +++ b/arch/powerpc/kvm/timing.c @@ -23,6 +23,7 @@ #include linux/seq_file.h #include linux/debugfs.h #include linux/uaccess.h +#include linux/module.h #include asm/time.h #include asm-generic/div64.h For some reason, I'm not seeing this build break, but the patch is obviously correct. Acked-by: Hollis Blanchard holl...@us.ibm.com -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v5
On Oct 26, 2009, at 6:20 PM, Hollis Blanchard wrote: For some reason, I'm not seeing this build break, but the patch is obviously correct. Acked-by: Hollis Blanchard holl...@us.ibm.com I saw it when building with pasemi_defconfig + manually enabled KVM options (all available). -Olof -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html