[COMMIT master] KVM: Trivial format fix in setup_routing_entry()
From: Chris Wright chr...@sous-sol.org Remove extra tab. Signed-off-by: Chris Wright chr...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c index 4fa1f60..a8bd466 100644 --- a/virt/kvm/irq_comm.c +++ b/virt/kvm/irq_comm.c @@ -271,7 +271,7 @@ static int setup_routing_entry(struct kvm_kernel_irq_routing_entry *e, delta = 8; break; case KVM_IRQCHIP_IOAPIC: - e-set = kvm_set_ioapic_irq; + e-set = kvm_set_ioapic_irq; break; default: goto out; -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] KVM: Fix NX support reporting
From: Avi Kivity a...@redhat.com NX support is bit 20, not bit 1. Signed-off-by: Avi Kivity a...@redhat.com diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5fcde2c..8b0d777 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1266,7 +1266,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, bit(X86_FEATURE_CMOV) | bit(X86_FEATURE_PSE36) | bit(X86_FEATURE_MMX) | bit(X86_FEATURE_FXSR) | bit(X86_FEATURE_SYSCALL) | - (bit(X86_FEATURE_NX) is_efer_nx()) | + (is_efer_nx() ? bit(X86_FEATURE_NX) : 0) | #ifdef CONFIG_X86_64 bit(X86_FEATURE_LM) | #endif -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] KVM: Make EFER reads safe when EFER does not exist
From: Avi Kivity a...@redhat.com Some processors don't have EFER; don't oops if userspace wants us to read EFER when we check NX. Cc: sta...@kernel.org Signed-off-by: Avi Kivity a...@redhat.com diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8b0d777..2d7082c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1128,9 +1128,9 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) static int is_efer_nx(void) { - u64 efer; + unsigned long long efer = 0; - rdmsrl(MSR_EFER, efer); + rdmsrl_safe(MSR_EFER, efer); return efer EFER_NX; } -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Increment virtio-net savevm version to avoid conflict with upstream QEMU.
From: Anthony Liguori aligu...@us.ibm.com When TAP_VNET_HDR eventually merges into upstream QEMU, it cannot change the format of the version 6 savevm data. This means that we're going to have to bump the version up to 7. I'm happy to reserve version 7 as having TAP_VNET_HDR support to allow time to include this support in upstream QEMU. For those shipping products based on KVM though, it's important that we do not conflict with upstream QEMU versioning or else it's going to result in breakage of backwards compatibility. Signed-off-by: Anthony Liguori aligu...@us.ibm.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/hw/virtio-net.c b/hw/virtio-net.c index 5f5f2f3..ac8e030 100644 --- a/hw/virtio-net.c +++ b/hw/virtio-net.c @@ -21,7 +21,12 @@ #define TAP_VNET_HDR -#define VIRTIO_NET_VM_VERSION6 +/* Version 7 has TAP_VNET_HDR support. This is reserved in upstream QEMU to + * avoid future conflict. + * We can't assume verisons 7 have TAP_VNET_HDR support until this is merged + * in upstream QEMU. + */ +#define VIRTIO_NET_VM_VERSION7 #define MAC_TABLE_ENTRIES32 #define MAX_VLAN(1 12) /* Per 802.1Q definition */ @@ -652,8 +657,9 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id) qemu_get_buffer(f, (uint8_t *)n-vlans, MAX_VLAN 3); #ifdef TAP_VNET_HDR -if (qemu_get_be32(f)) +if (version_id == 7 qemu_get_be32(f)) { tap_using_vnet_hdr(n-vc-vlan-first_client, 1); +} #endif if (n-tx_timer_active) { -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Remove -cpu-vendor-string
From: Anthony Liguori aligu...@us.ibm.com Superceded by qemu '-cpu vendor=...' option. Signed-off-by: Anthony Liguori aligu...@us.ibm.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/linux-user/main.c b/linux-user/main.c index 5967fa3..dc39b05 100644 --- a/linux-user/main.c +++ b/linux-user/main.c @@ -43,7 +43,6 @@ int singlestep; static const char *interp_prefix = CONFIG_QEMU_PREFIX; const char *qemu_uname_release = CONFIG_UNAME_RELEASE; -const char *cpu_vendor_string = NULL; #if defined(__i386__) !defined(CONFIG_STATIC) /* Force usage of an ELF interpreter even if it is an ELF shared diff --git a/qemu-options.hx b/qemu-options.hx index a11ead9..f7d83c9 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -1545,8 +1545,6 @@ DEF(pcidevice, HAS_ARG, QEMU_OPTION_pcidevice, #endif DEF(enable-nesting, 0, QEMU_OPTION_enable_nesting, -enable-nesting enable support for running a VM inside the VM (AMD only)\n) -DEF(cpu-vendor, HAS_ARG, QEMU_OPTION_cpu_vendor, --cpu-vendor STRING override the cpuid vendor string\n) DEF(nvram, HAS_ARG, QEMU_OPTION_nvram, -nvram FILE provide ia64 nvram contents\n) DEF(tdf, 0, QEMU_OPTION_tdf, diff --git a/target-i386/helper.c b/target-i386/helper.c index 24fcea8..7440983 100644 --- a/target-i386/helper.c +++ b/target-i386/helper.c @@ -91,8 +91,6 @@ static void add_flagname_to_bitmaps(char *flagname, uint32_t *features, fprintf(stderr, CPU feature %s not found\n, flagname); } -extern const char *cpu_vendor_string; - typedef struct x86_def_t { const char *name; uint32_t level; @@ -431,9 +429,6 @@ static int cpu_x86_register (CPUX86State *env, const char *cpu_model) { const char *model_id = def-model_id; int c, len, i; - -if (cpu_vendor_string != NULL) -model_id = cpu_vendor_string; if (!model_id) model_id = ; len = strlen(model_id); diff --git a/vl.c b/vl.c index fbc84a7..38f208a 100644 --- a/vl.c +++ b/vl.c @@ -268,7 +268,6 @@ const char *mem_path = NULL; int mem_prealloc = 1; /* force preallocation of physical target memory */ #endif long hpagesize = 0; -const char *cpu_vendor_string; #ifdef TARGET_ARM int old_param = 0; #endif @@ -5216,9 +5215,6 @@ int main(int argc, char **argv, char **envp) nb_prom_envs++; break; #endif - case QEMU_OPTION_cpu_vendor: - cpu_vendor_string = optarg; - break; #ifdef TARGET_ARM case QEMU_OPTION_old_param: old_param = 1; -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Don't clean *.dtb
From: Anthony Liguori aligu...@us.ibm.com *.dtb is under source control, so it shouldn't be deleted. Signed-off-by: Anthony Liguori aligu...@us.ibm.com Acked-by: Hollis Blanchard holl...@us.ibm.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/pc-bios/Makefile b/pc-bios/Makefile index dabeb4c..315288d 100644 --- a/pc-bios/Makefile +++ b/pc-bios/Makefile @@ -16,4 +16,4 @@ all: $(TARGETS) dtc -I dts -O dtb -o $@ $ clean: - rm -f $(TARGETS) *.o *~ *.dtb + rm -f $(TARGETS) *.o *~ -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Increment version id for CPU save state
From: Anthony Liguori aligu...@us.ibm.com 9 is reserved for KVM. KVM cannot support migration from any other version. Signed-off-by: Anthony Liguori aligu...@us.ibm.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/target-i386/cpu.h b/target-i386/cpu.h index f054af1..af0ee18 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -838,7 +838,9 @@ static inline int cpu_get_time_fast(void) #define cpu_signal_handler cpu_x86_signal_handler #define cpu_list x86_cpu_list -#define CPU_SAVE_VERSION 8 +/* CPU_SAVE_VERSION 9 is reserved for KVM. This is to avoid breakage as KVM + * merges into upstream QEMU */ +#define CPU_SAVE_VERSION 9 /* MMU modes definitions */ #define MMU_MODE0_SUFFIX _kernel diff --git a/target-i386/machine.c b/target-i386/machine.c index 399204d..7f75d31 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -196,6 +196,9 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id) if (version_id != 3 version_id != 4 version_id != 5 version_id != 6 version_id != 7 version_id != 8) return -EINVAL; +/* KVM cannot accept migrations from QEMU today */ +if (version_id != 9) +return -EINVAL; for(i = 0; i CPU_NB_REGS; i++) qemu_get_betls(f, env-regs[i]); qemu_get_betls(f, env-eip); -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Fix build when objdir != srcdir
From: Anthony Liguori aligu...@us.ibm.com This requires adding the necessary bits to configure to create the directories and symlinks for libkvm. It also requires sticking KVM_CFLAGS in config-host.mak to ensure that it gets the right set of includes for the kernel headers. Signed-off-by: Anthony Liguori aligu...@us.ibm.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/configure b/configure index f59476d..7bc7f22 100755 --- a/configure +++ b/configure @@ -520,7 +520,7 @@ if test $werror = yes ; then CFLAGS=$CFLAGS -Werror fi -CFLAGS=$CFLAGS -I$(readlink -f kvm/libkvm) +CFLAGS=$CFLAGS -I$(readlink -f $source_path/kvm/libkvm) if test $solaris = no ; then if ld --version 2/dev/null | grep GNU ld /dev/null 2/dev/null ; then @@ -1788,6 +1788,11 @@ bsd) ;; esac +# this is a temp hack needed for libkvm +if test $kvm = yes ; then +echo KVM_CFLAGS=$kvm_cflags $config_mak +fi + tools= if test `expr $target_list : .*softmmu.*` != 0 ; then tools=qemu-img\$(EXESUF) $tools @@ -2165,10 +2170,11 @@ done # for target in $targets # build tree in object directory if source path is different from current one if test $source_path_used = yes ; then -DIRS=tests tests/cris slirp audio +DIRS=tests tests/cris slirp audio kvm/libkvm FILES=Makefile tests/Makefile FILES=$FILES tests/cris/Makefile tests/cris/.gdbinit FILES=$FILES tests/test-mmap.c +FILES=$FILES kvm/libkvm/Makefile for dir in $DIRS ; do mkdir -p $dir done diff --git a/kvm/libkvm/Makefile b/kvm/libkvm/Makefile index 727ce48..8de7eaf 100644 --- a/kvm/libkvm/Makefile +++ b/kvm/libkvm/Makefile @@ -1,5 +1,12 @@ include ../../config-host.mak -include config-$(ARCH).mak +ifneq ($(VPATH),) +srcdir=$(VPATH)/kvm/libkvm +else +srcdir=. +endif + +include $(srcdir)/config-$(ARCH).mak + # libkvm is not -Wredundant-decls friendly yet CFLAGS += -Wno-redundant-decls @@ -18,6 +25,8 @@ LDFLAGS += $(CFLAGS) CXXFLAGS = $(autodepend-flags) +VPATH:=$(VPATH)/kvm/libkvm + autodepend-flags = -MMD -MF $(dir $*).$(notdir $*).d -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Rename config-powerpc to config-ppc
From: Hollis Blanchard holl...@us.ibm.com Apparently $(ARCH) now holds the qemu meaning, rather than the KVM meaning. Signed-off-by: Hollis Blanchard holl...@us.ibm.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/kvm/libkvm/config-powerpc.mak b/kvm/libkvm/config-ppc.mak similarity index 100% rename from kvm/libkvm/config-powerpc.mak rename to kvm/libkvm/config-ppc.mak -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Fix format of package version for KVM
From: Anthony Liguori aligu...@us.ibm.com We currently show 0.10.50kvm-devel whereas pkgversion normally would show 0.10.50 (kvm-devel). This is due to some weirdness in how pkgversion is constructed in configure. This corrects the version display. Signed-off-by: Anthony Liguori aligu...@us.ibm.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/configure b/configure index 09b61a6..a9a6756 100755 --- a/configure +++ b/configure @@ -194,7 +194,7 @@ blobs=yes fdt=yes sdl_x11=no xen=yes -pkgversion=kvm-devel +pkgversion= (kvm-devel) signalfd=no eventfd=no cpu_emulation=yes -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Fix missing prototype warning.
From: Hollis Blanchard holl...@us.ibm.com As far as I can see, kvm_destroy_memory_region_works() has nothing to do with KVM_CAP_DEVICE_ASSIGNMENT, so move the prototype outside that ifdef block. Signed-off-by: Hollis Blanchard holl...@us.ibm.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/kvm/libkvm/libkvm.h b/kvm/libkvm/libkvm.h index ce6f054..c23d37b 100644 --- a/kvm/libkvm/libkvm.h +++ b/kvm/libkvm/libkvm.h @@ -739,6 +739,7 @@ int kvm_assign_irq(kvm_context_t kvm, int kvm_deassign_irq(kvm_context_t kvm, struct kvm_assigned_irq *assigned_irq); #endif +#endif /*! * \brief Determines whether destroying memory regions is allowed @@ -748,7 +749,6 @@ int kvm_deassign_irq(kvm_context_t kvm, * \param kvm Pointer to the current kvm_context */ int kvm_destroy_memory_region_works(kvm_context_t kvm); -#endif #ifdef KVM_CAP_DEVICE_DEASSIGNMENT /*! -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Fix warning when__ia64__ is not defined.
From: Hollis Blanchard holl...@us.ibm.com Signed-off-by: Hollis Blanchard holl...@us.ibm.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/kvm/libkvm/kvm-common.h b/kvm/libkvm/kvm-common.h index 96361e8..591fb53 100644 --- a/kvm/libkvm/kvm-common.h +++ b/kvm/libkvm/kvm-common.h @@ -22,7 +22,7 @@ #define KVM_MAX_NUM_MEM_REGIONS 1u #define MAX_VCPUS 64 #define LIBKVM_S390_ORIGIN (0UL) -#elif __ia64__ +#elif defined(__ia64__) #define KVM_MAX_NUM_MEM_REGIONS 32u #define MAX_VCPUS 256 #else -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT stable-0.10] Read kvm version from KVM_VERSION file
From: Avi Kivity a...@redhat.com This allows the packager to add a KVM_VERSION file to the tarball instead of modifying the source. Signed-off-by: Avi Kivity a...@redhat.com diff --git a/configure b/configure index 0cfdf7b..04e072b 100755 --- a/configure +++ b/configure @@ -152,6 +152,17 @@ case $cpu in cpu=unknown ;; esac + +kvm_version() { +local fname=$(dirname $0)/KVM_VERSION + +if test -f $fname; then +cat $fname +else +echo kvm-devel +fi +} + gprof=no sparse=no bigendian=no @@ -190,6 +201,7 @@ aix=no blobs=yes fdt=yes sdl_x11=no +pkgversion=$(kvm_version) signalfd=no eventfd=no cpu_emulation=yes @@ -1474,7 +1486,7 @@ fi qemu_version=`head $source_path/VERSION` echo VERSION=$qemu_version $config_mak echo #define QEMU_VERSION \$qemu_version\ $config_h -echo #define KVM_VERSION \kvm-devel\ $config_h +echo #define KVM_VERSION \${pkgversion}\ $config_h echo SRC_PATH=$source_path $config_mak if [ $source_path_used = yes ]; then -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] patch add_powerpc_kvm_headers.diff
From: Hollis Blanchard holl...@us.ibm.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/kvm/kernel/arch/powerpc/include/asm/kvm.h b/kvm/kernel/arch/powerpc/include/asm/kvm.h new file mode 100644 index 000..c4f1ed1 --- /dev/null +++ b/kvm/kernel/arch/powerpc/include/asm/kvm.h @@ -0,0 +1,102 @@ +#ifndef KVM_UNIFDEF_H +#define KVM_UNIFDEF_H + +#ifdef __i386__ +#ifndef CONFIG_X86_32 +#define CONFIG_X86_32 1 +#endif +#endif + +#ifdef __x86_64__ +#ifndef CONFIG_X86_64 +#define CONFIG_X86_64 1 +#endif +#endif + +#if defined(__i386__) || defined (__x86_64__) +#ifndef CONFIG_X86 +#define CONFIG_X86 1 +#endif +#endif + +#ifdef __ia64__ +#ifndef CONFIG_IA64 +#define CONFIG_IA64 1 +#endif +#endif + +#ifdef __PPC__ +#ifndef CONFIG_PPC +#define CONFIG_PPC 1 +#endif +#endif + +#ifdef __s390__ +#ifndef CONFIG_S390 +#define CONFIG_S390 1 +#endif +#endif + +#endif +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + * + * Copyright IBM Corp. 2007 + * + * Authors: Hollis Blanchard holl...@us.ibm.com + */ + +#ifndef __LINUX_KVM_POWERPC_H +#define __LINUX_KVM_POWERPC_H + +#include linux/types.h + +struct kvm_regs { + __u64 pc; + __u64 cr; + __u64 ctr; + __u64 lr; + __u64 xer; + __u64 msr; + __u64 srr0; + __u64 srr1; + __u64 pid; + + __u64 sprg0; + __u64 sprg1; + __u64 sprg2; + __u64 sprg3; + __u64 sprg4; + __u64 sprg5; + __u64 sprg6; + __u64 sprg7; + + __u64 gpr[32]; +}; + +struct kvm_sregs { +}; + +struct kvm_fpu { + __u64 fpr[32]; +}; + +struct kvm_debug_exit_arch { +}; + +/* for KVM_SET_GUEST_DEBUG */ +struct kvm_guest_debug_arch { +}; + +#endif /* __LINUX_KVM_POWERPC_H */ diff --git a/kvm/kernel/arch/powerpc/include/asm/kvm_44x.h b/kvm/kernel/arch/powerpc/include/asm/kvm_44x.h new file mode 100644 index 000..956f252 --- /dev/null +++ b/kvm/kernel/arch/powerpc/include/asm/kvm_44x.h @@ -0,0 +1,108 @@ +#ifndef KVM_UNIFDEF_H +#define KVM_UNIFDEF_H + +#ifdef __i386__ +#ifndef CONFIG_X86_32 +#define CONFIG_X86_32 1 +#endif +#endif + +#ifdef __x86_64__ +#ifndef CONFIG_X86_64 +#define CONFIG_X86_64 1 +#endif +#endif + +#if defined(__i386__) || defined (__x86_64__) +#ifndef CONFIG_X86 +#define CONFIG_X86 1 +#endif +#endif + +#ifdef __ia64__ +#ifndef CONFIG_IA64 +#define CONFIG_IA64 1 +#endif +#endif + +#ifdef __PPC__ +#ifndef CONFIG_PPC +#define CONFIG_PPC 1 +#endif +#endif + +#ifdef __s390__ +#ifndef CONFIG_S390 +#define CONFIG_S390 1 +#endif +#endif + +#endif +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + * + * Copyright IBM Corp. 2008 + * + * Authors: Hollis Blanchard holl...@us.ibm.com + */ + +#ifndef __ASM_44X_H__ +#define __ASM_44X_H__ + +#include linux/kvm_host.h + +#define PPC44x_TLB_SIZE 64 + +/* If the guest is expecting it, this can be as large as we like; we'd just + * need to find some way of advertising it. */ +#define KVM44x_GUEST_TLB_SIZE 64 + +struct kvmppc_44x_tlbe { + u32 tid; /* Only the low 8 bits are used. */ + u32 word0; + u32 word1; + u32 word2; +}; + +struct kvmppc_44x_shadow_ref { + struct page *page; + u16 gtlb_index; + u8 writeable; + u8 tid; +}; + +struct kvmppc_vcpu_44x { + /* Unmodified copy of the guest's TLB. */ + struct kvmppc_44x_tlbe guest_tlb[KVM44x_GUEST_TLB_SIZE]; + + /* References to guest pages in the hardware TLB. */ + struct kvmppc_44x_shadow_ref shadow_refs[PPC44x_TLB_SIZE]; + + /* State of the shadow TLB at guest context switch time. */ + struct kvmppc_44x_tlbe shadow_tlb[PPC44x_TLB_SIZE]; + u8 shadow_tlb_mod[PPC44x_TLB_SIZE]; + + struct kvm_vcpu vcpu; +}; + +static inline struct kvmppc_vcpu_44x *to_44x(struct kvm_vcpu *vcpu) +{ + return
[COMMIT stable-0.10] Merge commit 'v0.10.3' into stable-0.10
From: Avi Kivity a...@redhat.com * commit 'v0.10.3': Update version for 0.10.3 release Implement cancellation method for dma async I/O (Avi Kivity) Convert vectored aio emulation to use a dedicated pool (Avi Kivity) Refactor aio callback allocation to use an aiocb pool (Avi Kivity) Fix hw/acpi.c build w/ DEBUG enabled Make sure not to fall through on error in loadvm Pci nic: pci_register_device can fail Fix serial option with -drive suport device driver initialization model kvm: Avoid COW if KVM MMU is asynchronous vnc: windup keypad keys for qemu console emulation -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT stable-0.10] kvm: Add release script
From: Avi Kivity a...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/kvm/scripts/make-release b/kvm/scripts/make-release new file mode 100755 index 000..3b1dccf --- /dev/null +++ b/kvm/scripts/make-release @@ -0,0 +1,60 @@ +#!/bin/bash -e + +usage() { +echo usage: $0 [--upload] [--formal] commit [name] +exit 1 +} + +[[ -f ~/.kvmreleaserc ]] . ~/.kvmreleaserc + +upload= +formal= + +releasedir=~/sf-release +[[ -z $TMP ]] TMP=/tmp +tmpdir=$TMP/qemu-kvm-make-release.$$ +while [[ $1 = -* ]]; do +opt=$1 +shift +case $opt in + --upload) + upload=yes + ;; + --formal) + formal=yes + ;; + *) + usage + ;; +esac +done + +commit=$1 +name=$2 + +if [[ -z $commit ]]; then +usage +fi + +if [[ -z $name ]]; then +name=$commit +fi + +tarball=$releasedir/$name.tar + +cd $(dirname $0)/../.. +git archive --prefix=$name/ --format=tar $commit $tarball + +if [[ -n $formal ]]; then +mkdir -p $tmpdir +echo $name $tmpdir/KVM_VERSION +tar -rf $tarball --transform s,^,$name/, -C $tmpdir KVM_VERSION +rm -rf $tmpdir +fi + +gzip -9 $tarball +tarball=$tarball.gz + +if [[ -n $upload ]]; then +rsync --progress -h $tarball a...@frs.sourceforge.net:uploads/ +fi -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT stable-0.10] kvm: disable kqemu
From: Avi Kivity a...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/configure b/configure index 913bcb8..0cfdf7b 100755 --- a/configure +++ b/configure @@ -314,6 +314,7 @@ if [ $cpu = i386 -o $cpu = x86_64 ] ; then kqemu=yes audio_possible_drivers=$audio_possible_drivers fmod kvm=yes +kqemu=no fi if [ $cpu = ia64 ] ; then kvm=yes -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] Revert Sync idcache after emualted DMA operations for ia64
From: Hollis Blanchard holl...@us.ibm.com This reverts commit 9dc99a28236161a5a1b4c58f1e9c4ec6179cb976. Aside from the other issues discussed on kvm-devel, this commit breaks the PowerPC build. Signed-off-by: Hollis Blanchard holl...@us.ibm.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/cache-utils.h b/cache-utils.h index 2a07fbd..b45fde4 100644 --- a/cache-utils.h +++ b/cache-utils.h @@ -33,22 +33,8 @@ static inline void flush_icache_range(unsigned long start, unsigned long stop) asm volatile (sync : : : memory); asm volatile (isync : : : memory); } -#define qemu_sync_idcache flush_icache_range -#else -#ifdef __ia64__ -static inline void qemu_sync_idcache(unsigned long start, unsigned long stop) -{ -while (start stop) { - asm volatile (fc %0 :: r(start)); - start += 32; -} -asm volatile (;;sync.i;;srlz.i;;); -} #else -static inline void qemu_sync_idcache(unsigned long start, unsigned long stop) {} -#endif - #define qemu_cache_utils_init(envp) do { (void) (envp); } while (0) #endif diff --git a/cutils.c b/cutils.c index e2bee1e..a1652ab 100644 --- a/cutils.c +++ b/cutils.c @@ -23,7 +23,6 @@ */ #include qemu-common.h #include host-utils.h -#include cache-utils.h #include assert.h void pstrcpy(char *buf, int buf_size, const char *str) @@ -177,8 +176,6 @@ void qemu_iovec_from_buffer(QEMUIOVector *qiov, const void *buf, size_t count) if (copy qiov-iov[i].iov_len) copy = qiov-iov[i].iov_len; memcpy(qiov-iov[i].iov_base, p, copy); -qemu_sync_idcache((unsigned long)qiov-iov[i].iov_base, - (unsigned long)(qiov-iov[i].iov_base + copy)); p += copy; count -= copy; } diff --git a/dma-helpers.c b/dma-helpers.c index f71ca3b..f9eb224 100644 --- a/dma-helpers.c +++ b/dma-helpers.c @@ -9,7 +9,6 @@ #include dma.h #include block_int.h -#include cache-utils.h static AIOPool dma_aio_pool; @@ -138,8 +137,6 @@ static BlockDriverAIOCB *dma_bdrv_io( BlockDriverCompletionFunc *cb, void *opaque, int is_write) { -int i; -QEMUIOVector *qiov; DMAAIOCB *dbs = qemu_aio_get_pool(dma_aio_pool, bs, cb, opaque); dbs-acb = NULL; @@ -152,15 +149,6 @@ static BlockDriverAIOCB *dma_bdrv_io( dbs-bh = NULL; qemu_iovec_init(dbs-iov, sg-nsg); dma_bdrv_cb(dbs, 0); - -if (!is_write) { -qiov = dbs-iov; -for (i = 0; i qiov-niov; ++i) { - qemu_sync_idcache((unsigned long)qiov-iov[i].iov_base, - (unsigned long)(qiov-iov[i].iov_base + qiov-iov[i].iov_len)); - } -} - if (!dbs-acb) { qemu_aio_release(dbs); return NULL; diff --git a/exec.c b/exec.c index 0c5545e..a5bca49 100644 --- a/exec.c +++ b/exec.c @@ -3400,9 +3400,6 @@ void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len, addr1 += l; access_len -= l; } -if (kvm_enabled()) - flush_icache_range((unsigned long)buffer, - (unsigned long)buffer + access_len); } return; } -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT stable-0.10] Fix warning when__ia64__ is not defined.
From: Hollis Blanchard holl...@us.ibm.com Signed-off-by: Hollis Blanchard holl...@us.ibm.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/kvm/libkvm/kvm-common.h b/kvm/libkvm/kvm-common.h index de1ada2..9060820 100644 --- a/kvm/libkvm/kvm-common.h +++ b/kvm/libkvm/kvm-common.h @@ -22,7 +22,7 @@ #define KVM_MAX_NUM_MEM_REGIONS 1u #define MAX_VCPUS 64 #define LIBKVM_S390_ORIGIN (0UL) -#elif __ia64__ +#elif defined(__ia64__) #define KVM_MAX_NUM_MEM_REGIONS 32u #define MAX_VCPUS 256 #else -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[COMMIT master] kvm: Add release script
From: Avi Kivity a...@redhat.com Signed-off-by: Avi Kivity a...@redhat.com diff --git a/kvm/scripts/make-release b/kvm/scripts/make-release new file mode 100755 index 000..3b1dccf --- /dev/null +++ b/kvm/scripts/make-release @@ -0,0 +1,60 @@ +#!/bin/bash -e + +usage() { +echo usage: $0 [--upload] [--formal] commit [name] +exit 1 +} + +[[ -f ~/.kvmreleaserc ]] . ~/.kvmreleaserc + +upload= +formal= + +releasedir=~/sf-release +[[ -z $TMP ]] TMP=/tmp +tmpdir=$TMP/qemu-kvm-make-release.$$ +while [[ $1 = -* ]]; do +opt=$1 +shift +case $opt in + --upload) + upload=yes + ;; + --formal) + formal=yes + ;; + *) + usage + ;; +esac +done + +commit=$1 +name=$2 + +if [[ -z $commit ]]; then +usage +fi + +if [[ -z $name ]]; then +name=$commit +fi + +tarball=$releasedir/$name.tar + +cd $(dirname $0)/../.. +git archive --prefix=$name/ --format=tar $commit $tarball + +if [[ -n $formal ]]; then +mkdir -p $tmpdir +echo $name $tmpdir/KVM_VERSION +tar -rf $tarball --transform s,^,$name/, -C $tmpdir KVM_VERSION +rm -rf $tmpdir +fi + +gzip -9 $tarball +tarball=$tarball.gz + +if [[ -n $upload ]]; then +rsync --progress -h $tarball a...@frs.sourceforge.net:uploads/ +fi -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM PATCH v3 2/2] kvm: add support for irqfd via eventfd-notification interface
On Mon, Apr 27, 2009 at 02:33:34PM -0400, Gregory Haskins wrote: + /* We re-use eventfd for irqfd */ + fd = sys_eventfd2(0, 0); + if (fd 0) { + ret = fd; + goto fail; + } + + /* We maintain a reference to eventfd for the irqfd lifetime */ + file = eventfd_fget(fd); + if (IS_ERR(file)) { + ret = PTR_ERR(file); + goto fail; + } + + irqfd-file = file; This is just plain wrong. You have no promise whatsoever that caller of that sucker won't race with e.g. dup2(). IOW, you can't assume that file will be of the expected kind. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] qemu-kvm: build system Add link to qemu
Jan Kiszka wrote: I'm getting closer to a working qemu-kvm, but there are still a few messy parts. The magic dance goes like this: cd qemu-kvm/kvm ln -s .. qemu (or apply patch below) ./configure -whatever make Still, this is unintuitive. As both top-level configure and Makefile already differ from upstream, I see no reason not tweaking them also in way that ./configure make from the top-level directory behaves as expected again. May look into this later (and the other warnings the build threw at me), now I've to understand an ugly shadow page table inconsistency of kvm... Jan --- Subject: [PATCH] qemu-kvm: build system Add link to qemu Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm/qemu |1 + 1 files changed, 1 insertions(+), 0 deletions(-) create mode 12 kvm/qemu diff --git a/kvm/qemu b/kvm/qemu new file mode 12 index 000..a96aa0e --- /dev/null +++ b/kvm/qemu @@ -0,0 +1 @@ +.. \ No newline at end of file This shouldn't be needed. Can you confirm this with current qemu-kvm.git? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PPC support for qemu-kvm
Hollis Blanchard wrote: These patches fix a number of issues with PowerPC builds of qemu-kvm.git. However, even after applying these patches it still doesn't build, due to confusion with KVM_UPSTREAM and CONFIG_KVM. Applied all, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: trivial format fix in setup_routing_entry()
Chris Wright wrote: Remove extra tab. Applied, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Unable to boot guest on kernel 2.6.29.1 with kvm-84 or kvm-85
Avi Kivity wrote: Kenni Lund wrote: Avi Kivity a...@redhat.com wrote: Kenni Lund wrote: Ok, but as I write in my message, I'm using the KVM modules from the latest upstream kernel, not the kvm-85 modules. According to the KVM download page, http://www.linux-kvm.org/page/Downloads, any kernel above 2.6.25 should work with the latest KVM userspace. This has been true until now in my case, but it breaks with 2.6.29.1 and that's the reason why I'm posting this bug report. Can you try a bisect? Yes, sorry for the late reply. I did the bisect as requested and it returned the following results: # bad: [8d7bff2d72660d9d60aa371ae3d1356bbf329a09] Linux 2.6.29.1 # good: [4a6908a3a050aacc9c3a2f36b276b46c0629ad91] Linux 2.6.28 git bisect start 'v2.6.29.1' 'v2.6.28' '--' 'arch/x86/kvm' 'virt/kvm' # good: [b82091824ee4970adf92d5cd6d57b12273171625] KVM: Prevent trace call into unloaded module text git bisect good b82091824ee4970adf92d5cd6d57b12273171625 # good: [7f59f492da722eb3551bbe1f8f4450a21896f05d] KVM: use cpumask_var_t for cpus_hardware_enabled git bisect good 7f59f492da722eb3551bbe1f8f4450a21896f05d # good: [19de40a8472fa64693eab844911eec277d489f6c] KVM: change KVM to use IOMMU API git bisect good 19de40a8472fa64693eab844911eec277d489f6c # good: [2aaf69dcee864f4fb6402638dd2f263324ac839f] KVM: MMU: Map device MMIO as UC in EPT git bisect good 2aaf69dcee864f4fb6402638dd2f263324ac839f # good: [682edb4c01e690c7c7cd772dbd6f4e0fd74dc572] KVM: Fix assigned devices circular locking dependency git bisect good 682edb4c01e690c7c7cd772dbd6f4e0fd74dc572 # bad: [f438349efb8247cd0c1d453a4131b1f801bf5691] KVM: VMX: Don't allow uninhibited access to EFER on i386 git bisect bad f438349efb8247cd0c1d453a4131b1f801bf5691 # good: [516a1a7e9dc80358030fe01aabb3bedf882db9e2] KVM: VMX: Flush volatile msrs before emulating rdmsr git bisect good 516a1a7e9dc80358030fe01aabb3bedf882db9e2 And the final output: f438349efb8247cd0c1d453a4131b1f801bf5691 is first bad commit commit f438349efb8247cd0c1d453a4131b1f801bf5691 Author: Avi Kivity Date: Thu Mar 26 23:05:03 2009 + KVM: VMX: Don't allow uninhibited access to EFER on i386 upstream commit: 16175a796d061833aacfbd9672235f2d2725df65 vmx_set_msr() does not allow i386 guests to touch EFER, but they can still do so through the default: label in the switch. If they set EFER_LME, they can oops the host. Fix by having EFER access through the normal channel (which will check for EFER_LME) even on i386. Reported-and-tested-by: Benjamin Gilbert Cc: sta...@kernel.org Signed-off-by: Avi Kivity Signed-off-by: Chris Wright :04 04 cf7848d35c136beee6665e67839080d450977af0 0a39980481dd346306b2ac54dbe916741515f1f1 M arch FYI, I also tested 2.6.29.2 and the issue still exists. Do you need more information? Please try the attached patch. It won't help - I reproduced the issue. Instead, try passing the parameter '-cpu qemu32' (or '-cpu qemu64,-nx'). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Implement generic double fault generation mechanism
On Thu, Apr 30, 2009 at 03:24:07PM +0800, Dong, Eddie wrote: Move Double-Fault generation logic out of page fault exception generating function to cover more generic case. Signed-off-by: Eddie Dong eddie.d...@intel.com diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ab1fdac..51a8dad 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -162,12 +162,59 @@ void kvm_set_apic_base(struct kvm_vcpu *vcpu, u64 data) } EXPORT_SYMBOL_GPL(kvm_set_apic_base); +#define EXCPT_BENIGN 0 +#define EXCPT_CONTRIBUTORY 1 +#define EXCPT_PF 2 + +static int exception_class(int vector) +{ + if (vector == 14) + return EXCPT_PF; + else if (vector == 0 || (vector = 10 vector = 13)) + return EXCPT_CONTRIBUTORY; + else + return EXCPT_BENIGN; +} + This makes double fault (8) benign exception. Surely not what you want. +static void kvm_multiple_exception(struct kvm_vcpu *vcpu, + unsigned nr, bool has_error, u32 error_code) +{ + u32 prev_nr; + int class1, class2; + + if (!vcpu-arch.exception.pending) { + vcpu-arch.exception.pending = true; + vcpu-arch.exception.has_error_code = has_error; + vcpu-arch.exception.nr = nr; + vcpu-arch.exception.error_code = error_code; + return; + } + + /* to check exception */ + prev_nr = vcpu-arch.exception.nr; + class2 = exception_class(nr); + class1 = exception_class(prev_nr); + if ((class1 == EXCPT_CONTRIBUTORY class2 == EXCPT_CONTRIBUTORY) + || (class1 == EXCPT_PF class2 != EXCPT_BENIGN)) { + /* generate double fault per SDM Table 5-5 */ + printk(KERN_DEBUG kvm: double fault 0x%x on 0x%x\n, + prev_nr, nr); + vcpu-arch.exception.pending = true; + vcpu-arch.exception.has_error_code = 1; + vcpu-arch.exception.nr = DF_VECTOR; + vcpu-arch.exception.error_code = 0; + if (prev_nr == DF_VECTOR) { + /* triple fault - shutdown */ + set_bit(KVM_REQ_TRIPLE_FAULT, vcpu-requests); + } + } else + printk(KERN_ERR Exception 0x%x on 0x%x happens serially\n, + prev_nr, nr); +} When two exceptions happens serially is is better to replace pending exception with a new one. This way the first exception (that is lost) will be regenerated when instruction will be re-executed. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm: Add helpers for checking and requiring kvm extensions
Instead of open-coding the check extension sequence, provide helpers for checking whether an extension exists, and for aborting if an extension is missing. Signed-off-by: Avi Kivity a...@redhat.com --- kvm-all.c | 63 +++-- kvm.h |6 + target-i386/kvm.c |8 +-- 3 files changed, 39 insertions(+), 38 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 36659a9..1642a2a 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -64,6 +64,30 @@ struct KVMState static KVMState *kvm_state; +int kvm_check_extension(int extension) +{ +int ret; + +ret = kvm_ioctl(kvm_state, KVM_CHECK_EXTENSION, extension); +if (ret 0) { +fprintf(stderr, KVM_CHECK_EXTENSION failed: %s\n, strerror(errno)); +exit(1); +} +return ret; +} + +int kvm_require_extension(int extension, const char *estr) +{ +int ret; + +ret = kvm_check_extension(extension); +if (!ret) { +fprintf(stderr, Required KVM extension %s not present\n, estr); +exit(1); +} +return ret; +} + static KVMSlot *kvm_alloc_slot(KVMState *s) { int i; @@ -331,6 +355,8 @@ int kvm_init(int smp_cpus) s = qemu_mallocz(sizeof(KVMState)); +kvm_state = s; + #ifdef KVM_CAP_SET_GUEST_DEBUG TAILQ_INIT(s-kvm_sw_breakpoints); #endif @@ -368,42 +394,22 @@ int kvm_init(int smp_cpus) * just use a user allocated buffer so we can use regular pages * unmodified. Make sure we have a sufficiently modern version of KVM. */ -ret = kvm_ioctl(s, KVM_CHECK_EXTENSION, KVM_CAP_USER_MEMORY); -if (ret = 0) { -if (ret == 0) -ret = -EINVAL; -fprintf(stderr, kvm does not support KVM_CAP_USER_MEMORY\n); -goto err; -} +KVM_REQUIRE_EXTENSION(KVM_CAP_USER_MEMORY); /* There was a nasty bug in kvm-80 that prevents memory slots from being * destroyed properly. Since we rely on this capability, refuse to work * with any kernel without this capability. */ -ret = kvm_ioctl(s, KVM_CHECK_EXTENSION, -KVM_CAP_DESTROY_MEMORY_REGION_WORKS); -if (ret = 0) { -if (ret == 0) -ret = -EINVAL; - -fprintf(stderr, -KVM kernel module broken (DESTROY_MEMORY_REGION)\n -Please upgrade to at least kvm-81.\n); -goto err; -} +KVM_REQUIRE_EXTENSION(KVM_CAP_DESTROY_MEMORY_REGION_WORKS); s-coalesced_mmio = 0; #ifdef KVM_CAP_COALESCED_MMIO -ret = kvm_ioctl(s, KVM_CHECK_EXTENSION, KVM_CAP_COALESCED_MMIO); -if (ret 0) -s-coalesced_mmio = ret; +s-coalesced_mmio = kvm_check_extension(KVM_CAP_COALESCED_MMIO); #endif ret = kvm_arch_init(s, smp_cpus); if (ret 0) goto err; -kvm_state = s; - return 0; err: @@ -415,6 +421,8 @@ err: } qemu_free(s); +kvm_state = NULL; + return ret; } @@ -763,14 +771,7 @@ int kvm_vcpu_ioctl(CPUState *env, int type, ...) int kvm_has_sync_mmu(void) { -#ifdef KVM_CAP_SYNC_MMU -KVMState *s = kvm_state; - -if (kvm_ioctl(s, KVM_CHECK_EXTENSION, KVM_CAP_SYNC_MMU) 0) -return 1; -#endif - -return 0; +return kvm_check_extension(KVM_CAP_SYNC_MMU); } void kvm_setup_guest_memory(void *start, size_t size) diff --git a/kvm.h b/kvm.h index 0ea2426..bd4e8d4 100644 --- a/kvm.h +++ b/kvm.h @@ -29,6 +29,12 @@ struct kvm_run; /* external API */ +#define KVM_REQUIRE_EXTENSION(extension) \ +kvm_require_extension(extension, #extension) + +int kvm_check_extension(int extension); +int kvm_require_extension(int extension, const char *estr); + int kvm_init(int smp_cpus); int kvm_init_vcpu(CPUState *env); diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 2de8b81..b534b2d 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -154,19 +154,13 @@ static int kvm_has_msr_star(CPUState *env) int kvm_arch_init(KVMState *s, int smp_cpus) { -int ret; - /* create vm86 tss. KVM uses vm86 mode to emulate 16-bit code * directly. In order to use vm86 mode, a TSS is needed. Since this * must be part of guest physical memory, we need to allocate it. Older * versions of KVM just assumed that it would be at the end of physical * memory but that doesn't work with more than 4GB of memory. We simply * refuse to work with those older versions of KVM. */ -ret = kvm_ioctl(s, KVM_CHECK_EXTENSION, KVM_CAP_SET_TSS_ADDR); -if (ret = 0) { -fprintf(stderr, kvm does not support KVM_CAP_SET_TSS_ADDR\n); -return ret; -} +KVM_REQUIRE_EXTENSION(KVM_CAP_SET_TSS_ADDR); /* this address is 3 pages before the bios, and the bios should present * as unavaible memory. FIXME, need to ensure the e820 map deals with -- 1.6.1.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at
Re: Unable to boot guest on kernel 2.6.29.1 with kvm-84 or kvm-85
Avi Kivity wrote: Kenni Lund wrote: Avi Kivity a...@redhat.com wrote: Kenni Lund wrote: Ok, but as I write in my message, I'm using the KVM modules from the latest upstream kernel, not the kvm-85 modules. According to the KVM download page, http://www.linux-kvm.org/page/Downloads, any kernel above 2.6.25 should work with the latest KVM userspace. This has been true until now in my case, but it breaks with 2.6.29.1 and that's the reason why I'm posting this bug report. Can you try a bisect? Yes, sorry for the late reply. I did the bisect as requested and it returned the following results: # bad: [8d7bff2d72660d9d60aa371ae3d1356bbf329a09] Linux 2.6.29.1 # good: [4a6908a3a050aacc9c3a2f36b276b46c0629ad91] Linux 2.6.28 git bisect start 'v2.6.29.1' 'v2.6.28' '--' 'arch/x86/kvm' 'virt/kvm' # good: [b82091824ee4970adf92d5cd6d57b12273171625] KVM: Prevent trace call into unloaded module text git bisect good b82091824ee4970adf92d5cd6d57b12273171625 # good: [7f59f492da722eb3551bbe1f8f4450a21896f05d] KVM: use cpumask_var_t for cpus_hardware_enabled git bisect good 7f59f492da722eb3551bbe1f8f4450a21896f05d # good: [19de40a8472fa64693eab844911eec277d489f6c] KVM: change KVM to use IOMMU API git bisect good 19de40a8472fa64693eab844911eec277d489f6c # good: [2aaf69dcee864f4fb6402638dd2f263324ac839f] KVM: MMU: Map device MMIO as UC in EPT git bisect good 2aaf69dcee864f4fb6402638dd2f263324ac839f # good: [682edb4c01e690c7c7cd772dbd6f4e0fd74dc572] KVM: Fix assigned devices circular locking dependency git bisect good 682edb4c01e690c7c7cd772dbd6f4e0fd74dc572 # bad: [f438349efb8247cd0c1d453a4131b1f801bf5691] KVM: VMX: Don't allow uninhibited access to EFER on i386 git bisect bad f438349efb8247cd0c1d453a4131b1f801bf5691 # good: [516a1a7e9dc80358030fe01aabb3bedf882db9e2] KVM: VMX: Flush volatile msrs before emulating rdmsr git bisect good 516a1a7e9dc80358030fe01aabb3bedf882db9e2 And the final output: f438349efb8247cd0c1d453a4131b1f801bf5691 is first bad commit commit f438349efb8247cd0c1d453a4131b1f801bf5691 Author: Avi Kivity Date: Thu Mar 26 23:05:03 2009 + KVM: VMX: Don't allow uninhibited access to EFER on i386 upstream commit: 16175a796d061833aacfbd9672235f2d2725df65 vmx_set_msr() does not allow i386 guests to touch EFER, but they can still do so through the default: label in the switch. If they set EFER_LME, they can oops the host. Fix by having EFER access through the normal channel (which will check for EFER_LME) even on i386. Reported-and-tested-by: Benjamin Gilbert Cc: sta...@kernel.org Signed-off-by: Avi Kivity Signed-off-by: Chris Wright :04 04 cf7848d35c136beee6665e67839080d450977af0 0a39980481dd346306b2ac54dbe916741515f1f1 M arch FYI, I also tested 2.6.29.2 and the issue still exists. Do you need more information? Please try the attached patch. It won't help - I reproduced the issue. Instead, try passing the parameter '-cpu qemu32' (or '-cpu qemu64,-nx'). Adding the parameter '-cpu qemu32' (32bit host + 32 bit guest) makes the WinXP guest boot. ...but is this parameter equal to '-no-kvm'? Eg. with emulated CPU? Best Regards Kenni Lund -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo
Re: Unable to boot guest on kernel 2.6.29.1 with kvm-84 or kvm-85
Kenni Lund wrote: It won't help - I reproduced the issue. Instead, try passing the parameter '-cpu qemu32' (or '-cpu qemu64,-nx'). Adding the parameter '-cpu qemu32' (32bit host + 32 bit guest) makes the WinXP guest boot. ...but is this parameter equal to '-no-kvm'? Eg. with emulated CPU? No, as you can tell from the speed (and 'info kvm' output). '-cpu qemu32' means use the cpu features exposed by the virtual qemu32 cpu. The default is qemu64, which includes 64-bit support and NX. Your host kernel doesn't have NX support, but qemu erronously reports that it does, causing Windows to get confused. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/4] Fix kvm cpuid reporting
kvm supports an interface for reporting which cpuid features are supported. Use it for trimming the cpu feature set reported to the guest. This prevents, for example, reporting NX to a guest when in fact we do not support it. Avi Kivity (4): kvm: Add support for querying supported cpu features Make x86 cpuid feature names available in file scope Fix x86 feature modifications for features that set multiple bits kvm: Trim cpu features not supported by kvm kvm.h|3 ++ target-i386/helper.c | 98 + target-i386/kvm.c| 80 3 files changed, 149 insertions(+), 32 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] Make x86 cpuid feature names available in file scope
To be used later. Signed-off-by: Avi Kivity a...@redhat.com --- target-i386/helper.c | 55 + 1 files changed, 28 insertions(+), 27 deletions(-) diff --git a/target-i386/helper.c b/target-i386/helper.c index a070e08..88585b8 100644 --- a/target-i386/helper.c +++ b/target-i386/helper.c @@ -32,39 +32,40 @@ //#define DEBUG_MMU +/* feature flags taken from Intel Processor Identification and the CPUID + * Instruction and AMD's CPUID Specification. In cases of disagreement + * about feature names, the Linux name is used. */ +static const char *feature_name[] = { +fpu, vme, de, pse, tsc, msr, pae, mce, +cx8, apic, NULL, sep, mtrr, pge, mca, cmov, +pat, pse36, pn /* Intel psn */, clflush /* Intel clfsh */, NULL, ds /* Intel dts */, acpi, mmx, +fxsr, sse, sse2, ss, ht /* Intel htt */, tm, ia64, pbe, +}; +static const char *ext_feature_name[] = { +pni /* Intel,AMD sse3 */, NULL, NULL, monitor, ds_cpl, vmx, NULL /* Linux smx */, est, +tm2, ssse3, cid, NULL, NULL, cx16, xtpr, NULL, +NULL, NULL, dca, NULL, NULL, NULL, NULL, popcnt, + NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, +}; +static const char *ext2_feature_name[] = { +fpu, vme, de, pse, tsc, msr, pae, mce, +cx8 /* AMD CMPXCHG8B */, apic, NULL, syscall, mtrr, pge, mca, cmov, +pat, pse36, NULL, NULL /* Linux mp */, nx /* Intel xd */, NULL, mmxext, mmx, +fxsr, fxsr_opt /* AMD ffxsr */, pdpe1gb /* AMD Page1GB */, rdtscp, NULL, lm /* Intel 64 */, 3dnowext, 3dnow, +}; +static const char *ext3_feature_name[] = { +lahf_lm /* AMD LahfSahf */, cmp_legacy, svm, extapic /* AMD ExtApicSpace */, cr8legacy /* AMD AltMovCr8 */, abm, sse4a, misalignsse, +3dnowprefetch, osvw, NULL /* Linux ibs */, NULL, skinit, wdt, NULL, NULL, +NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, +NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, +}; + static void add_flagname_to_bitmaps(char *flagname, uint32_t *features, uint32_t *ext_features, uint32_t *ext2_features, uint32_t *ext3_features) { int i; -/* feature flags taken from Intel Processor Identification and the CPUID - * Instruction and AMD's CPUID Specification. In cases of disagreement - * about feature names, the Linux name is used. */ -static const char *feature_name[] = { -fpu, vme, de, pse, tsc, msr, pae, mce, -cx8, apic, NULL, sep, mtrr, pge, mca, cmov, -pat, pse36, pn /* Intel psn */, clflush /* Intel clfsh */, NULL, ds /* Intel dts */, acpi, mmx, -fxsr, sse, sse2, ss, ht /* Intel htt */, tm, ia64, pbe, -}; -static const char *ext_feature_name[] = { - pni /* Intel,AMD sse3 */, NULL, NULL, monitor, ds_cpl, vmx, NULL /* Linux smx */, est, - tm2, ssse3, cid, NULL, NULL, cx16, xtpr, NULL, - NULL, NULL, dca, NULL, NULL, NULL, NULL, popcnt, - NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, -}; -static const char *ext2_feature_name[] = { - fpu, vme, de, pse, tsc, msr, pae, mce, - cx8 /* AMD CMPXCHG8B */, apic, NULL, syscall, mtrr, pge, mca, cmov, - pat, pse36, NULL, NULL /* Linux mp */, nx /* Intel xd */, NULL, mmxext, mmx, - fxsr, fxsr_opt /* AMD ffxsr */, pdpe1gb /* AMD Page1GB */, rdtscp, NULL, lm /* Intel 64 */, 3dnowext, 3dnow, -}; -static const char *ext3_feature_name[] = { - lahf_lm /* AMD LahfSahf */, cmp_legacy, svm, extapic /* AMD ExtApicSpace */, cr8legacy /* AMD AltMovCr8 */, abm, sse4a, misalignsse, - 3dnowprefetch, osvw, NULL /* Linux ibs */, NULL, skinit, wdt, NULL, NULL, - NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, - NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, -}; for ( i = 0 ; i 32 ; i++ ) if (feature_name[i] !strcmp (flagname, feature_name[i])) { -- 1.6.1.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] Fix x86 feature modifications for features that set multiple bits
QEMU allows adding or removing cpu features by using the syntax '-cpu +feature' or '-cpu -feature'. Some cpuid features cause more than one bit to be set or cleared; but QEMU stops after just one bit has been modified, causing the feature bits to be inconsistent. Fix by allowing all feature bits corresponding to a given name to be set. Signed-off-by: Avi Kivity a...@redhat.com --- target-i386/helper.c | 13 - 1 files changed, 8 insertions(+), 5 deletions(-) diff --git a/target-i386/helper.c b/target-i386/helper.c index 88585b8..0c91133 100644 --- a/target-i386/helper.c +++ b/target-i386/helper.c @@ -66,28 +66,31 @@ static void add_flagname_to_bitmaps(char *flagname, uint32_t *features, uint32_t *ext3_features) { int i; +int found = 0; for ( i = 0 ; i 32 ; i++ ) if (feature_name[i] !strcmp (flagname, feature_name[i])) { *features |= 1 i; -return; +found = 1; } for ( i = 0 ; i 32 ; i++ ) if (ext_feature_name[i] !strcmp (flagname, ext_feature_name[i])) { *ext_features |= 1 i; -return; +found = 1; } for ( i = 0 ; i 32 ; i++ ) if (ext2_feature_name[i] !strcmp (flagname, ext2_feature_name[i])) { *ext2_features |= 1 i; -return; +found = 1; } for ( i = 0 ; i 32 ; i++ ) if (ext3_feature_name[i] !strcmp (flagname, ext3_feature_name[i])) { *ext3_features |= 1 i; -return; +found = 1; } -fprintf(stderr, CPU feature %s not found\n, flagname); +if (!found) { +fprintf(stderr, CPU feature %s not found\n, flagname); +} } typedef struct x86_def_t { -- 1.6.1.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] kvm: Trim cpu features not supported by kvm
Remove cpu features that are not supported by kvm from the cpuid features reported to the guest. Signed-off-by: Avi Kivity a...@redhat.com --- target-i386/helper.c | 30 ++ 1 files changed, 30 insertions(+), 0 deletions(-) diff --git a/target-i386/helper.c b/target-i386/helper.c index 0c91133..bdf242b 100644 --- a/target-i386/helper.c +++ b/target-i386/helper.c @@ -93,6 +93,21 @@ static void add_flagname_to_bitmaps(char *flagname, uint32_t *features, } } +static void kvm_trim_features(uint32_t *features, uint32_t supported, + const char *names[]) +{ +int i; +uint32_t mask; + +for (i = 0; i 32; ++i) { +mask = 1U i; +if ((*features mask) !(supported mask)) { +printf(Processor feature %s not supported by kvm\n, names[i]); +*features = ~mask; +} +} +} + typedef struct x86_def_t { const char *name; uint32_t level; @@ -1699,5 +1714,20 @@ CPUX86State *cpu_x86_init(const char *cpu_model) qemu_init_vcpu(env); +if (kvm_enabled()) { +kvm_trim_features(env-cpuid_features, + kvm_arch_get_supported_cpuid(env, 1, R_EDX), + feature_name); +kvm_trim_features(env-cpuid_ext_features, + kvm_arch_get_supported_cpuid(env, 1, R_ECX), + ext_feature_name); +kvm_trim_features(env-cpuid_ext2_features, + kvm_arch_get_supported_cpuid(env, 0x8001, R_EDX), + ext2_feature_name); +kvm_trim_features(env-cpuid_ext3_features, + kvm_arch_get_supported_cpuid(env, 0x8001, R_ECX), + ext3_feature_name); +} + return env; } -- 1.6.1.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/4] kvm: Add support for querying supported cpu features
kvm does not support all cpu features; add support for dunamically querying the supported feature set. Signed-off-by: Avi Kivity a...@redhat.com --- kvm.h |3 ++ target-i386/kvm.c | 80 + 2 files changed, 83 insertions(+), 0 deletions(-) diff --git a/kvm.h b/kvm.h index bd4e8d4..c134c45 100644 --- a/kvm.h +++ b/kvm.h @@ -124,6 +124,9 @@ void kvm_arch_remove_all_hw_breakpoints(void); void kvm_arch_update_guest_debug(CPUState *env, struct kvm_guest_debug *dbg); +uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function, + int reg); + /* generic hooks - to be moved/refactored once there are more users */ static inline void cpu_synchronize_state(CPUState *env, int modified) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index b534b2d..5f54ff5 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -34,6 +34,86 @@ do { } while (0) #endif +#ifdef KVM_CAP_EXT_CPUID + +static struct kvm_cpuid2 *try_get_cpuid(KVMState *s, int max) +{ +struct kvm_cpuid2 *cpuid; +int r, size; + +size = sizeof(*cpuid) + max * sizeof(*cpuid-entries); +cpuid = (struct kvm_cpuid2 *)qemu_mallocz(size); +cpuid-nent = max; +r = kvm_ioctl(s, KVM_GET_SUPPORTED_CPUID, cpuid); +if (r 0) { +if (r == -E2BIG) { +qemu_free(cpuid); +return NULL; +} else { +fprintf(stderr, KVM_GET_SUPPORTED_CPUID failed: %s\n, +strerror(-r)); +exit(1); +} +} +return cpuid; +} + +uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function, int reg) +{ +struct kvm_cpuid2 *cpuid; +int i, max; +uint32_t ret = 0; +uint32_t cpuid_1_edx; + +if (!kvm_check_extension(KVM_CAP_EXT_CPUID)) { +return -1U; +} + +max = 1; +while ((cpuid = try_get_cpuid(env-kvm_state, max)) == NULL) { +max *= 2; +} + +for (i = 0; i cpuid-nent; ++i) { +if (cpuid-entries[i].function == function) { +switch (reg) { +case R_EAX: +ret = cpuid-entries[i].eax; +break; +case R_EBX: +ret = cpuid-entries[i].ebx; +break; +case R_ECX: +ret = cpuid-entries[i].ecx; +break; +case R_EDX: +ret = cpuid-entries[i].edx; +if (function == 0x8001) { +/* On Intel, kvm returns cpuid according to the Intel spec, + * so add missing bits according to the AMD spec: + */ +cpuid_1_edx = kvm_arch_get_supported_cpuid(env, 1, R_EDX); +ret |= cpuid_1_edx 0xdfeff7ff; +} +break; +} +} +} + +qemu_free(cpuid); + +return ret; +} + +#else + +uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function, int reg) +{ +return -1U; +} + +#endif + int kvm_arch_init_vcpu(CPUState *env) { struct { -- 1.6.1.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Revert Sync idcache after emualted DMA operations for ia64
Hollis Blanchard wrote: This reverts commit 9dc99a28236161a5a1b4c58f1e9c4ec6179cb976. Aside from the other issues discussed on kvm-devel, this commit breaks the PowerPC build. Applied, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03/04] qemu-kvm: Remove the dependency for phys_ram_base for ipf.c
Jes Sorensen wrote: Zhang, Xiantao wrote: Jes Sorensen wrote: I still can't see the difference with the patch in Avi's tree except nvram stuff. And I believe the global variable you mentioned should be only used for nvram. So I propose an incremental patch for that. :) Hi, Here is an incremental version of the patch. I think the differences should be pretty obvious now :-) It fixes the memcpy issues in the hob and nvram code and also cleans up the interfaces a lot. Avi, please add. Looks good to me. Xiantao? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03/21] Remove use of signalfd in block-raw-posix.c
Avi Kivity wrote: Anthony Liguori wrote: Oh okay. But signal delivery is slow; for example the FPU needs to be reset. Is it really justified to add all of this extra code (including signalfd emulation) for something that probably isn't even measurable? We don't have to add signalfd emulation; we can simply use signal+pipe in that case. We won't know if it's measurable or not until we measure it (or not). I like using wiz-bang features of Linux as much as the next guy, but I think we're stretching to justify it here :-) I think it's worth it in this case. It will become more important in time, too. Out of curiosity, I measured this: [...@balrog test]$ ./signal 2777 ns/signal (pipe) 844 ns/signal (signalfd) At 1 signals/sec, this will save about 2% cpu time. It's definitely worthwhile for the handful of lines it takes. test program: #include signal.h #include unistd.h #include sys/signalfd.h #include sys/time.h #include stdio.h static int wfd; static void handler(int signum) { char b; write(wfd, b, 1); } static int create_pipe(void) { int fd[2]; sigset_t s; pipe(fd); wfd = fd[1]; signal(SIGUSR1, handler); sigemptyset(s); sigaddset(s, SIGUSR1); sigprocmask(SIG_UNBLOCK, s, NULL); return fd[0]; } static int create_signalfd(void) { sigset_t s; sigemptyset(s); sigaddset(s, SIGUSR1); sigprocmask(SIG_BLOCK, s, NULL); return signalfd(-1, s, 0); } static uint64_t time_usec(void) { struct timeval tv; gettimeofday(tv, NULL); return (uint64_t)tv.tv_sec * 100 + tv.tv_usec; } #define N 1000 static void test(const char *name, int fd, int len) { int i; uint64_t t1, t2; char buf[128]; t1 = time_usec(); for (i = 0; i N; ++i) { raise(SIGUSR1); read(fd, buf, len); } t2 = time_usec(); close(fd); printf(%5d ns/signal (%s)\n, 1000 * (t2 - t1) / N, name); } int main(int ac, char **av) { test(pipe, create_pipe(), 1); test(signalfd, create_signalfd(), 128); return 0; } -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM PATCH v3 2/2] kvm: add support for irqfd via eventfd-notification interface
Michael S. Tsirkin wrote: On Mon, Apr 27, 2009 at 02:33:34PM -0400, Gregory Haskins wrote: This allows an eventfd to be registered as an irq source with a guest. Any signaling operation on the eventfd (via userspace or kernel) will inject the registered GSI at the next available window. Signed-off-by: Gregory Haskins ghask...@novell.com If we ever want to use this with e.g. MSI-X emulation in guest, and want to be stricly compliant to MSI-X, we'll need a way for guest to mask interrupts, and for host to report that a masked interrupt is pending. Ideally, all this will be doable with a couple of mmapped pages to avoid vmexits/system calls. We could do this in two ways: - move msix entry emulation into the kernel - require the device to support replacing its irqfd, and juggle it like so: - guest disables msi - replace device model fd with eventfd belonging to us - when the device fires its eventfd, set the irq pending bit - guest enables msi - if the pending bit is set, fire the interrupt? - replace device model fd with the real irqfd I'm leaning towards the latter, though it's not an easy call. +static void +irqfd_inject(struct work_struct *work) +{ + struct _irqfd *irqfd = container_of(work, struct _irqfd, work); + struct kvm *kvm = irqfd-kvm; + + mutex_lock(kvm-lock); + kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 1); + kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 0); + mutex_unlock(kvm-lock); This will do weird stuff (deliver the irq twice) if the irq is MSI/MSI-X. I know this was discussed already and is a temporary shortcut, but maybe add a comment that we really want kvm_toggle_irq, so that we won't forget? If so, that's a bug. MSI should ignore kvm_set_irq(..., 0). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
qemu-kvm: unapplied patches in my queue
Hi, The following patches have been posted a week ago: [PATCH] qemu-kvm: make clean should propagate into libkvm dire [PATCH] qemu-kvm: fix compiler warning [PATCH] qemu-kvm: make kvm_create_pit static No comments have been made since then - does this mean they can be applied? Thanks, -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu-kvm: unapplied patches in my queue
Michael S. Tsirkin wrote: Hi, The following patches have been posted a week ago: [PATCH] qemu-kvm: make clean should propagate into libkvm dire [PATCH] qemu-kvm: fix compiler warning [PATCH] qemu-kvm: make kvm_create_pit static No comments have been made since then - does this mean they can be applied? I make a note when I apply patches, and make a comment when something is wrong with them. No comment means they're either waiting to be reviewed, or that I forgot about them completely. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM PATCH v3 2/2] kvm: add support for irqfd via eventfd-notification interface
On Sun, 3 May 2009, Al Viro wrote: On Mon, Apr 27, 2009 at 02:33:34PM -0400, Gregory Haskins wrote: + /* We re-use eventfd for irqfd */ + fd = sys_eventfd2(0, 0); + if (fd 0) { + ret = fd; + goto fail; + } + + /* We maintain a reference to eventfd for the irqfd lifetime */ + file = eventfd_fget(fd); + if (IS_ERR(file)) { + ret = PTR_ERR(file); + goto fail; + } + + irqfd-file = file; This is just plain wrong. You have no promise whatsoever that caller of that sucker won't race with e.g. dup2(). IOW, you can't assume that file will be of the expected kind. The eventfd_fget() checks for the file_operations pointer, before returning the file*, and fails if the fd in not an eventfd. Or you have other concerns? - Davide -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Linux x86 guest panics in skb_copy_bits
Hi all, I have a pretty straightforward setup. Hypervisor: dual xeon e5205 running Gentoo Linux kernel 2.6.27 with virtio devices enabled kvm 84 libvirt 0.5.1 Guest: 32-bit, virtio for nic and disk, qcow2. Linux 2.6.28. Network is bridged using tap and brctl. I'm running Apache on the guest. Whenever I send enough data through the virtual NIC, I get a panic in skb_copy_bits. I've tried using the e1000 driver instead of the virtio one, but that makes no difference. Has anyone else seen this behavior before? I got this on 2.6.27 and 2.6.28. Here's a snippet: [280204.340016] [c02253e2] panic+0x4e/0xea [280204.340016] [c05906b9] oops_end+0x8f/0xa3 [280204.340016] [c0204e94] die+0x57/0x5f [280204.340016] [c0592192] do_page_fault+0x605/0x6bc [280204.340016] [c059010d] ? _spin_lock+0x15/0x18 [280204.340016] [c04bfad8] ? __qdisc_run+0xe6/0x1a7 [280204.340016] [c0591b8d] ? do_page_fault+0x0/0x6bc [280204.340016] [c0590482] error_code+0x72/0x78 [280204.340016] [c04ace8c] ? skb_copy_bits+0x4f/0x1c4 [280204.340016] [c0215864] ? kvm_set_pte+0x26/0x29 [280204.340016] [c054889c] xdr_skb_read_bits+0x1f/0x37 [280204.340016] [c054872b] xdr_partial_copy_from_skb+ 0x117/0x16c [280204.340016] [c0549ec4] xs_tcp_data_recv+0x245/0x3de [280204.340016] [c054887d] ? xdr_skb_read_bits+0x0/0x37 [280204.340016] [c04e07d6] tcp_read_sock+0x8c/0x1e2 [280204.340016] [c0549c7f] ? xs_tcp_data_recv+0x0/0x3de [280204.340016] [c054a5d1] xs_tcp_data_ready+0x54/0x64 [280204.340016] [c04e9469] tcp_rcv_established+0x524/0x7b7 [280204.340016] [c04ee4b2] tcp_v4_do_rcv+0x173/0x2dc -- Justin Dossey -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM PATCH v3 2/2] kvm: add support for irqfd via eventfd-notification interface
On Sun, May 03, 2009 at 11:07:26AM -0700, Davide Libenzi wrote: On Sun, 3 May 2009, Al Viro wrote: On Mon, Apr 27, 2009 at 02:33:34PM -0400, Gregory Haskins wrote: + /* We re-use eventfd for irqfd */ + fd = sys_eventfd2(0, 0); + if (fd 0) { + ret = fd; + goto fail; + } + + /* We maintain a reference to eventfd for the irqfd lifetime */ + file = eventfd_fget(fd); + if (IS_ERR(file)) { + ret = PTR_ERR(file); + goto fail; + } + + irqfd-file = file; This is just plain wrong. You have no promise whatsoever that caller of that sucker won't race with e.g. dup2(). IOW, you can't assume that file will be of the expected kind. The eventfd_fget() checks for the file_operations pointer, before returning the file*, and fails if the fd in not an eventfd. Or you have other concerns? OK, but... it's still wrong. Descriptor numbers are purely for interaction with userland; using them that way violates very general race-prevention rules, even if you do paper over the worst of consequences with check in eventfd_fget(). General rules: * descriptor you've generated is fit only for return to userland; * descriptor you've got from userland is fit only for *single* fget() or equivalent, unless you are one of the core syscalls manipulating the descriptor table itself (dup2, etc.) * once file is installed in descriptor table, you'd better be past the last failure exit; sys_close() on cleanup path is not acceptable. That's what reserving descriptors is for. IOW, the sane solution would be to export something that returns your struct file *. And stop playing with fd completely. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM PATCH v3 2/2] kvm: add support for irqfd via eventfd-notification interface
On Sun, May 03, 2009 at 07:59:40PM +0300, Avi Kivity wrote: Michael S. Tsirkin wrote: On Mon, Apr 27, 2009 at 02:33:34PM -0400, Gregory Haskins wrote: This allows an eventfd to be registered as an irq source with a guest. Any signaling operation on the eventfd (via userspace or kernel) will inject the registered GSI at the next available window. Signed-off-by: Gregory Haskins ghask...@novell.com If we ever want to use this with e.g. MSI-X emulation in guest, and want to be stricly compliant to MSI-X, we'll need a way for guest to mask interrupts, and for host to report that a masked interrupt is pending. Ideally, all this will be doable with a couple of mmapped pages to avoid vmexits/system calls. We could do this in two ways: - move msix entry emulation into the kernel It's not too bad IMO: MSIX is just a table with a list of vectors, you check the mask bit on each interrupt, if masked mark vector pending and poll until unmasked. - require the device to support replacing its irqfd, and juggle it like so: - guest disables msi - replace device model fd with eventfd belonging to us - when the device fires its eventfd, set the irq pending bit - guest enables msi - if the pending bit is set, fire the interrupt? - replace device model fd with the real irqfd Looks like a lot of code. No? I'm leaning towards the latter, though it's not an easy call. Actually there's a third option: add KVM_MASK_IRQ, KVM_UNMASK_IRQ ioctls which will block/unblock guest from getting interrupt on this irq, whatever the source. Interrupts are queued in kernel while masked. A third ioctl KVM_PENDING_IRQS will return the status for a set if IRQs. qemu would call these ioctls when guest edits the MSIX vector control or reads the pending bit array. +static void +irqfd_inject(struct work_struct *work) +{ + struct _irqfd *irqfd = container_of(work, struct _irqfd, work); + struct kvm *kvm = irqfd-kvm; + + mutex_lock(kvm-lock); + kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 1); + kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd-gsi, 0); + mutex_unlock(kvm-lock); This will do weird stuff (deliver the irq twice) if the irq is MSI/MSI-X. I know this was discussed already and is a temporary shortcut, but maybe add a comment that we really want kvm_toggle_irq, so that we won't forget? If so, that's a bug. MSI should ignore kvm_set_irq(..., 0). -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM PATCH v3 2/2] kvm: add support for irqfd via eventfd-notification interface
Michael S. Tsirkin wrote: On Sun, May 03, 2009 at 07:59:40PM +0300, Avi Kivity wrote: Michael S. Tsirkin wrote: On Mon, Apr 27, 2009 at 02:33:34PM -0400, Gregory Haskins wrote: This allows an eventfd to be registered as an irq source with a guest. Any signaling operation on the eventfd (via userspace or kernel) will inject the registered GSI at the next available window. Signed-off-by: Gregory Haskins ghask...@novell.com If we ever want to use this with e.g. MSI-X emulation in guest, and want to be stricly compliant to MSI-X, we'll need a way for guest to mask interrupts, and for host to report that a masked interrupt is pending. Ideally, all this will be doable with a couple of mmapped pages to avoid vmexits/system calls. We could do this in two ways: - move msix entry emulation into the kernel It's not too bad IMO: MSIX is just a table with a list of vectors, you check the mask bit on each interrupt, if masked mark vector pending and poll until unmasked. Right, but it's more and more core, and more and more bugs. Bugs in the kernel are more expensive to fix and get to users. - require the device to support replacing its irqfd, and juggle it like so: - guest disables msi - replace device model fd with eventfd belonging to us - when the device fires its eventfd, set the irq pending bit - guest enables msi - if the pending bit is set, fire the interrupt? - replace device model fd with the real irqfd Looks like a lot of code. No? We'll need exactly the same code if we do it in the kernel. The only addition is replacing the fd. I'm leaning towards the latter, though it's not an easy call. Actually there's a third option: add KVM_MASK_IRQ, KVM_UNMASK_IRQ ioctls which will block/unblock guest from getting interrupt on this irq, whatever the source. Interrupts are queued in kernel while masked. A third ioctl KVM_PENDING_IRQS will return the status for a set if IRQs. qemu would call these ioctls when guest edits the MSIX vector control or reads the pending bit array. I think this is the best option. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
file descriptor abuses
On Sun, May 03, 2009 at 08:01:36PM +0100, Al Viro wrote: General rules: * descriptor you've generated is fit only for return to userland; * descriptor you've got from userland is fit only for *single* fget() or equivalent, unless you are one of the core syscalls manipulating the descriptor table itself (dup2, etc.) * once file is installed in descriptor table, you'd better be past the last failure exit; sys_close() on cleanup path is not acceptable. That's what reserving descriptors is for. IOW, the sane solution would be to export something that returns your struct file *. And stop playing with fd completely. Speaking of which, quick look through fget() callers shows this turd: static int p9_socket_open(struct p9_client *client, struct socket *csocket) { int fd, ret; fd = sock_map_fd(csocket, 0); . ret = p9_fd_open(client, fd, fd); if (ret 0) { P9_EPRINTK(KERN_ERR, p9_socket_open: failed to open fd\n); sockfd_put(csocket); return ret; } . return 0; } where p9_fd_open() calls fget() on its 2nd and 3rd arguments. Which does worse than just a leak, AFAICT - on failure exit it leaves a dangling pointer from descriptor table. On the almost unrelated note, we have (in drivers/sneak_in^Wstaging/usbip) sockfd_to_socket(), with all callers leaking struct file, AFAICS. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM PATCH v3 2/2] kvm: add support for irqfd via eventfd-notification interface
On Sun, May 03, 2009 at 10:17:16PM +0300, Avi Kivity wrote: Actually there's a third option: add KVM_MASK_IRQ, KVM_UNMASK_IRQ ioctls which will block/unblock guest from getting interrupt on this irq, whatever the source. Interrupts are queued in kernel while masked. A third ioctl KVM_PENDING_IRQS will return the status for a set if IRQs. qemu would call these ioctls when guest edits the MSIX vector control or reads the pending bit array. I think this is the best option. Sounds good. -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM PATCH v3 2/2] kvm: add support for irqfd via eventfd-notification interface
On Sun, 3 May 2009, Al Viro wrote: IOW, the sane solution would be to export something that returns your struct file *. And stop playing with fd completely. This builds but it's not tested at all. - Make all the work of the old anon_inode_getfd(), done by a new anon_inode_getfile(), with anon_inode_getfd() using its services - Make all the work done by sys_eventfd(), done by a new eventfd_file_create() (which in turn uses anon_inode_getfile()), with sys_eventfd() using its services IRQfd can use eventfd_file_create(), fget(), get_unused_fd_flags() and fd_install() just before returning. Is that what you had in mind? - Davide --- fs/anon_inodes.c| 68 +--- fs/eventfd.c| 44 +--- include/linux/anon_inodes.h |3 + include/linux/eventfd.h |6 +++ 4 files changed, 92 insertions(+), 29 deletions(-) Index: linux-2.6.mod/fs/anon_inodes.c === --- linux-2.6.mod.orig/fs/anon_inodes.c 2009-05-03 12:21:09.0 -0700 +++ linux-2.6.mod/fs/anon_inodes.c 2009-05-03 12:54:02.0 -0700 @@ -64,28 +64,24 @@ static const struct dentry_operations an * * Creates a new file by hooking it on a single inode. This is useful for files * that do not need to have a full-fledged inode in order to operate correctly. - * All the files created with anon_inode_getfd() will share a single inode, + * All the files created with anon_inode_getfile() will share a single inode, * hence saving memory and avoiding code duplication for the file/inode/dentry - * setup. Returns new descriptor or -error. + * setup. Returns the newly created file* or error. */ -int anon_inode_getfd(const char *name, const struct file_operations *fops, -void *priv, int flags) +struct file *anon_inode_getfile(const char *name, + const struct file_operations *fops, + void *priv, int flags) { struct qstr this; struct dentry *dentry; struct file *file; - int error, fd; + int error; if (IS_ERR(anon_inode_inode)) - return -ENODEV; + return ERR_PTR(-ENODEV); if (fops-owner !try_module_get(fops-owner)) - return -ENOENT; - - error = get_unused_fd_flags(flags); - if (error 0) - goto err_module; - fd = error; + return ERR_PTR(-ENOENT); /* * Link the inode to a directory entry by creating a unique name @@ -97,7 +93,7 @@ int anon_inode_getfd(const char *name, c this.hash = 0; dentry = d_alloc(anon_inode_mnt-mnt_sb-s_root, this); if (!dentry) - goto err_put_unused_fd; + goto err_module; /* * We know the anon_inode inode count is always greater than zero, @@ -123,16 +119,54 @@ int anon_inode_getfd(const char *name, c file-f_version = 0; file-private_data = priv; + return file; + +err_dput: + dput(dentry); +err_module: + module_put(fops-owner); + return ERR_PTR(error); +} +EXPORT_SYMBOL_GPL(anon_inode_getfile); + +/** + * anon_inode_getfd - creates a new file instance by hooking it up to an + *anonymous inode, and a dentry that describe the class + *of the file + * + * @name:[in]name of the class of the new file + * @fops:[in]file operations for the new file + * @priv:[in]private data for the new file (will be file's private_data) + * @flags: [in]flags + * + * Creates a new file by hooking it on a single inode. This is useful for files + * that do not need to have a full-fledged inode in order to operate correctly. + * All the files created with anon_inode_getfd() will share a single inode, + * hence saving memory and avoiding code duplication for the file/inode/dentry + * setup. Returns new descriptor or -error. + */ +int anon_inode_getfd(const char *name, const struct file_operations *fops, +void *priv, int flags) +{ + int error, fd; + struct file *file; + + error = get_unused_fd_flags(flags); + if (error 0) + return error; + fd = error; + + file = anon_inode_getfile(name, fops, priv, flags); + if (IS_ERR(file)) { + error = PTR_ERR(file); + goto err_put_unused_fd; + } fd_install(fd, file); return fd; -err_dput: - dput(dentry); err_put_unused_fd: put_unused_fd(fd); -err_module: - module_put(fops-owner); return error; } EXPORT_SYMBOL_GPL(anon_inode_getfd); Index: linux-2.6.mod/fs/eventfd.c === --- linux-2.6.mod.orig/fs/eventfd.c 2009-05-03 12:21:09.0 -0700 +++ linux-2.6.mod/fs/eventfd.c 2009-05-03
Custom BIOS supported size
Hello, Which is the maximum size supported for a custom BIOS image(eg. coreboot-based)? I tried some 256K coreboot BIOS images and seemed to work fine, but it blowed up with a 3MB image (which by the way works on qemu just fine). Qemu also seems to fail with images greater than 4MB, and I'd appreciate if either kvm or qemu had support for such images, and maybe even greater than that if technically possible. You can get such images from http://www.coreboot.org/QEMU The 3MB one that I tried can be fetched from http://panzer.utcluj.ro/~alien/coreboot/AVATT/BIOS/OpenVZ/bios.bin Thanks! Cristi -- Ing. Cristi Măgherușan, System/Network Engineer Technical University of Cluj-Napoca, Romania http://cc.utcluj.ro +40264 401247 signature.asc Description: This is a digitally signed message part
[RFC] Bring in all the Linux headers we depend on in QEMU
Sorry this explanation is long winded, but this is a messy situation. In Linux, there isn't a very consistent policy about userspace kernel header inclusion. On a typical Linux system, you're likely to find kernel headers in three places. glibc headers (/usr/include/{linux,asm}) These headers are installed by glibc. They very often are based on much older kernel versions that the kernel you have in your distribution. For software that depends on these headers, very often this means that your software detects features being missing that are present on your kernel. Furthermore, glibc only installs the headers it needs so very often certain headers have dependencies that aren't met. A classic example is linux/compiler.h and the broken usbdevice_fs.h header that depends on it. There are still distributions today that QEMU doesn't compile on because of this. Today, most of QEMU's code depends on these headers. /lib/modules/$(uname -r)/build These are the kernel headers that are installed as part of your kernel. In general, this is a pretty good place to find the headers that are associated with the kernel version you're actually running on. However, these headers are part of the kernel build tree and are not always guaranteed to be includable from userspace. random kernel tree Developers, in particular, like to point things at their random kernel trees. In general though, relying on a full kernel source tree being available isn't a good idea. Kernel headers change dramatically across versions too so it's very likely that we would need to have a lot of #ifdefs dependent on kernel versions, or some of the uglier work arounds we have in usb-linux.c. I think the best way to avoid #ifdefs and dependencies on broken/incomplete glibc headers is to include all of the Linux headers we need within QEMU. The attached patch does just this. I think there's room for discussion about whether we really want to do this. We could potentially depend on some more common glibc headers (like asm/types.h) while bringing in less reliable headers (if_tun.h/virtio*). Including them all seems like the most robust solution to me though. Comments? Regards, Anthony Liguori From b42aa57e94fc6b05897271b0816fa34d70670c9a Mon Sep 17 00:00:00 2001 From: Anthony Liguori aligu...@us.ibm.com Date: Sat, 2 May 2009 15:45:24 -0500 Subject: [PATCH] Import Linux headers into QEMU These are the headers from Linux 2.6.29 that have been modified to work under QEMU. This includes the necessary scripts to generate the headers from the original Linux source tree. I've not included the headers in this patch as they are quite big and would make review more difficult. I've included headers for all host architectures QEMU currently supports but I've only tested x86. Signed-off-by: Anthony Liguori aligu...@us.ibm.com --- Makefile.target |3 -- configure | 37 ++- linux/Makefile | 108 +++ linux/README| 19 ++ linux/fixup.sed | 19 ++ usb-linux.c |1 - 6 files changed, 165 insertions(+), 22 deletions(-) create mode 100644 linux/Makefile create mode 100644 linux/README create mode 100644 linux/fixup.sed diff --git a/Makefile.target b/Makefile.target index f735105..474245a 100644 --- a/Makefile.target +++ b/Makefile.target @@ -124,9 +124,6 @@ CFLAGS+=-I/opt/SUNWspro/prod/include/cc endif endif -kvm.o: CFLAGS+=$(KVM_CFLAGS) -kvm-all.o: CFLAGS+=$(KVM_CFLAGS) - all: $(PROGS) # diff --git a/configure b/configure index 82fb60a..379a2a6 100755 --- a/configure +++ b/configure @@ -1081,6 +1081,23 @@ EOF fi fi +# Linux kernel headers CFLAGS +if test -z $kerneldir ; then +linux_cflags=-I$source_path/linux +else +linux_cflags=-I$kerneldir/include +if test \( $cpu = i386 -o $cpu = x86_64 \) \ + -a -d $kerneldir/arch/x86/include ; then + linux_cflags=$linux_cflags -I$kerneldir/arch/x86/include +elif test $cpu = ppc -a -d $kerneldir/arch/powerpc/include ; then + linux_cflags=$linux_cflags -I$kerneldir/arch/powerpc/include +elif test -d $kerneldir/arch/$cpu/include ; then + linux_cflags=$linux_cflags -I$kerneldir/arch/$cpu/include +fi +fi + +OS_CFLAGS=$OS_CFLAGS $linux_cflags + ## # kvm probe if test $kvm = yes ; then @@ -1100,27 +1117,14 @@ if test $kvm = yes ; then #endif int main(void) { return 0; } EOF - if test $kerneldir != ; then - kvm_cflags=-I$kerneldir/include - if test \( $cpu = i386 -o $cpu = x86_64 \) \ - -a -d $kerneldir/arch/x86/include ; then -kvm_cflags=$kvm_cflags -I$kerneldir/arch/x86/include - elif test $cpu = ppc -a -d $kerneldir/arch/powerpc/include ; then - kvm_cflags=$kvm_cflags -I$kerneldir/arch/powerpc/include -elif test -d $kerneldir/arch/$cpu/include ; then -kvm_cflags=$kvm_cflags
Re: kvm-userspace broken?
Avi Kivity wrote: Oliver Rath wrote: Hi List, maybe i missed some announcements, but git clone git://git.kernel.org/pub/scm/virt/kvm/kvm-userspace.git givs no response: kvm-userspace # git clone git://git.kernel.org/pub/scm/virt/kvm/kvm-userspace.git Initialized empty Git repository in /home/oliver/kvm-userspace/kvm-userspace/.git/ fatal: The remote end hung up unexpectedly Whats up there? kvm-userspace.git has been retired; it's now playing golf in git://git.kernel.org/pub/scm/virt/kvm/retired/kvm-userspace.git. Use git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git instead. The latest tarbal on sourceforge is kvm-85, yet the clone I just made from qemu-kvm 'git descibes' itself as kvm-84-756-gf7d114d. Is that right? -- Hans -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM x86_64 with SR-IOV..?
On Sat, 2009-05-02 at 18:22 +0800, Sheng Yang wrote: On Thu, Apr 30, 2009 at 01:22:54PM -0700, Nicholas A. Bellinger wrote: Greetings KVM folks, I wondering if any information exists for doing SR-IOV on the new VT-d capable chipsets with KVM..? From what I understand the patches for doing this with KVM are floating around, but I have been unable to find any user-level docs for actually making it all go against a upstream v2.6.30-rc3 code.. So far I have been doing IOV testing with Xen 3.3 and 3.4.0-pre, and I am really hoping to be able to jump to KVM for single-function and and then multi-function SR-IOV. I know that the VM migration stuff for IOV in Xen is up and running, and I assume it is being worked in for KVM instance migration as well..? This part is less important (at least for me :-) than getting a stable SR-IOV setup running under the KVM hypervisor.. Does anyone have any pointers for this..? Any comments or suggestions are appreciated! Hi Nicholas The patches are not floating around now. As you know, SR-IOV for Linux have been in 2.6.30, so then you can use upstream KVM and qemu-kvm(or recent released kvm-85) with 2.6.30-rc3 as host kernel. And some time ago, there are several SRIOV related patches for qemu-kvm, and now they all have been checked in. And for KVM, the extra document is not necessary, for you can simple assign a VF to guest like any other devices. And how to create VF is specific for each device driver. So just create a VF then assign it to KVM guest is fine. Greetings Sheng, So, I have been trying the latest kvm-85 release on a v2.6.30-rc3 checkout from linux-2.6.git on a CentOS 5u3 x86_64 install on Intel IOH-5520 based dual socket Nehalem board. I have enabled DMAR and Interrupt Remapping my KVM host using v2.6.30-rc3 and from what I can tell, the KVM_CAP_* defines from libkvm are enabled with building kvm-85 after './configure --kerneldir=/usr/src/linux-2.6.git' and the PCI passthrough code is being enabled in kvm-85/qemu/hw/device-assignment.c AFAICT.. From there, I use the freshly installed qemu-x86_64-system binary to start a Debian 5 x86_64 HVM (that previously had been moving network packets under Xen for PCIe passthrough). I see the MSI-X interrupt remapping working on the KVM host for the passed -pcidevice, and the MMIO mappings from the qemu build that I also saw while using Xen/qemu-dm built with PCI passthrough are there as well.. But while the KVM guest is booting, I see the following exception(s) from qemu-x86_64-system for one of the VFs for a multi-function PCIe device: BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) I try with one of the on-board e1000e ports (02:00.0) and I see the same exception along with some MSI-X exceptions from qemu-x86_64-system in KVM guest.. However, I am still able to see the e1000e and the other vxge multi-function device with lspci, but I am unable to dhcp or ping with the e1000e and VF from multi-function device fails to register the MSI-X interrupt in the guest.. S, I enabled the debugging code in kvm-85/qemu/hw/device-assignment.c and see the PAGE aligned MMIO memory for the passed PCIe device is being released during the BUG exceptions above.. Is there something else I should be looking at..? I have pci-stub enabled, and I unbind 02:00.0 from /sys/bus/pci/drivers/e1000e/unbind successfully (just like with Xen and pciback), but I am unable to do the 'echo -n 02:00.0 /sys/bus/pci/drivers/pci-stub/bind' (it returns write error, no such device, with no dmesg output) on the KVM host running v2.6.30-rc3. Is this supposed to happen on v2.6.30-rc3 with pci-stub..? I am also using the the kvm-85 source dist kvm_intel.ko and kvm.ko kernel modules. Is there something I am missing when building kvm-85 for SR-IOV passthrough..? Also FYI, I am having to use pci=resource_alignment= because the BIOS does not PAGE_SIZE align the MMIO BARs for my multi-function devices.. Also, I tried with disabling the DMAR with the Intel IOMMU passthrough from this patch: https://lists.linux-foundation.org/pipermail/iommu/2009-April/001339.html that did not make it into v2.6.30-rc3. The patch logic was enabled but still I saw the same kvm exceptions from qemu-system-x86_64. Anyways, I am going to give it a shot with the Fedora 11 x86_64 Preview and see if it works as expected with a IOH-5520 chipset with the AMI BIOS on a Tyan S7010 with Xeon 5520s. Hopefully this is just a kvm-85 build and/or install issue I am seeing on my CentOS 5u3 install (that has a Xen PCIe passthrough setup on it as well) with v2.6.30-rc3. I will try on a fresh install on a distro with the new KVM logic and see what happens. :-) Thanks for your comments! --nab -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] kvm-kmod: fix build on kernels with kvm trace set
Avi Kivity wrote: Michael S. Tsirkin wrote: CONFIG_KVM_TRACE in kernel conflicts with the definition in external module. external-module-compat-comm.h tried to work around this, but this didn't work as some code still does #include linux/autoconf.h directly. Solve this differently by s/CONFIG_KVM_TRACE/CONFIG_KMOD_KVM_TRACE/ in awk. Had to tighten regular expressions in hack-module.awk so that they don't trigger on kvm_host.h . Signed-off-by: Michael S. Tsirkin m...@redhat.com --- Makefile |5 +++-- configure |2 +- external-module-compat-comm.h |7 --- x86/Kbuild|2 +- x86/hack-module.awk |8 +--- 5 files changed, 10 insertions(+), 14 deletions(-) diff --git a/Makefile b/Makefile index f2ef811..9cdc0af 100644 --- a/Makefile +++ b/Makefile @@ -34,8 +34,8 @@ hack-files-ia64 = kvm_main.c kvm_fw.c kvm_lib.c kvm-ia64.c hack-files = $(hack-files-$(ARCH_DIR)) -ifeq ($(EXT_CONFIG_KVM_TRACE),y) -module_defines += -DEXT_CONFIG_KVM_TRACE=y +ifeq ($(CONFIG_KMOD_KVM_TRACE),y) +module_defines += -DCONFIG_KMOD_KVM_TRACE=1 endif all:: prerequisite @@ -72,6 +72,7 @@ header-sync: for i in $$(find $T -name '*.h'); do \ $(call unifdef,$$i); done $(call hack, include/linux/kvm.h) +$(call hack, include/linux/kvm_host.h) $(call hack, include/asm-$(ARCH_DIR)/kvm.h) set -e for i in $$(find $T -type f -printf '%P '); \ do mkdir -p $$(dirname $$i); cmp -s $$i $T/$$i || cp $T/$$i $$i; done diff --git a/configure b/configure index 30af6e7..6e12bb1 100755 --- a/configure +++ b/configure @@ -122,5 +122,5 @@ DEPMOD_VERSION=$depmod_version EOF cat EOF config.kbuild -EXT_CONFIG_KVM_TRACE=$kvm_trace +CONFIG_KMOD_KVM_TRACE=$kvm_trace EOF diff --git a/external-module-compat-comm.h b/external-module-compat-comm.h index c955927..e561448 100644 --- a/external-module-compat-comm.h +++ b/external-module-compat-comm.h @@ -18,13 +18,6 @@ #include linux/hrtimer.h #include asm/bitops.h -/* Override CONFIG_KVM_TRACE */ -#ifdef EXT_CONFIG_KVM_TRACE -# define CONFIG_KVM_TRACE 1 -#else -# undef CONFIG_KVM_TRACE -#endif - /* * 2.6.16 does not have GFP_NOWAIT */ diff --git a/x86/Kbuild b/x86/Kbuild index d3aca00..fbdb28b 100644 --- a/x86/Kbuild +++ b/x86/Kbuild @@ -7,7 +7,7 @@ kvm-objs := kvm_main.o x86.o mmu.o x86_emulate.o ../anon_inodes.o irq.o i8259.o lapic.o ioapic.o preempt.o i8254.o coalesced_mmio.o irq_comm.o \ timer.o \ ../external-module-compat.o -ifeq ($(EXT_CONFIG_KVM_TRACE),y) +ifeq ($(CONFIG_KMOD_KVM_TRACE),y) kvm-objs += kvm_trace.o endif ifeq ($(CONFIG_IOMMU_API),y) diff --git a/x86/hack-module.awk b/x86/hack-module.awk index 260eeef..f3d95be 100644 --- a/x86/hack-module.awk +++ b/x86/hack-module.awk @@ -4,7 +4,7 @@ BEGIN { split(INIT_WORK desc_struct ldttss_desc64 desc_ptr \ hrtimer_expires_remaining \ on_each_cpu relay_open request_irq , compat_apis); } -/^int kvm_init\(/ { anon_inodes = 1 } +/^int kvm_init\([^)]*\)$/ { anon_inodes = 1 } /return 0;/ anon_inodes { print \tr = kvm_init_anon_inodes();; @@ -17,7 +17,7 @@ BEGIN { split(INIT_WORK desc_struct ldttss_desc64 desc_ptr \ anon_inodes = 0 } -/^void kvm_exit/ { anon_inodes_exit = 1 } +/^void kvm_exit[^)]*\)$/ { anon_inodes_exit = 1 } /\}/ anon_inodes_exit { print \tkvm_exit_anon_inodes();; @@ -25,7 +25,7 @@ BEGIN { split(INIT_WORK desc_struct ldttss_desc64 desc_ptr \ anon_inodes_exit = 0 } -/^int kvm_arch_init/ { kvm_arch_init = 1 } +/^int kvm_arch_init[^)])$/ { kvm_arch_init = 1 } /\tsc_khz\/ kvm_arch_init { sub(\\tsc_khz\\, kvm_tsc_khz) } /^}/ { kvm_arch_init = 0 } @@ -85,6 +85,8 @@ BEGIN { split(INIT_WORK desc_struct ldttss_desc64 desc_ptr \ /\kvm_.*_fops\.owner = module;/ { $0 = IF_ANON_INODES_DOES_REFCOUNTS( $0 ) } +{ sub(/\CONFIG_KVM_TRACE\/, CONFIG_KMOD_KVM_TRACE) } + { print } /unsigned long flags;/ vmx_load_host_state { Xiantao, do we need to change this for ia64? IA64 didn't support kvm trace, so doesn't need these changes , thanks! :) Xiantao-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] Revert Sync idcache after emualted DMA operations for ia64
Avi Kivity wrote: Hollis Blanchard wrote: This reverts commit 9dc99a28236161a5a1b4c58f1e9c4ec6179cb976. Aside from the other issues discussed on kvm-devel, this commit breaks the PowerPC build. Applied, thanks. Hollis, Could you explain why this patch breaks the powerpc build? qemu_sync_icache has the definition for non-ai64 case, so shoudn't break any arch-specific build. Xiantao-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 03/04] qemu-kvm: Remove the dependency for phys_ram_base for ipf.c
Avi Kivity wrote: Jes Sorensen wrote: Zhang, Xiantao wrote: Jes Sorensen wrote: I still can't see the difference with the patch in Avi's tree except nvram stuff. And I believe the global variable you mentioned should be only used for nvram. So I propose an incremental patch for that. :) Hi, Here is an incremental version of the patch. I think the differences should be pretty obvious now :-) It fixes the memcpy issues in the hob and nvram code and also cleans up the interfaces a lot. Avi, please add. Looks good to me. Xiantao? Hi, Jes Have you tested nvram support with this patch? I Xiantao-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Assign the correct pci id range to virtio_pci
On Mon, 27 Apr 2009 12:53:25 pm Pantelis Koukousoulas wrote: On Mon, Apr 27, 2009 at 3:44 AM, Anthony Liguori anth...@codemonkey.ws wrote: Would be good to at least include the experiment range in case people are making third-party virtio modules and want to play around without replacing virtio-{pci,*}. I 'd be happy with a simple comment explaining the 0x103f (e.g., /* Not yet using the full 0x1000 - 0x10ef to hedge our bets in case we broke the ABI.*/ as explained above) Thanks, I like your patch. Where did this idea of experimental range come from, BTW? I prefer your module cmdline approach, as it discourages deployment with such numbers. Thanks, Rusty. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM x86_64 with SR-IOV..?
On Monday 04 May 2009 08:53:07 Nicholas A. Bellinger wrote: On Sat, 2009-05-02 at 18:22 +0800, Sheng Yang wrote: On Thu, Apr 30, 2009 at 01:22:54PM -0700, Nicholas A. Bellinger wrote: Greetings KVM folks, I wondering if any information exists for doing SR-IOV on the new VT-d capable chipsets with KVM..? From what I understand the patches for doing this with KVM are floating around, but I have been unable to find any user-level docs for actually making it all go against a upstream v2.6.30-rc3 code.. So far I have been doing IOV testing with Xen 3.3 and 3.4.0-pre, and I am really hoping to be able to jump to KVM for single-function and and then multi-function SR-IOV. I know that the VM migration stuff for IOV in Xen is up and running, and I assume it is being worked in for KVM instance migration as well..? This part is less important (at least for me :-) than getting a stable SR-IOV setup running under the KVM hypervisor.. Does anyone have any pointers for this..? Any comments or suggestions are appreciated! Hi Nicholas The patches are not floating around now. As you know, SR-IOV for Linux have been in 2.6.30, so then you can use upstream KVM and qemu-kvm(or recent released kvm-85) with 2.6.30-rc3 as host kernel. And some time ago, there are several SRIOV related patches for qemu-kvm, and now they all have been checked in. And for KVM, the extra document is not necessary, for you can simple assign a VF to guest like any other devices. And how to create VF is specific for each device driver. So just create a VF then assign it to KVM guest is fine. Greetings Sheng, So, I have been trying the latest kvm-85 release on a v2.6.30-rc3 checkout from linux-2.6.git on a CentOS 5u3 x86_64 install on Intel IOH-5520 based dual socket Nehalem board. I have enabled DMAR and Interrupt Remapping my KVM host using v2.6.30-rc3 and from what I can tell, the KVM_CAP_* defines from libkvm are enabled with building kvm-85 after './configure --kerneldir=/usr/src/linux-2.6.git' and the PCI passthrough code is being enabled in kvm-85/qemu/hw/device-assignment.c AFAICT.. From there, I use the freshly installed qemu-x86_64-system binary to start a Debian 5 x86_64 HVM (that previously had been moving network packets under Xen for PCIe passthrough). I see the MSI-X interrupt remapping working on the KVM host for the passed -pcidevice, and the MMIO mappings from the qemu build that I also saw while using Xen/qemu-dm built with PCI passthrough are there as well.. Hi Nicholas But while the KVM guest is booting, I see the following exception(s) from qemu-x86_64-system for one of the VFs for a multi-function PCIe device: BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) This one is mostly harmless. I try with one of the on-board e1000e ports (02:00.0) and I see the same exception along with some MSI-X exceptions from qemu-x86_64-system in KVM guest.. However, I am still able to see the e1000e and the other vxge multi-function device with lspci, but I am unable to dhcp or ping with the e1000e and VF from multi-function device fails to register the MSI-X interrupt in the guest.. Did you see the interrupt in the guest and host side? I think you can try on- board e1000e for MSI-X first. And please ensure correlated driver have been loaded correctly. And what do you mean by some MSI-X exceptions? Better with the log. S, I enabled the debugging code in kvm-85/qemu/hw/device-assignment.c and see the PAGE aligned MMIO memory for the passed PCIe device is being released during the BUG exceptions above.. Is there something else I should be looking at..? That part of memory should be released for trap MMIO for MSI-X table. I have pci-stub enabled, and I unbind 02:00.0 from /sys/bus/pci/drivers/e1000e/unbind successfully (just like with Xen and pciback), but I am unable to do the 'echo -n 02:00.0 /sys/bus/pci/drivers/pci-stub/bind' (it returns write error, no such device, with no dmesg output) on the KVM host running v2.6.30-rc3. Is this supposed to happen on v2.6.30-rc3 with pci-stub..? Maybe you need echo :02:00.0 /sys/bus/pci/drivers/pci-stub/bind? I am also using the the kvm-85 source dist kvm_intel.ko and kvm.ko kernel modules. Is there something I am missing when building kvm-85 for SR-IOV passthrough..? I think the first thing is to confirm that device assignment work in your environment, using on-board card. You can also refer to http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM And you can post debug_device_assignment=1 log and qemu log and the tail of dmesg as well. Thanks! -- regards Yang, Sheng Also FYI, I am having to use pci=resource_alignment= because the BIOS does not PAGE_SIZE align the MMIO BARs for my multi-function devices.. Also, I tried with disabling the DMAR with the Intel IOMMU passthrough from this patch:
Re: [PATCH] Assign the correct pci id range to virtio_pci
I 'd be happy with a simple comment explaining the 0x103f (e.g., /* Not yet using the full 0x1000 - 0x10ef to hedge our bets in case we broke the ABI.*/ as explained above) Thanks, I like your patch. Where did this idea of experimental range come from, BTW? In the qemu sources, there is a file pci-ids.txt that documents the PCI ID rules. I 'm attaching it for your convenience. I prefer your module cmdline approach, as it discourages deployment with such numbers. Great, I like this way better too, because it allows using the full experimental range (16 IDs) while also allowing for breaking the virtio_pci ABI. Thanks, Pantelis PCI IDs for qemu Red Hat, Inc. donates a part of its device ID range to qemu, to be used for virtual devices. The vendor ID is 1af4 (formerly Qumranet ID). The 1000 - 10ff device ID range is used for VirtIO devices. The 1100 device ID is used as PCI Subsystem ID for existing hardware devices emulated by qemu. All other device IDs are reserved. VirtIO Device IDs - 1af4:1000 network device 1af4:1001 block device 1af4:1002 balloon device 1af4:1003 console device 1af4:1004 Reserved. to Contact Gerd Hoffmann kra...@redhat.com to get a 1af4:10ef device ID assigned for your new virtio device. 1af4:10f0 Available for experimental usage without registration. Must get to official ID when the code leaves the test lab (i.e. when seeking 1af4:10ff upstream merge or shipping a distro/product) to avoid conflicts.
Re: [RFC 1/2] Add MCE simulation support to qemu/tcg
I found there is a qemu-kvm.git on git.kernel.org. I will re-base my patches for that. And I found that the kvm support in qemu (target-i386/kvm.c) is quite different from that in qemu-kvm (kvm/). Where should MCE support goes? To target-i386/kvm.c or kvm/ or both? Best Regards, Huang Ying On Thu, 2009-04-30 at 16:53 +0800, Huang Ying wrote: - MCE features are initialized when VCPU is intialized according to CPUID. - A monitor command mce is added to inject a MCE. - A new interrupt mask: CPU_INTERRUPT_MCE is added to inject the MCE. Signed-off-by: Huang Ying ying.hu...@intel.com --- cpu-all.h |4 ++ cpu-exec.c |4 ++ monitor.c | 49 + target-i386/cpu.h | 22 +++ target-i386/helper.c| 70 target-i386/op_helper.c | 34 +++ 6 files changed, 183 insertions(+) --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -202,6 +202,7 @@ #define CR4_DE_MASK (1 3) #define CR4_PSE_MASK (1 4) #define CR4_PAE_MASK (1 5) +#define CR4_MCE_MASK (1 6) #define CR4_PGE_MASK (1 7) #define CR4_PCE_MASK (1 8) #define CR4_OSFXSR_SHIFT 9 @@ -248,6 +249,17 @@ #define PG_ERROR_RSVD_MASK 0x08 #define PG_ERROR_I_D_MASK 0x10 +#define MCE_CAP_DEF 0x100 +#define MCE_BANKS_DEF4 + +#define MCG_CTL_P(1UL8) + +#define MCG_STATUS_MCIP (1UL2) + +#define MCI_STATUS_VAL (1UL63) +#define MCI_STATUS_OVER (1UL62) +#define MCI_STATUS_UC(1UL61) + #define MSR_IA32_TSC0x10 #define MSR_IA32_APICBASE 0x1b #define MSR_IA32_APICBASE_BSP (18) @@ -288,6 +300,11 @@ #define MSR_MTRRdefType 0x2ff +#define MSR_MC0_CTL 0x400 +#define MSR_MC0_STATUS 0x401 +#define MSR_MC0_ADDR 0x402 +#define MSR_MC0_MISC 0x403 + #define MSR_EFER0xc080 #define MSR_EFER_SCE (1 0) @@ -673,6 +690,11 @@ typedef struct CPUX86State { /* in order to simplify APIC support, we leave this pointer to the user */ struct APICState *apic_state; + +uint64 mcg_cap; +uint64 mcg_status; +uint64 mcg_ctl; +uint64 *mce_banks; } CPUX86State; CPUX86State *cpu_x86_init(const char *cpu_model); --- a/target-i386/op_helper.c +++ b/target-i386/op_helper.c @@ -3133,7 +3133,23 @@ void helper_wrmsr(void) case MSR_MTRRdefType: env-mtrr_deftype = val; break; +case MSR_MCG_STATUS: +env-mcg_status = val; +break; +case MSR_MCG_CTL: +if ((env-mcg_cap MCG_CTL_P) + (val == 0 || val == ~(uint64_t)0)) +env-mcg_ctl = val; +break; default: +if ((uint32_t)ECX = MSR_MC0_CTL + (uint32_t)ECX MSR_MC0_CTL + (4 * env-mcg_cap 0xff)) { +uint32_t offset = (uint32_t)ECX - MSR_MC0_CTL; +if ((offset 0x3) != 0 +|| (val == 0 || val == ~(uint64_t)0)) +env-mce_banks[offset] = val; +break; +} /* XXX: exception ? */ break; } @@ -3252,7 +3268,25 @@ void helper_rdmsr(void) /* XXX: exception ? */ val = 0; break; +case MSR_MCG_CAP: +val = env-mcg_cap; +break; +case MSR_MCG_CTL: +if (env-mcg_cap MCG_CTL_P) +val = env-mcg_ctl; +else +val = 0; +break; +case MSR_MCG_STATUS: +val = env-mcg_status; +break; default: +if ((uint32_t)ECX = MSR_MC0_CTL + (uint32_t)ECX MSR_MC0_CTL + (4 * env-mcg_cap 0xff)) { +uint32_t offset = (uint32_t)ECX - MSR_MC0_CTL; +val = env-mce_banks[offset]; +break; +} /* XXX: exception ? */ val = 0; break; --- a/target-i386/helper.c +++ b/target-i386/helper.c @@ -1430,6 +1430,75 @@ static void breakpoint_handler(CPUState } #endif /* !CONFIG_USER_ONLY */ +/* This should come from sysemu.h - if we could include it here... */ +void qemu_system_reset_request(void); + +void cpu_inject_x86_mce(CPUState *cenv, int bank, uint64_t status, +uint64_t mcg_status, uint64_t addr, uint64_t misc) +{ +uint64_t mcg_cap = cenv-mcg_cap; +unsigned bank_num = mcg_cap 0xff; +uint64_t *banks = cenv-mce_banks; + +if (bank = bank_num || !(status MCI_STATUS_VAL)) +return; + +/* + * if MSR_MCG_CTL is not all 1s, the uncorrected error + * reporting is disabled + */ +if ((status MCI_STATUS_UC) (mcg_cap MCG_CTL_P) +cenv-mcg_ctl != ~(uint64_t)0) +return; +banks += 4 * bank; +/* + * if MSR_MCi_CTL is not all 1s, the uncorrected
Re: KVM x86_64 with SR-IOV..?
On Mon, 2009-05-04 at 10:09 +0800, Sheng Yang wrote: On Monday 04 May 2009 08:53:07 Nicholas A. Bellinger wrote: On Sat, 2009-05-02 at 18:22 +0800, Sheng Yang wrote: On Thu, Apr 30, 2009 at 01:22:54PM -0700, Nicholas A. Bellinger wrote: Greetings KVM folks, I wondering if any information exists for doing SR-IOV on the new VT-d capable chipsets with KVM..? From what I understand the patches for doing this with KVM are floating around, but I have been unable to find any user-level docs for actually making it all go against a upstream v2.6.30-rc3 code.. So far I have been doing IOV testing with Xen 3.3 and 3.4.0-pre, and I am really hoping to be able to jump to KVM for single-function and and then multi-function SR-IOV. I know that the VM migration stuff for IOV in Xen is up and running, and I assume it is being worked in for KVM instance migration as well..? This part is less important (at least for me :-) than getting a stable SR-IOV setup running under the KVM hypervisor.. Does anyone have any pointers for this..? Any comments or suggestions are appreciated! Hi Nicholas The patches are not floating around now. As you know, SR-IOV for Linux have been in 2.6.30, so then you can use upstream KVM and qemu-kvm(or recent released kvm-85) with 2.6.30-rc3 as host kernel. And some time ago, there are several SRIOV related patches for qemu-kvm, and now they all have been checked in. And for KVM, the extra document is not necessary, for you can simple assign a VF to guest like any other devices. And how to create VF is specific for each device driver. So just create a VF then assign it to KVM guest is fine. Greetings Sheng, So, I have been trying the latest kvm-85 release on a v2.6.30-rc3 checkout from linux-2.6.git on a CentOS 5u3 x86_64 install on Intel IOH-5520 based dual socket Nehalem board. I have enabled DMAR and Interrupt Remapping my KVM host using v2.6.30-rc3 and from what I can tell, the KVM_CAP_* defines from libkvm are enabled with building kvm-85 after './configure --kerneldir=/usr/src/linux-2.6.git' and the PCI passthrough code is being enabled in kvm-85/qemu/hw/device-assignment.c AFAICT.. From there, I use the freshly installed qemu-x86_64-system binary to start a Debian 5 x86_64 HVM (that previously had been moving network packets under Xen for PCIe passthrough). I see the MSI-X interrupt remapping working on the KVM host for the passed -pcidevice, and the MMIO mappings from the qemu build that I also saw while using Xen/qemu-dm built with PCI passthrough are there as well.. Hi Nicholas But while the KVM guest is booting, I see the following exception(s) from qemu-x86_64-system for one of the VFs for a multi-function PCIe device: BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) This one is mostly harmless. Ok, good to know.. :-) I try with one of the on-board e1000e ports (02:00.0) and I see the same exception along with some MSI-X exceptions from qemu-x86_64-system in KVM guest.. However, I am still able to see the e1000e and the other vxge multi-function device with lspci, but I am unable to dhcp or ping with the e1000e and VF from multi-function device fails to register the MSI-X interrupt in the guest.. Did you see the interrupt in the guest and host side? Ok, I am restarting the e1000e test with a fresh Fedora 11 install and KVM host kernel 2.6.29.1-111.fc11.x86_64. After unbinding and attaching the e1000e single-function device at 02:00.0 to pci-stub with: echo 8086 10d3 /sys/bus/pci/drivers/pci-stub/new_id echo :02:00.0 /sys/bus/pci/devices/:02:00.0/driver/unbind echo :02:00.0 /sys/bus/pci/drivers/pci-stub/bind I see the following the KVM host kernel ring buffer: e1000e :02:00.0: PCI INT A disabled pci-stub :02:00.0: PCI INT A - GSI 17 (level, low) - IRQ 17 pci-stub :02:00.0: irq 58 for MSI/MSI-X I think you can try on- board e1000e for MSI-X first. And please ensure correlated driver have been loaded correctly. nod.. And what do you mean by some MSI-X exceptions? Better with the log. Ok, with the Fedora 11 installed qemu-kemu, I see the expected kvm_destroy_phys_mem() statements: #kvm-host qemu-kvm -m 2048 -smp 8 -pcidevice host=02:00.0 lenny64guest1-orig.img BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) However I still see the following in the KVM guest kernel ring buffer running v2.6.30-rc in the HVM guest. [5.523790] ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 10 [5.524582] e1000e :00:05.0: PCI INT A - Link[LNKA] - GSI 10 (level, high) - IRQ 10 [5.525710] e1000e :00:05.0: setting latency timer to 64 [5.526048] :00:05.0: :00:05.0: Failed to initialize MSI-X interrupts. Falling back to MSI interrupts. [5.527200]
Re: [PATCH 03/04] qemu-kvm: Remove the dependency for phys_ram_base for ipf.c
Zhang, Xiantao wrote: Avi Kivity wrote: Looks good to me. Xiantao? Hi, Jes Have you tested nvram support with this patch? I Xiantao No, But it is behaving exactly like the old code, so it is no more broken than the old code was. Lets apply this and then look at the nvram issues afterwards. Jes -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 03/04] qemu-kvm: Remove the dependency for phys_ram_base for ipf.c
Jes Sorensen wrote: Zhang, Xiantao wrote: Avi Kivity wrote: Looks good to me. Xiantao? Hi, Jes Have you tested nvram support with this patch? I Xiantao No, But it is behaving exactly like the old code, so it is no more broken than the old code was. Lets apply this and then look at the nvram issues afterwards. Okay. :) Xiantao -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM x86_64 with SR-IOV..?
On Sun, 2009-05-03 at 21:36 -0700, Nicholas A. Bellinger wrote: On Mon, 2009-05-04 at 10:09 +0800, Sheng Yang wrote: On Monday 04 May 2009 08:53:07 Nicholas A. Bellinger wrote: On Sat, 2009-05-02 at 18:22 +0800, Sheng Yang wrote: On Thu, Apr 30, 2009 at 01:22:54PM -0700, Nicholas A. Bellinger wrote: Greetings KVM folks, I wondering if any information exists for doing SR-IOV on the new VT-d capable chipsets with KVM..? From what I understand the patches for doing this with KVM are floating around, but I have been unable to find any user-level docs for actually making it all go against a upstream v2.6.30-rc3 code.. So far I have been doing IOV testing with Xen 3.3 and 3.4.0-pre, and I am really hoping to be able to jump to KVM for single-function and and then multi-function SR-IOV. I know that the VM migration stuff for IOV in Xen is up and running, and I assume it is being worked in for KVM instance migration as well..? This part is less important (at least for me :-) than getting a stable SR-IOV setup running under the KVM hypervisor.. Does anyone have any pointers for this..? Any comments or suggestions are appreciated! Hi Nicholas The patches are not floating around now. As you know, SR-IOV for Linux have been in 2.6.30, so then you can use upstream KVM and qemu-kvm(or recent released kvm-85) with 2.6.30-rc3 as host kernel. And some time ago, there are several SRIOV related patches for qemu-kvm, and now they all have been checked in. And for KVM, the extra document is not necessary, for you can simple assign a VF to guest like any other devices. And how to create VF is specific for each device driver. So just create a VF then assign it to KVM guest is fine. Greetings Sheng, So, I have been trying the latest kvm-85 release on a v2.6.30-rc3 checkout from linux-2.6.git on a CentOS 5u3 x86_64 install on Intel IOH-5520 based dual socket Nehalem board. I have enabled DMAR and Interrupt Remapping my KVM host using v2.6.30-rc3 and from what I can tell, the KVM_CAP_* defines from libkvm are enabled with building kvm-85 after './configure --kerneldir=/usr/src/linux-2.6.git' and the PCI passthrough code is being enabled in kvm-85/qemu/hw/device-assignment.c AFAICT.. From there, I use the freshly installed qemu-x86_64-system binary to start a Debian 5 x86_64 HVM (that previously had been moving network packets under Xen for PCIe passthrough). I see the MSI-X interrupt remapping working on the KVM host for the passed -pcidevice, and the MMIO mappings from the qemu build that I also saw while using Xen/qemu-dm built with PCI passthrough are there as well.. Hi Nicholas But while the KVM guest is booting, I see the following exception(s) from qemu-x86_64-system for one of the VFs for a multi-function PCIe device: BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) This one is mostly harmless. Ok, good to know.. :-) I try with one of the on-board e1000e ports (02:00.0) and I see the same exception along with some MSI-X exceptions from qemu-x86_64-system in KVM guest.. However, I am still able to see the e1000e and the other vxge multi-function device with lspci, but I am unable to dhcp or ping with the e1000e and VF from multi-function device fails to register the MSI-X interrupt in the guest.. Did you see the interrupt in the guest and host side? Ok, I am restarting the e1000e test with a fresh Fedora 11 install and KVM host kernel 2.6.29.1-111.fc11.x86_64. After unbinding and attaching the e1000e single-function device at 02:00.0 to pci-stub with: echo 8086 10d3 /sys/bus/pci/drivers/pci-stub/new_id echo :02:00.0 /sys/bus/pci/devices/:02:00.0/driver/unbind echo :02:00.0 /sys/bus/pci/drivers/pci-stub/bind I see the following the KVM host kernel ring buffer: e1000e :02:00.0: PCI INT A disabled pci-stub :02:00.0: PCI INT A - GSI 17 (level, low) - IRQ 17 pci-stub :02:00.0: irq 58 for MSI/MSI-X I think you can try on- board e1000e for MSI-X first. And please ensure correlated driver have been loaded correctly. nod.. And what do you mean by some MSI-X exceptions? Better with the log. Ok, with the Fedora 11 installed qemu-kemu, I see the expected kvm_destroy_phys_mem() statements: #kvm-host qemu-kvm -m 2048 -smp 8 -pcidevice host=02:00.0 lenny64guest1-orig.img BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) However I still see the following in the KVM guest kernel ring buffer running v2.6.30-rc in the HVM guest. [5.523790] ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 10 [5.524582] e1000e :00:05.0: PCI INT A - Link[LNKA] - GSI 10 (level, high) - IRQ 10 [
Re: KVM x86_64 with SR-IOV..?
On Sun, 2009-05-03 at 22:28 -0700, Nicholas A. Bellinger wrote: On Sun, 2009-05-03 at 21:36 -0700, Nicholas A. Bellinger wrote: On Mon, 2009-05-04 at 10:09 +0800, Sheng Yang wrote: On Monday 04 May 2009 08:53:07 Nicholas A. Bellinger wrote: On Sat, 2009-05-02 at 18:22 +0800, Sheng Yang wrote: On Thu, Apr 30, 2009 at 01:22:54PM -0700, Nicholas A. Bellinger wrote: Greetings KVM folks, I wondering if any information exists for doing SR-IOV on the new VT-d capable chipsets with KVM..? From what I understand the patches for doing this with KVM are floating around, but I have been unable to find any user-level docs for actually making it all go against a upstream v2.6.30-rc3 code.. So far I have been doing IOV testing with Xen 3.3 and 3.4.0-pre, and I am really hoping to be able to jump to KVM for single-function and and then multi-function SR-IOV. I know that the VM migration stuff for IOV in Xen is up and running, and I assume it is being worked in for KVM instance migration as well..? This part is less important (at least for me :-) than getting a stable SR-IOV setup running under the KVM hypervisor.. Does anyone have any pointers for this..? Any comments or suggestions are appreciated! Hi Nicholas The patches are not floating around now. As you know, SR-IOV for Linux have been in 2.6.30, so then you can use upstream KVM and qemu-kvm(or recent released kvm-85) with 2.6.30-rc3 as host kernel. And some time ago, there are several SRIOV related patches for qemu-kvm, and now they all have been checked in. And for KVM, the extra document is not necessary, for you can simple assign a VF to guest like any other devices. And how to create VF is specific for each device driver. So just create a VF then assign it to KVM guest is fine. Greetings Sheng, So, I have been trying the latest kvm-85 release on a v2.6.30-rc3 checkout from linux-2.6.git on a CentOS 5u3 x86_64 install on Intel IOH-5520 based dual socket Nehalem board. I have enabled DMAR and Interrupt Remapping my KVM host using v2.6.30-rc3 and from what I can tell, the KVM_CAP_* defines from libkvm are enabled with building kvm-85 after './configure --kerneldir=/usr/src/linux-2.6.git' and the PCI passthrough code is being enabled in kvm-85/qemu/hw/device-assignment.c AFAICT.. From there, I use the freshly installed qemu-x86_64-system binary to start a Debian 5 x86_64 HVM (that previously had been moving network packets under Xen for PCIe passthrough). I see the MSI-X interrupt remapping working on the KVM host for the passed -pcidevice, and the MMIO mappings from the qemu build that I also saw while using Xen/qemu-dm built with PCI passthrough are there as well.. Hi Nicholas But while the KVM guest is booting, I see the following exception(s) from qemu-x86_64-system for one of the VFs for a multi-function PCIe device: BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) This one is mostly harmless. Ok, good to know.. :-) I try with one of the on-board e1000e ports (02:00.0) and I see the same exception along with some MSI-X exceptions from qemu-x86_64-system in KVM guest.. However, I am still able to see the e1000e and the other vxge multi-function device with lspci, but I am unable to dhcp or ping with the e1000e and VF from multi-function device fails to register the MSI-X interrupt in the guest.. Did you see the interrupt in the guest and host side? Ok, I am restarting the e1000e test with a fresh Fedora 11 install and KVM host kernel 2.6.29.1-111.fc11.x86_64. After unbinding and attaching the e1000e single-function device at 02:00.0 to pci-stub with: echo 8086 10d3 /sys/bus/pci/drivers/pci-stub/new_id echo :02:00.0 /sys/bus/pci/devices/:02:00.0/driver/unbind echo :02:00.0 /sys/bus/pci/drivers/pci-stub/bind I see the following the KVM host kernel ring buffer: e1000e :02:00.0: PCI INT A disabled pci-stub :02:00.0: PCI INT A - GSI 17 (level, low) - IRQ 17 pci-stub :02:00.0: irq 58 for MSI/MSI-X I think you can try on- board e1000e for MSI-X first. And please ensure correlated driver have been loaded correctly. nod.. And what do you mean by some MSI-X exceptions? Better with the log. Ok, with the Fedora 11 installed qemu-kemu, I see the expected kvm_destroy_phys_mem() statements: #kvm-host qemu-kvm -m 2048 -smp 8 -pcidevice host=02:00.0 lenny64guest1-orig.img BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) However I still see the following in the KVM guest kernel ring