Re: [Qemu-devel] [PATCH 0/8]: QMP feature negotiation support
Luiz Capitulino lcapitul...@redhat.com writes: On Mon, 01 Feb 2010 20:37:41 +0100 Markus Armbruster arm...@redhat.com wrote: Luiz Capitulino lcapitul...@redhat.com writes: On Mon, 01 Feb 2010 18:08:27 +0100 Markus Armbruster arm...@redhat.com wrote: [...] I don't doubt your design does the job. I just think it's overly general. I had something far more stupid in mind: client connects server - client: version capability offer (one message) again: client - server: capability selection (one message) server - client: either okay or error (one message) if error goto again connection is now ready for commands No modes. The distinct lack of generality is a design feature. I like the simplicity and if we were allowed to change later I'd do it. The question is if we will ever want features to be _configured_ before the protocol is operational. In this case we'd need to pass feature arguments through the capability selection command, which will get ugly and hard to use/understand. Mode oriented support doesn't have this limitation. Maybe we won't never really use it, but it's safer. Capability selection could be done as an object where the name/value pairs are capability/argument. If you need multiple arguments for a capability, make the capability's value an object. That's exactly what seems complicated to me, because besides performing two functions (enable/configure) some feature setup could require more commands to be done in a clear way. What do you mean by feature setup? And how does it go beyond setting a bunch of parameters? The async messages setup in the previous series was an example of this. I don't remember the details. Could you summarize? As said we might never use this, but I wouldn't like to regret later. A somewhat plausible example for how it could be needed would help.
[Qemu-devel] [PATCH 04/21] KVM: x86: Fix up misreported CPU features
From qemu-kvm: Kernels before 2.6.30 misreported some essential CPU features via KVM_GET_SUPPORTED_CPUID. Fix them up. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- target-i386/kvm.c |8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 504f501..9fb96b5 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -101,12 +101,18 @@ uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function, int reg) break; case R_EDX: ret = cpuid-entries[i].edx; -if (function == 0x8001) { +switch (function) { +case 1: +/* KVM before 2.6.30 misreports the following features */ +ret |= CPUID_MTRR | CPUID_PAT | CPUID_MCE | CPUID_MCA; +break; +case 0x8001: /* On Intel, kvm returns cpuid according to the Intel spec, * so add missing bits according to the AMD spec: */ cpuid_1_edx = kvm_arch_get_supported_cpuid(env, 1, R_EDX); ret |= cpuid_1_edx 0xdfeff7ff; +break; } break; } -- 1.6.0.2
[Qemu-devel] [PATCH 03/21] qemu-kvm: Clean up register access API
qemu-kvm's functios for accessing the VCPU registers are kvm_arch_load/save_regs. Use them directly instead of going through various wrappers. Specifically, we do not need on_vcpu wrapping as all users either already run in the related thread or call while the vm is stopped. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- qemu-kvm.c| 37 +++-- qemu-kvm.h| 11 --- target-ia64/machine.c |4 ++-- 3 files changed, 5 insertions(+), 47 deletions(-) diff --git a/qemu-kvm.c b/qemu-kvm.c index a305907..97c098c 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -862,7 +862,7 @@ int pre_kvm_run(kvm_context_t kvm, CPUState *env) kvm_arch_pre_run(env, env-kvm_run); if (env-kvm_cpu_state.regs_modified) { -kvm_arch_put_registers(env); +kvm_arch_load_regs(env); env-kvm_cpu_state.regs_modified = 0; } @@ -1532,16 +1532,11 @@ static void on_vcpu(CPUState *env, void (*func)(void *data), void *data) qemu_cond_wait(qemu_work_cond); } -void kvm_arch_get_registers(CPUState *env) -{ - kvm_arch_save_regs(env); -} - static void do_kvm_cpu_synchronize_state(void *_env) { CPUState *env = _env; if (!env-kvm_cpu_state.regs_modified) { -kvm_arch_get_registers(env); +kvm_arch_save_regs(env); env-kvm_cpu_state.regs_modified = 1; } } @@ -1584,32 +1579,6 @@ void kvm_update_interrupt_request(CPUState *env) } } -static void kvm_do_load_registers(void *_env) -{ -CPUState *env = _env; - -kvm_arch_load_regs(env); -} - -void kvm_load_registers(CPUState *env) -{ -if (kvm_enabled() qemu_system_ready) -on_vcpu(env, kvm_do_load_registers, env); -} - -static void kvm_do_save_registers(void *_env) -{ -CPUState *env = _env; - -kvm_arch_save_regs(env); -} - -void kvm_save_registers(CPUState *env) -{ -if (kvm_enabled()) -on_vcpu(env, kvm_do_save_registers, env); -} - static void kvm_do_load_mpstate(void *_env) { CPUState *env = _env; @@ -2379,7 +2348,7 @@ static void kvm_invoke_set_guest_debug(void *data) struct kvm_set_guest_debug_data *dbg_data = data; if (cpu_single_env-kvm_cpu_state.regs_modified) { -kvm_arch_put_registers(cpu_single_env); +kvm_arch_save_regs(cpu_single_env); cpu_single_env-kvm_cpu_state.regs_modified = 0; } dbg_data-err = diff --git a/qemu-kvm.h b/qemu-kvm.h index 6b3e5a1..1354227 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -902,8 +902,6 @@ int kvm_main_loop(void); int kvm_init_ap(void); #ifndef QEMU_KVM_NO_CPU int kvm_vcpu_inited(CPUState *env); -void kvm_load_registers(CPUState *env); -void kvm_save_registers(CPUState *env); void kvm_load_mpstate(CPUState *env); void kvm_save_mpstate(CPUState *env); int kvm_cpu_exec(CPUState *env); @@ -1068,8 +1066,6 @@ void kvm_load_tsc(CPUState *env); #ifdef TARGET_I386 #define qemu_kvm_has_pit_state2() (0) #endif -#define kvm_load_registers(env) do {} while(0) -#define kvm_save_registers(env) do {} while(0) #define kvm_save_mpstate(env) do {} while(0) #define qemu_kvm_cpu_stop(env) do {} while(0) static inline void kvm_init_vcpu(CPUState *env) @@ -1098,13 +1094,6 @@ static inline int kvm_sync_vcpus(void) } #ifndef QEMU_KVM_NO_CPU -void kvm_arch_get_registers(CPUState *env); - -static inline void kvm_arch_put_registers(CPUState *env) -{ -kvm_load_registers(env); -} - void kvm_cpu_synchronize_state(CPUState *env); static inline void cpu_synchronize_state(CPUState *env) diff --git a/target-ia64/machine.c b/target-ia64/machine.c index 70ef379..7d29575 100644 --- a/target-ia64/machine.c +++ b/target-ia64/machine.c @@ -9,7 +9,7 @@ void cpu_save(QEMUFile *f, void *opaque) CPUState *env = opaque; if (kvm_enabled()) { -kvm_save_registers(env); +kvm_arch_save_regs(env); kvm_arch_save_mpstate(env); } } @@ -19,7 +19,7 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id) CPUState *env = opaque; if (kvm_enabled()) { -kvm_load_registers(env); +kvm_arch_load_regs(env); kvm_arch_load_mpstate(env); } return 0; -- 1.6.0.2
[Qemu-devel] [PATCH 02/21] KVM: Make vmport KVM-compatible
The vmport device accesses the VCPU registers, so it requires proper cpu_synchronize_state. Add it to vmport_ioport_read, which also synchronizes vmport_ioport_write. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/vmport.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/hw/vmport.c b/hw/vmport.c index 884af3f..6c9d7c9 100644 --- a/hw/vmport.c +++ b/hw/vmport.c @@ -25,6 +25,7 @@ #include isa.h #include pc.h #include sysemu.h +#include kvm.h //#define VMPORT_DEBUG @@ -58,6 +59,8 @@ static uint32_t vmport_ioport_read(void *opaque, uint32_t addr) unsigned char command; uint32_t eax; +cpu_synchronize_state(env); + eax = env-regs[R_EAX]; if (eax != VMPORT_MAGIC) return eax; -- 1.6.0.2
[Qemu-devel] [PATCH 2/2] powerpc/e500: adjust fdt and ramdisk loading addr
Since kernel uimage is getting bigger, old fixed loading bases will result in regions overlap. Add pad for fdt and ramdisk, so that they won't overlap with uimage. Signed-off-by: Liu Yu yu@freescale.com --- hw/ppce500_mpc8544ds.c | 12 1 files changed, 8 insertions(+), 4 deletions(-) diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c index 9a5654b..3826156 100644 --- a/hw/ppce500_mpc8544ds.c +++ b/hw/ppce500_mpc8544ds.c @@ -34,8 +34,10 @@ #define BINARY_DEVICE_TREE_FILEmpc8544ds.dtb #define UIMAGE_LOAD_BASE 0 -#define DTB_LOAD_BASE 0x60 -#define INITRD_LOAD_BASE 0x200 +#define DTC_LOAD_PAD 0x50 +#define DTC_PAD_MASK 0xF +#define INITRD_LOAD_PAD0x200 +#define INITRD_PAD_MASK0xFF #define RAM_SIZES_ALIGN(64UL 20) @@ -170,8 +172,8 @@ static void mpc8544ds_init(ram_addr_t ram_size, target_phys_addr_t entry=0; target_phys_addr_t loadaddr=UIMAGE_LOAD_BASE; target_long kernel_size=0; -target_ulong dt_base=DTB_LOAD_BASE; -target_ulong initrd_base=INITRD_LOAD_BASE; +target_ulong dt_base = 0; +target_ulong initrd_base = 0; target_long initrd_size=0; int i=0; unsigned int pci_irq_nrs[4] = {1, 2, 3, 4}; @@ -246,6 +248,7 @@ static void mpc8544ds_init(ram_addr_t ram_size, /* Load initrd. */ if (initrd_filename) { +initrd_base = (kernel_size + INITRD_LOAD_PAD) ~INITRD_PAD_MASK; initrd_size = load_image_targphys(initrd_filename, initrd_base, ram_size - initrd_base); @@ -258,6 +261,7 @@ static void mpc8544ds_init(ram_addr_t ram_size, /* If we're loading a kernel directly, we must load the device tree too. */ if (kernel_filename) { +dt_base = (kernel_size + DTC_LOAD_PAD) ~DTC_PAD_MASK; if (mpc8544_load_device_tree(dt_base, ram_size, initrd_base, initrd_size, kernel_cmdline) 0) { fprintf(stderr, couldn't load device tree\n); -- 1.6.4
[Qemu-devel] [PATCH 1/2] powerpc/booke: move fdt loading to rom infrastructure
It's convinent to use rom to checking overlap, to reset etc. And uImage and ramdisk loading has already moved to it. Also, after we add fdt to rom, free it. Signed-off-by: Liu Yu yu@freescale.com --- hw/ppc440_bamboo.c | 15 --- hw/ppce500_mpc8544ds.c | 17 ++--- 2 files changed, 18 insertions(+), 14 deletions(-) diff --git a/hw/ppc440_bamboo.c b/hw/ppc440_bamboo.c index 1ab9872..9d95417 100644 --- a/hw/ppc440_bamboo.c +++ b/hw/ppc440_bamboo.c @@ -27,7 +27,7 @@ #define BINARY_DEVICE_TREE_FILE bamboo.dtb -static void *bamboo_load_device_tree(target_phys_addr_t addr, +static int bamboo_load_device_tree(target_phys_addr_t addr, uint32_t ramsize, target_phys_addr_t initrd_base, target_phys_addr_t initrd_size, @@ -42,11 +42,13 @@ static void *bamboo_load_device_tree(target_phys_addr_t addr, filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, BINARY_DEVICE_TREE_FILE); if (!filename) { +ret = -1; goto out; } fdt = load_device_tree(filename, fdt_size); qemu_free(filename); if (fdt == NULL) { +ret = -1; goto out; } @@ -75,12 +77,13 @@ static void *bamboo_load_device_tree(target_phys_addr_t addr, if (kvm_enabled()) kvmppc_fdt_update(fdt); -cpu_physical_memory_write (addr, (void *)fdt, fdt_size); +ret = rom_add_blob_fixed(BINARY_DEVICE_TREE_FILE, fdt, fdt_size, addr); +qemu_free(fdt); out: #endif -return fdt; +return ret; } static void bamboo_init(ram_addr_t ram_size, @@ -101,7 +104,6 @@ static void bamboo_init(ram_addr_t ram_size, target_ulong initrd_base = 0; target_long initrd_size = 0; target_ulong dt_base = 0; -void *fdt; int i; /* Setup CPU. */ @@ -153,9 +155,8 @@ static void bamboo_init(ram_addr_t ram_size, else dt_base = kernel_size + loadaddr; -fdt = bamboo_load_device_tree(dt_base, ram_size, - initrd_base, initrd_size, kernel_cmdline); -if (fdt == NULL) { +if (bamboo_load_device_tree(dt_base, ram_size, +initrd_base, initrd_size, kernel_cmdline) 0) { fprintf(stderr, couldn't load device tree\n); exit(1); } diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c index ea30816..9a5654b 100644 --- a/hw/ppce500_mpc8544ds.c +++ b/hw/ppce500_mpc8544ds.c @@ -72,7 +72,7 @@ out: } #endif -static void *mpc8544_load_device_tree(target_phys_addr_t addr, +static int mpc8544_load_device_tree(target_phys_addr_t addr, uint32_t ramsize, target_phys_addr_t initrd_base, target_phys_addr_t initrd_size, @@ -87,11 +87,13 @@ static void *mpc8544_load_device_tree(target_phys_addr_t addr, filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, BINARY_DEVICE_TREE_FILE); if (!filename) { +ret = -1; goto out; } fdt = load_device_tree(filename, fdt_size); qemu_free(filename); if (fdt == NULL) { +ret = -1; goto out; } @@ -123,6 +125,7 @@ static void *mpc8544_load_device_tree(target_phys_addr_t addr, if ((dp = opendir(/proc/device-tree/cpus/)) == NULL) { printf(Can't open directory /proc/device-tree/cpus/\n); +ret = -1; goto out; } @@ -136,6 +139,7 @@ static void *mpc8544_load_device_tree(target_phys_addr_t addr, closedir(dp); if (buf[0] == '\0') { printf(Unknow host!\n); +ret = -1; goto out; } @@ -143,12 +147,13 @@ static void *mpc8544_load_device_tree(target_phys_addr_t addr, mpc8544_copy_soc_cell(fdt, buf, timebase-frequency); } -cpu_physical_memory_write (addr, (void *)fdt, fdt_size); +ret = rom_add_blob_fixed(BINARY_DEVICE_TREE_FILE, fdt, fdt_size, addr); +qemu_free(fdt); out: #endif -return fdt; +return ret; } static void mpc8544ds_init(ram_addr_t ram_size, @@ -168,7 +173,6 @@ static void mpc8544ds_init(ram_addr_t ram_size, target_ulong dt_base=DTB_LOAD_BASE; target_ulong initrd_base=INITRD_LOAD_BASE; target_long initrd_size=0; -void *fdt; int i=0; unsigned int pci_irq_nrs[4] = {1, 2, 3, 4}; qemu_irq *irqs, *mpic, *pci_irqs; @@ -254,9 +258,8 @@ static void mpc8544ds_init(ram_addr_t ram_size, /* If we're loading a kernel directly, we must load the device tree too. */ if (kernel_filename) { -fdt = mpc8544_load_device_tree(dt_base, ram_size, - initrd_base, initrd_size, kernel_cmdline); -if (fdt == NULL) { +if (mpc8544_load_device_tree(dt_base, ram_size, +initrd_base, initrd_size, kernel_cmdline) 0) {
[Qemu-devel] [PATCH 20/21] qemu-kvm: Move kvm_set_boot_cpu_id
Setting the boot CPU ID is arch-specific KVM stuff. So push it where it belongs to. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/pc.c|3 --- qemu-kvm-x86.c |3 ++- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/hw/pc.c b/hw/pc.c index 6c15a9f..3df6195 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -803,9 +803,6 @@ static void pc_init1(ram_addr_t ram_size, #endif } -if (kvm_enabled()) { -kvm_set_boot_cpu_id(0); -} for (i = 0; i smp_cpus; i++) { env = pc_new_cpu(cpu_model); } diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 9de018e..0f34451 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -695,7 +695,8 @@ int kvm_arch_qemu_create_context(void) if (kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK)) vmstate_register(0, vmstate_kvmclock, kvmclock_data); #endif -return 0; + +return kvm_set_boot_cpu_id(0); } static void set_msr_entry(struct kvm_msr_entry *entry, uint32_t index, -- 1.6.0.2
[Qemu-devel] [PATCH 01/21] qemu-kvm: Drop vmport changes
This attempt to make vmport KVM compatible is half-broken and is scheduled to be replaced by proper upstream support. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/vmport.c | 13 + 1 files changed, 1 insertions(+), 12 deletions(-) diff --git a/hw/vmport.c b/hw/vmport.c index 648861b..884af3f 100644 --- a/hw/vmport.c +++ b/hw/vmport.c @@ -21,12 +21,10 @@ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * THE SOFTWARE. */ - #include hw.h #include isa.h #include pc.h #include sysemu.h -#include qemu-kvm.h //#define VMPORT_DEBUG @@ -59,10 +57,6 @@ static uint32_t vmport_ioport_read(void *opaque, uint32_t addr) CPUState *env = cpu_single_env; unsigned char command; uint32_t eax; -uint32_t ret; - -if (kvm_enabled()) - kvm_save_registers(env); eax = env-regs[R_EAX]; if (eax != VMPORT_MAGIC) @@ -79,12 +73,7 @@ static uint32_t vmport_ioport_read(void *opaque, uint32_t addr) return eax; } -ret = s-func[command](s-opaque[command], addr); - -if (kvm_enabled()) - kvm_load_registers(env); - -return ret; +return s-func[command](s-opaque[command], addr); } static void vmport_ioport_write(void *opaque, uint32_t addr, uint32_t val) -- 1.6.0.2
[Qemu-devel] [PATCH 17/21] qemu-kvm: Use VCPU event state for reset and vmsave/load
Push reading/writing of vcpu_events into kvm_arch_load/save_regs to avoid KVM-specific hooks in generic code. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm.h |2 -- qemu-kvm-x86.c|6 -- target-i386/kvm.c |4 ++-- target-i386/machine.c |6 -- 4 files changed, 6 insertions(+), 12 deletions(-) diff --git a/kvm.h b/kvm.h index e4005d8..686ee33 100644 --- a/kvm.h +++ b/kvm.h @@ -53,8 +53,6 @@ int kvm_set_migration_log(int enable); int kvm_has_sync_mmu(void); int kvm_has_vcpu_events(void); -int kvm_put_vcpu_events(CPUState *env, int level); -int kvm_get_vcpu_events(CPUState *env); void kvm_setup_guest_memory(void *start, size_t size); diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 21476db..f484149 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -972,6 +972,8 @@ void kvm_arch_load_regs(CPUState *env, int level) if (level = KVM_PUT_RESET_STATE) { kvm_arch_load_mpstate(env); } + +kvm_put_vcpu_events(env, level); } void kvm_load_tsc(CPUState *env) @@ -1141,6 +1143,7 @@ void kvm_arch_save_regs(CPUState *env) } } kvm_arch_save_mpstate(env); +kvm_get_vcpu_events(env); } static void do_cpuid_ent(struct kvm_cpuid_entry2 *e, uint32_t function, @@ -1215,7 +1218,7 @@ int kvm_arch_init_vcpu(CPUState *cenv) qemu_kvm_load_lapic(cenv); -cenv-interrupt_injected = -1; +kvm_arch_reset_vcpu(cenv); #ifdef KVM_CPUID_SIGNATURE /* Paravirtualization CPUIDs */ @@ -1381,7 +1384,6 @@ void kvm_arch_push_nmi(void *opaque) void kvm_arch_cpu_reset(CPUState *env) { kvm_arch_reset_vcpu(env); -kvm_put_vcpu_events(env, KVM_PUT_RESET_STATE); if (!cpu_is_bsp(env) !kvm_irqchip_in_kernel()) { env-interrupt_request = ~CPU_INTERRUPT_HARD; env-halted = 1; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index fefd5a5..9bd2952 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -789,7 +789,7 @@ static int kvm_get_mp_state(CPUState *env) } #endif -int kvm_put_vcpu_events(CPUState *env, int level) +static int kvm_put_vcpu_events(CPUState *env, int level) { #ifdef KVM_CAP_VCPU_EVENTS struct kvm_vcpu_events events; @@ -825,7 +825,7 @@ int kvm_put_vcpu_events(CPUState *env, int level) #endif } -int kvm_get_vcpu_events(CPUState *env) +static int kvm_get_vcpu_events(CPUState *env) { #ifdef KVM_CAP_VCPU_EVENTS struct kvm_vcpu_events events; diff --git a/target-i386/machine.c b/target-i386/machine.c index 6fca559..bcc315b 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -5,7 +5,6 @@ #include exec-all.h #include kvm.h -#include qemu-kvm.h static const VMStateDescription vmstate_segment = { .name = segment, @@ -322,10 +321,6 @@ static void cpu_pre_save(void *opaque) CPUState *env = opaque; int i; -if (kvm_enabled()) { -kvm_get_vcpu_events(env); -} - /* FPU */ env-fpus_vmstate = (env-fpus ~0x3800) | (env-fpstt 0x7) 11; env-fptag_vmstate = 0; @@ -362,7 +357,6 @@ static int cpu_post_load(void *opaque, int version_id) if (kvm_enabled()) { kvm_load_tsc(env); -kvm_put_vcpu_events(env, KVM_PUT_FULL_STATE); } return 0; -- 1.6.0.2
[Qemu-devel] [PATCH 00/21] qemu-kvm: Hook cleanups and extended use of upstream code
Let's start with the overall stats: 31 files changed, 274 insertions(+), 822 deletions(-) So this series drops far more than 500 lines of redundant code, moving qemu-kvm yet a bit closer to upstream. The other highlight is the simplification of synchronization between in-kernel and user space VCPU states. This area used to call a lot of problems in the past because it was tricky to get things right, specifically during the multi-threaded startup. The new approach pushes all the sync work around reset and vmsave/load into generic code, not only removing the burden from developers of, say, in-kernel APIC support, but also dropping most of our kvm-specific hooks, especially in the qemu-kvm tree. While I tested this on various VMs around, and things look good so far, I wouldn't be surprised if there are some regressions remaining, specifically in the non-x86 parts that I wasn't able to test or even build. Please have a careful look! Regarding the organization of the series: Patches prefixed with KVM: are for upstream, unmodified or with only minor adjustments. But I have a separate series against uq/master here that just needs final polishing and can then be rolled out as well. You can pull this series from git://git.kiszka.org/qemu-kvm.git queues/vcpu-state There are two more items on my to-do list, yet with medium prio: o switch kvm_arch_save/load_regs and sub-functions to upstream code o drop qemu-kvm's slot management in favor of upstream's implementation Jan Kiszka (21): qemu-kvm: Drop vmport changes KVM: Make vmport KVM-compatible qemu-kvm: Clean up register access API KVM: x86: Fix up misreported CPU features qemu-kvm: Use upstream kvm_enabled and cpu_synchronize_state qemu-kvm: Use upstream kvm_setup_guest_memory qemu-kvm: Use some more upstream prototypes qemu-kvm: Use upstream kvm_arch_get_supported_cpuid qemu-kvm: Use upstream kvm_pit_in_kernel KVM: Move and rename regs_modified KVM: Rework of guest debug state writing qemu-kvm: Use upstream kvm_vcpu_dirty qemu-kvm: Use upstream guest debug code qemu-kvm: Rework VCPU state writeback API qemu-kvm: Clean up mpstate synchronization KVM: x86: Restrict writeback of VCPU state qemu-kvm: Use VCPU event state for reset and vmsave/load qemu-kvm: Cleanup/fix TSC and PV clock writeback qemu-kvm: Clean up KVM's APIC hooks qemu-kvm: Move kvm_set_boot_cpu_id qemu-kvm: Bring qemu_init_vcpu back home cpu-defs.h|2 +- exec.c| 17 -- hw/apic.c | 47 +- hw/i8254.c|6 +- hw/i8259.c|2 +- hw/ioapic.c |2 +- hw/msix.c |3 +- hw/pc.c | 13 +-- hw/pcspk.c|4 +- hw/piix_pci.c |2 +- hw/ppc_newworld.c |3 - hw/ppc_oldworld.c |3 - hw/s390-virtio.c |1 - hw/vmport.c | 14 +-- kvm-all.c | 51 +++--- kvm.h | 35 +++-- qemu-kvm-ia64.c |6 +- qemu-kvm-x86.c| 415 + qemu-kvm.c| 159 +++ qemu-kvm.h| 158 +-- savevm.c |4 + sysemu.h |4 + target-i386/cpu.h |9 +- target-i386/helper.c |2 + target-i386/kvm.c | 61 +-- target-i386/machine.c | 27 target-ia64/machine.c |5 +- target-ppc/kvm.c |2 +- target-ppc/machine.c |4 - target-s390x/kvm.c|3 +- vl.c | 32 - 31 files changed, 274 insertions(+), 822 deletions(-)
[Qemu-devel] [PATCH 16/21] KVM: x86: Restrict writeback of VCPU state
Do not write nmi_pending, sipi_vector, and mpstate unless we at least go through a reset. And TSC as well as KVM wallclocks should only be written on full sync, otherwise we risk to drop some time on during state read-modify-write. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm.h |2 +- qemu-kvm-x86.c|2 +- target-i386/kvm.c | 32 target-i386/machine.c |2 +- 4 files changed, 23 insertions(+), 15 deletions(-) diff --git a/kvm.h b/kvm.h index ee8b3f6..e4005d8 100644 --- a/kvm.h +++ b/kvm.h @@ -53,7 +53,7 @@ int kvm_set_migration_log(int enable); int kvm_has_sync_mmu(void); int kvm_has_vcpu_events(void); -int kvm_put_vcpu_events(CPUState *env); +int kvm_put_vcpu_events(CPUState *env, int level); int kvm_get_vcpu_events(CPUState *env); void kvm_setup_guest_memory(void *start, size_t size); diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 6b5895f..21476db 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -1381,7 +1381,7 @@ void kvm_arch_push_nmi(void *opaque) void kvm_arch_cpu_reset(CPUState *env) { kvm_arch_reset_vcpu(env); -kvm_put_vcpu_events(env); +kvm_put_vcpu_events(env, KVM_PUT_RESET_STATE); if (!cpu_is_bsp(env) !kvm_irqchip_in_kernel()) { env-interrupt_request = ~CPU_INTERRUPT_HARD; env-halted = 1; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 4a0c8bb..fefd5a5 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -544,7 +544,7 @@ static void kvm_msr_entry_set(struct kvm_msr_entry *entry, entry-data = value; } -static int kvm_put_msrs(CPUState *env) +static int kvm_put_msrs(CPUState *env, int level) { struct { struct kvm_msrs info; @@ -558,7 +558,6 @@ static int kvm_put_msrs(CPUState *env) kvm_msr_entry_set(msrs[n++], MSR_IA32_SYSENTER_EIP, env-sysenter_eip); if (kvm_has_msr_star(env)) kvm_msr_entry_set(msrs[n++], MSR_STAR, env-star); -kvm_msr_entry_set(msrs[n++], MSR_IA32_TSC, env-tsc); kvm_msr_entry_set(msrs[n++], MSR_VM_HSAVE_PA, env-vm_hsave); #ifdef TARGET_X86_64 /* FIXME if lm capable */ @@ -567,8 +566,12 @@ static int kvm_put_msrs(CPUState *env) kvm_msr_entry_set(msrs[n++], MSR_FMASK, env-fmask); kvm_msr_entry_set(msrs[n++], MSR_LSTAR, env-lstar); #endif -kvm_msr_entry_set(msrs[n++], MSR_KVM_SYSTEM_TIME, env-system_time_msr); -kvm_msr_entry_set(msrs[n++], MSR_KVM_WALL_CLOCK, env-wall_clock_msr); +if (level == KVM_PUT_FULL_STATE) { +kvm_msr_entry_set(msrs[n++], MSR_IA32_TSC, env-tsc); +kvm_msr_entry_set(msrs[n++], MSR_KVM_SYSTEM_TIME, + env-system_time_msr); +kvm_msr_entry_set(msrs[n++], MSR_KVM_WALL_CLOCK, env-wall_clock_msr); +} msr_data.info.nmsrs = n; @@ -786,7 +789,7 @@ static int kvm_get_mp_state(CPUState *env) } #endif -int kvm_put_vcpu_events(CPUState *env) +int kvm_put_vcpu_events(CPUState *env, int level) { #ifdef KVM_CAP_VCPU_EVENTS struct kvm_vcpu_events events; @@ -810,8 +813,11 @@ int kvm_put_vcpu_events(CPUState *env) events.sipi_vector = env-sipi_vector; -events.flags = -KVM_VCPUEVENT_VALID_NMI_PENDING | KVM_VCPUEVENT_VALID_SIPI_VECTOR; +events.flags = 0; +if (level = KVM_PUT_RESET_STATE) { +events.flags |= +KVM_VCPUEVENT_VALID_NMI_PENDING | KVM_VCPUEVENT_VALID_SIPI_VECTOR; +} return kvm_vcpu_ioctl(env, KVM_SET_VCPU_EVENTS, events); #else @@ -882,15 +888,17 @@ int kvm_arch_put_registers(CPUState *env, int level) if (ret 0) return ret; -ret = kvm_put_msrs(env); +ret = kvm_put_msrs(env, level); if (ret 0) return ret; -ret = kvm_put_mp_state(env); -if (ret 0) -return ret; +if (level = KVM_PUT_RESET_STATE) { +ret = kvm_put_mp_state(env); +if (ret 0) +return ret; +} -ret = kvm_put_vcpu_events(env); +ret = kvm_put_vcpu_events(env, level); if (ret 0) return ret; diff --git a/target-i386/machine.c b/target-i386/machine.c index 61e6a87..6fca559 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -362,7 +362,7 @@ static int cpu_post_load(void *opaque, int version_id) if (kvm_enabled()) { kvm_load_tsc(env); -kvm_put_vcpu_events(env); +kvm_put_vcpu_events(env, KVM_PUT_FULL_STATE); } return 0; -- 1.6.0.2
[Qemu-devel] [PATCH 09/21] qemu-kvm: Use upstream kvm_pit_in_kernel
Drop private version in favor of recently added upstream service and track it state directly in KVMState. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/i8254.c |4 ++-- hw/pc.c|2 +- hw/pcspk.c |4 ++-- kvm-all.c |2 +- kvm.h |2 +- qemu-kvm-x86.c | 12 ++-- qemu-kvm.c |5 - qemu-kvm.h | 13 + 8 files changed, 14 insertions(+), 30 deletions(-) diff --git a/hw/i8254.c b/hw/i8254.c index db9e94a..1add08e 100644 --- a/hw/i8254.c +++ b/hw/i8254.c @@ -491,7 +491,7 @@ void hpet_disable_pit(void) { PITChannelState *s = pit_state.channels[0]; -if (kvm_enabled() qemu_kvm_pit_in_kernel()) { +if (kvm_enabled() kvm_pit_in_kernel()) { if (qemu_kvm_has_pit_state2()) { kvm_hpet_disable_kpit(); } else { @@ -515,7 +515,7 @@ void hpet_enable_pit(void) PITState *pit = pit_state; PITChannelState *s = pit-channels[0]; -if (kvm_enabled() qemu_kvm_pit_in_kernel()) { +if (kvm_enabled() kvm_pit_in_kernel()) { if (qemu_kvm_has_pit_state2()) { kvm_hpet_enable_kpit(); } else { diff --git a/hw/pc.c b/hw/pc.c index dac373e..7a7dfa7 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -951,7 +951,7 @@ static void pc_init1(ram_addr_t ram_size, ioapic_irq_hack = isa_irq; } #ifdef CONFIG_KVM_PIT -if (kvm_enabled() qemu_kvm_pit_in_kernel()) +if (kvm_enabled() kvm_pit_in_kernel()) pit = kvm_pit_init(0x40, isa_reserve_irq(0)); else #endif diff --git a/hw/pcspk.c b/hw/pcspk.c index 128836b..fb5f763 100644 --- a/hw/pcspk.c +++ b/hw/pcspk.c @@ -56,7 +56,7 @@ static void kvm_get_pit_ch2(PITState *pit, { struct kvm_pit_state pit_state; -if (kvm_enabled() qemu_kvm_pit_in_kernel()) { +if (kvm_enabled() kvm_pit_in_kernel()) { kvm_get_pit(kvm_context, pit_state); pit-channels[2].mode = pit_state.channels[2].mode; pit-channels[2].count = pit_state.channels[2].count; @@ -71,7 +71,7 @@ static void kvm_get_pit_ch2(PITState *pit, static void kvm_set_pit_ch2(PITState *pit, struct kvm_pit_state *inkernel_state) { -if (kvm_enabled() qemu_kvm_pit_in_kernel()) { +if (kvm_enabled() kvm_pit_in_kernel()) { inkernel_state-channels[2].mode = pit-channels[2].mode; inkernel_state-channels[2].count = pit-channels[2].count; inkernel_state-channels[2].count_load_time = diff --git a/kvm-all.c b/kvm-all.c index e7fa605..6cbca97 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -164,13 +164,13 @@ int kvm_irqchip_in_kernel(void) return kvm_state-irqchip_in_kernel; } -#ifdef KVM_UPSTREAM int kvm_pit_in_kernel(void) { return kvm_state-pit_in_kernel; } +#ifdef KVM_UPSTREAM int kvm_init_vcpu(CPUState *env) { KVMState *s = kvm_state; diff --git a/kvm.h b/kvm.h index 189a5d4..253b45d 100644 --- a/kvm.h +++ b/kvm.h @@ -68,10 +68,10 @@ int kvm_remove_breakpoint(CPUState *current_env, target_ulong addr, target_ulong len, int type); void kvm_remove_all_breakpoints(CPUState *current_env); int kvm_update_guest_debug(CPUState *env, unsigned long reinject_trap); +#endif /* KVM_UPSTREAM */ int kvm_pit_in_kernel(void); int kvm_irqchip_in_kernel(void); -#endif /* KVM_UPSTREAM */ /* internal API */ diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 0457a6e..074b510 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -119,13 +119,13 @@ static int kvm_create_pit(kvm_context_t kvm) #ifdef KVM_CAP_PIT int r; - kvm-pit_in_kernel = 0; + kvm_state-pit_in_kernel = 0; if (!kvm-no_pit_creation) { r = kvm_ioctl(kvm_state, KVM_CHECK_EXTENSION, KVM_CAP_PIT); if (r 0) { r = kvm_vm_ioctl(kvm_state, KVM_CREATE_PIT); if (r = 0) - kvm-pit_in_kernel = 1; + kvm_state-pit_in_kernel = 1; else { fprintf(stderr, Create kernel PIC irqchip failed\n); return r; @@ -311,14 +311,14 @@ int kvm_set_lapic(CPUState *env, struct kvm_lapic_state *s) int kvm_get_pit(kvm_context_t kvm, struct kvm_pit_state *s) { - if (!kvm-pit_in_kernel) + if (!kvm_pit_in_kernel()) return 0; return kvm_vm_ioctl(kvm_state, KVM_GET_PIT, s); } int kvm_set_pit(kvm_context_t kvm, struct kvm_pit_state *s) { - if (!kvm-pit_in_kernel) + if (!kvm_pit_in_kernel()) return 0; return kvm_vm_ioctl(kvm_state, KVM_SET_PIT, s); } @@ -326,14 +326,14 @@ int kvm_set_pit(kvm_context_t kvm, struct kvm_pit_state *s) #ifdef KVM_CAP_PIT_STATE2 int kvm_get_pit2(kvm_context_t kvm, struct kvm_pit_state2 *ps2) { - if (!kvm-pit_in_kernel) + if (!kvm_pit_in_kernel()) return 0; return
[Qemu-devel] [PATCH 10/21] KVM: Move and rename regs_modified
Touching the user space representation of KVM's VCPU state is - naturally - a per-VCPU thing. So move the dirty flag into KVM_CPU_COMMON and rename it at this chance to reflect its true meaning. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- cpu-defs.h |1 + kvm-all.c | 12 ++-- 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/cpu-defs.h b/cpu-defs.h index cf502e9..49a9e8d 100644 --- a/cpu-defs.h +++ b/cpu-defs.h @@ -208,6 +208,7 @@ struct KVMCPUState { struct KVMState *kvm_state; \ struct kvm_run *kvm_run;\ int kvm_fd; \ +int kvm_vcpu_dirty; \ uint32_t stop; /* Stop request */ \ uint32_t stopped; /* Artificially stopped */\ struct KVMCPUState kvm_cpu_state; diff --git a/kvm-all.c b/kvm-all.c index 6cbca97..3516f01 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -573,9 +573,9 @@ static void kvm_run_coalesced_mmio(CPUState *env, struct kvm_run *run) void kvm_cpu_synchronize_state(CPUState *env) { -if (!env-kvm_state-regs_modified) { +if (!env-kvm_vcpu_dirty) { kvm_arch_get_registers(env); -env-kvm_state-regs_modified = 1; +env-kvm_vcpu_dirty = 1; } } @@ -593,9 +593,9 @@ int kvm_cpu_exec(CPUState *env) break; } -if (env-kvm_state-regs_modified) { +if (env-kvm_vcpu_dirty) { kvm_arch_put_registers(env); -env-kvm_state-regs_modified = 0; +env-kvm_vcpu_dirty = 0; } kvm_arch_pre_run(env, run); @@ -951,9 +951,9 @@ static void kvm_invoke_set_guest_debug(void *data) struct kvm_set_guest_debug_data *dbg_data = data; CPUState *env = dbg_data-env; -if (env-kvm_state-regs_modified) { +if (env-kvm_vcpu_dirty) { kvm_arch_put_registers(env); -env-kvm_state-regs_modified = 0; +env-kvm_vcpu_dirty = 0; } dbg_data-err = kvm_vcpu_ioctl(env, KVM_SET_GUEST_DEBUG, dbg_data-dbg); } -- 1.6.0.2
[Qemu-devel] [PATCH 12/21] qemu-kvm: Use upstream kvm_vcpu_dirty
Drop regs_modified in favor of upstream's equivalent and clean up kvm_cpu_synchronize_state at this chance. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- cpu-defs.h |1 - hw/pc.c|2 +- qemu-kvm.c | 18 +- 3 files changed, 10 insertions(+), 11 deletions(-) diff --git a/cpu-defs.h b/cpu-defs.h index 49a9e8d..c57d8df 100644 --- a/cpu-defs.h +++ b/cpu-defs.h @@ -142,7 +142,6 @@ struct KVMCPUState { pthread_t thread; int signalled; struct qemu_work_item *queued_work_first, *queued_work_last; -int regs_modified; }; #define CPU_TEMP_BUF_NLONGS 128 diff --git a/hw/pc.c b/hw/pc.c index 7a7dfa7..af6ea8b 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -744,7 +744,7 @@ CPUState *pc_new_cpu(const char *cpu_model) fprintf(stderr, Unable to find x86 CPU definition\n); exit(1); } -env-kvm_cpu_state.regs_modified = 1; +env-kvm_vcpu_dirty = 1; if ((env-cpuid_features CPUID_APIC) || smp_cpus 1) { env-cpuid_apic_id = env-cpu_index; /* APIC reset callback resets cpu */ diff --git a/qemu-kvm.c b/qemu-kvm.c index 3ad0ec7..c04f805 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -861,9 +861,9 @@ int pre_kvm_run(kvm_context_t kvm, CPUState *env) { kvm_arch_pre_run(env, env-kvm_run); -if (env-kvm_cpu_state.regs_modified) { +if (env-kvm_vcpu_dirty) { kvm_arch_load_regs(env); -env-kvm_cpu_state.regs_modified = 0; +env-kvm_vcpu_dirty = 0; } pthread_mutex_unlock(qemu_mutex); @@ -1530,16 +1530,16 @@ static void on_vcpu(CPUState *env, void (*func)(void *data), void *data) static void do_kvm_cpu_synchronize_state(void *_env) { CPUState *env = _env; -if (!env-kvm_cpu_state.regs_modified) { -kvm_arch_save_regs(env); -env-kvm_cpu_state.regs_modified = 1; -} + +kvm_arch_save_regs(env); } void kvm_cpu_synchronize_state(CPUState *env) { -if (!env-kvm_cpu_state.regs_modified) +if (!env-kvm_vcpu_dirty) { on_vcpu(env, do_kvm_cpu_synchronize_state, env); +env-kvm_vcpu_dirty = 1; +} } static void inject_interrupt(void *data) @@ -2329,9 +2329,9 @@ static void kvm_invoke_set_guest_debug(void *data) { struct kvm_set_guest_debug_data *dbg_data = data; -if (cpu_single_env-kvm_cpu_state.regs_modified) { +if (cpu_single_env-kvm_vcpu_dirty) { kvm_arch_save_regs(cpu_single_env); -cpu_single_env-kvm_cpu_state.regs_modified = 0; +cpu_single_env-kvm_vcpu_dirty = 0; } dbg_data-err = kvm_set_guest_debug(cpu_single_env, -- 1.6.0.2
[Qemu-devel] [PATCH 11/21] KVM: Rework of guest debug state writing
So far we synchronized any dirty VCPU state back into the kernel before updating the guest debug state. This was a tribute to a deficite in x86 kernels before 2.6.33. But as this is an arch-dependent issue, it is better handle in the x86 part of KVM and remove the writeback point for generic code. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm-all.c | 12 target-i386/cpu.h |9 - target-i386/kvm.c | 11 +++ 3 files changed, 23 insertions(+), 9 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 3516f01..9c921cc 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -951,10 +951,6 @@ static void kvm_invoke_set_guest_debug(void *data) struct kvm_set_guest_debug_data *dbg_data = data; CPUState *env = dbg_data-env; -if (env-kvm_vcpu_dirty) { -kvm_arch_put_registers(env); -env-kvm_vcpu_dirty = 0; -} dbg_data-err = kvm_vcpu_ioctl(env, KVM_SET_GUEST_DEBUG, dbg_data-dbg); } @@ -962,12 +958,12 @@ int kvm_update_guest_debug(CPUState *env, unsigned long reinject_trap) { struct kvm_set_guest_debug_data data; -data.dbg.control = 0; -if (env-singlestep_enabled) -data.dbg.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP; +data.dbg.control = reinject_trap; +if (env-singlestep_enabled) { +data.dbg.control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP; +} kvm_arch_update_guest_debug(env, data.dbg); -data.dbg.control |= reinject_trap; data.env = env; on_vcpu(env, kvm_invoke_set_guest_debug, data); diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 7d0bbd0..7787fb1 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -21,6 +21,10 @@ #include config.h +#ifdef CONFIG_KVM +#include linux/kvm.h /* for kvm_guest_debug */ +#endif + #ifdef TARGET_X86_64 #define TARGET_LONG_BITS 64 #else @@ -718,7 +722,10 @@ typedef struct CPUX86State { uint8_t has_error_code; uint32_t sipi_vector; uint32_t cpuid_kvm_features; - +#if defined(CONFIG_KVM) defined(KVM_CAP_SET_GUEST_DEBUG) +struct kvm_guest_debug kvm_guest_debug; +#endif + /* in order to simplify APIC support, we leave this pointer to the user */ struct APICState *apic_state; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 8743f32..5ac12a8 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -865,6 +865,15 @@ int kvm_arch_put_registers(CPUState *env) if (ret 0) return ret; +/* + * Kernels before 2.6.33 overwrote flags.TF injected via SET_GUEST_DEBUG + * while updating GP regs. Work around this by updating the debug state + * once again. + */ +ret = kvm_vcpu_ioctl(env, KVM_SET_GUEST_DEBUG, env-kvm_guest_debug); +if (ret 0) +return ret; + ret = kvm_put_fpu(env); if (ret 0) return ret; @@ -1163,6 +1172,8 @@ void kvm_arch_update_guest_debug(CPUState *env, struct kvm_guest_debug *dbg) (len_code[hw_breakpoint[n].len] (18 + n*4)); } } +/* Keep a copy for the writeback workaround in kvm_arch_put_registers */ +memcpy(env-kvm_guest_debug, dbg, sizeof(env-kvm_guest_debug)); } #endif /* KVM_CAP_SET_GUEST_DEBUG */ #endif -- 1.6.0.2
[Qemu-devel] [PATCH 15/21] qemu-kvm: Clean up mpstate synchronization
Push mpstate reading/writing into kvm_arch_load/save_regs and, on x86, properly synchronize with halted in the accessor functions. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/apic.c |7 qemu-kvm-ia64.c |4 ++- qemu-kvm-x86.c| 88 +++- qemu-kvm.c| 30 - qemu-kvm.h| 15 target-i386/machine.c |6 --- target-ia64/machine.c |3 ++ 7 files changed, 55 insertions(+), 98 deletions(-) diff --git a/hw/apic.c b/hw/apic.c index 3e03e10..092c61e 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -507,13 +507,6 @@ void apic_init_reset(CPUState *env) s-wait_for_sipi = 1; env-halted = !(s-apicbase MSR_IA32_APICBASE_BSP); -#ifdef KVM_CAP_MP_STATE -if (kvm_enabled() kvm_irqchip_in_kernel()) { -env-mp_state -= env-halted ? KVM_MP_STATE_UNINITIALIZED : KVM_MP_STATE_RUNNABLE; -kvm_load_mpstate(env); -} -#endif } static void apic_startup(APICState *s, int vector_num) diff --git a/qemu-kvm-ia64.c b/qemu-kvm-ia64.c index fc8110e..39bcbeb 100644 --- a/qemu-kvm-ia64.c +++ b/qemu-kvm-ia64.c @@ -124,7 +124,9 @@ void kvm_arch_cpu_reset(CPUState *env) { if (kvm_irqchip_in_kernel(kvm_context)) { #ifdef KVM_CAP_MP_STATE - kvm_reset_mpstate(env-kvm_cpu_state.vcpu_ctx); +struct kvm_mp_state mp_state = {.mp_state = KVM_MP_STATE_UNINITIALIZED +}; +kvm_set_mpstate(env, mp_state); #endif } else { env-interrupt_request = ~CPU_INTERRUPT_HARD; diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 63cd095..6b5895f 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -754,6 +754,48 @@ static int get_msr_entry(struct kvm_msr_entry *entry, CPUState *env) return 0; } +static void kvm_arch_save_mpstate(CPUState *env) +{ +#ifdef KVM_CAP_MP_STATE +int r; +struct kvm_mp_state mp_state; + +r = kvm_get_mpstate(env, mp_state); +if (r 0) { +env-mp_state = -1; +} else { +env-mp_state = mp_state.mp_state; +if (kvm_irqchip_in_kernel()) { +env-halted = (env-mp_state == KVM_MP_STATE_HALTED); +} +} +#else +env-mp_state = -1; +#endif +} + +static void kvm_arch_load_mpstate(CPUState *env) +{ +#ifdef KVM_CAP_MP_STATE +struct kvm_mp_state mp_state; + +/* + * -1 indicates that the host did not support GET_MP_STATE ioctl, + * so don't touch it. + */ +if (env-mp_state != -1) { +if (kvm_irqchip_in_kernel()) { +env-mp_state = env-halted ? KVM_MP_STATE_UNINITIALIZED : + KVM_MP_STATE_RUNNABLE; +/* Avoid deadlock: no user space IRQ will ever clear it. */ +env-halted = 0; +} +mp_state.mp_state = env-mp_state; +kvm_set_mpstate(env, mp_state); +} +#endif +} + static void set_v8086_seg(struct kvm_segment *lhs, const SegmentCache *rhs) { lhs-selector = rhs-selector; @@ -926,6 +968,10 @@ void kvm_arch_load_regs(CPUState *env, int level) rc = kvm_set_msrs(env, msrs, n); if (rc == -1) perror(kvm_set_msrs FAILED); + +if (level = KVM_PUT_RESET_STATE) { +kvm_arch_load_mpstate(env); +} } void kvm_load_tsc(CPUState *env) @@ -940,36 +986,6 @@ void kvm_load_tsc(CPUState *env) perror(kvm_set_tsc FAILED.\n); } -void kvm_arch_save_mpstate(CPUState *env) -{ -#ifdef KVM_CAP_MP_STATE -int r; -struct kvm_mp_state mp_state; - -r = kvm_get_mpstate(env, mp_state); -if (r 0) -env-mp_state = -1; -else -env-mp_state = mp_state.mp_state; -#else -env-mp_state = -1; -#endif -} - -void kvm_arch_load_mpstate(CPUState *env) -{ -#ifdef KVM_CAP_MP_STATE -struct kvm_mp_state mp_state = { .mp_state = env-mp_state }; - -/* - * -1 indicates that the host did not support GET_MP_STATE ioctl, - * so don't touch it. - */ -if (env-mp_state != -1) -kvm_set_mpstate(env, mp_state); -#endif -} - void kvm_arch_save_regs(CPUState *env) { struct kvm_regs regs; @@ -1366,15 +1382,9 @@ void kvm_arch_cpu_reset(CPUState *env) { kvm_arch_reset_vcpu(env); kvm_put_vcpu_events(env); -if (!cpu_is_bsp(env)) { - if (kvm_irqchip_in_kernel()) { -#ifdef KVM_CAP_MP_STATE - kvm_reset_mpstate(env); -#endif - } else { - env-interrupt_request = ~CPU_INTERRUPT_HARD; - env-halted = 1; - } +if (!cpu_is_bsp(env) !kvm_irqchip_in_kernel()) { +env-interrupt_request = ~CPU_INTERRUPT_HARD; +env-halted = 1; } } diff --git a/qemu-kvm.c b/qemu-kvm.c index 53030f1..efa6a29 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -1579,36 +1579,6 @@ void kvm_update_interrupt_request(CPUState *env) } } -static void kvm_do_load_mpstate(void *_env) -{ -CPUState *env = _env; - -kvm_arch_load_mpstate(env); -} - -void kvm_load_mpstate(CPUState *env) -{ -if
[Qemu-devel] [PATCH 05/21] qemu-kvm: Use upstream kvm_enabled and cpu_synchronize_state
They are identical, no need for private copies. This requires replacing qemu-kvm.h includes with kvm.h, a good thing anyway, and reveals that there is no need for QEMU_KVM_NO_CPU protection. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/i8254.c|2 +- hw/i8259.c|2 +- hw/ioapic.c |2 +- hw/msix.c |3 +-- hw/pc.c |2 +- hw/piix_pci.c |2 +- kvm.h |7 +++ qemu-kvm.h| 41 - vl.c |3 ++- 9 files changed, 11 insertions(+), 53 deletions(-) diff --git a/hw/i8254.c b/hw/i8254.c index c4f8d2e..db9e94a 100644 --- a/hw/i8254.c +++ b/hw/i8254.c @@ -25,7 +25,7 @@ #include pc.h #include isa.h #include qemu-timer.h -#include qemu-kvm.h +#include kvm.h #include i8254.h //#define DEBUG_PIT diff --git a/hw/i8259.c b/hw/i8259.c index 7a484c0..b64c6fb 100644 --- a/hw/i8259.c +++ b/hw/i8259.c @@ -27,7 +27,7 @@ #include monitor.h #include qemu-timer.h -#include qemu-kvm.h +#include kvm.h /* debug PIC */ //#define DEBUG_PIC diff --git a/hw/ioapic.c b/hw/ioapic.c index a66325d..0adb0ac 100644 --- a/hw/ioapic.c +++ b/hw/ioapic.c @@ -26,7 +26,7 @@ #include qemu-timer.h #include host-utils.h -#include qemu-kvm.h +#include kvm.h //#define DEBUG_IOAPIC diff --git a/hw/msix.c b/hw/msix.c index 87f125b..faee0b2 100644 --- a/hw/msix.c +++ b/hw/msix.c @@ -14,8 +14,7 @@ #include hw.h #include msix.h #include pci.h -#define QEMU_KVM_NO_CPU -#include qemu-kvm.h +#include kvm.h /* Declaration from linux/pci_regs.h */ #define PCI_CAP_ID_MSIX 0x11 /* MSI-X */ diff --git a/hw/pc.c b/hw/pc.c index 97e16ce..dac373e 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -47,7 +47,7 @@ #include multiboot.h #include device-assignment.h -#include qemu-kvm.h +#include kvm.h /* output Bochs bios info messages */ //#define DEBUG_BIOS diff --git a/hw/piix_pci.c b/hw/piix_pci.c index 155587b..170f858 100644 --- a/hw/piix_pci.c +++ b/hw/piix_pci.c @@ -28,7 +28,7 @@ #include pci_host.h #include isa.h #include sysbus.h -#include qemu-kvm.h +#include kvm.h /* * I440FX chipset data sheet. diff --git a/kvm.h b/kvm.h index 9fa4e25..d0f4bbe 100644 --- a/kvm.h +++ b/kvm.h @@ -18,8 +18,6 @@ #include qemu-queue.h #include qemu-kvm.h -#ifdef KVM_UPSTREAM - #ifdef CONFIG_KVM extern int kvm_allowed; @@ -28,6 +26,7 @@ extern int kvm_allowed; #define kvm_enabled() (0) #endif +#ifdef KVM_UPSTREAM struct kvm_run; /* external API */ @@ -138,6 +137,8 @@ int kvm_check_extension(KVMState *s, unsigned int extension); uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function, int reg); +#endif + void kvm_cpu_synchronize_state(CPUState *env); /* generic hooks - to be moved/refactored once there are more users */ @@ -150,5 +151,3 @@ static inline void cpu_synchronize_state(CPUState *env) } #endif - -#endif diff --git a/qemu-kvm.h b/qemu-kvm.h index 1354227..d838bca 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -8,9 +8,7 @@ #ifndef THE_ORIGINAL_AND_TRUE_QEMU_KVM_H #define THE_ORIGINAL_AND_TRUE_QEMU_KVM_H -#ifndef QEMU_KVM_NO_CPU #include cpu.h -#endif #include signal.h #include stdlib.h @@ -94,8 +92,6 @@ void kvm_show_code(CPUState *env); int handle_halt(CPUState *env); -#ifndef QEMU_KVM_NO_CPU - int handle_shutdown(kvm_context_t kvm, CPUState *env); void post_kvm_run(kvm_context_t kvm, CPUState *env); int pre_kvm_run(kvm_context_t kvm, CPUState *env); @@ -113,8 +109,6 @@ struct kvm_x86_mce; int kvm_set_mce(CPUState *env, struct kvm_x86_mce *mce); #endif -#endif - /*! * \brief Create new KVM context * @@ -880,8 +874,6 @@ static inline int kvm_init(int smp_cpus) return 0; } -#ifndef QEMU_KVM_NO_CPU - static inline void kvm_inject_x86_mce(CPUState *cenv, int bank, uint64_t status, uint64_t mcg_status, uint64_t addr, uint64_t misc, @@ -891,16 +883,11 @@ static inline void kvm_inject_x86_mce(CPUState *cenv, int bank, abort(); } -#endif - -extern int kvm_allowed; - #endif /* !CONFIG_KVM */ int kvm_main_loop(void); int kvm_init_ap(void); -#ifndef QEMU_KVM_NO_CPU int kvm_vcpu_inited(CPUState *env); void kvm_load_mpstate(CPUState *env); void kvm_save_mpstate(CPUState *env); @@ -914,7 +901,6 @@ int kvm_update_guest_debug(CPUState *env, unsigned long reinject_trap); void kvm_apic_init(CPUState *env); /* called from vcpu initialization */ void qemu_kvm_load_lapic(CPUState *env); -#endif void kvm_hpet_enable_kpit(void); void kvm_hpet_disable_kpit(void); @@ -923,13 +909,11 @@ int kvm_set_irq(int irq, int level, int *status); int kvm_physical_memory_set_dirty_tracking(int enable); int kvm_update_dirty_pages_log(void); -#ifndef QEMU_KVM_NO_CPU void qemu_kvm_call_with_env(void (*func)(void *), void *data, CPUState *env); void qemu_kvm_cpuid_on_env(CPUState *env); void
[Qemu-devel] [PATCH 06/21] qemu-kvm: Use upstream kvm_setup_guest_memory
Nothing missing in upstream kvm_setup_guest_memory, it is even more careful about error handling. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm-all.c |3 --- kvm.h |3 +-- qemu-kvm.c | 15 --- qemu-kvm.h |1 - 4 files changed, 1 insertions(+), 21 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 0423fff..e7fa605 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -886,7 +886,6 @@ int kvm_has_vcpu_events(void) return kvm_state-vcpu_events; } -#ifdef KVM_UPSTREAM void kvm_setup_guest_memory(void *start, size_t size) { if (!kvm_has_sync_mmu()) { @@ -905,8 +904,6 @@ void kvm_setup_guest_memory(void *start, size_t size) } } -#endif /* KVM_UPSTREAM */ - #ifdef KVM_CAP_SET_GUEST_DEBUG #ifdef KVM_UPSTREAM diff --git a/kvm.h b/kvm.h index d0f4bbe..05ee540 100644 --- a/kvm.h +++ b/kvm.h @@ -54,10 +54,9 @@ int kvm_has_vcpu_events(void); int kvm_put_vcpu_events(CPUState *env); int kvm_get_vcpu_events(CPUState *env); -#ifdef KVM_UPSTREAM - void kvm_setup_guest_memory(void *start, size_t size); +#ifdef KVM_UPSTREAM int kvm_coalesce_mmio_region(target_phys_addr_t start, ram_addr_t size); int kvm_uncoalesce_mmio_region(target_phys_addr_t start, ram_addr_t size); diff --git a/qemu-kvm.c b/qemu-kvm.c index 97c098c..76f056c 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -2321,21 +2321,6 @@ void kvm_set_phys_mem(target_phys_addr_t start_addr, ram_addr_t size, return; } -int kvm_setup_guest_memory(void *area, unsigned long size) -{ -int ret = 0; - -#ifdef MADV_DONTFORK -if (kvm_enabled() !kvm_has_sync_mmu()) -ret = madvise(area, size, MADV_DONTFORK); -#endif - -if (ret) -perror(madvise); - -return ret; -} - #ifdef KVM_CAP_SET_GUEST_DEBUG struct kvm_set_guest_debug_data { diff --git a/qemu-kvm.h b/qemu-kvm.h index d838bca..0664c1d 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -923,7 +923,6 @@ void kvm_cpu_destroy_phys_mem(target_phys_addr_t start_addr, unsigned long size); void kvm_qemu_log_memory(target_phys_addr_t start, target_phys_addr_t size, int log); -int kvm_setup_guest_memory(void *area, unsigned long size); int kvm_qemu_create_memory_alias(uint64_t phys_start, uint64_t len, uint64_t target_phys); int kvm_qemu_destroy_memory_alias(uint64_t phys_start); -- 1.6.0.2
[Qemu-devel] [PATCH 21/21] qemu-kvm: Bring qemu_init_vcpu back home
There is no need for the this hack anymore, initialization is now robust against reordering as it doesn't try to write the VCPU state on its own. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/pc.c |5 - target-i386/helper.c |2 ++ 2 files changed, 2 insertions(+), 5 deletions(-) diff --git a/hw/pc.c b/hw/pc.c index 3df6195..cd0746c 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -751,11 +751,6 @@ CPUState *pc_new_cpu(const char *cpu_model) } else { qemu_register_reset((QEMUResetHandler*)cpu_reset, env); } - -/* kvm needs this to run after the apic is initialized. Otherwise, - * it can access invalid state and crash. - */ -qemu_init_vcpu(env); return env; } diff --git a/target-i386/helper.c b/target-i386/helper.c index f9d63f6..f83e8cc 100644 --- a/target-i386/helper.c +++ b/target-i386/helper.c @@ -1953,6 +1953,8 @@ CPUX86State *cpu_x86_init(const char *cpu_model) } mce_init(env); +qemu_init_vcpu(env); + return env; } -- 1.6.0.2
[Qemu-devel] [PATCH 19/21] qemu-kvm: Clean up KVM's APIC hooks
The APIC is part of the VCPU state, so trigger its readout and writeback from kvm_arch_save/load_regs. Thanks to the transparent sync on reset and vmsave/load, we can also drop explicit sync code, reducing the diff to upstream. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/apic.c | 37 + qemu-kvm-x86.c |4 ++-- qemu-kvm.h |5 ++--- 3 files changed, 9 insertions(+), 37 deletions(-) diff --git a/hw/apic.c b/hw/apic.c index 092c61e..d8c4f7c 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -24,8 +24,6 @@ #include host-utils.h #include kvm.h -#include qemu-kvm.h - //#define DEBUG_APIC /* APIC Local Vector Table */ @@ -951,36 +949,22 @@ static void kvm_kernel_lapic_load_from_user(APICState *s) #endif -void qemu_kvm_load_lapic(CPUState *env) +void kvm_load_lapic(CPUState *env) { #ifdef KVM_CAP_IRQCHIP -if (kvm_enabled() kvm_vcpu_inited(env) kvm_irqchip_in_kernel()) { -kvm_kernel_lapic_load_from_user(env-apic_state); -} -#endif -} - -static void apic_pre_save(void *opaque) -{ -#ifdef KVM_CAP_IRQCHIP -APICState *s = (void *)opaque; - if (kvm_enabled() kvm_irqchip_in_kernel()) { -kvm_kernel_lapic_save_to_user(s); +kvm_kernel_lapic_load_from_user(env-apic_state); } #endif } -static int apic_post_load(void *opaque, int version_id) +void kvm_save_lapic(CPUState *env) { #ifdef KVM_CAP_IRQCHIP -APICState *s = opaque; - if (kvm_enabled() kvm_irqchip_in_kernel()) { -kvm_kernel_lapic_load_from_user(s); +kvm_kernel_lapic_save_to_user(env-apic_state); } #endif -return 0; } /* This function is only used for old state version 1 and 2 */ @@ -1019,9 +1003,6 @@ static int apic_load_old(QEMUFile *f, void *opaque, int version_id) if (version_id = 2) qemu_get_timer(f, s-timer); - -qemu_kvm_load_lapic(s-cpu_env); - return 0; } @@ -1052,9 +1033,7 @@ static const VMStateDescription vmstate_apic = { VMSTATE_INT64(next_time, APICState), VMSTATE_TIMER(timer, APICState), VMSTATE_END_OF_LIST() -}, -.pre_save = apic_pre_save, -.post_load = apic_post_load, +} }; static void apic_reset(void *opaque) @@ -1077,7 +1056,6 @@ static void apic_reset(void *opaque) */ s-lvt[APIC_LVT_LINT0] = 0x700; } -qemu_kvm_load_lapic(s-cpu_env); } static CPUReadMemoryFunc * const apic_mem_read[3] = { @@ -1121,11 +1099,6 @@ int apic_init(CPUState *env) vmstate_register(s-idx, vmstate_apic, s); qemu_register_reset(apic_reset, s); -/* apic_reset must be called before the vcpu threads are initialized and load - * registers, in qemu-kvm. - */ -apic_reset(s); - local_apics[s-idx] = s; return 0; } diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 4b78570..9de018e 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -974,6 +974,7 @@ void kvm_arch_load_regs(CPUState *env, int level) if (level = KVM_PUT_RESET_STATE) { kvm_arch_load_mpstate(env); +kvm_load_lapic(env); } kvm_put_vcpu_events(env, level); @@ -1134,6 +1135,7 @@ void kvm_arch_save_regs(CPUState *env) } } kvm_arch_save_mpstate(env); +kvm_save_lapic(env); kvm_get_vcpu_events(env); } @@ -1207,8 +1209,6 @@ int kvm_arch_init_vcpu(CPUState *cenv) CPUState copy; uint32_t i, j, limit; -qemu_kvm_load_lapic(cenv); - kvm_arch_reset_vcpu(cenv); #ifdef KVM_CPUID_SIGNATURE diff --git a/qemu-kvm.h b/qemu-kvm.h index 2af206c..fea23a4 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -864,9 +864,8 @@ static inline void kvm_inject_x86_mce(CPUState *cenv, int bank, int kvm_main_loop(void); int kvm_init_ap(void); int kvm_vcpu_inited(CPUState *env); -void kvm_apic_init(CPUState *env); -/* called from vcpu initialization */ -void qemu_kvm_load_lapic(CPUState *env); +void kvm_save_lapic(CPUState *env); +void kvm_load_lapic(CPUState *env); void kvm_hpet_enable_kpit(void); void kvm_hpet_disable_kpit(void); -- 1.6.0.2
[Qemu-devel] [PATCH 08/21] qemu-kvm: Use upstream kvm_arch_get_supported_cpuid
It is idential to our version now, so drop the copy. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm.h |3 - qemu-kvm-x86.c| 106 - qemu-kvm.h|5 -- target-i386/kvm.c |4 +- 4 files changed, 2 insertions(+), 116 deletions(-) diff --git a/kvm.h b/kvm.h index b5ed744..189a5d4 100644 --- a/kvm.h +++ b/kvm.h @@ -137,11 +137,8 @@ void kvm_arch_update_guest_debug(CPUState *env, struct kvm_guest_debug *dbg); int kvm_check_extension(KVMState *s, unsigned int extension); -#ifdef KVM_UPSTREAM uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function, int reg); -#endif - void kvm_cpu_synchronize_state(CPUState *env); /* generic hooks - to be moved/refactored once there are more users */ diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 7f820a4..0457a6e 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -627,106 +627,6 @@ int kvm_disable_tpr_access_reporting(CPUState *env) #endif -#ifdef KVM_CAP_EXT_CPUID - -static struct kvm_cpuid2 *try_get_cpuid(kvm_context_t kvm, int max) -{ - struct kvm_cpuid2 *cpuid; - int r, size; - - size = sizeof(*cpuid) + max * sizeof(*cpuid-entries); - cpuid = qemu_malloc(size); - cpuid-nent = max; - r = kvm_ioctl(kvm_state, KVM_GET_SUPPORTED_CPUID, cpuid); - if (r == 0 cpuid-nent = max) - r = -E2BIG; - if (r 0) { - if (r == -E2BIG) { - free(cpuid); - return NULL; - } else { - fprintf(stderr, KVM_GET_SUPPORTED_CPUID failed: %s\n, - strerror(-r)); - exit(1); - } - } - return cpuid; -} - -#define R_EAX 0 -#define R_ECX 1 -#define R_EDX 2 -#define R_EBX 3 -#define R_ESP 4 -#define R_EBP 5 -#define R_ESI 6 -#define R_EDI 7 - -uint32_t kvm_get_supported_cpuid(kvm_context_t kvm, uint32_t function, int reg) -{ - struct kvm_cpuid2 *cpuid; - int i, max; - uint32_t ret = 0; - uint32_t cpuid_1_edx; - - if (!kvm_check_extension(kvm_state, KVM_CAP_EXT_CPUID)) { - return -1U; - } - - max = 1; - while ((cpuid = try_get_cpuid(kvm, max)) == NULL) { - max *= 2; - } - - for (i = 0; i cpuid-nent; ++i) { - if (cpuid-entries[i].function == function) { - switch (reg) { - case R_EAX: - ret = cpuid-entries[i].eax; - break; - case R_EBX: - ret = cpuid-entries[i].ebx; - break; - case R_ECX: - ret = cpuid-entries[i].ecx; - break; - case R_EDX: - ret = cpuid-entries[i].edx; -if (function == 1) { -/* kvm misreports the following features - */ -ret |= 1 12; /* MTRR */ -ret |= 1 16; /* PAT */ -ret |= 1 7; /* MCE */ -ret |= 1 14; /* MCA */ -} - - /* On Intel, kvm returns cpuid according to -* the Intel spec, so add missing bits -* according to the AMD spec: -*/ - if (function == 0x8001) { - cpuid_1_edx = kvm_get_supported_cpuid(kvm, 1, R_EDX); - ret |= cpuid_1_edx 0xdfeff7ff; - } - break; - } - } - } - - free(cpuid); - - return ret; -} - -#else - -uint32_t kvm_get_supported_cpuid(kvm_context_t kvm, uint32_t function, int reg) -{ - return -1U; -} - -#endif int kvm_qemu_create_memory_alias(uint64_t phys_start, uint64_t len, uint64_t target_phys) @@ -1686,12 +1586,6 @@ int kvm_arch_init_irq_routing(void) return 0; } -uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function, - int reg) -{ -return kvm_get_supported_cpuid(kvm_context, function, reg); -} - void kvm_arch_process_irqchip_events(CPUState *env) { if (env-interrupt_request CPU_INTERRUPT_INIT) { diff --git a/qemu-kvm.h b/qemu-kvm.h index 150017d..7b75fdd 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -859,8 +859,6 @@ int kvm_assign_set_msix_entry(kvm_context_t kvm,
[Qemu-devel] [PATCH 07/21] qemu-kvm: Use some more upstream prototypes
Drop our private typedef of KVMState and use more identical upstream prototypes. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm.h | 10 +++--- qemu-kvm.c |4 +++- qemu-kvm.h | 24 +++- 3 files changed, 13 insertions(+), 25 deletions(-) diff --git a/kvm.h b/kvm.h index 05ee540..b5ed744 100644 --- a/kvm.h +++ b/kvm.h @@ -32,11 +32,13 @@ struct kvm_run; /* external API */ int kvm_init(int smp_cpus); +#endif /* KVM_UPSTREAM */ int kvm_init_vcpu(CPUState *env); int kvm_cpu_exec(CPUState *env); +#ifdef KVM_UPSTREAM void kvm_set_phys_mem(target_phys_addr_t start_addr, ram_addr_t size, ram_addr_t phys_offset); @@ -47,19 +49,19 @@ int kvm_physical_sync_dirty_bitmap(target_phys_addr_t start_addr, int kvm_log_start(target_phys_addr_t phys_addr, ram_addr_t size); int kvm_log_stop(target_phys_addr_t phys_addr, ram_addr_t size); int kvm_set_migration_log(int enable); +#endif /* KVM_UPSTREAM */ int kvm_has_sync_mmu(void); -#endif /* KVM_UPSTREAM */ int kvm_has_vcpu_events(void); int kvm_put_vcpu_events(CPUState *env); int kvm_get_vcpu_events(CPUState *env); void kvm_setup_guest_memory(void *start, size_t size); -#ifdef KVM_UPSTREAM int kvm_coalesce_mmio_region(target_phys_addr_t start, ram_addr_t size); int kvm_uncoalesce_mmio_region(target_phys_addr_t start, ram_addr_t size); +#ifdef KVM_UPSTREAM int kvm_insert_breakpoint(CPUState *current_env, target_ulong addr, target_ulong len, int type); int kvm_remove_breakpoint(CPUState *current_env, target_ulong addr, @@ -69,6 +71,7 @@ int kvm_update_guest_debug(CPUState *env, unsigned long reinject_trap); int kvm_pit_in_kernel(void); int kvm_irqchip_in_kernel(void); +#endif /* KVM_UPSTREAM */ /* internal API */ @@ -97,7 +100,6 @@ int kvm_arch_init(KVMState *s, int smp_cpus); int kvm_arch_init_vcpu(CPUState *env); -#endif void kvm_arch_reset_vcpu(CPUState *env); #ifdef KVM_UPSTREAM @@ -131,9 +133,11 @@ int kvm_arch_remove_hw_breakpoint(target_ulong addr, void kvm_arch_remove_all_hw_breakpoints(void); void kvm_arch_update_guest_debug(CPUState *env, struct kvm_guest_debug *dbg); +#endif int kvm_check_extension(KVMState *s, unsigned int extension); +#ifdef KVM_UPSTREAM uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function, int reg); #endif diff --git a/qemu-kvm.c b/qemu-kvm.c index 76f056c..12442a7 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -1909,12 +1909,14 @@ static void *ap_main_loop(void *_env) return NULL; } -void kvm_init_vcpu(CPUState *env) +int kvm_init_vcpu(CPUState *env) { pthread_create(env-kvm_cpu_state.thread, NULL, ap_main_loop, env); while (env-created == 0) qemu_cond_wait(qemu_vcpu_cond); + +return 0; } int kvm_vcpu_inited(CPUState *env) diff --git a/qemu-kvm.h b/qemu-kvm.h index 0664c1d..150017d 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -891,7 +891,6 @@ int kvm_init_ap(void); int kvm_vcpu_inited(CPUState *env); void kvm_load_mpstate(CPUState *env); void kvm_save_mpstate(CPUState *env); -int kvm_cpu_exec(CPUState *env); int kvm_insert_breakpoint(CPUState * current_env, target_ulong addr, target_ulong len, int type); int kvm_remove_breakpoint(CPUState * current_env, target_ulong addr, @@ -933,9 +932,6 @@ void kvm_arch_save_regs(CPUState *env); void kvm_arch_load_regs(CPUState *env); void kvm_arch_load_mpstate(CPUState *env); void kvm_arch_save_mpstate(CPUState *env); -int kvm_arch_init_vcpu(CPUState *cenv); -int kvm_arch_pre_run(CPUState *env, struct kvm_run *run); -int kvm_arch_post_run(CPUState *env, struct kvm_run *run); int kvm_arch_has_work(CPUState *env); void kvm_arch_process_irqchip_events(CPUState *env); int kvm_arch_try_push_interrupts(void *opaque); @@ -981,8 +977,6 @@ void kvm_tpr_access_report(CPUState *env, uint64_t rip, int is_write); void kvm_tpr_vcpu_start(CPUState *env); int qemu_kvm_get_dirty_pages(unsigned long phys_addr, void *buf); -int kvm_coalesce_mmio_region(target_phys_addr_t start, ram_addr_t size); -int kvm_uncoalesce_mmio_region(target_phys_addr_t start, ram_addr_t size); int kvm_arch_init_irq_routing(void); @@ -1021,17 +1015,14 @@ void qemu_kvm_cpu_stop(CPUState *env); int kvm_arch_halt(CPUState *env); int handle_tpr_access(void *opaque, CPUState *env, uint64_t rip, int is_write); -int kvm_has_sync_mmu(void); #define qemu_kvm_pit_in_kernel() kvm_pit_in_kernel(kvm_context) #define qemu_kvm_has_gsi_routing() kvm_has_gsi_routing(kvm_context) #ifdef TARGET_I386 #define qemu_kvm_has_pit_state2() kvm_has_pit_state2(kvm_context) #endif -void kvm_init_vcpu(CPUState *env); void kvm_load_tsc(CPUState *env); #else -#define kvm_has_sync_mmu() (0) #define kvm_nested 0 #define qemu_kvm_pit_in_kernel() (0) #define qemu_kvm_has_gsi_routing() (0) @@ -1040,10 +1031,6 @@ void
[Qemu-devel] [PATCH 13/21] qemu-kvm: Use upstream guest debug code
Code was absolute identical except for previous cleanup in upstream. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm-all.c |7 +- kvm.h |4 - qemu-kvm-x86.c| 178 ++-- qemu-kvm.c| 44 - qemu-kvm.h| 37 --- target-i386/kvm.c |2 +- 6 files changed, 11 insertions(+), 261 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 9c921cc..f3cfa2c 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -919,7 +919,9 @@ static void on_vcpu(CPUState *env, void (*func)(void *data), void *data) func(data); #endif } -#endif /* KVM_UPSTREAM */ +#else /* !KVM_UPSTREAM */ +static void on_vcpu(CPUState *env, void (*func)(void *data), void *data); +#endif /* !KVM_UPSTREAM */ struct kvm_sw_breakpoint *kvm_find_sw_breakpoint(CPUState *env, target_ulong pc) @@ -938,8 +940,6 @@ int kvm_sw_breakpoints_active(CPUState *env) return !QTAILQ_EMPTY(env-kvm_state-kvm_sw_breakpoints); } -#ifdef KVM_UPSTREAM - struct kvm_set_guest_debug_data { struct kvm_guest_debug dbg; CPUState *env; @@ -969,7 +969,6 @@ int kvm_update_guest_debug(CPUState *env, unsigned long reinject_trap) on_vcpu(env, kvm_invoke_set_guest_debug, data); return data.err; } -#endif int kvm_insert_breakpoint(CPUState *current_env, target_ulong addr, target_ulong len, int type) diff --git a/kvm.h b/kvm.h index 253b45d..740fd1a 100644 --- a/kvm.h +++ b/kvm.h @@ -61,14 +61,12 @@ void kvm_setup_guest_memory(void *start, size_t size); int kvm_coalesce_mmio_region(target_phys_addr_t start, ram_addr_t size); int kvm_uncoalesce_mmio_region(target_phys_addr_t start, ram_addr_t size); -#ifdef KVM_UPSTREAM int kvm_insert_breakpoint(CPUState *current_env, target_ulong addr, target_ulong len, int type); int kvm_remove_breakpoint(CPUState *current_env, target_ulong addr, target_ulong len, int type); void kvm_remove_all_breakpoints(CPUState *current_env); int kvm_update_guest_debug(CPUState *env, unsigned long reinject_trap); -#endif /* KVM_UPSTREAM */ int kvm_pit_in_kernel(void); int kvm_irqchip_in_kernel(void); @@ -101,7 +99,6 @@ int kvm_arch_init(KVMState *s, int smp_cpus); int kvm_arch_init_vcpu(CPUState *env); void kvm_arch_reset_vcpu(CPUState *env); -#ifdef KVM_UPSTREAM struct kvm_guest_debug; struct kvm_debug_exit_arch; @@ -133,7 +130,6 @@ int kvm_arch_remove_hw_breakpoint(target_ulong addr, void kvm_arch_remove_all_hw_breakpoints(void); void kvm_arch_update_guest_debug(CPUState *env, struct kvm_guest_debug *dbg); -#endif int kvm_check_extension(KVMState *s, unsigned int extension); diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 074b510..834e9c1 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -835,6 +835,13 @@ void kvm_arch_load_regs(CPUState *env) kvm_set_regs(env, regs); +/* + * Kernels before 2.6.33 overwrote flags.TF injected via SET_GUEST_DEBUG + * while updating GP regs. Work around this by updating the debug state + * once again. + */ +kvm_vcpu_ioctl(env, KVM_SET_GUEST_DEBUG, env-kvm_guest_debug); + memset(fpu, 0, sizeof fpu); fpu.fsw = env-fpus ~(7 11); fpu.fsw |= (env-fpstt 7) 11; @@ -1372,177 +1379,6 @@ void kvm_arch_cpu_reset(CPUState *env) } } -int kvm_arch_insert_sw_breakpoint(CPUState *env, struct kvm_sw_breakpoint *bp) -{ -uint8_t int3 = 0xcc; - -if (cpu_memory_rw_debug(env, bp-pc, (uint8_t *)bp-saved_insn, 1, 0) || -cpu_memory_rw_debug(env, bp-pc, int3, 1, 1)) -return -EINVAL; -return 0; -} - -int kvm_arch_remove_sw_breakpoint(CPUState *env, struct kvm_sw_breakpoint *bp) -{ -uint8_t int3; - -if (cpu_memory_rw_debug(env, bp-pc, int3, 1, 0) || int3 != 0xcc || -cpu_memory_rw_debug(env, bp-pc, (uint8_t *)bp-saved_insn, 1, 1)) -return -EINVAL; -return 0; -} - -#ifdef KVM_CAP_SET_GUEST_DEBUG -static struct { -target_ulong addr; -int len; -int type; -} hw_breakpoint[4]; - -static int nb_hw_breakpoint; - -static int find_hw_breakpoint(target_ulong addr, int len, int type) -{ -int n; - -for (n = 0; n nb_hw_breakpoint; n++) - if (hw_breakpoint[n].addr == addr hw_breakpoint[n].type == type - (hw_breakpoint[n].len == len || len == -1)) - return n; -return -1; -} - -int kvm_arch_insert_hw_breakpoint(target_ulong addr, - target_ulong len, int type) -{ -switch (type) { -case GDB_BREAKPOINT_HW: - len = 1; - break; -case GDB_WATCHPOINT_WRITE: -case GDB_WATCHPOINT_ACCESS: - switch (len) { - case 1: - break; - case 2: - case 4: - case 8: - if (addr (len - 1)) - return -EINVAL; - break; - default: - return -EINVAL; - } -
[Qemu-devel] [PATCH 14/21] qemu-kvm: Rework VCPU state writeback API
This grand cleanup drops all reset and vmsave/load related synchronization points in favor of four(!) generic hooks: - cpu_synchronize_all_states in qemu_savevm_state_complete (initial sync from kernel before vmsave) - cpu_synchronize_all_post_init in qemu_loadvm_state (writeback after vmload) - cpu_synchronize_all_post_init in main after machine init - cpu_synchronize_all_post_reset in qemu_system_reset (writeback after system reset) These writeback points + the existing one of VCPU exec after cpu_synchronize_state map on three levels of writeback: - KVM_PUT_ASYNC_STATE (during runtime, other VCPUs continue to run) - KVM_PUT_RESET_STATE (on synchronous system reset, all VCPUs stopped) - KVM_PUT_FULL_STATE (on init or vmload, all VCPUs stopped as well) This level is passed to the arch-specific VCPU state writing function that will decide which concrete substates need to be written. That way, no writer of load, save or reset functions that interact with in-kernel KVM states will ever have to worry about synchronization again. That also means that a lot of reasons for races, segfaults and deadlocks are eliminated. cpu_synchronize_state remains untouched, just as Anthony suggested. We continue to need it before reading or writing of VCPU states that are also tracked by in-kernel KVM subsystems. Consequently, this patch removes many cpu_synchronize_state calls that are now redundant, just like remaining explicit register syncs. It does not touch qemu-kvm's special hooks for mpstate, vcpu_events, or tsc loading. They will be cleaned up by individual patches. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- exec.c| 17 - hw/apic.c |3 --- hw/pc.c |1 - hw/ppc_newworld.c |3 --- hw/ppc_oldworld.c |3 --- hw/s390-virtio.c |1 - kvm-all.c | 19 +-- kvm.h | 22 +- qemu-kvm-ia64.c |2 +- qemu-kvm-x86.c|3 +-- qemu-kvm.c| 16 +--- qemu-kvm.h|2 +- savevm.c |4 sysemu.h |4 target-i386/kvm.c |2 +- target-i386/machine.c | 10 -- target-ia64/machine.c |2 -- target-ppc/kvm.c |2 +- target-ppc/machine.c |4 target-s390x/kvm.c|3 +-- vl.c | 29 + 21 files changed, 90 insertions(+), 62 deletions(-) diff --git a/exec.c b/exec.c index ade09cb..7b35e0f 100644 --- a/exec.c +++ b/exec.c @@ -529,21 +529,6 @@ void cpu_exec_init_all(unsigned long tb_size) #if defined(CPU_SAVE_VERSION) !defined(CONFIG_USER_ONLY) -static void cpu_common_pre_save(void *opaque) -{ -CPUState *env = opaque; - -cpu_synchronize_state(env); -} - -static int cpu_common_pre_load(void *opaque) -{ -CPUState *env = opaque; - -cpu_synchronize_state(env); -return 0; -} - static int cpu_common_post_load(void *opaque, int version_id) { CPUState *env = opaque; @@ -561,8 +546,6 @@ static const VMStateDescription vmstate_cpu_common = { .version_id = 1, .minimum_version_id = 1, .minimum_version_id_old = 1, -.pre_save = cpu_common_pre_save, -.pre_load = cpu_common_pre_load, .post_load = cpu_common_post_load, .fields = (VMStateField []) { VMSTATE_UINT32(halted, CPUState), diff --git a/hw/apic.c b/hw/apic.c index ae805dc..3e03e10 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -488,7 +488,6 @@ void apic_init_reset(CPUState *env) if (!s) return; -cpu_synchronize_state(env); s-tpr = 0; s-spurious_vec = 0xff; s-log_dest = 0; @@ -1070,8 +1069,6 @@ static void apic_reset(void *opaque) APICState *s = opaque; int bsp; -cpu_synchronize_state(s-cpu_env); - bsp = cpu_is_bsp(s-cpu_env); s-apicbase = 0xfee0 | (bsp ? MSR_IA32_APICBASE_BSP : 0) | MSR_IA32_APICBASE_ENABLE; diff --git a/hw/pc.c b/hw/pc.c index af6ea8b..6c15a9f 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -744,7 +744,6 @@ CPUState *pc_new_cpu(const char *cpu_model) fprintf(stderr, Unable to find x86 CPU definition\n); exit(1); } -env-kvm_vcpu_dirty = 1; if ((env-cpuid_features CPUID_APIC) || smp_cpus 1) { env-cpuid_apic_id = env-cpu_index; /* APIC reset callback resets cpu */ diff --git a/hw/ppc_newworld.c b/hw/ppc_newworld.c index a4c714a..9e288bd 100644 --- a/hw/ppc_newworld.c +++ b/hw/ppc_newworld.c @@ -139,9 +139,6 @@ static void ppc_core99_init (ram_addr_t ram_size, envs[i] = env; } -/* Make sure all register sets take effect */ -cpu_synchronize_state(env); - /* allocate RAM */ ram_offset = qemu_ram_alloc(ram_size); cpu_register_physical_memory(0, ram_size, ram_offset); diff --git a/hw/ppc_oldworld.c b/hw/ppc_oldworld.c index 7ccc6a1..1aa05ed 100644 --- a/hw/ppc_oldworld.c +++ b/hw/ppc_oldworld.c @@ -164,9
[Qemu-devel] Re: [RFC 0/2]: QMP DISK_ERROR event
Hi Luiz, Am 01.02.2010 19:07, schrieb Luiz Capitulino: Hi there, I've been requested by libvirt guys to add a QMP event for disk I/O errors, this is what this series is about. It's a RFC because I need feedback on the following: 1. drive_get_on_error() is called on all disk errors, right? Well, yes, it is for all devices that support rerror/werror. But it also might be called in other situations. Look at the get in the function name, it's really a getter function and not a event handler. 2. I've tested only ENOSPC errors, is there a way to test other errors? Like read ones? So you'll probably want some EIO. Some recent bugs I've been handling were a about images on NFS when the NFS server want away. It's a reliable way to get EIO (mount with -osoft and small timeouts). I guess qemu-nbd and the nbd: protocol might work, too. Or maybe copy the start of a qcow2 image to a too small device. 3. Is this the right approach at all? :) Yes and no. As I said above, drive_get_on_error() is not the right place to do it. Unfortunately it looks like there isn't a single generic place where it can be done, but the call to the event handler must be added to every device. Kevin
[Qemu-devel] Re: [PATCH 2/2] QMP: Introduce DISK_ERROR event
Am 01.02.2010 19:07, schrieb Luiz Capitulino: It's emitted when a disk write or read fails, some device information is provided. We can also provide error details in the future. Example: { event: DISK_ERROR, data: { device: ide0-hd1, operation: write, action: stop } timestamp: { seconds: 1265044230, microseconds: 450486 } } NOTE: Adding a small reference in QMP/qmp-events.txt, but this file is wrong and will be replaced by proper documentation shortly. Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- QMP/qmp-events.txt |7 +++ monitor.c |3 +++ monitor.h |1 + vl.c | 34 +- 4 files changed, 44 insertions(+), 1 deletions(-) diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt index dc48ccc..e968ef5 100644 --- a/QMP/qmp-events.txt +++ b/QMP/qmp-events.txt @@ -43,3 +43,10 @@ Data: 'server' and 'client' keys with the same keys as 'query-vnc'. Description: Issued when the VNC session is made active. Data: 'server' and 'client' keys with the same keys as 'query-vnc'. + +7 DISK_ERROR + + +Description: Issued when a disk I/O error occurs +Data: 'device' (device name), 'action' (action to be taken), + 'operation' (read or write) diff --git a/monitor.c b/monitor.c index fb7c572..82edd79 100644 --- a/monitor.c +++ b/monitor.c @@ -378,6 +378,9 @@ void monitor_protocol_event(MonitorEvent event, QObject *data) case QEVENT_VNC_DISCONNECTED: event_name = VNC_DISCONNECTED; break; +case QEVENT_DISK_ERROR: +event_name = DISK_ERROR; +break; default: abort(); break; diff --git a/monitor.h b/monitor.h index b0f9270..beaddaf 100644 --- a/monitor.h +++ b/monitor.h @@ -23,6 +23,7 @@ typedef enum MonitorEvent { QEVENT_VNC_CONNECTED, QEVENT_VNC_INITIALIZED, QEVENT_VNC_DISCONNECTED, +QEVENT_DISK_ERROR, QEVENT_MAX, } MonitorEvent; diff --git a/vl.c b/vl.c index 57c439d..1f69f56 100644 --- a/vl.c +++ b/vl.c @@ -1856,10 +1856,42 @@ static BlockInterfaceErrorAction drive_get_err_action( return is_read ? BLOCK_ERR_REPORT : BLOCK_ERR_STOP_ENOSPC; } +static void driver_err_event( +BlockInterfaceErrorAction action, int is_read, const char *device) +{ +QObject *data; +const char *action_str; + +switch (action) { +case BLOCK_ERR_REPORT: +action_str = report; +break; +case BLOCK_ERR_IGNORE: +action_str = ignore; +break; +case BLOCK_ERR_STOP_ANY: +case BLOCK_ERR_STOP_ENOSPC: +action_str = stop; This is wrong. If it's BLOCK_ERR_STOP_ENOSPC, the action taken depends on the error code. It might as well be a report instead of stop if it was an EIO, for example. But the problem is probably going to go away when you stop abusing a getter function and add some calls that are explicitly made for your requirements. Kevin
[Qemu-devel] Re: KVM call agenda for Feb 2
Chris Wright wrote: Please send in any agenda items you are interested in covering. [not sure, though, if I'll manage to join do to overlapping meeting] - state of in-kernel APIC/IOAPIC/PIT upstream merge - road map to get rid of qemu-kvm's slot management (IMHO: qemu-kvm-0.13) - any further ongoing/planned upstream merge efforts? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
[Qemu-devel] [PATCH 00/13] i386 cpuid: cleanup and fixes
Hi, first: I know that this conflicts with John Cooper's latest patch, but I want to send this out for review and to help merging the stuff. This patchset cleans up the CPUID handling code in QEMU. The biggest change is obviously the move of the CPUID function to a separate file (cpuid.c). This helps to split up a rather large source file, which's name (helper.c) is also a bit misleading. Please tell me soon if you don't like it so that I can rebase the rest of patches. Additionally the rest of the patches beautifies or simplifies some code. Feature additions are: 5/13: add missing CPUID feature bit names 6/13: list CPUID feature bit names when using -cpu ? 9/13: -cpu host propagates more CPUID leafs, so that the cache topology will be visibile in the guest 10/13: add CPUID feature bit trimming for TCG: Features not supported by the emulator will be masked out. 11/13: always show all CPU types: also expose the newer (64bit) CPU types for the i386 emulator. 64bit features will be masked out due to 10/13. 12/13: add kvm32 CPU model: Per popular request add a counterpart to kvm64 describing a basic hardware virtualization capable CPU for migration purposes. More details in the commit messages. Note: In opposite to the last version I left out patches which change the CPUID bits of existing CPU models to avoid regressions with guests. Please review and comment. Regards, Andre.
[Qemu-devel] [PATCH 02/13] cpuid: replace magic number with named constant
CPUID leaf Fn8000_0001.EDX contains a copy of many Fn_0001.EDX bits. Define a name for the mask to improve readability and avoid typos. Signed-off-by: Andre Przywara andre.przyw...@amd.com --- target-i386/cpuid.c | 11 ++- 1 files changed, 6 insertions(+), 5 deletions(-) diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index aaa14ba..0a17020 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -130,6 +130,7 @@ typedef struct x86_def_t { CPUID_MSR | CPUID_MCE | CPUID_CX8 | CPUID_PGE | CPUID_CMOV | \ CPUID_PAT | CPUID_FXSR | CPUID_MMX | CPUID_SSE | CPUID_SSE2 | \ CPUID_PAE | CPUID_SEP | CPUID_APIC) +#define EXT2_FEATURE_MASK 0x0183F3FF static x86_def_t x86_defs[] = { #ifdef TARGET_X86_64 { @@ -147,7 +148,7 @@ static x86_def_t x86_defs[] = { /* this feature is needed for Solaris and isn't fully implemented */ CPUID_PSE36, .ext_features = CPUID_EXT_SSE3 | CPUID_EXT_CX16 | CPUID_EXT_POPCNT, -.ext2_features = (PPRO_FEATURES 0x0183F3FF) | +.ext2_features = (PPRO_FEATURES EXT2_FEATURE_MASK) | CPUID_EXT2_LM | CPUID_EXT2_SYSCALL | CPUID_EXT2_NX, .ext3_features = CPUID_EXT3_LAHF_LM | CPUID_EXT3_SVM | CPUID_EXT3_ABM | CPUID_EXT3_SSE4A, @@ -170,7 +171,7 @@ static x86_def_t x86_defs[] = { .ext_features = CPUID_EXT_SSE3 | CPUID_EXT_MONITOR | CPUID_EXT_CX16 | CPUID_EXT_POPCNT, /* Missing: CPUID_EXT2_PDPE1GB, CPUID_EXT2_RDTSCP */ -.ext2_features = (PPRO_FEATURES 0x0183F3FF) | +.ext2_features = (PPRO_FEATURES EXT2_FEATURE_MASK) | CPUID_EXT2_LM | CPUID_EXT2_SYSCALL | CPUID_EXT2_NX | CPUID_EXT2_3DNOW | CPUID_EXT2_3DNOWEXT | CPUID_EXT2_MMXEXT | CPUID_EXT2_FFXSR, @@ -220,7 +221,7 @@ static x86_def_t x86_defs[] = { /* Missing: CPUID_EXT_POPCNT, CPUID_EXT_MONITOR */ .ext_features = CPUID_EXT_SSE3 | CPUID_EXT_CX16, /* Missing: CPUID_EXT2_PDPE1GB, CPUID_EXT2_RDTSCP */ -.ext2_features = (PPRO_FEATURES 0x0183F3FF) | +.ext2_features = (PPRO_FEATURES EXT2_FEATURE_MASK) | CPUID_EXT2_LM | CPUID_EXT2_SYSCALL | CPUID_EXT2_NX, /* Missing: CPUID_EXT3_LAHF_LM, CPUID_EXT3_CMP_LEG, CPUID_EXT3_EXTAPIC, CPUID_EXT3_CR8LEG, CPUID_EXT3_ABM, CPUID_EXT3_SSE4A, @@ -308,7 +309,7 @@ static x86_def_t x86_defs[] = { .stepping = 3, .features = PPRO_FEATURES | CPUID_PSE36 | CPUID_VME | CPUID_MTRR | CPUID_MCA, -.ext2_features = (PPRO_FEATURES 0x0183F3FF) | CPUID_EXT2_MMXEXT | +.ext2_features = (PPRO_FEATURES EXT2_FEATURE_MASK) | CPUID_EXT2_MMXEXT | CPUID_EXT2_3DNOW | CPUID_EXT2_3DNOWEXT, .xlevel = 0x8008, /* XXX: put another string ? */ @@ -330,7 +331,7 @@ static x86_def_t x86_defs[] = { CPUID_EXT_SSE3 /* PNI */ | CPUID_EXT_SSSE3, /* Missing: CPUID_EXT_DSCPL | CPUID_EXT_EST | * CPUID_EXT_TM2 | CPUID_EXT_XTPR */ -.ext2_features = (PPRO_FEATURES 0x0183F3FF) | CPUID_EXT2_NX, +.ext2_features = (PPRO_FEATURES EXT2_FEATURE_MASK) | CPUID_EXT2_NX, /* Missing: .ext3_features = CPUID_EXT3_LAHF_LM */ .xlevel = 0x800A, .model_id = Intel(R) Atom(TM) CPU N270 @ 1.60GHz, -- 1.6.4
[Qemu-devel] [PATCH 05/13] cpuid: add missing CPUID feature flag names
Some CPUID feature flags had no string value, so they could not be switched on or off from the command line. Add names for the missing ones mentioned in the current public CPUID specification from both Intel and AMD. Those only mentioned in the Linux kernel source I put as comments. Signed-off-by: Andre Przywara andre.przyw...@amd.com --- target-i386/cpuid.c | 15 --- 1 files changed, 8 insertions(+), 7 deletions(-) diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index 0238718..19d58e1 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -52,11 +52,11 @@ static const char *feature_name[] = { fxsr, sse, sse2, ss, ht /* Intel htt */, tm, ia64, pbe, }; static const char *ext_feature_name[] = { -pni /* Intel,AMD sse3 */, NULL, NULL, monitor, -ds_cpl, vmx, NULL /* Linux smx */, est, -tm2, ssse3, cid, NULL, NULL, cx16, xtpr, NULL, -NULL, NULL, dca, NULL, NULL, NULL, NULL, popcnt, -NULL, NULL, NULL, NULL, NULL, NULL, NULL, hypervisor, +pni /* Intel,AMD sse3 */, pclmuldq, dtes64, monitor, +ds_cpl, vmx, smx, est, +tm2, ssse3, cid, NULL, NULL /* FMA */, cx16, xtpr, pdcm, +NULL, NULL, dca, sse4_1, sse4_2, x2apic, movbe, popcnt, +NULL, aes, xsave, osxsave, NULL /* AVX */, NULL, NULL, hypervisor, }; static const char *ext2_feature_name[] = { fpu, vme, de, pse, tsc, msr, pae, mce, @@ -71,8 +71,9 @@ static const char *ext3_feature_name[] = { lahf_lm /* AMD LahfSahf */, cmp_legacy, svm, extapic /* AMD ExtApicSpace */, cr8legacy /* AMD AltMovCr8 */, abm, sse4a, misalignsse, -3dnowprefetch, osvw, NULL /* Linux ibs */, NULL, skinit, wdt, NULL, NULL, -NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, +3dnowprefetch, osvw, ibs, NULL /* SSE-5 */, +skinit, wdt, NULL, NULL, +NULL, NULL, NULL, nodeid_msr, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, }; -- 1.6.4
[Qemu-devel] [PATCH 08/13] cpuid: simplify CPUID flag search function
avoid code duplication and handle the CPUID flag name search in a loop. Signed-off-by: Andre Przywara andre.przyw...@amd.com --- target-i386/cpuid.c | 38 +- 1 files changed, 13 insertions(+), 25 deletions(-) diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index 3f56c50..635c2f4 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -90,34 +90,22 @@ static void add_flagname_to_bitmaps(const char *flagname, uint32_t *features, uint32_t *ext3_features, uint32_t *kvm_features) { -int i; +int i, j; int found = 0; - -for ( i = 0 ; i 32 ; i++ ) -if (feature_name[i] !strcmp (flagname, feature_name[i])) { -*features |= 1 i; -found = 1; -} -for ( i = 0 ; i 32 ; i++ ) -if (ext_feature_name[i] !strcmp (flagname, ext_feature_name[i])) { -*ext_features |= 1 i; -found = 1; -} -for ( i = 0 ; i 32 ; i++ ) -if (ext2_feature_name[i] !strcmp (flagname, ext2_feature_name[i])) { -*ext2_features |= 1 i; -found = 1; -} -for ( i = 0 ; i 32 ; i++ ) -if (ext3_feature_name[i] !strcmp (flagname, ext3_feature_name[i])) { -*ext3_features |= 1 i; -found = 1; -} -for ( i = 0 ; i 32 ; i++ ) -if (kvm_feature_name[i] !strcmp (flagname, kvm_feature_name[i])) { -*kvm_features |= 1 i; -found = 1; +const char ** feature_names[5] = {feature_name, ext_feature_name, + ext2_feature_name, ext3_feature_name, + kvm_feature_name}; +uint32_t* feature_flags[5] = {features, ext_features, ext2_features, + ext3_features, kvm_features}; + +for (j = 0; j 5; j++) { +for ( i = 0 ; i 32 ; i++ ) { +if (feature_names[j][i] !strcmp(flagname, feature_names[j][i])) { +*feature_flags[j] |= 1 i; +found = 1; +} } +} if (!found) { fprintf(stderr, CPU feature %s not found\n, flagname); -- 1.6.4
[Qemu-devel] [PATCH 11/13] cpuid: Always expose 32 and 64-bit CPUs
Since 64-bit capability is just another CPUID bit we now properly mask, there is no reason anymore to hide the 64-bit capable CPU models from a 32-bit only QEMU. All 64-bit CPUs can be used perfectly in 32-bit legacy mode anyway, so these models also make sense for 32-bit. Signed-off-by: Andre Przywara andre.przyw...@amd.com --- target-i386/cpuid.c |2 -- 1 files changed, 0 insertions(+), 2 deletions(-) diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index 6e6ee54..b03a363 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -153,7 +153,6 @@ typedef struct x86_def_t { CPUID_EXT3_CR8LEG | CPUID_EXT3_ABM | CPUID_EXT3_SSE4A) static x86_def_t x86_defs[] = { -#ifdef TARGET_X86_64 { .name = qemu64, .level = 4, @@ -252,7 +251,6 @@ static x86_def_t x86_defs[] = { .xlevel = 0x8008, .model_id = Common KVM processor }, -#endif { .name = qemu32, .level = 4, -- 1.6.4
[Qemu-devel] [PATCH 10/13] cpuid: add TCG feature bit trimming
In KVM we trim the user provided CPUID bits to match the host CPU's one. Introduce a similar feature to QEMU/TCG. Create a mask of TCG's capabilities and apply it to the user bits. This allows to let the CPU models reflect their native archetypes. Signed-off-by: Andre Przywara andre.przyw...@amd.com --- target-i386/cpuid.c | 26 ++ 1 files changed, 26 insertions(+), 0 deletions(-) diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index 6aa1f3f..6e6ee54 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -137,6 +137,21 @@ typedef struct x86_def_t { CPUID_PAT | CPUID_FXSR | CPUID_MMX | CPUID_SSE | CPUID_SSE2 | \ CPUID_PAE | CPUID_SEP | CPUID_APIC) #define EXT2_FEATURE_MASK 0x0183F3FF + +#define TCG_FEATURES (CPUID_FP87 | CPUID_PSE | CPUID_TSC | CPUID_MSR | \ + CPUID_PAE | CPUID_MCE | CPUID_CX8 | CPUID_APIC | CPUID_SEP | \ + CPUID_MTRR | CPUID_PGE | CPUID_MCA | CPUID_CMOV | CPUID_PAT | \ + CPUID_PSE36 | CPUID_CLFLUSH | CPUID_ACPI | CPUID_MMX | \ + CPUID_FXSR | CPUID_SSE | CPUID_SSE2 | CPUID_SS) +#define TCG_EXT_FEATURES (CPUID_EXT_SSE3 | CPUID_EXT_MONITOR | \ + CPUID_EXT_CX16 | CPUID_EXT_POPCNT | CPUID_EXT_XSAVE | \ + CPUID_EXT_HYPERVISOR) +#define TCG_EXT2_FEATURES ((TCG_FEATURES EXT2_FEATURE_MASK) | \ + CPUID_EXT2_NX | CPUID_EXT2_MMXEXT | CPUID_EXT2_RDTSCP | \ + CPUID_EXT2_3DNOW | CPUID_EXT2_3DNOWEXT) +#define TCG_EXT3_FEATURES (CPUID_EXT3_LAHF_LM | CPUID_EXT3_SVM | \ + CPUID_EXT3_CR8LEG | CPUID_EXT3_ABM | CPUID_EXT3_SSE4A) + static x86_def_t x86_defs[] = { #ifdef TARGET_X86_64 { @@ -616,6 +631,17 @@ int cpu_x86_register (CPUX86State *env, const char *cpu_model) env-cpuid_ext2_features = def-ext2_features; env-cpuid_xlevel = def-xlevel; env-cpuid_kvm_features = def-kvm_features; +env-cpuid_ext3_features = def-ext3_features; +if (!kvm_enabled()) { +env-cpuid_features = TCG_FEATURES; +env-cpuid_ext_features = TCG_EXT_FEATURES; +env-cpuid_ext2_features = (TCG_EXT2_FEATURES +#ifdef TARGET_X86_64 +| CPUID_EXT2_SYSCALL | CPUID_EXT2_LM +#endif +); +env-cpuid_ext3_features = TCG_EXT3_FEATURES; +} { const char *model_id = def-model_id; int c, len, i; -- 1.6.4
[Qemu-devel] [PATCH 03/13] cpuid: moved host_cpuid function and remove prototype
the host_cpuid function was located at the end of the file and had a prototype before it's first use. Move it up and remove the prototype. Signed-off-by: Andre Przywara andre.przyw...@amd.com --- target-i386/cpuid.c | 70 -- 1 files changed, 34 insertions(+), 36 deletions(-) diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index 0a17020..cc080f4 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -338,8 +338,40 @@ static x86_def_t x86_defs[] = { }, }; -static void host_cpuid(uint32_t function, uint32_t count, uint32_t *eax, - uint32_t *ebx, uint32_t *ecx, uint32_t *edx); +static void host_cpuid(uint32_t function, uint32_t count, + uint32_t *eax, uint32_t *ebx, + uint32_t *ecx, uint32_t *edx) +{ +#if defined(CONFIG_KVM) +uint32_t vec[4]; + +#ifdef __x86_64__ +asm volatile(cpuid + : =a(vec[0]), =b(vec[1]), + =c(vec[2]), =d(vec[3]) + : 0(function), c(count) : cc); +#else +asm volatile(pusha \n\t + cpuid \n\t + mov %%eax, 0(%2) \n\t + mov %%ebx, 4(%2) \n\t + mov %%ecx, 8(%2) \n\t + mov %%edx, 12(%2) \n\t + popa + : : a(function), c(count), S(vec) + : memory, cc); +#endif + +if (eax) + *eax = vec[0]; +if (ebx) + *ebx = vec[1]; +if (ecx) + *ecx = vec[2]; +if (edx) + *edx = vec[3]; +#endif +} static int cpu_x86_fill_model_id(char *str) { @@ -578,40 +610,6 @@ int cpu_x86_register (CPUX86State *env, const char *cpu_model) return 0; } -static void host_cpuid(uint32_t function, uint32_t count, - uint32_t *eax, uint32_t *ebx, - uint32_t *ecx, uint32_t *edx) -{ -#if defined(CONFIG_KVM) -uint32_t vec[4]; - -#ifdef __x86_64__ -asm volatile(cpuid - : =a(vec[0]), =b(vec[1]), - =c(vec[2]), =d(vec[3]) - : 0(function), c(count) : cc); -#else -asm volatile(pusha \n\t - cpuid \n\t - mov %%eax, 0(%2) \n\t - mov %%ebx, 4(%2) \n\t - mov %%ecx, 8(%2) \n\t - mov %%edx, 12(%2) \n\t - popa - : : a(function), c(count), S(vec) - : memory, cc); -#endif - -if (eax) - *eax = vec[0]; -if (ebx) - *ebx = vec[1]; -if (ecx) - *ecx = vec[2]; -if (edx) - *edx = vec[3]; -#endif -} static void get_cpuid_vendor(CPUX86State *env, uint32_t *ebx, uint32_t *ecx, uint32_t *edx) -- 1.6.4
[Qemu-devel] [PATCH 09/13] cpuid: propagate further CPUID leafs when -cpu host
-cpu host currently only propagates the CPU's family/model/stepping, the brand name and the feature bits. Add a whitelist of safe CPUID leafs to let the guest see the actual CPU's cache details and other things. Signed-off-by: Andre Przywara andre.przyw...@amd.com --- target-i386/cpu.h |5 - target-i386/cpuid.c | 28 ++-- 2 files changed, 26 insertions(+), 7 deletions(-) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index f826d3d..982f815 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -581,6 +581,9 @@ typedef struct { #define NB_MMU_MODES 2 +#define CPUID_FLAGS_VENDOR_OVERRIDE 1 +#define CPUID_FLAGS_HOST 2 + typedef struct CPUX86State { /* standard registers */ target_ulong regs[CPU_NB_REGS]; @@ -685,7 +688,7 @@ typedef struct CPUX86State { uint32_t cpuid_ext2_features; uint32_t cpuid_ext3_features; uint32_t cpuid_apic_id; -int cpuid_vendor_override; +uint32_t cpuid_flags; /* MTRRs */ uint64_t mtrr_fixed[11]; diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index 635c2f4..6aa1f3f 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -122,7 +122,7 @@ typedef struct x86_def_t { uint32_t features, ext_features, ext2_features, ext3_features, kvm_features; uint32_t xlevel; char model_id[48]; -int vendor_override; +uint32_t flags; } x86_def_t; #define I486_FEATURES (CPUID_FP87 | CPUID_VME | CPUID_PSE) @@ -419,7 +419,7 @@ static int cpu_x86_fill_host(x86_def_t *x86_cpu_def) x86_cpu_def-ext2_features = edx; x86_cpu_def-ext3_features = ecx; cpu_x86_fill_model_id(x86_cpu_def-model_id); -x86_cpu_def-vendor_override = 0; +x86_cpu_def-flags = CPUID_FLAGS_HOST; return 0; } @@ -529,7 +529,7 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) x86_cpu_def-vendor2 |= ((uint8_t)val[i + 4]) (8 * i); x86_cpu_def-vendor3 |= ((uint8_t)val[i + 8]) (8 * i); } -x86_cpu_def-vendor_override = 1; +x86_cpu_def-flags |= CPUID_FLAGS_VENDOR_OVERRIDE; } else if (!strcmp(featurestr, model_id)) { pstrcpy(x86_cpu_def-model_id, sizeof(x86_cpu_def-model_id), val); @@ -602,7 +602,7 @@ int cpu_x86_register (CPUX86State *env, const char *cpu_model) env-cpuid_vendor2 = CPUID_VENDOR_INTEL_2; env-cpuid_vendor3 = CPUID_VENDOR_INTEL_3; } -env-cpuid_vendor_override = def-vendor_override; +env-cpuid_flags = def-flags; env-cpuid_level = def-level; if (def-family 0x0f) env-cpuid_version = 0xf00 | ((def-family - 0x0f) 20); @@ -647,22 +647,38 @@ static void get_cpuid_vendor(CPUX86State *env, uint32_t *ebx, * this if you want to use KVM's sysenter/syscall emulation * in compatibility mode and when doing cross vendor migration */ -if (kvm_enabled() env-cpuid_vendor_override) { +if (kvm_enabled() +(env-cpuid_flags CPUID_FLAGS_VENDOR_OVERRIDE) == 0) { host_cpuid(0, 0, NULL, ebx, ecx, edx); } } +#define CPUID_LEAF_PROPAGATE ((1 0x02) | (1 0x04) | (1 0x05) |\ + (1 0x0D)) +#define CPUID_LEAF_PROPAGATE_EXTENDED ((1 0x05) | (1 0x06) |\ + (1 0x08) | (1 0x19) | (1 0x1A)) + void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx) { -/* test if maximum index reached */ if (index 0x8000) { +/* test if maximum index reached */ if (index env-cpuid_xlevel) index = env-cpuid_level; +if ((env-cpuid_flags CPUID_FLAGS_HOST) +((1 (index - 0x8000)) CPUID_LEAF_PROPAGATE_EXTENDED)) { +host_cpuid(index, count, eax, ebx, ecx, edx); +return; +} } else { if (index env-cpuid_level) index = env-cpuid_level; +if ((env-cpuid_flags CPUID_FLAGS_HOST) +((1 index) CPUID_LEAF_PROPAGATE)) { +host_cpuid(index, count, eax, ebx, ecx, edx); +return; +} } switch(index) { -- 1.6.4
[Qemu-devel] [PATCH 04/13] cpuid: Replace strtok with get_opt_name
To avoid the non-reentrant capable strtok() use the QEMU defined get_opt_name() to parse the -cpu parameter list. Since there is a name clash between linux-user/mmap.c:qemu_malloc() and qemu-malloc.c:qemu_malloc() I copied the small function from qemu-option.c into cpuid.c. Not the best solution, bit IMO the least intrusive and smallest one. Signed-off-by: Andre Przywara andre.przyw...@amd.com --- target-i386/cpuid.c | 34 -- 1 files changed, 24 insertions(+), 10 deletions(-) diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index cc080f4..0238718 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -24,6 +24,23 @@ #include cpu.h #include kvm.h +static const char *get_opt_name(char *buf, int buf_size, +const char *p, char delim) +{ +char *q; + +q = buf; +while (*p != '\0' *p != delim) { +if (q (q - buf) buf_size - 1) +*q++ = *p; +p++; +} +if (q) +*q = '\0'; + +return p; +} + /* feature flags taken from Intel Processor Identification and the CPUID * Instruction and AMD's CPUID Specification. In cases of disagreement * about feature names, the Linux name is used. */ @@ -423,8 +440,8 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) unsigned int i; x86_def_t *def; -char *s = strdup(cpu_model); -char *featurestr, *name = strtok(s, ,); +const char* s; +char featurestr[64]; uint32_t plus_features = 0, plus_ext_features = 0, plus_ext2_features = 0, plus_ext3_features = 0, plus_kvm_features = 0; uint32_t minus_features = 0, minus_ext_features = 0, @@ -432,14 +449,15 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) minus_kvm_features = 0; uint32_t numvalue; +s = get_opt_name(featurestr, 64, cpu_model, ','); def = NULL; for (i = 0; i ARRAY_SIZE(x86_defs); i++) { -if (strcmp(name, x86_defs[i].name) == 0) { +if (strcmp(featurestr, x86_defs[i].name) == 0) { def = x86_defs[i]; break; } } -if (kvm_enabled() strcmp(name, host) == 0) { +if (kvm_enabled() strcmp(featurestr, host) == 0) { cpu_x86_fill_host(x86_cpu_def); } else if (!def) { goto error; @@ -453,10 +471,9 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) plus_ext_features, plus_ext2_features, plus_ext3_features, plus_kvm_features); -featurestr = strtok(NULL, ,); - -while (featurestr) { +while (*s != 0) { char *val; +s = get_opt_name(featurestr, 64, s + 1, ','); if (featurestr[0] == '+') { add_flagname_to_bitmaps(featurestr + 1, plus_features, plus_ext_features, plus_ext2_features, @@ -536,7 +553,6 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) (+feature|-feature|feature=xyz)\n, featurestr); goto error; } -featurestr = strtok(NULL, ,); } x86_cpu_def-features |= plus_features; x86_cpu_def-ext_features |= plus_ext_features; @@ -548,11 +564,9 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) x86_cpu_def-ext2_features = ~minus_ext2_features; x86_cpu_def-ext3_features = ~minus_ext3_features; x86_cpu_def-kvm_features = ~minus_kvm_features; -free(s); return 0; error: -free(s); return -1; } -- 1.6.4
[Qemu-devel] [PATCH 07/13] cpuid: remove unnecessary kvm_trim function
Correct me if I am wrong, but kvm_trim looks like a really bloated implementation of a bitwise AND. So remove this function and replace it with the real stuff(TM). Signed-off-by: Andre Przywara andre.przyw...@amd.com --- target-i386/kvm.c | 27 ++- 1 files changed, 6 insertions(+), 21 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 5b093ce..daa65c1 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -125,19 +125,6 @@ uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function, int reg) #endif -static void kvm_trim_features(uint32_t *features, uint32_t supported) -{ -int i; -uint32_t mask; - -for (i = 0; i 32; ++i) { -mask = 1U i; -if ((*features mask) !(supported mask)) { -*features = ~mask; -} -} -} - #ifdef CONFIG_KVM_PARA struct kvm_para_features { int cap; @@ -186,18 +173,16 @@ int kvm_arch_init_vcpu(CPUState *env) env-mp_state = KVM_MP_STATE_RUNNABLE; -kvm_trim_features(env-cpuid_features, -kvm_arch_get_supported_cpuid(env, 1, R_EDX)); +env-cpuid_features = kvm_arch_get_supported_cpuid(env, 1, R_EDX); i = env-cpuid_ext_features CPUID_EXT_HYPERVISOR; -kvm_trim_features(env-cpuid_ext_features, -kvm_arch_get_supported_cpuid(env, 1, R_ECX)); +env-cpuid_ext_features = kvm_arch_get_supported_cpuid(env, 1, R_ECX); env-cpuid_ext_features |= i; -kvm_trim_features(env-cpuid_ext2_features, -kvm_arch_get_supported_cpuid(env, 0x8001, R_EDX)); -kvm_trim_features(env-cpuid_ext3_features, -kvm_arch_get_supported_cpuid(env, 0x8001, R_ECX)); +env-cpuid_ext2_features = kvm_arch_get_supported_cpuid(env, 0x8001, + R_EDX); +env-cpuid_ext3_features = kvm_arch_get_supported_cpuid(env, 0x8001, + R_ECX); cpuid_i = 0; -- 1.6.4
[Qemu-devel] [PATCH 06/13] cpuid: list all known x86 CPUID feature flags
-cpu ? currently gives us a list of known CPU models. Add host if using KVM and a list of known CPUID feature flags to the output. Signed-off-by: Andre Przywara andre.przyw...@amd.com --- target-i386/cpuid.c | 22 +- 1 files changed, 21 insertions(+), 1 deletions(-) diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index 19d58e1..3f56c50 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -573,10 +573,30 @@ error: void x86_cpu_list (FILE *f, int (*cpu_fprintf)(FILE *f, const char *fmt, ...)) { -unsigned int i; +unsigned int i, j; +const char **stringlist[] = {feature_name, ext_feature_name, + ext2_feature_name, ext3_feature_name}; for (i = 0; i ARRAY_SIZE(x86_defs); i++) (*cpu_fprintf)(f, x86 %16s\n, x86_defs[i].name); +if (kvm_enabled()) { +(*cpu_fprintf)(f, x86 %16s\n, host); +} + +(*cpu_fprintf)(f, x86 recognized feature flags:\n); +for (j = 0; j 4; j++) { +for (i = 0; i 32; i++) { +if (j == 2 ((1 i) EXT2_FEATURE_MASK)) +continue; +if (stringlist[j][i] == NULL) +continue; +(*cpu_fprintf)(f, %s , stringlist[j][i]); +if (i == 15) +(*cpu_fprintf)(f, \n); +} +(*cpu_fprintf)(f, \n); +} +return; } int cpu_x86_register (CPUX86State *env, const char *cpu_model) -- 1.6.4
[Qemu-devel] [PATCH 12/13] cpuid: Add kvm32 CPU model
Create a kvm32 CPU model that describes a least common denominator for KVM capable guest CPUs. Useful for migration purposes. Signed-off-by: Andre Przywara andre.przyw...@amd.com --- target-i386/cpuid.c | 14 ++ 1 files changed, 14 insertions(+), 0 deletions(-) diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index b03a363..65dcb23 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -263,6 +263,20 @@ static x86_def_t x86_defs[] = { .model_id = QEMU Virtual CPU version QEMU_VERSION, }, { +.name = kvm32, +.level = 5, +.family = 15, +.model = 6, +.stepping = 1, +.features = PPRO_FEATURES | +CPUID_MTRR | CPUID_CLFLUSH | CPUID_MCA | CPUID_PSE36, +.ext_features = CPUID_EXT_SSE3, +.ext2_features = PPRO_FEATURES EXT2_FEATURE_MASK, +.ext3_features = 0, +.xlevel = 0x8008, +.model_id = Common 32-bit KVM processor +}, +{ .name = coreduo, .level = 10, .family = 6, -- 1.6.4
[Qemu-devel] [PATCH 13/13] cpuid: fix CPUID levels
Bump up the xlevel number for qemu32 to allow parsing of the processor name string for this model. Similiarly the 486 processor should have at least the feature bit leaf enabled. Signed-off-by: Andre Przywara andre.przyw...@amd.com --- target-i386/cpuid.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index 65dcb23..725efe3 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -259,7 +259,7 @@ static x86_def_t x86_defs[] = { .stepping = 3, .features = PPRO_FEATURES, .ext_features = CPUID_EXT_SSE3 | CPUID_EXT_POPCNT, -.xlevel = 0, +.xlevel = 0x8004, .model_id = QEMU Virtual CPU version QEMU_VERSION, }, { @@ -297,7 +297,7 @@ static x86_def_t x86_defs[] = { }, { .name = 486, -.level = 0, +.level = 1, .family = 4, .model = 0, .stepping = 0, -- 1.6.4
[Qemu-devel] Re: [PATCH] Add cpu model configuration support.. (resend)
john cooper wrote: [target-x86_64.conf was unintentionally omitted from the earlier patch] This is a reimplementation of prior versions which adds the ability to define cpu models for contemporary processors. The added models are likewise selected via -cpu name, and are intended to displace the existing convention of -cpu qemu64 augmented with a series of feature flags. ... John, first I would like to apologize for sending out my patch series although I know that it heavily conflicts with yours. Actually you beat me just by hours with yours, I had mine ready on Friday evening and just delayed the sending until Monday ;-) Can you split up the patch into a series of smaller ones (maybe git add -i can help you here?). This version is a bit large for proper review and mixes fixes and feature additions. Additionally this would help to merge our both versions. Regards, Andre. -- Andre Przywara AMD-Operating System Research Center (OSRC), Dresden, Germany Tel: +49 351 448 3567 12 to satisfy European Law for business letters: Advanced Micro Devices GmbH Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen Geschaeftsfuehrer: Andrew Bowd; Thomas M. McCoy; Giuliano Meroni Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen Registergericht Muenchen, HRB Nr. 43632
Re: [Qemu-devel] [PATCH 2/5] socket: Add a reconnect option.
Anthony Liguori wrote: I'm all for doing things incrementally but there has to be a big picture that the incremental bit fits into otherwise you end up with a bunch of random features that don't work together well. Well, if you just add stuff without ever changing anything that went before, of course. Honestly, I'd strongly suggest splitting the reconnect logic out of the series when resubmitting. IMO the RNG stuff is worthless without the reconnect logic. You cant have a machine in a production environment that just stops getting entropy forever when you (say) restart the EGD, perhaps during a package update. Or when someone unplugs the entropy source temporarily or something like that. I think it's just too hacky with too weak of a justification. If you really want this functionality, we can discuss the right approach for doing it but it's gotta be done in a way that's not introducing a one-off case just for the random number generator. I dont think its a case of 'really want' as much as 'its completely essential' :-) I still think that unless there are any other use cases, theres not much to discuss - The code is already generic to some degree - it notifies users, and its got a configurable delay. What else do we need? I implemented it generically rather than stuff it into the virtio-rng driver *because* I didnt think a dedicated version of it was the right way to go, but without some other use cases, I cant see what good there is in bikeshedding over this? -Ian
[Qemu-devel] Re: [PATCH 00/21] qemu-kvm: Hook cleanups and extended use of upstream code
On 02.02.2010, at 09:18, Jan Kiszka wrote: Let's start with the overall stats: 31 files changed, 274 insertions(+), 822 deletions(-) So this series drops far more than 500 lines of redundant code, moving qemu-kvm yet a bit closer to upstream. The other highlight is the simplification of synchronization between in-kernel and user space VCPU states. This area used to call a lot of problems in the past because it was tricky to get things right, specifically during the multi-threaded startup. The new approach pushes all the sync work around reset and vmsave/load into generic code, not only removing the burden from developers of, say, in-kernel APIC support, but also dropping most of our kvm-specific hooks, especially in the qemu-kvm tree. While I tested this on various VMs around, and things look good so far, I wouldn't be surprised if there are some regressions remaining, specifically in the non-x86 parts that I wasn't able to test or even build. Please have a careful look! The good news on that part is that apart from IA64, all other archs are broken in qemu-kvm anyways, but work on upstream qemu. So moving towards upstream definitely helps here. Alex
[Qemu-devel] Re: [PATCH 03/21] qemu-kvm: Clean up register access API
Gleb Natapov wrote: On Tue, Feb 02, 2010 at 09:18:49AM +0100, Jan Kiszka wrote: qemu-kvm's functios for accessing the VCPU registers are kvm_arch_load/save_regs. Use them directly instead of going through various wrappers. Specifically, we do not need on_vcpu wrapping as all users either already run in the related thread or call while the vm is stopped. Can we put check for that into those functions just to be sure. Something like: assert(!vm_stopped env-thread_id != pthread_id()) Good idea. Will add this to a potential v2 or send an add-on patch. We just need something else than vm_stopped (for reset, only the vcpu threads are stopped, not the vm), probably env-stopped in qemu-kvm. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- qemu-kvm.c| 37 +++-- qemu-kvm.h| 11 --- target-ia64/machine.c |4 ++-- 3 files changed, 5 insertions(+), 47 deletions(-) diff --git a/qemu-kvm.c b/qemu-kvm.c index a305907..97c098c 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -862,7 +862,7 @@ int pre_kvm_run(kvm_context_t kvm, CPUState *env) kvm_arch_pre_run(env, env-kvm_run); if (env-kvm_cpu_state.regs_modified) { -kvm_arch_put_registers(env); +kvm_arch_load_regs(env); env-kvm_cpu_state.regs_modified = 0; } @@ -1532,16 +1532,11 @@ static void on_vcpu(CPUState *env, void (*func)(void *data), void *data) qemu_cond_wait(qemu_work_cond); } -void kvm_arch_get_registers(CPUState *env) -{ -kvm_arch_save_regs(env); -} - static void do_kvm_cpu_synchronize_state(void *_env) { CPUState *env = _env; if (!env-kvm_cpu_state.regs_modified) { -kvm_arch_get_registers(env); +kvm_arch_save_regs(env); env-kvm_cpu_state.regs_modified = 1; } } @@ -1584,32 +1579,6 @@ void kvm_update_interrupt_request(CPUState *env) } } -static void kvm_do_load_registers(void *_env) -{ -CPUState *env = _env; - -kvm_arch_load_regs(env); -} - -void kvm_load_registers(CPUState *env) -{ -if (kvm_enabled() qemu_system_ready) -on_vcpu(env, kvm_do_load_registers, env); -} - -static void kvm_do_save_registers(void *_env) -{ -CPUState *env = _env; - -kvm_arch_save_regs(env); -} - -void kvm_save_registers(CPUState *env) -{ -if (kvm_enabled()) -on_vcpu(env, kvm_do_save_registers, env); -} - static void kvm_do_load_mpstate(void *_env) { CPUState *env = _env; @@ -2379,7 +2348,7 @@ static void kvm_invoke_set_guest_debug(void *data) struct kvm_set_guest_debug_data *dbg_data = data; if (cpu_single_env-kvm_cpu_state.regs_modified) { -kvm_arch_put_registers(cpu_single_env); +kvm_arch_save_regs(cpu_single_env); cpu_single_env-kvm_cpu_state.regs_modified = 0; } dbg_data-err = diff --git a/qemu-kvm.h b/qemu-kvm.h index 6b3e5a1..1354227 100644 --- a/qemu-kvm.h +++ b/qemu-kvm.h @@ -902,8 +902,6 @@ int kvm_main_loop(void); int kvm_init_ap(void); #ifndef QEMU_KVM_NO_CPU int kvm_vcpu_inited(CPUState *env); -void kvm_load_registers(CPUState *env); -void kvm_save_registers(CPUState *env); void kvm_load_mpstate(CPUState *env); void kvm_save_mpstate(CPUState *env); int kvm_cpu_exec(CPUState *env); @@ -1068,8 +1066,6 @@ void kvm_load_tsc(CPUState *env); #ifdef TARGET_I386 #define qemu_kvm_has_pit_state2() (0) #endif -#define kvm_load_registers(env) do {} while(0) -#define kvm_save_registers(env) do {} while(0) #define kvm_save_mpstate(env) do {} while(0) #define qemu_kvm_cpu_stop(env) do {} while(0) static inline void kvm_init_vcpu(CPUState *env) @@ -1098,13 +1094,6 @@ static inline int kvm_sync_vcpus(void) } #ifndef QEMU_KVM_NO_CPU -void kvm_arch_get_registers(CPUState *env); - -static inline void kvm_arch_put_registers(CPUState *env) -{ -kvm_load_registers(env); -} - void kvm_cpu_synchronize_state(CPUState *env); static inline void cpu_synchronize_state(CPUState *env) diff --git a/target-ia64/machine.c b/target-ia64/machine.c index 70ef379..7d29575 100644 --- a/target-ia64/machine.c +++ b/target-ia64/machine.c @@ -9,7 +9,7 @@ void cpu_save(QEMUFile *f, void *opaque) CPUState *env = opaque; if (kvm_enabled()) { -kvm_save_registers(env); +kvm_arch_save_regs(env); kvm_arch_save_mpstate(env); } } @@ -19,7 +19,7 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id) CPUState *env = opaque; if (kvm_enabled()) { -kvm_load_registers(env); +kvm_arch_load_regs(env); kvm_arch_load_mpstate(env); } return 0; -- 1.6.0.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Gleb. Jan --
[Qemu-devel] Re: [PATCH 00/21] qemu-kvm: Hook cleanups and extended use of upstream code
Alexander Graf wrote: On 02.02.2010, at 09:18, Jan Kiszka wrote: Let's start with the overall stats: 31 files changed, 274 insertions(+), 822 deletions(-) So this series drops far more than 500 lines of redundant code, moving qemu-kvm yet a bit closer to upstream. The other highlight is the simplification of synchronization between in-kernel and user space VCPU states. This area used to call a lot of problems in the past because it was tricky to get things right, specifically during the multi-threaded startup. The new approach pushes all the sync work around reset and vmsave/load into generic code, not only removing the burden from developers of, say, in-kernel APIC support, but also dropping most of our kvm-specific hooks, especially in the qemu-kvm tree. While I tested this on various VMs around, and things look good so far, I wouldn't be surprised if there are some regressions remaining, specifically in the non-x86 parts that I wasn't able to test or even build. Please have a careful look! The good news on that part is that apart from IA64, all other archs are broken in qemu-kvm anyways, but work on upstream qemu. So moving towards upstream definitely helps here. OK, then you probably want my corresponding uq/master series in order to test. Will try to roll them out ASAP. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 0/8]: QMP feature negotiation support
On Tue, 02 Feb 2010 09:03:32 +0100 Markus Armbruster arm...@redhat.com wrote: Luiz Capitulino lcapitul...@redhat.com writes: On Mon, 01 Feb 2010 20:37:41 +0100 Markus Armbruster arm...@redhat.com wrote: Luiz Capitulino lcapitul...@redhat.com writes: On Mon, 01 Feb 2010 18:08:27 +0100 Markus Armbruster arm...@redhat.com wrote: [...] I don't doubt your design does the job. I just think it's overly general. I had something far more stupid in mind: client connects server - client: version capability offer (one message) again: client - server: capability selection (one message) server - client: either okay or error (one message) if error goto again connection is now ready for commands No modes. The distinct lack of generality is a design feature. I like the simplicity and if we were allowed to change later I'd do it. The question is if we will ever want features to be _configured_ before the protocol is operational. In this case we'd need to pass feature arguments through the capability selection command, which will get ugly and hard to use/understand. Mode oriented support doesn't have this limitation. Maybe we won't never really use it, but it's safer. Capability selection could be done as an object where the name/value pairs are capability/argument. If you need multiple arguments for a capability, make the capability's value an object. That's exactly what seems complicated to me, because besides performing two functions (enable/configure) some feature setup could require more commands to be done in a clear way. What do you mean by feature setup? And how does it go beyond setting a bunch of parameters? The async messages setup in the previous series was an example of this. I don't remember the details. Could you summarize? Not the best example since we agreed async messages setup could be done in operational mode, but in case other features will require it: 1. The async message feature _and_ each async message were disabled by default 2. You could enable async message feature with capability_enable 3. Then, each message should be enabled separately with async_message_enable The use case here is: a feature requires to be configured before the protocol is operational. It's possible to do this with a command like feature, but it'll get bloated over time.
[Qemu-devel] Re: [RFC 0/2]: QMP DISK_ERROR event
On Tue, 02 Feb 2010 10:25:19 +0100 Kevin Wolf kw...@redhat.com wrote: Hi Luiz, Am 01.02.2010 19:07, schrieb Luiz Capitulino: Hi there, I've been requested by libvirt guys to add a QMP event for disk I/O errors, this is what this series is about. It's a RFC because I need feedback on the following: 1. drive_get_on_error() is called on all disk errors, right? Well, yes, it is for all devices that support rerror/werror. But it also might be called in other situations. Look at the get in the function name, it's really a getter function and not a event handler. 2. I've tested only ENOSPC errors, is there a way to test other errors? Like read ones? So you'll probably want some EIO. Some recent bugs I've been handling were a about images on NFS when the NFS server want away. It's a reliable way to get EIO (mount with -osoft and small timeouts). I guess qemu-nbd and the nbd: protocol might work, too. Or maybe copy the start of a qcow2 image to a too small device. Thanks! 3. Is this the right approach at all? :) Yes and no. As I said above, drive_get_on_error() is not the right place to do it. Unfortunately it looks like there isn't a single generic place where it can be done, but the call to the event handler must be added to every device. Can't it be added to subsystems? Like ide, virtio etc? Maybe in the same function that calls driver_get_on_error()?
[Qemu-devel] Re: [RFC 0/2]: QMP DISK_ERROR event
Am 02.02.2010 13:17, schrieb Luiz Capitulino: 3. Is this the right approach at all? :) Yes and no. As I said above, drive_get_on_error() is not the right place to do it. Unfortunately it looks like there isn't a single generic place where it can be done, but the call to the event handler must be added to every device. Can't it be added to subsystems? Like ide, virtio etc? Maybe in the same function that calls driver_get_on_error()? This is what I meant by devices, yes. Putting it into the same function sounds good, too. Kevin
[Qemu-devel] Re: [PATCH 15/21] qemu-kvm: Clean up mpstate synchronization
On Tue, Feb 02, 2010 at 09:19:01AM +0100, Jan Kiszka wrote: Push mpstate reading/writing into kvm_arch_load/save_regs and, on x86, properly synchronize with halted in the accessor functions. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/apic.c |7 qemu-kvm-ia64.c |4 ++- qemu-kvm-x86.c| 88 +++- qemu-kvm.c| 30 - qemu-kvm.h| 15 target-i386/machine.c |6 --- target-ia64/machine.c |3 ++ 7 files changed, 55 insertions(+), 98 deletions(-) diff --git a/hw/apic.c b/hw/apic.c index 3e03e10..092c61e 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -507,13 +507,6 @@ void apic_init_reset(CPUState *env) s-wait_for_sipi = 1; env-halted = !(s-apicbase MSR_IA32_APICBASE_BSP); -#ifdef KVM_CAP_MP_STATE -if (kvm_enabled() kvm_irqchip_in_kernel()) { -env-mp_state -= env-halted ? KVM_MP_STATE_UNINITIALIZED : KVM_MP_STATE_RUNNABLE; -kvm_load_mpstate(env); -} -#endif } static void apic_startup(APICState *s, int vector_num) diff --git a/qemu-kvm-ia64.c b/qemu-kvm-ia64.c index fc8110e..39bcbeb 100644 --- a/qemu-kvm-ia64.c +++ b/qemu-kvm-ia64.c @@ -124,7 +124,9 @@ void kvm_arch_cpu_reset(CPUState *env) { if (kvm_irqchip_in_kernel(kvm_context)) { #ifdef KVM_CAP_MP_STATE - kvm_reset_mpstate(env-kvm_cpu_state.vcpu_ctx); +struct kvm_mp_state mp_state = {.mp_state = KVM_MP_STATE_UNINITIALIZED +}; +kvm_set_mpstate(env, mp_state); #endif } else { env-interrupt_request = ~CPU_INTERRUPT_HARD; diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 63cd095..6b5895f 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -754,6 +754,48 @@ static int get_msr_entry(struct kvm_msr_entry *entry, CPUState *env) return 0; } +static void kvm_arch_save_mpstate(CPUState *env) +{ +#ifdef KVM_CAP_MP_STATE +int r; +struct kvm_mp_state mp_state; + +r = kvm_get_mpstate(env, mp_state); +if (r 0) { +env-mp_state = -1; +} else { +env-mp_state = mp_state.mp_state; +if (kvm_irqchip_in_kernel()) { +env-halted = (env-mp_state == KVM_MP_STATE_HALTED); +} +} +#else +env-mp_state = -1; +#endif +} + +static void kvm_arch_load_mpstate(CPUState *env) +{ +#ifdef KVM_CAP_MP_STATE +struct kvm_mp_state mp_state; + +/* + * -1 indicates that the host did not support GET_MP_STATE ioctl, + * so don't touch it. + */ +if (env-mp_state != -1) { +if (kvm_irqchip_in_kernel()) { +env-mp_state = env-halted ? KVM_MP_STATE_UNINITIALIZED : + KVM_MP_STATE_RUNNABLE; When irqchip is in kernel env-halted doesn't contain any relevant information, so this is incorrect. Actually env-halted is updated only to show correct cpu state during info cpus. +/* Avoid deadlock: no user space IRQ will ever clear it. */ And this comment explains why looking at env-halt when irqchip is in kernel is wrong :) +env-halted = 0; +} +mp_state.mp_state = env-mp_state; +kvm_set_mpstate(env, mp_state); +} +#endif +} + static void set_v8086_seg(struct kvm_segment *lhs, const SegmentCache *rhs) { lhs-selector = rhs-selector; @@ -926,6 +968,10 @@ void kvm_arch_load_regs(CPUState *env, int level) rc = kvm_set_msrs(env, msrs, n); if (rc == -1) perror(kvm_set_msrs FAILED); + +if (level = KVM_PUT_RESET_STATE) { +kvm_arch_load_mpstate(env); +} } void kvm_load_tsc(CPUState *env) @@ -940,36 +986,6 @@ void kvm_load_tsc(CPUState *env) perror(kvm_set_tsc FAILED.\n); } -void kvm_arch_save_mpstate(CPUState *env) -{ -#ifdef KVM_CAP_MP_STATE -int r; -struct kvm_mp_state mp_state; - -r = kvm_get_mpstate(env, mp_state); -if (r 0) -env-mp_state = -1; -else -env-mp_state = mp_state.mp_state; -#else -env-mp_state = -1; -#endif -} - -void kvm_arch_load_mpstate(CPUState *env) -{ -#ifdef KVM_CAP_MP_STATE -struct kvm_mp_state mp_state = { .mp_state = env-mp_state }; - -/* - * -1 indicates that the host did not support GET_MP_STATE ioctl, - * so don't touch it. - */ -if (env-mp_state != -1) -kvm_set_mpstate(env, mp_state); -#endif -} - void kvm_arch_save_regs(CPUState *env) { struct kvm_regs regs; @@ -1366,15 +1382,9 @@ void kvm_arch_cpu_reset(CPUState *env) { kvm_arch_reset_vcpu(env); kvm_put_vcpu_events(env); -if (!cpu_is_bsp(env)) { - if (kvm_irqchip_in_kernel()) { -#ifdef KVM_CAP_MP_STATE - kvm_reset_mpstate(env); -#endif - } else { - env-interrupt_request = ~CPU_INTERRUPT_HARD; - env-halted = 1; - } +if
[Qemu-devel] Re: [PATCH 15/21] qemu-kvm: Clean up mpstate synchronization
Gleb Natapov wrote: On Tue, Feb 02, 2010 at 09:19:01AM +0100, Jan Kiszka wrote: Push mpstate reading/writing into kvm_arch_load/save_regs and, on x86, properly synchronize with halted in the accessor functions. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/apic.c |7 qemu-kvm-ia64.c |4 ++- qemu-kvm-x86.c| 88 +++- qemu-kvm.c| 30 - qemu-kvm.h| 15 target-i386/machine.c |6 --- target-ia64/machine.c |3 ++ 7 files changed, 55 insertions(+), 98 deletions(-) diff --git a/hw/apic.c b/hw/apic.c index 3e03e10..092c61e 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -507,13 +507,6 @@ void apic_init_reset(CPUState *env) s-wait_for_sipi = 1; env-halted = !(s-apicbase MSR_IA32_APICBASE_BSP); -#ifdef KVM_CAP_MP_STATE -if (kvm_enabled() kvm_irqchip_in_kernel()) { -env-mp_state -= env-halted ? KVM_MP_STATE_UNINITIALIZED : KVM_MP_STATE_RUNNABLE; -kvm_load_mpstate(env); -} -#endif } static void apic_startup(APICState *s, int vector_num) diff --git a/qemu-kvm-ia64.c b/qemu-kvm-ia64.c index fc8110e..39bcbeb 100644 --- a/qemu-kvm-ia64.c +++ b/qemu-kvm-ia64.c @@ -124,7 +124,9 @@ void kvm_arch_cpu_reset(CPUState *env) { if (kvm_irqchip_in_kernel(kvm_context)) { #ifdef KVM_CAP_MP_STATE -kvm_reset_mpstate(env-kvm_cpu_state.vcpu_ctx); +struct kvm_mp_state mp_state = {.mp_state = KVM_MP_STATE_UNINITIALIZED +}; +kvm_set_mpstate(env, mp_state); #endif } else { env-interrupt_request = ~CPU_INTERRUPT_HARD; diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 63cd095..6b5895f 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -754,6 +754,48 @@ static int get_msr_entry(struct kvm_msr_entry *entry, CPUState *env) return 0; } +static void kvm_arch_save_mpstate(CPUState *env) +{ +#ifdef KVM_CAP_MP_STATE +int r; +struct kvm_mp_state mp_state; + +r = kvm_get_mpstate(env, mp_state); +if (r 0) { +env-mp_state = -1; +} else { +env-mp_state = mp_state.mp_state; +if (kvm_irqchip_in_kernel()) { +env-halted = (env-mp_state == KVM_MP_STATE_HALTED); +} +} +#else +env-mp_state = -1; +#endif +} + +static void kvm_arch_load_mpstate(CPUState *env) +{ +#ifdef KVM_CAP_MP_STATE +struct kvm_mp_state mp_state; + +/* + * -1 indicates that the host did not support GET_MP_STATE ioctl, + * so don't touch it. + */ +if (env-mp_state != -1) { +if (kvm_irqchip_in_kernel()) { +env-mp_state = env-halted ? KVM_MP_STATE_UNINITIALIZED : + KVM_MP_STATE_RUNNABLE; When irqchip is in kernel env-halted doesn't contain any relevant information, so this is incorrect. Actually env-halted is updated only to show correct cpu state during info cpus. OK, copied from apic_init_reset, see above. So that hunk was probably at least useless, and now it's harmfull. Will drop this and only sync from mp_state - halted. Thanks, Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
[Qemu-devel] Re: [PATCH 15/21] qemu-kvm: Clean up mpstate synchronization
On Tue, Feb 02, 2010 at 01:31:50PM +0100, Jan Kiszka wrote: Gleb Natapov wrote: On Tue, Feb 02, 2010 at 09:19:01AM +0100, Jan Kiszka wrote: Push mpstate reading/writing into kvm_arch_load/save_regs and, on x86, properly synchronize with halted in the accessor functions. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/apic.c |7 qemu-kvm-ia64.c |4 ++- qemu-kvm-x86.c| 88 +++- qemu-kvm.c| 30 - qemu-kvm.h| 15 target-i386/machine.c |6 --- target-ia64/machine.c |3 ++ 7 files changed, 55 insertions(+), 98 deletions(-) diff --git a/hw/apic.c b/hw/apic.c index 3e03e10..092c61e 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -507,13 +507,6 @@ void apic_init_reset(CPUState *env) s-wait_for_sipi = 1; env-halted = !(s-apicbase MSR_IA32_APICBASE_BSP); -#ifdef KVM_CAP_MP_STATE -if (kvm_enabled() kvm_irqchip_in_kernel()) { -env-mp_state -= env-halted ? KVM_MP_STATE_UNINITIALIZED : KVM_MP_STATE_RUNNABLE; -kvm_load_mpstate(env); -} -#endif } static void apic_startup(APICState *s, int vector_num) diff --git a/qemu-kvm-ia64.c b/qemu-kvm-ia64.c index fc8110e..39bcbeb 100644 --- a/qemu-kvm-ia64.c +++ b/qemu-kvm-ia64.c @@ -124,7 +124,9 @@ void kvm_arch_cpu_reset(CPUState *env) { if (kvm_irqchip_in_kernel(kvm_context)) { #ifdef KVM_CAP_MP_STATE - kvm_reset_mpstate(env-kvm_cpu_state.vcpu_ctx); +struct kvm_mp_state mp_state = {.mp_state = KVM_MP_STATE_UNINITIALIZED +}; +kvm_set_mpstate(env, mp_state); #endif } else { env-interrupt_request = ~CPU_INTERRUPT_HARD; diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 63cd095..6b5895f 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -754,6 +754,48 @@ static int get_msr_entry(struct kvm_msr_entry *entry, CPUState *env) return 0; } +static void kvm_arch_save_mpstate(CPUState *env) +{ +#ifdef KVM_CAP_MP_STATE +int r; +struct kvm_mp_state mp_state; + +r = kvm_get_mpstate(env, mp_state); +if (r 0) { +env-mp_state = -1; +} else { +env-mp_state = mp_state.mp_state; +if (kvm_irqchip_in_kernel()) { +env-halted = (env-mp_state == KVM_MP_STATE_HALTED); +} +} +#else +env-mp_state = -1; +#endif +} + +static void kvm_arch_load_mpstate(CPUState *env) +{ +#ifdef KVM_CAP_MP_STATE +struct kvm_mp_state mp_state; + +/* + * -1 indicates that the host did not support GET_MP_STATE ioctl, + * so don't touch it. + */ +if (env-mp_state != -1) { +if (kvm_irqchip_in_kernel()) { +env-mp_state = env-halted ? KVM_MP_STATE_UNINITIALIZED : + KVM_MP_STATE_RUNNABLE; When irqchip is in kernel env-halted doesn't contain any relevant information, so this is incorrect. Actually env-halted is updated only to show correct cpu state during info cpus. OK, copied from apic_init_reset, see above. So that hunk was probably at least useless, and now it's harmfull. Will drop this and only sync from mp_state - halted. It was not useless in apic_init_reset it was a shortcut for: env-mp_state = !(s-apicbase MSR_IA32_APICBASE_BSP) ? KVM_MP_STATE_UNINITIALIZED : KVM_MP_STATE_RUNNABLE; On reset BSP VCPU should set env-mp_state to KVM_MP_STATE_RUNNABLE and all others to KVM_MP_STATE_UNINITIALIZED. -- Gleb.
[Qemu-devel] Re: [PATCH 15/21] qemu-kvm: Clean up mpstate synchronization
Gleb Natapov wrote: On Tue, Feb 02, 2010 at 01:31:50PM +0100, Jan Kiszka wrote: Gleb Natapov wrote: On Tue, Feb 02, 2010 at 09:19:01AM +0100, Jan Kiszka wrote: Push mpstate reading/writing into kvm_arch_load/save_regs and, on x86, properly synchronize with halted in the accessor functions. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/apic.c |7 qemu-kvm-ia64.c |4 ++- qemu-kvm-x86.c| 88 +++- qemu-kvm.c| 30 - qemu-kvm.h| 15 target-i386/machine.c |6 --- target-ia64/machine.c |3 ++ 7 files changed, 55 insertions(+), 98 deletions(-) diff --git a/hw/apic.c b/hw/apic.c index 3e03e10..092c61e 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -507,13 +507,6 @@ void apic_init_reset(CPUState *env) s-wait_for_sipi = 1; env-halted = !(s-apicbase MSR_IA32_APICBASE_BSP); -#ifdef KVM_CAP_MP_STATE -if (kvm_enabled() kvm_irqchip_in_kernel()) { -env-mp_state -= env-halted ? KVM_MP_STATE_UNINITIALIZED : KVM_MP_STATE_RUNNABLE; -kvm_load_mpstate(env); -} -#endif } static void apic_startup(APICState *s, int vector_num) diff --git a/qemu-kvm-ia64.c b/qemu-kvm-ia64.c index fc8110e..39bcbeb 100644 --- a/qemu-kvm-ia64.c +++ b/qemu-kvm-ia64.c @@ -124,7 +124,9 @@ void kvm_arch_cpu_reset(CPUState *env) { if (kvm_irqchip_in_kernel(kvm_context)) { #ifdef KVM_CAP_MP_STATE - kvm_reset_mpstate(env-kvm_cpu_state.vcpu_ctx); +struct kvm_mp_state mp_state = {.mp_state = KVM_MP_STATE_UNINITIALIZED +}; +kvm_set_mpstate(env, mp_state); #endif } else { env-interrupt_request = ~CPU_INTERRUPT_HARD; diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 63cd095..6b5895f 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -754,6 +754,48 @@ static int get_msr_entry(struct kvm_msr_entry *entry, CPUState *env) return 0; } +static void kvm_arch_save_mpstate(CPUState *env) +{ +#ifdef KVM_CAP_MP_STATE +int r; +struct kvm_mp_state mp_state; + +r = kvm_get_mpstate(env, mp_state); +if (r 0) { +env-mp_state = -1; +} else { +env-mp_state = mp_state.mp_state; +if (kvm_irqchip_in_kernel()) { +env-halted = (env-mp_state == KVM_MP_STATE_HALTED); +} +} +#else +env-mp_state = -1; +#endif +} + +static void kvm_arch_load_mpstate(CPUState *env) +{ +#ifdef KVM_CAP_MP_STATE +struct kvm_mp_state mp_state; + +/* + * -1 indicates that the host did not support GET_MP_STATE ioctl, + * so don't touch it. + */ +if (env-mp_state != -1) { +if (kvm_irqchip_in_kernel()) { +env-mp_state = env-halted ? KVM_MP_STATE_UNINITIALIZED : + KVM_MP_STATE_RUNNABLE; When irqchip is in kernel env-halted doesn't contain any relevant information, so this is incorrect. Actually env-halted is updated only to show correct cpu state during info cpus. OK, copied from apic_init_reset, see above. So that hunk was probably at least useless, and now it's harmfull. Will drop this and only sync from mp_state - halted. It was not useless in apic_init_reset it was a shortcut for: env-mp_state = !(s-apicbase MSR_IA32_APICBASE_BSP) ? KVM_MP_STATE_UNINITIALIZED : KVM_MP_STATE_RUNNABLE; On reset BSP VCPU should set env-mp_state to KVM_MP_STATE_RUNNABLE and all others to KVM_MP_STATE_UNINITIALIZED. OK, belongs to kvm vpcu init code then - less encrypted. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] system_reset command cause assert failed
On Tue, 2 Feb 2010 09:35:16 +0800 Roy Tam roy...@gmail.com wrote: 2010/2/2 Luiz Capitulino lcapitul...@redhat.com: On Tue, 2 Feb 2010 00:26:53 +0800 Roy Tam roy...@gmail.com wrote: 2010/2/2 Luiz Capitulino lcapitul...@redhat.com: Hm, I'm puzzled. Is this failing on malloc()? At least qemu_malloc() is the last qemu's function I see in the logs. From now on I only see msvcrt functions... Maybe, you can type run on gdb, run system_reset on the Monitor and then switch back to gdb and type bt? source-less debugging seems better... As far as I can understand something bad happens while the parser is processing the first ' character of the qobject_from_jsonf() call in monitor.c:4524. Strange. Can you try 'info pci', 'info block' and 'info version'? Do they work? Maybe this is a refcount problem? Anthony, could you take a look too please? rebuild with -gstabs -O1, you can see double free here: Ok, so we have a double free and #0 qobject_to_qdict (obj=0x0) at qobject.h:108 #1 0x004127ae in pci_device_print (mon=0x494c460, device=0x49696c0) at /home/roy/qemu/hw/pci.c:1165 a segfault. I don't know what's happening, I'll have to run QEMU on windows and try to reproduce it.
Re: [Qemu-devel] [ANNOUNCE] New qemu.org website
On 02/02/2010 12:05 AM, Mulyadi Santosa wrote: On Mon, Feb 1, 2010 at 9:26 PM, Anthony Liguorianth...@codemonkey.ws wrote: Hi, The new qemu.org wiki is now live. I've transferred all of the content from the old website and have now switched www.qemu.org to redirect to wiki.qemu.org. Hi Anthony... Right now, February 2nd, 1:02 PM GMT+7 (Indonesian time), I get time out when accessing qemu.org. Is it down for maintenance? Since going live, there's a memory leak on the system that's resulting in the OOM killer going off. I've got some tracking in place right now that should help us get to the bottom of it. So apologies if the site has some down time over the next couple days as we figure this issue out. Regards, Anthony Liguori
[Qemu-devel] usb-host quirks
Hi, I've got a buggy device that needs a special workaround to be usable under host-usb access. The device really doesn't like being reset via USBDEVFS_RESET. It immediatenly locks up the device firmware or whatever. It won't respond properly anymore. With the following patch it works fine, though. So I was wondering what the accepted way was to get these quirks upstream into the qemu source tree. Is usb-linux.c the correct place, or should we put the quirk into a different place? --- usb-linux.c |4 1 file changed, 4 insertions(+) --- qemu.orig/usb-linux.c +++ qemu/usb-linux.c @@ -389,6 +389,10 @@ static void usb_host_handle_reset(USBDev dprintf(husb: reset device %u.%u\n, s-bus_num, s-addr); +if (((s-descr[8] 8) | s-descr[9]) == 0x2471 +((s-descr[10] 8) | s-descr[11]) == 0x0853) +return; + ioctl(s-fd, USBDEVFS_RESET); usb_host_claim_interfaces(s, s-configuration); -- Greetings, Michael.
[Qemu-devel] Re: [PATCH 20/21] qemu-kvm: Move kvm_set_boot_cpu_id
On Tue, Feb 02, 2010 at 09:19:06AM +0100, Jan Kiszka wrote: Setting the boot CPU ID is arch-specific KVM stuff. So push it where it belongs to. pc_init1 is also arch-specific, no? TCG should also be able to have BSP apic_id != 0. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/pc.c|3 --- qemu-kvm-x86.c |3 ++- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/hw/pc.c b/hw/pc.c index 6c15a9f..3df6195 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -803,9 +803,6 @@ static void pc_init1(ram_addr_t ram_size, #endif } -if (kvm_enabled()) { -kvm_set_boot_cpu_id(0); -} for (i = 0; i smp_cpus; i++) { env = pc_new_cpu(cpu_model); } diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 9de018e..0f34451 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -695,7 +695,8 @@ int kvm_arch_qemu_create_context(void) if (kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK)) vmstate_register(0, vmstate_kvmclock, kvmclock_data); #endif -return 0; + +return kvm_set_boot_cpu_id(0); } static void set_msr_entry(struct kvm_msr_entry *entry, uint32_t index, -- 1.6.0.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Gleb.
[Qemu-devel] Re: [PATCH 20/21] qemu-kvm: Move kvm_set_boot_cpu_id
Gleb Natapov wrote: On Tue, Feb 02, 2010 at 09:19:06AM +0100, Jan Kiszka wrote: Setting the boot CPU ID is arch-specific KVM stuff. So push it where it belongs to. pc_init1 is also arch-specific, no? TCG should also be able to have BSP apic_id != 0. But not kvm-specific. I don't understand your second remark. Can you help me how TCG is affected by kvm_set_boot_cpu_id? Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/pc.c|3 --- qemu-kvm-x86.c |3 ++- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/hw/pc.c b/hw/pc.c index 6c15a9f..3df6195 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -803,9 +803,6 @@ static void pc_init1(ram_addr_t ram_size, #endif } -if (kvm_enabled()) { -kvm_set_boot_cpu_id(0); -} for (i = 0; i smp_cpus; i++) { env = pc_new_cpu(cpu_model); } diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 9de018e..0f34451 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -695,7 +695,8 @@ int kvm_arch_qemu_create_context(void) if (kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK)) vmstate_register(0, vmstate_kvmclock, kvmclock_data); #endif -return 0; + +return kvm_set_boot_cpu_id(0); } static void set_msr_entry(struct kvm_msr_entry *entry, uint32_t index, -- 1.6.0.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Gleb. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
[Qemu-devel] [PATCH] qcow2: Fix signedness bugs
Checking for return codes 0 isn't really going to work with unsigned types. Use signed types instead. Signed-off-by: Kevin Wolf kw...@redhat.com --- block/qcow2-cluster.c | 12 ++-- block/qcow2.h |6 ++ 2 files changed, 8 insertions(+), 10 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 4e30d16..3501a94 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -219,7 +219,8 @@ static uint64_t *l2_allocate(BlockDriverState *bs, int l1_index) BDRVQcowState *s = bs-opaque; int min_index; uint64_t old_l2_offset; -uint64_t *l2_table, l2_offset; +uint64_t *l2_table; +int64_t l2_offset; old_l2_offset = s-l1_table[l1_index]; @@ -560,7 +561,8 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs, { BDRVQcowState *s = bs-opaque; int l2_index, ret; -uint64_t l2_offset, *l2_table, cluster_offset; +uint64_t l2_offset, *l2_table; +int64_t cluster_offset; int nb_csectors; ret = get_cluster_table(bs, offset, l2_table, l2_offset, l2_index); @@ -704,10 +706,8 @@ err: * * Return 0 on success and -errno in error cases */ -uint64_t qcow2_alloc_cluster_offset(BlockDriverState *bs, -uint64_t offset, -int n_start, int n_end, -int *num, QCowL2Meta *m) +int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset, +int n_start, int n_end, int *num, QCowL2Meta *m) { BDRVQcowState *s = bs-opaque; int l2_index, ret; diff --git a/block/qcow2.h b/block/qcow2.h index d9ea6ab..de9397a 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -192,10 +192,8 @@ void qcow2_encrypt_sectors(BDRVQcowState *s, int64_t sector_num, uint64_t qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset, int *num); -uint64_t qcow2_alloc_cluster_offset(BlockDriverState *bs, - uint64_t offset, - int n_start, int n_end, - int *num, QCowL2Meta *m); +int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset, +int n_start, int n_end, int *num, QCowL2Meta *m); uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs, uint64_t offset, int compressed_size); -- 1.6.5.2
[Qemu-devel] Re: [PATCH 20/21] qemu-kvm: Move kvm_set_boot_cpu_id
On Tue, Feb 02, 2010 at 03:20:02PM +0100, Jan Kiszka wrote: Gleb Natapov wrote: On Tue, Feb 02, 2010 at 09:19:06AM +0100, Jan Kiszka wrote: Setting the boot CPU ID is arch-specific KVM stuff. So push it where it belongs to. pc_init1 is also arch-specific, no? TCG should also be able to have BSP apic_id != 0. But not kvm-specific. I don't understand your second remark. Can you help me how TCG is affected by kvm_set_boot_cpu_id? It is not affected right now. It assumes that apic ID of BSP cpu is 0, but this limitation does not exists on real HW. So when QEMU will be fixed and it will be possible to configure what CPU is BSP this will be the pace to do it. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/pc.c|3 --- qemu-kvm-x86.c |3 ++- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/hw/pc.c b/hw/pc.c index 6c15a9f..3df6195 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -803,9 +803,6 @@ static void pc_init1(ram_addr_t ram_size, #endif } -if (kvm_enabled()) { -kvm_set_boot_cpu_id(0); -} for (i = 0; i smp_cpus; i++) { env = pc_new_cpu(cpu_model); } diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 9de018e..0f34451 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -695,7 +695,8 @@ int kvm_arch_qemu_create_context(void) if (kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK)) vmstate_register(0, vmstate_kvmclock, kvmclock_data); #endif -return 0; + +return kvm_set_boot_cpu_id(0); } static void set_msr_entry(struct kvm_msr_entry *entry, uint32_t index, -- 1.6.0.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Gleb. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- Gleb.
[Qemu-devel] Re: [PATCH 20/21] qemu-kvm: Move kvm_set_boot_cpu_id
Gleb Natapov wrote: On Tue, Feb 02, 2010 at 03:20:02PM +0100, Jan Kiszka wrote: Gleb Natapov wrote: On Tue, Feb 02, 2010 at 09:19:06AM +0100, Jan Kiszka wrote: Setting the boot CPU ID is arch-specific KVM stuff. So push it where it belongs to. pc_init1 is also arch-specific, no? TCG should also be able to have BSP apic_id != 0. But not kvm-specific. I don't understand your second remark. Can you help me how TCG is affected by kvm_set_boot_cpu_id? It is not affected right now. It assumes that apic ID of BSP cpu is 0, but this limitation does not exists on real HW. So when QEMU will be fixed and it will be possible to configure what CPU is BSP this will be the pace to do it. That day pc_init1 (or whatever x86 part) will set the bsp number somewhere in env or apicstate, and we will transfer that afterwards to kvm. The point is that kvm_* belongs into kvm[-all].c as far as possible. And in this case it is possible. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 0/8]: QMP feature negotiation support
Luiz Capitulino lcapitul...@redhat.com writes: On Tue, 02 Feb 2010 09:03:32 +0100 Markus Armbruster arm...@redhat.com wrote: Luiz Capitulino lcapitul...@redhat.com writes: On Mon, 01 Feb 2010 20:37:41 +0100 Markus Armbruster arm...@redhat.com wrote: Luiz Capitulino lcapitul...@redhat.com writes: On Mon, 01 Feb 2010 18:08:27 +0100 Markus Armbruster arm...@redhat.com wrote: [...] I don't doubt your design does the job. I just think it's overly general. I had something far more stupid in mind: client connects server - client: version capability offer (one message) again: client - server: capability selection (one message) server - client: either okay or error (one message) if error goto again connection is now ready for commands No modes. The distinct lack of generality is a design feature. I like the simplicity and if we were allowed to change later I'd do it. The question is if we will ever want features to be _configured_ before the protocol is operational. In this case we'd need to pass feature arguments through the capability selection command, which will get ugly and hard to use/understand. Mode oriented support doesn't have this limitation. Maybe we won't never really use it, but it's safer. Capability selection could be done as an object where the name/value pairs are capability/argument. If you need multiple arguments for a capability, make the capability's value an object. That's exactly what seems complicated to me, because besides performing two functions (enable/configure) some feature setup could require more commands to be done in a clear way. What do you mean by feature setup? And how does it go beyond setting a bunch of parameters? The async messages setup in the previous series was an example of this. I don't remember the details. Could you summarize? Not the best example since we agreed async messages setup could be done in operational mode, but in case other features will require it: 1. The async message feature _and_ each async message were disabled by default 2. You could enable async message feature with capability_enable 3. Then, each message should be enabled separately with async_message_enable The use case here is: a feature requires to be configured before the protocol is operational. Okay, let's pretend for the sake of the argument that async message enable/disable is core protocol, and thus needs to be controlled via capabilities. An obvious way is to have one capability for every enable/disable switch. The server's capability offer lists them all, and the client's capability selection includes the one it wants. What if this leads to dozens of capabilities? It's a machine protocol, and a machine can cope with sixty capabilities just as fine as with six. Six hundred would be kind of ugly, though. If we absolutely insist on controlling async messages with a single capability, things get slightly more complex. The capability now sports an object value, with a member for each enable/disable switch. The client's capability client selection supplies such an object value. If we're worried about discoverability, we can make the server's capability offer include a description of each capability's value. And now let's quit pretending, and remind ourselves that capabilities are for variations of the core protocol. Do we really expect the core protocol to become so baroque that we'll need a full-blown configuration mode? It's possible to do this with a command like feature, but it'll get bloated over time. I doubt it.
[Qemu-devel] The new qemu.org
The new site looks nice. When is the Mac OS X section under Compilation from the sources going to be updated from the lame The Mac OS X patches are not fully merged in QEMU, so you should look at the QEMU mailing list archive to have all the necessary information.. This is unacceptable.
[Qemu-devel] KVM developer call minutes (Feb 2)
Minutes (please reply w/ corrections or follow-ups): state of in-kernel APIC/IOAPIC/PIT upstream merge - Glauber?... road map to get rid of qemu-kvm's slot management (IMHO: qemu-kvm-0.13) - no real feedback here - any further ongoing/planned upstream merge efforts? - SMP - needs in-kernel irqchip support upstream - possibly make it a stream in uq for integration and autotesting - PCI Device Assignment cleanup, proper capabilities support - do we move to uio support before pushing upstream? - Michael's UIO patch for 2.3 devices is upstream, so moving to uio would add a feature - what is missing w/out PCIe bus emulation - AER, ARI, few other less critical things - any 64-bit issues or other things that a driver may probe for that will break ww/out PCIe - most are probing PCI capabilities upstream queue uq qemu-kvm - flush mmio buffer periodically - enabled unconditional save/restore - started porting -mempath - getting autotest to run on upstream to autotest before sending patches to upstream - anthony wants to receive pull request w/ inline patches for review qmp feature negotiation issue - Luiz and Markus discussing alternatives, either seems fine - Anthony will follow-up on list
[Qemu-devel] Question on qcow2 image with base image
Hi, when I use a qcow2 image based on a base image, what should happen when I invoke the commit command from the qemu monitor ? Is it expected/intended to flush the data into the base image ? IIUC, that is what happening in the released qemu (0.12). I would expect it not to touch the base image. Naphtali
[Qemu-devel] [PATCH 0/3] Event signaling tweaks
This series of three patches makes two small changes to qemu_event_read and qemu_event_increment. These are preparatory to merging eventfd usage in the iothread from qemu-kvm (which would have conflicts, so it has to be done with some care). Paolo Bonzini (3): do not loop on an incomplete io_thread_fd read loop qemu_event_increment if we have an EINTR fix placement of config-host.h inclusion osdep.c |7 --- vl.c| 12 2 files changed, 12 insertions(+), 7 deletions(-)
[Qemu-devel] [PATCH 1/3] do not loop on an incomplete io_thread_fd read
No need to loop if less than a full buffer is read, the next read would return EAGAIN. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- vl.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/vl.c b/vl.c index 6f1e1ab..46c1118 100644 --- a/vl.c +++ b/vl.c @@ -3210,12 +3210,12 @@ static void qemu_event_read(void *opaque) { int fd = (unsigned long)opaque; ssize_t len; +char buffer[512]; /* Drain the notify pipe */ do { -char buffer[512]; len = read(fd, buffer, sizeof(buffer)); -} while ((len == -1 errno == EINTR) || len 0); +} while ((len == -1 errno == EINTR) || len == sizeof(buffer)); } static int qemu_event_init(void) -- 1.6.6
[Qemu-devel] [PATCH 2/3] loop write in qemu_event_increment upon EINTR
Same as what qemu-kvm does. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- vl.c |8 ++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/vl.c b/vl.c index 46c1118..f150eca 100644 --- a/vl.c +++ b/vl.c @@ -3198,8 +3198,12 @@ static void qemu_event_increment(void) if (io_thread_fd == -1) return; -ret = write(io_thread_fd, byte, sizeof(byte)); -if (ret 0 (errno != EINTR errno != EAGAIN)) { +do { +ret = write(io_thread_fd, byte, sizeof(byte)); +} while (ret 0 errno == EINTR); + +/* EAGAIN is fine, a read must be pending. */ +if (ret 0 errno != EAGAIN) { fprintf(stderr, qemu_event_increment: write() filed: %s\n, strerror(errno)); exit (1); -- 1.6.6
[Qemu-devel] [PATCH 3/3] fix placement of config-host.h inclusion
The #ifdef CONFIG_SOLARIS below was useless without this patch. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- osdep.c |7 --- 1 files changed, 4 insertions(+), 3 deletions(-) diff --git a/osdep.c b/osdep.c index cf3a2c6..9059f01 100644 --- a/osdep.c +++ b/osdep.c @@ -28,14 +28,15 @@ #include errno.h #include unistd.h #include fcntl.h + +/* Needed early for CONFIG_BSD etc. */ +#include config-host.h + #ifdef CONFIG_SOLARIS #include sys/types.h #include sys/statvfs.h #endif -/* Needed early for CONFIG_BSD etc. */ -#include config-host.h - #ifdef _WIN32 #include windows.h #elif defined(CONFIG_BSD) -- 1.6.6
[Qemu-devel] Re: [PATCH] Add cpu model configuration support.. (resend)
Andre Przywara wrote: +[cpudef] + name = Conroe + level = 2 + vendor = GenuineIntel + family = 6 + model = 2 + stepping = 3 + feature_edx = sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc pse de fpumtrr clflush mca pse36 + feature_ecx = sse3 ssse3 + extfeature_edx = fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpulm syscall nx + extfeature_ecx = lahf_lm Wouldn't it be much more user friendly to merge them all into one string? Just from the feature names it is quite obscure to guess which flag belongs into which string (especially since we lack the EXTn_ prefix we had in helper.c). I haven't tried it, but the parsing code looks like this shouldn't be too hard. To avoid overlong lines one could think about a += operator. That's true. Although I expect setup of a cpu model to be a rather infrequent occurrence by the expert (+/-) user so the above didn't strike me as a significant issue. Also -cpu ?cpuid dumps out the entire motley crew of flags relative to each grouping for reference. That said the current config file syntax seems rather rigid and I think your suggestion makes sense. I avoided modifying the parser at this point just in the interest of minimizing the sprawl of this patch. I would just drop all definitions here except qemu{32,64} and kvm{32,64}. The other models should be described in the config file. That's the goal but I wanted to leave an interim firewall of sorts. If the target-x86_64.conf isn't installed for whatever reason, qemu still can fall back to the internal definitions. Even here it isn't strictly necessary to remove an internal def as it can be redefined in the config file which will override the internal version. In general -cpu ?model will indicate internal vs. externally defined models by enclosing internal model names in brackets: : x86 Opteron_G3 AMD Opteron 23xx (Gen 3 Class Opteron) : x86 [athlon] QEMU Virtual CPU version 0.12.50 : It also seems worth dropping a hint to the user in the case qemu fails to find a target config file rather than leaving them to puzzle out why an external model has gone missing. Thanks for the feedback. -john -- john.coo...@redhat.com
[Qemu-devel] [PATCH v0 0/5]: BLOCK_IO_ERROR QMP event
Hi, This series adds the BLOCK_IO_ERROR event libvirt guys have requested, I have made some improvements after Kevin's feedback and hope it's in better shape now. The only small issue is that I couldn't get a read error. I've followed Kevin's advices wrt NFS, but got only write errors... I've tested with ide and virtio. Thanks.
[Qemu-devel] [PATCH 1/5] QMP: BLOCK_IO_ERROR event handling
This commit adds the basic definitions for the BLOCK_IO_ERROR event, but actual event emission will be introduced by the next commits. NOTE: Adding a small reference in QMP/qmp-events.txt, but this file is wrong and will be replaced by proper documentation shortly. Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- QMP/qmp-events.txt |7 +++ monitor.c |3 +++ monitor.h |1 + 3 files changed, 11 insertions(+), 0 deletions(-) diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt index dc48ccc..7886192 100644 --- a/QMP/qmp-events.txt +++ b/QMP/qmp-events.txt @@ -43,3 +43,10 @@ Data: 'server' and 'client' keys with the same keys as 'query-vnc'. Description: Issued when the VNC session is made active. Data: 'server' and 'client' keys with the same keys as 'query-vnc'. + +7 BLOCK_IO_ERROR + + +Description: Issued when a disk I/O error occurs +Data: 'device' (device name), 'action' (action to be taken), + 'operation' (read or write) diff --git a/monitor.c b/monitor.c index fb7c572..6e688ac 100644 --- a/monitor.c +++ b/monitor.c @@ -378,6 +378,9 @@ void monitor_protocol_event(MonitorEvent event, QObject *data) case QEVENT_VNC_DISCONNECTED: event_name = VNC_DISCONNECTED; break; +case QEVENT_BLOCK_IO_ERROR: +event_name = BLOCK_IO_ERROR; +break; default: abort(); break; diff --git a/monitor.h b/monitor.h index b0f9270..e35f1e4 100644 --- a/monitor.h +++ b/monitor.h @@ -23,6 +23,7 @@ typedef enum MonitorEvent { QEVENT_VNC_CONNECTED, QEVENT_VNC_INITIALIZED, QEVENT_VNC_DISCONNECTED, +QEVENT_BLOCK_IO_ERROR, QEVENT_MAX, } MonitorEvent; -- 1.6.6
[Qemu-devel] [PATCH 2/5] block: BLOCK_IO_ERROR QMP event
This commit introduces the bdrv_mon_event() function, which should be called by block subsystems (eg. IDE) when a I/O error occurs, so that an QMP event is emitted. The following information is currently provided in the event: - device name - operation (ie. read or write) - action taken (eg. stop) Event example: { event: BLOCK_IO_ERROR, data: { device: ide0-hd1, operation: write, action: stop }, timestamp: { seconds: 1265044230, microseconds: 450486 } } Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- block.c | 29 + block.h |6 ++ 2 files changed, 35 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 1919d19..2913124 100644 --- a/block.c +++ b/block.c @@ -1164,6 +1164,35 @@ int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors, return bs-drv-bdrv_is_allocated(bs, sector_num, nb_sectors, pnum); } +void bdrv_mon_event(const BlockDriverState *bdrv, +BlockMonEventAction action, int is_read) +{ +QObject *data; +const char *action_str; + +switch (action) { +case BDRV_ACTION_REPORT: +action_str = report; +break; +case BDRV_ACTION_IGNORE: +action_str = ignore; +break; +case BDRV_ACTION_STOP: +action_str = stop; +break; +default: +abort(); +} + +data = qobject_from_jsonf({ 'device': %s, 'action': %s, 'operation': %s }, + bdrv-device_name, + action_str, + is_read ? read : write); +monitor_protocol_event(QEVENT_BLOCK_IO_ERROR, data); + +qobject_decref(data); +} + static void bdrv_print_dict(QObject *obj, void *opaque) { QDict *bs_dict; diff --git a/block.h b/block.h index ecf66c5..a834300 100644 --- a/block.h +++ b/block.h @@ -44,6 +44,12 @@ typedef struct QEMUSnapshotInfo { #define BDRV_SECTOR_SIZE (1 BDRV_SECTOR_BITS) #define BDRV_SECTOR_MASK ~(BDRV_SECTOR_SIZE - 1); +typedef enum { +BDRV_ACTION_REPORT, BDRV_ACTION_IGNORE, BDRV_ACTION_STOP +} BlockMonEventAction; + +void bdrv_mon_event(const BlockDriverState *bdrv, +BlockMonEventAction action, int is_read); void bdrv_info_print(Monitor *mon, const QObject *data); void bdrv_info(Monitor *mon, QObject **ret_data); void bdrv_stats_print(Monitor *mon, const QObject *data); -- 1.6.6
[Qemu-devel] [PATCH 3/5] ide: Generate BLOCK_IO_ERROR QMP event
Just call bdrv_mon_event() in the right place. Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- hw/ide/core.c |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/hw/ide/core.c b/hw/ide/core.c index b6643e8..603e537 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -480,14 +480,17 @@ static int ide_handle_rw_error(IDEState *s, int error, int op) int is_read = (op BM_STATUS_RETRY_READ); BlockInterfaceErrorAction action = drive_get_on_error(s-bs, is_read); -if (action == BLOCK_ERR_IGNORE) +if (action == BLOCK_ERR_IGNORE) { +bdrv_mon_event(s-bs, BDRV_ACTION_IGNORE, is_read); return 0; +} if ((error == ENOSPC action == BLOCK_ERR_STOP_ENOSPC) || action == BLOCK_ERR_STOP_ANY) { s-bus-bmdma-unit = s-unit; s-bus-bmdma-status |= op; vm_stop(0); +bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read); } else { if (op BM_STATUS_DMA_RETRY) { dma_buf_commit(s, 0); @@ -495,6 +498,7 @@ static int ide_handle_rw_error(IDEState *s, int error, int op) } else { ide_rw_error(s); } +bdrv_mon_event(s-bs, BDRV_ACTION_REPORT, is_read); } return 1; -- 1.6.6
[Qemu-devel] [PATCH 4/5] scsi: Generate BLOCK_IO_ERROR QMP event
Just call bdrv_mon_event() in the right place. Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- hw/scsi-disk.c |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index b34fbaa..1285122 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -182,16 +182,20 @@ static int scsi_handle_write_error(SCSIDiskReq *r, int error) BlockInterfaceErrorAction action = drive_get_on_error(s-qdev.dinfo-bdrv, 0); -if (action == BLOCK_ERR_IGNORE) +if (action == BLOCK_ERR_IGNORE) { +bdrv_mon_event(s-qdev.dinfo-bdrv, BDRV_ACTION_IGNORE, 0); return 0; +} if ((error == ENOSPC action == BLOCK_ERR_STOP_ENOSPC) || action == BLOCK_ERR_STOP_ANY) { r-status |= SCSI_REQ_STATUS_RETRY; vm_stop(0); +bdrv_mon_event(s-qdev.dinfo-bdrv, BDRV_ACTION_STOP, 0); } else { scsi_command_complete(r, CHECK_CONDITION, HARDWARE_ERROR); +bdrv_mon_event(s-qdev.dinfo-bdrv, BDRV_ACTION_REPORT, 0); } return 1; -- 1.6.6
[Qemu-devel] [PATCH 5/5] virtio-blk: Generate BLOCK_IO_ERROR QMP event
Just call bdrv_mon_event() in the right place. Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- hw/virtio-blk.c |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c index 037a79c..75adbec 100644 --- a/hw/virtio-blk.c +++ b/hw/virtio-blk.c @@ -105,16 +105,20 @@ static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error, drive_get_on_error(req-dev-bs, is_read); VirtIOBlock *s = req-dev; -if (action == BLOCK_ERR_IGNORE) +if (action == BLOCK_ERR_IGNORE) { +bdrv_mon_event(req-dev-bs, BDRV_ACTION_IGNORE, is_read); return 0; +} if ((error == ENOSPC action == BLOCK_ERR_STOP_ENOSPC) || action == BLOCK_ERR_STOP_ANY) { req-next = s-rq; s-rq = req; vm_stop(0); +bdrv_mon_event(req-dev-bs, BDRV_ACTION_STOP, is_read); } else { virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR); +bdrv_mon_event(req-dev-bs, BDRV_ACTION_REPORT, is_read); } return 1; -- 1.6.6
Re: [Qemu-devel] usb-host quirks
On 02/02/2010 06:42 AM, Michael Buesch wrote: Hi, I've got a buggy device that needs a special workaround to be usable under host-usb access. The device really doesn't like being reset via USBDEVFS_RESET. It immediatenly locks up the device firmware or whatever. It won't respond properly anymore. With the following patch it works fine, though. What about the USBDEVFS_RESET in usb_host_open? Does that have an impact? For some USB keys I have had to add an additional reset prior to claiming interfaces: diff --git a/usb-linux.c b/usb-linux.c index 1aaa595..092e75c 100644 --- a/usb-linux.c +++ b/usb-linux.c @@ -906,6 +906,9 @@ static int usb_host_open(USBHostDevice *dev, int bus_num, #endif +/* some keys require a reset before the getconfig */ +ioctl(fd, USBDEVFS_RESET); + /* * Initial configuration is -1 which makes us claim first * available config. We used to start with 1, which does not David Ahern So I was wondering what the accepted way was to get these quirks upstream into the qemu source tree. Is usb-linux.c the correct place, or should we put the quirk into a different place? --- usb-linux.c |4 1 file changed, 4 insertions(+) --- qemu.orig/usb-linux.c +++ qemu/usb-linux.c @@ -389,6 +389,10 @@ static void usb_host_handle_reset(USBDev dprintf(husb: reset device %u.%u\n, s-bus_num, s-addr); +if (((s-descr[8] 8) | s-descr[9]) == 0x2471 +((s-descr[10] 8) | s-descr[11]) == 0x0853) +return; + ioctl(s-fd, USBDEVFS_RESET); usb_host_claim_interfaces(s, s-configuration);
Re: [Qemu-devel] system_reset command cause assert failed
2010/2/2 Luiz Capitulino lcapitul...@redhat.com: On Tue, 2 Feb 2010 09:35:16 +0800 Roy Tam roy...@gmail.com wrote: 2010/2/2 Luiz Capitulino lcapitul...@redhat.com: On Tue, 2 Feb 2010 00:26:53 +0800 Roy Tam roy...@gmail.com wrote: 2010/2/2 Luiz Capitulino lcapitul...@redhat.com: Hm, I'm puzzled. Is this failing on malloc()? At least qemu_malloc() is the last qemu's function I see in the logs. From now on I only see msvcrt functions... Maybe, you can type run on gdb, run system_reset on the Monitor and then switch back to gdb and type bt? source-less debugging seems better... As far as I can understand something bad happens while the parser is processing the first ' character of the qobject_from_jsonf() call in monitor.c:4524. Strange. Can you try 'info pci', 'info block' and 'info version'? Do they work? Maybe this is a refcount problem? Anthony, could you take a look too please? rebuild with -gstabs -O1, you can see double free here: Ok, so we have a double free and Clarify that after digging into sources further, it is not double free, but parse_json not be executed by json_lexer_feed_char as I put asm(int3) in parse_json but there's no SIGTRAP be raised. (for system_reset and system_powerdown) #0 qobject_to_qdict (obj=0x0) at qobject.h:108 #1 0x004127ae in pci_device_print (mon=0x494c460, device=0x49696c0) at /home/roy/qemu/hw/pci.c:1165 a segfault. for this, parse_json was executed by json_lexer_feed_char. a workaround patch is here, but why null qobj has pushed into qlist? diff --git a/hw/pci.c b/hw/pci.c index 023f7b6..84e7b35 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -1161,8 +1161,11 @@ static void pci_device_print(Monitor *mon, QDict *device) qdict_get_int(info, limit)); } +QObject* qobj; QLIST_FOREACH_ENTRY(qdict_get_qlist(device, regions), entry) { -qdict = qobject_to_qdict(qlist_entry_obj(entry)); +qobj = qlist_entry_obj(entry); +if(!qobj) continue; +qdict = qobject_to_qdict(qobj); monitor_printf(mon, BAR%d: , (int) qdict_get_int(qdict, bar)); addr = qdict_get_int(qdict, address);
RE: [Qemu-devel] [Patch] Support translating Guest physical address to Host virtual address.
Hi, Any futher comments for this patch so that we can modify? thanks, jiajia Max Asbock wrote: On Wed, 2010-01-27 at 15:39 -0600, Anthony Liguori wrote: On 01/26/2010 09:25 PM, Zheng, Jiajia wrote: Add command p2v to translate Guest physical address to Host virtual address. For what purpose? Signed-off-by: Max Asbockmasb...@linux.vnet.ibm.com Jiajia Zhengjiajia.zh...@intel.com --- diff --git a/monitor.c b/monitor.c index b33b01f..83d9ac7 100644 --- a/monitor.c +++ b/monitor.c @@ -668,6 +668,11 @@ static void do_info_uuid(Monitor *mon, QObject **ret_data) *ret_data = qobject_from_jsonf({ 'UUID': %s }, uuid); } +static void do_info_p2v(Monitor *mon) +{ +monitor_printf(mon, p2v implemented\n); +} These should be implemented as QMP commands. /* get the current CPU defined by the user */ static int mon_set_cpu(int cpu_index) { @@ -2283,6 +2288,14 @@ static void do_inject_mce(Monitor *mon, const QDict *qdict) break; } } +static void do_p2v(Monitor *mon, const QDict *qdict) +{ +target_long size = 4096; +target_long addr = qdict_get_int(qdict, addr); + +monitor_printf(mon, Guest physical address %p is mapped at host virtual address %p\n, (void *)addr, cpu_physical_memory_map( (target_phys_addr_t)addr, (target_phys_addr_t *)size, 0)); This isn't quite right. It assumes TARGET_PAGE_SIZE is 4k which is certainly not always true. It also assumes that cpu_physical_memory_map() something that has some meaning which isn't necessarily the case. It could be a pointer to a bounce buffer. Could you give an end-to-end description of how you expect this mechanism to be used so we can work out a more appropriate set of interfaces. I assume this is MCE related. The purpose of this is to translate a guest physical address to a host virtual address. This was indeed used for MCE testing. The p2v command provides one step in a chain of translations from guest virtual to guest physical to host virtual to host physical. Host physical is then used to inject a machine check error. As a consequence the HPOISON code on the host and the MCE injection code in qemu are exercised. I was always assuming that this implementation perhaps isn't the most optimal, but it simply worked for our test case. What would an appropriate method be to get a host virtual address for guest physical address that represents a page of RAM? thanks, Max
[Qemu-devel] Re: Network shutdown under load
Hi, we're currently having this problem on two production servers that 2-4 times a day one interface shuts down. We've four KVMs running on two hosts (2x2). All VMs have eth0 and eth1 running virtio_net. All eth0's are connected to bridge br0 and all eth1's are connected to br1 on the host. Here are the startup options for one VM (the others are quite similar [of course other mac address, ...]): /usr/bin/kvm -m 8192 -smp 8 -cpu host -daemonize -k de -vnc 127.0.0.1:1 -monitor telnet:172.18.105.46:,server,nowait -localtime -pidfile /tmp/kvm-dodoma.pid -drive file=/data/kvm/kvmimages/dodoma.qcow2,if=virtio,cache=none,boot=on -drive file=/data/kvm/kvmimages/dodoma-vdb.qcow2,if=virtio,cache=none -net nic,vlan=104,model=virtio,macaddr=00:ff:48:e5:4b:8d -net tap,vlan=104,ifname=tap.b.dodoma,script=no -net nic,vlan=96,model=virtio,macaddr=00:ff:48:e5:4b:8f -net tap,vlan=96,ifname=tap.f.dodoma,script=no I've tried the very latest Gentoo kernel 2.6.30 on the host and guest (all VMs and hosts running Gentoo btw.). With kernel 2.6.31 on host and 2.6.30 on guest the problem still exist. I've tried KVM 0.11.1, 0.12.1.2 and 0.12.2 running with kernel 2.6.30 and 2.6.31 on the host side. Interestingly all the VMs almost have the same network traffic (in and out) but the VMs running Apache bind to eth1 have the biggest problems. They shut down eth1 2-4 times a day. eth0 is running fine despite that it is doing almost the same traffic amount but this traffic comes from the database where as eth1 sends the traffic to the proxy (Varnish). So incoming traffic seems to work fine here but outgoing traffic is problematic. On the other hand the VMs running Varnish getting all the traffic through eth1. Here I've only seen one shutdown of eth1 in 48 hours. Is there anything I can help to debug this problem? Is there already a fix available? Otherwise I really have to install KVM-88 which runs fine on some other hosts. Thanks! Robert Tom Lendacky wrote: There's been some discussion of this already in the kvm list, but I want to summarize what I've found and also include the qemu-devel list in an effort to find a solution to this problem. Running a netperf test between two kvm guests results in the guest's network interface shutting down. I originally found this using kvm guests on two different machines that were connected via a 10GbE link. However, I found this problem can be easily reproduced using two guests on the same machine. I am running the 2.6.32 level of the kvm.git tree and the 0.12.1.2 level of the qemu-kvm.git tree. The setup includes two bridges, br0 and br1. The commands used to start the guests are as follows: usr/local/bin/qemu-system-x86_64 -name cape-vm001 -m 1024 -drive file=/autobench/var/tmp/cape-vm001- raw.img,if=virtio,index=0,media=disk,boot=on -net nic,model=virtio,vlan=0,macaddr=00:16:3E:00:62:51,netdev=cape-vm001-eth0 - netdev tap,id=cape-vm001-eth0,script=/autobench/var/tmp/ifup-kvm- br0,downscript=/autobench/var/tmp/ifdown-kvm-br0 -net nic,model=virtio,vlan=1,macaddr=00:16:3E:00:62:D1,netdev=cape-vm001-eth1 - netdev tap,id=cape-vm001-eth1,script=/autobench/var/tmp/ifup-kvm- br1,downscript=/autobench/var/tmp/ifdown-kvm-br1 -vnc :1 -monitor telnet::5701,server,nowait -snapshot -daemonize usr/local/bin/qemu-system-x86_64 -name cape-vm002 -m 1024 -drive file=/autobench/var/tmp/cape-vm002- raw.img,if=virtio,index=0,media=disk,boot=on -net nic,model=virtio,vlan=0,macaddr=00:16:3E:00:62:61,netdev=cape-vm002-eth0 - netdev tap,id=cape-vm002-eth0,script=/autobench/var/tmp/ifup-kvm- br0,downscript=/autobench/var/tmp/ifdown-kvm-br0 -net nic,model=virtio,vlan=1,macaddr=00:16:3E:00:62:E1,netdev=cape-vm002-eth1 - netdev tap,id=cape-vm002-eth1,script=/autobench/var/tmp/ifup-kvm- br1,downscript=/autobench/var/tmp/ifdown-kvm-br1 -vnc :2 -monitor telnet::5702,server,nowait -snapshot -daemonize The ifup-kvm-br0 script takes the (first) qemu created tap device and brings it up and adds it to bridge br0. The ifup-kvm-br1 script take the (second) qemu created tap device and brings it up and adds it to bridge br1. Each ethernet device within a guest is on it's own subnet. For example: guest 1 eth0 has addr 192.168.100.32 and eth1 has addr 192.168.101.32 guest 2 eth0 has addr 192.168.100.64 and eth1 has addr 192.168.101.64 On one of the guests run netserver: netserver -L 192.168.101.32 -p 12000 On the other guest run netperf: netperf -L 192.168.101.64 -H 192.168.101.32 -p 12000 -t TCP_STREAM -l 60 -c -C -- -m 16K -M 16K It may take more than one netperf run (I find that my second run almost always causes the shutdown) but the network on the eth1 links will stop working. I did some debugging and found that in qemu on the guest running netserver: - the receive_disabled variable is set and never gets reset - the read_poll event handler for the eth1 tap device is disabled and never re-enabled These conditions result in no