date:20100202

Re: [Qemu-devel] [PATCH 0/8]: QMP feature negotiation support

2010-02-02 Thread Markus Armbruster

Luiz Capitulino lcapitul...@redhat.com writes:

 On Mon, 01 Feb 2010 20:37:41 +0100
 Markus Armbruster arm...@redhat.com wrote:

 Luiz Capitulino lcapitul...@redhat.com writes:
 
  On Mon, 01 Feb 2010 18:08:27 +0100
  Markus Armbruster arm...@redhat.com wrote:
[...]
  I don't doubt your design does the job.  I just think it's overly
  general.  I had something far more stupid in mind:
  
  client connects
  server - client: version  capability offer (one message)
again:
  client - server: capability selection (one message)
  server - client: either okay or error (one message)
  if error goto again
  connection is now ready for commands
  
  No modes.  The distinct lack of generality is a design feature.
 
   I like the simplicity and if we were allowed to change later I'd
  do it.
 
   The question is if we will ever want features to be _configured_
  before the protocol is operational. In this case we'd need to
  pass feature arguments through the capability selection command,
  which will get ugly and hard to use/understand.
 
   Mode oriented support doesn't have this limitation. Maybe we
  won't never really use it, but it's safer.
 
 Capability selection could be done as an object where the name/value
 pairs are capability/argument.  If you need multiple arguments for a
 capability, make the capability's value an object.

  That's exactly what seems complicated to me, because besides performing
 two functions (enable/configure) some feature setup could require
 more commands to be done in a clear way.

What do you mean by feature setup?  And how does it go beyond setting
a bunch of parameters?

  The async messages setup in the previous series was an example of this.

I don't remember the details.  Could you summarize?

  As said we might never use this, but I wouldn't like to regret later.

A somewhat plausible example for how it could be needed would help.

[Qemu-devel] [PATCH 04/21] KVM: x86: Fix up misreported CPU features

2010-02-02 Thread Jan Kiszka

From qemu-kvm: Kernels before 2.6.30 misreported some essential CPU
features via KVM_GET_SUPPORTED_CPUID. Fix them up.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 target-i386/kvm.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 504f501..9fb96b5 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -101,12 +101,18 @@ uint32_t kvm_arch_get_supported_cpuid(CPUState *env, 
uint32_t function, int reg)
 break;
 case R_EDX:
 ret = cpuid-entries[i].edx;
-if (function == 0x8001) {
+switch (function) {
+case 1:
+/* KVM before 2.6.30 misreports the following features */
+ret |= CPUID_MTRR | CPUID_PAT | CPUID_MCE | CPUID_MCA;
+break;
+case 0x8001:
 /* On Intel, kvm returns cpuid according to the Intel spec,
  * so add missing bits according to the AMD spec:
  */
 cpuid_1_edx = kvm_arch_get_supported_cpuid(env, 1, R_EDX);
 ret |= cpuid_1_edx  0xdfeff7ff;
+break;
 }
 break;
 }
-- 
1.6.0.2

[Qemu-devel] [PATCH 03/21] qemu-kvm: Clean up register access API

2010-02-02 Thread Jan Kiszka

qemu-kvm's functios for accessing the VCPU registers are
kvm_arch_load/save_regs. Use them directly instead of going through
various wrappers. Specifically, we do not need on_vcpu wrapping as all
users either already run in the related thread or call while the vm is
stopped.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 qemu-kvm.c|   37 +++--
 qemu-kvm.h|   11 ---
 target-ia64/machine.c |4 ++--
 3 files changed, 5 insertions(+), 47 deletions(-)

diff --git a/qemu-kvm.c b/qemu-kvm.c
index a305907..97c098c 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -862,7 +862,7 @@ int pre_kvm_run(kvm_context_t kvm, CPUState *env)
 kvm_arch_pre_run(env, env-kvm_run);
 
 if (env-kvm_cpu_state.regs_modified) {
-kvm_arch_put_registers(env);
+kvm_arch_load_regs(env);
 env-kvm_cpu_state.regs_modified = 0;
 }
 
@@ -1532,16 +1532,11 @@ static void on_vcpu(CPUState *env, void (*func)(void 
*data), void *data)
 qemu_cond_wait(qemu_work_cond);
 }
 
-void kvm_arch_get_registers(CPUState *env)
-{
-   kvm_arch_save_regs(env);
-}
-
 static void do_kvm_cpu_synchronize_state(void *_env)
 {
 CPUState *env = _env;
 if (!env-kvm_cpu_state.regs_modified) {
-kvm_arch_get_registers(env);
+kvm_arch_save_regs(env);
 env-kvm_cpu_state.regs_modified = 1;
 }
 }
@@ -1584,32 +1579,6 @@ void kvm_update_interrupt_request(CPUState *env)
 }
 }
 
-static void kvm_do_load_registers(void *_env)
-{
-CPUState *env = _env;
-
-kvm_arch_load_regs(env);
-}
-
-void kvm_load_registers(CPUState *env)
-{
-if (kvm_enabled()  qemu_system_ready)
-on_vcpu(env, kvm_do_load_registers, env);
-}
-
-static void kvm_do_save_registers(void *_env)
-{
-CPUState *env = _env;
-
-kvm_arch_save_regs(env);
-}
-
-void kvm_save_registers(CPUState *env)
-{
-if (kvm_enabled())
-on_vcpu(env, kvm_do_save_registers, env);
-}
-
 static void kvm_do_load_mpstate(void *_env)
 {
 CPUState *env = _env;
@@ -2379,7 +2348,7 @@ static void kvm_invoke_set_guest_debug(void *data)
 struct kvm_set_guest_debug_data *dbg_data = data;
 
 if (cpu_single_env-kvm_cpu_state.regs_modified) {
-kvm_arch_put_registers(cpu_single_env);
+kvm_arch_save_regs(cpu_single_env);
 cpu_single_env-kvm_cpu_state.regs_modified = 0;
 }
 dbg_data-err =
diff --git a/qemu-kvm.h b/qemu-kvm.h
index 6b3e5a1..1354227 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -902,8 +902,6 @@ int kvm_main_loop(void);
 int kvm_init_ap(void);
 #ifndef QEMU_KVM_NO_CPU
 int kvm_vcpu_inited(CPUState *env);
-void kvm_load_registers(CPUState *env);
-void kvm_save_registers(CPUState *env);
 void kvm_load_mpstate(CPUState *env);
 void kvm_save_mpstate(CPUState *env);
 int kvm_cpu_exec(CPUState *env);
@@ -1068,8 +1066,6 @@ void kvm_load_tsc(CPUState *env);
 #ifdef TARGET_I386
 #define qemu_kvm_has_pit_state2() (0)
 #endif
-#define kvm_load_registers(env) do {} while(0)
-#define kvm_save_registers(env) do {} while(0)
 #define kvm_save_mpstate(env)   do {} while(0)
 #define qemu_kvm_cpu_stop(env) do {} while(0)
 static inline void kvm_init_vcpu(CPUState *env)
@@ -1098,13 +1094,6 @@ static inline int kvm_sync_vcpus(void)
 }
 
 #ifndef QEMU_KVM_NO_CPU
-void kvm_arch_get_registers(CPUState *env);
-
-static inline void kvm_arch_put_registers(CPUState *env)
-{
-kvm_load_registers(env);
-}
-
 void kvm_cpu_synchronize_state(CPUState *env);
 
 static inline void cpu_synchronize_state(CPUState *env)
diff --git a/target-ia64/machine.c b/target-ia64/machine.c
index 70ef379..7d29575 100644
--- a/target-ia64/machine.c
+++ b/target-ia64/machine.c
@@ -9,7 +9,7 @@ void cpu_save(QEMUFile *f, void *opaque)
 CPUState *env = opaque;
 
 if (kvm_enabled()) {
-kvm_save_registers(env);
+kvm_arch_save_regs(env);
 kvm_arch_save_mpstate(env);
 }
 }
@@ -19,7 +19,7 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id)
 CPUState *env = opaque;
 
 if (kvm_enabled()) {
-kvm_load_registers(env);
+kvm_arch_load_regs(env);
 kvm_arch_load_mpstate(env);
 }
 return 0;
-- 
1.6.0.2

[Qemu-devel] [PATCH 02/21] KVM: Make vmport KVM-compatible

2010-02-02 Thread Jan Kiszka

The vmport device accesses the VCPU registers, so it requires proper
cpu_synchronize_state. Add it to vmport_ioport_read, which also
synchronizes vmport_ioport_write.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/vmport.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/hw/vmport.c b/hw/vmport.c
index 884af3f..6c9d7c9 100644
--- a/hw/vmport.c
+++ b/hw/vmport.c
@@ -25,6 +25,7 @@
 #include isa.h
 #include pc.h
 #include sysemu.h
+#include kvm.h
 
 //#define VMPORT_DEBUG
 
@@ -58,6 +59,8 @@ static uint32_t vmport_ioport_read(void *opaque, uint32_t 
addr)
 unsigned char command;
 uint32_t eax;
 
+cpu_synchronize_state(env);
+
 eax = env-regs[R_EAX];
 if (eax != VMPORT_MAGIC)
 return eax;
-- 
1.6.0.2

[Qemu-devel] [PATCH 2/2] powerpc/e500: adjust fdt and ramdisk loading addr

2010-02-02 Thread Liu Yu

Since kernel uimage is getting bigger,
old fixed loading bases will result in regions overlap.

Add pad for fdt and ramdisk, so that they won't overlap with uimage.

Signed-off-by: Liu Yu yu@freescale.com
---
 hw/ppce500_mpc8544ds.c |   12 
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index 9a5654b..3826156 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -34,8 +34,10 @@
 
 #define BINARY_DEVICE_TREE_FILEmpc8544ds.dtb
 #define UIMAGE_LOAD_BASE   0
-#define DTB_LOAD_BASE  0x60
-#define INITRD_LOAD_BASE   0x200
+#define DTC_LOAD_PAD   0x50
+#define DTC_PAD_MASK   0xF
+#define INITRD_LOAD_PAD0x200
+#define INITRD_PAD_MASK0xFF
 
 #define RAM_SIZES_ALIGN(64UL  20)
 
@@ -170,8 +172,8 @@ static void mpc8544ds_init(ram_addr_t ram_size,
 target_phys_addr_t entry=0;
 target_phys_addr_t loadaddr=UIMAGE_LOAD_BASE;
 target_long kernel_size=0;
-target_ulong dt_base=DTB_LOAD_BASE;
-target_ulong initrd_base=INITRD_LOAD_BASE;
+target_ulong dt_base = 0;
+target_ulong initrd_base = 0;
 target_long initrd_size=0;
 int i=0;
 unsigned int pci_irq_nrs[4] = {1, 2, 3, 4};
@@ -246,6 +248,7 @@ static void mpc8544ds_init(ram_addr_t ram_size,
 
 /* Load initrd. */
 if (initrd_filename) {
+initrd_base = (kernel_size + INITRD_LOAD_PAD)  ~INITRD_PAD_MASK;
 initrd_size = load_image_targphys(initrd_filename, initrd_base,
   ram_size - initrd_base);
 
@@ -258,6 +261,7 @@ static void mpc8544ds_init(ram_addr_t ram_size,
 
 /* If we're loading a kernel directly, we must load the device tree too. */
 if (kernel_filename) {
+dt_base = (kernel_size + DTC_LOAD_PAD)  ~DTC_PAD_MASK;
 if (mpc8544_load_device_tree(dt_base, ram_size,
 initrd_base, initrd_size, kernel_cmdline)  0) {
 fprintf(stderr, couldn't load device tree\n);
-- 
1.6.4

[Qemu-devel] [PATCH 1/2] powerpc/booke: move fdt loading to rom infrastructure

2010-02-02 Thread Liu Yu

It's convinent to use rom to checking overlap, to reset etc.
And uImage and ramdisk loading has already moved to it.

Also, after we add fdt to rom, free it.

Signed-off-by: Liu Yu yu@freescale.com
---
 hw/ppc440_bamboo.c |   15 ---
 hw/ppce500_mpc8544ds.c |   17 ++---
 2 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/hw/ppc440_bamboo.c b/hw/ppc440_bamboo.c
index 1ab9872..9d95417 100644
--- a/hw/ppc440_bamboo.c
+++ b/hw/ppc440_bamboo.c
@@ -27,7 +27,7 @@
 
 #define BINARY_DEVICE_TREE_FILE bamboo.dtb
 
-static void *bamboo_load_device_tree(target_phys_addr_t addr,
+static int bamboo_load_device_tree(target_phys_addr_t addr,
  uint32_t ramsize,
  target_phys_addr_t initrd_base,
  target_phys_addr_t initrd_size,
@@ -42,11 +42,13 @@ static void *bamboo_load_device_tree(target_phys_addr_t 
addr,
 
 filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, BINARY_DEVICE_TREE_FILE);
 if (!filename) {
+ret = -1;
 goto out;
 }
 fdt = load_device_tree(filename, fdt_size);
 qemu_free(filename);
 if (fdt == NULL) {
+ret = -1;
 goto out;
 }
 
@@ -75,12 +77,13 @@ static void *bamboo_load_device_tree(target_phys_addr_t 
addr,
 if (kvm_enabled())
 kvmppc_fdt_update(fdt);
 
-cpu_physical_memory_write (addr, (void *)fdt, fdt_size);
+ret = rom_add_blob_fixed(BINARY_DEVICE_TREE_FILE, fdt, fdt_size, addr);
+qemu_free(fdt);
 
 out:
 #endif
 
-return fdt;
+return ret;
 }
 
 static void bamboo_init(ram_addr_t ram_size,
@@ -101,7 +104,6 @@ static void bamboo_init(ram_addr_t ram_size,
 target_ulong initrd_base = 0;
 target_long initrd_size = 0;
 target_ulong dt_base = 0;
-void *fdt;
 int i;
 
 /* Setup CPU. */
@@ -153,9 +155,8 @@ static void bamboo_init(ram_addr_t ram_size,
 else
 dt_base = kernel_size + loadaddr;
 
-fdt = bamboo_load_device_tree(dt_base, ram_size,
-  initrd_base, initrd_size, 
kernel_cmdline);
-if (fdt == NULL) {
+if (bamboo_load_device_tree(dt_base, ram_size,
+initrd_base, initrd_size, kernel_cmdline)  0) {
 fprintf(stderr, couldn't load device tree\n);
 exit(1);
 }
diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index ea30816..9a5654b 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -72,7 +72,7 @@ out:
 }
 #endif
 
-static void *mpc8544_load_device_tree(target_phys_addr_t addr,
+static int mpc8544_load_device_tree(target_phys_addr_t addr,
  uint32_t ramsize,
  target_phys_addr_t initrd_base,
  target_phys_addr_t initrd_size,
@@ -87,11 +87,13 @@ static void *mpc8544_load_device_tree(target_phys_addr_t 
addr,
 
 filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, BINARY_DEVICE_TREE_FILE);
 if (!filename) {
+ret = -1;
 goto out;
 }
 fdt = load_device_tree(filename, fdt_size);
 qemu_free(filename);
 if (fdt == NULL) {
+ret = -1;
 goto out;
 }
 
@@ -123,6 +125,7 @@ static void *mpc8544_load_device_tree(target_phys_addr_t 
addr,
 
 if ((dp = opendir(/proc/device-tree/cpus/)) == NULL) {
 printf(Can't open directory /proc/device-tree/cpus/\n);
+ret = -1;
 goto out;
 }
 
@@ -136,6 +139,7 @@ static void *mpc8544_load_device_tree(target_phys_addr_t 
addr,
 closedir(dp);
 if (buf[0] == '\0') {
 printf(Unknow host!\n);
+ret = -1;
 goto out;
 }
 
@@ -143,12 +147,13 @@ static void *mpc8544_load_device_tree(target_phys_addr_t 
addr,
 mpc8544_copy_soc_cell(fdt, buf, timebase-frequency);
 }
 
-cpu_physical_memory_write (addr, (void *)fdt, fdt_size);
+ret = rom_add_blob_fixed(BINARY_DEVICE_TREE_FILE, fdt, fdt_size, addr);
+qemu_free(fdt);
 
 out:
 #endif
 
-return fdt;
+return ret;
 }
 
 static void mpc8544ds_init(ram_addr_t ram_size,
@@ -168,7 +173,6 @@ static void mpc8544ds_init(ram_addr_t ram_size,
 target_ulong dt_base=DTB_LOAD_BASE;
 target_ulong initrd_base=INITRD_LOAD_BASE;
 target_long initrd_size=0;
-void *fdt;
 int i=0;
 unsigned int pci_irq_nrs[4] = {1, 2, 3, 4};
 qemu_irq *irqs, *mpic, *pci_irqs;
@@ -254,9 +258,8 @@ static void mpc8544ds_init(ram_addr_t ram_size,
 
 /* If we're loading a kernel directly, we must load the device tree too. */
 if (kernel_filename) {
-fdt = mpc8544_load_device_tree(dt_base, ram_size,
-  initrd_base, initrd_size, 
kernel_cmdline);
-if (fdt == NULL) {
+if (mpc8544_load_device_tree(dt_base, ram_size,
+initrd_base, initrd_size, kernel_cmdline)  0) {

[Qemu-devel] [PATCH 20/21] qemu-kvm: Move kvm_set_boot_cpu_id

2010-02-02 Thread Jan Kiszka

Setting the boot CPU ID is arch-specific KVM stuff. So push it where it
belongs to.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/pc.c|3 ---
 qemu-kvm-x86.c |3 ++-
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index 6c15a9f..3df6195 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -803,9 +803,6 @@ static void pc_init1(ram_addr_t ram_size,
 #endif
 }
 
-if (kvm_enabled()) {
-kvm_set_boot_cpu_id(0);
-}
 for (i = 0; i  smp_cpus; i++) {
 env = pc_new_cpu(cpu_model);
 }
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 9de018e..0f34451 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -695,7 +695,8 @@ int kvm_arch_qemu_create_context(void)
 if (kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK))
 vmstate_register(0, vmstate_kvmclock, kvmclock_data);
 #endif
-return 0;
+
+return kvm_set_boot_cpu_id(0);
 }
 
 static void set_msr_entry(struct kvm_msr_entry *entry, uint32_t index,
-- 
1.6.0.2

[Qemu-devel] [PATCH 01/21] qemu-kvm: Drop vmport changes

2010-02-02 Thread Jan Kiszka

This attempt to make vmport KVM compatible is half-broken and is
scheduled to be replaced by proper upstream support.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/vmport.c |   13 +
 1 files changed, 1 insertions(+), 12 deletions(-)

diff --git a/hw/vmport.c b/hw/vmport.c
index 648861b..884af3f 100644
--- a/hw/vmport.c
+++ b/hw/vmport.c
@@ -21,12 +21,10 @@
  * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
  * THE SOFTWARE.
  */
-
 #include hw.h
 #include isa.h
 #include pc.h
 #include sysemu.h
-#include qemu-kvm.h
 
 //#define VMPORT_DEBUG
 
@@ -59,10 +57,6 @@ static uint32_t vmport_ioport_read(void *opaque, uint32_t 
addr)
 CPUState *env = cpu_single_env;
 unsigned char command;
 uint32_t eax;
-uint32_t ret;
-
-if (kvm_enabled())
-   kvm_save_registers(env);
 
 eax = env-regs[R_EAX];
 if (eax != VMPORT_MAGIC)
@@ -79,12 +73,7 @@ static uint32_t vmport_ioport_read(void *opaque, uint32_t 
addr)
 return eax;
 }
 
-ret = s-func[command](s-opaque[command], addr);
-
-if (kvm_enabled())
-   kvm_load_registers(env);
-
-return ret;
+return s-func[command](s-opaque[command], addr);
 }
 
 static void vmport_ioport_write(void *opaque, uint32_t addr, uint32_t val)
-- 
1.6.0.2

[Qemu-devel] [PATCH 17/21] qemu-kvm: Use VCPU event state for reset and vmsave/load

2010-02-02 Thread Jan Kiszka

Push reading/writing of vcpu_events into kvm_arch_load/save_regs to
avoid KVM-specific hooks in generic code.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 kvm.h |2 --
 qemu-kvm-x86.c|6 --
 target-i386/kvm.c |4 ++--
 target-i386/machine.c |6 --
 4 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/kvm.h b/kvm.h
index e4005d8..686ee33 100644
--- a/kvm.h
+++ b/kvm.h
@@ -53,8 +53,6 @@ int kvm_set_migration_log(int enable);
 
 int kvm_has_sync_mmu(void);
 int kvm_has_vcpu_events(void);
-int kvm_put_vcpu_events(CPUState *env, int level);
-int kvm_get_vcpu_events(CPUState *env);
 
 void kvm_setup_guest_memory(void *start, size_t size);
 
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 21476db..f484149 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -972,6 +972,8 @@ void kvm_arch_load_regs(CPUState *env, int level)
 if (level = KVM_PUT_RESET_STATE) {
 kvm_arch_load_mpstate(env);
 }
+
+kvm_put_vcpu_events(env, level);
 }
 
 void kvm_load_tsc(CPUState *env)
@@ -1141,6 +1143,7 @@ void kvm_arch_save_regs(CPUState *env)
 }
 }
 kvm_arch_save_mpstate(env);
+kvm_get_vcpu_events(env);
 }
 
 static void do_cpuid_ent(struct kvm_cpuid_entry2 *e, uint32_t function,
@@ -1215,7 +1218,7 @@ int kvm_arch_init_vcpu(CPUState *cenv)
 
 qemu_kvm_load_lapic(cenv);
 
-cenv-interrupt_injected = -1;
+kvm_arch_reset_vcpu(cenv);
 
 #ifdef KVM_CPUID_SIGNATURE
 /* Paravirtualization CPUIDs */
@@ -1381,7 +1384,6 @@ void kvm_arch_push_nmi(void *opaque)
 void kvm_arch_cpu_reset(CPUState *env)
 {
 kvm_arch_reset_vcpu(env);
-kvm_put_vcpu_events(env, KVM_PUT_RESET_STATE);
 if (!cpu_is_bsp(env)  !kvm_irqchip_in_kernel()) {
 env-interrupt_request = ~CPU_INTERRUPT_HARD;
 env-halted = 1;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index fefd5a5..9bd2952 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -789,7 +789,7 @@ static int kvm_get_mp_state(CPUState *env)
 }
 #endif
 
-int kvm_put_vcpu_events(CPUState *env, int level)
+static int kvm_put_vcpu_events(CPUState *env, int level)
 {
 #ifdef KVM_CAP_VCPU_EVENTS
 struct kvm_vcpu_events events;
@@ -825,7 +825,7 @@ int kvm_put_vcpu_events(CPUState *env, int level)
 #endif
 }
 
-int kvm_get_vcpu_events(CPUState *env)
+static int kvm_get_vcpu_events(CPUState *env)
 {
 #ifdef KVM_CAP_VCPU_EVENTS
 struct kvm_vcpu_events events;
diff --git a/target-i386/machine.c b/target-i386/machine.c
index 6fca559..bcc315b 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -5,7 +5,6 @@
 
 #include exec-all.h
 #include kvm.h
-#include qemu-kvm.h
 
 static const VMStateDescription vmstate_segment = {
 .name = segment,
@@ -322,10 +321,6 @@ static void cpu_pre_save(void *opaque)
 CPUState *env = opaque;
 int i;
 
-if (kvm_enabled()) {
-kvm_get_vcpu_events(env);
-}
-
 /* FPU */
 env-fpus_vmstate = (env-fpus  ~0x3800) | (env-fpstt  0x7)  11;
 env-fptag_vmstate = 0;
@@ -362,7 +357,6 @@ static int cpu_post_load(void *opaque, int version_id)
 
 if (kvm_enabled()) {
 kvm_load_tsc(env);
-kvm_put_vcpu_events(env, KVM_PUT_FULL_STATE);
 }
 
 return 0;
-- 
1.6.0.2

[Qemu-devel] [PATCH 00/21] qemu-kvm: Hook cleanups and extended use of upstream code

2010-02-02 Thread Jan Kiszka

Let's start with the overall stats:

 31 files changed, 274 insertions(+), 822 deletions(-)

So this series drops far more than 500 lines of redundant code, moving
qemu-kvm yet a bit closer to upstream.

The other highlight is the simplification of synchronization between
in-kernel and user space VCPU states. This area used to call a lot of
problems in the past because it was tricky to get things right,
specifically during the multi-threaded startup. The new approach pushes
all the sync work around reset and vmsave/load into generic code, not
only removing the burden from developers of, say, in-kernel APIC
support, but also dropping most of our kvm-specific hooks, especially in
the qemu-kvm tree.

While I tested this on various VMs around, and things look good so far,
I wouldn't be surprised if there are some regressions remaining,
specifically in the non-x86 parts that I wasn't able to test or even
build. Please have a careful look!

Regarding the organization of the series: Patches prefixed with KVM:
are for upstream, unmodified or with only minor adjustments. But I have
a separate series against uq/master here that just needs final polishing
and can then be rolled out as well.

You can pull this series from

git://git.kiszka.org/qemu-kvm.git queues/vcpu-state

There are two more items on my to-do list, yet with medium prio:
 o switch kvm_arch_save/load_regs and sub-functions to upstream code
 o drop qemu-kvm's slot management in favor of upstream's implementation

Jan Kiszka (21):
  qemu-kvm: Drop vmport changes
  KVM: Make vmport KVM-compatible
  qemu-kvm: Clean up register access API
  KVM: x86: Fix up misreported CPU features
  qemu-kvm: Use upstream kvm_enabled and cpu_synchronize_state
  qemu-kvm: Use upstream kvm_setup_guest_memory
  qemu-kvm: Use some more upstream prototypes
  qemu-kvm: Use upstream kvm_arch_get_supported_cpuid
  qemu-kvm: Use upstream kvm_pit_in_kernel
  KVM: Move and rename regs_modified
  KVM: Rework of guest debug state writing
  qemu-kvm: Use upstream kvm_vcpu_dirty
  qemu-kvm: Use upstream guest debug code
  qemu-kvm: Rework VCPU state writeback API
  qemu-kvm: Clean up mpstate synchronization
  KVM: x86: Restrict writeback of VCPU state
  qemu-kvm: Use VCPU event state for reset and vmsave/load
  qemu-kvm: Cleanup/fix TSC and PV clock writeback
  qemu-kvm: Clean up KVM's APIC hooks
  qemu-kvm: Move kvm_set_boot_cpu_id
  qemu-kvm: Bring qemu_init_vcpu back home

 cpu-defs.h|2 +-
 exec.c|   17 --
 hw/apic.c |   47 +-
 hw/i8254.c|6 +-
 hw/i8259.c|2 +-
 hw/ioapic.c   |2 +-
 hw/msix.c |3 +-
 hw/pc.c   |   13 +--
 hw/pcspk.c|4 +-
 hw/piix_pci.c |2 +-
 hw/ppc_newworld.c |3 -
 hw/ppc_oldworld.c |3 -
 hw/s390-virtio.c  |1 -
 hw/vmport.c   |   14 +--
 kvm-all.c |   51 +++---
 kvm.h |   35 +++--
 qemu-kvm-ia64.c   |6 +-
 qemu-kvm-x86.c|  415 +
 qemu-kvm.c|  159 +++
 qemu-kvm.h|  158 +--
 savevm.c  |4 +
 sysemu.h  |4 +
 target-i386/cpu.h |9 +-
 target-i386/helper.c  |2 +
 target-i386/kvm.c |   61 +--
 target-i386/machine.c |   27 
 target-ia64/machine.c |5 +-
 target-ppc/kvm.c  |2 +-
 target-ppc/machine.c  |4 -
 target-s390x/kvm.c|3 +-
 vl.c  |   32 -
 31 files changed, 274 insertions(+), 822 deletions(-)

[Qemu-devel] [PATCH 16/21] KVM: x86: Restrict writeback of VCPU state

2010-02-02 Thread Jan Kiszka

Do not write nmi_pending, sipi_vector, and mpstate unless we at least go
through a reset. And TSC as well as KVM wallclocks should only be
written on full sync, otherwise we risk to drop some time on during
state read-modify-write.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 kvm.h |2 +-
 qemu-kvm-x86.c|2 +-
 target-i386/kvm.c |   32 
 target-i386/machine.c |2 +-
 4 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/kvm.h b/kvm.h
index ee8b3f6..e4005d8 100644
--- a/kvm.h
+++ b/kvm.h
@@ -53,7 +53,7 @@ int kvm_set_migration_log(int enable);
 
 int kvm_has_sync_mmu(void);
 int kvm_has_vcpu_events(void);
-int kvm_put_vcpu_events(CPUState *env);
+int kvm_put_vcpu_events(CPUState *env, int level);
 int kvm_get_vcpu_events(CPUState *env);
 
 void kvm_setup_guest_memory(void *start, size_t size);
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 6b5895f..21476db 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -1381,7 +1381,7 @@ void kvm_arch_push_nmi(void *opaque)
 void kvm_arch_cpu_reset(CPUState *env)
 {
 kvm_arch_reset_vcpu(env);
-kvm_put_vcpu_events(env);
+kvm_put_vcpu_events(env, KVM_PUT_RESET_STATE);
 if (!cpu_is_bsp(env)  !kvm_irqchip_in_kernel()) {
 env-interrupt_request = ~CPU_INTERRUPT_HARD;
 env-halted = 1;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 4a0c8bb..fefd5a5 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -544,7 +544,7 @@ static void kvm_msr_entry_set(struct kvm_msr_entry *entry,
 entry-data = value;
 }
 
-static int kvm_put_msrs(CPUState *env)
+static int kvm_put_msrs(CPUState *env, int level)
 {
 struct {
 struct kvm_msrs info;
@@ -558,7 +558,6 @@ static int kvm_put_msrs(CPUState *env)
 kvm_msr_entry_set(msrs[n++], MSR_IA32_SYSENTER_EIP, env-sysenter_eip);
 if (kvm_has_msr_star(env))
kvm_msr_entry_set(msrs[n++], MSR_STAR, env-star);
-kvm_msr_entry_set(msrs[n++], MSR_IA32_TSC, env-tsc);
 kvm_msr_entry_set(msrs[n++], MSR_VM_HSAVE_PA, env-vm_hsave);
 #ifdef TARGET_X86_64
 /* FIXME if lm capable */
@@ -567,8 +566,12 @@ static int kvm_put_msrs(CPUState *env)
 kvm_msr_entry_set(msrs[n++], MSR_FMASK, env-fmask);
 kvm_msr_entry_set(msrs[n++], MSR_LSTAR, env-lstar);
 #endif
-kvm_msr_entry_set(msrs[n++], MSR_KVM_SYSTEM_TIME,  env-system_time_msr);
-kvm_msr_entry_set(msrs[n++], MSR_KVM_WALL_CLOCK,  env-wall_clock_msr);
+if (level == KVM_PUT_FULL_STATE) {
+kvm_msr_entry_set(msrs[n++], MSR_IA32_TSC, env-tsc);
+kvm_msr_entry_set(msrs[n++], MSR_KVM_SYSTEM_TIME,
+  env-system_time_msr);
+kvm_msr_entry_set(msrs[n++], MSR_KVM_WALL_CLOCK, env-wall_clock_msr);
+}
 
 msr_data.info.nmsrs = n;
 
@@ -786,7 +789,7 @@ static int kvm_get_mp_state(CPUState *env)
 }
 #endif
 
-int kvm_put_vcpu_events(CPUState *env)
+int kvm_put_vcpu_events(CPUState *env, int level)
 {
 #ifdef KVM_CAP_VCPU_EVENTS
 struct kvm_vcpu_events events;
@@ -810,8 +813,11 @@ int kvm_put_vcpu_events(CPUState *env)
 
 events.sipi_vector = env-sipi_vector;
 
-events.flags =
-KVM_VCPUEVENT_VALID_NMI_PENDING | KVM_VCPUEVENT_VALID_SIPI_VECTOR;
+events.flags = 0;
+if (level = KVM_PUT_RESET_STATE) {
+events.flags |=
+KVM_VCPUEVENT_VALID_NMI_PENDING | KVM_VCPUEVENT_VALID_SIPI_VECTOR;
+}
 
 return kvm_vcpu_ioctl(env, KVM_SET_VCPU_EVENTS, events);
 #else
@@ -882,15 +888,17 @@ int kvm_arch_put_registers(CPUState *env, int level)
 if (ret  0)
 return ret;
 
-ret = kvm_put_msrs(env);
+ret = kvm_put_msrs(env, level);
 if (ret  0)
 return ret;
 
-ret = kvm_put_mp_state(env);
-if (ret  0)
-return ret;
+if (level = KVM_PUT_RESET_STATE) {
+ret = kvm_put_mp_state(env);
+if (ret  0)
+return ret;
+}
 
-ret = kvm_put_vcpu_events(env);
+ret = kvm_put_vcpu_events(env, level);
 if (ret  0)
 return ret;
 
diff --git a/target-i386/machine.c b/target-i386/machine.c
index 61e6a87..6fca559 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -362,7 +362,7 @@ static int cpu_post_load(void *opaque, int version_id)
 
 if (kvm_enabled()) {
 kvm_load_tsc(env);
-kvm_put_vcpu_events(env);
+kvm_put_vcpu_events(env, KVM_PUT_FULL_STATE);
 }
 
 return 0;
-- 
1.6.0.2

[Qemu-devel] [PATCH 09/21] qemu-kvm: Use upstream kvm_pit_in_kernel

2010-02-02 Thread Jan Kiszka

Drop private version in favor of recently added upstream service and
track it state directly in KVMState.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/i8254.c |4 ++--
 hw/pc.c|2 +-
 hw/pcspk.c |4 ++--
 kvm-all.c  |2 +-
 kvm.h  |2 +-
 qemu-kvm-x86.c |   12 ++--
 qemu-kvm.c |5 -
 qemu-kvm.h |   13 +
 8 files changed, 14 insertions(+), 30 deletions(-)

diff --git a/hw/i8254.c b/hw/i8254.c
index db9e94a..1add08e 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -491,7 +491,7 @@ void hpet_disable_pit(void)
 {
 PITChannelState *s = pit_state.channels[0];
 
-if (kvm_enabled()  qemu_kvm_pit_in_kernel()) {
+if (kvm_enabled()  kvm_pit_in_kernel()) {
 if (qemu_kvm_has_pit_state2()) {
 kvm_hpet_disable_kpit();
 } else {
@@ -515,7 +515,7 @@ void hpet_enable_pit(void)
 PITState *pit = pit_state;
 PITChannelState *s = pit-channels[0];
 
-if (kvm_enabled()  qemu_kvm_pit_in_kernel()) {
+if (kvm_enabled()  kvm_pit_in_kernel()) {
 if (qemu_kvm_has_pit_state2()) {
 kvm_hpet_enable_kpit();
 } else {
diff --git a/hw/pc.c b/hw/pc.c
index dac373e..7a7dfa7 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -951,7 +951,7 @@ static void pc_init1(ram_addr_t ram_size,
 ioapic_irq_hack = isa_irq;
 }
 #ifdef CONFIG_KVM_PIT
-if (kvm_enabled()  qemu_kvm_pit_in_kernel())
+if (kvm_enabled()  kvm_pit_in_kernel())
pit = kvm_pit_init(0x40, isa_reserve_irq(0));
 else
 #endif
diff --git a/hw/pcspk.c b/hw/pcspk.c
index 128836b..fb5f763 100644
--- a/hw/pcspk.c
+++ b/hw/pcspk.c
@@ -56,7 +56,7 @@ static void kvm_get_pit_ch2(PITState *pit,
 {
 struct kvm_pit_state pit_state;
 
-if (kvm_enabled()  qemu_kvm_pit_in_kernel()) {
+if (kvm_enabled()  kvm_pit_in_kernel()) {
 kvm_get_pit(kvm_context, pit_state);
 pit-channels[2].mode = pit_state.channels[2].mode;
 pit-channels[2].count = pit_state.channels[2].count;
@@ -71,7 +71,7 @@ static void kvm_get_pit_ch2(PITState *pit,
 static void kvm_set_pit_ch2(PITState *pit,
 struct kvm_pit_state *inkernel_state)
 {
-if (kvm_enabled()  qemu_kvm_pit_in_kernel()) {
+if (kvm_enabled()  kvm_pit_in_kernel()) {
 inkernel_state-channels[2].mode = pit-channels[2].mode;
 inkernel_state-channels[2].count = pit-channels[2].count;
 inkernel_state-channels[2].count_load_time =
diff --git a/kvm-all.c b/kvm-all.c
index e7fa605..6cbca97 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -164,13 +164,13 @@ int kvm_irqchip_in_kernel(void)
 return kvm_state-irqchip_in_kernel;
 }
 
-#ifdef KVM_UPSTREAM
 int kvm_pit_in_kernel(void)
 {
 return kvm_state-pit_in_kernel;
 }
 
 
+#ifdef KVM_UPSTREAM
 int kvm_init_vcpu(CPUState *env)
 {
 KVMState *s = kvm_state;
diff --git a/kvm.h b/kvm.h
index 189a5d4..253b45d 100644
--- a/kvm.h
+++ b/kvm.h
@@ -68,10 +68,10 @@ int kvm_remove_breakpoint(CPUState *current_env, 
target_ulong addr,
   target_ulong len, int type);
 void kvm_remove_all_breakpoints(CPUState *current_env);
 int kvm_update_guest_debug(CPUState *env, unsigned long reinject_trap);
+#endif /* KVM_UPSTREAM */
 
 int kvm_pit_in_kernel(void);
 int kvm_irqchip_in_kernel(void);
-#endif /* KVM_UPSTREAM */
 
 /* internal API */
 
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 0457a6e..074b510 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -119,13 +119,13 @@ static int kvm_create_pit(kvm_context_t kvm)
 #ifdef KVM_CAP_PIT
int r;
 
-   kvm-pit_in_kernel = 0;
+   kvm_state-pit_in_kernel = 0;
if (!kvm-no_pit_creation) {
r = kvm_ioctl(kvm_state, KVM_CHECK_EXTENSION, KVM_CAP_PIT);
if (r  0) {
r = kvm_vm_ioctl(kvm_state, KVM_CREATE_PIT);
if (r = 0)
-   kvm-pit_in_kernel = 1;
+   kvm_state-pit_in_kernel = 1;
else {
fprintf(stderr, Create kernel PIC irqchip 
failed\n);
return r;
@@ -311,14 +311,14 @@ int kvm_set_lapic(CPUState *env, struct kvm_lapic_state 
*s)
 
 int kvm_get_pit(kvm_context_t kvm, struct kvm_pit_state *s)
 {
-   if (!kvm-pit_in_kernel)
+   if (!kvm_pit_in_kernel())
return 0;
return kvm_vm_ioctl(kvm_state, KVM_GET_PIT, s);
 }
 
 int kvm_set_pit(kvm_context_t kvm, struct kvm_pit_state *s)
 {
-   if (!kvm-pit_in_kernel)
+   if (!kvm_pit_in_kernel())
return 0;
return kvm_vm_ioctl(kvm_state, KVM_SET_PIT, s);
 }
@@ -326,14 +326,14 @@ int kvm_set_pit(kvm_context_t kvm, struct kvm_pit_state 
*s)
 #ifdef KVM_CAP_PIT_STATE2
 int kvm_get_pit2(kvm_context_t kvm, struct kvm_pit_state2 *ps2)
 {
-   if (!kvm-pit_in_kernel)
+   if (!kvm_pit_in_kernel())
return 0;
return

[Qemu-devel] [PATCH 10/21] KVM: Move and rename regs_modified

2010-02-02 Thread Jan Kiszka

Touching the user space representation of KVM's VCPU state is -
naturally - a per-VCPU thing. So move the dirty flag into KVM_CPU_COMMON
and rename it at this chance to reflect its true meaning.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 cpu-defs.h |1 +
 kvm-all.c  |   12 ++--
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/cpu-defs.h b/cpu-defs.h
index cf502e9..49a9e8d 100644
--- a/cpu-defs.h
+++ b/cpu-defs.h
@@ -208,6 +208,7 @@ struct KVMCPUState {
 struct KVMState *kvm_state; \
 struct kvm_run *kvm_run;\
 int kvm_fd; \
+int kvm_vcpu_dirty; \
 uint32_t stop;   /* Stop request */ \
 uint32_t stopped; /* Artificially stopped */\
 struct KVMCPUState kvm_cpu_state;
diff --git a/kvm-all.c b/kvm-all.c
index 6cbca97..3516f01 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -573,9 +573,9 @@ static void kvm_run_coalesced_mmio(CPUState *env, struct 
kvm_run *run)
 
 void kvm_cpu_synchronize_state(CPUState *env)
 {
-if (!env-kvm_state-regs_modified) {
+if (!env-kvm_vcpu_dirty) {
 kvm_arch_get_registers(env);
-env-kvm_state-regs_modified = 1;
+env-kvm_vcpu_dirty = 1;
 }
 }
 
@@ -593,9 +593,9 @@ int kvm_cpu_exec(CPUState *env)
 break;
 }
 
-if (env-kvm_state-regs_modified) {
+if (env-kvm_vcpu_dirty) {
 kvm_arch_put_registers(env);
-env-kvm_state-regs_modified = 0;
+env-kvm_vcpu_dirty = 0;
 }
 
 kvm_arch_pre_run(env, run);
@@ -951,9 +951,9 @@ static void kvm_invoke_set_guest_debug(void *data)
 struct kvm_set_guest_debug_data *dbg_data = data;
 CPUState *env = dbg_data-env;
 
-if (env-kvm_state-regs_modified) {
+if (env-kvm_vcpu_dirty) {
 kvm_arch_put_registers(env);
-env-kvm_state-regs_modified = 0;
+env-kvm_vcpu_dirty = 0;
 }
 dbg_data-err = kvm_vcpu_ioctl(env, KVM_SET_GUEST_DEBUG, dbg_data-dbg);
 }
-- 
1.6.0.2

[Qemu-devel] [PATCH 12/21] qemu-kvm: Use upstream kvm_vcpu_dirty

2010-02-02 Thread Jan Kiszka

Drop regs_modified in favor of upstream's equivalent and clean up
kvm_cpu_synchronize_state at this chance.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 cpu-defs.h |1 -
 hw/pc.c|2 +-
 qemu-kvm.c |   18 +-
 3 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/cpu-defs.h b/cpu-defs.h
index 49a9e8d..c57d8df 100644
--- a/cpu-defs.h
+++ b/cpu-defs.h
@@ -142,7 +142,6 @@ struct KVMCPUState {
 pthread_t thread;
 int signalled;
 struct qemu_work_item *queued_work_first, *queued_work_last;
-int regs_modified;
 };
 
 #define CPU_TEMP_BUF_NLONGS 128
diff --git a/hw/pc.c b/hw/pc.c
index 7a7dfa7..af6ea8b 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -744,7 +744,7 @@ CPUState *pc_new_cpu(const char *cpu_model)
 fprintf(stderr, Unable to find x86 CPU definition\n);
 exit(1);
 }
-env-kvm_cpu_state.regs_modified = 1;
+env-kvm_vcpu_dirty = 1;
 if ((env-cpuid_features  CPUID_APIC) || smp_cpus  1) {
 env-cpuid_apic_id = env-cpu_index;
 /* APIC reset callback resets cpu */
diff --git a/qemu-kvm.c b/qemu-kvm.c
index 3ad0ec7..c04f805 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -861,9 +861,9 @@ int pre_kvm_run(kvm_context_t kvm, CPUState *env)
 {
 kvm_arch_pre_run(env, env-kvm_run);
 
-if (env-kvm_cpu_state.regs_modified) {
+if (env-kvm_vcpu_dirty) {
 kvm_arch_load_regs(env);
-env-kvm_cpu_state.regs_modified = 0;
+env-kvm_vcpu_dirty = 0;
 }
 
 pthread_mutex_unlock(qemu_mutex);
@@ -1530,16 +1530,16 @@ static void on_vcpu(CPUState *env, void (*func)(void 
*data), void *data)
 static void do_kvm_cpu_synchronize_state(void *_env)
 {
 CPUState *env = _env;
-if (!env-kvm_cpu_state.regs_modified) {
-kvm_arch_save_regs(env);
-env-kvm_cpu_state.regs_modified = 1;
-}
+
+kvm_arch_save_regs(env);
 }
 
 void kvm_cpu_synchronize_state(CPUState *env)
 {
-if (!env-kvm_cpu_state.regs_modified)
+if (!env-kvm_vcpu_dirty) {
 on_vcpu(env, do_kvm_cpu_synchronize_state, env);
+env-kvm_vcpu_dirty = 1;
+}
 }
 
 static void inject_interrupt(void *data)
@@ -2329,9 +2329,9 @@ static void kvm_invoke_set_guest_debug(void *data)
 {
 struct kvm_set_guest_debug_data *dbg_data = data;
 
-if (cpu_single_env-kvm_cpu_state.regs_modified) {
+if (cpu_single_env-kvm_vcpu_dirty) {
 kvm_arch_save_regs(cpu_single_env);
-cpu_single_env-kvm_cpu_state.regs_modified = 0;
+cpu_single_env-kvm_vcpu_dirty = 0;
 }
 dbg_data-err =
 kvm_set_guest_debug(cpu_single_env,
-- 
1.6.0.2

[Qemu-devel] [PATCH 11/21] KVM: Rework of guest debug state writing

2010-02-02 Thread Jan Kiszka

So far we synchronized any dirty VCPU state back into the kernel before
updating the guest debug state. This was a tribute to a deficite in x86
kernels before 2.6.33. But as this is an arch-dependent issue, it is
better handle in the x86 part of KVM and remove the writeback point for
generic code.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 kvm-all.c |   12 
 target-i386/cpu.h |9 -
 target-i386/kvm.c |   11 +++
 3 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 3516f01..9c921cc 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -951,10 +951,6 @@ static void kvm_invoke_set_guest_debug(void *data)
 struct kvm_set_guest_debug_data *dbg_data = data;
 CPUState *env = dbg_data-env;
 
-if (env-kvm_vcpu_dirty) {
-kvm_arch_put_registers(env);
-env-kvm_vcpu_dirty = 0;
-}
 dbg_data-err = kvm_vcpu_ioctl(env, KVM_SET_GUEST_DEBUG, dbg_data-dbg);
 }
 
@@ -962,12 +958,12 @@ int kvm_update_guest_debug(CPUState *env, unsigned long 
reinject_trap)
 {
 struct kvm_set_guest_debug_data data;
 
-data.dbg.control = 0;
-if (env-singlestep_enabled)
-data.dbg.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP;
+data.dbg.control = reinject_trap;
 
+if (env-singlestep_enabled) {
+data.dbg.control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP;
+}
 kvm_arch_update_guest_debug(env, data.dbg);
-data.dbg.control |= reinject_trap;
 data.env = env;
 
 on_vcpu(env, kvm_invoke_set_guest_debug, data);
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index 7d0bbd0..7787fb1 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -21,6 +21,10 @@
 
 #include config.h
 
+#ifdef CONFIG_KVM
+#include linux/kvm.h  /* for kvm_guest_debug */
+#endif
+
 #ifdef TARGET_X86_64
 #define TARGET_LONG_BITS 64
 #else
@@ -718,7 +722,10 @@ typedef struct CPUX86State {
 uint8_t has_error_code;
 uint32_t sipi_vector;
 uint32_t cpuid_kvm_features;
-
+#if defined(CONFIG_KVM)  defined(KVM_CAP_SET_GUEST_DEBUG)
+struct kvm_guest_debug kvm_guest_debug;
+#endif
+
 /* in order to simplify APIC support, we leave this pointer to the
user */
 struct APICState *apic_state;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 8743f32..5ac12a8 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -865,6 +865,15 @@ int kvm_arch_put_registers(CPUState *env)
 if (ret  0)
 return ret;
 
+/*
+ * Kernels before 2.6.33 overwrote flags.TF injected via SET_GUEST_DEBUG
+ * while updating GP regs. Work around this by updating the debug state
+ * once again.
+ */
+ret = kvm_vcpu_ioctl(env, KVM_SET_GUEST_DEBUG, env-kvm_guest_debug);
+if (ret  0)
+return ret;
+
 ret = kvm_put_fpu(env);
 if (ret  0)
 return ret;
@@ -1163,6 +1172,8 @@ void kvm_arch_update_guest_debug(CPUState *env, struct 
kvm_guest_debug *dbg)
 (len_code[hw_breakpoint[n].len]  (18 + n*4));
 }
 }
+/* Keep a copy for the writeback workaround in kvm_arch_put_registers */
+memcpy(env-kvm_guest_debug, dbg, sizeof(env-kvm_guest_debug));
 }
 #endif /* KVM_CAP_SET_GUEST_DEBUG */
 #endif
-- 
1.6.0.2

[Qemu-devel] [PATCH 15/21] qemu-kvm: Clean up mpstate synchronization

2010-02-02 Thread Jan Kiszka

Push mpstate reading/writing into kvm_arch_load/save_regs and, on x86,
properly synchronize with halted in the accessor functions.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/apic.c |7 
 qemu-kvm-ia64.c   |4 ++-
 qemu-kvm-x86.c|   88 +++-
 qemu-kvm.c|   30 -
 qemu-kvm.h|   15 
 target-i386/machine.c |6 ---
 target-ia64/machine.c |3 ++
 7 files changed, 55 insertions(+), 98 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 3e03e10..092c61e 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -507,13 +507,6 @@ void apic_init_reset(CPUState *env)
 s-wait_for_sipi = 1;
 
 env-halted = !(s-apicbase  MSR_IA32_APICBASE_BSP);
-#ifdef KVM_CAP_MP_STATE
-if (kvm_enabled()  kvm_irqchip_in_kernel()) {
-env-mp_state
-= env-halted ? KVM_MP_STATE_UNINITIALIZED : KVM_MP_STATE_RUNNABLE;
-kvm_load_mpstate(env);
-}
-#endif
 }
 
 static void apic_startup(APICState *s, int vector_num)
diff --git a/qemu-kvm-ia64.c b/qemu-kvm-ia64.c
index fc8110e..39bcbeb 100644
--- a/qemu-kvm-ia64.c
+++ b/qemu-kvm-ia64.c
@@ -124,7 +124,9 @@ void kvm_arch_cpu_reset(CPUState *env)
 {
 if (kvm_irqchip_in_kernel(kvm_context)) {
 #ifdef KVM_CAP_MP_STATE
-   kvm_reset_mpstate(env-kvm_cpu_state.vcpu_ctx);
+struct kvm_mp_state mp_state = {.mp_state = KVM_MP_STATE_UNINITIALIZED
+};
+kvm_set_mpstate(env, mp_state);
 #endif
 } else {
env-interrupt_request = ~CPU_INTERRUPT_HARD;
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 63cd095..6b5895f 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -754,6 +754,48 @@ static int get_msr_entry(struct kvm_msr_entry *entry, 
CPUState *env)
 return 0;
 }
 
+static void kvm_arch_save_mpstate(CPUState *env)
+{
+#ifdef KVM_CAP_MP_STATE
+int r;
+struct kvm_mp_state mp_state;
+
+r = kvm_get_mpstate(env, mp_state);
+if (r  0) {
+env-mp_state = -1;
+} else {
+env-mp_state = mp_state.mp_state;
+if (kvm_irqchip_in_kernel()) {
+env-halted = (env-mp_state == KVM_MP_STATE_HALTED);
+}
+}
+#else
+env-mp_state = -1;
+#endif
+}
+
+static void kvm_arch_load_mpstate(CPUState *env)
+{
+#ifdef KVM_CAP_MP_STATE
+struct kvm_mp_state mp_state;
+
+/*
+ * -1 indicates that the host did not support GET_MP_STATE ioctl,
+ *  so don't touch it.
+ */
+if (env-mp_state != -1) {
+if (kvm_irqchip_in_kernel()) {
+env-mp_state = env-halted ? KVM_MP_STATE_UNINITIALIZED :
+  KVM_MP_STATE_RUNNABLE;
+/* Avoid deadlock: no user space IRQ will ever clear it. */
+env-halted = 0;
+}
+mp_state.mp_state = env-mp_state;
+kvm_set_mpstate(env, mp_state);
+}
+#endif
+}
+
 static void set_v8086_seg(struct kvm_segment *lhs, const SegmentCache *rhs)
 {
 lhs-selector = rhs-selector;
@@ -926,6 +968,10 @@ void kvm_arch_load_regs(CPUState *env, int level)
 rc = kvm_set_msrs(env, msrs, n);
 if (rc == -1)
 perror(kvm_set_msrs FAILED);
+
+if (level = KVM_PUT_RESET_STATE) {
+kvm_arch_load_mpstate(env);
+}
 }
 
 void kvm_load_tsc(CPUState *env)
@@ -940,36 +986,6 @@ void kvm_load_tsc(CPUState *env)
 perror(kvm_set_tsc FAILED.\n);
 }
 
-void kvm_arch_save_mpstate(CPUState *env)
-{
-#ifdef KVM_CAP_MP_STATE
-int r;
-struct kvm_mp_state mp_state;
-
-r = kvm_get_mpstate(env, mp_state);
-if (r  0)
-env-mp_state = -1;
-else
-env-mp_state = mp_state.mp_state;
-#else
-env-mp_state = -1;
-#endif
-}
-
-void kvm_arch_load_mpstate(CPUState *env)
-{
-#ifdef KVM_CAP_MP_STATE
-struct kvm_mp_state mp_state = { .mp_state = env-mp_state };
-
-/*
- * -1 indicates that the host did not support GET_MP_STATE ioctl,
- *  so don't touch it.
- */
-if (env-mp_state != -1)
-kvm_set_mpstate(env, mp_state);
-#endif
-}
-
 void kvm_arch_save_regs(CPUState *env)
 {
 struct kvm_regs regs;
@@ -1366,15 +1382,9 @@ void kvm_arch_cpu_reset(CPUState *env)
 {
 kvm_arch_reset_vcpu(env);
 kvm_put_vcpu_events(env);
-if (!cpu_is_bsp(env)) {
-   if (kvm_irqchip_in_kernel()) {
-#ifdef KVM_CAP_MP_STATE
-   kvm_reset_mpstate(env);
-#endif
-   } else {
-   env-interrupt_request = ~CPU_INTERRUPT_HARD;
-   env-halted = 1;
-   }
+if (!cpu_is_bsp(env)  !kvm_irqchip_in_kernel()) {
+env-interrupt_request = ~CPU_INTERRUPT_HARD;
+env-halted = 1;
 }
 }
 
diff --git a/qemu-kvm.c b/qemu-kvm.c
index 53030f1..efa6a29 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -1579,36 +1579,6 @@ void kvm_update_interrupt_request(CPUState *env)
 }
 }
 
-static void kvm_do_load_mpstate(void *_env)
-{
-CPUState *env = _env;
-
-kvm_arch_load_mpstate(env);
-}
-
-void kvm_load_mpstate(CPUState *env)
-{
-if

[Qemu-devel] [PATCH 05/21] qemu-kvm: Use upstream kvm_enabled and cpu_synchronize_state

2010-02-02 Thread Jan Kiszka

They are identical, no need for private copies. This requires replacing
qemu-kvm.h includes with kvm.h, a good thing anyway, and reveals that
there is no need for QEMU_KVM_NO_CPU protection.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/i8254.c|2 +-
 hw/i8259.c|2 +-
 hw/ioapic.c   |2 +-
 hw/msix.c |3 +--
 hw/pc.c   |2 +-
 hw/piix_pci.c |2 +-
 kvm.h |7 +++
 qemu-kvm.h|   41 -
 vl.c  |3 ++-
 9 files changed, 11 insertions(+), 53 deletions(-)

diff --git a/hw/i8254.c b/hw/i8254.c
index c4f8d2e..db9e94a 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -25,7 +25,7 @@
 #include pc.h
 #include isa.h
 #include qemu-timer.h
-#include qemu-kvm.h
+#include kvm.h
 #include i8254.h
 
 //#define DEBUG_PIT
diff --git a/hw/i8259.c b/hw/i8259.c
index 7a484c0..b64c6fb 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -27,7 +27,7 @@
 #include monitor.h
 #include qemu-timer.h
 
-#include qemu-kvm.h
+#include kvm.h
 
 /* debug PIC */
 //#define DEBUG_PIC
diff --git a/hw/ioapic.c b/hw/ioapic.c
index a66325d..0adb0ac 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -26,7 +26,7 @@
 #include qemu-timer.h
 #include host-utils.h
 
-#include qemu-kvm.h
+#include kvm.h
 
 //#define DEBUG_IOAPIC
 
diff --git a/hw/msix.c b/hw/msix.c
index 87f125b..faee0b2 100644
--- a/hw/msix.c
+++ b/hw/msix.c
@@ -14,8 +14,7 @@
 #include hw.h
 #include msix.h
 #include pci.h
-#define QEMU_KVM_NO_CPU
-#include qemu-kvm.h
+#include kvm.h
 
 /* Declaration from linux/pci_regs.h */
 #define  PCI_CAP_ID_MSIX 0x11 /* MSI-X */
diff --git a/hw/pc.c b/hw/pc.c
index 97e16ce..dac373e 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -47,7 +47,7 @@
 #include multiboot.h
 #include device-assignment.h
 
-#include qemu-kvm.h
+#include kvm.h
 
 /* output Bochs bios info messages */
 //#define DEBUG_BIOS
diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index 155587b..170f858 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -28,7 +28,7 @@
 #include pci_host.h
 #include isa.h
 #include sysbus.h
-#include qemu-kvm.h
+#include kvm.h
 
 /*
  * I440FX chipset data sheet.
diff --git a/kvm.h b/kvm.h
index 9fa4e25..d0f4bbe 100644
--- a/kvm.h
+++ b/kvm.h
@@ -18,8 +18,6 @@
 #include qemu-queue.h
 #include qemu-kvm.h
 
-#ifdef KVM_UPSTREAM
-
 #ifdef CONFIG_KVM
 extern int kvm_allowed;
 
@@ -28,6 +26,7 @@ extern int kvm_allowed;
 #define kvm_enabled() (0)
 #endif
 
+#ifdef KVM_UPSTREAM
 struct kvm_run;
 
 /* external API */
@@ -138,6 +137,8 @@ int kvm_check_extension(KVMState *s, unsigned int 
extension);
 
 uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function,
   int reg);
+#endif
+
 void kvm_cpu_synchronize_state(CPUState *env);
 
 /* generic hooks - to be moved/refactored once there are more users */
@@ -150,5 +151,3 @@ static inline void cpu_synchronize_state(CPUState *env)
 }
 
 #endif
-
-#endif
diff --git a/qemu-kvm.h b/qemu-kvm.h
index 1354227..d838bca 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -8,9 +8,7 @@
 #ifndef THE_ORIGINAL_AND_TRUE_QEMU_KVM_H
 #define THE_ORIGINAL_AND_TRUE_QEMU_KVM_H
 
-#ifndef QEMU_KVM_NO_CPU
 #include cpu.h
-#endif
 
 #include signal.h
 #include stdlib.h
@@ -94,8 +92,6 @@ void kvm_show_code(CPUState *env);
 
 int handle_halt(CPUState *env);
 
-#ifndef QEMU_KVM_NO_CPU
-
 int handle_shutdown(kvm_context_t kvm, CPUState *env);
 void post_kvm_run(kvm_context_t kvm, CPUState *env);
 int pre_kvm_run(kvm_context_t kvm, CPUState *env);
@@ -113,8 +109,6 @@ struct kvm_x86_mce;
 int kvm_set_mce(CPUState *env, struct kvm_x86_mce *mce);
 #endif
 
-#endif
-
 /*!
  * \brief Create new KVM context
  *
@@ -880,8 +874,6 @@ static inline int kvm_init(int smp_cpus)
 return 0;
 }
 
-#ifndef QEMU_KVM_NO_CPU
-
 static inline void kvm_inject_x86_mce(CPUState *cenv, int bank,
   uint64_t status, uint64_t mcg_status,
   uint64_t addr, uint64_t misc,
@@ -891,16 +883,11 @@ static inline void kvm_inject_x86_mce(CPUState *cenv, int 
bank,
 abort();
 }
 
-#endif
-
-extern int kvm_allowed;
-
 #endif  /* !CONFIG_KVM */
 
 
 int kvm_main_loop(void);
 int kvm_init_ap(void);
-#ifndef QEMU_KVM_NO_CPU
 int kvm_vcpu_inited(CPUState *env);
 void kvm_load_mpstate(CPUState *env);
 void kvm_save_mpstate(CPUState *env);
@@ -914,7 +901,6 @@ int kvm_update_guest_debug(CPUState *env, unsigned long 
reinject_trap);
 void kvm_apic_init(CPUState *env);
 /* called from vcpu initialization */
 void qemu_kvm_load_lapic(CPUState *env);
-#endif
 
 void kvm_hpet_enable_kpit(void);
 void kvm_hpet_disable_kpit(void);
@@ -923,13 +909,11 @@ int kvm_set_irq(int irq, int level, int *status);
 int kvm_physical_memory_set_dirty_tracking(int enable);
 int kvm_update_dirty_pages_log(void);
 
-#ifndef QEMU_KVM_NO_CPU
 void qemu_kvm_call_with_env(void (*func)(void *), void *data, CPUState *env);
 void qemu_kvm_cpuid_on_env(CPUState *env);
 void

[Qemu-devel] [PATCH 06/21] qemu-kvm: Use upstream kvm_setup_guest_memory

2010-02-02 Thread Jan Kiszka

Nothing missing in upstream kvm_setup_guest_memory, it is even more
careful about error handling.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 kvm-all.c  |3 ---
 kvm.h  |3 +--
 qemu-kvm.c |   15 ---
 qemu-kvm.h |1 -
 4 files changed, 1 insertions(+), 21 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 0423fff..e7fa605 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -886,7 +886,6 @@ int kvm_has_vcpu_events(void)
 return kvm_state-vcpu_events;
 }
 
-#ifdef KVM_UPSTREAM
 void kvm_setup_guest_memory(void *start, size_t size)
 {
 if (!kvm_has_sync_mmu()) {
@@ -905,8 +904,6 @@ void kvm_setup_guest_memory(void *start, size_t size)
 }
 }
 
-#endif /* KVM_UPSTREAM */
-
 #ifdef KVM_CAP_SET_GUEST_DEBUG
 
 #ifdef KVM_UPSTREAM
diff --git a/kvm.h b/kvm.h
index d0f4bbe..05ee540 100644
--- a/kvm.h
+++ b/kvm.h
@@ -54,10 +54,9 @@ int kvm_has_vcpu_events(void);
 int kvm_put_vcpu_events(CPUState *env);
 int kvm_get_vcpu_events(CPUState *env);
 
-#ifdef KVM_UPSTREAM
-
 void kvm_setup_guest_memory(void *start, size_t size);
 
+#ifdef KVM_UPSTREAM
 int kvm_coalesce_mmio_region(target_phys_addr_t start, ram_addr_t size);
 int kvm_uncoalesce_mmio_region(target_phys_addr_t start, ram_addr_t size);
 
diff --git a/qemu-kvm.c b/qemu-kvm.c
index 97c098c..76f056c 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -2321,21 +2321,6 @@ void kvm_set_phys_mem(target_phys_addr_t start_addr, 
ram_addr_t size,
 return;
 }
 
-int kvm_setup_guest_memory(void *area, unsigned long size)
-{
-int ret = 0;
-
-#ifdef MADV_DONTFORK
-if (kvm_enabled()  !kvm_has_sync_mmu())
-ret = madvise(area, size, MADV_DONTFORK);
-#endif
-
-if (ret)
-perror(madvise);
-
-return ret;
-}
-
 #ifdef KVM_CAP_SET_GUEST_DEBUG
 
 struct kvm_set_guest_debug_data {
diff --git a/qemu-kvm.h b/qemu-kvm.h
index d838bca..0664c1d 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -923,7 +923,6 @@ void kvm_cpu_destroy_phys_mem(target_phys_addr_t start_addr,
   unsigned long size);
 void kvm_qemu_log_memory(target_phys_addr_t start, target_phys_addr_t size,
  int log);
-int kvm_setup_guest_memory(void *area, unsigned long size);
 int kvm_qemu_create_memory_alias(uint64_t phys_start, uint64_t len,
  uint64_t target_phys);
 int kvm_qemu_destroy_memory_alias(uint64_t phys_start);
-- 
1.6.0.2

[Qemu-devel] [PATCH 21/21] qemu-kvm: Bring qemu_init_vcpu back home

2010-02-02 Thread Jan Kiszka

There is no need for the this hack anymore, initialization is now robust
against reordering as it doesn't try to write the VCPU state on its own.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/pc.c  |5 -
 target-i386/helper.c |2 ++
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index 3df6195..cd0746c 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -751,11 +751,6 @@ CPUState *pc_new_cpu(const char *cpu_model)
 } else {
 qemu_register_reset((QEMUResetHandler*)cpu_reset, env);
 }
-
-/* kvm needs this to run after the apic is initialized. Otherwise,
- * it can access invalid state and crash.
- */
-qemu_init_vcpu(env);
 return env;
 }
 
diff --git a/target-i386/helper.c b/target-i386/helper.c
index f9d63f6..f83e8cc 100644
--- a/target-i386/helper.c
+++ b/target-i386/helper.c
@@ -1953,6 +1953,8 @@ CPUX86State *cpu_x86_init(const char *cpu_model)
 }
 mce_init(env);
 
+qemu_init_vcpu(env);
+
 return env;
 }
 
-- 
1.6.0.2

[Qemu-devel] [PATCH 19/21] qemu-kvm: Clean up KVM's APIC hooks

2010-02-02 Thread Jan Kiszka

The APIC is part of the VCPU state, so trigger its readout and writeback
from kvm_arch_save/load_regs. Thanks to the transparent sync on reset
and vmsave/load, we can also drop explicit sync code, reducing the diff
to upstream.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/apic.c  |   37 +
 qemu-kvm-x86.c |4 ++--
 qemu-kvm.h |5 ++---
 3 files changed, 9 insertions(+), 37 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 092c61e..d8c4f7c 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -24,8 +24,6 @@
 #include host-utils.h
 #include kvm.h
 
-#include qemu-kvm.h
-
 //#define DEBUG_APIC
 
 /* APIC Local Vector Table */
@@ -951,36 +949,22 @@ static void kvm_kernel_lapic_load_from_user(APICState *s)
 
 #endif
 
-void qemu_kvm_load_lapic(CPUState *env)
+void kvm_load_lapic(CPUState *env)
 {
 #ifdef KVM_CAP_IRQCHIP
-if (kvm_enabled()  kvm_vcpu_inited(env)  kvm_irqchip_in_kernel()) {
-kvm_kernel_lapic_load_from_user(env-apic_state);
-}
-#endif
-}
-
-static void apic_pre_save(void *opaque)
-{
-#ifdef KVM_CAP_IRQCHIP
-APICState *s = (void *)opaque;
-
 if (kvm_enabled()  kvm_irqchip_in_kernel()) {
-kvm_kernel_lapic_save_to_user(s);
+kvm_kernel_lapic_load_from_user(env-apic_state);
 }
 #endif
 }
 
-static int apic_post_load(void *opaque, int version_id)
+void kvm_save_lapic(CPUState *env)
 {
 #ifdef KVM_CAP_IRQCHIP
-APICState *s = opaque;
-
 if (kvm_enabled()  kvm_irqchip_in_kernel()) {
-kvm_kernel_lapic_load_from_user(s);
+kvm_kernel_lapic_save_to_user(env-apic_state);
 }
 #endif
-return 0;
 }
 
 /* This function is only used for old state version 1 and 2 */
@@ -1019,9 +1003,6 @@ static int apic_load_old(QEMUFile *f, void *opaque, int 
version_id)
 
 if (version_id = 2)
 qemu_get_timer(f, s-timer);
-
-qemu_kvm_load_lapic(s-cpu_env);
-
 return 0;
 }
 
@@ -1052,9 +1033,7 @@ static const VMStateDescription vmstate_apic = {
 VMSTATE_INT64(next_time, APICState),
 VMSTATE_TIMER(timer, APICState),
 VMSTATE_END_OF_LIST()
-},
-.pre_save = apic_pre_save,
-.post_load = apic_post_load,
+}
 };
 
 static void apic_reset(void *opaque)
@@ -1077,7 +1056,6 @@ static void apic_reset(void *opaque)
  */
 s-lvt[APIC_LVT_LINT0] = 0x700;
 }
-qemu_kvm_load_lapic(s-cpu_env);
 }
 
 static CPUReadMemoryFunc * const apic_mem_read[3] = {
@@ -1121,11 +1099,6 @@ int apic_init(CPUState *env)
 vmstate_register(s-idx, vmstate_apic, s);
 qemu_register_reset(apic_reset, s);
 
-/* apic_reset must be called before the vcpu threads are initialized and 
load
- * registers, in qemu-kvm.
- */
-apic_reset(s);
-
 local_apics[s-idx] = s;
 return 0;
 }
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 4b78570..9de018e 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -974,6 +974,7 @@ void kvm_arch_load_regs(CPUState *env, int level)
 
 if (level = KVM_PUT_RESET_STATE) {
 kvm_arch_load_mpstate(env);
+kvm_load_lapic(env);
 }
 
 kvm_put_vcpu_events(env, level);
@@ -1134,6 +1135,7 @@ void kvm_arch_save_regs(CPUState *env)
 }
 }
 kvm_arch_save_mpstate(env);
+kvm_save_lapic(env);
 kvm_get_vcpu_events(env);
 }
 
@@ -1207,8 +1209,6 @@ int kvm_arch_init_vcpu(CPUState *cenv)
 CPUState copy;
 uint32_t i, j, limit;
 
-qemu_kvm_load_lapic(cenv);
-
 kvm_arch_reset_vcpu(cenv);
 
 #ifdef KVM_CPUID_SIGNATURE
diff --git a/qemu-kvm.h b/qemu-kvm.h
index 2af206c..fea23a4 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -864,9 +864,8 @@ static inline void kvm_inject_x86_mce(CPUState *cenv, int 
bank,
 int kvm_main_loop(void);
 int kvm_init_ap(void);
 int kvm_vcpu_inited(CPUState *env);
-void kvm_apic_init(CPUState *env);
-/* called from vcpu initialization */
-void qemu_kvm_load_lapic(CPUState *env);
+void kvm_save_lapic(CPUState *env);
+void kvm_load_lapic(CPUState *env);
 
 void kvm_hpet_enable_kpit(void);
 void kvm_hpet_disable_kpit(void);
-- 
1.6.0.2

[Qemu-devel] [PATCH 08/21] qemu-kvm: Use upstream kvm_arch_get_supported_cpuid

2010-02-02 Thread Jan Kiszka

It is idential to our version now, so drop the copy.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 kvm.h |3 -
 qemu-kvm-x86.c|  106 -
 qemu-kvm.h|5 --
 target-i386/kvm.c |4 +-
 4 files changed, 2 insertions(+), 116 deletions(-)

diff --git a/kvm.h b/kvm.h
index b5ed744..189a5d4 100644
--- a/kvm.h
+++ b/kvm.h
@@ -137,11 +137,8 @@ void kvm_arch_update_guest_debug(CPUState *env, struct 
kvm_guest_debug *dbg);
 
 int kvm_check_extension(KVMState *s, unsigned int extension);
 
-#ifdef KVM_UPSTREAM
 uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function,
   int reg);
-#endif
-
 void kvm_cpu_synchronize_state(CPUState *env);
 
 /* generic hooks - to be moved/refactored once there are more users */
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 7f820a4..0457a6e 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -627,106 +627,6 @@ int kvm_disable_tpr_access_reporting(CPUState *env)
 
 #endif
 
-#ifdef KVM_CAP_EXT_CPUID
-
-static struct kvm_cpuid2 *try_get_cpuid(kvm_context_t kvm, int max)
-{
-   struct kvm_cpuid2 *cpuid;
-   int r, size;
-
-   size = sizeof(*cpuid) + max * sizeof(*cpuid-entries);
-   cpuid = qemu_malloc(size);
-   cpuid-nent = max;
-   r = kvm_ioctl(kvm_state, KVM_GET_SUPPORTED_CPUID, cpuid);
-   if (r == 0  cpuid-nent = max)
-   r = -E2BIG;
-   if (r  0) {
-   if (r == -E2BIG) {
-   free(cpuid);
-   return NULL;
-   } else {
-   fprintf(stderr, KVM_GET_SUPPORTED_CPUID failed: %s\n,
-   strerror(-r));
-   exit(1);
-   }
-   }
-   return cpuid;
-}
-
-#define R_EAX 0
-#define R_ECX 1
-#define R_EDX 2
-#define R_EBX 3
-#define R_ESP 4
-#define R_EBP 5
-#define R_ESI 6
-#define R_EDI 7
-
-uint32_t kvm_get_supported_cpuid(kvm_context_t kvm, uint32_t function, int reg)
-{
-   struct kvm_cpuid2 *cpuid;
-   int i, max;
-   uint32_t ret = 0;
-   uint32_t cpuid_1_edx;
-
-   if (!kvm_check_extension(kvm_state, KVM_CAP_EXT_CPUID)) {
-   return -1U;
-   }
-
-   max = 1;
-   while ((cpuid = try_get_cpuid(kvm, max)) == NULL) {
-   max *= 2;
-   }
-
-   for (i = 0; i  cpuid-nent; ++i) {
-   if (cpuid-entries[i].function == function) {
-   switch (reg) {
-   case R_EAX:
-   ret = cpuid-entries[i].eax;
-   break;
-   case R_EBX:
-   ret = cpuid-entries[i].ebx;
-   break;
-   case R_ECX:
-   ret = cpuid-entries[i].ecx;
-   break;
-   case R_EDX:
-   ret = cpuid-entries[i].edx;
-if (function == 1) {
-/* kvm misreports the following features
- */
-ret |= 1  12; /* MTRR */
-ret |= 1  16; /* PAT */
-ret |= 1  7;  /* MCE */
-ret |= 1  14; /* MCA */
-}
-
-   /* On Intel, kvm returns cpuid according to
-* the Intel spec, so add missing bits
-* according to the AMD spec:
-*/
-   if (function == 0x8001) {
-   cpuid_1_edx = 
kvm_get_supported_cpuid(kvm, 1, R_EDX);
-   ret |= cpuid_1_edx  0xdfeff7ff;
-   }
-   break;
-   }
-   }
-   }
-
-   free(cpuid);
-
-   return ret;
-}
-
-#else
-
-uint32_t kvm_get_supported_cpuid(kvm_context_t kvm, uint32_t function, int reg)
-{
-   return -1U;
-}
-
-#endif
 int kvm_qemu_create_memory_alias(uint64_t phys_start,
  uint64_t len,
  uint64_t target_phys)
@@ -1686,12 +1586,6 @@ int kvm_arch_init_irq_routing(void)
 return 0;
 }
 
-uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function,
-  int reg)
-{
-return kvm_get_supported_cpuid(kvm_context, function, reg);
-}
-
 void kvm_arch_process_irqchip_events(CPUState *env)
 {
 if (env-interrupt_request  CPU_INTERRUPT_INIT) {
diff --git a/qemu-kvm.h b/qemu-kvm.h
index 150017d..7b75fdd 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -859,8 +859,6 @@ int kvm_assign_set_msix_entry(kvm_context_t kvm,

[Qemu-devel] [PATCH 07/21] qemu-kvm: Use some more upstream prototypes

2010-02-02 Thread Jan Kiszka

Drop our private typedef of KVMState and use more identical upstream
prototypes.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 kvm.h  |   10 +++---
 qemu-kvm.c |4 +++-
 qemu-kvm.h |   24 +++-
 3 files changed, 13 insertions(+), 25 deletions(-)

diff --git a/kvm.h b/kvm.h
index 05ee540..b5ed744 100644
--- a/kvm.h
+++ b/kvm.h
@@ -32,11 +32,13 @@ struct kvm_run;
 /* external API */
 
 int kvm_init(int smp_cpus);
+#endif /* KVM_UPSTREAM */
 
 int kvm_init_vcpu(CPUState *env);
 
 int kvm_cpu_exec(CPUState *env);
 
+#ifdef KVM_UPSTREAM
 void kvm_set_phys_mem(target_phys_addr_t start_addr,
   ram_addr_t size,
   ram_addr_t phys_offset);
@@ -47,19 +49,19 @@ int kvm_physical_sync_dirty_bitmap(target_phys_addr_t 
start_addr,
 int kvm_log_start(target_phys_addr_t phys_addr, ram_addr_t size);
 int kvm_log_stop(target_phys_addr_t phys_addr, ram_addr_t size);
 int kvm_set_migration_log(int enable);
+#endif /* KVM_UPSTREAM */
 
 int kvm_has_sync_mmu(void);
-#endif /* KVM_UPSTREAM */
 int kvm_has_vcpu_events(void);
 int kvm_put_vcpu_events(CPUState *env);
 int kvm_get_vcpu_events(CPUState *env);
 
 void kvm_setup_guest_memory(void *start, size_t size);
 
-#ifdef KVM_UPSTREAM
 int kvm_coalesce_mmio_region(target_phys_addr_t start, ram_addr_t size);
 int kvm_uncoalesce_mmio_region(target_phys_addr_t start, ram_addr_t size);
 
+#ifdef KVM_UPSTREAM
 int kvm_insert_breakpoint(CPUState *current_env, target_ulong addr,
   target_ulong len, int type);
 int kvm_remove_breakpoint(CPUState *current_env, target_ulong addr,
@@ -69,6 +71,7 @@ int kvm_update_guest_debug(CPUState *env, unsigned long 
reinject_trap);
 
 int kvm_pit_in_kernel(void);
 int kvm_irqchip_in_kernel(void);
+#endif /* KVM_UPSTREAM */
 
 /* internal API */
 
@@ -97,7 +100,6 @@ int kvm_arch_init(KVMState *s, int smp_cpus);
 
 int kvm_arch_init_vcpu(CPUState *env);
 
-#endif
 void kvm_arch_reset_vcpu(CPUState *env);
 #ifdef KVM_UPSTREAM
 
@@ -131,9 +133,11 @@ int kvm_arch_remove_hw_breakpoint(target_ulong addr,
 void kvm_arch_remove_all_hw_breakpoints(void);
 
 void kvm_arch_update_guest_debug(CPUState *env, struct kvm_guest_debug *dbg);
+#endif
 
 int kvm_check_extension(KVMState *s, unsigned int extension);
 
+#ifdef KVM_UPSTREAM
 uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function,
   int reg);
 #endif
diff --git a/qemu-kvm.c b/qemu-kvm.c
index 76f056c..12442a7 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -1909,12 +1909,14 @@ static void *ap_main_loop(void *_env)
 return NULL;
 }
 
-void kvm_init_vcpu(CPUState *env)
+int kvm_init_vcpu(CPUState *env)
 {
 pthread_create(env-kvm_cpu_state.thread, NULL, ap_main_loop, env);
 
 while (env-created == 0)
 qemu_cond_wait(qemu_vcpu_cond);
+
+return 0;
 }
 
 int kvm_vcpu_inited(CPUState *env)
diff --git a/qemu-kvm.h b/qemu-kvm.h
index 0664c1d..150017d 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -891,7 +891,6 @@ int kvm_init_ap(void);
 int kvm_vcpu_inited(CPUState *env);
 void kvm_load_mpstate(CPUState *env);
 void kvm_save_mpstate(CPUState *env);
-int kvm_cpu_exec(CPUState *env);
 int kvm_insert_breakpoint(CPUState * current_env, target_ulong addr,
   target_ulong len, int type);
 int kvm_remove_breakpoint(CPUState * current_env, target_ulong addr,
@@ -933,9 +932,6 @@ void kvm_arch_save_regs(CPUState *env);
 void kvm_arch_load_regs(CPUState *env);
 void kvm_arch_load_mpstate(CPUState *env);
 void kvm_arch_save_mpstate(CPUState *env);
-int kvm_arch_init_vcpu(CPUState *cenv);
-int kvm_arch_pre_run(CPUState *env, struct kvm_run *run);
-int kvm_arch_post_run(CPUState *env, struct kvm_run *run);
 int kvm_arch_has_work(CPUState *env);
 void kvm_arch_process_irqchip_events(CPUState *env);
 int kvm_arch_try_push_interrupts(void *opaque);
@@ -981,8 +977,6 @@ void kvm_tpr_access_report(CPUState *env, uint64_t rip, int 
is_write);
 void kvm_tpr_vcpu_start(CPUState *env);
 
 int qemu_kvm_get_dirty_pages(unsigned long phys_addr, void *buf);
-int kvm_coalesce_mmio_region(target_phys_addr_t start, ram_addr_t size);
-int kvm_uncoalesce_mmio_region(target_phys_addr_t start, ram_addr_t size);
 
 int kvm_arch_init_irq_routing(void);
 
@@ -1021,17 +1015,14 @@ void qemu_kvm_cpu_stop(CPUState *env);
 int kvm_arch_halt(CPUState *env);
 int handle_tpr_access(void *opaque, CPUState *env, uint64_t rip,
   int is_write);
-int kvm_has_sync_mmu(void);
 
 #define qemu_kvm_pit_in_kernel() kvm_pit_in_kernel(kvm_context)
 #define qemu_kvm_has_gsi_routing() kvm_has_gsi_routing(kvm_context)
 #ifdef TARGET_I386
 #define qemu_kvm_has_pit_state2() kvm_has_pit_state2(kvm_context)
 #endif
-void kvm_init_vcpu(CPUState *env);
 void kvm_load_tsc(CPUState *env);
 #else
-#define kvm_has_sync_mmu() (0)
 #define kvm_nested 0
 #define qemu_kvm_pit_in_kernel() (0)
 #define qemu_kvm_has_gsi_routing() (0)
@@ -1040,10 +1031,6 @@ void

[Qemu-devel] [PATCH 13/21] qemu-kvm: Use upstream guest debug code

2010-02-02 Thread Jan Kiszka

Code was absolute identical except for previous cleanup in upstream.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 kvm-all.c |7 +-
 kvm.h |4 -
 qemu-kvm-x86.c|  178 ++--
 qemu-kvm.c|   44 -
 qemu-kvm.h|   37 ---
 target-i386/kvm.c |2 +-
 6 files changed, 11 insertions(+), 261 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 9c921cc..f3cfa2c 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -919,7 +919,9 @@ static void on_vcpu(CPUState *env, void (*func)(void 
*data), void *data)
 func(data);
 #endif
 }
-#endif /* KVM_UPSTREAM */
+#else /* !KVM_UPSTREAM */
+static void on_vcpu(CPUState *env, void (*func)(void *data), void *data);
+#endif /* !KVM_UPSTREAM */
 
 struct kvm_sw_breakpoint *kvm_find_sw_breakpoint(CPUState *env,
  target_ulong pc)
@@ -938,8 +940,6 @@ int kvm_sw_breakpoints_active(CPUState *env)
 return !QTAILQ_EMPTY(env-kvm_state-kvm_sw_breakpoints);
 }
 
-#ifdef KVM_UPSTREAM
-
 struct kvm_set_guest_debug_data {
 struct kvm_guest_debug dbg;
 CPUState *env;
@@ -969,7 +969,6 @@ int kvm_update_guest_debug(CPUState *env, unsigned long 
reinject_trap)
 on_vcpu(env, kvm_invoke_set_guest_debug, data);
 return data.err;
 }
-#endif
 
 int kvm_insert_breakpoint(CPUState *current_env, target_ulong addr,
   target_ulong len, int type)
diff --git a/kvm.h b/kvm.h
index 253b45d..740fd1a 100644
--- a/kvm.h
+++ b/kvm.h
@@ -61,14 +61,12 @@ void kvm_setup_guest_memory(void *start, size_t size);
 int kvm_coalesce_mmio_region(target_phys_addr_t start, ram_addr_t size);
 int kvm_uncoalesce_mmio_region(target_phys_addr_t start, ram_addr_t size);
 
-#ifdef KVM_UPSTREAM
 int kvm_insert_breakpoint(CPUState *current_env, target_ulong addr,
   target_ulong len, int type);
 int kvm_remove_breakpoint(CPUState *current_env, target_ulong addr,
   target_ulong len, int type);
 void kvm_remove_all_breakpoints(CPUState *current_env);
 int kvm_update_guest_debug(CPUState *env, unsigned long reinject_trap);
-#endif /* KVM_UPSTREAM */
 
 int kvm_pit_in_kernel(void);
 int kvm_irqchip_in_kernel(void);
@@ -101,7 +99,6 @@ int kvm_arch_init(KVMState *s, int smp_cpus);
 int kvm_arch_init_vcpu(CPUState *env);
 
 void kvm_arch_reset_vcpu(CPUState *env);
-#ifdef KVM_UPSTREAM
 
 struct kvm_guest_debug;
 struct kvm_debug_exit_arch;
@@ -133,7 +130,6 @@ int kvm_arch_remove_hw_breakpoint(target_ulong addr,
 void kvm_arch_remove_all_hw_breakpoints(void);
 
 void kvm_arch_update_guest_debug(CPUState *env, struct kvm_guest_debug *dbg);
-#endif
 
 int kvm_check_extension(KVMState *s, unsigned int extension);
 
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 074b510..834e9c1 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -835,6 +835,13 @@ void kvm_arch_load_regs(CPUState *env)
 
 kvm_set_regs(env, regs);
 
+/*
+ * Kernels before 2.6.33 overwrote flags.TF injected via SET_GUEST_DEBUG
+ * while updating GP regs. Work around this by updating the debug state
+ * once again.
+ */
+kvm_vcpu_ioctl(env, KVM_SET_GUEST_DEBUG, env-kvm_guest_debug);
+
 memset(fpu, 0, sizeof fpu);
 fpu.fsw = env-fpus  ~(7  11);
 fpu.fsw |= (env-fpstt  7)  11;
@@ -1372,177 +1379,6 @@ void kvm_arch_cpu_reset(CPUState *env)
 }
 }
 
-int kvm_arch_insert_sw_breakpoint(CPUState *env, struct kvm_sw_breakpoint *bp)
-{
-uint8_t int3 = 0xcc;
-
-if (cpu_memory_rw_debug(env, bp-pc, (uint8_t *)bp-saved_insn, 1, 0) ||
-cpu_memory_rw_debug(env, bp-pc, int3, 1, 1))
-return -EINVAL;
-return 0;
-}
-
-int kvm_arch_remove_sw_breakpoint(CPUState *env, struct kvm_sw_breakpoint *bp)
-{
-uint8_t int3;
-
-if (cpu_memory_rw_debug(env, bp-pc, int3, 1, 0) || int3 != 0xcc ||
-cpu_memory_rw_debug(env, bp-pc, (uint8_t *)bp-saved_insn, 1, 1))
-return -EINVAL;
-return 0;
-}
-
-#ifdef KVM_CAP_SET_GUEST_DEBUG
-static struct {
-target_ulong addr;
-int len;
-int type;
-} hw_breakpoint[4];
-
-static int nb_hw_breakpoint;
-
-static int find_hw_breakpoint(target_ulong addr, int len, int type)
-{
-int n;
-
-for (n = 0; n  nb_hw_breakpoint; n++)
-   if (hw_breakpoint[n].addr == addr  hw_breakpoint[n].type == type 
-   (hw_breakpoint[n].len == len || len == -1))
-   return n;
-return -1;
-}
-
-int kvm_arch_insert_hw_breakpoint(target_ulong addr,
-  target_ulong len, int type)
-{
-switch (type) {
-case GDB_BREAKPOINT_HW:
-   len = 1;
-   break;
-case GDB_WATCHPOINT_WRITE:
-case GDB_WATCHPOINT_ACCESS:
-   switch (len) {
-   case 1:
-   break;
-   case 2:
-   case 4:
-   case 8:
-   if (addr  (len - 1))
-   return -EINVAL;
-   break;
-   default:
-   return -EINVAL;
-   }
-

[Qemu-devel] [PATCH 14/21] qemu-kvm: Rework VCPU state writeback API

2010-02-02 Thread Jan Kiszka

This grand cleanup drops all reset and vmsave/load related
synchronization points in favor of four(!) generic hooks:

- cpu_synchronize_all_states in qemu_savevm_state_complete
  (initial sync from kernel before vmsave)
- cpu_synchronize_all_post_init in qemu_loadvm_state
  (writeback after vmload)
- cpu_synchronize_all_post_init in main after machine init
- cpu_synchronize_all_post_reset in qemu_system_reset
  (writeback after system reset)

These writeback points + the existing one of VCPU exec after
cpu_synchronize_state map on three levels of writeback:

- KVM_PUT_ASYNC_STATE (during runtime, other VCPUs continue to run)
- KVM_PUT_RESET_STATE (on synchronous system reset, all VCPUs stopped)
- KVM_PUT_FULL_STATE  (on init or vmload, all VCPUs stopped as well)

This level is passed to the arch-specific VCPU state writing function
that will decide which concrete substates need to be written. That way,
no writer of load, save or reset functions that interact with in-kernel
KVM states will ever have to worry about synchronization again. That
also means that a lot of reasons for races, segfaults and deadlocks are
eliminated.

cpu_synchronize_state remains untouched, just as Anthony suggested. We
continue to need it before reading or writing of VCPU states that are
also tracked by in-kernel KVM subsystems.

Consequently, this patch removes many cpu_synchronize_state calls that
are now redundant, just like remaining explicit register syncs. It does
not touch qemu-kvm's special hooks for mpstate, vcpu_events, or tsc
loading. They will be cleaned up by individual patches.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 exec.c|   17 -
 hw/apic.c |3 ---
 hw/pc.c   |1 -
 hw/ppc_newworld.c |3 ---
 hw/ppc_oldworld.c |3 ---
 hw/s390-virtio.c  |1 -
 kvm-all.c |   19 +--
 kvm.h |   22 +-
 qemu-kvm-ia64.c   |2 +-
 qemu-kvm-x86.c|3 +--
 qemu-kvm.c|   16 +---
 qemu-kvm.h|2 +-
 savevm.c  |4 
 sysemu.h  |4 
 target-i386/kvm.c |2 +-
 target-i386/machine.c |   10 --
 target-ia64/machine.c |2 --
 target-ppc/kvm.c  |2 +-
 target-ppc/machine.c  |4 
 target-s390x/kvm.c|3 +--
 vl.c  |   29 +
 21 files changed, 90 insertions(+), 62 deletions(-)

diff --git a/exec.c b/exec.c
index ade09cb..7b35e0f 100644
--- a/exec.c
+++ b/exec.c
@@ -529,21 +529,6 @@ void cpu_exec_init_all(unsigned long tb_size)
 
 #if defined(CPU_SAVE_VERSION)  !defined(CONFIG_USER_ONLY)
 
-static void cpu_common_pre_save(void *opaque)
-{
-CPUState *env = opaque;
-
-cpu_synchronize_state(env);
-}
-
-static int cpu_common_pre_load(void *opaque)
-{
-CPUState *env = opaque;
-
-cpu_synchronize_state(env);
-return 0;
-}
-
 static int cpu_common_post_load(void *opaque, int version_id)
 {
 CPUState *env = opaque;
@@ -561,8 +546,6 @@ static const VMStateDescription vmstate_cpu_common = {
 .version_id = 1,
 .minimum_version_id = 1,
 .minimum_version_id_old = 1,
-.pre_save = cpu_common_pre_save,
-.pre_load = cpu_common_pre_load,
 .post_load = cpu_common_post_load,
 .fields  = (VMStateField []) {
 VMSTATE_UINT32(halted, CPUState),
diff --git a/hw/apic.c b/hw/apic.c
index ae805dc..3e03e10 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -488,7 +488,6 @@ void apic_init_reset(CPUState *env)
 if (!s)
 return;
 
-cpu_synchronize_state(env);
 s-tpr = 0;
 s-spurious_vec = 0xff;
 s-log_dest = 0;
@@ -1070,8 +1069,6 @@ static void apic_reset(void *opaque)
 APICState *s = opaque;
 int bsp;
 
-cpu_synchronize_state(s-cpu_env);
-
 bsp = cpu_is_bsp(s-cpu_env);
 s-apicbase = 0xfee0 |
 (bsp ? MSR_IA32_APICBASE_BSP : 0) | MSR_IA32_APICBASE_ENABLE;
diff --git a/hw/pc.c b/hw/pc.c
index af6ea8b..6c15a9f 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -744,7 +744,6 @@ CPUState *pc_new_cpu(const char *cpu_model)
 fprintf(stderr, Unable to find x86 CPU definition\n);
 exit(1);
 }
-env-kvm_vcpu_dirty = 1;
 if ((env-cpuid_features  CPUID_APIC) || smp_cpus  1) {
 env-cpuid_apic_id = env-cpu_index;
 /* APIC reset callback resets cpu */
diff --git a/hw/ppc_newworld.c b/hw/ppc_newworld.c
index a4c714a..9e288bd 100644
--- a/hw/ppc_newworld.c
+++ b/hw/ppc_newworld.c
@@ -139,9 +139,6 @@ static void ppc_core99_init (ram_addr_t ram_size,
 envs[i] = env;
 }
 
-/* Make sure all register sets take effect */
-cpu_synchronize_state(env);
-
 /* allocate RAM */
 ram_offset = qemu_ram_alloc(ram_size);
 cpu_register_physical_memory(0, ram_size, ram_offset);
diff --git a/hw/ppc_oldworld.c b/hw/ppc_oldworld.c
index 7ccc6a1..1aa05ed 100644
--- a/hw/ppc_oldworld.c
+++ b/hw/ppc_oldworld.c
@@ -164,9

[Qemu-devel] Re: [RFC 0/2]: QMP DISK_ERROR event

2010-02-02 Thread Kevin Wolf

Hi Luiz,

Am 01.02.2010 19:07, schrieb Luiz Capitulino:
  Hi there,
 
  I've been requested by libvirt guys to add a QMP event for disk I/O errors,
 this is what this series is about.
 
  It's a RFC because I need feedback on the following:
 
 1. drive_get_on_error() is called on all disk errors, right?

Well, yes, it is for all devices that support rerror/werror. But it also
might be called in other situations. Look at the get in the function
name, it's really a getter function and not a event handler.

 2. I've tested only ENOSPC errors, is there a way to test other errors? Like
 read ones?

So you'll probably want some EIO. Some recent bugs I've been handling
were a about images on NFS when the NFS server want away. It's a
reliable way to get EIO (mount with -osoft and small timeouts). I guess
qemu-nbd and the nbd: protocol might work, too.

Or maybe copy the start of a qcow2 image to a too small device.

 3. Is this the right approach at all? :)

Yes and no. As I said above, drive_get_on_error() is not the right place
to do it. Unfortunately it looks like there isn't a single generic place
where it can be done, but the call to the event handler must be added to
every device.

Kevin

[Qemu-devel] Re: [PATCH 2/2] QMP: Introduce DISK_ERROR event

2010-02-02 Thread Kevin Wolf

Am 01.02.2010 19:07, schrieb Luiz Capitulino:
 It's emitted when a disk write or read fails, some device information
 is provided. We can also provide error details in the future.
 
 Example:
 
 { event: DISK_ERROR,
 data: { device: ide0-hd1,
   operation: write,
   action: stop }
 timestamp: { seconds: 1265044230, microseconds: 450486 } }
 
 NOTE: Adding a small reference in QMP/qmp-events.txt, but this file is
 wrong and will be replaced by proper documentation shortly.
 
 Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
 ---
  QMP/qmp-events.txt |7 +++
  monitor.c  |3 +++
  monitor.h  |1 +
  vl.c   |   34 +-
  4 files changed, 44 insertions(+), 1 deletions(-)
 
 diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt
 index dc48ccc..e968ef5 100644
 --- a/QMP/qmp-events.txt
 +++ b/QMP/qmp-events.txt
 @@ -43,3 +43,10 @@ Data: 'server' and 'client' keys with the same keys as 
 'query-vnc'.
  
  Description: Issued when the VNC session is made active.
  Data: 'server' and 'client' keys with the same keys as 'query-vnc'.
 +
 +7 DISK_ERROR
 +
 +
 +Description: Issued when a disk I/O error occurs
 +Data: 'device' (device name), 'action' (action to be taken),
 +  'operation' (read or write)
 diff --git a/monitor.c b/monitor.c
 index fb7c572..82edd79 100644
 --- a/monitor.c
 +++ b/monitor.c
 @@ -378,6 +378,9 @@ void monitor_protocol_event(MonitorEvent event, QObject 
 *data)
  case QEVENT_VNC_DISCONNECTED:
  event_name = VNC_DISCONNECTED;
  break;
 +case QEVENT_DISK_ERROR:
 +event_name = DISK_ERROR;
 +break;
  default:
  abort();
  break;
 diff --git a/monitor.h b/monitor.h
 index b0f9270..beaddaf 100644
 --- a/monitor.h
 +++ b/monitor.h
 @@ -23,6 +23,7 @@ typedef enum MonitorEvent {
  QEVENT_VNC_CONNECTED,
  QEVENT_VNC_INITIALIZED,
  QEVENT_VNC_DISCONNECTED,
 +QEVENT_DISK_ERROR,
  QEVENT_MAX,
  } MonitorEvent;
  
 diff --git a/vl.c b/vl.c
 index 57c439d..1f69f56 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -1856,10 +1856,42 @@ static BlockInterfaceErrorAction drive_get_err_action(
  return is_read ? BLOCK_ERR_REPORT : BLOCK_ERR_STOP_ENOSPC;
  }
  
 +static void driver_err_event(
 +BlockInterfaceErrorAction action, int is_read, const char *device)
 +{
 +QObject *data;
 +const char *action_str;
 +
 +switch (action) {
 +case BLOCK_ERR_REPORT:
 +action_str = report;
 +break;
 +case BLOCK_ERR_IGNORE:
 +action_str = ignore;
 +break;
 +case BLOCK_ERR_STOP_ANY:
 +case BLOCK_ERR_STOP_ENOSPC:
 +action_str = stop;

This is wrong. If it's BLOCK_ERR_STOP_ENOSPC, the action taken depends
on the error code. It might as well be a report instead of stop if
it was an EIO, for example.

But the problem is probably going to go away when you stop abusing a
getter function and add some calls that are explicitly made for your
requirements.

Kevin

[Qemu-devel] Re: KVM call agenda for Feb 2

2010-02-02 Thread Jan Kiszka

Chris Wright wrote:
 Please send in any agenda items you are interested in covering.

[not sure, though, if I'll manage to join do to overlapping meeting]

- state of in-kernel APIC/IOAPIC/PIT upstream merge
- road map to get rid of qemu-kvm's slot management
  (IMHO: qemu-kvm-0.13)
- any further ongoing/planned upstream merge efforts?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH 00/13] i386 cpuid: cleanup and fixes

2010-02-02 Thread Andre Przywara

Hi,

first: I know that this conflicts with John Cooper's latest patch, but I want
to send this out for review and to help merging the stuff.

This patchset cleans up the CPUID handling code in QEMU. The biggest change
is obviously the move of the CPUID function to a separate file (cpuid.c).
This helps to split up a rather large source file, which's name (helper.c) is
also a bit misleading.
Please tell me soon if you don't like it so that I can rebase the rest of
patches.
Additionally the rest of the patches beautifies or simplifies some code.
Feature additions are:
 5/13: add missing CPUID feature bit names
 6/13: list CPUID feature bit names when using -cpu ?
 9/13: -cpu host propagates more CPUID leafs, so that the cache topology
   will be visibile in the guest
10/13: add CPUID feature bit trimming for TCG: Features not supported by
   the emulator will be masked out.
11/13: always show all CPU types: also expose the newer (64bit) CPU types
   for the i386 emulator. 64bit features will be masked out due to 10/13.
12/13: add kvm32 CPU model: Per popular request add a counterpart to kvm64
   describing a basic hardware virtualization capable CPU for migration
   purposes.

More details in the commit messages.

Note: In opposite to the last version I left out patches which change
the CPUID bits of existing CPU models to avoid regressions with guests.

Please review and comment.

Regards,
Andre.

[Qemu-devel] [PATCH 02/13] cpuid: replace magic number with named constant

2010-02-02 Thread Andre Przywara

CPUID leaf Fn8000_0001.EDX contains a copy of many Fn_0001.EDX bits.
Define a name for the mask to improve readability and avoid typos.

Signed-off-by: Andre Przywara andre.przyw...@amd.com
---
 target-i386/cpuid.c |   11 ++-
 1 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index aaa14ba..0a17020 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -130,6 +130,7 @@ typedef struct x86_def_t {
   CPUID_MSR | CPUID_MCE | CPUID_CX8 | CPUID_PGE | CPUID_CMOV | \
   CPUID_PAT | CPUID_FXSR | CPUID_MMX | CPUID_SSE | CPUID_SSE2 | \
   CPUID_PAE | CPUID_SEP | CPUID_APIC)
+#define EXT2_FEATURE_MASK 0x0183F3FF
 static x86_def_t x86_defs[] = {
 #ifdef TARGET_X86_64
 {
@@ -147,7 +148,7 @@ static x86_def_t x86_defs[] = {
 /* this feature is needed for Solaris and isn't fully implemented */
 CPUID_PSE36,
 .ext_features = CPUID_EXT_SSE3 | CPUID_EXT_CX16 | CPUID_EXT_POPCNT,
-.ext2_features = (PPRO_FEATURES  0x0183F3FF) | 
+.ext2_features = (PPRO_FEATURES  EXT2_FEATURE_MASK) | 
 CPUID_EXT2_LM | CPUID_EXT2_SYSCALL | CPUID_EXT2_NX,
 .ext3_features = CPUID_EXT3_LAHF_LM | CPUID_EXT3_SVM |
 CPUID_EXT3_ABM | CPUID_EXT3_SSE4A,
@@ -170,7 +171,7 @@ static x86_def_t x86_defs[] = {
 .ext_features = CPUID_EXT_SSE3 | CPUID_EXT_MONITOR | CPUID_EXT_CX16 |
 CPUID_EXT_POPCNT,
 /* Missing: CPUID_EXT2_PDPE1GB, CPUID_EXT2_RDTSCP */
-.ext2_features = (PPRO_FEATURES  0x0183F3FF) | 
+.ext2_features = (PPRO_FEATURES  EXT2_FEATURE_MASK) | 
 CPUID_EXT2_LM | CPUID_EXT2_SYSCALL | CPUID_EXT2_NX |
 CPUID_EXT2_3DNOW | CPUID_EXT2_3DNOWEXT | CPUID_EXT2_MMXEXT |
 CPUID_EXT2_FFXSR,
@@ -220,7 +221,7 @@ static x86_def_t x86_defs[] = {
 /* Missing: CPUID_EXT_POPCNT, CPUID_EXT_MONITOR */
 .ext_features = CPUID_EXT_SSE3 | CPUID_EXT_CX16,
 /* Missing: CPUID_EXT2_PDPE1GB, CPUID_EXT2_RDTSCP */
-.ext2_features = (PPRO_FEATURES  0x0183F3FF) |
+.ext2_features = (PPRO_FEATURES  EXT2_FEATURE_MASK) |
 CPUID_EXT2_LM | CPUID_EXT2_SYSCALL | CPUID_EXT2_NX,
 /* Missing: CPUID_EXT3_LAHF_LM, CPUID_EXT3_CMP_LEG, CPUID_EXT3_EXTAPIC,
 CPUID_EXT3_CR8LEG, CPUID_EXT3_ABM, CPUID_EXT3_SSE4A,
@@ -308,7 +309,7 @@ static x86_def_t x86_defs[] = {
 .stepping = 3,
 .features = PPRO_FEATURES | CPUID_PSE36 | CPUID_VME |
 CPUID_MTRR | CPUID_MCA,
-.ext2_features = (PPRO_FEATURES  0x0183F3FF) | CPUID_EXT2_MMXEXT |
+.ext2_features = (PPRO_FEATURES  EXT2_FEATURE_MASK) | 
CPUID_EXT2_MMXEXT |
   CPUID_EXT2_3DNOW | CPUID_EXT2_3DNOWEXT,
 .xlevel = 0x8008,
 /* XXX: put another string ? */
@@ -330,7 +331,7 @@ static x86_def_t x86_defs[] = {
 CPUID_EXT_SSE3 /* PNI */ | CPUID_EXT_SSSE3,
 /* Missing: CPUID_EXT_DSCPL | CPUID_EXT_EST |
  * CPUID_EXT_TM2 | CPUID_EXT_XTPR */
-.ext2_features = (PPRO_FEATURES  0x0183F3FF) | CPUID_EXT2_NX,
+.ext2_features = (PPRO_FEATURES  EXT2_FEATURE_MASK) | CPUID_EXT2_NX,
 /* Missing: .ext3_features = CPUID_EXT3_LAHF_LM */
 .xlevel = 0x800A,
 .model_id = Intel(R) Atom(TM) CPU N270   @ 1.60GHz,
-- 
1.6.4

[Qemu-devel] [PATCH 05/13] cpuid: add missing CPUID feature flag names

2010-02-02 Thread Andre Przywara

Some CPUID feature flags had no string value, so they could not be
switched on or off from the command line.
Add names for the missing ones mentioned in the current public CPUID
specification from both Intel and AMD. Those only mentioned in the
Linux kernel source I put as comments.

Signed-off-by: Andre Przywara andre.przyw...@amd.com
---
 target-i386/cpuid.c |   15 ---
 1 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index 0238718..19d58e1 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -52,11 +52,11 @@ static const char *feature_name[] = {
 fxsr, sse, sse2, ss, ht /* Intel htt */, tm, ia64, pbe,
 };
 static const char *ext_feature_name[] = {
-pni /* Intel,AMD sse3 */, NULL, NULL, monitor,
-ds_cpl, vmx, NULL /* Linux smx */, est,
-tm2, ssse3, cid, NULL, NULL, cx16, xtpr, NULL,
-NULL, NULL, dca, NULL, NULL, NULL, NULL, popcnt,
-NULL, NULL, NULL, NULL, NULL, NULL, NULL, hypervisor,
+pni /* Intel,AMD sse3 */, pclmuldq, dtes64, monitor,
+ds_cpl, vmx, smx, est,
+tm2, ssse3, cid, NULL, NULL /* FMA */, cx16, xtpr, pdcm,
+NULL, NULL, dca, sse4_1, sse4_2, x2apic, movbe, popcnt,
+NULL, aes, xsave, osxsave, NULL /* AVX */, NULL, NULL, hypervisor,
 };
 static const char *ext2_feature_name[] = {
 fpu, vme, de, pse, tsc, msr, pae, mce,
@@ -71,8 +71,9 @@ static const char *ext3_feature_name[] = {
 lahf_lm /* AMD LahfSahf */, cmp_legacy,
 svm, extapic /* AMD ExtApicSpace */,
 cr8legacy /* AMD AltMovCr8 */, abm, sse4a, misalignsse,
-3dnowprefetch, osvw, NULL /* Linux ibs */, NULL, skinit, wdt, 
NULL, NULL,
-NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
+3dnowprefetch, osvw, ibs, NULL /* SSE-5 */,
+skinit, wdt, NULL, NULL,
+NULL, NULL, NULL, nodeid_msr, NULL, NULL, NULL, NULL,
 NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
 };
 
-- 
1.6.4

[Qemu-devel] [PATCH 08/13] cpuid: simplify CPUID flag search function

2010-02-02 Thread Andre Przywara

avoid code duplication and handle the CPUID flag name search in a
loop.

Signed-off-by: Andre Przywara andre.przyw...@amd.com
---
 target-i386/cpuid.c |   38 +-
 1 files changed, 13 insertions(+), 25 deletions(-)

diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index 3f56c50..635c2f4 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -90,34 +90,22 @@ static void add_flagname_to_bitmaps(const char *flagname, 
uint32_t *features,
 uint32_t *ext3_features,
 uint32_t *kvm_features)
 {
-int i;
+int i, j;
 int found = 0;
-
-for ( i = 0 ; i  32 ; i++ )
-if (feature_name[i]  !strcmp (flagname, feature_name[i])) {
-*features |= 1  i;
-found = 1;
-}
-for ( i = 0 ; i  32 ; i++ )
-if (ext_feature_name[i]  !strcmp (flagname, ext_feature_name[i])) {
-*ext_features |= 1  i;
-found = 1;
-}
-for ( i = 0 ; i  32 ; i++ )
-if (ext2_feature_name[i]  !strcmp (flagname, ext2_feature_name[i])) {
-*ext2_features |= 1  i;
-found = 1;
-}
-for ( i = 0 ; i  32 ; i++ )
-if (ext3_feature_name[i]  !strcmp (flagname, ext3_feature_name[i])) {
-*ext3_features |= 1  i;
-found = 1;
-}
-for ( i = 0 ; i  32 ; i++ )
-if (kvm_feature_name[i]  !strcmp (flagname, kvm_feature_name[i])) {
-*kvm_features |= 1  i;
-found = 1;
+const char ** feature_names[5] = {feature_name, ext_feature_name,
+  ext2_feature_name, ext3_feature_name,
+  kvm_feature_name};
+uint32_t* feature_flags[5] = {features, ext_features, ext2_features,
+  ext3_features, kvm_features};
+
+for (j = 0; j  5; j++) {
+for ( i = 0 ; i  32 ; i++ ) {
+if (feature_names[j][i]  !strcmp(flagname, feature_names[j][i])) 
{
+*feature_flags[j] |= 1  i;
+found = 1;
+}
 }
+}
 
 if (!found) {
 fprintf(stderr, CPU feature %s not found\n, flagname);
-- 
1.6.4

[Qemu-devel] [PATCH 11/13] cpuid: Always expose 32 and 64-bit CPUs

2010-02-02 Thread Andre Przywara

Since 64-bit capability is just another CPUID bit we now properly
mask, there is no reason anymore to hide the 64-bit capable CPU
models from a 32-bit only QEMU. All 64-bit CPUs can be used
perfectly in 32-bit legacy mode anyway, so these models also make
sense for 32-bit.

Signed-off-by: Andre Przywara andre.przyw...@amd.com
---
 target-i386/cpuid.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index 6e6ee54..b03a363 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -153,7 +153,6 @@ typedef struct x86_def_t {
   CPUID_EXT3_CR8LEG | CPUID_EXT3_ABM | CPUID_EXT3_SSE4A)
 
 static x86_def_t x86_defs[] = {
-#ifdef TARGET_X86_64
 {
 .name = qemu64,
 .level = 4,
@@ -252,7 +251,6 @@ static x86_def_t x86_defs[] = {
 .xlevel = 0x8008,
 .model_id = Common KVM processor
 },
-#endif
 {
 .name = qemu32,
 .level = 4,
-- 
1.6.4

[Qemu-devel] [PATCH 10/13] cpuid: add TCG feature bit trimming

2010-02-02 Thread Andre Przywara

In KVM we trim the user provided CPUID bits to match the host CPU's
one. Introduce a similar feature to QEMU/TCG. Create a mask of TCG's
capabilities and apply it to the user bits.
This allows to let the CPU models reflect their native archetypes.

Signed-off-by: Andre Przywara andre.przyw...@amd.com
---
 target-i386/cpuid.c |   26 ++
 1 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index 6aa1f3f..6e6ee54 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -137,6 +137,21 @@ typedef struct x86_def_t {
   CPUID_PAT | CPUID_FXSR | CPUID_MMX | CPUID_SSE | CPUID_SSE2 | \
   CPUID_PAE | CPUID_SEP | CPUID_APIC)
 #define EXT2_FEATURE_MASK 0x0183F3FF
+
+#define TCG_FEATURES (CPUID_FP87 | CPUID_PSE | CPUID_TSC | CPUID_MSR | \
+  CPUID_PAE | CPUID_MCE | CPUID_CX8 | CPUID_APIC | CPUID_SEP | \
+  CPUID_MTRR | CPUID_PGE | CPUID_MCA | CPUID_CMOV | CPUID_PAT | \
+  CPUID_PSE36 | CPUID_CLFLUSH | CPUID_ACPI | CPUID_MMX | \
+  CPUID_FXSR | CPUID_SSE | CPUID_SSE2 | CPUID_SS)
+#define TCG_EXT_FEATURES (CPUID_EXT_SSE3 | CPUID_EXT_MONITOR | \
+  CPUID_EXT_CX16 | CPUID_EXT_POPCNT | CPUID_EXT_XSAVE | \
+  CPUID_EXT_HYPERVISOR)
+#define TCG_EXT2_FEATURES ((TCG_FEATURES  EXT2_FEATURE_MASK) | \
+  CPUID_EXT2_NX | CPUID_EXT2_MMXEXT | CPUID_EXT2_RDTSCP | \
+  CPUID_EXT2_3DNOW | CPUID_EXT2_3DNOWEXT)
+#define TCG_EXT3_FEATURES (CPUID_EXT3_LAHF_LM | CPUID_EXT3_SVM | \
+  CPUID_EXT3_CR8LEG | CPUID_EXT3_ABM | CPUID_EXT3_SSE4A)
+
 static x86_def_t x86_defs[] = {
 #ifdef TARGET_X86_64
 {
@@ -616,6 +631,17 @@ int cpu_x86_register (CPUX86State *env, const char 
*cpu_model)
 env-cpuid_ext2_features = def-ext2_features;
 env-cpuid_xlevel = def-xlevel;
 env-cpuid_kvm_features = def-kvm_features;
+env-cpuid_ext3_features = def-ext3_features;
+if (!kvm_enabled()) {
+env-cpuid_features = TCG_FEATURES;
+env-cpuid_ext_features = TCG_EXT_FEATURES;
+env-cpuid_ext2_features = (TCG_EXT2_FEATURES
+#ifdef TARGET_X86_64
+| CPUID_EXT2_SYSCALL | CPUID_EXT2_LM
+#endif
+);
+env-cpuid_ext3_features = TCG_EXT3_FEATURES;
+}
 {
 const char *model_id = def-model_id;
 int c, len, i;
-- 
1.6.4

[Qemu-devel] [PATCH 03/13] cpuid: moved host_cpuid function and remove prototype

2010-02-02 Thread Andre Przywara

the host_cpuid function was located at the end of the file and had
a prototype before it's first use. Move it up and remove the
prototype.

Signed-off-by: Andre Przywara andre.przyw...@amd.com
---
 target-i386/cpuid.c |   70 --
 1 files changed, 34 insertions(+), 36 deletions(-)

diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index 0a17020..cc080f4 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -338,8 +338,40 @@ static x86_def_t x86_defs[] = {
 },
 };
 
-static void host_cpuid(uint32_t function, uint32_t count, uint32_t *eax,
-   uint32_t *ebx, uint32_t *ecx, uint32_t *edx);
+static void host_cpuid(uint32_t function, uint32_t count,
+   uint32_t *eax, uint32_t *ebx,
+   uint32_t *ecx, uint32_t *edx)
+{
+#if defined(CONFIG_KVM)
+uint32_t vec[4];
+
+#ifdef __x86_64__
+asm volatile(cpuid
+ : =a(vec[0]), =b(vec[1]),
+   =c(vec[2]), =d(vec[3])
+ : 0(function), c(count) : cc);
+#else
+asm volatile(pusha \n\t
+ cpuid \n\t
+ mov %%eax, 0(%2) \n\t
+ mov %%ebx, 4(%2) \n\t
+ mov %%ecx, 8(%2) \n\t
+ mov %%edx, 12(%2) \n\t
+ popa
+ : : a(function), c(count), S(vec)
+ : memory, cc);
+#endif
+
+if (eax)
+   *eax = vec[0];
+if (ebx)
+   *ebx = vec[1];
+if (ecx)
+   *ecx = vec[2];
+if (edx)
+   *edx = vec[3];
+#endif
+}
 
 static int cpu_x86_fill_model_id(char *str)
 {
@@ -578,40 +610,6 @@ int cpu_x86_register (CPUX86State *env, const char 
*cpu_model)
 return 0;
 }
 
-static void host_cpuid(uint32_t function, uint32_t count,
-   uint32_t *eax, uint32_t *ebx,
-   uint32_t *ecx, uint32_t *edx)
-{
-#if defined(CONFIG_KVM)
-uint32_t vec[4];
-
-#ifdef __x86_64__
-asm volatile(cpuid
- : =a(vec[0]), =b(vec[1]),
-   =c(vec[2]), =d(vec[3])
- : 0(function), c(count) : cc);
-#else
-asm volatile(pusha \n\t
- cpuid \n\t
- mov %%eax, 0(%2) \n\t
- mov %%ebx, 4(%2) \n\t
- mov %%ecx, 8(%2) \n\t
- mov %%edx, 12(%2) \n\t
- popa
- : : a(function), c(count), S(vec)
- : memory, cc);
-#endif
-
-if (eax)
-   *eax = vec[0];
-if (ebx)
-   *ebx = vec[1];
-if (ecx)
-   *ecx = vec[2];
-if (edx)
-   *edx = vec[3];
-#endif
-}
 
 static void get_cpuid_vendor(CPUX86State *env, uint32_t *ebx,
  uint32_t *ecx, uint32_t *edx)
-- 
1.6.4

[Qemu-devel] [PATCH 09/13] cpuid: propagate further CPUID leafs when -cpu host

2010-02-02 Thread Andre Przywara

-cpu host currently only propagates the CPU's family/model/stepping,
the brand name and the feature bits.
Add a whitelist of safe CPUID leafs to let the guest see the actual
CPU's cache details and other things.

Signed-off-by: Andre Przywara andre.przyw...@amd.com
---
 target-i386/cpu.h   |5 -
 target-i386/cpuid.c |   28 ++--
 2 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index f826d3d..982f815 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -581,6 +581,9 @@ typedef struct {
 
 #define NB_MMU_MODES 2
 
+#define CPUID_FLAGS_VENDOR_OVERRIDE 1
+#define CPUID_FLAGS_HOST 2
+
 typedef struct CPUX86State {
 /* standard registers */
 target_ulong regs[CPU_NB_REGS];
@@ -685,7 +688,7 @@ typedef struct CPUX86State {
 uint32_t cpuid_ext2_features;
 uint32_t cpuid_ext3_features;
 uint32_t cpuid_apic_id;
-int cpuid_vendor_override;
+uint32_t cpuid_flags;
 
 /* MTRRs */
 uint64_t mtrr_fixed[11];
diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index 635c2f4..6aa1f3f 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -122,7 +122,7 @@ typedef struct x86_def_t {
 uint32_t features, ext_features, ext2_features, ext3_features, 
kvm_features;
 uint32_t xlevel;
 char model_id[48];
-int vendor_override;
+uint32_t flags;
 } x86_def_t;
 
 #define I486_FEATURES (CPUID_FP87 | CPUID_VME | CPUID_PSE)
@@ -419,7 +419,7 @@ static int cpu_x86_fill_host(x86_def_t *x86_cpu_def)
 x86_cpu_def-ext2_features = edx;
 x86_cpu_def-ext3_features = ecx;
 cpu_x86_fill_model_id(x86_cpu_def-model_id);
-x86_cpu_def-vendor_override = 0;
+x86_cpu_def-flags = CPUID_FLAGS_HOST;
 
 return 0;
 }
@@ -529,7 +529,7 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
const char *cpu_model)
 x86_cpu_def-vendor2 |= ((uint8_t)val[i + 4])  (8 * i);
 x86_cpu_def-vendor3 |= ((uint8_t)val[i + 8])  (8 * i);
 }
-x86_cpu_def-vendor_override = 1;
+x86_cpu_def-flags |= CPUID_FLAGS_VENDOR_OVERRIDE;
 } else if (!strcmp(featurestr, model_id)) {
 pstrcpy(x86_cpu_def-model_id, sizeof(x86_cpu_def-model_id),
 val);
@@ -602,7 +602,7 @@ int cpu_x86_register (CPUX86State *env, const char 
*cpu_model)
 env-cpuid_vendor2 = CPUID_VENDOR_INTEL_2;
 env-cpuid_vendor3 = CPUID_VENDOR_INTEL_3;
 }
-env-cpuid_vendor_override = def-vendor_override;
+env-cpuid_flags = def-flags;
 env-cpuid_level = def-level;
 if (def-family  0x0f)
 env-cpuid_version = 0xf00 | ((def-family - 0x0f)  20);
@@ -647,22 +647,38 @@ static void get_cpuid_vendor(CPUX86State *env, uint32_t 
*ebx,
  * this if you want to use KVM's sysenter/syscall emulation
  * in compatibility mode and when doing cross vendor migration
  */
-if (kvm_enabled()  env-cpuid_vendor_override) {
+if (kvm_enabled() 
+(env-cpuid_flags  CPUID_FLAGS_VENDOR_OVERRIDE) == 0) {
 host_cpuid(0, 0, NULL, ebx, ecx, edx);
 }
 }
 
+#define CPUID_LEAF_PROPAGATE ((1  0x02) | (1  0x04) | (1  0x05) |\
+  (1  0x0D))
+#define CPUID_LEAF_PROPAGATE_EXTENDED ((1  0x05) | (1  0x06) |\
+   (1  0x08) | (1  0x19) | (1  0x1A))
+
 void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
uint32_t *eax, uint32_t *ebx,
uint32_t *ecx, uint32_t *edx)
 {
-/* test if maximum index reached */
 if (index  0x8000) {
+/* test if maximum index reached */
 if (index  env-cpuid_xlevel)
 index = env-cpuid_level;
+if ((env-cpuid_flags  CPUID_FLAGS_HOST) 
+((1  (index - 0x8000))  CPUID_LEAF_PROPAGATE_EXTENDED)) {
+host_cpuid(index, count, eax, ebx, ecx, edx);
+return;
+}
 } else {
 if (index  env-cpuid_level)
 index = env-cpuid_level;
+if ((env-cpuid_flags  CPUID_FLAGS_HOST) 
+((1  index)  CPUID_LEAF_PROPAGATE)) {
+host_cpuid(index, count, eax, ebx, ecx, edx);
+return;
+}
 }
 
 switch(index) {
-- 
1.6.4

[Qemu-devel] [PATCH 04/13] cpuid: Replace strtok with get_opt_name

2010-02-02 Thread Andre Przywara

To avoid the non-reentrant capable strtok() use the QEMU defined
get_opt_name() to parse the -cpu parameter list. Since there is a
name clash between linux-user/mmap.c:qemu_malloc() and
qemu-malloc.c:qemu_malloc() I copied the small function from
qemu-option.c into cpuid.c. Not the best solution, bit IMO the
least intrusive and smallest one.

Signed-off-by: Andre Przywara andre.przyw...@amd.com
---
 target-i386/cpuid.c |   34 --
 1 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index cc080f4..0238718 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -24,6 +24,23 @@
 #include cpu.h
 #include kvm.h
 
+static const char *get_opt_name(char *buf, int buf_size,
+const char *p, char delim)
+{
+char *q;
+
+q = buf;
+while (*p != '\0'  *p != delim) {
+if (q  (q - buf)  buf_size - 1)
+*q++ = *p;
+p++;
+}
+if (q)
+*q = '\0';
+
+return p;
+}
+
 /* feature flags taken from Intel Processor Identification and the CPUID
  * Instruction and AMD's CPUID Specification. In cases of disagreement
  * about feature names, the Linux name is used. */
@@ -423,8 +440,8 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
const char *cpu_model)
 unsigned int i;
 x86_def_t *def;
 
-char *s = strdup(cpu_model);
-char *featurestr, *name = strtok(s, ,);
+const char* s;
+char featurestr[64];
 uint32_t plus_features = 0, plus_ext_features = 0,
 plus_ext2_features = 0, plus_ext3_features = 0, plus_kvm_features = 0;
 uint32_t minus_features = 0, minus_ext_features = 0,
@@ -432,14 +449,15 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
const char *cpu_model)
 minus_kvm_features = 0;
 uint32_t numvalue;
 
+s = get_opt_name(featurestr, 64, cpu_model, ',');
 def = NULL;
 for (i = 0; i  ARRAY_SIZE(x86_defs); i++) {
-if (strcmp(name, x86_defs[i].name) == 0) {
+if (strcmp(featurestr, x86_defs[i].name) == 0) {
 def = x86_defs[i];
 break;
 }
 }
-if (kvm_enabled()  strcmp(name, host) == 0) {
+if (kvm_enabled()  strcmp(featurestr, host) == 0) {
 cpu_x86_fill_host(x86_cpu_def);
 } else if (!def) {
 goto error;
@@ -453,10 +471,9 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
const char *cpu_model)
 plus_ext_features, plus_ext2_features, plus_ext3_features,
 plus_kvm_features);
 
-featurestr = strtok(NULL, ,);
-
-while (featurestr) {
+while (*s != 0) {
 char *val;
+s = get_opt_name(featurestr, 64, s + 1, ',');
 if (featurestr[0] == '+') {
 add_flagname_to_bitmaps(featurestr + 1, plus_features,
 plus_ext_features, plus_ext2_features,
@@ -536,7 +553,6 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
const char *cpu_model)
 (+feature|-feature|feature=xyz)\n, featurestr);
 goto error;
 }
-featurestr = strtok(NULL, ,);
 }
 x86_cpu_def-features |= plus_features;
 x86_cpu_def-ext_features |= plus_ext_features;
@@ -548,11 +564,9 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
const char *cpu_model)
 x86_cpu_def-ext2_features = ~minus_ext2_features;
 x86_cpu_def-ext3_features = ~minus_ext3_features;
 x86_cpu_def-kvm_features = ~minus_kvm_features;
-free(s);
 return 0;
 
 error:
-free(s);
 return -1;
 }
 
-- 
1.6.4

[Qemu-devel] [PATCH 07/13] cpuid: remove unnecessary kvm_trim function

2010-02-02 Thread Andre Przywara

Correct me if I am wrong, but kvm_trim looks like a really bloated
implementation of a bitwise AND. So remove this function and replace
it with the real stuff(TM).

Signed-off-by: Andre Przywara andre.przyw...@amd.com
---
 target-i386/kvm.c |   27 ++-
 1 files changed, 6 insertions(+), 21 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 5b093ce..daa65c1 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -125,19 +125,6 @@ uint32_t kvm_arch_get_supported_cpuid(CPUState *env, 
uint32_t function, int reg)
 
 #endif
 
-static void kvm_trim_features(uint32_t *features, uint32_t supported)
-{
-int i;
-uint32_t mask;
-
-for (i = 0; i  32; ++i) {
-mask = 1U  i;
-if ((*features  mask)  !(supported  mask)) {
-*features = ~mask;
-}
-}
-}
-
 #ifdef CONFIG_KVM_PARA
 struct kvm_para_features {
 int cap;
@@ -186,18 +173,16 @@ int kvm_arch_init_vcpu(CPUState *env)
 
 env-mp_state = KVM_MP_STATE_RUNNABLE;
 
-kvm_trim_features(env-cpuid_features,
-kvm_arch_get_supported_cpuid(env, 1, R_EDX));
+env-cpuid_features = kvm_arch_get_supported_cpuid(env, 1, R_EDX);
 
 i = env-cpuid_ext_features  CPUID_EXT_HYPERVISOR;
-kvm_trim_features(env-cpuid_ext_features,
-kvm_arch_get_supported_cpuid(env, 1, R_ECX));
+env-cpuid_ext_features = kvm_arch_get_supported_cpuid(env, 1, R_ECX);
 env-cpuid_ext_features |= i;
 
-kvm_trim_features(env-cpuid_ext2_features,
-kvm_arch_get_supported_cpuid(env, 0x8001, R_EDX));
-kvm_trim_features(env-cpuid_ext3_features,
-kvm_arch_get_supported_cpuid(env, 0x8001, R_ECX));
+env-cpuid_ext2_features = kvm_arch_get_supported_cpuid(env, 0x8001,
+ R_EDX);
+env-cpuid_ext3_features = kvm_arch_get_supported_cpuid(env, 0x8001,
+ R_ECX);
 
 cpuid_i = 0;
 
-- 
1.6.4

[Qemu-devel] [PATCH 06/13] cpuid: list all known x86 CPUID feature flags

2010-02-02 Thread Andre Przywara

-cpu ? currently gives us a list of known CPU models. Add host if
using KVM and a list of known CPUID feature flags to the output.

Signed-off-by: Andre Przywara andre.przyw...@amd.com
---
 target-i386/cpuid.c |   22 +-
 1 files changed, 21 insertions(+), 1 deletions(-)

diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index 19d58e1..3f56c50 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -573,10 +573,30 @@ error:
 
 void x86_cpu_list (FILE *f, int (*cpu_fprintf)(FILE *f, const char *fmt, ...))
 {
-unsigned int i;
+unsigned int i, j;
+const char **stringlist[] = {feature_name, ext_feature_name,
+ ext2_feature_name, ext3_feature_name};
 
 for (i = 0; i  ARRAY_SIZE(x86_defs); i++)
 (*cpu_fprintf)(f, x86 %16s\n, x86_defs[i].name);
+if (kvm_enabled()) {
+(*cpu_fprintf)(f, x86 %16s\n, host);
+}
+
+(*cpu_fprintf)(f, x86 recognized feature flags:\n);
+for (j = 0; j  4; j++) {
+for (i = 0; i  32; i++) {
+if (j == 2  ((1  i)  EXT2_FEATURE_MASK))
+continue;
+if (stringlist[j][i] == NULL)
+continue;
+(*cpu_fprintf)(f, %s , stringlist[j][i]);
+if (i == 15)
+(*cpu_fprintf)(f, \n);
+}
+(*cpu_fprintf)(f, \n);
+}
+return;
 }
 
 int cpu_x86_register (CPUX86State *env, const char *cpu_model)
-- 
1.6.4

[Qemu-devel] [PATCH 12/13] cpuid: Add kvm32 CPU model

2010-02-02 Thread Andre Przywara

Create a kvm32 CPU model that describes a least common denominator
for KVM capable guest CPUs. Useful for migration purposes.

Signed-off-by: Andre Przywara andre.przyw...@amd.com
---
 target-i386/cpuid.c |   14 ++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index b03a363..65dcb23 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -263,6 +263,20 @@ static x86_def_t x86_defs[] = {
 .model_id = QEMU Virtual CPU version  QEMU_VERSION,
 },
 {
+.name = kvm32,
+.level = 5,
+.family = 15,
+.model = 6,
+.stepping = 1,
+.features = PPRO_FEATURES |
+CPUID_MTRR | CPUID_CLFLUSH | CPUID_MCA | CPUID_PSE36,
+.ext_features = CPUID_EXT_SSE3,
+.ext2_features = PPRO_FEATURES  EXT2_FEATURE_MASK,
+.ext3_features = 0,
+.xlevel = 0x8008,
+.model_id = Common 32-bit KVM processor
+},
+{
 .name = coreduo,
 .level = 10,
 .family = 6,
-- 
1.6.4

[Qemu-devel] [PATCH 13/13] cpuid: fix CPUID levels

2010-02-02 Thread Andre Przywara

Bump up the xlevel number for qemu32 to allow parsing of the processor
name string for this model.
Similiarly the 486 processor should have at least the feature bit
leaf enabled.

Signed-off-by: Andre Przywara andre.przyw...@amd.com
---
 target-i386/cpuid.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index 65dcb23..725efe3 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -259,7 +259,7 @@ static x86_def_t x86_defs[] = {
 .stepping = 3,
 .features = PPRO_FEATURES,
 .ext_features = CPUID_EXT_SSE3 | CPUID_EXT_POPCNT,
-.xlevel = 0,
+.xlevel = 0x8004,
 .model_id = QEMU Virtual CPU version  QEMU_VERSION,
 },
 {
@@ -297,7 +297,7 @@ static x86_def_t x86_defs[] = {
 },
 {
 .name = 486,
-.level = 0,
+.level = 1,
 .family = 4,
 .model = 0,
 .stepping = 0,
-- 
1.6.4

[Qemu-devel] Re: [PATCH] Add cpu model configuration support.. (resend)

2010-02-02 Thread Andre Przywara


john cooper wrote:


[target-x86_64.conf was unintentionally omitted from the earlier patch]

This is a reimplementation of prior versions which adds
the ability to define cpu models for contemporary processors.
The added models are likewise selected via -cpu name,
and are intended to displace the existing convention
of -cpu qemu64 augmented with a series of feature flags.

 ...
John,

first I would like to apologize for sending out my patch series although 
I know that it heavily conflicts with yours. Actually you beat me just 
by hours with yours, I had mine ready on Friday evening and just delayed 
the sending until Monday ;-)


Can you split up the patch into a series of smaller ones (maybe git add 
-i can help you here?). This version is a bit large for proper review 
and mixes fixes and feature additions. Additionally this would help to 
merge our both versions.


Regards,
Andre.

--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448 3567 12
to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen
Geschaeftsfuehrer: Andrew Bowd; Thomas M. McCoy; Giuliano Meroni
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

Re: [Qemu-devel] [PATCH 2/5] socket: Add a reconnect option.

2010-02-02 Thread Ian Molton

Anthony Liguori wrote:

 I'm all for doing things incrementally but there has to be a big picture
 that the incremental bit fits into otherwise you end up with a bunch of
 random features that don't work together well.

Well, if you just add stuff without ever changing anything that went
before, of course.

 Honestly, I'd strongly suggest splitting the reconnect logic out of the
 series when resubmitting.

IMO the RNG stuff is worthless without the reconnect logic. You cant
have a machine in a production environment that just stops getting
entropy forever when you (say) restart the EGD, perhaps during a package
update. Or when someone unplugs the entropy source temporarily or
something like that.

  I think it's just too hacky with too weak of
 a justification.  If you really want this functionality, we can discuss
 the right approach for doing it but it's gotta be done in a way that's
 not introducing a one-off case just for the random number generator.

I dont think its a case of 'really want' as much as 'its completely
essential' :-)

I still think that unless there are any other use cases, theres not much
to discuss - The code is already generic to some degree - it notifies
users, and its got a configurable delay. What else do we need? I
implemented it generically rather than stuff it into the virtio-rng
driver *because* I didnt think a dedicated version of it was the right
way to go, but without some other use cases, I cant see what good there
is in bikeshedding over this?

-Ian

[Qemu-devel] Re: [PATCH 00/21] qemu-kvm: Hook cleanups and extended use of upstream code

2010-02-02 Thread Alexander Graf


On 02.02.2010, at 09:18, Jan Kiszka wrote:

 Let's start with the overall stats:
 
 31 files changed, 274 insertions(+), 822 deletions(-)
 
 So this series drops far more than 500 lines of redundant code, moving
 qemu-kvm yet a bit closer to upstream.
 
 The other highlight is the simplification of synchronization between
 in-kernel and user space VCPU states. This area used to call a lot of
 problems in the past because it was tricky to get things right,
 specifically during the multi-threaded startup. The new approach pushes
 all the sync work around reset and vmsave/load into generic code, not
 only removing the burden from developers of, say, in-kernel APIC
 support, but also dropping most of our kvm-specific hooks, especially in
 the qemu-kvm tree.
 
 While I tested this on various VMs around, and things look good so far,
 I wouldn't be surprised if there are some regressions remaining,
 specifically in the non-x86 parts that I wasn't able to test or even
 build. Please have a careful look!

The good news on that part is that apart from IA64, all other archs are broken 
in qemu-kvm anyways, but work on upstream qemu. So moving towards upstream 
definitely helps here.

Alex

[Qemu-devel] Re: [PATCH 03/21] qemu-kvm: Clean up register access API

2010-02-02 Thread Jan Kiszka

Gleb Natapov wrote:
 On Tue, Feb 02, 2010 at 09:18:49AM +0100, Jan Kiszka wrote:
 qemu-kvm's functios for accessing the VCPU registers are
 kvm_arch_load/save_regs. Use them directly instead of going through
 various wrappers. Specifically, we do not need on_vcpu wrapping as all
 users either already run in the related thread or call while the vm is
 stopped.

 Can we put check for that into those functions just to be sure.
 Something like:
   assert(!vm_stopped  env-thread_id != pthread_id())
 

Good idea. Will add this to a potential v2 or send an add-on patch. We
just need something else than vm_stopped (for reset, only the vcpu
threads are stopped, not the vm), probably env-stopped in qemu-kvm.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  qemu-kvm.c|   37 +++--
  qemu-kvm.h|   11 ---
  target-ia64/machine.c |4 ++--
  3 files changed, 5 insertions(+), 47 deletions(-)

 diff --git a/qemu-kvm.c b/qemu-kvm.c
 index a305907..97c098c 100644
 --- a/qemu-kvm.c
 +++ b/qemu-kvm.c
 @@ -862,7 +862,7 @@ int pre_kvm_run(kvm_context_t kvm, CPUState *env)
  kvm_arch_pre_run(env, env-kvm_run);
  
  if (env-kvm_cpu_state.regs_modified) {
 -kvm_arch_put_registers(env);
 +kvm_arch_load_regs(env);
  env-kvm_cpu_state.regs_modified = 0;
  }
  
 @@ -1532,16 +1532,11 @@ static void on_vcpu(CPUState *env, void (*func)(void 
 *data), void *data)
  qemu_cond_wait(qemu_work_cond);
  }
  
 -void kvm_arch_get_registers(CPUState *env)
 -{
 -kvm_arch_save_regs(env);
 -}
 -
  static void do_kvm_cpu_synchronize_state(void *_env)
  {
  CPUState *env = _env;
  if (!env-kvm_cpu_state.regs_modified) {
 -kvm_arch_get_registers(env);
 +kvm_arch_save_regs(env);
  env-kvm_cpu_state.regs_modified = 1;
  }
  }
 @@ -1584,32 +1579,6 @@ void kvm_update_interrupt_request(CPUState *env)
  }
  }
  
 -static void kvm_do_load_registers(void *_env)
 -{
 -CPUState *env = _env;
 -
 -kvm_arch_load_regs(env);
 -}
 -
 -void kvm_load_registers(CPUState *env)
 -{
 -if (kvm_enabled()  qemu_system_ready)
 -on_vcpu(env, kvm_do_load_registers, env);
 -}
 -
 -static void kvm_do_save_registers(void *_env)
 -{
 -CPUState *env = _env;
 -
 -kvm_arch_save_regs(env);
 -}
 -
 -void kvm_save_registers(CPUState *env)
 -{
 -if (kvm_enabled())
 -on_vcpu(env, kvm_do_save_registers, env);
 -}
 -
  static void kvm_do_load_mpstate(void *_env)
  {
  CPUState *env = _env;
 @@ -2379,7 +2348,7 @@ static void kvm_invoke_set_guest_debug(void *data)
  struct kvm_set_guest_debug_data *dbg_data = data;
  
  if (cpu_single_env-kvm_cpu_state.regs_modified) {
 -kvm_arch_put_registers(cpu_single_env);
 +kvm_arch_save_regs(cpu_single_env);
  cpu_single_env-kvm_cpu_state.regs_modified = 0;
  }
  dbg_data-err =
 diff --git a/qemu-kvm.h b/qemu-kvm.h
 index 6b3e5a1..1354227 100644
 --- a/qemu-kvm.h
 +++ b/qemu-kvm.h
 @@ -902,8 +902,6 @@ int kvm_main_loop(void);
  int kvm_init_ap(void);
  #ifndef QEMU_KVM_NO_CPU
  int kvm_vcpu_inited(CPUState *env);
 -void kvm_load_registers(CPUState *env);
 -void kvm_save_registers(CPUState *env);
  void kvm_load_mpstate(CPUState *env);
  void kvm_save_mpstate(CPUState *env);
  int kvm_cpu_exec(CPUState *env);
 @@ -1068,8 +1066,6 @@ void kvm_load_tsc(CPUState *env);
  #ifdef TARGET_I386
  #define qemu_kvm_has_pit_state2() (0)
  #endif
 -#define kvm_load_registers(env) do {} while(0)
 -#define kvm_save_registers(env) do {} while(0)
  #define kvm_save_mpstate(env)   do {} while(0)
  #define qemu_kvm_cpu_stop(env) do {} while(0)
  static inline void kvm_init_vcpu(CPUState *env)
 @@ -1098,13 +1094,6 @@ static inline int kvm_sync_vcpus(void)
  }
  
  #ifndef QEMU_KVM_NO_CPU
 -void kvm_arch_get_registers(CPUState *env);
 -
 -static inline void kvm_arch_put_registers(CPUState *env)
 -{
 -kvm_load_registers(env);
 -}
 -
  void kvm_cpu_synchronize_state(CPUState *env);
  
  static inline void cpu_synchronize_state(CPUState *env)
 diff --git a/target-ia64/machine.c b/target-ia64/machine.c
 index 70ef379..7d29575 100644
 --- a/target-ia64/machine.c
 +++ b/target-ia64/machine.c
 @@ -9,7 +9,7 @@ void cpu_save(QEMUFile *f, void *opaque)
  CPUState *env = opaque;
  
  if (kvm_enabled()) {
 -kvm_save_registers(env);
 +kvm_arch_save_regs(env);
  kvm_arch_save_mpstate(env);
  }
  }
 @@ -19,7 +19,7 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id)
  CPUState *env = opaque;
  
  if (kvm_enabled()) {
 -kvm_load_registers(env);
 +kvm_arch_load_regs(env);
  kvm_arch_load_mpstate(env);
  }
  return 0;
 -- 
 1.6.0.2

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 --
   Gleb.

Jan

--

[Qemu-devel] Re: [PATCH 00/21] qemu-kvm: Hook cleanups and extended use of upstream code

2010-02-02 Thread Jan Kiszka

Alexander Graf wrote:
 On 02.02.2010, at 09:18, Jan Kiszka wrote:
 
 Let's start with the overall stats:

 31 files changed, 274 insertions(+), 822 deletions(-)

 So this series drops far more than 500 lines of redundant code, moving
 qemu-kvm yet a bit closer to upstream.

 The other highlight is the simplification of synchronization between
 in-kernel and user space VCPU states. This area used to call a lot of
 problems in the past because it was tricky to get things right,
 specifically during the multi-threaded startup. The new approach pushes
 all the sync work around reset and vmsave/load into generic code, not
 only removing the burden from developers of, say, in-kernel APIC
 support, but also dropping most of our kvm-specific hooks, especially in
 the qemu-kvm tree.

 While I tested this on various VMs around, and things look good so far,
 I wouldn't be surprised if there are some regressions remaining,
 specifically in the non-x86 parts that I wasn't able to test or even
 build. Please have a careful look!
 
 The good news on that part is that apart from IA64, all other archs are 
 broken in qemu-kvm anyways, but work on upstream qemu. So moving towards 
 upstream definitely helps here.
 

OK, then you probably want my corresponding uq/master series in order to
test. Will try to roll them out ASAP.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 0/8]: QMP feature negotiation support

2010-02-02 Thread Luiz Capitulino

On Tue, 02 Feb 2010 09:03:32 +0100
Markus Armbruster arm...@redhat.com wrote:

 Luiz Capitulino lcapitul...@redhat.com writes:
 
  On Mon, 01 Feb 2010 20:37:41 +0100
  Markus Armbruster arm...@redhat.com wrote:
 
  Luiz Capitulino lcapitul...@redhat.com writes:
  
   On Mon, 01 Feb 2010 18:08:27 +0100
   Markus Armbruster arm...@redhat.com wrote:
 [...]
   I don't doubt your design does the job.  I just think it's overly
   general.  I had something far more stupid in mind:
   
   client connects
   server - client: version  capability offer (one message)
 again:
   client - server: capability selection (one message)
   server - client: either okay or error (one message)
   if error goto again
   connection is now ready for commands
   
   No modes.  The distinct lack of generality is a design feature.
  
I like the simplicity and if we were allowed to change later I'd
   do it.
  
The question is if we will ever want features to be _configured_
   before the protocol is operational. In this case we'd need to
   pass feature arguments through the capability selection command,
   which will get ugly and hard to use/understand.
  
Mode oriented support doesn't have this limitation. Maybe we
   won't never really use it, but it's safer.
  
  Capability selection could be done as an object where the name/value
  pairs are capability/argument.  If you need multiple arguments for a
  capability, make the capability's value an object.
 
   That's exactly what seems complicated to me, because besides performing
  two functions (enable/configure) some feature setup could require
  more commands to be done in a clear way.
 
 What do you mean by feature setup?  And how does it go beyond setting
 a bunch of parameters?
 
   The async messages setup in the previous series was an example of this.
 
 I don't remember the details.  Could you summarize?

 Not the best example since we agreed async messages setup could be done
in operational mode, but in case other features will require it:

1. The async message feature _and_ each async message were disabled by
   default
2. You could enable async message feature with capability_enable
3. Then, each message should be enabled separately with async_message_enable

 The use case here is: a feature requires to be configured before the
protocol is operational.

 It's possible to do this with a command like feature, but it'll get
bloated over time.

[Qemu-devel] Re: [RFC 0/2]: QMP DISK_ERROR event

2010-02-02 Thread Luiz Capitulino

On Tue, 02 Feb 2010 10:25:19 +0100
Kevin Wolf kw...@redhat.com wrote:

 Hi Luiz,
 
 Am 01.02.2010 19:07, schrieb Luiz Capitulino:
   Hi there,
  
   I've been requested by libvirt guys to add a QMP event for disk I/O errors,
  this is what this series is about.
  
   It's a RFC because I need feedback on the following:
  
  1. drive_get_on_error() is called on all disk errors, right?
 
 Well, yes, it is for all devices that support rerror/werror. But it also
 might be called in other situations. Look at the get in the function
 name, it's really a getter function and not a event handler.
 
  2. I've tested only ENOSPC errors, is there a way to test other errors? Like
  read ones?
 
 So you'll probably want some EIO. Some recent bugs I've been handling
 were a about images on NFS when the NFS server want away. It's a
 reliable way to get EIO (mount with -osoft and small timeouts). I guess
 qemu-nbd and the nbd: protocol might work, too.
 
 Or maybe copy the start of a qcow2 image to a too small device.

 Thanks!

  3. Is this the right approach at all? :)
 
 Yes and no. As I said above, drive_get_on_error() is not the right place
 to do it. Unfortunately it looks like there isn't a single generic place
 where it can be done, but the call to the event handler must be added to
 every device.

 Can't it be added to subsystems? Like ide, virtio etc?

 Maybe in the same function that calls driver_get_on_error()?

[Qemu-devel] Re: [RFC 0/2]: QMP DISK_ERROR event

2010-02-02 Thread Kevin Wolf

Am 02.02.2010 13:17, schrieb Luiz Capitulino:
 3. Is this the right approach at all? :)

 Yes and no. As I said above, drive_get_on_error() is not the right place
 to do it. Unfortunately it looks like there isn't a single generic place
 where it can be done, but the call to the event handler must be added to
 every device.
 
  Can't it be added to subsystems? Like ide, virtio etc?
 
  Maybe in the same function that calls driver_get_on_error()?

This is what I meant by devices, yes. Putting it into the same function
sounds good, too.

Kevin

[Qemu-devel] Re: [PATCH 15/21] qemu-kvm: Clean up mpstate synchronization

2010-02-02 Thread Gleb Natapov

On Tue, Feb 02, 2010 at 09:19:01AM +0100, Jan Kiszka wrote:
 Push mpstate reading/writing into kvm_arch_load/save_regs and, on x86,
 properly synchronize with halted in the accessor functions.
 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/apic.c |7 
  qemu-kvm-ia64.c   |4 ++-
  qemu-kvm-x86.c|   88 +++-
  qemu-kvm.c|   30 -
  qemu-kvm.h|   15 
  target-i386/machine.c |6 ---
  target-ia64/machine.c |3 ++
  7 files changed, 55 insertions(+), 98 deletions(-)
 
 diff --git a/hw/apic.c b/hw/apic.c
 index 3e03e10..092c61e 100644
 --- a/hw/apic.c
 +++ b/hw/apic.c
 @@ -507,13 +507,6 @@ void apic_init_reset(CPUState *env)
  s-wait_for_sipi = 1;
  
  env-halted = !(s-apicbase  MSR_IA32_APICBASE_BSP);
 -#ifdef KVM_CAP_MP_STATE
 -if (kvm_enabled()  kvm_irqchip_in_kernel()) {
 -env-mp_state
 -= env-halted ? KVM_MP_STATE_UNINITIALIZED : 
 KVM_MP_STATE_RUNNABLE;
 -kvm_load_mpstate(env);
 -}
 -#endif
  }
  
  static void apic_startup(APICState *s, int vector_num)
 diff --git a/qemu-kvm-ia64.c b/qemu-kvm-ia64.c
 index fc8110e..39bcbeb 100644
 --- a/qemu-kvm-ia64.c
 +++ b/qemu-kvm-ia64.c
 @@ -124,7 +124,9 @@ void kvm_arch_cpu_reset(CPUState *env)
  {
  if (kvm_irqchip_in_kernel(kvm_context)) {
  #ifdef KVM_CAP_MP_STATE
 - kvm_reset_mpstate(env-kvm_cpu_state.vcpu_ctx);
 +struct kvm_mp_state mp_state = {.mp_state = 
 KVM_MP_STATE_UNINITIALIZED
 +};
 +kvm_set_mpstate(env, mp_state);
  #endif
  } else {
   env-interrupt_request = ~CPU_INTERRUPT_HARD;
 diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
 index 63cd095..6b5895f 100644
 --- a/qemu-kvm-x86.c
 +++ b/qemu-kvm-x86.c
 @@ -754,6 +754,48 @@ static int get_msr_entry(struct kvm_msr_entry *entry, 
 CPUState *env)
  return 0;
  }
  
 +static void kvm_arch_save_mpstate(CPUState *env)
 +{
 +#ifdef KVM_CAP_MP_STATE
 +int r;
 +struct kvm_mp_state mp_state;
 +
 +r = kvm_get_mpstate(env, mp_state);
 +if (r  0) {
 +env-mp_state = -1;
 +} else {
 +env-mp_state = mp_state.mp_state;
 +if (kvm_irqchip_in_kernel()) {
 +env-halted = (env-mp_state == KVM_MP_STATE_HALTED);
 +}
 +}
 +#else
 +env-mp_state = -1;
 +#endif
 +}
 +
 +static void kvm_arch_load_mpstate(CPUState *env)
 +{
 +#ifdef KVM_CAP_MP_STATE
 +struct kvm_mp_state mp_state;
 +
 +/*
 + * -1 indicates that the host did not support GET_MP_STATE ioctl,
 + *  so don't touch it.
 + */
 +if (env-mp_state != -1) {
 +if (kvm_irqchip_in_kernel()) {
 +env-mp_state = env-halted ? KVM_MP_STATE_UNINITIALIZED :
 +  KVM_MP_STATE_RUNNABLE;
When irqchip is in kernel env-halted doesn't contain any relevant
information, so this is incorrect. Actually env-halted is updated only
to show correct cpu state during info cpus.

 +/* Avoid deadlock: no user space IRQ will ever clear it. */
And this comment explains why looking at env-halt when irqchip is in
kernel is wrong :)

 +env-halted = 0;
 +}
 +mp_state.mp_state = env-mp_state;
 +kvm_set_mpstate(env, mp_state);
 +}
 +#endif
 +}
 +
  static void set_v8086_seg(struct kvm_segment *lhs, const SegmentCache *rhs)
  {
  lhs-selector = rhs-selector;
 @@ -926,6 +968,10 @@ void kvm_arch_load_regs(CPUState *env, int level)
  rc = kvm_set_msrs(env, msrs, n);
  if (rc == -1)
  perror(kvm_set_msrs FAILED);
 +
 +if (level = KVM_PUT_RESET_STATE) {
 +kvm_arch_load_mpstate(env);
 +}
  }
  
  void kvm_load_tsc(CPUState *env)
 @@ -940,36 +986,6 @@ void kvm_load_tsc(CPUState *env)
  perror(kvm_set_tsc FAILED.\n);
  }
  
 -void kvm_arch_save_mpstate(CPUState *env)
 -{
 -#ifdef KVM_CAP_MP_STATE
 -int r;
 -struct kvm_mp_state mp_state;
 -
 -r = kvm_get_mpstate(env, mp_state);
 -if (r  0)
 -env-mp_state = -1;
 -else
 -env-mp_state = mp_state.mp_state;
 -#else
 -env-mp_state = -1;
 -#endif
 -}
 -
 -void kvm_arch_load_mpstate(CPUState *env)
 -{
 -#ifdef KVM_CAP_MP_STATE
 -struct kvm_mp_state mp_state = { .mp_state = env-mp_state };
 -
 -/*
 - * -1 indicates that the host did not support GET_MP_STATE ioctl,
 - *  so don't touch it.
 - */
 -if (env-mp_state != -1)
 -kvm_set_mpstate(env, mp_state);
 -#endif
 -}
 -
  void kvm_arch_save_regs(CPUState *env)
  {
  struct kvm_regs regs;
 @@ -1366,15 +1382,9 @@ void kvm_arch_cpu_reset(CPUState *env)
  {
  kvm_arch_reset_vcpu(env);
  kvm_put_vcpu_events(env);
 -if (!cpu_is_bsp(env)) {
 - if (kvm_irqchip_in_kernel()) {
 -#ifdef KVM_CAP_MP_STATE
 - kvm_reset_mpstate(env);
 -#endif
 - } else {
 - env-interrupt_request = ~CPU_INTERRUPT_HARD;
 - env-halted = 1;
 - }
 +if

[Qemu-devel] Re: [PATCH 15/21] qemu-kvm: Clean up mpstate synchronization

2010-02-02 Thread Jan Kiszka

Gleb Natapov wrote:
 On Tue, Feb 02, 2010 at 09:19:01AM +0100, Jan Kiszka wrote:
 Push mpstate reading/writing into kvm_arch_load/save_regs and, on x86,
 properly synchronize with halted in the accessor functions.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/apic.c |7 
  qemu-kvm-ia64.c   |4 ++-
  qemu-kvm-x86.c|   88 
 +++-
  qemu-kvm.c|   30 -
  qemu-kvm.h|   15 
  target-i386/machine.c |6 ---
  target-ia64/machine.c |3 ++
  7 files changed, 55 insertions(+), 98 deletions(-)

 diff --git a/hw/apic.c b/hw/apic.c
 index 3e03e10..092c61e 100644
 --- a/hw/apic.c
 +++ b/hw/apic.c
 @@ -507,13 +507,6 @@ void apic_init_reset(CPUState *env)
  s-wait_for_sipi = 1;
  
  env-halted = !(s-apicbase  MSR_IA32_APICBASE_BSP);
 -#ifdef KVM_CAP_MP_STATE
 -if (kvm_enabled()  kvm_irqchip_in_kernel()) {
 -env-mp_state
 -= env-halted ? KVM_MP_STATE_UNINITIALIZED : 
 KVM_MP_STATE_RUNNABLE;
 -kvm_load_mpstate(env);
 -}
 -#endif
  }
  
  static void apic_startup(APICState *s, int vector_num)
 diff --git a/qemu-kvm-ia64.c b/qemu-kvm-ia64.c
 index fc8110e..39bcbeb 100644
 --- a/qemu-kvm-ia64.c
 +++ b/qemu-kvm-ia64.c
 @@ -124,7 +124,9 @@ void kvm_arch_cpu_reset(CPUState *env)
  {
  if (kvm_irqchip_in_kernel(kvm_context)) {
  #ifdef KVM_CAP_MP_STATE
 -kvm_reset_mpstate(env-kvm_cpu_state.vcpu_ctx);
 +struct kvm_mp_state mp_state = {.mp_state = 
 KVM_MP_STATE_UNINITIALIZED
 +};
 +kvm_set_mpstate(env, mp_state);
  #endif
  } else {
  env-interrupt_request = ~CPU_INTERRUPT_HARD;
 diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
 index 63cd095..6b5895f 100644
 --- a/qemu-kvm-x86.c
 +++ b/qemu-kvm-x86.c
 @@ -754,6 +754,48 @@ static int get_msr_entry(struct kvm_msr_entry *entry, 
 CPUState *env)
  return 0;
  }
  
 +static void kvm_arch_save_mpstate(CPUState *env)
 +{
 +#ifdef KVM_CAP_MP_STATE
 +int r;
 +struct kvm_mp_state mp_state;
 +
 +r = kvm_get_mpstate(env, mp_state);
 +if (r  0) {
 +env-mp_state = -1;
 +} else {
 +env-mp_state = mp_state.mp_state;
 +if (kvm_irqchip_in_kernel()) {
 +env-halted = (env-mp_state == KVM_MP_STATE_HALTED);
 +}
 +}
 +#else
 +env-mp_state = -1;
 +#endif
 +}
 +
 +static void kvm_arch_load_mpstate(CPUState *env)
 +{
 +#ifdef KVM_CAP_MP_STATE
 +struct kvm_mp_state mp_state;
 +
 +/*
 + * -1 indicates that the host did not support GET_MP_STATE ioctl,
 + *  so don't touch it.
 + */
 +if (env-mp_state != -1) {
 +if (kvm_irqchip_in_kernel()) {
 +env-mp_state = env-halted ? KVM_MP_STATE_UNINITIALIZED :
 +  KVM_MP_STATE_RUNNABLE;
 When irqchip is in kernel env-halted doesn't contain any relevant
 information, so this is incorrect. Actually env-halted is updated only
 to show correct cpu state during info cpus.

OK, copied from apic_init_reset, see above. So that hunk was probably at
least useless, and now it's harmfull. Will drop this and only sync from
mp_state - halted.

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] Re: [PATCH 15/21] qemu-kvm: Clean up mpstate synchronization

2010-02-02 Thread Gleb Natapov

On Tue, Feb 02, 2010 at 01:31:50PM +0100, Jan Kiszka wrote:
 Gleb Natapov wrote:
  On Tue, Feb 02, 2010 at 09:19:01AM +0100, Jan Kiszka wrote:
  Push mpstate reading/writing into kvm_arch_load/save_regs and, on x86,
  properly synchronize with halted in the accessor functions.
 
  Signed-off-by: Jan Kiszka jan.kis...@siemens.com
  ---
   hw/apic.c |7 
   qemu-kvm-ia64.c   |4 ++-
   qemu-kvm-x86.c|   88 
  +++-
   qemu-kvm.c|   30 -
   qemu-kvm.h|   15 
   target-i386/machine.c |6 ---
   target-ia64/machine.c |3 ++
   7 files changed, 55 insertions(+), 98 deletions(-)
 
  diff --git a/hw/apic.c b/hw/apic.c
  index 3e03e10..092c61e 100644
  --- a/hw/apic.c
  +++ b/hw/apic.c
  @@ -507,13 +507,6 @@ void apic_init_reset(CPUState *env)
   s-wait_for_sipi = 1;
   
   env-halted = !(s-apicbase  MSR_IA32_APICBASE_BSP);
  -#ifdef KVM_CAP_MP_STATE
  -if (kvm_enabled()  kvm_irqchip_in_kernel()) {
  -env-mp_state
  -= env-halted ? KVM_MP_STATE_UNINITIALIZED : 
  KVM_MP_STATE_RUNNABLE;
  -kvm_load_mpstate(env);
  -}
  -#endif
   }
   
   static void apic_startup(APICState *s, int vector_num)
  diff --git a/qemu-kvm-ia64.c b/qemu-kvm-ia64.c
  index fc8110e..39bcbeb 100644
  --- a/qemu-kvm-ia64.c
  +++ b/qemu-kvm-ia64.c
  @@ -124,7 +124,9 @@ void kvm_arch_cpu_reset(CPUState *env)
   {
   if (kvm_irqchip_in_kernel(kvm_context)) {
   #ifdef KVM_CAP_MP_STATE
  -  kvm_reset_mpstate(env-kvm_cpu_state.vcpu_ctx);
  +struct kvm_mp_state mp_state = {.mp_state = 
  KVM_MP_STATE_UNINITIALIZED
  +};
  +kvm_set_mpstate(env, mp_state);
   #endif
   } else {
 env-interrupt_request = ~CPU_INTERRUPT_HARD;
  diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
  index 63cd095..6b5895f 100644
  --- a/qemu-kvm-x86.c
  +++ b/qemu-kvm-x86.c
  @@ -754,6 +754,48 @@ static int get_msr_entry(struct kvm_msr_entry *entry, 
  CPUState *env)
   return 0;
   }
   
  +static void kvm_arch_save_mpstate(CPUState *env)
  +{
  +#ifdef KVM_CAP_MP_STATE
  +int r;
  +struct kvm_mp_state mp_state;
  +
  +r = kvm_get_mpstate(env, mp_state);
  +if (r  0) {
  +env-mp_state = -1;
  +} else {
  +env-mp_state = mp_state.mp_state;
  +if (kvm_irqchip_in_kernel()) {
  +env-halted = (env-mp_state == KVM_MP_STATE_HALTED);
  +}
  +}
  +#else
  +env-mp_state = -1;
  +#endif
  +}
  +
  +static void kvm_arch_load_mpstate(CPUState *env)
  +{
  +#ifdef KVM_CAP_MP_STATE
  +struct kvm_mp_state mp_state;
  +
  +/*
  + * -1 indicates that the host did not support GET_MP_STATE ioctl,
  + *  so don't touch it.
  + */
  +if (env-mp_state != -1) {
  +if (kvm_irqchip_in_kernel()) {
  +env-mp_state = env-halted ? KVM_MP_STATE_UNINITIALIZED :
  +  KVM_MP_STATE_RUNNABLE;
  When irqchip is in kernel env-halted doesn't contain any relevant
  information, so this is incorrect. Actually env-halted is updated only
  to show correct cpu state during info cpus.
 
 OK, copied from apic_init_reset, see above. So that hunk was probably at
 least useless, and now it's harmfull. Will drop this and only sync from
 mp_state - halted.
 
It was not useless in apic_init_reset it was a shortcut for:
env-mp_state = !(s-apicbase  MSR_IA32_APICBASE_BSP) ? 
KVM_MP_STATE_UNINITIALIZED : KVM_MP_STATE_RUNNABLE;

On reset BSP VCPU should set env-mp_state to KVM_MP_STATE_RUNNABLE and
all others to KVM_MP_STATE_UNINITIALIZED.

--
Gleb.

[Qemu-devel] Re: [PATCH 15/21] qemu-kvm: Clean up mpstate synchronization

2010-02-02 Thread Jan Kiszka

Gleb Natapov wrote:
 On Tue, Feb 02, 2010 at 01:31:50PM +0100, Jan Kiszka wrote:
 Gleb Natapov wrote:
 On Tue, Feb 02, 2010 at 09:19:01AM +0100, Jan Kiszka wrote:
 Push mpstate reading/writing into kvm_arch_load/save_regs and, on x86,
 properly synchronize with halted in the accessor functions.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/apic.c |7 
  qemu-kvm-ia64.c   |4 ++-
  qemu-kvm-x86.c|   88 
 +++-
  qemu-kvm.c|   30 -
  qemu-kvm.h|   15 
  target-i386/machine.c |6 ---
  target-ia64/machine.c |3 ++
  7 files changed, 55 insertions(+), 98 deletions(-)

 diff --git a/hw/apic.c b/hw/apic.c
 index 3e03e10..092c61e 100644
 --- a/hw/apic.c
 +++ b/hw/apic.c
 @@ -507,13 +507,6 @@ void apic_init_reset(CPUState *env)
  s-wait_for_sipi = 1;
  
  env-halted = !(s-apicbase  MSR_IA32_APICBASE_BSP);
 -#ifdef KVM_CAP_MP_STATE
 -if (kvm_enabled()  kvm_irqchip_in_kernel()) {
 -env-mp_state
 -= env-halted ? KVM_MP_STATE_UNINITIALIZED : 
 KVM_MP_STATE_RUNNABLE;
 -kvm_load_mpstate(env);
 -}
 -#endif
  }
  
  static void apic_startup(APICState *s, int vector_num)
 diff --git a/qemu-kvm-ia64.c b/qemu-kvm-ia64.c
 index fc8110e..39bcbeb 100644
 --- a/qemu-kvm-ia64.c
 +++ b/qemu-kvm-ia64.c
 @@ -124,7 +124,9 @@ void kvm_arch_cpu_reset(CPUState *env)
  {
  if (kvm_irqchip_in_kernel(kvm_context)) {
  #ifdef KVM_CAP_MP_STATE
 -  kvm_reset_mpstate(env-kvm_cpu_state.vcpu_ctx);
 +struct kvm_mp_state mp_state = {.mp_state = 
 KVM_MP_STATE_UNINITIALIZED
 +};
 +kvm_set_mpstate(env, mp_state);
  #endif
  } else {
env-interrupt_request = ~CPU_INTERRUPT_HARD;
 diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
 index 63cd095..6b5895f 100644
 --- a/qemu-kvm-x86.c
 +++ b/qemu-kvm-x86.c
 @@ -754,6 +754,48 @@ static int get_msr_entry(struct kvm_msr_entry *entry, 
 CPUState *env)
  return 0;
  }
  
 +static void kvm_arch_save_mpstate(CPUState *env)
 +{
 +#ifdef KVM_CAP_MP_STATE
 +int r;
 +struct kvm_mp_state mp_state;
 +
 +r = kvm_get_mpstate(env, mp_state);
 +if (r  0) {
 +env-mp_state = -1;
 +} else {
 +env-mp_state = mp_state.mp_state;
 +if (kvm_irqchip_in_kernel()) {
 +env-halted = (env-mp_state == KVM_MP_STATE_HALTED);
 +}
 +}
 +#else
 +env-mp_state = -1;
 +#endif
 +}
 +
 +static void kvm_arch_load_mpstate(CPUState *env)
 +{
 +#ifdef KVM_CAP_MP_STATE
 +struct kvm_mp_state mp_state;
 +
 +/*
 + * -1 indicates that the host did not support GET_MP_STATE ioctl,
 + *  so don't touch it.
 + */
 +if (env-mp_state != -1) {
 +if (kvm_irqchip_in_kernel()) {
 +env-mp_state = env-halted ? KVM_MP_STATE_UNINITIALIZED :
 +  KVM_MP_STATE_RUNNABLE;
 When irqchip is in kernel env-halted doesn't contain any relevant
 information, so this is incorrect. Actually env-halted is updated only
 to show correct cpu state during info cpus.
 OK, copied from apic_init_reset, see above. So that hunk was probably at
 least useless, and now it's harmfull. Will drop this and only sync from
 mp_state - halted.

 It was not useless in apic_init_reset it was a shortcut for:
 env-mp_state = !(s-apicbase  MSR_IA32_APICBASE_BSP) ? 
 KVM_MP_STATE_UNINITIALIZED : KVM_MP_STATE_RUNNABLE;
 
 On reset BSP VCPU should set env-mp_state to KVM_MP_STATE_RUNNABLE and
 all others to KVM_MP_STATE_UNINITIALIZED.

OK, belongs to kvm vpcu init code then - less encrypted.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] system_reset command cause assert failed

2010-02-02 Thread Luiz Capitulino

On Tue, 2 Feb 2010 09:35:16 +0800
Roy Tam roy...@gmail.com wrote:

 2010/2/2 Luiz Capitulino lcapitul...@redhat.com:
  On Tue, 2 Feb 2010 00:26:53 +0800
  Roy Tam roy...@gmail.com wrote:
 
  2010/2/2 Luiz Capitulino lcapitul...@redhat.com:
 
Hm, I'm puzzled. Is this failing on malloc()? At least qemu_malloc()
   is the last qemu's function I see in the logs.
  
From now on I only see msvcrt functions...
  
Maybe, you can type run on gdb, run system_reset on the
   Monitor and then switch back to gdb and type bt?
  
  source-less debugging seems better...
 
   As far as I can understand something bad happens while the parser
  is processing the first ' character of the qobject_from_jsonf()
  call in monitor.c:4524.
 
   Strange. Can you try 'info pci', 'info block' and 'info version'?
  Do they work?
 
   Maybe this is a refcount problem?
 
   Anthony, could you take a look too please?
 
 
 rebuild with -gstabs -O1, you can see double free here:

 Ok, so we have a double free and

 #0  qobject_to_qdict (obj=0x0) at qobject.h:108
 #1  0x004127ae in pci_device_print (mon=0x494c460, device=0x49696c0)
 at /home/roy/qemu/hw/pci.c:1165

 a segfault.

 I don't know what's happening, I'll have to run QEMU on windows and
try to reproduce it.

Re: [Qemu-devel] [ANNOUNCE] New qemu.org website

2010-02-02 Thread Anthony Liguori


On 02/02/2010 12:05 AM, Mulyadi Santosa wrote:

On Mon, Feb 1, 2010 at 9:26 PM, Anthony Liguorianth...@codemonkey.ws  wrote:
   

Hi,

The new qemu.org wiki is now live.  I've transferred all of the content from
the old website and have now switched www.qemu.org to redirect to
wiki.qemu.org.
 

Hi Anthony...

Right now, February 2nd, 1:02 PM GMT+7 (Indonesian time), I get time
out when accessing qemu.org. Is it down for maintenance?
   


Since going live, there's a memory leak on the system that's resulting 
in the OOM killer going off.  I've got some tracking in place right now 
that should help us get to the bottom of it.


So apologies if the site has some down time over the next couple days as 
we figure this issue out.


Regards,

Anthony Liguori

[Qemu-devel] usb-host quirks

2010-02-02 Thread Michael Buesch

Hi,

I've got a buggy device that needs a special workaround to be usable under
host-usb access. The device really doesn't like being reset via USBDEVFS_RESET. 
It
immediatenly locks up the device firmware or whatever. It won't respond 
properly anymore.
With the following patch it works fine, though.

So I was wondering what the accepted way was to get these quirks upstream into 
the qemu
source tree. Is usb-linux.c the correct place, or should we put the quirk into
a different place?

---
 usb-linux.c |4 
 1 file changed, 4 insertions(+)

--- qemu.orig/usb-linux.c
+++ qemu/usb-linux.c
@@ -389,6 +389,10 @@ static void usb_host_handle_reset(USBDev
 
 dprintf(husb: reset device %u.%u\n, s-bus_num, s-addr);
 
+if (((s-descr[8]  8) | s-descr[9]) == 0x2471 
+((s-descr[10]  8) | s-descr[11]) == 0x0853)
+return;
+
 ioctl(s-fd, USBDEVFS_RESET);
 
 usb_host_claim_interfaces(s, s-configuration);

-- 
Greetings, Michael.

[Qemu-devel] Re: [PATCH 20/21] qemu-kvm: Move kvm_set_boot_cpu_id

2010-02-02 Thread Gleb Natapov

On Tue, Feb 02, 2010 at 09:19:06AM +0100, Jan Kiszka wrote:
 Setting the boot CPU ID is arch-specific KVM stuff. So push it where it
 belongs to.
 
pc_init1 is also arch-specific, no? TCG should also be able to
have BSP apic_id != 0.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/pc.c|3 ---
  qemu-kvm-x86.c |3 ++-
  2 files changed, 2 insertions(+), 4 deletions(-)
 
 diff --git a/hw/pc.c b/hw/pc.c
 index 6c15a9f..3df6195 100644
 --- a/hw/pc.c
 +++ b/hw/pc.c
 @@ -803,9 +803,6 @@ static void pc_init1(ram_addr_t ram_size,
  #endif
  }
  
 -if (kvm_enabled()) {
 -kvm_set_boot_cpu_id(0);
 -}
  for (i = 0; i  smp_cpus; i++) {
  env = pc_new_cpu(cpu_model);
  }
 diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
 index 9de018e..0f34451 100644
 --- a/qemu-kvm-x86.c
 +++ b/qemu-kvm-x86.c
 @@ -695,7 +695,8 @@ int kvm_arch_qemu_create_context(void)
  if (kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK))
  vmstate_register(0, vmstate_kvmclock, kvmclock_data);
  #endif
 -return 0;
 +
 +return kvm_set_boot_cpu_id(0);
  }
  
  static void set_msr_entry(struct kvm_msr_entry *entry, uint32_t index,
 -- 
 1.6.0.2
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.

[Qemu-devel] Re: [PATCH 20/21] qemu-kvm: Move kvm_set_boot_cpu_id

2010-02-02 Thread Jan Kiszka

Gleb Natapov wrote:
 On Tue, Feb 02, 2010 at 09:19:06AM +0100, Jan Kiszka wrote:
 Setting the boot CPU ID is arch-specific KVM stuff. So push it where it
 belongs to.

 pc_init1 is also arch-specific, no? TCG should also be able to
 have BSP apic_id != 0.

But not kvm-specific.

I don't understand your second remark. Can you help me how TCG is
affected by kvm_set_boot_cpu_id?

 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/pc.c|3 ---
  qemu-kvm-x86.c |3 ++-
  2 files changed, 2 insertions(+), 4 deletions(-)

 diff --git a/hw/pc.c b/hw/pc.c
 index 6c15a9f..3df6195 100644
 --- a/hw/pc.c
 +++ b/hw/pc.c
 @@ -803,9 +803,6 @@ static void pc_init1(ram_addr_t ram_size,
  #endif
  }
  
 -if (kvm_enabled()) {
 -kvm_set_boot_cpu_id(0);
 -}
  for (i = 0; i  smp_cpus; i++) {
  env = pc_new_cpu(cpu_model);
  }
 diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
 index 9de018e..0f34451 100644
 --- a/qemu-kvm-x86.c
 +++ b/qemu-kvm-x86.c
 @@ -695,7 +695,8 @@ int kvm_arch_qemu_create_context(void)
  if (kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK))
  vmstate_register(0, vmstate_kvmclock, kvmclock_data);
  #endif
 -return 0;
 +
 +return kvm_set_boot_cpu_id(0);
  }
  
  static void set_msr_entry(struct kvm_msr_entry *entry, uint32_t index,
 -- 
 1.6.0.2

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 --
   Gleb.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH] qcow2: Fix signedness bugs

2010-02-02 Thread Kevin Wolf

Checking for return codes  0 isn't really going to work with unsigned
types. Use signed types instead.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/qcow2-cluster.c |   12 ++--
 block/qcow2.h |6 ++
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 4e30d16..3501a94 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -219,7 +219,8 @@ static uint64_t *l2_allocate(BlockDriverState *bs, int 
l1_index)
 BDRVQcowState *s = bs-opaque;
 int min_index;
 uint64_t old_l2_offset;
-uint64_t *l2_table, l2_offset;
+uint64_t *l2_table;
+int64_t l2_offset;
 
 old_l2_offset = s-l1_table[l1_index];
 
@@ -560,7 +561,8 @@ uint64_t 
qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
 {
 BDRVQcowState *s = bs-opaque;
 int l2_index, ret;
-uint64_t l2_offset, *l2_table, cluster_offset;
+uint64_t l2_offset, *l2_table;
+int64_t cluster_offset;
 int nb_csectors;
 
 ret = get_cluster_table(bs, offset, l2_table, l2_offset, l2_index);
@@ -704,10 +706,8 @@ err:
  *
  * Return 0 on success and -errno in error cases
  */
-uint64_t qcow2_alloc_cluster_offset(BlockDriverState *bs,
-uint64_t offset,
-int n_start, int n_end,
-int *num, QCowL2Meta *m)
+int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
+int n_start, int n_end, int *num, QCowL2Meta *m)
 {
 BDRVQcowState *s = bs-opaque;
 int l2_index, ret;
diff --git a/block/qcow2.h b/block/qcow2.h
index d9ea6ab..de9397a 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -192,10 +192,8 @@ void qcow2_encrypt_sectors(BDRVQcowState *s, int64_t 
sector_num,
 
 uint64_t qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
 int *num);
-uint64_t qcow2_alloc_cluster_offset(BlockDriverState *bs,
-  uint64_t offset,
-  int n_start, int n_end,
-  int *num, QCowL2Meta *m);
+int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
+int n_start, int n_end, int *num, QCowL2Meta *m);
 uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
  uint64_t offset,
  int compressed_size);
-- 
1.6.5.2

[Qemu-devel] Re: [PATCH 20/21] qemu-kvm: Move kvm_set_boot_cpu_id

2010-02-02 Thread Gleb Natapov

On Tue, Feb 02, 2010 at 03:20:02PM +0100, Jan Kiszka wrote:
 Gleb Natapov wrote:
  On Tue, Feb 02, 2010 at 09:19:06AM +0100, Jan Kiszka wrote:
  Setting the boot CPU ID is arch-specific KVM stuff. So push it where it
  belongs to.
 
  pc_init1 is also arch-specific, no? TCG should also be able to
  have BSP apic_id != 0.
 
 But not kvm-specific.
 
 I don't understand your second remark. Can you help me how TCG is
 affected by kvm_set_boot_cpu_id?
 
It is not affected right now. It assumes that apic ID of BSP cpu is 0,
but this limitation does not exists on real HW. So when QEMU will be fixed
and it will be possible to configure what CPU is BSP this will be the
pace to do it.

  
  Signed-off-by: Jan Kiszka jan.kis...@siemens.com
  ---
   hw/pc.c|3 ---
   qemu-kvm-x86.c |3 ++-
   2 files changed, 2 insertions(+), 4 deletions(-)
 
  diff --git a/hw/pc.c b/hw/pc.c
  index 6c15a9f..3df6195 100644
  --- a/hw/pc.c
  +++ b/hw/pc.c
  @@ -803,9 +803,6 @@ static void pc_init1(ram_addr_t ram_size,
   #endif
   }
   
  -if (kvm_enabled()) {
  -kvm_set_boot_cpu_id(0);
  -}
   for (i = 0; i  smp_cpus; i++) {
   env = pc_new_cpu(cpu_model);
   }
  diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
  index 9de018e..0f34451 100644
  --- a/qemu-kvm-x86.c
  +++ b/qemu-kvm-x86.c
  @@ -695,7 +695,8 @@ int kvm_arch_qemu_create_context(void)
   if (kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK))
   vmstate_register(0, vmstate_kvmclock, kvmclock_data);
   #endif
  -return 0;
  +
  +return kvm_set_boot_cpu_id(0);
   }
   
   static void set_msr_entry(struct kvm_msr_entry *entry, uint32_t index,
  -- 
  1.6.0.2
 
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
  
  --
  Gleb.
 
 Jan
 
 -- 
 Siemens AG, Corporate Technology, CT T DE IT 1
 Corporate Competence Center Embedded Linux

--
Gleb.

[Qemu-devel] Re: [PATCH 20/21] qemu-kvm: Move kvm_set_boot_cpu_id

2010-02-02 Thread Jan Kiszka

Gleb Natapov wrote:
 On Tue, Feb 02, 2010 at 03:20:02PM +0100, Jan Kiszka wrote:
 Gleb Natapov wrote:
 On Tue, Feb 02, 2010 at 09:19:06AM +0100, Jan Kiszka wrote:
 Setting the boot CPU ID is arch-specific KVM stuff. So push it where it
 belongs to.

 pc_init1 is also arch-specific, no? TCG should also be able to
 have BSP apic_id != 0.
 But not kvm-specific.

 I don't understand your second remark. Can you help me how TCG is
 affected by kvm_set_boot_cpu_id?

 It is not affected right now. It assumes that apic ID of BSP cpu is 0,
 but this limitation does not exists on real HW. So when QEMU will be fixed
 and it will be possible to configure what CPU is BSP this will be the
 pace to do it.

That day pc_init1 (or whatever x86 part) will set the bsp number
somewhere in env or apicstate, and we will transfer that afterwards to kvm.

The point is that kvm_* belongs into kvm[-all].c as far as possible. And
in this case it is possible.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 0/8]: QMP feature negotiation support

2010-02-02 Thread Markus Armbruster

Luiz Capitulino lcapitul...@redhat.com writes:

 On Tue, 02 Feb 2010 09:03:32 +0100
 Markus Armbruster arm...@redhat.com wrote:

 Luiz Capitulino lcapitul...@redhat.com writes:
 
  On Mon, 01 Feb 2010 20:37:41 +0100
  Markus Armbruster arm...@redhat.com wrote:
 
  Luiz Capitulino lcapitul...@redhat.com writes:
  
   On Mon, 01 Feb 2010 18:08:27 +0100
   Markus Armbruster arm...@redhat.com wrote:
 [...]
   I don't doubt your design does the job.  I just think it's overly
   general.  I had something far more stupid in mind:
   
   client connects
   server - client: version  capability offer (one message)
 again:
   client - server: capability selection (one message)
   server - client: either okay or error (one message)
   if error goto again
   connection is now ready for commands
   
   No modes.  The distinct lack of generality is a design feature.
  
I like the simplicity and if we were allowed to change later I'd
   do it.
  
The question is if we will ever want features to be _configured_
   before the protocol is operational. In this case we'd need to
   pass feature arguments through the capability selection command,
   which will get ugly and hard to use/understand.
  
Mode oriented support doesn't have this limitation. Maybe we
   won't never really use it, but it's safer.
  
  Capability selection could be done as an object where the name/value
  pairs are capability/argument.  If you need multiple arguments for a
  capability, make the capability's value an object.
 
   That's exactly what seems complicated to me, because besides performing
  two functions (enable/configure) some feature setup could require
  more commands to be done in a clear way.
 
 What do you mean by feature setup?  And how does it go beyond setting
 a bunch of parameters?
 
   The async messages setup in the previous series was an example of this.
 
 I don't remember the details.  Could you summarize?

  Not the best example since we agreed async messages setup could be done
 in operational mode, but in case other features will require it:

 1. The async message feature _and_ each async message were disabled by
default
 2. You could enable async message feature with capability_enable
 3. Then, each message should be enabled separately with async_message_enable

  The use case here is: a feature requires to be configured before the
 protocol is operational.

Okay, let's pretend for the sake of the argument that async message
enable/disable is core protocol, and thus needs to be controlled via
capabilities.

An obvious way is to have one capability for every enable/disable
switch.  The server's capability offer lists them all, and the client's
capability selection includes the one it wants.

What if this leads to dozens of capabilities?  It's a machine protocol,
and a machine can cope with sixty capabilities just as fine as with six.
Six hundred would be kind of ugly, though.

If we absolutely insist on controlling async messages with a single
capability, things get slightly more complex.  The capability now sports
an object value, with a member for each enable/disable switch.  The
client's capability client selection supplies such an object value.

If we're worried about discoverability, we can make the server's
capability offer include a description of each capability's value.

And now let's quit pretending, and remind ourselves that capabilities
are for variations of the core protocol.  Do we really expect the core
protocol to become so baroque that we'll need a full-blown configuration
mode?

  It's possible to do this with a command like feature, but it'll get
 bloated over time.

I doubt it.

[Qemu-devel] The new qemu.org

2010-02-02 Thread G 3

The new site looks nice. When is the Mac OS X section under  
Compilation from the sources going to be updated from the lame The  
Mac OS X patches are not fully merged in QEMU, so you should look at  
the QEMU mailing list archive to have all the necessary  
information.. This is unacceptable.

[Qemu-devel] KVM developer call minutes (Feb 2)

2010-02-02 Thread Chris Wright

Minutes (please reply w/ corrections or follow-ups):

state of in-kernel APIC/IOAPIC/PIT upstream merge
- Glauber?...

road map to get rid of qemu-kvm's slot management (IMHO: qemu-kvm-0.13)
- no real feedback here

- any further ongoing/planned upstream merge efforts?
  - SMP
- needs in-kernel irqchip support upstream
- possibly make it a stream in uq for integration and autotesting
  - PCI Device Assignment cleanup, proper capabilities support
- do we move to uio support before pushing upstream?
- Michael's UIO patch for 2.3 devices is upstream, so moving to uio
  would add a feature
- what is missing w/out PCIe bus emulation
  - AER, ARI, few other less critical things
- any 64-bit issues or other things that a driver may probe for that
  will break ww/out PCIe
  - most are probing PCI capabilities

upstream queue uq qemu-kvm 
- flush mmio buffer periodically
- enabled unconditional save/restore
- started porting -mempath
- getting autotest to run on upstream to autotest before sending patches
  to upstream
- anthony wants to receive pull request w/ inline patches for review

qmp feature negotiation issue
- Luiz and Markus discussing alternatives, either seems fine
  - Anthony will follow-up on list

[Qemu-devel] Question on qcow2 image with base image

2010-02-02 Thread Naphtali Sprei

Hi,
when I use a qcow2 image based on a base image, what should happen when I 
invoke the commit command from the qemu monitor ?
Is it expected/intended to flush the data into the base image ?
IIUC, that is what happening in the released qemu (0.12).
I would expect it not to touch the base image.

  Naphtali

[Qemu-devel] [PATCH 0/3] Event signaling tweaks

2010-02-02 Thread Paolo Bonzini

This series of three patches makes two small changes to qemu_event_read
and qemu_event_increment.  These are preparatory to merging eventfd
usage in the iothread from qemu-kvm (which would have conflicts, so it
has to be done with some care).

Paolo Bonzini (3):
  do not loop on an incomplete io_thread_fd read
  loop qemu_event_increment if we have an EINTR
  fix placement of config-host.h inclusion

 osdep.c |7 ---
 vl.c|   12 
 2 files changed, 12 insertions(+), 7 deletions(-)

[Qemu-devel] [PATCH 1/3] do not loop on an incomplete io_thread_fd read

2010-02-02 Thread Paolo Bonzini

No need to loop if less than a full buffer is read, the next
read would return EAGAIN.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 vl.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/vl.c b/vl.c
index 6f1e1ab..46c1118 100644
--- a/vl.c
+++ b/vl.c
@@ -3210,12 +3210,12 @@ static void qemu_event_read(void *opaque)
 {
 int fd = (unsigned long)opaque;
 ssize_t len;
+char buffer[512];
 
 /* Drain the notify pipe */
 do {
-char buffer[512];
 len = read(fd, buffer, sizeof(buffer));
-} while ((len == -1  errno == EINTR) || len  0);
+} while ((len == -1  errno == EINTR) || len == sizeof(buffer));
 }
 
 static int qemu_event_init(void)
-- 
1.6.6

[Qemu-devel] [PATCH 2/3] loop write in qemu_event_increment upon EINTR

2010-02-02 Thread Paolo Bonzini

Same as what qemu-kvm does.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 vl.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/vl.c b/vl.c
index 46c1118..f150eca 100644
--- a/vl.c
+++ b/vl.c
@@ -3198,8 +3198,12 @@ static void qemu_event_increment(void)
 if (io_thread_fd == -1)
 return;
 
-ret = write(io_thread_fd, byte, sizeof(byte));
-if (ret  0  (errno != EINTR  errno != EAGAIN)) {
+do {
+ret = write(io_thread_fd, byte, sizeof(byte));
+} while (ret  0  errno == EINTR);
+
+/* EAGAIN is fine, a read must be pending.  */
+if (ret  0  errno != EAGAIN) {
 fprintf(stderr, qemu_event_increment: write() filed: %s\n,
 strerror(errno));
 exit (1);
-- 
1.6.6

[Qemu-devel] [PATCH 3/3] fix placement of config-host.h inclusion

2010-02-02 Thread Paolo Bonzini

The #ifdef CONFIG_SOLARIS below was useless without this patch.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 osdep.c |7 ---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/osdep.c b/osdep.c
index cf3a2c6..9059f01 100644
--- a/osdep.c
+++ b/osdep.c
@@ -28,14 +28,15 @@
 #include errno.h
 #include unistd.h
 #include fcntl.h
+
+/* Needed early for CONFIG_BSD etc. */
+#include config-host.h
+
 #ifdef CONFIG_SOLARIS
 #include sys/types.h
 #include sys/statvfs.h
 #endif
 
-/* Needed early for CONFIG_BSD etc. */
-#include config-host.h
-
 #ifdef _WIN32
 #include windows.h
 #elif defined(CONFIG_BSD)
-- 
1.6.6

[Qemu-devel] Re: [PATCH] Add cpu model configuration support.. (resend)

2010-02-02 Thread john cooper

Andre Przywara wrote:

 +[cpudef]
 +   name = Conroe
 +   level = 2
 +   vendor = GenuineIntel
 +   family = 6
 +   model = 2
 +   stepping = 3
 +   feature_edx = sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae
 msr tsc pse de fpumtrr clflush mca pse36
 +   feature_ecx = sse3 ssse3
 +   extfeature_edx = fxsr mmx pat cmov pge apic cx8 mce pae msr tsc
 pse de fpulm syscall nx
 +   extfeature_ecx = lahf_lm
 Wouldn't it be much more user friendly to merge them all into one
 string? Just from the feature names it is quite obscure to guess which
 flag belongs into which string (especially since we lack the EXTn_
 prefix we had in helper.c). I haven't tried it, but the parsing code
 looks like this shouldn't be too hard.
 To avoid overlong lines one could think about a += operator.

That's true.  Although I expect setup of a cpu model to
be a rather infrequent occurrence by the expert (+/-)
user so the above didn't strike me as a significant issue.
Also -cpu ?cpuid dumps out the entire motley crew of
flags relative to each grouping for reference.

That said the current config file syntax seems rather
rigid and I think your suggestion makes sense.  I avoided
modifying the parser at this point just in the interest of
minimizing the sprawl of this patch.

 I would just drop all definitions here except qemu{32,64} and
 kvm{32,64}. The other models should be described in the config file.

That's the goal but I wanted to leave an interim firewall
of sorts.  If the target-x86_64.conf isn't installed for
whatever reason, qemu still can fall back to the internal
definitions.  Even here it isn't strictly necessary to
remove an internal def as it can be redefined in the
config file which will override the internal version.
In general -cpu ?model will indicate internal vs.
externally defined models by enclosing internal model names
in brackets:

:
x86   Opteron_G3  AMD Opteron 23xx (Gen 3 Class Opteron)
:
x86 [athlon]  QEMU Virtual CPU version 0.12.50
:

It also seems worth dropping a hint to the user in the case qemu
fails to find a target config file rather than leaving them to
puzzle out why an external model has gone missing.

Thanks for the feedback.

-john

-- 
john.coo...@redhat.com

[Qemu-devel] [PATCH v0 0/5]: BLOCK_IO_ERROR QMP event

2010-02-02 Thread Luiz Capitulino

 Hi,

 This series adds the BLOCK_IO_ERROR event libvirt guys have requested,
I have made some improvements after Kevin's feedback and hope it's in better
shape now.

 The only small issue is that I couldn't get a read error. I've followed Kevin's
advices wrt NFS, but got only write errors...

 I've tested with ide and virtio.

 Thanks.

[Qemu-devel] [PATCH 1/5] QMP: BLOCK_IO_ERROR event handling

2010-02-02 Thread Luiz Capitulino

This commit adds the basic definitions for the BLOCK_IO_ERROR
event, but actual event emission will be introduced by the
next commits.

NOTE: Adding a small reference in QMP/qmp-events.txt, but this
file is wrong and will be replaced by proper documentation shortly.

Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
---
 QMP/qmp-events.txt |7 +++
 monitor.c  |3 +++
 monitor.h  |1 +
 3 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt
index dc48ccc..7886192 100644
--- a/QMP/qmp-events.txt
+++ b/QMP/qmp-events.txt
@@ -43,3 +43,10 @@ Data: 'server' and 'client' keys with the same keys as 
'query-vnc'.
 
 Description: Issued when the VNC session is made active.
 Data: 'server' and 'client' keys with the same keys as 'query-vnc'.
+
+7 BLOCK_IO_ERROR
+
+
+Description: Issued when a disk I/O error occurs
+Data: 'device' (device name), 'action' (action to be taken),
+  'operation' (read or write)
diff --git a/monitor.c b/monitor.c
index fb7c572..6e688ac 100644
--- a/monitor.c
+++ b/monitor.c
@@ -378,6 +378,9 @@ void monitor_protocol_event(MonitorEvent event, QObject 
*data)
 case QEVENT_VNC_DISCONNECTED:
 event_name = VNC_DISCONNECTED;
 break;
+case QEVENT_BLOCK_IO_ERROR:
+event_name = BLOCK_IO_ERROR;
+break;
 default:
 abort();
 break;
diff --git a/monitor.h b/monitor.h
index b0f9270..e35f1e4 100644
--- a/monitor.h
+++ b/monitor.h
@@ -23,6 +23,7 @@ typedef enum MonitorEvent {
 QEVENT_VNC_CONNECTED,
 QEVENT_VNC_INITIALIZED,
 QEVENT_VNC_DISCONNECTED,
+QEVENT_BLOCK_IO_ERROR,
 QEVENT_MAX,
 } MonitorEvent;
 
-- 
1.6.6

[Qemu-devel] [PATCH 2/5] block: BLOCK_IO_ERROR QMP event

2010-02-02 Thread Luiz Capitulino

This commit introduces the bdrv_mon_event() function, which
should be called by block subsystems (eg. IDE) when a I/O
error occurs, so that an QMP event is emitted.

The following information is currently provided in the event:

- device name
- operation (ie. read or write)
- action taken (eg. stop)

Event example:

{ event: BLOCK_IO_ERROR,
data: { device: ide0-hd1,
  operation: write,
  action: stop },
timestamp: { seconds: 1265044230, microseconds: 450486 } }

Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
---
 block.c |   29 +
 block.h |6 ++
 2 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index 1919d19..2913124 100644
--- a/block.c
+++ b/block.c
@@ -1164,6 +1164,35 @@ int bdrv_is_allocated(BlockDriverState *bs, int64_t 
sector_num, int nb_sectors,
 return bs-drv-bdrv_is_allocated(bs, sector_num, nb_sectors, pnum);
 }
 
+void bdrv_mon_event(const BlockDriverState *bdrv,
+BlockMonEventAction action, int is_read)
+{
+QObject *data;
+const char *action_str;
+
+switch (action) {
+case BDRV_ACTION_REPORT:
+action_str = report;
+break;
+case BDRV_ACTION_IGNORE:
+action_str = ignore;
+break;
+case BDRV_ACTION_STOP:
+action_str = stop;
+break;
+default:
+abort();
+}
+
+data = qobject_from_jsonf({ 'device': %s, 'action': %s, 'operation': %s 
},
+  bdrv-device_name,
+  action_str,
+  is_read ? read : write);
+monitor_protocol_event(QEVENT_BLOCK_IO_ERROR, data);
+
+qobject_decref(data);
+}
+
 static void bdrv_print_dict(QObject *obj, void *opaque)
 {
 QDict *bs_dict;
diff --git a/block.h b/block.h
index ecf66c5..a834300 100644
--- a/block.h
+++ b/block.h
@@ -44,6 +44,12 @@ typedef struct QEMUSnapshotInfo {
 #define BDRV_SECTOR_SIZE   (1  BDRV_SECTOR_BITS)
 #define BDRV_SECTOR_MASK   ~(BDRV_SECTOR_SIZE - 1);
 
+typedef enum {
+BDRV_ACTION_REPORT, BDRV_ACTION_IGNORE, BDRV_ACTION_STOP
+} BlockMonEventAction;
+
+void bdrv_mon_event(const BlockDriverState *bdrv,
+BlockMonEventAction action, int is_read);
 void bdrv_info_print(Monitor *mon, const QObject *data);
 void bdrv_info(Monitor *mon, QObject **ret_data);
 void bdrv_stats_print(Monitor *mon, const QObject *data);
-- 
1.6.6

[Qemu-devel] [PATCH 3/5] ide: Generate BLOCK_IO_ERROR QMP event

2010-02-02 Thread Luiz Capitulino

Just call bdrv_mon_event() in the right place.

Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
---
 hw/ide/core.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index b6643e8..603e537 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -480,14 +480,17 @@ static int ide_handle_rw_error(IDEState *s, int error, 
int op)
 int is_read = (op  BM_STATUS_RETRY_READ);
 BlockInterfaceErrorAction action = drive_get_on_error(s-bs, is_read);
 
-if (action == BLOCK_ERR_IGNORE)
+if (action == BLOCK_ERR_IGNORE) {
+bdrv_mon_event(s-bs, BDRV_ACTION_IGNORE, is_read);
 return 0;
+}
 
 if ((error == ENOSPC  action == BLOCK_ERR_STOP_ENOSPC)
 || action == BLOCK_ERR_STOP_ANY) {
 s-bus-bmdma-unit = s-unit;
 s-bus-bmdma-status |= op;
 vm_stop(0);
+bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read);
 } else {
 if (op  BM_STATUS_DMA_RETRY) {
 dma_buf_commit(s, 0);
@@ -495,6 +498,7 @@ static int ide_handle_rw_error(IDEState *s, int error, int 
op)
 } else {
 ide_rw_error(s);
 }
+bdrv_mon_event(s-bs, BDRV_ACTION_REPORT, is_read);
 }
 
 return 1;
-- 
1.6.6

[Qemu-devel] [PATCH 4/5] scsi: Generate BLOCK_IO_ERROR QMP event

2010-02-02 Thread Luiz Capitulino

Just call bdrv_mon_event() in the right place.

Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
---
 hw/scsi-disk.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index b34fbaa..1285122 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -182,16 +182,20 @@ static int scsi_handle_write_error(SCSIDiskReq *r, int 
error)
 BlockInterfaceErrorAction action =
 drive_get_on_error(s-qdev.dinfo-bdrv, 0);
 
-if (action == BLOCK_ERR_IGNORE)
+if (action == BLOCK_ERR_IGNORE) {
+bdrv_mon_event(s-qdev.dinfo-bdrv, BDRV_ACTION_IGNORE, 0);
 return 0;
+}
 
 if ((error == ENOSPC  action == BLOCK_ERR_STOP_ENOSPC)
 || action == BLOCK_ERR_STOP_ANY) {
 r-status |= SCSI_REQ_STATUS_RETRY;
 vm_stop(0);
+bdrv_mon_event(s-qdev.dinfo-bdrv, BDRV_ACTION_STOP, 0);
 } else {
 scsi_command_complete(r, CHECK_CONDITION,
 HARDWARE_ERROR);
+bdrv_mon_event(s-qdev.dinfo-bdrv, BDRV_ACTION_REPORT, 0);
 }
 
 return 1;
-- 
1.6.6

[Qemu-devel] [PATCH 5/5] virtio-blk: Generate BLOCK_IO_ERROR QMP event

2010-02-02 Thread Luiz Capitulino

Just call bdrv_mon_event() in the right place.

Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
---
 hw/virtio-blk.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index 037a79c..75adbec 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -105,16 +105,20 @@ static int virtio_blk_handle_rw_error(VirtIOBlockReq 
*req, int error,
 drive_get_on_error(req-dev-bs, is_read);
 VirtIOBlock *s = req-dev;
 
-if (action == BLOCK_ERR_IGNORE)
+if (action == BLOCK_ERR_IGNORE) {
+bdrv_mon_event(req-dev-bs, BDRV_ACTION_IGNORE, is_read);
 return 0;
+}
 
 if ((error == ENOSPC  action == BLOCK_ERR_STOP_ENOSPC)
 || action == BLOCK_ERR_STOP_ANY) {
 req-next = s-rq;
 s-rq = req;
 vm_stop(0);
+bdrv_mon_event(req-dev-bs, BDRV_ACTION_STOP, is_read);
 } else {
 virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR);
+bdrv_mon_event(req-dev-bs, BDRV_ACTION_REPORT, is_read);
 }
 
 return 1;
-- 
1.6.6

Re: [Qemu-devel] usb-host quirks

2010-02-02 Thread David S. Ahern


On 02/02/2010 06:42 AM, Michael Buesch wrote:
 Hi,
 
 I've got a buggy device that needs a special workaround to be usable under
 host-usb access. The device really doesn't like being reset via 
 USBDEVFS_RESET. It
 immediatenly locks up the device firmware or whatever. It won't respond 
 properly anymore.
 With the following patch it works fine, though.
 

What about the USBDEVFS_RESET in usb_host_open? Does that have an impact?

For some USB keys I have had to add an additional reset prior to
claiming interfaces:

diff --git a/usb-linux.c b/usb-linux.c
index 1aaa595..092e75c 100644
--- a/usb-linux.c
+++ b/usb-linux.c
@@ -906,6 +906,9 @@ static int usb_host_open(USBHostDevice *dev, int
bus_num,
 #endif


+/* some keys require a reset before the getconfig */
+ioctl(fd, USBDEVFS_RESET);
+
 /*
  * Initial configuration is -1 which makes us claim first
  * available config. We used to start with 1, which does not


David Ahern


 So I was wondering what the accepted way was to get these quirks upstream 
 into the qemu
 source tree. Is usb-linux.c the correct place, or should we put the quirk into
 a different place?
 
 ---
  usb-linux.c |4 
  1 file changed, 4 insertions(+)
 
 --- qemu.orig/usb-linux.c
 +++ qemu/usb-linux.c
 @@ -389,6 +389,10 @@ static void usb_host_handle_reset(USBDev
  
  dprintf(husb: reset device %u.%u\n, s-bus_num, s-addr);
  
 +if (((s-descr[8]  8) | s-descr[9]) == 0x2471 
 +((s-descr[10]  8) | s-descr[11]) == 0x0853)
 +return;
 +
  ioctl(s-fd, USBDEVFS_RESET);
  
  usb_host_claim_interfaces(s, s-configuration);

Re: [Qemu-devel] system_reset command cause assert failed

2010-02-02 Thread Roy Tam

2010/2/2 Luiz Capitulino lcapitul...@redhat.com:
 On Tue, 2 Feb 2010 09:35:16 +0800
 Roy Tam roy...@gmail.com wrote:

 2010/2/2 Luiz Capitulino lcapitul...@redhat.com:
  On Tue, 2 Feb 2010 00:26:53 +0800
  Roy Tam roy...@gmail.com wrote:
 
  2010/2/2 Luiz Capitulino lcapitul...@redhat.com:
 
Hm, I'm puzzled. Is this failing on malloc()? At least qemu_malloc()
   is the last qemu's function I see in the logs.
  
From now on I only see msvcrt functions...
  
Maybe, you can type run on gdb, run system_reset on the
   Monitor and then switch back to gdb and type bt?
  
  source-less debugging seems better...
 
   As far as I can understand something bad happens while the parser
  is processing the first ' character of the qobject_from_jsonf()
  call in monitor.c:4524.
 
   Strange. Can you try 'info pci', 'info block' and 'info version'?
  Do they work?
 
   Maybe this is a refcount problem?
 
   Anthony, could you take a look too please?
 

 rebuild with -gstabs -O1, you can see double free here:

  Ok, so we have a double free and


Clarify that after digging into sources further, it is not double
free, but parse_json not be executed by json_lexer_feed_char as I put
asm(int3) in parse_json but there's no SIGTRAP be raised. (for
system_reset and system_powerdown)

 #0  qobject_to_qdict (obj=0x0) at qobject.h:108
 #1  0x004127ae in pci_device_print (mon=0x494c460, device=0x49696c0)
 at /home/roy/qemu/hw/pci.c:1165

  a segfault.

for this, parse_json was executed by json_lexer_feed_char.
a workaround patch is here, but why null qobj has pushed into qlist?

diff --git a/hw/pci.c b/hw/pci.c
index 023f7b6..84e7b35 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -1161,8 +1161,11 @@ static void pci_device_print(Monitor *mon, QDict *device)
 qdict_get_int(info, limit));
 }

+QObject* qobj;
 QLIST_FOREACH_ENTRY(qdict_get_qlist(device, regions), entry) {
-qdict = qobject_to_qdict(qlist_entry_obj(entry));
+qobj = qlist_entry_obj(entry);
+if(!qobj) continue;
+qdict = qobject_to_qdict(qobj);
 monitor_printf(mon,   BAR%d: , (int)
qdict_get_int(qdict, bar));

 addr = qdict_get_int(qdict, address);

RE: [Qemu-devel] [Patch] Support translating Guest physical address to Host virtual address.

2010-02-02 Thread Zheng, Jiajia

Hi, 
Any futher comments for this patch so that we can modify?

thanks, 
jiajia

Max Asbock wrote:
 On Wed, 2010-01-27 at 15:39 -0600, Anthony Liguori wrote:
 On 01/26/2010 09:25 PM, Zheng, Jiajia wrote:
 Add command p2v to translate Guest physical address to Host virtual
 address. 
 
 
 For what purpose?
 
 Signed-off-by: Max Asbockmasb...@linux.vnet.ibm.com
 Jiajia Zhengjiajia.zh...@intel.com
 ---
 diff --git a/monitor.c b/monitor.c
 index b33b01f..83d9ac7 100644
 --- a/monitor.c
 +++ b/monitor.c
 @@ -668,6 +668,11 @@ static void do_info_uuid(Monitor *mon, QObject
   **ret_data) *ret_data = qobject_from_jsonf({ 'UUID': %s },
 uuid);   } 
 
 +static void do_info_p2v(Monitor *mon)
 +{
 +monitor_printf(mon, p2v implemented\n);
 +}
 
 
 These should be implemented as QMP commands.
 
   /* get the current CPU defined by the user */
   static int mon_set_cpu(int cpu_index)
   {
 @@ -2283,6 +2288,14 @@ static void do_inject_mce(Monitor *mon,
   const QDict *qdict)   break; }
   }
 +static void do_p2v(Monitor *mon, const QDict *qdict) +{
 +target_long size = 4096;
 +target_long addr = qdict_get_int(qdict, addr); +
 +monitor_printf(mon, Guest physical address %p is mapped at
 host virtual address %p\n, (void *)addr, cpu_physical_memory_map(
 (target_phys_addr_t)addr, (target_phys_addr_t *)size, 0));  
 
 
 This isn't quite right.  It assumes TARGET_PAGE_SIZE is 4k which is
 certainly not always true.  It also assumes that
 cpu_physical_memory_map() something that has some meaning which isn't
 necessarily the case.  It could be a pointer to a bounce buffer.
 
 Could you give an end-to-end description of how you expect this
 mechanism to be used so we can work out a more appropriate set of
 interfaces.  I assume this is MCE related.
 
 
 The purpose of this is to translate a guest physical address to a host
 virtual address.
 This was indeed used for MCE testing. The p2v command provides one
 step in a chain of translations from guest virtual to guest physical
 to host virtual to host physical. Host physical is then used to
 inject a machine check error. As a consequence the HPOISON code on
 the host and the MCE injection code in qemu are exercised.
 I was always assuming that this implementation perhaps isn't the most
 optimal, but it simply worked for our test case.
 
 What would an appropriate method be to get a host virtual address for
 guest physical address that represents a page of RAM?
 
 thanks,
 Max

[Qemu-devel] Re: Network shutdown under load

2010-02-02 Thread RW

Hi,

we're currently having this problem on two production servers
that 2-4 times a day one interface shuts down. We've four KVMs
running on two hosts (2x2). All VMs have eth0 and eth1 running virtio_net.
All eth0's are connected to bridge br0 and all eth1's are connected to
br1 on the host. Here are the startup options for one VM (the
others are quite similar [of course other mac address, ...]):

/usr/bin/kvm -m 8192 -smp 8 -cpu host -daemonize -k de -vnc 127.0.0.1:1
-monitor telnet:172.18.105.46:,server,nowait -localtime -pidfile
/tmp/kvm-dodoma.pid -drive
file=/data/kvm/kvmimages/dodoma.qcow2,if=virtio,cache=none,boot=on
-drive file=/data/kvm/kvmimages/dodoma-vdb.qcow2,if=virtio,cache=none
-net nic,vlan=104,model=virtio,macaddr=00:ff:48:e5:4b:8d -net
tap,vlan=104,ifname=tap.b.dodoma,script=no -net
nic,vlan=96,model=virtio,macaddr=00:ff:48:e5:4b:8f -net
tap,vlan=96,ifname=tap.f.dodoma,script=no

I've tried the very latest Gentoo kernel 2.6.30 on the host and
guest (all VMs and hosts running Gentoo btw.). With kernel
2.6.31 on host and 2.6.30 on guest the problem still exist. I've
tried KVM 0.11.1, 0.12.1.2 and 0.12.2 running with kernel 2.6.30
and 2.6.31 on the host side.

Interestingly all the VMs almost have the same network traffic
(in and out) but the VMs running Apache bind to eth1 have
the biggest problems. They shut down eth1 2-4 times a day.
eth0 is running fine despite that it is doing almost the same
traffic amount but this traffic comes from the database where
as eth1 sends the traffic to the proxy (Varnish). So incoming traffic
seems to work fine here but outgoing traffic is problematic. On the
other hand the VMs running Varnish getting all the traffic through
eth1. Here I've only seen one shutdown of eth1 in 48 hours.

Is there anything I can help to debug this problem? Is there
already a fix available? Otherwise I really have to install KVM-88
which runs fine on some other hosts.

Thanks!
Robert


Tom Lendacky wrote:
 There's been some discussion of this already in the kvm list, but I want to 
 summarize what I've found and also include the qemu-devel list in an effort 
 to 
 find a solution to this problem.

 Running a netperf test between two kvm guests results in the guest's network 
 interface shutting down. I originally found this using kvm guests on two 
 different machines that were connected via a 10GbE link.  However, I found 
 this problem can be easily reproduced using two guests on the same machine.

 I am running the 2.6.32 level of the kvm.git tree and the 0.12.1.2 level of 
 the qemu-kvm.git tree.

 The setup includes two bridges, br0 and br1.

 The commands used to start the guests are as follows:
 usr/local/bin/qemu-system-x86_64 -name cape-vm001 -m 1024 -drive 
 file=/autobench/var/tmp/cape-vm001-
 raw.img,if=virtio,index=0,media=disk,boot=on -net 
 nic,model=virtio,vlan=0,macaddr=00:16:3E:00:62:51,netdev=cape-vm001-eth0 -
 netdev tap,id=cape-vm001-eth0,script=/autobench/var/tmp/ifup-kvm-
 br0,downscript=/autobench/var/tmp/ifdown-kvm-br0 -net 
 nic,model=virtio,vlan=1,macaddr=00:16:3E:00:62:D1,netdev=cape-vm001-eth1 -
 netdev tap,id=cape-vm001-eth1,script=/autobench/var/tmp/ifup-kvm-
 br1,downscript=/autobench/var/tmp/ifdown-kvm-br1 -vnc :1 -monitor 
 telnet::5701,server,nowait -snapshot -daemonize

 usr/local/bin/qemu-system-x86_64 -name cape-vm002 -m 1024 -drive 
 file=/autobench/var/tmp/cape-vm002-
 raw.img,if=virtio,index=0,media=disk,boot=on -net 
 nic,model=virtio,vlan=0,macaddr=00:16:3E:00:62:61,netdev=cape-vm002-eth0 -
 netdev tap,id=cape-vm002-eth0,script=/autobench/var/tmp/ifup-kvm-
 br0,downscript=/autobench/var/tmp/ifdown-kvm-br0 -net 
 nic,model=virtio,vlan=1,macaddr=00:16:3E:00:62:E1,netdev=cape-vm002-eth1 -
 netdev tap,id=cape-vm002-eth1,script=/autobench/var/tmp/ifup-kvm-
 br1,downscript=/autobench/var/tmp/ifdown-kvm-br1 -vnc :2 -monitor 
 telnet::5702,server,nowait -snapshot -daemonize

 The ifup-kvm-br0 script takes the (first) qemu created tap device and brings 
 it up and adds it to bridge br0.  The ifup-kvm-br1 script take the (second) 
 qemu created tap device and brings it up and adds it to bridge br1.

 Each ethernet device within a guest is on it's own subnet.  For example:
   guest 1 eth0 has addr 192.168.100.32 and eth1 has addr 192.168.101.32
   guest 2 eth0 has addr 192.168.100.64 and eth1 has addr 192.168.101.64

 On one of the guests run netserver:
   netserver -L 192.168.101.32 -p 12000

 On the other guest run netperf:
   netperf -L 192.168.101.64 -H 192.168.101.32 -p 12000 -t TCP_STREAM -l 60 -c 
 -C -- -m 16K -M 16K

 It may take more than one netperf run (I find that my second run almost 
 always 
 causes the shutdown) but the network on the eth1 links will stop working.

 I did some debugging and found that in qemu on the guest running netserver:
  - the receive_disabled variable is set and never gets reset
  - the read_poll event handler for the eth1 tap device is disabled and never 
 re-enabled
 These conditions result in no

79 matches

Mail list logo