date:20120827

Re: [Qemu-devel] [PATCH 2/4] qxl: split qxl functions in common and pci files

2012-08-27 Thread Gerd Hoffmann

On 08/24/12 21:14, Erlon Cruz wrote:
 From: Fabiano Fidêncio fabi...@fidencio.org
 
 This commit splits qxl functions into common functions (located in
 qxl.c) and pci-specific functions (located in qxl-pci.c).
 All prototypes are being kept in qxl.h, as common MACROS and inline
 functions. Moreover, this commit is exposing a lot of APIs, don't know
 if it's the correct way to do it, but it was the only way that we saw to
 do it.

Try enabling rename detection for this one (git format-patch -M).

 diff --git a/hw/qxl.h b/hw/qxl.h
 index f25e341..516e7da 100644
 --- a/hw/qxl.h
 +++ b/hw/qxl.h
 @@ -143,6 +143,44 @@ typedef struct QXLDevice {
  }   \
  } while (0)
  
 +/*
 + * NOTE: SPICE_RING_PROD_ITEM accesses memory on the pci bar and as
 + * such can be changed by the guest, so to avoid a guest trigerrable
 + * abort we just qxl_set_guest_bug and set the return to NULL. Still
 + * it may happen as a result of emulator bug as well.
 + */

Why these are here and not in qxl-pci.c?

 +void init_qxl_rom(QXLDevice *d);
 +void init_qxl_ram(QXLDevice *d);

Same question.

 +void interface_get_init_info(QXLInstance *sin, QXLDevInitInfo *info);
 +int interface_get_command(QXLInstance *sin, struct QXLCommandExt *ext);
 +int interface_req_cmd_notification(QXLInstance *sin);
 +void interface_release_resource(QXLInstance *sin, struct QXLReleaseInfoExt 
 ext);
 +int interface_get_cursor_command(QXLInstance *sin, struct QXLCommandExt 
 *ext);
 +int interface_req_cursor_notification(QXLInstance *sin);
 +void interface_notify_update(QXLInstance *sin, uint32_t update_id);
 +int interface_flush_resources(QXLInstance *sin);
 +void interface_update_area_complete(QXLInstance *sin, uint32_t surface_id, 
 QXLRect *dirty, uint32_t num_updated_rects);
 +void interface_async_complete(QXLInstance *sin, uint64_t cookie_token);
 +ram_addr_t qxl_rom_size(void);

Same question.

I'd expect at least some of these having a virtio-specific
implementation.  interface_get_command() for example, which gets a qxl
command from the ring ...

cheers,
  Gerd

Re: [Qemu-devel] [PATCH 4/4] qxl: introducing virtio-qxl

2012-08-27 Thread Gerd Hoffmann

  Hi,

 To enable the VirtIOQXL device, use '-virtio-qxl'. Video output will be

Please don't add a new option.  'qemu -vga none -device virtio-qxl'
should work these days.  You could also make virtio-qxl a valid choice
for '-vga' for convenience.

cheers,
  Gerd

Re: [Qemu-devel] Implementing qxl-virtio on QEMU

2012-08-27 Thread Gerd Hoffmann

On 08/24/12 21:14, Erlon Cruz wrote:
 The following patches makes provides video support to non PCI architectures, 
 please review!

Can you give an overview on the virtio-qxl virtual hardware design?

thanks,
  Gerd

Re: [Qemu-devel] [PATCH v2] register reset handler to write image into memory

2012-08-27 Thread Alexander Graf



On 26.08.2012, at 20:50, Yin Olivia-R63875 r63...@freescale.com wrote:

 Thanks to Dunrong and Andreas.
 
 $ scripts/get_maintainer.pl -f hw/loader.c
 Alexander Graf ag...@suse.de (commit_signer:3/6=50%)
 Anthony Liguori aligu...@us.ibm.com (commit_signer:2/6=33%)
 Stefan Weil w...@mail.berlios.de (commit_signer:1/6=17%)
 Benjamin Herrenschmidt b...@kernel.crashing.org (commit_signer:1/6=17%)
 Avi Kivity a...@redhat.com (commit_signer:1/6=17%)
 
 Dear maintainers,
 Could you please help review this patch?
 
 So far I got feedback from Andreas and try to answer the question.
 
 This patch does not answer the question why you try to avoid the ROM blobs
 and what ROM blobs are still being used for after your patch. I don't 
 think it makes much sense to work around them for your use cases and to 
 leave them behind - if there's something fundamentally wrong with them 
 they should be ripped out completely or fixed. But maybe I'm 
 misunderstanding 
 in the absence of explanations?
 
 It's a general problem. 
 
 For example, in my case, there're 3 different files loaded from host rootfs.
 $ qemu-system-ppc -enable-kvm -m 256 -nographic -M mpc8544ds -kernel 
 uImage.8572.agraf -initrd /media/ram/guest-8572.rootfs.ext2.gz -append 
 root=/dev/ram rw loglevel=7 console=ttyS0,115200 -serial tcp::4445,server 
 -net nic
 
 (qemu) info roms
 addr= size=0x782840 mem=ram name=uImage.8572.agraf
 addr=00c0 size=0x01 mem=ram name=mpc8544ds.dtb
 addr=0200 size=0x3f922f mem=ram 
 name=/media/ram/guest-8572.rootfs.ext2.gz
 
 The problem is that rom_add_*() mallocs memory for the image, and then 
 rom_reset()
 copies those images into the guest's memory, but the QEMU memory does not get 
 freed.
 On a VM reset, the images get recopied from QEMU to guest.
 
 Comparing the memory map of qemu process before and after starting up guest,
 we can find that QEMU consumes much memory for those images.
 
 $ diff -urN pmap.pre.log pmap.post.log
 --- pmap.pre.log
 +++ pmap.post.log
 @@ -33,7 +33,14 @@
 0ffee000  8K rwx--  /lib/ld-2.13.so
 1000   3472K r-x--  qemu-system-ppc
 10374000112K rwx--  qemu-system-ppc
 -1039   6524K rwx--[ anon ]
 +1039   7100K rwx--[ anon ]
 48002000 16K rw---[ anon ]
 +48006000  4K -[ anon ]
 +48007000   8188K rw---[ anon ]
 +48806000  8K rw-s-[ anon ]
 +48808000  4K rw---[ anon ]
 +48809000 262144K rw---[ anon ]
 +58809000   5280K rw---[ anon ]
 +5cb98000   7692K rw---[ anon ]
 bf93e000132K rw---[ stack ]
 - total14456K
 + total   298352K
 
 Exactly we can re-load them from disk on a reset instead of holding onto the 
 images in QEMU's memory.
 
 With this patch, the two big images (uImage and especially initrd) will not 
 be loaded into QEMU's memory
 (qemu) info roms
 addr=00c0 size=0x01 mem=ram name=mpc8544ds.dtb
 
 It will save much memory space according to memory map of QEMU process.
 # diff -urN pmap.pre.log pmap.post.log
 --- pmap.pre.log
 +++ pmap.post.log
 @@ -33,7 +33,14 @@
 0ffee000  8K rwx--  /lib/ld-2.13.so
 1000   3472K r-x--  qemu-system-ppc
 10374000112K rwx--  qemu-system-ppc
 -1039   6524K rwx--[ anon ]
 +1039   7036K rwx--[ anon ]
 48002000 16K rw---[ anon ]
 +48006000  4K -[ anon ]
 +48007000   8188K rw---[ anon ]
 +48806000  8K rw-s-[ anon ]
 +48808000  4K rw---[ anon ]
 +48809000 262144K rw---[ anon ]
 +58809000  4K rw---[ anon ]
 +58c04000   1204K rw---[ anon ]
 bfb2a000132K rw---[ stack ]
 - total14456K
 + total   286524K
 
 This patch changes all the image load process called by load_uimage() and 
 load_image_targphys() in platform initialization.

This doesn't explain why you leave the old in-RAM code alive though. The only 
reason I can imagine would be to allow for reset to not reload new roms after 
an update.

Anthony, any opinion here? Do we need the keep-in-RAM rom code? Or could we 
just always load rom blobs on demand for everything?


Alex

 
 Best Regards,
 Olivia
 
 -Original Message-
 From: Dunrong Huang [mailto:riegama...@gmail.com]
 Sent: Thursday, August 23, 2012 6:44 PM
 To: Yin Olivia-R63875
 Cc: qemu-...@nongnu.org; qemu-devel@nongnu.org
 Subject: Re: [Qemu-devel] [PATCH v2] register reset handler to write
 image into memory
 
 2012/8/23 Yin Olivia-R63875 r63...@freescale.com:
 Dear All,
 
 I can't find MAINTAINER of hw/loader.c.
 Who can help review and apply this patch?
 
 Please use the script scripts/get_maintainer.pl, like:
 $ scripts/get_maintainer.pl your_patch_file.patch or
 $ scripts/get_maintainer.pl -f hw/loader.c
 Best Regards,
 Olivia Yin
 
 
 
 --
 Best Regards,
 
 Dunrong Huang

Re: [Qemu-devel] [PATCH for-1.2 0/2] migrate PV EOI MSR

2012-08-27 Thread Jan Kiszka

On 2012-08-26 17:59, Michael S. Tsirkin wrote:
 It turns out PV EOI gets disabled after migration -
 until next guest reset.
 This is because we are missing code to actually migrate it.
 This patch fixes it up: it does not do anything useful
 without kvm irqchip but applies cleanly to qemu.git
 as well as qemu-kvm.git, so I think it's cleaner
 to apply it in qemu.git to keep diff to minimum.

There is nothing except pci-assign left in qemu-kvm (which will be
posted for upstream in a minute), so you are intuitively doing the right
thing.

Patch 2 looks good to me, see patch 1 for the clean procedure.

Jan




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH for-1.2 1/2] linux-headers: update asm/kvm_para.h to 3.6

2012-08-27 Thread Jan Kiszka

On 2012-08-26 17:59, Michael S. Tsirkin wrote:
 Update asm-x96/kvm_para.h to version present in Linux 3.6.

Nope, we have update-linux-headers.sh for this. Just run it again
3.6-rcX, grab the result, and mention the source (release version or
kvm.git hash).

Jan

 This is needed for the new PV EOI feature.
 
 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 ---
  linux-headers/asm-x86/kvm_para.h | 7 +++
  1 file changed, 7 insertions(+)
 
 diff --git a/linux-headers/asm-x86/kvm_para.h 
 b/linux-headers/asm-x86/kvm_para.h
 index f2ac46a..a1c3d72 100644
 --- a/linux-headers/asm-x86/kvm_para.h
 +++ b/linux-headers/asm-x86/kvm_para.h
 @@ -22,6 +22,7 @@
  #define KVM_FEATURE_CLOCKSOURCE23
  #define KVM_FEATURE_ASYNC_PF 4
  #define KVM_FEATURE_STEAL_TIME   5
 +#define KVM_FEATURE_PV_EOI   6
  
  /* The last 8 bits are used to indicate how to interpret the flags field
   * in pvclock structure. If no bits are set, all flags are ignored.
 @@ -37,6 +38,7 @@
  #define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01
  #define MSR_KVM_ASYNC_PF_EN 0x4b564d02
  #define MSR_KVM_STEAL_TIME  0x4b564d03
 +#define MSR_KVM_PV_EOI_EN  0x4b564d04
  
  struct kvm_steal_time {
   __u64 steal;
 @@ -89,5 +91,10 @@ struct kvm_vcpu_pv_apf_data {
   __u32 enabled;
  };
  
 +#define KVM_PV_EOI_BIT 0
 +#define KVM_PV_EOI_MASK (0x1  KVM_PV_EOI_BIT)
 +#define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK
 +#define KVM_PV_EOI_DISABLED 0x0
 +
  
  #endif /* _ASM_X86_KVM_PARA_H */
 




signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH 0/4] uq/master: Add classic PCI device assignment

2012-08-27 Thread Jan Kiszka

I'm proud to present probably the last patch series to merge qemu-kvm
into upstream: This one adds PCI device assignment for x86 using the
classic interface that the KVM model provides. See the last patch for
reasons why we still want this while next-generation device assignment
via VFIO is approaching.

It's been a long journey, but once this is merged, I think we can close
the qemu-kvm chapter. I already did so, all work is based on QEMU now.

Jan Kiszka (4):
  kvm: Introduce kvm_irqchip_update_msi_route
  kvm: Introduce kvm_has_intx_set_mask
  kvm: i386: Add services required for PCI device assignment
  kvm: i386: Add classic PCI device assignment

 hw/kvm/Makefile.objs   |2 +-
 hw/kvm/pci-assign.c| 1929 
 kvm-all.c  |   50 ++
 kvm.h  |2 +
 target-i386/kvm.c  |  141 
 target-i386/kvm_i386.h |   22 +
 6 files changed, 2145 insertions(+), 1 deletions(-)
 create mode 100644 hw/kvm/pci-assign.c

-- 
1.7.3.4

[Qemu-devel] [PATCH 2/4] kvm: Introduce kvm_has_intx_set_mask

2012-08-27 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

Will be used by PCI device assignment code.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 kvm-all.c |8 
 kvm.h |1 +
 2 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index fd9d9b4..84d4f7f 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -88,6 +88,7 @@ struct KVMState
 int pit_state2;
 int xsave, xcrs;
 int many_ioeventfds;
+int intx_set_mask;
 /* The man page (and posix) say ioctl numbers are signed int, but
  * they're not.  Linux, glibc and *BSD all treat ioctl numbers as
  * unsigned, and treating them as signed here can break things */
@@ -1387,6 +1388,8 @@ int kvm_init(void)
 s-irq_set_ioctl = KVM_IRQ_LINE_STATUS;
 }
 
+s-intx_set_mask = kvm_check_extension(s, KVM_CAP_PCI_2_3);
+
 ret = kvm_arch_init(s);
 if (ret  0) {
 goto err;
@@ -1739,6 +1742,11 @@ int kvm_has_gsi_routing(void)
 #endif
 }
 
+int kvm_has_intx_set_mask(void)
+{
+return kvm_state-intx_set_mask;
+}
+
 void *kvm_vmalloc(ram_addr_t size)
 {
 #ifdef TARGET_S390X
diff --git a/kvm.h b/kvm.h
index 5cefe3a..dea2998 100644
--- a/kvm.h
+++ b/kvm.h
@@ -117,6 +117,7 @@ int kvm_has_xcrs(void);
 int kvm_has_pit_state2(void);
 int kvm_has_many_ioeventfds(void);
 int kvm_has_gsi_routing(void);
+int kvm_has_intx_set_mask(void);
 
 #ifdef NEED_CPU_H
 int kvm_init_vcpu(CPUArchState *env);
-- 
1.7.3.4

[Qemu-devel] [PATCH 1/4] kvm: Introduce kvm_irqchip_update_msi_route

2012-08-27 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

This service allows to update an MSI route without releasing/reacquiring
the associated VIRQ. Will be used by PCI device assignment, later on
likely also by virtio/vhost and VFIO.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 kvm-all.c |   42 ++
 kvm.h |1 +
 2 files changed, 43 insertions(+), 0 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index d4d8a1f..fd9d9b4 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -963,6 +963,30 @@ static void kvm_add_routing_entry(KVMState *s,
 kvm_irqchip_commit_routes(s);
 }
 
+static int kvm_update_routing_entry(KVMState *s,
+struct kvm_irq_routing_entry *new_entry)
+{
+struct kvm_irq_routing_entry *entry;
+int n;
+
+for (n = 0; n  s-irq_routes-nr; n++) {
+entry = s-irq_routes-entries[n];
+if (entry-gsi != new_entry-gsi) {
+continue;
+}
+
+entry-type = new_entry-type;
+entry-flags = new_entry-flags;
+entry-u = new_entry-u;
+
+kvm_irqchip_commit_routes(s);
+
+return 0;
+}
+
+return -ESRCH;
+}
+
 void kvm_irqchip_add_irq_route(KVMState *s, int irq, int irqchip, int pin)
 {
 struct kvm_irq_routing_entry e;
@@ -1125,6 +1149,24 @@ int kvm_irqchip_add_msi_route(KVMState *s, MSIMessage 
msg)
 return virq;
 }
 
+int kvm_irqchip_update_msi_route(KVMState *s, int virq, MSIMessage msg)
+{
+struct kvm_irq_routing_entry kroute;
+
+if (!kvm_irqchip_in_kernel()) {
+return -ENOSYS;
+}
+
+kroute.gsi = virq;
+kroute.type = KVM_IRQ_ROUTING_MSI;
+kroute.flags = 0;
+kroute.u.msi.address_lo = (uint32_t)msg.address;
+kroute.u.msi.address_hi = msg.address  32;
+kroute.u.msi.data = msg.data;
+
+return kvm_update_routing_entry(s, kroute);
+}
+
 static int kvm_irqchip_assign_irqfd(KVMState *s, int fd, int virq, bool assign)
 {
 struct kvm_irqfd irqfd = {
diff --git a/kvm.h b/kvm.h
index 37d1f81..5cefe3a 100644
--- a/kvm.h
+++ b/kvm.h
@@ -270,6 +270,7 @@ int kvm_set_ioeventfd_mmio(int fd, uint32_t adr, uint32_t 
val, bool assign,
 int kvm_set_ioeventfd_pio_word(int fd, uint16_t adr, uint16_t val, bool 
assign);
 
 int kvm_irqchip_add_msi_route(KVMState *s, MSIMessage msg);
+int kvm_irqchip_update_msi_route(KVMState *s, int virq, MSIMessage msg);
 void kvm_irqchip_release_virq(KVMState *s, int virq);
 
 int kvm_irqchip_add_irqfd_notifier(KVMState *s, EventNotifier *n, int virq);
-- 
1.7.3.4

[Qemu-devel] [PATCH 3/4] kvm: i386: Add services required for PCI device assignment

2012-08-27 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

These helpers abstract the interaction of upcoming pci-assign with the
KVM kernel services. Put them under i386 only as other archs will
implement device pass-through via VFIO and not this classic interface.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 target-i386/kvm.c  |  141 
 target-i386/kvm_i386.h |   22 
 2 files changed, 163 insertions(+), 0 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 696b14a..5e2d4f5 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -31,6 +31,7 @@
 #include hw/apic.h
 #include ioport.h
 #include hyperv.h
+#include hw/pci.h
 
 //#define DEBUG_KVM
 
@@ -2055,3 +2056,143 @@ void kvm_arch_init_irq_routing(KVMState *s)
 kvm_msi_via_irqfd_allowed = true;
 kvm_gsi_routing_allowed = true;
 }
+
+/* Classic KVM device assignment interface. Will remain x86 only. */
+int kvm_device_pci_assign(KVMState *s, PCIHostDeviceAddress *dev_addr,
+  uint32_t flags, uint32_t *dev_id)
+{
+struct kvm_assigned_pci_dev dev_data = {
+.segnr = dev_addr-domain,
+.busnr = dev_addr-bus,
+.devfn = PCI_DEVFN(dev_addr-slot, dev_addr-function),
+.flags = flags,
+};
+int ret;
+
+dev_data.assigned_dev_id =
+(dev_addr-domain  16) | (dev_addr-bus  8) | dev_data.devfn;
+
+ret = kvm_vm_ioctl(s, KVM_ASSIGN_PCI_DEVICE, dev_data);
+if (ret  0) {
+return ret;
+}
+
+*dev_id = dev_data.assigned_dev_id;
+
+return 0;
+}
+
+int kvm_device_pci_deassign(KVMState *s, uint32_t dev_id)
+{
+struct kvm_assigned_pci_dev dev_data = {
+.assigned_dev_id = dev_id,
+};
+
+return kvm_vm_ioctl(s, KVM_DEASSIGN_PCI_DEVICE, dev_data);
+}
+
+static int kvm_assign_irq_internal(KVMState *s, uint32_t dev_id,
+   uint32_t irq_type, uint32_t guest_irq)
+{
+struct kvm_assigned_irq assigned_irq = {
+.assigned_dev_id = dev_id,
+.guest_irq = guest_irq,
+.flags = irq_type,
+};
+
+if (kvm_check_extension(s, KVM_CAP_ASSIGN_DEV_IRQ)) {
+return kvm_vm_ioctl(s, KVM_ASSIGN_DEV_IRQ, assigned_irq);
+} else {
+return kvm_vm_ioctl(s, KVM_ASSIGN_IRQ, assigned_irq);
+}
+}
+
+int kvm_device_intx_assign(KVMState *s, uint32_t dev_id, bool use_host_msi,
+   uint32_t guest_irq)
+{
+uint32_t irq_type = KVM_DEV_IRQ_GUEST_INTX |
+(use_host_msi ? KVM_DEV_IRQ_HOST_MSI : KVM_DEV_IRQ_HOST_INTX);
+
+return kvm_assign_irq_internal(s, dev_id, irq_type, guest_irq);
+}
+
+int kvm_device_intx_set_mask(KVMState *s, uint32_t dev_id, bool masked)
+{
+struct kvm_assigned_pci_dev dev_data = {
+.assigned_dev_id = dev_id,
+.flags = masked ? KVM_DEV_ASSIGN_MASK_INTX : 0,
+};
+
+return kvm_vm_ioctl(s, KVM_ASSIGN_SET_INTX_MASK, dev_data);
+}
+
+static int kvm_deassign_irq_internal(KVMState *s, uint32_t dev_id,
+ uint32_t type)
+{
+struct kvm_assigned_irq assigned_irq = {
+.assigned_dev_id = dev_id,
+.flags = type,
+};
+
+return kvm_vm_ioctl(s, KVM_DEASSIGN_DEV_IRQ, assigned_irq);
+}
+
+int kvm_device_intx_deassign(KVMState *s, uint32_t dev_id, bool use_host_msi)
+{
+return kvm_deassign_irq_internal(s, dev_id, KVM_DEV_IRQ_GUEST_INTX |
+(use_host_msi ? KVM_DEV_IRQ_HOST_MSI : KVM_DEV_IRQ_HOST_INTX));
+}
+
+int kvm_device_msi_assign(KVMState *s, uint32_t dev_id, int virq)
+{
+return kvm_assign_irq_internal(s, dev_id, KVM_DEV_IRQ_HOST_MSI |
+  KVM_DEV_IRQ_GUEST_MSI, virq);
+}
+
+int kvm_device_msi_deassign(KVMState *s, uint32_t dev_id)
+{
+return kvm_deassign_irq_internal(s, dev_id, KVM_DEV_IRQ_GUEST_MSI |
+KVM_DEV_IRQ_HOST_MSI);
+}
+
+bool kvm_device_msix_supported(KVMState *s)
+{
+/* The kernel lacks a corresponding KVM_CAP, so we probe by calling
+ * KVM_ASSIGN_SET_MSIX_NR with an invalid parameter. */
+return kvm_vm_ioctl(s, KVM_ASSIGN_SET_MSIX_NR, NULL) == -EFAULT;
+}
+
+int kvm_device_msix_init_vectors(KVMState *s, uint32_t dev_id,
+ uint32_t nr_vectors)
+{
+struct kvm_assigned_msix_nr msix_nr = {
+.assigned_dev_id = dev_id,
+.entry_nr = nr_vectors,
+};
+
+return kvm_vm_ioctl(s, KVM_ASSIGN_SET_MSIX_NR, msix_nr);
+}
+
+int kvm_device_msix_set_vector(KVMState *s, uint32_t dev_id, uint32_t vector,
+   int virq)
+{
+struct kvm_assigned_msix_entry msix_entry = {
+.assigned_dev_id = dev_id,
+.gsi = virq,
+.entry = vector,
+};
+
+return kvm_vm_ioctl(s, KVM_ASSIGN_SET_MSIX_ENTRY, msix_entry);
+}
+
+int kvm_device_msix_assign(KVMState *s, uint32_t dev_id)
+{
+return kvm_assign_irq_internal(s, dev_id, KVM_DEV_IRQ_HOST_MSIX |
+

Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem

2012-08-27 Thread Paolo Bonzini

Il 25/08/2012 09:42, liu ping fan ha scritto:
 
  I don't see why MMIO dispatch should hold the IDEBus ref rather than the
  PCIIDEState.
 
 When transfer memory_region_init_io()  3rd para from void* opaque to
 Object* obj,  the obj : opaque is not neccessary 1:1 map. For such
 situation, in order to let MemoryRegionOps tell between them, we
 should pass PCIIDEState-bus[0], bus[1] separately.

The rule should be that the obj is the object that you want referenced,
and that should be the PCIIDEState.

But this is anyway moot because it only applies to objects that are
converted to use unlocked dispatch.  This likely will not be the case
for IDE.

Paolo

  In the case of the PIIX, the BARs are set up by the PCIIDEState in
  bmdma_setup_bar (called by bmdma_setup_bar).
 
 Supposing we have convert  PCIIDEState-bmdma[0]/[1] to Object. And in
 mmio-dispatch, object_ref will impose on bmdma[0/[1], but this can not
 prevent  PCIIDEState-refcnt=0, and then the whole object disappear!

[Qemu-devel] [PATCH V6 0/2] Add JSON output to qemu-img info

2012-08-27 Thread Benoît Canet

This patchset add a JSON output mode to the qemu-img info command.
It's a rewrite from scratch of the original patchset by Wenchao Xia
following Anthony Liguori advices on JSON formating.

the --output=(json|human) option is now mandatory on the command line.

Benoît Canet (3):
  qapi: Add SnapshotInfo.
  qapi: Add ImageInfo.
  qemu-img: Add json output option to the info command.

in v2:

eblake: make some field optionals
squash the two qapi patchs together
fix a typo on vm_clock_nsec

bcanet: fix a potential memory leak

in v3: 

lcapitulino: 
   remove unneeded test
   put '\n' at the end of json in printf statement
   drop the uneeded head pointer in collect_snapshots

in v4:

Wenchao Xia  Kevin Wolf: -Refactor to separate rate ImageInfo
   collection from human printing.

Kevin Wolf: -Use --output=(json|human).
-make the two choice exclusive and print a message
if none is specified.
-cosmetic '=' alignement in collect snapshots.

Benoît Canet: -add full-backing-filename to the ImageInfo structure
  (needed for human printing)
  -make ImageInfo-actual_size optional depending on the
  context.
in v5:

Eric Blake: -use a constant for getopt parsing to avoid future
 short options collision.
-make the command default to --output=human.
-fix spurious whitespace change.
-split vm-clock-nsec in two fields vm-clock-sec and
 vm-clock-nsec.
-declare JSON structure as Since 1.3 
  
in v6:

Blue Swirl: -Add missing const in getopt structure declaration.

Eric Blake: -Remove spurious undef.
-Use an enum instead of two boolean.

Benoît Canet (2):
  qapi: Add SnapshotInfo and ImageInfo.
  qemu-img: Add json output option to the info command.

 Makefile |3 +-
 qapi-schema.json |   64 ++
 qemu-img.c   |  259 +-
 3 files changed, 282 insertions(+), 44 deletions(-)

-- 
1.7.9.5

[Qemu-devel] [PATCH V6 2/2] qemu-img: Add json output option to the info command.

2012-08-27 Thread Benoît Canet

This option --output=[human|json] make qemu-img info output on
human or JSON representation at the choice of the user.

example:
{
snapshots: [
{
vm-clock-nsec: 637102488,
name: vm-20120821145509,
date-sec: 1345553709,
date-nsec: 220289000,
vm-clock-sec: 20,
id: 1,
vm-state-size: 96522745
},
{
vm-clock-nsec: 28210866,
name: vm-20120821154059,
date-sec: 1345556459,
date-nsec: 171392000,
vm-clock-sec: 46,
id: 2,
vm-state-size: 101208714
}
],
virtual-size: 1073741824,
filename: snap.qcow2,
cluster-size: 65536,
format: qcow2,
actual-size: 985587712,
dirty-flag: false
}

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 Makefile   |3 +-
 qemu-img.c |  259 ++--
 2 files changed, 218 insertions(+), 44 deletions(-)

diff --git a/Makefile b/Makefile
index ab82ef3..9ba064b 100644
--- a/Makefile
+++ b/Makefile
@@ -160,7 +160,8 @@ tools-obj-y = $(oslib-obj-y) $(trace-obj-y) qemu-tool.o 
qemu-timer.o \
iohandler.o cutils.o iov.o async.o
 tools-obj-$(CONFIG_POSIX) += compatfd.o
 
-qemu-img$(EXESUF): qemu-img.o $(tools-obj-y) $(block-obj-y)
+qemu-img$(EXESUF): qemu-img.o $(tools-obj-y) $(block-obj-y) $(qapi-obj-y) \
+  qapi-visit.o qapi-types.o
 qemu-nbd$(EXESUF): qemu-nbd.o $(tools-obj-y) $(block-obj-y)
 qemu-io$(EXESUF): qemu-io.o cmd.o $(tools-obj-y) $(block-obj-y)
 
diff --git a/qemu-img.c b/qemu-img.c
index 80cfb9b..fe4a4fc 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -21,12 +21,16 @@
  * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
  * THE SOFTWARE.
  */
+#include qapi-visit.h
+#include qapi/qmp-output-visitor.h
+#include qjson.h
 #include qemu-common.h
 #include qemu-option.h
 #include qemu-error.h
 #include osdep.h
 #include sysemu.h
 #include block_int.h
+#include getopt.h
 #include stdio.h
 
 #ifdef _WIN32
@@ -84,6 +88,7 @@ static void help(void)
  '-p' show progress of command (only certain commands)\n
  '-S' indicates the consecutive number of bytes that must contain 
only zeros\n
   for qemu-img to create a sparse image during conversion\n
+ '--output' takes the format in which the output must be done 
(human or json)\n
\n
Parameters to check subcommand:\n
  '-r' tries to repair any inconsistencies that are found during 
the check.\n
@@ -1102,21 +1107,196 @@ static void dump_snapshots(BlockDriverState *bs)
 g_free(sn_tab);
 }
 
-static int img_info(int argc, char **argv)
+static void collect_snapshots(BlockDriverState *bs , ImageInfo *info)
+{
+int i, sn_count;
+QEMUSnapshotInfo *sn_tab = NULL;
+SnapshotInfoList *info_list, *cur_item = NULL;
+sn_count = bdrv_snapshot_list(bs, sn_tab);
+
+for (i = 0; i  sn_count; i++) {
+info-has_snapshots = true;
+info_list = g_new0(SnapshotInfoList, 1);
+
+info_list-value= g_new0(SnapshotInfo, 1);
+info_list-value-id= g_strdup(sn_tab[i].id_str);
+info_list-value-name  = g_strdup(sn_tab[i].name);
+info_list-value-vm_state_size = sn_tab[i].vm_state_size;
+info_list-value-date_sec  = sn_tab[i].date_sec;
+info_list-value-date_nsec = sn_tab[i].date_nsec;
+info_list-value-vm_clock_sec  = sn_tab[i].vm_clock_nsec / 10;
+info_list-value-vm_clock_nsec = sn_tab[i].vm_clock_nsec % 10;
+
+/* XXX: waiting for the qapi to support GSList */
+if (!cur_item) {
+info-snapshots = cur_item = info_list;
+} else {
+cur_item-next = info_list;
+cur_item = info_list;
+}
+
+}
+
+g_free(sn_tab);
+}
+
+static void dump_json_image_info(ImageInfo *info)
+{
+Error *errp = NULL;
+QString *str;
+QmpOutputVisitor *ov = qmp_output_visitor_new();
+QObject *obj;
+visit_type_ImageInfo(qmp_output_get_visitor(ov),
+ info, NULL, errp);
+obj = qmp_output_get_qobject(ov);
+str = qobject_to_json_pretty(obj);
+assert(str != NULL);
+printf(%s\n, qstring_get_str(str));
+qobject_decref(obj);
+qmp_output_visitor_cleanup(ov);
+QDECREF(str);
+}
+
+static void collect_backing_file_format(ImageInfo *info, char *filename)
+{
+BlockDriverState *bs = NULL;
+bs = bdrv_new_open(filename, NULL,
+   BDRV_O_FLAGS | BDRV_O_NO_BACKING);
+if (!bs) {
+return;
+}
+info-backing_filename_format =
+g_strdup(bdrv_get_format_name(bs));
+bdrv_delete(bs);
+info-has_backing_filename_format = true;
+}
+
+static void collect_image_info(BlockDriverState *bs,
+   ImageInfo *info,
+   const char

[Qemu-devel] [PATCH V6 1/2] qapi: Add SnapshotInfo and ImageInfo.

2012-08-27 Thread Benoît Canet

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 qapi-schema.json |   64 ++
 1 file changed, 64 insertions(+)

diff --git a/qapi-schema.json b/qapi-schema.json
index a92adb1..ffe3a0a 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -126,6 +126,70 @@
 'running', 'save-vm', 'shutdown', 'suspended', 'watchdog' ] }
 
 ##
+# @SnapshotInfo
+#
+# @id: unique snapshot id
+#
+# @name: user chosen name
+#
+# @vm-state-size: size of the VM state
+#
+# @date-sec: UTC date of the snapshot in seconds
+#
+# @date-nsec: fractional part in nano seconds to be used with date-sec
+#
+# @vm-clock-sec: VM clock relative to boot in seconds
+#
+# @vm-clock-nsec: fractional part in nano seconds to be used with vm-clock-sec
+#
+# Since: 1.3
+#
+##
+
+{ 'type': 'SnapshotInfo',
+  'data': { 'id': 'str', 'name': 'str', 'vm-state-size': 'int',
+'date-sec': 'int', 'date-nsec': 'int',
+'vm-clock-sec': 'int', 'vm-clock-nsec': 'int' } }
+
+##
+# @ImageInfo:
+#
+# Information about a QEMU image file
+#
+# @filename: name of the image file
+#
+# @format: format of the image file
+#
+# @virtual-size: maximum capacity in bytes of the image
+#
+# @actual-size: #optional actual size on disk in bytes of the image
+#
+# @dirty-flag: #optional true if image is not cleanly closed
+#
+# @cluster-size: #optional size of a cluster in bytes
+#
+# @encrypted: #optional true if the image is encrypted
+#
+# @backing-filename: #optional name of the backing file
+#
+# @full-backing-filename: #optional full path of the backing file
+#
+# @backing-filename-format: #optional the format of the backing file
+#
+# @snapshots: #optional list of VM snapshots
+#
+# Since: 1.3
+#
+##
+
+{ 'type': 'ImageInfo',
+  'data': {'filename': 'str', 'format': 'str', '*dirty-flag': 'bool',
+   '*actual-size': 'int', 'virtual-size': 'int',
+   '*cluster-size': 'int', '*encrypted': 'bool',
+   '*backing-filename': 'str', '*full-backing-filename': 'str',
+   '*backing-filename-format': 'str', '*snapshots': ['SnapshotInfo'] } 
}
+
+##
 # @StatusInfo:
 #
 # Information about VCPU run state
-- 
1.7.9.5

Re: [Qemu-devel] [PATCH 8/9] qdev: make qdev_set_parent_bus() just set a link property

2012-08-27 Thread liu ping fan

On Sun, Aug 26, 2012 at 11:51 PM, Anthony Liguori aligu...@us.ibm.com wrote:
 Also make setting the link to NULL break the bus link

 Signed-off-by: Anthony Liguori aligu...@us.ibm.com
 ---
  hw/qdev.c |   48 ++--
  1 files changed, 42 insertions(+), 6 deletions(-)

 diff --git a/hw/qdev.c b/hw/qdev.c
 index 86e1337..525a0cb 100644
 --- a/hw/qdev.c
 +++ b/hw/qdev.c
 @@ -100,8 +100,7 @@ static void bus_add_child(BusState *bus, DeviceState 
 *child)

  void qdev_set_parent_bus(DeviceState *dev, BusState *bus)
  {
 -dev-parent_bus = bus;
 -bus_add_child(bus, dev);
 +object_property_set_link(OBJECT(dev), OBJECT(bus), parent_bus, NULL);
  }

  /* Create a new device.  This only initializes the device state structure
 @@ -241,8 +240,8 @@ void qbus_reset_all_fn(void *opaque)
  /* can be used as -unplug() callback for the simple cases */
  int qdev_simple_unplug_cb(DeviceState *dev)
  {
 -/* just zap it */
 -qdev_free(dev);
 +/* Unplug from parent bus via a forced eject */
 +qdev_set_parent_bus(dev, NULL);

I think it is more reliable to remove the reference property(child,
link) before object_finialize().  So when uplug-finish, we delete all
the refers:  bus-child, bus-child by _del_property not using
_set_property.

  return 0;
  }

 @@ -646,6 +645,40 @@ void qdev_property_add_static(DeviceState *dev, Property 
 *prop,
  assert_no_error(local_err);
  }

 +static void qdev_set_link_property(Object *obj, Visitor *v, void *opaque,
 +   const char *name, Error **errp)
 +{
 +DeviceState *dev = DEVICE(obj);
 +BusState *parent_bus = dev-parent_bus;
 +
 +object_set_link_property(obj, v, opaque, name, errp);
 +
 +if (parent_bus) {
 +bus_remove_child(parent_bus, dev);
 +}
 +
 +if (dev-parent_bus) {
 +bus_add_child(dev-parent_bus, dev);
 +}
 +
 +if (!dev-parent_bus) {
 +notifier_list_notify(dev-eject_notifier, dev);
 +}
 +}
 +
 +static void qdev_release_link_property(Object *obj, const char *name,
 +   void *opaque)
 +{
 +DeviceState *dev = DEVICE(obj);
 +
 +if (dev-parent_bus) {
 +bus_remove_child(dev-parent_bus, dev);
 +object_unref(OBJECT(dev-parent_bus));
 +}
 +
 +dev-parent_bus = NULL;
 +}
 +
  static void device_initfn(Object *obj)
  {
  DeviceState *dev = DEVICE(obj);
 @@ -670,8 +703,11 @@ static void device_initfn(Object *obj)
  } while (class != object_class_by_name(TYPE_DEVICE));
  qdev_prop_set_globals(dev);

 -object_property_add_link(OBJECT(dev), parent_bus, TYPE_BUS,
 - (Object **)dev-parent_bus, NULL);
 +object_property_add(OBJECT(dev), parent_bus, link TYPE_BUS ,
 +object_get_link_property,
 +qdev_set_link_property,
 +qdev_release_link_property,
 +dev-parent_bus, NULL);

  notifier_list_init(dev-eject_notifier);
  }
 --
 1.7.5.4

Re: [Qemu-devel] [RFC PATCH 0/9] qom: improve reference counting and hotplug

2012-08-27 Thread liu ping fan

On Sun, Aug 26, 2012 at 11:51 PM, Anthony Liguori aligu...@us.ibm.com wrote:
 Right now, you need to pair up object_new with object_delete.  This is
 impractical when using reference counting because we would like to ensure that
 object_unref() also frees memory when needed.

 The first few patches fix this problem by introducing a release callback so
 that objects that need special release behavior (i.e. g_free) can do that.

 Since link and child properties all hold references, in order to actually free
 an object, we need to break those links.  User created devices end up as
 children of a container.  But child properties cannot be removed which means
 there's no obvious way to remove the reference and ultimately free the object.

Why? Since we call _add_child() in qdev_device_add(), why can not we
call object_property_del_child() for qmp_device_del(). Could you
explain it more detail?

 We introduce the concept of nullable child properties to solve this.  This 
 is
 a child property that can be broken by writing NULL to the child link.  Today
 we set all /peripheral* children to be nullable so that they can be deleted by
 management tools.

 In terms of modeling hotplug, we represent unplug by removing the object from
 the parent bus.  We need to register a notifier for when this happens so that
 we can also remove the parent's child property to ultimately release the 
 object.

 Putting it all together, we have:

 1) qmp_device_del will issue a callback to a device.  The default callback 
 will
do a forced eject (which means writing NULL to the parent_bus link).

 2) PCI hotplug is a bit more sophisticated in that it waits for the guest to
do the ejection.

 3) qmp_device_del will register an eject notifier such that the device gets
completely removed.

 There's a slightly change in behavior here.  A device is not automatically
 destroyed based on a guest initiated eject.  A management tool must explicitly
 break the parent's link to the child in order for the device to disappear
 completely.  device_del behaves exactly as it does today though.

 This is an RFC.  I've tested the series quite a lot (it was hard to get the
 reference counting right) but not enough to apply.  I also don't think the
 series is quite split right and may not bisect cleanly.

 I also want to write up a document describing object life cycle since 
 admittedly
 the above is probably not that easy to follow.

 I wanted to share this now though because it works and I think the concepts 
 are
 right.

Re: [Qemu-devel] [RFC][PATCH v4 3/3] tcg: Optimize qemu_ld/st by generating slow paths at the end of a block

2012-08-27 Thread Yeongkyoon Lee


On 2012년 07월 29일 00:39, Yeongkyoon Lee wrote:

On 2012년 07월 25일 23:00, Richard Henderson wrote:

On 07/25/2012 12:35 AM, Yeongkyoon Lee wrote:

+#if defined(CONFIG_QEMU_LDST_OPTIMIZATION)  defined(CONFIG_SOFTMMU)
+/* Macros/structures for qemu_ld/st IR code optimization:
+   TCG_MAX_HELPER_LABELS is defined as same as OPC_BUF_SIZE in 
exec-all.h. */

+#define TCG_MAX_QEMU_LDST   640

Why statically size this ...


This just followed the other TCG's code style, the allocation of the 
labels of TCGContext in tcg.c.






+/* labels info for qemu_ld/st IRs
+   The labels help to generate TLB miss case codes at the end 
of TB */

+TCGLabelQemuLdst *qemu_ldst_labels;

... and then allocate the array dynamically?


ditto.




+/* jne slow_path */
+/* XXX: How to avoid using OPC_JCC_long for peephole 
optimization? */

+tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);

You can't, not and maintain the code-generate-until-address-reached
exception invariant.


+#ifndef CONFIG_QEMU_LDST_OPTIMIZATION
  uint8_t __ldb_mmu(target_ulong addr, int mmu_idx);
  void __stb_mmu(target_ulong addr, uint8_t val, int mmu_idx);
  uint16_t __ldw_mmu(target_ulong addr, int mmu_idx);
@@ -28,6 +30,30 @@ void __stl_cmmu(target_ulong addr, uint32_t val, 
int mmu_idx);

  uint64_t __ldq_cmmu(target_ulong addr, int mmu_idx);
  void __stq_cmmu(target_ulong addr, uint64_t val, int mmu_idx);
  #else
+/* Extended versions of MMU helpers for qemu_ld/st optimization.
+   The additional argument is a host code address accessing guest 
memory */

+uint8_t ext_ldb_mmu(target_ulong addr, int mmu_idx, uintptr_t ra);

Don't tie LDST_OPTIMIZATION directly to the extended function calls.

For a host supporting predication, like ARM, the best code sequence
may look like

(1) TLB check
(2) If hit, load value from memory
(3) If miss, call miss case (5)
(4) ... next code
...
(5) Load call parameters
(6) Tail call (aka jump) to MMU helper

so that (a) we need not explicitly load the address of (3) by hand
for your RA parameter and (b) the mmu helper returns directly to (4).


r~


The difference between current HEAD and the code sequence you said is, 
I think, code locality.
My LDST_OPTIMIZATION patches enhances the code locality and also 
removes one jump.
It shows about 4% rising of CoreMark performance on x86 host which 
supports predication like ARM.
Probably, the performance enhancement for AREG0 cases might get more 
larger.
I'm not sure where the performance enhancement came from now, and I'll 
check it by some tests later.


In my humble opinion, there are no things to lose in LDST_OPTIMIZATION 
except
for just adding one argument to MMU helper implicitly which doesn't 
look so critical.

How about your opinion?

Thanks.



It's been a long time.

I've tested the performances of one jump difference when fast qemu_ld/st 
(TLB hit).
The result shows 3.6% CoreMark enhancement when reducing one jump where 
slow paths are generated at the end of block as same for the both cases.
That means reducing one jump dominates the majority of performance 
enhancement from LDST_OPTIMIZATION.
As a result, it needs extended MMU helper functions for attaining that 
performance rising, and those extended functions are used only implicitly.


BTW, who will finally confirm my patches?
I have sent four version of my patches in which I have applied all the 
reasonable feedbacks from this community.
Currently, v4 is the final candidate though it might need merge with 
latest HEAD because it was sent 1 month before.


Thanks.

Re: [Qemu-devel] [PATCH 9/9] hotplug: refactor hotplug to leverage new QOM functions

2012-08-27 Thread liu ping fan

On Sun, Aug 26, 2012 at 11:51 PM, Anthony Liguori aligu...@us.ibm.com wrote:
 1) DeviceState::unplug requests for an eject to happen
- the default implementation is a forced eject

 2) A bus can eject a device by setting the parent_bus to NULL
- this detaches the device from the bus
- this does *not* cause the device to disappear

 3) The current implementation on unplug also registers an eject notifier
- the eject notifier will detach the device the parent.  This will cause 
 the
  device to disappear

 4) A purely guest initiated unplug will not delete a device but will cause the
device to appear detached from the guests PoV.

 Signed-off-by: Anthony Liguori aligu...@us.ibm.com
 ---
  hw/acpi_piix4.c   |3 ++-
  hw/pci.c  |   10 +-
  hw/pcie.c |2 +-
  hw/qdev.c |   22 ++
  hw/qdev.h |2 ++
  hw/shpc.c |2 +-
  hw/xen_platform.c |2 +-
  7 files changed, 38 insertions(+), 5 deletions(-)

 diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
 index 72d6e5c..eac53b3 100644
 --- a/hw/acpi_piix4.c
 +++ b/hw/acpi_piix4.c
 @@ -305,7 +305,8 @@ static void acpi_piix_eject_slot(PIIX4PMState *s, 
 unsigned slots)
  if (pc-no_hotplug) {
  slot_free = false;
  } else {
 -qdev_free(qdev);
 +/* Force eject of device */
 +qdev_set_parent_bus(qdev, NULL);

Do we need to wait for guest's ACKs for all of this device's children.
Then we can change this node's topology in the device tree.
I think, we can color the current device as unplug_state=ACK, and then
decide whether to detached it or not.
Each unplug ack from guest, will first check 1st.whether current node
can be release or not.  2nd. if can released, then go bottom-up
through the device tree to check whether the upper device can be
released or not.
If the down device(devB) removal cause the up device(devA) becomes a
leaf, then we can remove devA.
A leaf device is defined as : has no BusState kids OR all of its
BusState kids are empty.

This method can avoid sudden remove.
  }
  }
  }
 diff --git a/hw/pci.c b/hw/pci.c
 index 437af70..cc555c2 100644
 --- a/hw/pci.c
 +++ b/hw/pci.c
 @@ -46,6 +46,14 @@ static char *pcibus_get_dev_path(DeviceState *dev);
  static char *pcibus_get_fw_dev_path(DeviceState *dev);
  static int pcibus_reset(BusState *qbus);

 +static void pcibus_remove_child(BusState *bus, DeviceState *dev)
 +{
 +PCIDevice *pci_dev = PCI_DEVICE(dev);
 +PCIBus *pci_bus = PCI_BUS(bus);
 +
 +pci_bus-devices[pci_dev-devfn] = NULL;
 +}
 +
  static Property pci_props[] = {
  DEFINE_PROP_PCI_DEVFN(addr, PCIDevice, devfn, -1),
  DEFINE_PROP_STRING(romfile, PCIDevice, romfile),
 @@ -65,6 +73,7 @@ static void pci_bus_class_init(ObjectClass *klass, void 
 *data)
  k-get_dev_path = pcibus_get_dev_path;
  k-get_fw_dev_path = pcibus_get_fw_dev_path;
  k-reset = pcibus_reset;
 +k-remove_child = pcibus_remove_child;
  }

  static const TypeInfo pci_bus_info = {
 @@ -833,7 +842,6 @@ static PCIDevice *do_pci_register_device(PCIDevice 
 *pci_dev, PCIBus *bus,
  static void do_pci_unregister_device(PCIDevice *pci_dev)
  {
  qemu_free_irqs(pci_dev-irq);
 -pci_dev-bus-devices[pci_dev-devfn] = NULL;
  pci_config_free(pci_dev);
  }

 diff --git a/hw/pcie.c b/hw/pcie.c
 index 7c92f19..d10ffea 100644
 --- a/hw/pcie.c
 +++ b/hw/pcie.c
 @@ -235,7 +235,7 @@ static int pcie_cap_slot_hotplug(DeviceState *qdev,
 PCI_EXP_SLTSTA_PDS);
  pcie_cap_slot_event(d, PCI_EXP_HP_EV_PDC);
  } else {
 -qdev_free(pci_dev-qdev);
 +qdev_set_parent_bus(DEVICE(pci_dev), NULL);
  pci_word_test_and_clear_mask(exp_cap + PCI_EXP_SLTSTA,
   PCI_EXP_SLTSTA_PDS);
  pcie_cap_slot_event(d, PCI_EXP_HP_EV_PDC);
 diff --git a/hw/qdev.c b/hw/qdev.c
 index 525a0cb..be41f00 100644
 --- a/hw/qdev.c
 +++ b/hw/qdev.c
 @@ -62,6 +62,7 @@ static void qdev_property_add_legacy(DeviceState *dev, 
 Property *prop,

  static void bus_remove_child(BusState *bus, DeviceState *child)
  {
 +BusClass *bc = BUS_GET_CLASS(bus);
  BusChild *kid;

  QTAILQ_FOREACH(kid, bus-children, sibling) {
 @@ -71,6 +72,11 @@ static void bus_remove_child(BusState *bus, DeviceState 
 *child)
  snprintf(name, sizeof(name), child[%d], kid-index);
  QTAILQ_REMOVE(bus-children, kid, sibling);
  object_property_del(OBJECT(bus), name, NULL);
 +
 +if (bc-remove_child) {
 +bc-remove_child(bus, kid-child);
 +}
 +
  g_free(kid);
  return;
  }
 @@ -192,9 +198,20 @@ void qdev_set_legacy_instance_id(DeviceState *dev, int 
 alias_id,
  dev-alias_required_for_version = required_for_version;
  }

 +static void qdev_finish_unplug(Notifier *notifier, void *data)
 +{
 +

Re: [Qemu-devel] [PATCH v7 0/6] convert sendkey to qapi

2012-08-27 Thread Amos Kong


On 20/08/12 23:08, Luiz Capitulino wrote:

On Mon, 20 Aug 2012 07:25:13 -0600
Eric Blakeebl...@redhat.com  wrote:


On 08/19/2012 10:39 PM, Amos Kong wrote:

This series converted 'sendkey' command to qapi. The raw value
in hexadecimal format is not supported by 'send-key' of qmp.


Are we still trying to get this into 1.2, or have we missed that deadline?


Too late for 1.2, IMO.


So I need to wait and repost a V8(# Since: 1.3) after 1.2 is released ?

--
Amos.

[Qemu-devel] [RFC V5 08/11] quorum: Add quorum mechanism.

2012-08-27 Thread Benoît Canet

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 block/quorum.c |  222 +++-
 1 file changed, 221 insertions(+), 1 deletion(-)

diff --git a/block/quorum.c b/block/quorum.c
index 791ef4a..3fa9d53 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -14,6 +14,20 @@
  */
 
 #include block_int.h
+#include zlib.h
+
+typedef struct QuorumVoteItem {
+int index;
+QLIST_ENTRY(QuorumVoteItem) next;
+} QuorumVoteItem;
+
+typedef struct QuorumVoteVersion {
+unsigned long value;
+int index;
+int vote_count;
+QLIST_HEAD(, QuorumVoteItem) items;
+QLIST_ENTRY(QuorumVoteVersion) next;
+} QuorumVoteVersion;
 
 typedef struct {
 BlockDriverState **bs;
@@ -31,6 +45,10 @@ typedef struct QuorumSingleAIOCB {
 QuorumAIOCB *parent;
 } QuorumSingleAIOCB;
 
+typedef struct QuorumVotes {
+QLIST_HEAD(, QuorumVoteVersion) vote_list;
+} QuorumVotes;
+
 struct QuorumAIOCB {
 BlockDriverAIOCB common;
 BDRVQuorumState *bqs;
@@ -48,6 +66,8 @@ struct QuorumAIOCB {
 int success_count;  /* number of successfully completed AIOCB */
 bool *finished; /* completion signal for cancel */
 
+QuorumVotes votes;
+
 void (*vote)(QuorumAIOCB *acb);
 int vote_ret;
 };
@@ -204,6 +224,11 @@ static void quorum_aio_bh(void *opaque)
 }
 
 qemu_bh_delete(acb-bh);
+
+if (acb-vote_ret) {
+ret = acb-vote_ret;
+}
+
 acb-common.cb(acb-common.opaque, ret);
 if (acb-finished) {
 *acb-finished = true;
@@ -239,6 +264,7 @@ static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s,
 acb-nb_sectors = nb_sectors;
 acb-vote = NULL;
 acb-vote_ret = 0;
+QLIST_INIT(acb-votes.vote_list);
 
 for (i = 0; i  s-total; i++) {
 acb-aios[i].buf = NULL;
@@ -266,10 +292,202 @@ static void quorum_aio_cb(void *opaque, int ret)
 return;
 }
 
+/* Do the vote */
+if (acb-vote) {
+acb-vote(acb);
+}
+
 acb-bh = qemu_bh_new(quorum_aio_bh, acb);
 qemu_bh_schedule(acb-bh);
 }
 
+static void quorum_print_bad(QuorumAIOCB *acb, const char *filename)
+{
+fprintf(stderr, quorum: corrected error in quorum file %s: sector_num=%
+PRId64  nb_sectors=%i\n, filename, acb-sector_num,
+acb-nb_sectors);
+}
+
+static void quorum_print_failure(QuorumAIOCB *acb)
+{
+fprintf(stderr, quorum: failure sector_num=% PRId64  nb_sectors=%i\n,
+acb-sector_num, acb-nb_sectors);
+}
+
+static void quorum_print_bad_versions(QuorumAIOCB *acb,
+  unsigned long checksum)
+{
+QuorumVoteVersion *version;
+QuorumVoteItem *item;
+BDRVQuorumState *s = acb-bqs;
+
+QLIST_FOREACH(version, acb-votes.vote_list, next) {
+if (version-value == checksum) {
+continue;
+}
+QLIST_FOREACH(item, version-items, next) {
+quorum_print_bad(acb, s-filenames[item-index]);
+}
+}
+}
+
+static void quorum_copy_qiov(QEMUIOVector *dest, QEMUIOVector *source)
+{
+int i;
+assert(dest-niov == source-niov);
+assert(dest-size == source-size);
+for (i = 0; i  source-niov; i++) {
+assert(dest-iov[i].iov_len == source-iov[i].iov_len);
+memcpy(dest-iov[i].iov_base,
+   source-iov[i].iov_base,
+   source-iov[i].iov_len);
+}
+}
+
+static void quorum_count_vote(QuorumVotes *votes,
+  unsigned long checksum,
+  int index)
+{
+QuorumVoteVersion *v = NULL, *version = NULL;
+QuorumVoteItem *item;
+
+/* look if we have something with this checksum */
+QLIST_FOREACH(v, votes-vote_list, next) {
+if (v-value == checksum) {
+version = v;
+break;
+}
+}
+
+/* It's a version not yet in the list add it */
+if (!version) {
+version = g_new0(QuorumVoteVersion, 1);
+QLIST_INIT(version-items);
+version-value = checksum;
+version-index = index;
+version-vote_count = 0;
+QLIST_INSERT_HEAD(votes-vote_list, version, next);
+}
+
+version-vote_count++;
+
+item = g_new0(QuorumVoteItem, 1);
+item-index = index;
+QLIST_INSERT_HEAD(version-items, item, next);
+}
+
+static void quorum_free_vote_list(QuorumVotes *votes)
+{
+QuorumVoteVersion *version, *next_version;
+QuorumVoteItem *item, *next_item;
+
+QLIST_FOREACH_SAFE(version, votes-vote_list, next, next_version) {
+QLIST_REMOVE(version, next);
+QLIST_FOREACH_SAFE(item, version-items, next, next_item) {
+QLIST_REMOVE(item, next);
+g_free(item);
+}
+g_free(version);
+}
+}
+
+static unsigned long quorum_compute_checksum(QuorumAIOCB *acb, int i)
+{
+int j;
+unsigned long adler = adler32(0L, Z_NULL, 0);
+QEMUIOVector *qiov = acb-qiovs[i];
+
+for (j = 0; j  qiov-niov; j++) {
+adler = adler32(adler,
+

Re: [Qemu-devel] [PATCH 1/2] migration: Allow the migrate command to work on file: urls

2012-08-27 Thread Benoît Canet

Adding Luiz to the thread since he is concerned by migration.

Luiz do you have any hints on doing this properly ?

Benoît

 Le Thursday 23 Aug 2012 à 13:34:01 (+0100), Daniel P. Berrange a écrit :
 On Thu, Aug 23, 2012 at 02:28:07PM +0200, Benoît Canet wrote:
  Usage:
  (qemu) migrate file:/path/to/vm_statefile
  
  Signed-off-by: Benoit Canet ben...@irqsave.net
  ---
   migration-fd.c |4 ++--
   migration.c|   20 +++-
   migration.h|2 +-
   3 files changed, 22 insertions(+), 4 deletions(-)
  
  diff --git a/migration-fd.c b/migration-fd.c
  index 50138ed..d39e44a 100644
  --- a/migration-fd.c
  +++ b/migration-fd.c
  @@ -73,9 +73,9 @@ static int fd_close(MigrationState *s)
   return 0;
   }
   
  -int fd_start_outgoing_migration(MigrationState *s, const char *fdname)
  +int fd_start_outgoing_migration(MigrationState *s, const char *fdname, int 
  fd)
   {
  -s-fd = monitor_get_fd(cur_mon, fdname);
  +s-fd = fd ? fd : monitor_get_fd(cur_mon, fdname);
   if (s-fd == -1) {
   DPRINTF(fd_migration: invalid file descriptor identifier\n);
   goto err_after_get_fd;
  diff --git a/migration.c b/migration.c
  index 1edeec5..679847d 100644
  --- a/migration.c
  +++ b/migration.c
  @@ -239,9 +239,14 @@ void 
  qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
   static int migrate_fd_cleanup(MigrationState *s)
   {
   int ret = 0;
  +struct stat st;
   
   qemu_set_fd_handler2(s-fd, NULL, NULL, NULL, NULL);
   
  +if (!fstat(s-fd, st)  S_ISREG(st.st_mode)) {
  +fsync(s-fd);
  +}
  +
   if (s-file) {
   DPRINTF(closing file\n);
   ret = qemu_fclose(s-file);
  @@ -475,6 +480,17 @@ void migrate_del_blocker(Error *reason)
   migration_blockers = g_slist_remove(migration_blockers, reason);
   }
   
  +static int file_start_outgoing_migration(MigrationState *s,
  + const char *filename)
  +{
  +int fd;
  +fd = open(filename, O_CREAT|O_TRUNC|O_WRONLY, S_IRUSR|S_IWUSR);
  +if (fd  0) {
  +return -errno;
  +}
  +return fd_start_outgoing_migration(s, NULL, fd);
 
 'fd_start_outgoing_migration' requires that the FD you give it
 supports non-blocking I/O. File descriptors opened from plain
 files or block devices do not honour that requirement. So this
 proposed code will cause the entire QEMU process to block while
 migration is taking place. This is why no on has ever implemented
 the 'file:' protocol in QEMU before.
 
 To deal with this issue you'd either have to use the POSIX
 async I/O APIs (or QEMU's internal equivalent), or spawn a
 separate 'dd' helper process and give QEMU a pipe FD instead.
 The latter is what libvirt does to implement migrate to file.
 
 Daniel
 -- 
 |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
 |: http://libvirt.org  -o- http://virt-manager.org :|
 |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
 |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

[Qemu-devel] [RFC V5 10/11] quorum: Add quorum_invalidate_cache().

2012-08-27 Thread Benoît Canet

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 block/quorum.c |   11 +++
 1 file changed, 11 insertions(+)

diff --git a/block/quorum.c b/block/quorum.c
index 09eed84..c9dcd9c 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -571,6 +571,16 @@ static int64_t quorum_getlength(BlockDriverState *bs)
 return value;
 }
 
+static void quorum_invalidate_cache(BlockDriverState *bs)
+{
+BDRVQuorumState *s = bs-opaque;
+int i;
+
+for (i = 0; i  s-total; i++) {
+bdrv_invalidate_cache(s-bs[i]);
+}
+}
+
 static BlockDriver bdrv_quorum = {
 .format_name= quorum,
 .protocol_name  = quorum,
@@ -585,6 +595,7 @@ static BlockDriver bdrv_quorum = {
 
 .bdrv_aio_readv = quorum_aio_readv,
 .bdrv_aio_writev= quorum_aio_writev,
+.bdrv_invalidate_cache = quorum_invalidate_cache,
 };
 
 static void bdrv_quorum_init(void)
-- 
1.7.9.5

[Qemu-devel] [RFC V5 04/11] quorum: Add quorum_aio_writev and its dependencies.

2012-08-27 Thread Benoît Canet

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 block/quorum.c |  112 
 1 file changed, 112 insertions(+)

diff --git a/block/quorum.c b/block/quorum.c
index b9fb2b9..cd11cfb 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -172,6 +172,116 @@ static void quorum_close(BlockDriverState *bs)
 g_free(s-bs);
 }
 
+static void quorum_aio_cancel(BlockDriverAIOCB *blockacb)
+{
+QuorumAIOCB *acb = container_of(blockacb, QuorumAIOCB, common);
+bool finished = false;
+
+/* Wait for the request to finish */
+acb-finished = finished;
+while (!finished) {
+qemu_aio_wait();
+}
+}
+
+static AIOPool quorum_aio_pool = {
+.aiocb_size = sizeof(QuorumAIOCB),
+.cancel = quorum_aio_cancel,
+};
+
+static void quorum_aio_bh(void *opaque)
+{
+QuorumAIOCB *acb = opaque;
+BDRVQuorumState *s = acb-bqs;
+int ret;
+
+ret = s-threshold = acb-success_count ? 0 : -EIO;
+
+qemu_bh_delete(acb-bh);
+acb-common.cb(acb-common.opaque, ret);
+if (acb-finished) {
+*acb-finished = true;
+}
+g_free(acb-aios);
+g_free(acb-qiovs);
+qemu_aio_release(acb);
+}
+
+static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s,
+   BlockDriverState *bs,
+   QEMUIOVector *qiov,
+   int64_t sector_num,
+   int nb_sectors,
+   BlockDriverCompletionFunc *cb,
+   void *opaque)
+{
+QuorumAIOCB *acb = qemu_aio_get(quorum_aio_pool, bs, cb, opaque);
+int i;
+
+acb-aios = g_new0(QuorumSingleAIOCB, s-total);
+acb-qiovs = g_new0(QEMUIOVector, s-total);
+
+acb-bqs = s;
+acb-qiov = qiov;
+acb-bh = NULL;
+acb-count = 0;
+acb-success_count = 0;
+acb-sector_num = sector_num;
+acb-nb_sectors = nb_sectors;
+acb-vote = NULL;
+acb-vote_ret = 0;
+
+for (i = 0; i  s-total; i++) {
+acb-aios[i].buf = NULL;
+acb-aios[i].ret = 0;
+acb-aios[i].parent = acb;
+}
+
+return acb;
+}
+
+static void quorum_aio_cb(void *opaque, int ret)
+{
+QuorumSingleAIOCB *sacb = opaque;
+QuorumAIOCB *acb = sacb-parent;
+BDRVQuorumState *s = acb-bqs;
+
+sacb-ret = ret;
+acb-count++;
+if (ret == 0) {
+acb-success_count++;
+}
+assert(acb-count = s-total);
+assert(acb-success_count = s-total);
+if (acb-count  s-total) {
+return;
+}
+
+acb-bh = qemu_bh_new(quorum_aio_bh, acb);
+qemu_bh_schedule(acb-bh);
+}
+
+static BlockDriverAIOCB *quorum_aio_writev(BlockDriverState *bs,
+  int64_t sector_num,
+  QEMUIOVector *qiov,
+  int nb_sectors,
+  BlockDriverCompletionFunc *cb,
+  void *opaque)
+{
+BDRVQuorumState *s = bs-opaque;
+QuorumAIOCB *acb = quorum_aio_get(s, bs, qiov, sector_num, nb_sectors,
+  cb, opaque);
+int i;
+
+for (i = 0; i  s-total; i++) {
+acb-aios[i].aiocb = bdrv_aio_writev(s-bs[i], sector_num, qiov,
+ nb_sectors, quorum_aio_cb,
+ acb-aios[i]);
+}
+
+return acb-common;
+}
+
 static BlockDriver bdrv_quorum = {
 .format_name= quorum,
 .protocol_name  = quorum,
@@ -180,6 +290,8 @@ static BlockDriver bdrv_quorum = {
 
 .bdrv_file_open = quorum_open,
 .bdrv_close = quorum_close,
+
+.bdrv_aio_writev= quorum_aio_writev,
 };
 
 static void bdrv_quorum_init(void)
-- 
1.7.9.5

Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem

2012-08-27 Thread Jan Kiszka

On 2012-08-27 09:01, Paolo Bonzini wrote:
 Il 25/08/2012 09:42, liu ping fan ha scritto:

 I don't see why MMIO dispatch should hold the IDEBus ref rather than the
 PCIIDEState.

 When transfer memory_region_init_io()  3rd para from void* opaque to
 Object* obj,  the obj : opaque is not neccessary 1:1 map. For such
 situation, in order to let MemoryRegionOps tell between them, we
 should pass PCIIDEState-bus[0], bus[1] separately.
 
 The rule should be that the obj is the object that you want referenced,
 and that should be the PCIIDEState.
 
 But this is anyway moot because it only applies to objects that are
 converted to use unlocked dispatch.  This likely will not be the case
 for IDE.

BTW, I'm pretty sure - after implementing the basics for BQL-free PIO
dispatching - that device objects are the wrong target for reference
counting. We keep memory regions in our dispatching tables (PIO
dispatching needs some refactoring for this), and those regions need
protection for BQL-free use. Devices can't pass away as long as the have
referenced regions, memory region deregistration services will have to
take care of this.

I'm currently not using reference counting at all, I'm enforcing that
only BQL-protected regions can be deregistered.

Also note that there seems to be another misconception in the
discussions: deregistration is not only bound to device unplug. It also
happens on device reconfiguration, e.g. PCI BAR (re-)mapping. Another
strong indicator that we should worry about individual memory regions,
not devices.

Jan




signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [RFC V5 02/11] quorum: Create BDRVQuorumState and BlkDriver and do init.

2012-08-27 Thread Benoît Canet

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 block/quorum.c |   22 ++
 1 file changed, 22 insertions(+)

diff --git a/block/quorum.c b/block/quorum.c
index 65a6b55..19a9a44 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -15,6 +15,13 @@
 
 #include block_int.h
 
+typedef struct {
+BlockDriverState **bs;
+int threshold;
+int total;
+char **filenames;
+} BDRVQuorumState;
+
 typedef struct QuorumAIOCB QuorumAIOCB;
 
 typedef struct QuorumSingleAIOCB {
@@ -26,6 +33,7 @@ typedef struct QuorumSingleAIOCB {
 
 struct QuorumAIOCB {
 BlockDriverAIOCB common;
+BDRVQuorumState *bqs;
 QEMUBH *bh;
 
 /* Request metadata */
@@ -43,3 +51,17 @@ struct QuorumAIOCB {
 void (*vote)(QuorumAIOCB *acb);
 int vote_ret;
 };
+
+static BlockDriver bdrv_quorum = {
+.format_name= quorum,
+.protocol_name  = quorum,
+
+.instance_size  = sizeof(BDRVQuorumState),
+};
+
+static void bdrv_quorum_init(void)
+{
+bdrv_register(bdrv_quorum);
+}
+
+block_init(bdrv_quorum_init);
-- 
1.7.9.5

[Qemu-devel] [RFC V5 09/11] quorum: Add quorum_getlength().

2012-08-27 Thread Benoît Canet

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 block/quorum.c |   24 
 1 file changed, 24 insertions(+)

diff --git a/block/quorum.c b/block/quorum.c
index 3fa9d53..09eed84 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -549,12 +549,36 @@ static coroutine_fn int quorum_co_flush(BlockDriverState 
*bs)
 return 0;
 }
 
+static int64_t quorum_getlength(BlockDriverState *bs)
+{
+BDRVQuorumState *s = bs-opaque;
+QuorumVoteVersion *winner = NULL;
+QuorumVotes votes;
+int64_t value;
+int i;
+
+QLIST_INIT(votes.vote_list);
+for (i = 0; i  s-total; i++) {
+quorum_count_vote(votes, (unsigned long) bdrv_getlength(s-bs[i]), i);
+}
+
+/* vote to select the most represented version */
+winner = quorum_get_vote_winner(votes);
+
+value = (int64_t) winner-value;
+quorum_free_vote_list(votes);
+
+return value;
+}
+
 static BlockDriver bdrv_quorum = {
 .format_name= quorum,
 .protocol_name  = quorum,
 
 .instance_size  = sizeof(BDRVQuorumState),
 
+.bdrv_getlength = quorum_getlength,
+
 .bdrv_file_open = quorum_open,
 .bdrv_close = quorum_close,
 .bdrv_co_flush_to_disk = quorum_co_flush,
-- 
1.7.9.5

[Qemu-devel] [RFC V5 11/11] quorum: Add quorum_co_is_allocated.

2012-08-27 Thread Benoît Canet

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 block/quorum.c |   32 
 1 file changed, 32 insertions(+)

diff --git a/block/quorum.c b/block/quorum.c
index c9dcd9c..5a9f598 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -581,6 +581,37 @@ static void quorum_invalidate_cache(BlockDriverState *bs)
 }
 }
 
+static int coroutine_fn quorum_co_is_allocated(BlockDriverState *bs,
+   int64_t sector_num,
+   int nb_sectors,
+   int *pnum)
+{
+BDRVQuorumState *s = bs-opaque;
+QuorumVoteVersion *winner = NULL;
+QuorumVotes result_votes, num_votes;
+int i, result, num;
+
+QLIST_INIT(result_votes.vote_list);
+QLIST_INIT(num_votes.vote_list);
+
+for (i = 0; i  s-total; i++) {
+result = bdrv_co_is_allocated(s-bs[i], sector_num, nb_sectors, num);
+quorum_count_vote(result_votes, result, i);
+quorum_count_vote(num_votes, num, i);
+}
+
+winner = quorum_get_vote_winner(result_votes);
+result = winner-value;
+
+winner = quorum_get_vote_winner(num_votes);
+*pnum = winner-value;
+
+quorum_free_vote_list(result_votes);
+quorum_free_vote_list(num_votes);
+
+return result;
+}
+
 static BlockDriver bdrv_quorum = {
 .format_name= quorum,
 .protocol_name  = quorum,
@@ -596,6 +627,7 @@ static BlockDriver bdrv_quorum = {
 .bdrv_aio_readv = quorum_aio_readv,
 .bdrv_aio_writev= quorum_aio_writev,
 .bdrv_invalidate_cache = quorum_invalidate_cache,
+.bdrv_co_is_allocated  = quorum_co_is_allocated,
 };
 
 static void bdrv_quorum_init(void)
-- 
1.7.9.5

[Qemu-devel] [RFC V5 07/11] quorum: Add quorum_aio_readv.

2012-08-27 Thread Benoît Canet

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 block/quorum.c |   38 +-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff --git a/block/quorum.c b/block/quorum.c
index f83b4cf..791ef4a 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -193,15 +193,24 @@ static void quorum_aio_bh(void *opaque)
 {
 QuorumAIOCB *acb = opaque;
 BDRVQuorumState *s = acb-bqs;
-int ret;
+int i, ret;
 
 ret = s-threshold = acb-success_count ? 0 : -EIO;
 
+for (i = 0; i  s-total; i++) {
+qemu_vfree(acb-aios[i].buf);
+acb-aios[i].buf = NULL;
+acb-aios[i].ret = 0;
+}
+
 qemu_bh_delete(acb-bh);
 acb-common.cb(acb-common.opaque, ret);
 if (acb-finished) {
 *acb-finished = true;
 }
+for (i = 0; i  s-total; i++) {
+qemu_iovec_destroy(acb-qiovs[i]);
+}
 g_free(acb-aios);
 g_free(acb-qiovs);
 qemu_aio_release(acb);
@@ -261,6 +270,32 @@ static void quorum_aio_cb(void *opaque, int ret)
 qemu_bh_schedule(acb-bh);
 }
 
+static BlockDriverAIOCB *quorum_aio_readv(BlockDriverState *bs,
+ int64_t sector_num,
+ QEMUIOVector *qiov,
+ int nb_sectors,
+ BlockDriverCompletionFunc *cb,
+ void *opaque)
+{
+BDRVQuorumState *s = bs-opaque;
+QuorumAIOCB *acb = quorum_aio_get(s, bs, qiov, sector_num,
+  nb_sectors, cb, opaque);
+int i;
+
+for (i = 0; i  s-total; i++) {
+acb-aios[i].buf = qemu_blockalign(bs-file, qiov-size);
+qemu_iovec_init(acb-qiovs[i], qiov-niov);
+qemu_iovec_clone(acb-qiovs[i], qiov, acb-aios[i].buf);
+}
+
+for (i = 0; i  s-total; i++) {
+bdrv_aio_readv(s-bs[i], sector_num, qiov, nb_sectors,
+   quorum_aio_cb, acb-aios[i]);
+}
+
+return acb-common;
+}
+
 static BlockDriverAIOCB *quorum_aio_writev(BlockDriverState *bs,
   int64_t sector_num,
   QEMUIOVector *qiov,
@@ -304,6 +339,7 @@ static BlockDriver bdrv_quorum = {
 .bdrv_close = quorum_close,
 .bdrv_co_flush_to_disk = quorum_co_flush,
 
+.bdrv_aio_readv = quorum_aio_readv,
 .bdrv_aio_writev= quorum_aio_writev,
 };
 
-- 
1.7.9.5

[Qemu-devel] [RFC V5 06/11] quorum: Add quorum_co_flush().

2012-08-27 Thread Benoît Canet

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 block/quorum.c |   13 +
 1 file changed, 13 insertions(+)

diff --git a/block/quorum.c b/block/quorum.c
index cd11cfb..f83b4cf 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -282,6 +282,18 @@ static BlockDriverAIOCB 
*quorum_aio_writev(BlockDriverState *bs,
 return acb-common;
 }
 
+static coroutine_fn int quorum_co_flush(BlockDriverState *bs)
+{
+BDRVQuorumState *s = bs-opaque;
+int i;
+
+for (i = 0; i  s-total; i++) {
+bdrv_co_flush(s-bs[i]);
+}
+
+return 0;
+}
+
 static BlockDriver bdrv_quorum = {
 .format_name= quorum,
 .protocol_name  = quorum,
@@ -290,6 +302,7 @@ static BlockDriver bdrv_quorum = {
 
 .bdrv_file_open = quorum_open,
 .bdrv_close = quorum_close,
+.bdrv_co_flush_to_disk = quorum_co_flush,
 
 .bdrv_aio_writev= quorum_aio_writev,
 };
-- 
1.7.9.5

[Qemu-devel] [RFC V5 00/11] Quorum disk image corruption resiliency

2012-08-27 Thread Benoît Canet

This patchset create a block driver implementing a quorum using total qemu disk
images. Writes are mirrored on the $total files.
For the reading part the $total files are read at the same time and a vote is
done to determine if a qiov version is present $threshold or more times. It 
then return
this majority version to the upper layers.
When i  $threshold versions of the data are returned by the lower layer the
quorum is broken and the read return -EIO.

The goal of this patchset is to be turned in a QEMU block filter living just
above raw-*.c and below qcow2/qed when the required infrastructure will be done.

Main use of this feature will be people using NFS appliances which can be
subjected to bitflip errors.

This patchset can be used to replace blkverify and the out of tree blkmirror.

usage: -drive 
file=quorum:threshold/total:image_1.raw,,...,,image_total.raw,if=virtio,cache=none

in v2:

eblake: fix typos
squash two first commits

afärber: Modify the Makefile on first commit

bcanet: move function prototype of quorum.c one patch down

in v3:

Blue Swirl: change char * to uint8_t * in QuorumSingleAIOCB

Eric Blake: Add escaping of the : separator
Allow to specify the n/m ratio parameters of the Quorum

Stefan Hajnoczi: Squash quorum_close and quorum_open patch to avoid leak
 Add missing bdrv_delete() in quorum_close
 simpler quorum_getlength
 make the quorum_check_ret threshold a user setting (bind it to 
n)
 move blkverify_iovec_clone() and blkverify_iovec_compare() to 
cutils.c
 free unconditionally qemu_blockalign() with qemu_vfree()
 turn assignement into assert in quorum_copy_qiov()

in v4:

Eric Blake: verbose commit message for Add quorum_open() and quorum_close()
use of a bool for the escape variable in the same commit
simplify a if to a one liner in the same commit
replace += 1 by ++ in a number of places
make quorum_getlength return a quorum vote.

Blue Swirl: replace n and m by threshold and total
ignore flush errors in quorum_co_flush

Stefan Hajnoczi: removal of a macro in Add quorum mechanism
 call qemu_iovec_destroy in the bh

Benoît Canet: Now use QuorumVoteItem and QuorumVoteVersion as names for the
  voting structs 
  refactor and rename function to quorum_count_vote.

in v5:

Blue swirl: replace ':' by ',' as separator to allow networked path
replace remaining occurence of n and m by threshold and total

Eric Blake: fix commit message about escaping

Benoît Canet: Factorise voting into quorum_get_vote_winner()
  Create quorum_invalidate_cache to enable live migration
  Create quorum_co_is_allocated to enable streaming.
  Fix escaping 

Benoît Canet (11):
  quorum: Create quorum.c, add QuorumSingleAIOCB and QuorumAIOCB.
  quorum: Create BDRVQuorumState and BlkDriver and do init.
  quorum: Add quorum_open() and quorum_close().
  quorum: Add quorum_aio_writev and its dependencies.
  blkverify: Extract qemu_iovec_clone() and qemu_iovec_compare() from
blkverify.
  quorum: Add quorum_co_flush().
  quorum: Add quorum_aio_readv.
  quorum: Add quorum mechanism.
  quorum: Add quorum_getlength().
  quorum: Add quorum_invalidate_cache().
  quorum: Add quorum_co_is_allocated.

 block/Makefile.objs |1 +
 block/blkverify.c   |  108 +
 block/quorum.c  |  638 +++
 cutils.c|  103 +
 qemu-common.h   |2 +
 5 files changed, 746 insertions(+), 106 deletions(-)
 create mode 100644 block/quorum.c

-- 
1.7.9.5

[Qemu-devel] [RFC V5 05/11] blkverify: Extract qemu_iovec_clone() and qemu_iovec_compare() from blkverify.

2012-08-27 Thread Benoît Canet

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 block/blkverify.c |  108 +
 cutils.c  |  103 ++
 qemu-common.h |2 +
 3 files changed, 107 insertions(+), 106 deletions(-)

diff --git a/block/blkverify.c b/block/blkverify.c
index 9d5f1ec..79d36d5 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -123,110 +123,6 @@ static int64_t blkverify_getlength(BlockDriverState *bs)
 return bdrv_getlength(s-test_file);
 }
 
-/**
- * Check that I/O vector contents are identical
- *
- * @a:  I/O vector
- * @b:  I/O vector
- * @ret:Offset to first mismatching byte or -1 if match
- */
-static ssize_t blkverify_iovec_compare(QEMUIOVector *a, QEMUIOVector *b)
-{
-int i;
-ssize_t offset = 0;
-
-assert(a-niov == b-niov);
-for (i = 0; i  a-niov; i++) {
-size_t len = 0;
-uint8_t *p = (uint8_t *)a-iov[i].iov_base;
-uint8_t *q = (uint8_t *)b-iov[i].iov_base;
-
-assert(a-iov[i].iov_len == b-iov[i].iov_len);
-while (len  a-iov[i].iov_len  *p++ == *q++) {
-len++;
-}
-
-offset += len;
-
-if (len != a-iov[i].iov_len) {
-return offset;
-}
-}
-return -1;
-}
-
-typedef struct {
-int src_index;
-struct iovec *src_iov;
-void *dest_base;
-} IOVectorSortElem;
-
-static int sortelem_cmp_src_base(const void *a, const void *b)
-{
-const IOVectorSortElem *elem_a = a;
-const IOVectorSortElem *elem_b = b;
-
-/* Don't overflow */
-if (elem_a-src_iov-iov_base  elem_b-src_iov-iov_base) {
-return -1;
-} else if (elem_a-src_iov-iov_base  elem_b-src_iov-iov_base) {
-return 1;
-} else {
-return 0;
-}
-}
-
-static int sortelem_cmp_src_index(const void *a, const void *b)
-{
-const IOVectorSortElem *elem_a = a;
-const IOVectorSortElem *elem_b = b;
-
-return elem_a-src_index - elem_b-src_index;
-}
-
-/**
- * Copy contents of I/O vector
- *
- * The relative relationships of overlapping iovecs are preserved.  This is
- * necessary to ensure identical semantics in the cloned I/O vector.
- */
-static void blkverify_iovec_clone(QEMUIOVector *dest, const QEMUIOVector *src,
-  void *buf)
-{
-IOVectorSortElem sortelems[src-niov];
-void *last_end;
-int i;
-
-/* Sort by source iovecs by base address */
-for (i = 0; i  src-niov; i++) {
-sortelems[i].src_index = i;
-sortelems[i].src_iov = src-iov[i];
-}
-qsort(sortelems, src-niov, sizeof(sortelems[0]), sortelem_cmp_src_base);
-
-/* Allocate buffer space taking into account overlapping iovecs */
-last_end = NULL;
-for (i = 0; i  src-niov; i++) {
-struct iovec *cur = sortelems[i].src_iov;
-ptrdiff_t rewind = 0;
-
-/* Detect overlap */
-if (last_end  last_end  cur-iov_base) {
-rewind = last_end - cur-iov_base;
-}
-
-sortelems[i].dest_base = buf - rewind;
-buf += cur-iov_len - MIN(rewind, cur-iov_len);
-last_end = MAX(cur-iov_base + cur-iov_len, last_end);
-}
-
-/* Sort by source iovec index and build destination iovec */
-qsort(sortelems, src-niov, sizeof(sortelems[0]), sortelem_cmp_src_index);
-for (i = 0; i  src-niov; i++) {
-qemu_iovec_add(dest, sortelems[i].dest_base, src-iov[i].iov_len);
-}
-}
-
 static BlkverifyAIOCB *blkverify_aio_get(BlockDriverState *bs, bool is_write,
  int64_t sector_num, QEMUIOVector 
*qiov,
  int nb_sectors,
@@ -290,7 +186,7 @@ static void blkverify_aio_cb(void *opaque, int ret)
 
 static void blkverify_verify_readv(BlkverifyAIOCB *acb)
 {
-ssize_t offset = blkverify_iovec_compare(acb-qiov, acb-raw_qiov);
+ssize_t offset = qemu_iovec_compare(acb-qiov, acb-raw_qiov);
 if (offset != -1) {
 blkverify_err(acb, contents mismatch in sector % PRId64,
   acb-sector_num + (int64_t)(offset / BDRV_SECTOR_SIZE));
@@ -308,7 +204,7 @@ static BlockDriverAIOCB 
*blkverify_aio_readv(BlockDriverState *bs,
 acb-verify = blkverify_verify_readv;
 acb-buf = qemu_blockalign(bs-file, qiov-size);
 qemu_iovec_init(acb-raw_qiov, acb-qiov-niov);
-blkverify_iovec_clone(acb-raw_qiov, qiov, acb-buf);
+qemu_iovec_clone(acb-raw_qiov, qiov, acb-buf);
 
 bdrv_aio_readv(s-test_file, sector_num, qiov, nb_sectors,
blkverify_aio_cb, acb);
diff --git a/cutils.c b/cutils.c
index ee4614d..dcdd60f 100644
--- a/cutils.c
+++ b/cutils.c
@@ -245,6 +245,109 @@ size_t qemu_iovec_memset(QEMUIOVector *qiov, size_t 
offset,
 return iov_memset(qiov-iov, qiov-niov, offset, fillc, bytes);
 }
 
+/**
+ * Check that I/O vector contents are identical
+ *
+ * @a:  I/O vector
+ * @b:  I/O vector
+ * @ret:Offset to first mismatching

[Qemu-devel] [RFC V5 03/11] quorum: Add quorum_open() and quorum_close().

2012-08-27 Thread Benoît Canet

Valid quorum resources look like
quorum:threshold/total:path/to/image_1, ... ,path/to/image_total

',' is used as a separator to allow to use networked path
'\' is the escaping character for filename containing ','
'\' escape itself

On the command line for quorum files img,test.raw, img2.raw
and img3.raw invocation look like:

-drive file=quorum:2/3:img\\,,test.raw,,img2.raw,,img3.raw
(note the double ,, and \\)

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 block/quorum.c |  123 
 1 file changed, 123 insertions(+)

diff --git a/block/quorum.c b/block/quorum.c
index 19a9a44..b9fb2b9 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -52,11 +52,134 @@ struct QuorumAIOCB {
 int vote_ret;
 };
 
+/* Valid quorum resources look like
+ * quorum:threshold/total:path/to/image_1, ... ,path/to/image_total
+ *
+ * ',' is used as a separator to allow to use network path
+ * '\' is the escaping character for filename containing ','
+ */
+static int quorum_open(BlockDriverState *bs, const char *filename, int flags)
+{
+BDRVQuorumState *s = bs-opaque;
+int i, j, k, len, ret = 0;
+char *a, *b, *names;
+bool escape;
+
+/* Parse the quorum: prefix */
+if (strncmp(filename, quorum:, strlen(quorum:))) {
+return -EINVAL;
+}
+
+filename += strlen(quorum:);
+
+/* Get threshold */
+errno = 0;
+s-threshold = strtoul(filename, a, 10);
+if (*a != '/' || errno) {
+return -EINVAL;
+}
+a++;
+
+/* Get total */
+errno = 0;
+s-total = strtoul(a, b, 10);
+if (*b != ':' || errno) {
+return -EINVAL;
+}
+b++;
+
+if (s-threshold  1 || s-total  2) {
+return -EINVAL;
+}
+
+if (s-threshold  s-total) {
+return -EINVAL;
+}
+
+s-bs = g_malloc0(sizeof(BlockDriverState *) * s-total);
+/* Two allocations for all filenames: simpler to free */
+s-filenames = g_malloc0(sizeof(char *) * s-total);
+names = g_strdup(b);
+
+/* Get the filenames pointers */
+escape = false;
+s-filenames[0] = names;
+len = strlen(names);
+for (i = j = k = 0; i  len  j  s-total; i++) {
+/* separation between two files */
+if (!escape  names[i] == ',') {
+char *prev = s-filenames[j];
+prev[k] = '\0';
+s-filenames[++j] = prev + k + 1;
+k = 0;
+continue;
+}
+
+escape = !escape  names[i] == '\\';
+
+/* if we are not escaping copy */
+if (!escape) {
+s-filenames[j][k++] = names[i];
+}
+}
+/* terminate last string */
+s-filenames[j][k] = '\0';
+
+if ((j + 1) != s-total) {
+ret = -EINVAL;
+goto free_exit;
+}
+
+/* Open files */
+for (i = 0; i  s-total; i++) {
+s-bs[i] = bdrv_new();
+ret = bdrv_open(s-bs[i], s-filenames[i], flags, NULL);
+if (ret  0) {
+goto error_exit;
+}
+}
+
+goto exit;
+
+error_exit:
+for (; i = 0; i--) {
+bdrv_delete(s-bs[i]);
+s-bs[i] = NULL;
+}
+free_exit:
+g_free(s-filenames[0]);
+g_free(s-filenames);
+s-filenames = NULL;
+g_free(s-bs);
+exit:
+return ret;
+}
+
+static void quorum_close(BlockDriverState *bs)
+{
+BDRVQuorumState *s = bs-opaque;
+int i;
+
+for (i = 0; i  s-total; i++) {
+/* Ensure writes reach stable storage */
+bdrv_flush(s-bs[i]);
+bdrv_delete(s-bs[i]);
+}
+
+g_free(s-filenames[0]);
+g_free(s-filenames);
+s-filenames = NULL;
+g_free(s-bs);
+}
+
 static BlockDriver bdrv_quorum = {
 .format_name= quorum,
 .protocol_name  = quorum,
 
 .instance_size  = sizeof(BDRVQuorumState),
+
+.bdrv_file_open = quorum_open,
+.bdrv_close = quorum_close,
 };
 
 static void bdrv_quorum_init(void)
-- 
1.7.9.5

Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem

2012-08-27 Thread liu ping fan

On Mon, Aug 27, 2012 at 3:47 PM, Jan Kiszka jan.kis...@web.de wrote:
 On 2012-08-27 09:01, Paolo Bonzini wrote:
 Il 25/08/2012 09:42, liu ping fan ha scritto:

 I don't see why MMIO dispatch should hold the IDEBus ref rather than the
 PCIIDEState.

 When transfer memory_region_init_io()  3rd para from void* opaque to
 Object* obj,  the obj : opaque is not neccessary 1:1 map. For such
 situation, in order to let MemoryRegionOps tell between them, we
 should pass PCIIDEState-bus[0], bus[1] separately.

 The rule should be that the obj is the object that you want referenced,
 and that should be the PCIIDEState.

 But this is anyway moot because it only applies to objects that are
 converted to use unlocked dispatch.  This likely will not be the case
 for IDE.

 BTW, I'm pretty sure - after implementing the basics for BQL-free PIO
 dispatching - that device objects are the wrong target for reference

Hi Jan, thanks for reminder, but could you explain it more detail?
mmio dispatch table holds 1 ref for device, before releasing this
ref,( When unplugging, we detach all the device's mr from memory, then
drop the ref. So I think that no leak will be exposed by mr  and it is
safe to use device as target for reference.

 counting. We keep memory regions in our dispatching tables (PIO
 dispatching needs some refactoring for this), and those regions need
 protection for BQL-free use. Devices can't pass away as long as the have
Yes, it is right. Device can pass away only after mr removed from
dispatching tables

Thanx pingfan
 referenced regions, memory region deregistration services will have to
 take care of this.


 I'm currently not using reference counting at all, I'm enforcing that
 only BQL-protected regions can be deregistered.

 Also note that there seems to be another misconception in the
 discussions: deregistration is not only bound to device unplug. It also
 happens on device reconfiguration, e.g. PCI BAR (re-)mapping. Another
 strong indicator that we should worry about individual memory regions,
 not devices.

 Jan

[Qemu-devel] [RFC V5 01/11] quorum: Create quorum.c, add QuorumSingleAIOCB and QuorumAIOCB.

2012-08-27 Thread Benoît Canet

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 block/Makefile.objs |1 +
 block/quorum.c  |   45 +
 2 files changed, 46 insertions(+)
 create mode 100644 block/quorum.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index b5754d3..66af6dc 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -4,6 +4,7 @@ block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o 
qed-cluster.o
 block-obj-y += qed-check.o
 block-obj-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
 block-obj-y += stream.o
+block-obj-y += quorum.o
 block-obj-$(CONFIG_WIN32) += raw-win32.o
 block-obj-$(CONFIG_POSIX) += raw-posix.o
 block-obj-$(CONFIG_LIBISCSI) += iscsi.o
diff --git a/block/quorum.c b/block/quorum.c
new file mode 100644
index 000..65a6b55
--- /dev/null
+++ b/block/quorum.c
@@ -0,0 +1,45 @@
+/*
+ * Quorum Block filter
+ *
+ * Copyright (C) 2012 Nodalink, SARL.
+ *
+ * Author:
+ *   Benoît Canet benoit.ca...@irqsave.net
+ *
+ * Based on the design and code of blkverify.c (Copyright (C) 2010 IBM, Corp)
+ * and blkmirror.c (Copyright (C) 2011 Red Hat, Inc).
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include block_int.h
+
+typedef struct QuorumAIOCB QuorumAIOCB;
+
+typedef struct QuorumSingleAIOCB {
+BlockDriverAIOCB *aiocb;
+uint8_t *buf;
+int ret;
+QuorumAIOCB *parent;
+} QuorumSingleAIOCB;
+
+struct QuorumAIOCB {
+BlockDriverAIOCB common;
+QEMUBH *bh;
+
+/* Request metadata */
+int64_t sector_num;
+int nb_sectors;
+
+QEMUIOVector *qiov; /* calling readv IOV */
+
+QuorumSingleAIOCB *aios;/* individual AIOs */
+QEMUIOVector *qiovs;/* individual IOVs */
+int count;  /* number of completed AIOCB */
+int success_count;  /* number of successfully completed AIOCB */
+bool *finished; /* completion signal for cancel */
+
+void (*vote)(QuorumAIOCB *acb);
+int vote_ret;
+};
-- 
1.7.9.5

Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem

2012-08-27 Thread Jan Kiszka

On 2012-08-27 10:17, liu ping fan wrote:
 On Mon, Aug 27, 2012 at 3:47 PM, Jan Kiszka jan.kis...@web.de wrote:
 On 2012-08-27 09:01, Paolo Bonzini wrote:
 Il 25/08/2012 09:42, liu ping fan ha scritto:

 I don't see why MMIO dispatch should hold the IDEBus ref rather than the
 PCIIDEState.

 When transfer memory_region_init_io()  3rd para from void* opaque to
 Object* obj,  the obj : opaque is not neccessary 1:1 map. For such
 situation, in order to let MemoryRegionOps tell between them, we
 should pass PCIIDEState-bus[0], bus[1] separately.

 The rule should be that the obj is the object that you want referenced,
 and that should be the PCIIDEState.

 But this is anyway moot because it only applies to objects that are
 converted to use unlocked dispatch.  This likely will not be the case
 for IDE.

 BTW, I'm pretty sure - after implementing the basics for BQL-free PIO
 dispatching - that device objects are the wrong target for reference
 
 Hi Jan, thanks for reminder, but could you explain it more detail?
 mmio dispatch table holds 1 ref for device, before releasing this
 ref,( When unplugging, we detach all the device's mr from memory, then
 drop the ref. So I think that no leak will be exposed by mr  and it is
 safe to use device as target for reference.

It would be a mistake to assume that memory regions can only be embedded
in device objects. Memory regions can be reconfigured or dynamically
added/removed (see e.g. portio lists) - there is no device in this
sentence. Regions are stored in the dispatching table, they will first
of all be touched without holding the BQL. So their content has to be
stable in that period, and it is the proper abstraction, IMHO, to focus
on their life cycle management and attach all the rest to them.

 
 counting. We keep memory regions in our dispatching tables (PIO
 dispatching needs some refactoring for this), and those regions need
 protection for BQL-free use. Devices can't pass away as long as the have
 Yes, it is right. Device can pass away only after mr removed from
 dispatching tables

Great, then you don't have to worry about device objects in the context
of dispatching.

Jan




signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH] [MIPS] Fix operands of RECIP2.S and RECIP2.PS

2012-08-27 Thread Richard Sandiford

Read the second input operand of RECIP2.S and RECIP2.PS from FT rather
than FD.  RECIP2.D is already correct.

Signed-off-by: Richard Sandiford rdsandif...@googlemail.com
---
 target-mips/translate.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index 7104d30..d812986 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -6805,7 +6805,7 @@ static void gen_farith (DisasContext *ctx, enum fopcode 
op1,
 TCGv_i32 fp1 = tcg_temp_new_i32();
 
 gen_load_fpr32(fp0, fs);
-gen_load_fpr32(fp1, fd);
+gen_load_fpr32(fp1, ft);
 gen_helper_float_recip2_s(fp0, fp0, fp1);
 tcg_temp_free_i32(fp1);
 gen_store_fpr32(fp0, fd);
@@ -7543,7 +7543,7 @@ static void gen_farith (DisasContext *ctx, enum fopcode 
op1,
 TCGv_i64 fp1 = tcg_temp_new_i64();
 
 gen_load_fpr64(ctx, fp0, fs);
-gen_load_fpr64(ctx, fp1, fd);
+gen_load_fpr64(ctx, fp1, ft);
 gen_helper_float_recip2_ps(fp0, fp0, fp1);
 tcg_temp_free_i64(fp1);
 gen_store_fpr64(ctx, fp0, fd);
-- 
1.7.7.6

[Qemu-devel] [PATCH] [MIPS] Fix order of CVT.PS.S operands

2012-08-27 Thread Richard Sandiford

The FS input to CVT.PS.S is the high half and FT is the low half.
tcg_gen_concat_i32_i64 takes the low half first, so the operands
were in the wrong order.

Signed-off-by: Richard Sandiford rdsandif...@googlemail.com
---
 target-mips/translate.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index 06f0ac6..defc021 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -6907,7 +6907,7 @@ static void gen_farith (DisasContext *ctx, enum fopcode 
op1,
 
 gen_load_fpr32(fp32_0, fs);
 gen_load_fpr32(fp32_1, ft);
-tcg_gen_concat_i32_i64(fp64, fp32_0, fp32_1);
+tcg_gen_concat_i32_i64(fp64, fp32_1, fp32_0);
 tcg_temp_free_i32(fp32_1);
 tcg_temp_free_i32(fp32_0);
 gen_store_fpr64(ctx, fp64, fd);
-- 
1.7.7.6

Re: [Qemu-devel] qcow2: online snasphots : internal vs external ?

2012-08-27 Thread Stefan Hajnoczi

On Sun, Aug 26, 2012 at 10:56 AM, Alexandre DERUMIER
aderum...@odiso.com wrote:
 It is possible to achieve the same behaviour with external snapshot ? (I 
 would like to do it online)
 I don't see how I can rollback to the point of time of the snapshot.

The snapshot only captures the contents of the disk.  Rollback does
not make sense without shutting down the guest.  The OS/file system
would be very confused if the disk contents changed underneath it.

Existing hotplug can be used.  For example, if we have an external
snapshot of a virtio-blk drive, we can use hotplug to remove the
drive, choose the snapshot file and attach it again.  This only works
for data drives, the root file system usually cannot be changed
while the guest is running.

You may also wish to look at libvirt for higher level snapshot primitives.

 Also I see that snapshot_blkdev qmp command give in his description:
 Otherwise the snapshot will be internal! (currently unsupported).

 is Live internal snapshots on the roadmap ?

I'm not aware of anyone working on adding internal snapshot in the
near future.  Patches are welcome.

Stefan

Re: [Qemu-devel] [PATCH] [MIPS] Fix order of CVT.PS.S operands

2012-08-27 Thread Stefan Hajnoczi

On Mon, Aug 27, 2012 at 9:53 AM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 The FS input to CVT.PS.S is the high half and FT is the low half.
 tcg_gen_concat_i32_i64 takes the low half first, so the operands
 were in the wrong order.

 Signed-off-by: Richard Sandiford rdsandif...@googlemail.com
 ---
  target-mips/translate.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/target-mips/translate.c b/target-mips/translate.c
 index 06f0ac6..defc021 100644
 --- a/target-mips/translate.c
 +++ b/target-mips/translate.c
 @@ -6907,7 +6907,7 @@ static void gen_farith (DisasContext *ctx, enum fopcode 
 op1,

  gen_load_fpr32(fp32_0, fs);
  gen_load_fpr32(fp32_1, ft);
 -tcg_gen_concat_i32_i64(fp64, fp32_0, fp32_1);
 +tcg_gen_concat_i32_i64(fp64, fp32_1, fp32_0);
  tcg_temp_free_i32(fp32_1);
  tcg_temp_free_i32(fp32_0);
  gen_store_fpr64(ctx, fp64, fd);
 --
 1.7.7.6

CCing Aurelian for MIPS.  You can look at ./MAINTAINERS to see who
should be CCed.

Stefan

Re: [Qemu-devel] [PATCH] [MIPS] Fix operands of RECIP2.S and RECIP2.PS

2012-08-27 Thread Stefan Hajnoczi

On Mon, Aug 27, 2012 at 9:50 AM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 Read the second input operand of RECIP2.S and RECIP2.PS from FT rather
 than FD.  RECIP2.D is already correct.

 Signed-off-by: Richard Sandiford rdsandif...@googlemail.com
 ---
  target-mips/translate.c |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)

 diff --git a/target-mips/translate.c b/target-mips/translate.c
 index 7104d30..d812986 100644
 --- a/target-mips/translate.c
 +++ b/target-mips/translate.c
 @@ -6805,7 +6805,7 @@ static void gen_farith (DisasContext *ctx, enum fopcode 
 op1,
  TCGv_i32 fp1 = tcg_temp_new_i32();

  gen_load_fpr32(fp0, fs);
 -gen_load_fpr32(fp1, fd);
 +gen_load_fpr32(fp1, ft);
  gen_helper_float_recip2_s(fp0, fp0, fp1);
  tcg_temp_free_i32(fp1);
  gen_store_fpr32(fp0, fd);
 @@ -7543,7 +7543,7 @@ static void gen_farith (DisasContext *ctx, enum fopcode 
 op1,
  TCGv_i64 fp1 = tcg_temp_new_i64();

  gen_load_fpr64(ctx, fp0, fs);
 -gen_load_fpr64(ctx, fp1, fd);
 +gen_load_fpr64(ctx, fp1, ft);
  gen_helper_float_recip2_ps(fp0, fp0, fp1);
  tcg_temp_free_i64(fp1);
  gen_store_fpr64(ctx, fp0, fd);
 --
 1.7.7.6

CCing Aurelian for MIPS.  You can look at ./MAINTAINERS to see who
should be CCed.

Stefan

Re: [Qemu-devel] [PATCH] hw/pl110: Fix spelling of 'palette'

2012-08-27 Thread Peter Maydell

On 27 August 2012 06:19, Stefan Weil s...@weilnetz.de wrote:
 Am 26.08.2012 23:30, schrieb Peter Maydell:

 Fix the spelling of 'palette' used in various local variables
 and structure members.
   if (offset = 0x200  offset  0x400) {
   /* Pallette.  */


 What about this one? For V2 of your patch, you may add a

 Reviewed-by: Stefan Weil s...@weilnetz.de

Thanks; as you may have guessed I didn't do a case-insensitive
search...

-- PMM

Re: [Qemu-devel] qcow2: online snasphots : internal vs external ?

2012-08-27 Thread Alexandre DERUMIER

Thanks again Stefan

The snapshot only captures the contents of the disk. Rollback does
not make sense without shutting down the guest. The OS/file system
would be very confused if the disk contents changed underneath it.

Existing hotplug can be used. For example, if we have an external
snapshot of a virtio-blk drive, we can use hotplug to remove the
drive, choose the snapshot file and attach it again. This only works
for data drives, the root file system usually cannot be changed
while the guest is running.


Yes, sure rollback must be done offline.
But I wanted to say, with external snapshot, how can I rollback to the point of 
the snapshot.

exemple :
image1.qcow2
file : /beforesnap1
take a snaphot (snap1), so qemu switch to snap1.qcow2
write some file:
file:
/aftersnap1.
/beforesnap1

Now, how can I rollback to the point of time of snap1 ?
I can reuse image1.qcow2, but if I write some datas on it, I don't see how I 
can return to the point of time of the snap1. (like qemu-image -a  with 
internal snapshots)


You may also wish to look at libvirt for higher level snapshot primitives.
Thanks, I'll look at the libvirt to see how they do things.


- Mail original -

De: Stefan Hajnoczi stefa...@gmail.com
À: Alexandre DERUMIER aderum...@odiso.com
Cc: Jeff Cody jc...@redhat.com, qemu-devel qemu-devel@nongnu.org, 
Paolo Bonzini pbonz...@redhat.com, Eric Blake ebl...@redhat.com
Envoyé: Lundi 27 Août 2012 11:04:14
Objet: Re: [Qemu-devel] qcow2: online snasphots : internal vs external ?

On Sun, Aug 26, 2012 at 10:56 AM, Alexandre DERUMIER
aderum...@odiso.com wrote:
 It is possible to achieve the same behaviour with external snapshot ? (I 
 would like to do it online)
 I don't see how I can rollback to the point of time of the snapshot.

The snapshot only captures the contents of the disk. Rollback does
not make sense without shutting down the guest. The OS/file system
would be very confused if the disk contents changed underneath it.

Existing hotplug can be used. For example, if we have an external
snapshot of a virtio-blk drive, we can use hotplug to remove the
drive, choose the snapshot file and attach it again. This only works
for data drives, the root file system usually cannot be changed
while the guest is running.

You may also wish to look at libvirt for higher level snapshot primitives. 

 Also I see that snapshot_blkdev qmp command give in his description:
 Otherwise the snapshot will be internal! (currently unsupported).

 is Live internal snapshots on the roadmap ?

I'm not aware of anyone working on adding internal snapshot in the
near future. Patches are welcome.

Stefan



--

--





Alexandre D e rumier

Ingénieur Systèmes et Réseaux


Fixe : 03 20 68 88 85

Fax : 03 20 68 90 88


45 Bvd du Général Leclerc 59100 Roubaix
12 rue Marivaux 75002 Paris

Re: [Qemu-devel] qcow2: online snasphots : internal vs external ?

2012-08-27 Thread Paolo Bonzini

Il 27/08/2012 11:26, Alexandre DERUMIER ha scritto:
 how can I rollback to the point of the snapshot.
 
 exemple :
 image1.qcow2
 file : /beforesnap1
 take a snaphot (snap1), so qemu switch to snap1.qcow2
 write some file:
 file: 
 /aftersnap1.
 /beforesnap1
 
 Now, how can I rollback to the point of time of snap1 ?
 I can reuse image1.qcow2, but if I write some datas on it, I don't
 see how I can return to the point of time of the snap1. (like qemu-image -a
 with internal snapshots)

If you can drop snap1.qcow2 altogether, you just use image1.qcow2 the
next time you start QEMU.

If you cannot, you create snap2.qcow2 based on image1.qcow2:

  qemu-img -f qcow2 -obacking_file=image1.qcow2 snap2.qcow2

and use it the next time you start QEMU.

Paolo

Re: [Qemu-devel] qcow2: online snasphots : internal vs external ?

2012-08-27 Thread Paolo Bonzini

Il 27/08/2012 11:04, Stefan Hajnoczi ha scritto:
  Also I see that snapshot_blkdev qmp command give in his description:
  Otherwise the snapshot will be internal! (currently unsupported).
 
  is Live internal snapshots on the roadmap ?
 I'm not aware of anyone working on adding internal snapshot in the
 near future.  Patches are welcome.

The main problem with internal snapshots is that it's difficult to work
with two snapshots at the same time, especially if you need to write to
two of them.  IIUC this is why people concentrated more on external
snapshots.

It's not intrinsic to internal snapshots, more like a wart of the
implementation, but not an easily fixed one.

Paolo

Re: [Qemu-devel] [PATCH 2/2] mips-linux-user: Always support rdhwr.

2012-08-27 Thread Aurelien Jarno

On Fri, Mar 30, 2012 at 01:16:37PM -0400, Richard Henderson wrote:
 The kernel will emulate this instruction if it's not supported
 natively.  This insn is used for TLS, among other things, and
 so is required by modern glibc.
 
 Signed-off-by: Richard Henderson r...@twiddle.net
 Cc: Riku Voipio riku.voi...@iki.fi
 ---
  target-mips/translate.c |4 
  1 files changed, 4 insertions(+), 0 deletions(-)
 
 diff --git a/target-mips/translate.c b/target-mips/translate.c
 index 300d95e..ed28ca8 100644
 --- a/target-mips/translate.c
 +++ b/target-mips/translate.c
 @@ -8111,7 +8111,11 @@ gen_rdhwr (CPUMIPSState *env, DisasContext *ctx, int 
 rt, int rd)
  {
  TCGv t0;
  
 +#if !defined(CONFIG_USER_ONLY)
 +/* The Linux kernel will emulate rdhwr if it's not supported natively.
 +   Therefore only check the ISA in system mode.  */
  check_insn(env, ctx, ISA_MIPS32R2);
 +#endif
  t0 = tcg_temp_new();
  
  switch (rd) {

Thanks, applied.
 
-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH] [MIPS] Fix operands of RECIP2.S and RECIP2.PS

2012-08-27 Thread Aurelien Jarno

On Mon, Aug 27, 2012 at 09:50:38AM +0100, Richard Sandiford wrote:
 Read the second input operand of RECIP2.S and RECIP2.PS from FT rather
 than FD.  RECIP2.D is already correct.
 
 Signed-off-by: Richard Sandiford rdsandif...@googlemail.com
 ---
  target-mips/translate.c |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/target-mips/translate.c b/target-mips/translate.c
 index 7104d30..d812986 100644
 --- a/target-mips/translate.c
 +++ b/target-mips/translate.c
 @@ -6805,7 +6805,7 @@ static void gen_farith (DisasContext *ctx, enum fopcode 
 op1,
  TCGv_i32 fp1 = tcg_temp_new_i32();
  
  gen_load_fpr32(fp0, fs);
 -gen_load_fpr32(fp1, fd);
 +gen_load_fpr32(fp1, ft);
  gen_helper_float_recip2_s(fp0, fp0, fp1);
  tcg_temp_free_i32(fp1);
  gen_store_fpr32(fp0, fd);
 @@ -7543,7 +7543,7 @@ static void gen_farith (DisasContext *ctx, enum fopcode 
 op1,
  TCGv_i64 fp1 = tcg_temp_new_i64();
  
  gen_load_fpr64(ctx, fp0, fs);
 -gen_load_fpr64(ctx, fp1, fd);
 +gen_load_fpr64(ctx, fp1, ft);
  gen_helper_float_recip2_ps(fp0, fp0, fp1);
  tcg_temp_free_i64(fp1);
  gen_store_fpr64(ctx, fp0, fd);

Thanks, applied.


-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH] ATAPI: Add support for ASCQ in sense codes

2012-08-27 Thread Kevin Wolf

Am 31.07.2012 09:14, schrieb Paolo Bonzini:
 Il 31/07/2012 04:07, Ronnie Sahlberg ha scritto:
 Add support for setting the ASCQ for SCSI sense codes in the ATAPI driver.
 Use this to set ASCQ==2 for the medium removal prevention that is 
 recommended in MMC for this condition.

 asc:0x53 ascq:0x02 is the recommended error for MEDIUM_REMOVAL_PREVENTED and 
 is listed in Annex F in MMC
 
 You also need to cover migration.
 
 You could either add a subsection, or something like this:
 
 diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
 index f7f714c..89c0157 100644
 --- a/hw/ide/atapi.c
 +++ b/hw/ide/atapi.c
 @@ -1143,3 +1143,20 @@ void ide_atapi_cmd(IDEState *s)
  
  ide_atapi_cmd_error(s, ILLEGAL_REQUEST, ASC_ILLEGAL_OPCODE);
  }
 +
 +void ide_atapi_post_load(IDEState *s, int version_id)
 +{
 +if (version_id  3) {
 +if (s-sense_key == UNIT_ATTENTION 
 +s-asc == ASC_MEDIUM_MAY_HAVE_CHANGED) {
 +s-cdrom_changed = 1;
 +}
 +}
 +
 +/* This is simpler than adding a subsection just for the ascq.  */
 +if (s-asc == ASC_MEDIA_REMOVAL_PREVENTED) {
 +s-ascq = 2;
 +} else {
 +s-ascq = 0;
 +}
 +}
 diff --git a/hw/ide/core.c b/hw/ide/core.c
 index cb5ca4b..959ac48 100644
 --- a/hw/ide/core.c
 +++ b/hw/ide/core.c
 @@ -2154,12 +2154,7 @@ static int ide_drive_post_load(void *opaque, int 
 version_id)
  {
  IDEState *s = opaque;
  
 -if (version_id  3) {
 -if (s-sense_key == UNIT_ATTENTION 
 -s-asc == ASC_MEDIUM_MAY_HAVE_CHANGED) {
 -s-cdrom_changed = 1;
 -}
 -}
 +ide_atapi_post_load(s, version_id);
  if (s-identify_set) {
  bdrv_set_enable_write_cache(s-bs, !!(s-identify_data[85]  (1  
 5)));
  }
 diff --git a/hw/ide/internal.h b/hw/ide/internal.h
 index 7170bd9..2572461 100644
 --- a/hw/ide/internal.h
 +++ b/hw/ide/internal.h
 @@ -572,6 +572,7 @@ BlockDriverAIOCB *ide_issue_trim(BlockDriverState *bs,
  /* hw/ide/atapi.c */
  void ide_atapi_cmd(IDEState *s);
  void ide_atapi_cmd_reply_end(IDEState *s);
 +void ide_atapi_post_load(IDEState *s, int version_id);
  
  /* hw/ide/qdev.c */
  void ide_bus_new(IDEBus *idebus, DeviceState *dev, int bus_id);
 
 
 In fact, I wonder if it is simpler to make up the ascq directly in
 cmd_request_sense, instead of changing all invocations of
 ide_atapi_cmd_error.  Kevin, any preferences?

Oh, I missed this question and wondered why there was no v2... I'd
prefer this patch with migration support added.

Kevin

Re: [Qemu-devel] [PATCH] [MIPS] Fix order of CVT.PS.S operands

2012-08-27 Thread Aurelien Jarno

On Mon, Aug 27, 2012 at 09:53:29AM +0100, Richard Sandiford wrote:
 The FS input to CVT.PS.S is the high half and FT is the low half.
 tcg_gen_concat_i32_i64 takes the low half first, so the operands
 were in the wrong order.
 
 Signed-off-by: Richard Sandiford rdsandif...@googlemail.com
 ---
  target-mips/translate.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)
 
 diff --git a/target-mips/translate.c b/target-mips/translate.c
 index 06f0ac6..defc021 100644
 --- a/target-mips/translate.c
 +++ b/target-mips/translate.c
 @@ -6907,7 +6907,7 @@ static void gen_farith (DisasContext *ctx, enum fopcode 
 op1,
  
  gen_load_fpr32(fp32_0, fs);
  gen_load_fpr32(fp32_1, ft);
 -tcg_gen_concat_i32_i64(fp64, fp32_0, fp32_1);
 +tcg_gen_concat_i32_i64(fp64, fp32_1, fp32_0);
  tcg_temp_free_i32(fp32_1);
  tcg_temp_free_i32(fp32_0);
  gen_store_fpr64(ctx, fp64, fd);

Thanks, applied.


-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH 1/2] target-mips: Streamline indexed cp1 memory addressing.

2012-08-27 Thread Aurelien Jarno

On Fri, Mar 30, 2012 at 01:16:36PM -0400, Richard Henderson wrote:
 We've already eliminated both base and index being zero.
 ---
  target-mips/translate.c |3 +--
  1 files changed, 1 insertions(+), 2 deletions(-)
 
 diff --git a/target-mips/translate.c b/target-mips/translate.c
 index a663b74..300d95e 100644
 --- a/target-mips/translate.c
 +++ b/target-mips/translate.c
 @@ -7742,8 +7742,7 @@ static void gen_flt3_ldst (DisasContext *ctx, uint32_t 
 opc,
  } else if (index == 0) {
  gen_load_gpr(t0, base);
  } else {
 -gen_load_gpr(t0, index);
 -gen_op_addr_add(ctx, t0, cpu_gpr[base], t0);
 +gen_op_addr_add(ctx, t0, cpu_gpr[base], cpu_gpr[index]);
  }
  /* Don't do NOP if destination is zero: we must perform the actual
 memory access. */

Thanks, applied.


-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH] ATAPI: STARTSTOPUNIT only eject/load media if powercondition is 0

2012-08-27 Thread Kevin Wolf

Am 31.07.2012 03:28, schrieb Ronnie Sahlberg:
 The START STOP UNIT command will only eject/load media if
 power condition is zero.
 
 If power condition is !0 then LOEJ and START will be ignored.
 
 From MMC (sbc contains similar wordings too)
   The Power Conditions field requests the block device to be placed
   in the power condition defined in
   Table 558. If this field has a value other than 0h then the Start
   and LoEj bits shall be ignored.
 
 Signed-off-by: Ronnie Sahlberg ronniesahlb...@gmail.com

Thanks, applied to block-next for 1.3.

Kevin

[Qemu-devel] [PATCH] audio: previous audio buffer should be flushed

2012-08-27 Thread munkyu.im

*** BLURB HERE ***

munkyu.im (1):
  audio: previous audio buffer should be flushed

 audio/winwaveaudio.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

-- 
1.7.4.1

[Qemu-devel] [PATCH] audio: previous audio buffer should be flushed

2012-08-27 Thread munkyu.im

Buffer must be flushed when audio out is paused, but Winwave audio backend has 
problem with this unlike other backends.
As a result, when user stop and restart audio files or something, the previous 
audio data are played in front of user expected sound.
So changes it to waveOutReset()
---
 audio/winwaveaudio.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/audio/winwaveaudio.c b/audio/winwaveaudio.c
index 663abb9..7de12a6 100644
--- a/audio/winwaveaudio.c
+++ b/audio/winwaveaudio.c
@@ -361,9 +361,9 @@ static int winwave_ctl_out (HWVoiceOut *hw, int cmd, ...)
 
 case VOICE_DISABLE:
 if (!wave-paused) {
-mr = waveOutPause (wave-hwo);
+mr = waveOutReset (wave-hwo);
 if (mr != MMSYSERR_NOERROR) {
-winwave_logerr (mr, waveOutPause);
+winwave_logerr (mr, waveOutReset);
 }
 else {
 wave-paused = 1;
-- 
1.7.4.1

[Qemu-devel] [PATCH v2] hw/pl110: Fix spelling of 'palette'

2012-08-27 Thread Peter Maydell

Fix the spelling of 'palette' used in various local variables,
structure members and comments.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Reviewed-by: Stefan Weil s...@weilnetz.de
---
v1-v2 changes: fix a comment which I'd missed before because
it wasn't all-lowercase (thanks Stefan).

 hw/pl110.c  | 30 +++---
 hw/pl110_template.h | 22 +++---
 2 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/hw/pl110.c b/hw/pl110.c
index f94608c..a582640 100644
--- a/hw/pl110.c
+++ b/hw/pl110.c
@@ -55,8 +55,8 @@ typedef struct {
 enum pl110_bppmode bpp;
 int invalidate;
 uint32_t mux_ctrl;
-uint32_t pallette[256];
-uint32_t raw_pallette[128];
+uint32_t palette[256];
+uint32_t raw_palette[128];
 qemu_irq irq;
 } pl110_state;
 
@@ -79,8 +79,8 @@ static const VMStateDescription vmstate_pl110 = {
 VMSTATE_INT32(rows, pl110_state),
 VMSTATE_UINT32(bpp, pl110_state),
 VMSTATE_INT32(invalidate, pl110_state),
-VMSTATE_UINT32_ARRAY(pallette, pl110_state, 256),
-VMSTATE_UINT32_ARRAY(raw_pallette, pl110_state, 128),
+VMSTATE_UINT32_ARRAY(palette, pl110_state, 256),
+VMSTATE_UINT32_ARRAY(raw_palette, pl110_state, 128),
 VMSTATE_UINT32_V(mux_ctrl, pl110_state, 2),
 VMSTATE_END_OF_LIST()
 }
@@ -236,7 +236,7 @@ static void pl110_update_display(void *opaque)
s-upbase, s-cols, s-rows,
src_width, dest_width, 0,
s-invalidate,
-   fn, s-pallette,
+   fn, s-palette,
first, last);
 if (first = 0) {
 dpy_update(s-ds, 0, first, s-cols, last - first + 1);
@@ -253,13 +253,13 @@ static void pl110_invalidate_display(void * opaque)
 }
 }
 
-static void pl110_update_pallette(pl110_state *s, int n)
+static void pl110_update_palette(pl110_state *s, int n)
 {
 int i;
 uint32_t raw;
 unsigned int r, g, b;
 
-raw = s-raw_pallette[n];
+raw = s-raw_palette[n];
 n = 1;
 for (i = 0; i  2; i++) {
 r = (raw  0x1f)  3;
@@ -271,17 +271,17 @@ static void pl110_update_pallette(pl110_state *s, int n)
 raw = 6;
 switch (ds_get_bits_per_pixel(s-ds)) {
 case 8:
-s-pallette[n] = rgb_to_pixel8(r, g, b);
+s-palette[n] = rgb_to_pixel8(r, g, b);
 break;
 case 15:
-s-pallette[n] = rgb_to_pixel15(r, g, b);
+s-palette[n] = rgb_to_pixel15(r, g, b);
 break;
 case 16:
-s-pallette[n] = rgb_to_pixel16(r, g, b);
+s-palette[n] = rgb_to_pixel16(r, g, b);
 break;
 case 24:
 case 32:
-s-pallette[n] = rgb_to_pixel32(r, g, b);
+s-palette[n] = rgb_to_pixel32(r, g, b);
 break;
 }
 n++;
@@ -314,7 +314,7 @@ static uint64_t pl110_read(void *opaque, target_phys_addr_t 
offset,
 return idregs[s-version][(offset - 0xfe0)  2];
 }
 if (offset = 0x200  offset  0x400) {
-return s-raw_pallette[(offset - 0x200)  2];
+return s-raw_palette[(offset - 0x200)  2];
 }
 switch (offset  2) {
 case 0: /* LCDTiming0 */
@@ -364,10 +364,10 @@ static void pl110_write(void *opaque, target_phys_addr_t 
offset,
is written to.  */
 s-invalidate = 1;
 if (offset = 0x200  offset  0x400) {
-/* Pallette.  */
+/* Palette.  */
 n = (offset - 0x200)  2;
-s-raw_pallette[(offset - 0x200)  2] = val;
-pl110_update_pallette(s, n);
+s-raw_palette[(offset - 0x200)  2] = val;
+pl110_update_palette(s, n);
 return;
 }
 switch (offset  2) {
diff --git a/hw/pl110_template.h b/hw/pl110_template.h
index 1dce32a..e738e4a 100644
--- a/hw/pl110_template.h
+++ b/hw/pl110_template.h
@@ -129,14 +129,14 @@ static drawfn glue(pl110_draw_fn_,BITS)[48] =
 
 static void glue(pl110_draw_line1_,NAME)(void *opaque, uint8_t *d, const 
uint8_t *src, int width, int deststep)
 {
-uint32_t *pallette = opaque;
+uint32_t *palette = opaque;
 uint32_t data;
 while (width  0) {
 data = *(uint32_t *)src;
 #ifdef SWAP_PIXELS
-#define FN(x, y) COPY_PIXEL(d, pallette[(data  (y + 7 - (x)))  1]);
+#define FN(x, y) COPY_PIXEL(d, palette[(data  (y + 7 - (x)))  1]);
 #else
-#define FN(x, y) COPY_PIXEL(d, pallette[(data  ((x) + y))  1]);
+#define FN(x, y) COPY_PIXEL(d, palette[(data  ((x) + y))  1]);
 #endif
 #ifdef SWAP_WORDS
 FN_8(24)
@@ -157,14 +157,14 @@ static void glue(pl110_draw_line1_,NAME)(void *opaque, 
uint8_t *d, const uint8_t
 
 static void glue(pl110_draw_line2_,NAME)(void *opaque, uint8_t *d, const 
uint8_t *src, int width, int deststep)
 {
-uint32_t *pallette = opaque;
+uint32_t *palette = opaque;
 uint32_t data;
 while (width  0) {
 data =

Re: [Qemu-devel] [PATCH] Add privilege level check to several Cop0 instructions.

2012-08-27 Thread Aurelien Jarno

On Sat, Sep 17, 2011 at 05:05:32PM -0700, Eric Johnson wrote:
 The MIPS Architecture Verification Programs (AVPs) check privileged
 instructions for the required privilege level.  These changes are needed
 to pass the AVP suite.
 
 Signed-off-by: Eric Johnson er...@mips.com
 ---
  target-mips/translate.c |   10 ++
  1 files changed, 10 insertions(+), 0 deletions(-)
 
 diff --git a/target-mips/translate.c b/target-mips/translate.c
 index d5b1c76..d99a716 100644
 --- a/target-mips/translate.c
 +++ b/target-mips/translate.c
 @@ -5940,6 +5940,8 @@ static void gen_cp0 (CPUState *env, DisasContext *ctx, 
 uint32_t opc, int rt, int
  {
  const char *opn = ldst;
  
 +check_cp0_enabled(ctx);
 +
  switch (opc) {
  case OPC_MFC0:
  if (rt == 0) {
 @@ -10125,6 +10127,7 @@ static void gen_pool32axf (CPUState *env, 
 DisasContext *ctx, int rt, int rs,
  #ifndef CONFIG_USER_ONLY
  case MFC0:
  case MFC0 + 32:
 +check_cp0_enabled(ctx);
  if (rt == 0) {
  /* Treat as NOP. */
  break;
 @@ -10136,6 +10139,7 @@ static void gen_pool32axf (CPUState *env, 
 DisasContext *ctx, int rt, int rs,
  {
  TCGv t0 = tcg_temp_new();
  
 +check_cp0_enabled(ctx);
  gen_load_gpr(t0, rt);
  gen_mtc0(env, ctx, t0, rs, (ctx-opcode  11)  0x7);
  tcg_temp_free(t0);
 @@ -10230,10 +10234,12 @@ static void gen_pool32axf (CPUState *env, 
 DisasContext *ctx, int rt, int rs,
  switch (minor) {
  case RDPGPR:
  check_insn(env, ctx, ISA_MIPS32R2);
 +check_cp0_enabled(ctx);
  gen_load_srsgpr(rt, rs);
  break;
  case WRPGPR:
  check_insn(env, ctx, ISA_MIPS32R2);
 +check_cp0_enabled(ctx);
  gen_store_srsgpr(rt, rs);
  break;
  default:
 @@ -10276,6 +10282,7 @@ static void gen_pool32axf (CPUState *env, 
 DisasContext *ctx, int rt, int rs,
  {
  TCGv t0 = tcg_temp_new();
  
 +check_cp0_enabled(ctx);
  save_cpu_state(ctx, 1);
  gen_helper_di(t0);
  gen_store_gpr(t0, rs);
 @@ -10288,6 +10295,7 @@ static void gen_pool32axf (CPUState *env, 
 DisasContext *ctx, int rt, int rs,
  {
  TCGv t0 = tcg_temp_new();
  
 +check_cp0_enabled(ctx);
  save_cpu_state(ctx, 1);
  gen_helper_ei(t0);
  gen_store_gpr(t0, rs);
 @@ -10765,6 +10773,7 @@ static void decode_micromips32_opc (CPUState *env, 
 DisasContext *ctx,
  minor = (ctx-opcode  12)  0xf;
  switch (minor) {
  case CACHE:
 +check_cp0_enabled(ctx);
  /* Treat as no-op. */
  break;
  case LWC2:
 @@ -12216,6 +12225,7 @@ static void decode_opc (CPUState *env, DisasContext 
 *ctx, int *is_branch)
   break;
  case OPC_CACHE:
  check_insn(env, ctx, ISA_MIPS3 | ISA_MIPS32);
 +check_cp0_enabled(ctx);
  /* Treat as NOP. */
  break;
  case OPC_PREF:
 
 

Thanks, applied.

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH] Allow microMIPS SWP and SDP to have RD equal to BASE.

2012-08-27 Thread Aurelien Jarno

On Sat, Sep 17, 2011 at 05:28:16PM -0700, Eric Johnson wrote:
 The microMIPS SWP and SDP instructions do not modify GPRs.  So their
 behavior is well defined when RD equals BASE.  The MIPS Architecture
 Verification Programs (AVPs) check that they work as expected.  This
 is required for AVPs to pass.
 
 Signed-off-by: Eric Johnson er...@mips.com
 ---
  target-mips/translate.c |   10 +-
  1 files changed, 9 insertions(+), 1 deletions(-)
 
 The patch applies to a8467c7a0e8b024a18608ff7db31ca2f2297e641.
 
 diff --git a/target-mips/translate.c b/target-mips/translate.c
 index d5b1c76..82cf75b 100644
 --- a/target-mips/translate.c
 +++ b/target-mips/translate.c
 @@ -10034,7 +10034,7 @@ static void gen_ldst_pair (DisasContext *ctx, 
 uint32_t opc, int rd,
  const char *opn = ldst_pair;
  TCGv t0, t1;
  
 -if (ctx-hflags  MIPS_HFLAG_BMASK || rd == 31 || rd == base) {
 +if (ctx-hflags  MIPS_HFLAG_BMASK || rd == 31) {
  generate_exception(ctx, EXCP_RI);
  return;
  }
 @@ -10046,6 +10046,10 @@ static void gen_ldst_pair (DisasContext *ctx, 
 uint32_t opc, int rd,
  
  switch (opc) {
  case LWP:
 +if (rd == base) {
 +generate_exception(ctx, EXCP_RI);
 +return;
 +}
  save_cpu_state(ctx, 0);
  op_ld_lw(t1, t0, ctx);
  gen_store_gpr(t1, rd);
 @@ -10067,6 +10071,10 @@ static void gen_ldst_pair (DisasContext *ctx, 
 uint32_t opc, int rd,
  break;
  #ifdef TARGET_MIPS64
  case LDP:
 +if (rd == base) {
 +generate_exception(ctx, EXCP_RI);
 +return;
 +}
  save_cpu_state(ctx, 0);
  op_ld_ld(t1, t0, ctx);
  gen_store_gpr(t1, rd);
 
 

Thanks, applied.

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [RFC PATCH 0/9] qom: improve reference counting and hotplug

2012-08-27 Thread Andreas Färber

Am 27.08.2012 09:22, schrieb liu ping fan:
 On Sun, Aug 26, 2012 at 11:51 PM, Anthony Liguori aligu...@us.ibm.com wrote:
 Right now, you need to pair up object_new with object_delete.  This is
 impractical when using reference counting because we would like to ensure 
 that
 object_unref() also frees memory when needed.

 The first few patches fix this problem by introducing a release callback so
 that objects that need special release behavior (i.e. g_free) can do that.

 Since link and child properties all hold references, in order to actually 
 free
 an object, we need to break those links.  User created devices end up as
 children of a container.  But child properties cannot be removed which means
 there's no obvious way to remove the reference and ultimately free the 
 object.

 Why? Since we call _add_child() in qdev_device_add(), why can not we
 call object_property_del_child() for qmp_device_del(). Could you
 explain it more detail?

Seconded. If we hot-unplug a device, we should surely remove its child
property from /machine/unassigned or parent bus or whatever.
Why is it that child properties cannot be removed?

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH 4/4] kvm: i386: Add classic PCI device assignment

2012-08-27 Thread Andreas Färber

Hi,

Am 27.08.2012 08:28, schrieb Jan Kiszka:
 From: Jan Kiszka jan.kis...@siemens.com
 
 This adds PCI device assignment for i386 targets using the classic KVM
 interfaces. This version is 100% identical to what is being maintained
 in qemu-kvm for several years and is supported by libvirt as well. It is
 expected to remain relevant for another couple of years until kernels
 without full-features and performance-wise equivalent VFIO support are
 obsolete.
 
 A refactoring to-do that should be done in-tree is to model MSI and
 MSI-X support via the generic PCI layer, similar to what VFIO is already
 doing for MSI-X. This should improve the correctness and clean up the
 code from duplicate logic.
 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/kvm/Makefile.objs |2 +-
  hw/kvm/pci-assign.c  | 1929 
 ++
  2 files changed, 1930 insertions(+), 1 deletions(-)
  create mode 100644 hw/kvm/pci-assign.c
[...]
 diff --git a/hw/kvm/pci-assign.c b/hw/kvm/pci-assign.c
 new file mode 100644
 index 000..9cce02c
 --- /dev/null
 +++ b/hw/kvm/pci-assign.c
 @@ -0,0 +1,1929 @@
 +/*
 + * Copyright (c) 2007, Neocleus Corporation.
 + *
 + * This program is free software; you can redistribute it and/or modify it
 + * under the terms and conditions of the GNU General Public License,
 + * version 2, as published by the Free Software Foundation.

The downside of accepting this into qemu.git is that it gets us a huge
blob of GPLv2-only code without history of contributors for GPLv2+
relicensing...

 + *
 + * This program is distributed in the hope it will be useful, but WITHOUT
 + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
 + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
 + * more details.
 + *
 + * You should have received a copy of the GNU General Public License along 
 with
 + * this program; if not, write to the Free Software Foundation, Inc., 59 
 Temple
 + * Place - Suite 330, Boston, MA 02111-1307 USA.

(Expect the usual GNU address reminder here.)

 + *
 + *
 + *  Assign a PCI device from the host to a guest VM.
 + *
 + *  Adapted for KVM by Qumranet.
 + *
 + *  Copyright (c) 2007, Neocleus, Alex Novik (a...@neocleus.com)
 + *  Copyright (c) 2007, Neocleus, Guy Zana (g...@neocleus.com)
 + *  Copyright (C) 2008, Qumranet, Amit Shah (amit.s...@qumranet.com)
 + *  Copyright (C) 2008, Red Hat, Amit Shah (amit.s...@redhat.com)
 + *  Copyright (C) 2008, IBM, Muli Ben-Yehuda (m...@il.ibm.com)
 + */
 +#include stdio.h
 +#include unistd.h
 +#include sys/io.h
 +#include sys/mman.h
 +#include sys/types.h
 +#include sys/stat.h
 +#include hw/hw.h
 +#include hw/pc.h
 +#include qemu-error.h
 +#include console.h
 +#include hw/loader.h
 +#include monitor.h
 +#include range.h
 +#include sysemu.h
 +#include hw/pci.h
 +#include hw/msi.h

 +#include kvm_i386.h

Am I correct to understand we compile this only for i386 / x86_64?
(apic.o in kvm/Makefile.objs hints in that direction) You may want to
update the description in the comment above accordingly, also mentioning
that this is some deprecated backwards-compatibility thing.

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [RFC PATCH 0/9] qom: improve reference counting and hotplug

2012-08-27 Thread Paolo Bonzini

Il 27/08/2012 13:46, Andreas Färber ha scritto:
 
  Since link and child properties all hold references, in order to 
  actually free
  an object, we need to break those links.  User created devices end up as
  children of a container.  But child properties cannot be removed which 
  means
  there's no obvious way to remove the reference and ultimately free the 
  object.
 
  Why? Since we call _add_child() in qdev_device_add(), why can not we
  call object_property_del_child() for qmp_device_del(). Could you
  explain it more detail?
 Seconded. If we hot-unplug a device, we should surely remove its child
 property from /machine/unassigned or parent bus or whatever.

Sure, as soon as the device is ejected by the guest.  But until that
point we need to keep the device in the QOM tree so that: 1) it has a
canonical path; 2) it can be examined; 3) it keeps children alive.

 Why is it that child properties cannot be removed?

Yeah, I didn't quite understand the difference between unparenting and
setting the child property to NULL.

Paolo

Re: [Qemu-devel] [PATCH 4/4] kvm: i386: Add classic PCI device assignment

2012-08-27 Thread Jan Kiszka

On 2012-08-27 14:07, Andreas Färber wrote:
 Hi,
 
 Am 27.08.2012 08:28, schrieb Jan Kiszka:
 From: Jan Kiszka jan.kis...@siemens.com

 This adds PCI device assignment for i386 targets using the classic KVM
 interfaces. This version is 100% identical to what is being maintained
 in qemu-kvm for several years and is supported by libvirt as well. It is
 expected to remain relevant for another couple of years until kernels
 without full-features and performance-wise equivalent VFIO support are
 obsolete.

 A refactoring to-do that should be done in-tree is to model MSI and
 MSI-X support via the generic PCI layer, similar to what VFIO is already
 doing for MSI-X. This should improve the correctness and clean up the
 code from duplicate logic.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/kvm/Makefile.objs |2 +-
  hw/kvm/pci-assign.c  | 1929 
 ++
  2 files changed, 1930 insertions(+), 1 deletions(-)
  create mode 100644 hw/kvm/pci-assign.c
 [...]
 diff --git a/hw/kvm/pci-assign.c b/hw/kvm/pci-assign.c
 new file mode 100644
 index 000..9cce02c
 --- /dev/null
 +++ b/hw/kvm/pci-assign.c
 @@ -0,0 +1,1929 @@
 +/*
 + * Copyright (c) 2007, Neocleus Corporation.
 + *
 + * This program is free software; you can redistribute it and/or modify it
 + * under the terms and conditions of the GNU General Public License,
 + * version 2, as published by the Free Software Foundation.
 
 The downside of accepting this into qemu.git is that it gets us a huge
 blob of GPLv2-only code without history of contributors for GPLv2+
 relicensing...

The history is documented in qemu-kvm. I personally don't see it will
pay off going through this, but someone else may, and nothing will
prevent trying this at least. I can leave a comment.

BTW, VFIO will be GPLv2 only as well. If I understood Alex correctly, it
is too much derived from this code. IOW: There is probably no PCI
assignment without this restriction in the foreseeable future.

 
 + *
 + * This program is distributed in the hope it will be useful, but WITHOUT
 + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
 + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
 + * more details.
 + *
 + * You should have received a copy of the GNU General Public License along 
 with
 + * this program; if not, write to the Free Software Foundation, Inc., 59 
 Temple
 + * Place - Suite 330, Boston, MA 02111-1307 USA.
 
 (Expect the usual GNU address reminder here.)

Will fix.

 
 + *
 + *
 + *  Assign a PCI device from the host to a guest VM.
 + *
 + *  Adapted for KVM by Qumranet.
 + *
 + *  Copyright (c) 2007, Neocleus, Alex Novik (a...@neocleus.com)
 + *  Copyright (c) 2007, Neocleus, Guy Zana (g...@neocleus.com)
 + *  Copyright (C) 2008, Qumranet, Amit Shah (amit.s...@qumranet.com)
 + *  Copyright (C) 2008, Red Hat, Amit Shah (amit.s...@redhat.com)
 + *  Copyright (C) 2008, IBM, Muli Ben-Yehuda (m...@il.ibm.com)
 + */
 +#include stdio.h
 +#include unistd.h
 +#include sys/io.h
 +#include sys/mman.h
 +#include sys/types.h
 +#include sys/stat.h
 +#include hw/hw.h
 +#include hw/pc.h
 +#include qemu-error.h
 +#include console.h
 +#include hw/loader.h
 +#include monitor.h
 +#include range.h
 +#include sysemu.h
 +#include hw/pci.h
 +#include hw/msi.h
 
 +#include kvm_i386.h
 
 Am I correct to understand we compile this only for i386 / x86_64?

This is correct.

 (apic.o in kvm/Makefile.objs hints in that direction) You may want to
 update the description in the comment above accordingly, also mentioning
 that this is some deprecated backwards-compatibility thing.

You mean in the header of pci-assign.c? Can do.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCHv2 0/4] migrate PV EOI MSR

2012-08-27 Thread Michael S. Tsirkin

It turns out PV EOI gets disabled after migration -
until next guest reset.
This is because we are missing code to actually migrate it.
This patch fixes it up: it applies cleanly to qemu.git
as well as qemu-kvm.git, so I think it's cleaner
to apply it in qemu.git to keep diff to minimum.

Note: there's talk about adding infrastructure for
CPUID whitelisting which thinkably could be used
for migration compat support. I am guessing this won't be
1.2 material - when it's ready we can easily replace
a simple flag that this patchset adds with something else.

So this just adds minimal code to avoid regressing
cross-version migration.

Note: there's a kernel bug in linux 3.6-rc3 - apply
my patch 'kvm: fix KVM_GET_MSR for PV EOI' in order to
use this patchset on it.

Needed for 1.2.

Changes from v1:
Update all headers from 3.6-rc3 to keep them in sync (Jan)
Disable cpuid flag for qemu 1.2 and older (Orit)

Michael S. Tsirkin (4):
  linux-headers: update to 3.6-rc3
  pc: refactor compat code
  cpuid: disable pv eoi for 1.1 and older compat types
  kvm: get/set PV EOI MSR

 hw/Makefile.objs  |  2 +-
 hw/cpu_flags.c| 32 +++
 hw/cpu_flags.h|  9 
 hw/pc_piix.c  | 46 ---
 linux-headers/asm-s390/kvm.h  |  2 +-
 linux-headers/asm-s390/kvm_para.h |  2 +-
 linux-headers/asm-x86/kvm.h   |  1 +
 linux-headers/asm-x86/kvm_para.h  |  7 ++
 linux-headers/linux/kvm.h |  3 +++
 target-i386/cpu.c |  8 +++
 target-i386/cpu.h |  1 +
 target-i386/kvm.c | 13 +++
 target-i386/machine.c | 21 ++
 13 files changed, 136 insertions(+), 11 deletions(-)
 create mode 100644 hw/cpu_flags.c
 create mode 100644 hw/cpu_flags.h

-- 
MST

[Qemu-devel] [PATCHv2 1/4] linux-headers: update to 3.6-rc3

2012-08-27 Thread Michael S. Tsirkin

Update linux-headers to version present in Linux 3.6-rc3.
Header asm-x96_64/kvm_para.h update is needed for the new PV EOI
feature.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 linux-headers/asm-s390/kvm.h  | 2 +-
 linux-headers/asm-s390/kvm_para.h | 2 +-
 linux-headers/asm-x86/kvm.h   | 1 +
 linux-headers/asm-x86/kvm_para.h  | 7 +++
 linux-headers/linux/kvm.h | 3 +++
 5 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/linux-headers/asm-s390/kvm.h b/linux-headers/asm-s390/kvm.h
index bdcbe0f..d25da59 100644
--- a/linux-headers/asm-s390/kvm.h
+++ b/linux-headers/asm-s390/kvm.h
@@ -1,7 +1,7 @@
 #ifndef __LINUX_KVM_S390_H
 #define __LINUX_KVM_S390_H
 /*
- * asm-s390/kvm.h - KVM s390 specific structures and definitions
+ * KVM s390 specific structures and definitions
  *
  * Copyright IBM Corp. 2008
  *
diff --git a/linux-headers/asm-s390/kvm_para.h 
b/linux-headers/asm-s390/kvm_para.h
index 8e2dd67..870051f 100644
--- a/linux-headers/asm-s390/kvm_para.h
+++ b/linux-headers/asm-s390/kvm_para.h
@@ -1,5 +1,5 @@
 /*
- * asm-s390/kvm_para.h - definition for paravirtual devices on s390
+ * definition for paravirtual devices on s390
  *
  * Copyright IBM Corp. 2008
  *
diff --git a/linux-headers/asm-x86/kvm.h b/linux-headers/asm-x86/kvm.h
index e7d1c19..246617e 100644
--- a/linux-headers/asm-x86/kvm.h
+++ b/linux-headers/asm-x86/kvm.h
@@ -12,6 +12,7 @@
 /* Select x86 specific features in linux/kvm.h */
 #define __KVM_HAVE_PIT
 #define __KVM_HAVE_IOAPIC
+#define __KVM_HAVE_IRQ_LINE
 #define __KVM_HAVE_DEVICE_ASSIGNMENT
 #define __KVM_HAVE_MSI
 #define __KVM_HAVE_USER_NMI
diff --git a/linux-headers/asm-x86/kvm_para.h b/linux-headers/asm-x86/kvm_para.h
index f2ac46a..a1c3d72 100644
--- a/linux-headers/asm-x86/kvm_para.h
+++ b/linux-headers/asm-x86/kvm_para.h
@@ -22,6 +22,7 @@
 #define KVM_FEATURE_CLOCKSOURCE23
 #define KVM_FEATURE_ASYNC_PF   4
 #define KVM_FEATURE_STEAL_TIME 5
+#define KVM_FEATURE_PV_EOI 6
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
@@ -37,6 +38,7 @@
 #define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01
 #define MSR_KVM_ASYNC_PF_EN 0x4b564d02
 #define MSR_KVM_STEAL_TIME  0x4b564d03
+#define MSR_KVM_PV_EOI_EN  0x4b564d04
 
 struct kvm_steal_time {
__u64 steal;
@@ -89,5 +91,10 @@ struct kvm_vcpu_pv_apf_data {
__u32 enabled;
 };
 
+#define KVM_PV_EOI_BIT 0
+#define KVM_PV_EOI_MASK (0x1  KVM_PV_EOI_BIT)
+#define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK
+#define KVM_PV_EOI_DISABLED 0x0
+
 
 #endif /* _ASM_X86_KVM_PARA_H */
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 5a9d4e3..4b9e575 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -617,6 +617,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_SIGNAL_MSI 77
 #define KVM_CAP_PPC_GET_SMMU_INFO 78
 #define KVM_CAP_S390_COW 79
+#define KVM_CAP_PPC_ALLOC_HTAB 80
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -828,6 +829,8 @@ struct kvm_s390_ucas_mapping {
 #define KVM_SIGNAL_MSI_IOW(KVMIO,  0xa5, struct kvm_msi)
 /* Available with KVM_CAP_PPC_GET_SMMU_INFO */
 #define KVM_PPC_GET_SMMU_INFO_IOR(KVMIO,  0xa6, struct kvm_ppc_smmu_info)
+/* Available with KVM_CAP_PPC_ALLOC_HTAB */
+#define KVM_PPC_ALLOCATE_HTAB_IOWR(KVMIO, 0xa7, __u32)
 
 /*
  * ioctls for vcpu fds
-- 
MST

[Qemu-devel] [PATCHv2 2/4] pc: refactor compat code

2012-08-27 Thread Michael S. Tsirkin

In preparation to adding PV EOI migration for 1.2,
trivially refactor some some compat code
to make it easier to add version specific
cpuid tweaks.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/pc_piix.c | 44 
 1 file changed, 36 insertions(+), 8 deletions(-)

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index a771d79..008d42f 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -369,6 +369,22 @@ static QEMUMachine pc_machine_v1_2 = {
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 };
 
+static void pc_machine_v1_1_compat(void)
+{
+}
+
+static void pc_init_pci_v1_1(ram_addr_t ram_size,
+ const char *boot_device,
+ const char *kernel_filename,
+ const char *kernel_cmdline,
+ const char *initrd_filename,
+ const char *cpu_model)
+{
+pc_machine_v1_1_compat();
+pc_init_pci(ram_size, boot_device, kernel_filename,
+kernel_cmdline, initrd_filename, cpu_model);
+}
+
 #define PC_COMPAT_1_1 \
 {\
 .driver   = virtio-scsi-pci,\
@@ -403,7 +419,7 @@ static QEMUMachine pc_machine_v1_2 = {
 static QEMUMachine pc_machine_v1_1 = {
 .name = pc-1.1,
 .desc = Standard PC,
-.init = pc_init_pci,
+.init = pc_init_pci_v1_1,
 .max_cpus = 255,
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 .compat_props = (GlobalProperty[]) {
@@ -439,7 +455,7 @@ static QEMUMachine pc_machine_v1_1 = {
 static QEMUMachine pc_machine_v1_0 = {
 .name = pc-1.0,
 .desc = Standard PC,
-.init = pc_init_pci,
+.init = pc_init_pci_v1_1,
 .max_cpus = 255,
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 .compat_props = (GlobalProperty[]) {
@@ -455,7 +471,7 @@ static QEMUMachine pc_machine_v1_0 = {
 static QEMUMachine pc_machine_v0_15 = {
 .name = pc-0.15,
 .desc = Standard PC,
-.init = pc_init_pci,
+.init = pc_init_pci_v1_1,
 .max_cpus = 255,
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 .compat_props = (GlobalProperty[]) {
@@ -488,7 +504,7 @@ static QEMUMachine pc_machine_v0_15 = {
 static QEMUMachine pc_machine_v0_14 = {
 .name = pc-0.14,
 .desc = Standard PC,
-.init = pc_init_pci,
+.init = pc_init_pci_v1_1,
 .max_cpus = 255,
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 .compat_props = (GlobalProperty[]) {
@@ -519,10 +535,22 @@ static QEMUMachine pc_machine_v0_14 = {
 .value= stringify(1),\
 }
 
+static void pc_init_pci_v0_13(ram_addr_t ram_size,
+ const char *boot_device,
+ const char *kernel_filename,
+ const char *kernel_cmdline,
+ const char *initrd_filename,
+ const char *cpu_model)
+{
+pc_machine_v1_1_compat();
+pc_init_pci_no_kvmclock(ram_size, boot_device, kernel_filename,
+kernel_cmdline, initrd_filename, cpu_model);
+}
+
 static QEMUMachine pc_machine_v0_13 = {
 .name = pc-0.13,
 .desc = Standard PC,
-.init = pc_init_pci_no_kvmclock,
+.init = pc_init_pci_v0_13,
 .max_cpus = 255,
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 .compat_props = (GlobalProperty[]) {
@@ -560,7 +588,7 @@ static QEMUMachine pc_machine_v0_13 = {
 static QEMUMachine pc_machine_v0_12 = {
 .name = pc-0.12,
 .desc = Standard PC,
-.init = pc_init_pci_no_kvmclock,
+.init = pc_init_pci_v0_13,
 .max_cpus = 255,
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 .compat_props = (GlobalProperty[]) {
@@ -594,7 +622,7 @@ static QEMUMachine pc_machine_v0_12 = {
 static QEMUMachine pc_machine_v0_11 = {
 .name = pc-0.11,
 .desc = Standard PC, qemu 0.11,
-.init = pc_init_pci_no_kvmclock,
+.init = pc_init_pci_v0_13,
 .max_cpus = 255,
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 .compat_props = (GlobalProperty[]) {
@@ -616,7 +644,7 @@ static QEMUMachine pc_machine_v0_11 = {
 static QEMUMachine pc_machine_v0_10 = {
 .name = pc-0.10,
 .desc = Standard PC, qemu 0.10,
-.init = pc_init_pci_no_kvmclock,
+.init = pc_init_pci_v0_13,
 .max_cpus = 255,
 .default_machine_opts = KVM_MACHINE_OPTIONS,
 .compat_props = (GlobalProperty[]) {
-- 
MST

[Qemu-devel] [PATCHv2 3/4] cpuid: disable pv eoi for 1.1 and older compat types

2012-08-27 Thread Michael S. Tsirkin

In preparation for adding PV EOI support, disable PV EOI by default for
1.1 and older machine types, to avoid CPUID changing during migration.

PV EOI can still be enabled/disabled by specifying it explicitly.
Enable for 1.1
-M pc-1.1 -cpu kvm64,+kvm_pv_eoi
Disable for 1.2
-M pc-1.2 -cpu kvm64,-kvm_pv_eoi

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/Makefile.objs  |  2 +-
 hw/cpu_flags.c| 32 
 hw/cpu_flags.h|  9 +
 hw/pc_piix.c  |  2 ++
 target-i386/cpu.c |  8 
 5 files changed, 52 insertions(+), 1 deletion(-)
 create mode 100644 hw/cpu_flags.c
 create mode 100644 hw/cpu_flags.h

diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index 850b87b..3f2532a 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -1,5 +1,5 @@
 hw-obj-y = usb/ ide/
-hw-obj-y += loader.o
+hw-obj-y += loader.o cpu_flags.o
 hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
 hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
 hw-obj-y += fw_cfg.o
diff --git a/hw/cpu_flags.c b/hw/cpu_flags.c
new file mode 100644
index 000..2422d20
--- /dev/null
+++ b/hw/cpu_flags.c
@@ -0,0 +1,32 @@
+/*
+ * CPU compatibility flags.
+ *
+ * Copyright (c) 2012 Red Hat Inc.
+ * Author: Michael S. Tsirkin.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see http://www.gnu.org/licenses/.
+ */
+#include hw/cpu_flags.h
+
+static bool __kvm_pv_eoi_disabled;
+
+void disable_kvm_pv_eoi(void)
+{
+   __kvm_pv_eoi_disabled = true;
+}
+
+bool kvm_pv_eoi_disabled(void)
+{
+   return __kvm_pv_eoi_disabled;
+}
diff --git a/hw/cpu_flags.h b/hw/cpu_flags.h
new file mode 100644
index 000..05777b6
--- /dev/null
+++ b/hw/cpu_flags.h
@@ -0,0 +1,9 @@
+#ifndef HW_CPU_FLAGS_H
+#define HW_CPU_FLAGS_H
+
+#include stdbool.h
+
+void disable_kvm_pv_eoi(void);
+bool kvm_pv_eoi_disabled(void);
+
+#endif
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 008d42f..bdbceda 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -46,6 +46,7 @@
 #ifdef CONFIG_XEN
 #  include xen/hvm/hvm_info_table.h
 #endif
+#include cpu_flags.h
 
 #define MAX_IDE_BUS 2
 
@@ -371,6 +372,7 @@ static QEMUMachine pc_machine_v1_2 = {
 
 static void pc_machine_v1_1_compat(void)
 {
+disable_kvm_pv_eoi();
 }
 
 static void pc_init_pci_v1_1(ram_addr_t ram_size,
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 120a2e3..0d02fd1 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -23,6 +23,7 @@
 
 #include cpu.h
 #include kvm.h
+#include asm/kvm_para.h
 
 #include qemu-option.h
 #include qemu-config.h
@@ -33,6 +34,7 @@
 #include hyperv.h
 
 #include hw/hw.h
+#include hw/cpu_flags.h
 
 /* feature flags taken from Intel Processor Identification and the CPUID
  * Instruction and AMD's CPUID Specification.  In cases of disagreement
@@ -889,6 +891,12 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
const char *cpu_model)
 
 plus_kvm_features = ~0; /* not supported bits will be filtered out later */
 
+/* Disable PV EOI for old machine types.
+ * Feature flags can still override. */
+if (kvm_pv_eoi_disabled()) {
+plus_kvm_features = ~(0x1  KVM_FEATURE_PV_EOI);
+}
+
 add_flagname_to_bitmaps(hypervisor, plus_features,
 plus_ext_features, plus_ext2_features, plus_ext3_features,
 plus_kvm_features, plus_svm_features);
-- 
MST

[Qemu-devel] [PATCHv2 4/4] kvm: get/set PV EOI MSR

2012-08-27 Thread Michael S. Tsirkin

Support get/set of new PV EOI MSR, for migration.
Add an optional section for MSR value - send it
out in case MSR was changed from the default value (0).

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 target-i386/cpu.h |  1 +
 target-i386/kvm.c | 13 +
 target-i386/machine.c | 21 +
 3 files changed, 35 insertions(+)

diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index aabf993..3c57d8b 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -699,6 +699,7 @@ typedef struct CPUX86State {
 uint64_t system_time_msr;
 uint64_t wall_clock_msr;
 uint64_t async_pf_en_msr;
+uint64_t pv_eoi_en_msr;
 
 uint64_t tsc;
 uint64_t tsc_deadline;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 5e2d4f5..6790180 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -64,6 +64,7 @@ static bool has_msr_star;
 static bool has_msr_hsave_pa;
 static bool has_msr_tsc_deadline;
 static bool has_msr_async_pf_en;
+static bool has_msr_pv_eoi_en;
 static bool has_msr_misc_enable;
 static int lm_capable_kernel;
 
@@ -456,6 +457,8 @@ int kvm_arch_init_vcpu(CPUX86State *env)
 
 has_msr_async_pf_en = c-eax  (1  KVM_FEATURE_ASYNC_PF);
 
+has_msr_pv_eoi_en = c-eax  (1  KVM_FEATURE_PV_EOI);
+
 cpu_x86_cpuid(env, 0, 0, limit, unused, unused, unused);
 
 for (i = 0; i = limit; i++) {
@@ -1018,6 +1021,10 @@ static int kvm_put_msrs(CPUX86State *env, int level)
 kvm_msr_entry_set(msrs[n++], MSR_KVM_ASYNC_PF_EN,
   env-async_pf_en_msr);
 }
+if (has_msr_pv_eoi_en) {
+kvm_msr_entry_set(msrs[n++], MSR_KVM_PV_EOI_EN,
+  env-pv_eoi_en_msr);
+}
 if (hyperv_hypercall_available()) {
 kvm_msr_entry_set(msrs[n++], HV_X64_MSR_GUEST_OS_ID, 0);
 kvm_msr_entry_set(msrs[n++], HV_X64_MSR_HYPERCALL, 0);
@@ -1260,6 +1267,9 @@ static int kvm_get_msrs(CPUX86State *env)
 if (has_msr_async_pf_en) {
 msrs[n++].index = MSR_KVM_ASYNC_PF_EN;
 }
+if (has_msr_pv_eoi_en) {
+msrs[n++].index = MSR_KVM_PV_EOI_EN;
+}
 
 if (env-mcg_cap) {
 msrs[n++].index = MSR_MCG_STATUS;
@@ -1339,6 +1349,9 @@ static int kvm_get_msrs(CPUX86State *env)
 case MSR_KVM_ASYNC_PF_EN:
 env-async_pf_en_msr = msrs[i].data;
 break;
+case MSR_KVM_PV_EOI_EN:
+env-pv_eoi_en_msr = msrs[i].data;
+break;
 }
 }
 
diff --git a/target-i386/machine.c b/target-i386/machine.c
index a8be058..4771508 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -279,6 +279,13 @@ static bool async_pf_msr_needed(void *opaque)
 return cpu-async_pf_en_msr != 0;
 }
 
+static bool pv_eoi_msr_needed(void *opaque)
+{
+CPUX86State *cpu = opaque;
+
+return cpu-pv_eoi_en_msr != 0;
+}
+
 static const VMStateDescription vmstate_async_pf_msr = {
 .name = cpu/async_pf_msr,
 .version_id = 1,
@@ -290,6 +297,17 @@ static const VMStateDescription vmstate_async_pf_msr = {
 }
 };
 
+static const VMStateDescription vmstate_pv_eoi_msr = {
+.name = cpu/async_pv_eoi_msr,
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.fields  = (VMStateField []) {
+VMSTATE_UINT64(pv_eoi_en_msr, CPUX86State),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static bool fpop_ip_dp_needed(void *opaque)
 {
 CPUX86State *env = opaque;
@@ -454,6 +472,9 @@ static const VMStateDescription vmstate_cpu = {
 .vmsd = vmstate_async_pf_msr,
 .needed = async_pf_msr_needed,
 } , {
+.vmsd = vmstate_pv_eoi_msr,
+.needed = pv_eoi_msr_needed,
+} , {
 .vmsd = vmstate_fpop_ip_dp,
 .needed = fpop_ip_dp_needed,
 }, {
-- 
MST

[Qemu-devel] [PATCH] Save/load PC speaker internal state

2012-08-27 Thread Pavel Dovgaluk

Save PC speaker state to remove differences between system
states after saving the snapshot and after loading it again.
This patch is needed for deterministic replay of the execution.

Signed-off-by: Pavel Dovgalyukpavel.dovga...@gmail.com
---
 hw/pcspk.c |   18 ++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/hw/pcspk.c b/hw/pcspk.c
index e430324..3fb3dd1 100644
--- a/hw/pcspk.c
+++ b/hw/pcspk.c
@@ -159,10 +159,28 @@ static const MemoryRegionOps pcspk_io_ops = {
 },
 };
 
+static const VMStateDescription vmstate_spk = {
+.name = pcspk,
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.fields  = (VMStateField[]) {
+VMSTATE_UINT8_ARRAY(sample_buf, PCSpkState, PCSPK_BUF_LEN),
+VMSTATE_UINT32(pit_count, PCSpkState),
+VMSTATE_UINT32(samples, PCSpkState),
+VMSTATE_UINT32(play_pos, PCSpkState),
+VMSTATE_INT32(data_on, PCSpkState),
+VMSTATE_INT32(dummy_refresh_clock, PCSpkState),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static int pcspk_initfn(ISADevice *dev)
 {
 PCSpkState *s = DO_UPCAST(PCSpkState, dev, dev);
 
+vmstate_register(NULL, 0, vmstate_spk, s);
+
 memory_region_init_io(s-ioport, pcspk_io_ops, s, elcr, 1);
 isa_register_ioport(dev, s-ioport, s-iobase);

Re: [Qemu-devel] [PATCHv2 1/4] linux-headers: update to 3.6-rc3

2012-08-27 Thread Peter Maydell

On 27 August 2012 13:20, Michael S. Tsirkin m...@redhat.com wrote:
 Update linux-headers to version present in Linux 3.6-rc3.
 Header asm-x96_64/kvm_para.h update is needed for the new PV EOI
 feature.

 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 ---
  linux-headers/asm-s390/kvm.h  | 2 +-
  linux-headers/asm-s390/kvm_para.h | 2 +-
  linux-headers/asm-x86/kvm.h   | 1 +
  linux-headers/asm-x86/kvm_para.h  | 7 +++
  linux-headers/linux/kvm.h | 3 +++
  5 files changed, 13 insertions(+), 2 deletions(-)

The latest version of update-linux-headers.sh should have caused
this update to include asm-generic/kvm_para.h, I think. Did the
script not pull that header in, or were you maybe using an old
version of the script or forgot to git add the new file?

thanks
-- PMM

Re: [Qemu-devel] [PATCHv2 1/4] linux-headers: update to 3.6-rc3

2012-08-27 Thread Jan Kiszka

On 2012-08-27 14:42, Peter Maydell wrote:
 On 27 August 2012 13:20, Michael S. Tsirkin m...@redhat.com wrote:
 Update linux-headers to version present in Linux 3.6-rc3.
 Header asm-x96_64/kvm_para.h update is needed for the new PV EOI
 feature.

 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 ---
  linux-headers/asm-s390/kvm.h  | 2 +-
  linux-headers/asm-s390/kvm_para.h | 2 +-
  linux-headers/asm-x86/kvm.h   | 1 +
  linux-headers/asm-x86/kvm_para.h  | 7 +++
  linux-headers/linux/kvm.h | 3 +++
  5 files changed, 13 insertions(+), 2 deletions(-)
 
 The latest version of update-linux-headers.sh should have caused
 this update to include asm-generic/kvm_para.h, I think. Did the
 script not pull that header in, or were you maybe using an old
 version of the script or forgot to git add the new file?

To be fair, that is hard to guess. We should add some magic to the
update script to detect new files and maybe suggest them for addition.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH] Save/load PC speaker internal state

2012-08-27 Thread Peter Maydell

On 27 August 2012 13:21, Pavel Dovgaluk pavel.dovga...@ispras.ru wrote:
 Save PC speaker state to remove differences between system
 states after saving the snapshot and after loading it again.
 This patch is needed for deterministic replay of the execution.

 Signed-off-by: Pavel Dovgalyukpavel.dovga...@gmail.com

Hi Pavel; thanks for this patch. Couple of minor issues:

 +static const VMStateDescription vmstate_spk = {
 +.name = pcspk,
 +.version_id = 1,
 +.minimum_version_id = 1,
 +.minimum_version_id_old = 1,
 +.fields  = (VMStateField[]) {
 +VMSTATE_UINT8_ARRAY(sample_buf, PCSpkState, PCSPK_BUF_LEN),
 +VMSTATE_UINT32(pit_count, PCSpkState),
 +VMSTATE_UINT32(samples, PCSpkState),
 +VMSTATE_UINT32(play_pos, PCSpkState),
 +VMSTATE_INT32(data_on, PCSpkState),
 +VMSTATE_INT32(dummy_refresh_clock, PCSpkState),

I think that you need also to update the types in the PCSpkState
struct from int/unsigned int to int32_t/uint32_t, otherwise this
won't compile on a 64 bit system.

 +VMSTATE_END_OF_LIST()
 +}
 +};
 +
  static int pcspk_initfn(ISADevice *dev)
  {
  PCSpkState *s = DO_UPCAST(PCSpkState, dev, dev);

 +vmstate_register(NULL, 0, vmstate_spk, s);
 +

It's nicer to register the vmstate by setting
dc-vmsd = vmstate_spk;
in pcspk_class_initfn(); then you don't need to explicitly call
vmstate_register().

  memory_region_init_io(s-ioport, pcspk_io_ops, s, elcr, 1);
  isa_register_ioport(dev, s-ioport, s-iobase);

-- PMM

Re: [Qemu-devel] [PATCH 8/9] qdev: make qdev_set_parent_bus() just set a link property

2012-08-27 Thread Anthony Liguori

liu ping fan qemul...@gmail.com writes:

 On Sun, Aug 26, 2012 at 11:51 PM, Anthony Liguori aligu...@us.ibm.com wrote:
 Also make setting the link to NULL break the bus link

 Signed-off-by: Anthony Liguori aligu...@us.ibm.com
 ---
  hw/qdev.c |   48 ++--
  1 files changed, 42 insertions(+), 6 deletions(-)

 diff --git a/hw/qdev.c b/hw/qdev.c
 index 86e1337..525a0cb 100644
 --- a/hw/qdev.c
 +++ b/hw/qdev.c
 @@ -100,8 +100,7 @@ static void bus_add_child(BusState *bus, DeviceState 
 *child)

  void qdev_set_parent_bus(DeviceState *dev, BusState *bus)
  {
 -dev-parent_bus = bus;
 -bus_add_child(bus, dev);
 +object_property_set_link(OBJECT(dev), OBJECT(bus), parent_bus, NULL);
  }

  /* Create a new device.  This only initializes the device state structure
 @@ -241,8 +240,8 @@ void qbus_reset_all_fn(void *opaque)
  /* can be used as -unplug() callback for the simple cases */
  int qdev_simple_unplug_cb(DeviceState *dev)
  {
 -/* just zap it */
 -qdev_free(dev);
 +/* Unplug from parent bus via a forced eject */
 +qdev_set_parent_bus(dev, NULL);

 I think it is more reliable to remove the reference property(child,
 link) before object_finialize().  So when uplug-finish, we delete all
 the refers:  bus-child, bus-child by _del_property not using
 _set_property.

object_finalize is called when ref=0.  You cannot remove refs in
finalize because by definition, ref=0.

Regards,

Anthony Liguori

Re: [Qemu-devel] qcow2: online snasphots : internal vs external ?

2012-08-27 Thread Alexandre DERUMIER

Ok, got it,

Thanks Paolo !

- Mail original -

De: Paolo Bonzini pbonz...@redhat.com
À: qemu-devel@nongnu.org
Envoyé: Lundi 27 Août 2012 12:10:34
Objet: Re: [Qemu-devel] qcow2: online snasphots : internal vs external ?

Il 27/08/2012 11:26, Alexandre DERUMIER ha scritto:
 how can I rollback to the point of the snapshot.

 exemple :
 image1.qcow2
 file : /beforesnap1
 take a snaphot (snap1), so qemu switch to snap1.qcow2
 write some file:
 file:
 /aftersnap1.
 /beforesnap1

 Now, how can I rollback to the point of time of snap1 ?
 I can reuse image1.qcow2, but if I write some datas on it, I don't
 see how I can return to the point of time of the snap1. (like qemu-image -a
 with internal snapshots)

If you can drop snap1.qcow2 altogether, you just use image1.qcow2 the
next time you start QEMU.

If you cannot, you create snap2.qcow2 based on image1.qcow2:

qemu-img -f qcow2 -obacking_file=image1.qcow2 snap2.qcow2

and use it the next time you start QEMU.

Paolo






--

--





Alexandre D e rumier

Ingénieur Systèmes et Réseaux


Fixe : 03 20 68 88 85

Fax : 03 20 68 90 88


45 Bvd du Général Leclerc 59100 Roubaix
12 rue Marivaux 75002 Paris

Re: [Qemu-devel] [RFC PATCH 0/9] qom: improve reference counting and hotplug

2012-08-27 Thread Anthony Liguori

Paolo Bonzini pbonz...@redhat.com writes:

 Il 27/08/2012 13:46, Andreas Färber ha scritto:
 
  Since link and child properties all hold references, in order to 
  actually free
  an object, we need to break those links.  User created devices end up as
  children of a container.  But child properties cannot be removed which 
  means
  there's no obvious way to remove the reference and ultimately free the 
  object.
 
  Why? Since we call _add_child() in qdev_device_add(), why can not we
  call object_property_del_child() for qmp_device_del(). Could you
  explain it more detail?
 Seconded. If we hot-unplug a device, we should surely remove its child
 property from /machine/unassigned or parent bus or whatever.

That's exactly what is happening in this series.  qmp_device_del adds an
ejection notifier that unparents the device to remove the last reference count.


 Sure, as soon as the device is ejected by the guest.  But until that
 point we need to keep the device in the QOM tree so that: 1) it has a
 canonical path; 2) it can be examined; 3) it keeps children alive.

 Why is it that child properties cannot be removed?

 Yeah, I didn't quite understand the difference between unparenting and
 setting the child property to NULL.

They are exactly the same thing.  Setting the child property to NULL is
unparenting.

Unparenting is essentially deleting.  This series makes it such that
there is a white list of devices that are capable of being deleted.

Regards,

Anthony Liguori


 Paolo

Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem

2012-08-27 Thread Anthony Liguori

Liu Ping Fan qemul...@gmail.com writes:

 From: Liu Ping Fan pingf...@linux.vnet.ibm.com

 Scene:
   obja lies in objA, when objA's ref-0, it will be freed,
 but at that time obja can still be in use.

 The real example is:
 typedef struct PCIIDEState {
 PCIDevice dev;
 IDEBus bus[2]; -- create in place
 .
 }

 When without big lock protection for mmio-dispatch, we will hold
 obj's refcnt. So memory_region_init_io() will replace the third para
 void *opaque with Object *obj.
 With this patch, we can protect PCIIDEState from disappearing during
 mmio-dispatch hold the IDEBus-ref.

 And the ref circle has been broken when calling qdev_delete_subtree().

 Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com

I think this is solving the wrong problem.  There are many, many
dependencies a device may have on other devices.  Memory allocation
isn't the only one.

The problem is that we want to make sure that a device doesn't go away
while an MMIO dispatch is happening.  This is easy to solve without
touching referencing counting.

The device will hold a lock while the MMIO is being dispatched.  The
delete path simply needs to acquire that same lock.  This will ensure
that a delete operation cannot finish while MMIO is still in flight.

Regarding deleting a device, not all devices are capable of being
deleted and specifically, devices that are composed within the memory of
another device cannot be directly deleted (they can only be deleted
as part of their parent's destruction).

Regards,

Anthony Liguori

 ---
  hw/qdev.c |2 ++
  hw/qdev.h |1 +
  2 files changed, 3 insertions(+), 0 deletions(-)

 diff --git a/hw/qdev.c b/hw/qdev.c
 index e2339a1..b09ebbf 100644
 --- a/hw/qdev.c
 +++ b/hw/qdev.c
 @@ -510,6 +510,8 @@ void qbus_create_inplace(BusState *bus, const char 
 *typename,
  {
  object_initialize(bus, typename);
  
 +bus-overlap = parent;
 +object_ref(OBJECT(bus-overlap));
  bus-parent = parent;
  bus-name = name ? g_strdup(name) : NULL;
  qbus_realize(bus);
 diff --git a/hw/qdev.h b/hw/qdev.h
 index 182cfa5..9bc5783 100644
 --- a/hw/qdev.h
 +++ b/hw/qdev.h
 @@ -117,6 +117,7 @@ struct BusState {
  int allow_hotplug;
  bool qom_allocated;
  bool glib_allocated;
 +DeviceState *overlap;
  int max_index;
  QTAILQ_HEAD(ChildrenHead, BusChild) children;
  QLIST_ENTRY(BusState) sibling;
 -- 
 1.7.4.4

Re: [Qemu-devel] [PATCH v7 0/6] convert sendkey to qapi

2012-08-27 Thread Luiz Capitulino

On Mon, 27 Aug 2012 15:23:31 +0800
Amos Kong ak...@redhat.com wrote:

 On 20/08/12 23:08, Luiz Capitulino wrote:
  On Mon, 20 Aug 2012 07:25:13 -0600
  Eric Blakeebl...@redhat.com  wrote:
 
  On 08/19/2012 10:39 PM, Amos Kong wrote:
  This series converted 'sendkey' command to qapi. The raw value
  in hexadecimal format is not supported by 'send-key' of qmp.
 
  Are we still trying to get this into 1.2, or have we missed that deadline?
 
  Too late for 1.2, IMO.
 
 So I need to wait and repost a V8(# Since: 1.3) after 1.2 is released ?

I haven't reviewed this yet. If it's good enough, then I can do the
s/1.2/1.3 change myself when applying this to the qmp queue.

Re: [Qemu-devel] [PATCH 2/9] object: automatically free objects based on a release function

2012-08-27 Thread Andreas Färber

Am 26.08.2012 17:51, schrieb Anthony Liguori:
 Now object_delete() simply has the semantics of unref'ing an object and
 unparenting it.
 
 Signed-off-by: Anthony Liguori aligu...@us.ibm.com

Acked-by: Andreas Färber afaer...@suse.de

/-F

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH v2 0/6] Running Microport UNIX (ca 1987)

2012-08-27 Thread Anthony Liguori

malc av1...@comtv.ru writes:

 On Thu, 23 Aug 2012, Matthew Ogilvie wrote:

 After applying this version 2 of this patch series, I can
 successfully run Micoport UNIX System V/386, v 2.1 (ca 1987)
 under qemu.  (although not if I try to enable KVM)
 
 Version 1 of this series was posted about 4 weeks ago.  See
 http://patchwork.ozlabs.org/project/qemu-devel/list/?submitter=15654
 
 The patches are all independent, except that the documentation part
 of patch 5 (vga) adds onto patch 4 (retrace=) changes.

 [..snip..]

 Applied, thanks.

malc, please revert these patches.

They were not adequately reviewed and they also do not qualify for the
stage of the release we're in.

Regards,

Anthony Liguori

 -- 
 mailto:av1...@comtv.ru

Re: [Qemu-devel] [PATCH v2 6/6] i8259: add -no-spurious-interrupt-hack option

2012-08-27 Thread Anthony Liguori

Matthew Ogilvie mmogilvi_q...@miniinfo.net writes:

 This patch provides a way to optionally suppress spurious interrupts,
 as a workaround for systems described below:

 Some old operating systems do not handle spurious interrupts well,
 and qemu tends to generate them significantly more often than
 real hardware.

This is the wrong approach.  You add a LostTickPolicy property to the
i8259 device.

Regards,

Anthony Liguori


 Examples:
   - Microport UNIX System V/386 v 2.1 (ca 1987)
 (The main problem I'm fixing: Without this patch, it panics
 sporadically when accessing the hard disk.)
   - ATT UNIX System V/386 Release 4.0 Version 2.1a (ca 1991)
 See screenshot in QEMU Official OS Support List:
 http://www.claunia.com/qemu/objectManager.php?sClass=applicationiId=9
 (I don't have this system to test.)
   - A report about OS/2 boot lockup from 2004 by Hampa Hug:
 http://lists.nongnu.org/archive/html/qemu-devel/2004-09/msg00367.html
 (My patch was partially inspired by his.)
 Also: 
 http://lists.nongnu.org/archive/html/qemu-devel/2005-06/msg00243.html
 (I don't have this system to test.)

 Signed-off-by: Matthew Ogilvie mmogilvi_q...@miniinfo.net
 ---

 Note: checkpatches.pl gives an error about initializing the global 
 int no_spurious_interrupt_hack = 0;, even though existing lines
 near it are doing the same thing.  Should I give precedence to
 checkpatches.pl, or nearby code?

 There was no version 1 of this patch; this was the last thing I had to
 work around to get UNIX running.

 High level symptoms:
1. Despite using this UNIX system for nearly 10 years (ca 1987-1996)
   on an early 80386, I don't remember ever seeing any crash like
   this.  I vaguely remember I may have had one or two crashes for
   which I don't have other explanations that perhaps could have
   been this, but I don't remember the error messages to confirm it.
2. It is somewhat random when UNIX crashes when running in qemu.
- Sometimes it crashes the first time the floppy-based installer
  tries to access the hard disk (partition table?).
- Other times (though fairly rarely), it actually finishes
  formatting and copying the first disk's files to the
  hard disk without crashing.
- On the other hand, I've never seen it successfully boot from
  the hard disk without this patch.  An attempt to boot from
  the hard drive always panics quite early.
3. I tried -win2k-hack instead, thinking maybe the hard disk is just
   responding faster than UNIX expected.  But it doesn't seem
   to have any effect.  UNIX still panics sporadically the same way.
- TANGENT: I was going to see if my patch provides an
  alternative fix for installing Windows 2000, but
  I was unable to reproduce the original -win2k-hack problem at
  all (with neither -win2k-hack NOR this patch).  Maybe
  some other change has fixed it some other way?  Or maybe
  it is only an issue in configurations I didn't test?
  (KVM instead of TCG?  Less RAM?  Something else?)
 It might be worth doing a little more investigation,
  and eliminating the -win2k-hack option if appropriate.
4. If I enable KVM, I get a different error very early in
   bootup (in splx function instead of splint), and this patch
   doesn't help.

 
 My low level analysis of what is going on:

 It is hard to track down all the details, but based on logging a
 lot of qemu IRQ stuff, and setting a breakpoint in the earliest
 panic-related UNIX function using gdb, it looks like:

1. It is near the end of servicing a previous IRQ14 from the
   hard disk.
2. The processor has interrupts disabled (I think), while UNIX
   clears the slave 8259's IMR (mask) register (sets it to 0), allowing
   all interrupts to be passed on to the master.
3. While in that state, IRQ14 is raised (on the slave), which
   gets propagated to the master (IRQ2), but the CPU
   is not interrupted yet.
4. UNIX then masks the slave 8259's IMR register
   completely (sets to 0xff).
5. Because the master elcr register is set (by BIOS; UNIX never
   touches it) to edge trigger for IRQ2, the master latched on
   to IRQ2 earlier, and continues to assert the processors INT line
   (the env-interrupt_requestCPU_INTERRUPT_HARD bit) even
   after all slave IRQs have been masked off (clearing the input
   IRQ2).
6. Finally, UNIX enables CPU interrupts and the interrupt is delivered
   to the CPU, which ends up as a spurious IRQ15 due to the
   slave's imr register.  UNIX doesn't know what to do with
   that, and panics/halts.

 I'm not sure why it only sporadically hits this sequence of events.
 There doesn't seem to be other IRQs asserted or serviced anywhere
 in the near past; the last several were all IRQ14's.  But I can't
 help

[Qemu-devel] [Bug 1042084] [NEW] Windows 7 guest cannot boot after seabios updated

2012-08-27 Thread Vic

Public bug reported:

Hi,

I can no longer boot my Windows 7 guest after this commit (update
seabios to latest master)

http://git.qemu.org/?p=qemu.git;a=commitdiff;h=01afdadc92e71e29700e64f3a5f42c1c543e3cf9

When I tried to boot Windows, it BSOD and said The BIOS in this system
is not fully ACPI compliant. Please contact your system vendor for an
updated  BIOS. Reverting this commit will fix the issue.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1042084

Title:
  Windows 7 guest cannot boot after seabios updated

Status in QEMU:
  New

Bug description:
  Hi,

  I can no longer boot my Windows 7 guest after this commit (update
  seabios to latest master)

  
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=01afdadc92e71e29700e64f3a5f42c1c543e3cf9

  When I tried to boot Windows, it BSOD and said The BIOS in this
  system is not fully ACPI compliant. Please contact your system vendor
  for an updated  BIOS. Reverting this commit will fix the issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1042084/+subscriptions

[Qemu-devel] [Bug 1036363] Re: Major network performance problems on AMD hardware

2012-08-27 Thread Ziemowit Pierzycki

Thank Stefan,

I compiled both 0.15 and 1.0 and they do not have that problem... but
Fedora package does.  Perhaps the way Fedora package was compiled?  I'm
going to grab a source package and attempt to compile from that.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1036363

Title:
  Major network performance problems on AMD hardware

Status in QEMU:
  New
Status in qemu-kvm:
  New

Bug description:
  Hi,

  I am experiencing some major performance problems with all of our
  beefy AMD Opteron 6274 servers running Fedora 17 (kernel
  3.4.4-5.fc17.x86_64, qemu 1.0-17).  The network performance between
  host and the virtual machine is terrible:

  # iperf -c 10.10.11.22 -r
  
  Server listening on TCP port 5001
  TCP window size: 85.3 KByte (default)
  
  
  Client connecting to 10.10.11.22, TCP port 5001
  TCP window size:  197 KByte (default)
  
  [  5] local 10.10.11.199 port 44192 connected with 10.10.11.22 port 5001
  [ ID] Interval   Transfer Bandwidth
  [  5]  0.0-10.0 sec  2.45 GBytes  2.11 Gbits/sec
  [  4] local 10.10.11.199 port 5001 connected with 10.10.11.22 port 42601
  [  4]  0.0-10.0 sec  8.97 GBytes  7.71 Gbits/sec

  So the VM's receive is super slow.  I would be happy with 7.71 Gbps
  because it's closer to matching the speed of the 10G ethernet adapters
  but the iSCSI drive's write performance is few times faster than read.
  Now running a similar test on the slowest machine I have, Intel core
  i3 I see this:

  # iperf -c 192.168.7.60 -r
  
  Server listening on TCP port 5001
  TCP window size: 85.3 KByte (default)
  
  
  Client connecting to 192.168.7.60, TCP port 5001
  TCP window size:  306 KByte (default)
  
  [  5] local 192.168.7.98 port 53992 connected with 192.168.7.60 port 5001
  [ ID] Interval   Transfer Bandwidth
  [  5]  0.0-10.0 sec  22.5 GBytes  19.3 Gbits/sec
  [  4] local 192.168.7.98 port 5001 connected with 192.168.7.60 port 53339
  [  4]  0.0-10.0 sec  25.1 GBytes  21.5 Gbits/sec

  As you can image this is a huge difference in network IO.  Most setups
  are identical down to the same versions.  Vhost-net is enabled and it
  appears to use MSI-X on the VM.  I've tried all kinds of settings and
  while they improve performance a little I feel it's just masking a
  bigger problem.  All 12 of my AMD servers have this issue and it
  appears I'm not the only one complaining.  Any help would be
  appreciated.  Thanks.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1036363/+subscriptions

Re: [Qemu-devel] [PATCH 3/9] qbus: remove glib_allocated/qom_allocated and use release hook to free memory

2012-08-27 Thread Andreas Färber

Am 26.08.2012 17:51, schrieb Anthony Liguori:
 Signed-off-by: Anthony Liguori aligu...@us.ibm.com

That's a really nice solution for cleaning this up, thanks!

Acked-by: Andreas Färber afaer...@suse.de

However one conceptional detail...

 ---
  hw/pci.c|7 ++-
  hw/qdev.c   |   15 ---
  hw/qdev.h   |7 ---
  hw/sysbus.c |7 ++-
  4 files changed, 12 insertions(+), 24 deletions(-)
[...]
 diff --git a/hw/qdev.c b/hw/qdev.c
 index b5a52ac..6b61daa 100644
 --- a/hw/qdev.c
 +++ b/hw/qdev.c
[...]
 @@ -468,18 +466,6 @@ BusState *qbus_create(const char *typename, DeviceState 
 *parent, const char *nam
  return bus;
  }
  
 -void qbus_free(BusState *bus)
 -{
 -if (bus-qom_allocated) {
 -object_delete(OBJECT(bus));
 -} else {
 -object_finalize(OBJECT(bus));
 -if (bus-glib_allocated) {
 -g_free(bus);
 -}
 -}
 -}
 -
  static char *bus_get_fw_dev_path(BusState *bus, DeviceState *dev)
  {
  BusClass *bc = BUS_GET_CLASS(bus);
 @@ -698,7 +684,6 @@ static void device_finalize(Object *obj)
  if (dev-state == DEV_STATE_INITIALIZED) {
  while (dev-num_child_bus) {
  bus = QLIST_FIRST(dev-child_bus);
 -qbus_free(bus);
  }
  if (qdev_get_vmsd(dev)) {
  vmstate_unregister(dev, qdev_get_vmsd(dev), dev);

I wonder how this is gonna work: The device used to be in charge of
tearing down its bus children ... now it neither deletes nor finalizes
nor unrefs? Is the while loop even still needed?

Wouldn't the busses still have the device as parent, referencing it,
blocking device_finalize?

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH v2 0/6] Running Microport UNIX (ca 1987)

2012-08-27 Thread malc

On Mon, 27 Aug 2012, Anthony Liguori wrote:

 malc av1...@comtv.ru writes:
 
  On Thu, 23 Aug 2012, Matthew Ogilvie wrote:
 
  After applying this version 2 of this patch series, I can
  successfully run Micoport UNIX System V/386, v 2.1 (ca 1987)
  under qemu.  (although not if I try to enable KVM)
  
  Version 1 of this series was posted about 4 weeks ago.  See
  http://patchwork.ozlabs.org/project/qemu-devel/list/?submitter=15654
  
  The patches are all independent, except that the documentation part
  of patch 5 (vga) adds onto patch 4 (retrace=) changes.
 
  [..snip..]
 
  Applied, thanks.
 
 malc, please revert these patches.
 
 They were not adequately reviewed and they also do not qualify for the
 stage of the release we're in.

Number 2 was, and should stay, as the emulation wasn't correct before it,
don't really care about the rest.

-- 
mailto:av1...@comtv.ru

Re: [Qemu-devel] [PATCH v2 0/6] Running Microport UNIX (ca 1987)

2012-08-27 Thread Anthony Liguori

malc av1...@comtv.ru writes:

 On Mon, 27 Aug 2012, Anthony Liguori wrote:

 malc av1...@comtv.ru writes:
 
  On Thu, 23 Aug 2012, Matthew Ogilvie wrote:
 
  After applying this version 2 of this patch series, I can
  successfully run Micoport UNIX System V/386, v 2.1 (ca 1987)
  under qemu.  (although not if I try to enable KVM)
  
  Version 1 of this series was posted about 4 weeks ago.  See
  http://patchwork.ozlabs.org/project/qemu-devel/list/?submitter=15654
  
  The patches are all independent, except that the documentation part
  of patch 5 (vga) adds onto patch 4 (retrace=) changes.
 
  [..snip..]
 
  Applied, thanks.
 
 malc, please revert these patches.
 
 They were not adequately reviewed and they also do not qualify for the
 stage of the release we're in.

 Number 2 was, and should stay, as the emulation wasn't correct before it,
 don't really care about the rest.

Okay, please revert the rest then.

Regards,

Anthony Liguori


 -- 
 mailto:av1...@comtv.ru

Re: [Qemu-devel] [PATCH 3/9] qbus: remove glib_allocated/qom_allocated and use release hook to free memory

2012-08-27 Thread Anthony Liguori

Andreas Färber afaer...@suse.de writes:

 Am 26.08.2012 17:51, schrieb Anthony Liguori:
 Signed-off-by: Anthony Liguori aligu...@us.ibm.com

 That's a really nice solution for cleaning this up, thanks!

 Acked-by: Andreas Färber afaer...@suse.de

 However one conceptional detail...

 ---
  hw/pci.c|7 ++-
  hw/qdev.c   |   15 ---
  hw/qdev.h   |7 ---
  hw/sysbus.c |7 ++-
  4 files changed, 12 insertions(+), 24 deletions(-)
 [...]
 diff --git a/hw/qdev.c b/hw/qdev.c
 index b5a52ac..6b61daa 100644
 --- a/hw/qdev.c
 +++ b/hw/qdev.c
 [...]
 @@ -468,18 +466,6 @@ BusState *qbus_create(const char *typename, DeviceState 
 *parent, const char *nam
  return bus;
  }
  
 -void qbus_free(BusState *bus)
 -{
 -if (bus-qom_allocated) {
 -object_delete(OBJECT(bus));
 -} else {
 -object_finalize(OBJECT(bus));
 -if (bus-glib_allocated) {
 -g_free(bus);
 -}
 -}
 -}
 -
  static char *bus_get_fw_dev_path(BusState *bus, DeviceState *dev)
  {
  BusClass *bc = BUS_GET_CLASS(bus);
 @@ -698,7 +684,6 @@ static void device_finalize(Object *obj)
  if (dev-state == DEV_STATE_INITIALIZED) {
  while (dev-num_child_bus) {
  bus = QLIST_FIRST(dev-child_bus);
 -qbus_free(bus);
  }
  if (qdev_get_vmsd(dev)) {
  vmstate_unregister(dev, qdev_get_vmsd(dev), dev);

 I wonder how this is gonna work: The device used to be in charge of
 tearing down its bus children ... now it neither deletes nor finalizes
 nor unrefs? Is the while loop even still needed?

 Wouldn't the busses still have the device as parent, referencing it,
 blocking device_finalize?

This has never been right..  Just because a controller goes away, it
doesn't mean that the devices ought to go away too.

There are different types of remove so let's consider each.

1) Guest visible eject: if a controller is ejected, then the guest will
   obviously see everything behind it get removed too.  This is an
   emulation detail, not a QOM thing.

2) Final deletion: this only happens when all references go away.  If
   you eject a controller but there are still children that reference
   it, the controller won't go away.  You actually need to delete each
   individual disk (or whatever is behind it) in order to break the
   reference counting.

The eject notifier could walk the full bus and attempt to break the
connections but honestly, I'd much prefer that we deprecate the current
device_del interface and just do everything through QOM properties.
That would mean manually deleting all of the devices behind the bus if
that's really what you wanted to do.

Regards,

Anthony Liguori


 Regards,
 Andreas

 -- 
 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
 GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH v2 6/6] i8259: add -no-spurious-interrupt-hack option

2012-08-27 Thread Paolo Bonzini

Il 27/08/2012 15:55, Anthony Liguori ha scritto:
  This patch provides a way to optionally suppress spurious interrupts,
  as a workaround for systems described below:
 
  Some old operating systems do not handle spurious interrupts well,
  and qemu tends to generate them significantly more often than
  real hardware.
 This is the wrong approach.  You add a LostTickPolicy property to the
 i8259 device.

Isn't the i8254 the one that would need a LostTickPolicy?

But this seems like a bug that is either in the i8259 emulation, or in
the firmware.  Your own suggestion of setting IRQ2 to level-triggered in
SeaBIOS is definitely a good one.

Paolo

Re: [Qemu-devel] [RFC 1/8] move qemu_irq typedef out of cpu-common.h

2012-08-27 Thread Igor Mammedov

- Original Message -
 From: Peter Maydell peter.mayd...@linaro.org
...
 
 I'm not objecting to this patch if it helps us move forwards,
 but adding the #include to sysemu.h is effectively just adding
 the definition to another grabbag header (183 files include
 sysemu.h). It would be nicer long-term to separate out the
 one thing in this header that cares about qemu_irq (the extern
 declaration of qemu_system_powerdown).
 
Is there a preference/suggestion in which header it should be declared?

...

Re: [Qemu-devel] [PATCH v2 0/6] Running Microport UNIX (ca 1987)

2012-08-27 Thread malc

On Mon, 27 Aug 2012, Anthony Liguori wrote:

 malc av1...@comtv.ru writes:
 

[..snip..]

 
  Number 2 was, and should stay, as the emulation wasn't correct before it,
  don't really care about the rest.
 
 Okay, please revert the rest then.
 

Done.

[..snip..]

-- 
mailto:av1...@comtv.ru

Re: [Qemu-devel] [PATCH 3/9] qbus: remove glib_allocated/qom_allocated and use release hook to free memory

2012-08-27 Thread Andreas Färber

Am 27.08.2012 16:22, schrieb Anthony Liguori:
 Andreas Färber afaer...@suse.de writes:
 
 I wonder how this is gonna work: The device used to be in charge of
 tearing down its bus children ... now it neither deletes nor finalizes
 nor unrefs? Is the while loop even still needed?

 Wouldn't the busses still have the device as parent, referencing it,
 blocking device_finalize?
 
 This has never been right..  Just because a controller goes away, it
 doesn't mean that the devices ought to go away too.
 
 There are different types of remove so let's consider each.
 
 1) Guest visible eject: if a controller is ejected, then the guest will
obviously see everything behind it get removed too.  This is an
emulation detail, not a QOM thing.
 
 2) Final deletion: this only happens when all references go away.  If
you eject a controller but there are still children that reference
it, the controller won't go away.  You actually need to delete each
individual disk (or whatever is behind it) in order to break the
reference counting.
 
 The eject notifier could walk the full bus and attempt to break the
 connections but honestly, I'd much prefer that we deprecate the current
 device_del interface and just do everything through QOM properties.
 That would mean manually deleting all of the devices behind the bus if
 that's really what you wanted to do.

I think we're talking about different scenarios here...

I was thinking

PCIHostState has-a PCIBus

(not PCIBus has-a PCIDevice) and final deletion.

In that case I would expect that it must be guaranteed that the device
that created the bus has access to the bus until it destroys it. But
IIUC the PCIHostState, once unparented from its SysBus (bad example!),
has a refcount of 1 (its PCIBus) thereby not being finalized?

I do understand your concept of refcounting matches what Java, .NET,
etc. do for objects but combined with the new QBus I feel this is
blurring the encapsulations and expected semantics of the device-centric
functions we have. To me the uninitfn means the whole object goes away
and is incompatible with part of its children may stay behind if there
are still stray references to them... we can no longer properly access
them then, only devices have canonical paths, so we'd risk piling up
garbage at runtime.

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCHv2 1/4] linux-headers: update to 3.6-rc3

2012-08-27 Thread Michael S. Tsirkin

On Mon, Aug 27, 2012 at 01:42:03PM +0100, Peter Maydell wrote:
 On 27 August 2012 13:20, Michael S. Tsirkin m...@redhat.com wrote:
  Update linux-headers to version present in Linux 3.6-rc3.
  Header asm-x96_64/kvm_para.h update is needed for the new PV EOI
  feature.
 
  Signed-off-by: Michael S. Tsirkin m...@redhat.com
  ---
   linux-headers/asm-s390/kvm.h  | 2 +-
   linux-headers/asm-s390/kvm_para.h | 2 +-
   linux-headers/asm-x86/kvm.h   | 1 +
   linux-headers/asm-x86/kvm_para.h  | 7 +++
   linux-headers/linux/kvm.h | 3 +++
   5 files changed, 13 insertions(+), 2 deletions(-)
 
 The latest version of update-linux-headers.sh should have caused
 this update to include asm-generic/kvm_para.h, I think. Did the
 script not pull that header in, or were you maybe using an old
 version of the script or forgot to git add the new file?
 
 thanks
 -- PMM

I have no idea but adding new files is not the same as updating
existing ones.
Why don't you add it when you update headers to a version that
actually uses it?

-- 
MST

Re: [Qemu-devel] [PATCHv2 1/4] linux-headers: update to 3.6-rc3

2012-08-27 Thread Michael S. Tsirkin

On Mon, Aug 27, 2012 at 02:48:57PM +0200, Jan Kiszka wrote:
 On 2012-08-27 14:42, Peter Maydell wrote:
  On 27 August 2012 13:20, Michael S. Tsirkin m...@redhat.com wrote:
  Update linux-headers to version present in Linux 3.6-rc3.
  Header asm-x96_64/kvm_para.h update is needed for the new PV EOI
  feature.
 
  Signed-off-by: Michael S. Tsirkin m...@redhat.com
  ---
   linux-headers/asm-s390/kvm.h  | 2 +-
   linux-headers/asm-s390/kvm_para.h | 2 +-
   linux-headers/asm-x86/kvm.h   | 1 +
   linux-headers/asm-x86/kvm_para.h  | 7 +++
   linux-headers/linux/kvm.h | 3 +++
   5 files changed, 13 insertions(+), 2 deletions(-)
  
  The latest version of update-linux-headers.sh should have caused
  this update to include asm-generic/kvm_para.h, I think. Did the
  script not pull that header in, or were you maybe using an old
  version of the script or forgot to git add the new file?
 
 To be fair, that is hard to guess. We should add some magic to the
 update script to detect new files and maybe suggest them for addition.
 
 Jan

But why did you add a header to qemu without adding it
to git? That's a cleaner solution and needs no magic scripting.

 -- 
 Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
 Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH] spice: increase the verbosity of spice section in qemu --help

2012-08-27 Thread Eric Blake

On 08/26/2012 12:38 AM, Yonit Halperin wrote:
 On 08/21/2012 03:31 PM, Eric Blake wrote:
 On 08/21/2012 04:54 AM, Yonit Halperin wrote:
 Added all spice options to the help string. This can be used by libvirt
 to determine which spice related features are supported by qemu.

 For older released, this is true; but for future versions of qemu,
 libvirt would much rather learn this information from QMP commands than
 from scraping -help output.  Can we get at all of this information
 from QMP?

 No, we don't have qmp commands for any of spice config options. I don't
 think it should be in the scope of this patch.

But since we have already declared that 1.2 is the last release where
libvirt will be scraping -help output, and that 1.3 and later will allow
libvirt to query all configuration information via QMP commands, I think
that you really _do_ need to consider QMP commands in the scope of this
patch series, if you expect libvirt to be able to react to this
information in qemu 1.3.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCHv2 1/4] linux-headers: update to 3.6-rc3

2012-08-27 Thread Jan Kiszka

On 2012-08-27 16:53, Michael S. Tsirkin wrote:
 On Mon, Aug 27, 2012 at 02:48:57PM +0200, Jan Kiszka wrote:
 On 2012-08-27 14:42, Peter Maydell wrote:
 On 27 August 2012 13:20, Michael S. Tsirkin m...@redhat.com wrote:
 Update linux-headers to version present in Linux 3.6-rc3.
 Header asm-x96_64/kvm_para.h update is needed for the new PV EOI
 feature.

 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 ---
  linux-headers/asm-s390/kvm.h  | 2 +-
  linux-headers/asm-s390/kvm_para.h | 2 +-
  linux-headers/asm-x86/kvm.h   | 1 +
  linux-headers/asm-x86/kvm_para.h  | 7 +++
  linux-headers/linux/kvm.h | 3 +++
  5 files changed, 13 insertions(+), 2 deletions(-)

 The latest version of update-linux-headers.sh should have caused
 this update to include asm-generic/kvm_para.h, I think. Did the
 script not pull that header in, or were you maybe using an old
 version of the script or forgot to git add the new file?

 To be fair, that is hard to guess. We should add some magic to the
 update script to detect new files and maybe suggest them for addition.

 Jan
 
 But why did you add a header to qemu without adding it
 to git? That's a cleaner solution and needs no magic scripting.

Yes, this would have been appropriate. Still, a simple git status -s
linux-headers run at the end of the update script can help reminding
people in the future.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

[Qemu-devel] [RFC PATCH 01/13] nbd: add more constants

2012-08-27 Thread Paolo Bonzini

Avoid magic numbers and magic size computations; hide them behind #defines.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 nbd.c | 17 ++---
 1 file modificato, 10 inserzioni(+), 7 rimozioni(-)

diff --git a/nbd.c b/nbd.c
index 0dd60c5..8201b7a 100644
--- a/nbd.c
+++ b/nbd.c
@@ -57,9 +57,12 @@
 
 /* This is all part of the official NBD API */
 
+#define NBD_REQUEST_SIZE(4 + 4 + 8 + 8 + 4)
 #define NBD_REPLY_SIZE  (4 + 4 + 8)
 #define NBD_REQUEST_MAGIC   0x25609513
 #define NBD_REPLY_MAGIC 0x67446698
+#define NBD_OPTS_MAGIC  0x49484156454F5054LL
+#define NBD_CLIENT_MAGIC0x420281861253LL
 
 #define NBD_SET_SOCK_IO(0xab, 0)
 #define NBD_SET_BLKSIZE _IO(0xab, 1)
@@ -213,7 +216,7 @@ static int nbd_send_negotiate(int csock, off_t size, 
uint32_t flags)
 
 /* Negotiate
 [ 0 ..   7]   passwd   (NBDMAGIC)
-[ 8 ..  15]   magic(0x00420281861253)
+[ 8 ..  15]   magic(NBD_CLIENT_MAGIC)
 [16 ..  23]   size
 [24 ..  27]   flags
 [28 .. 151]   reserved (0)
@@ -224,7 +227,7 @@ static int nbd_send_negotiate(int csock, off_t size, 
uint32_t flags)
 
 TRACE(Beginning negotiation.);
 memcpy(buf, NBDMAGIC, 8);
-cpu_to_be64w((uint64_t*)(buf + 8), 0x00420281861253LL);
+cpu_to_be64w((uint64_t*)(buf + 8), NBD_CLIENT_MAGIC);
 cpu_to_be64w((uint64_t*)(buf + 16), size);
 cpu_to_be32w((uint32_t*)(buf + 24),
  flags | NBD_FLAG_HAS_FLAGS | NBD_FLAG_SEND_TRIM |
@@ -295,7 +298,7 @@ int nbd_receive_negotiate(int csock, const char *name, 
uint32_t *flags,
 uint32_t namesize;
 
 TRACE(Checking magic (opts_magic));
-if (magic != 0x49484156454F5054LL) {
+if (magic != NBD_OPTS_MAGIC) {
 LOG(Bad magic received);
 goto fail;
 }
@@ -334,7 +337,7 @@ int nbd_receive_negotiate(int csock, const char *name, 
uint32_t *flags,
 } else {
 TRACE(Checking magic (cli_magic));
 
-if (magic != 0x00420281861253LL) {
+if (magic != NBD_CLIENT_MAGIC) {
 LOG(Bad magic received);
 goto fail;
 }
@@ -477,7 +480,7 @@ int nbd_client(int fd)
 
 ssize_t nbd_send_request(int csock, struct nbd_request *request)
 {
-uint8_t buf[4 + 4 + 8 + 8 + 4];
+uint8_t buf[NBD_REQUEST_SIZE];
 ssize_t ret;
 
 cpu_to_be32w((uint32_t*)buf, NBD_REQUEST_MAGIC);
@@ -504,7 +507,7 @@ ssize_t nbd_send_request(int csock, struct nbd_request 
*request)
 
 static ssize_t nbd_receive_request(int csock, struct nbd_request *request)
 {
-uint8_t buf[4 + 4 + 8 + 8 + 4];
+uint8_t buf[NBD_REQUEST_SIZE];
 uint32_t magic;
 ssize_t ret;
 
@@ -582,7 +585,7 @@ ssize_t nbd_receive_reply(int csock, struct nbd_reply 
*reply)
 
 static ssize_t nbd_send_reply(int csock, struct nbd_reply *reply)
 {
-uint8_t buf[4 + 4 + 8];
+uint8_t buf[NBD_REPLY_SIZE];
 ssize_t ret;
 
 /* Reply
-- 
1.7.11.2

[Qemu-devel] [RFC PATCH 06/13] nbd: negotiate with named exports

2012-08-27 Thread Paolo Bonzini

Allow negotiation to receive the name of the requested export from
the client.  Passing a NULL export to nbd_client_new will cause
the server to send the extended negotiation header.  The exp field
is then filled during negotiation.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 nbd.c | 155 +++---
 1 file modificato, 140 inserzioni(+), 15 rimozioni(-)

diff --git a/nbd.c b/nbd.c
index 1249548..fe7551d 100644
--- a/nbd.c
+++ b/nbd.c
@@ -234,11 +234,23 @@ int unix_socket_outgoing(const char *path)
 return unix_connect(path);
 }
 
-/* Basic flow
+/* Basic flow for negotiation
 
Server Client
-
Negotiate
+
+   or
+
+   Server Client
+   Negotiate #1
+  Option
+   Negotiate #2
+
+   
+
+   followed by
+
+   Server Client
   Request
Response
   Request
@@ -246,20 +258,110 @@ int unix_socket_outgoing(const char *path)
   ...
...
   Request (type == 2)
+
 */
 
+static int nbd_receive_options(NBDClient *client)
+{
+int csock = client-sock;
+char name[256];
+uint32_t tmp, length;
+uint64_t magic;
+int rc;
+
+/* Client sends:
+[ 0 ..   3]   reserved (0)
+[ 4 ..  11]   NBD_OPTS_MAGIC
+[12 ..  15]   NBD_OPT_EXPORT_NAME
+[16 ..  19]   length
+[20 ..  xx]   export name (length bytes)
+ */
+
+rc = -EINVAL;
+if (read_sync(csock, tmp, sizeof(tmp)) != sizeof(tmp)) {
+LOG(read failed);
+goto fail;
+}
+TRACE(Checking reserved);
+if (tmp != 0) {
+LOG(Bad reserved received);
+goto fail;
+}
+
+if (read_sync(csock, magic, sizeof(magic)) != sizeof(magic)) {
+LOG(read failed);
+goto fail;
+}
+TRACE(Checking reserved);
+if (magic != be64_to_cpu(NBD_OPTS_MAGIC)) {
+LOG(Bad magic received);
+goto fail;
+}
+
+if (read_sync(csock, tmp, sizeof(tmp)) != sizeof(tmp)) {
+LOG(read failed);
+goto fail;
+}
+TRACE(Checking option);
+if (tmp != be32_to_cpu(NBD_OPT_EXPORT_NAME)) {
+LOG(Bad option received);
+goto fail;
+}
+
+if (read_sync(csock, length, sizeof(length)) != sizeof(length)) {
+LOG(read failed);
+goto fail;
+}
+TRACE(Checking length);
+length = be32_to_cpu(length);
+if (length  255) {
+LOG(Bad length received);
+goto fail;
+}
+if (read_sync(csock, name, length) != length) {
+LOG(read failed);
+goto fail;
+}
+name[length] = '\0';
+
+client-exp = nbd_export_find(name);
+if (!client-exp) {
+LOG(export not found);
+goto fail;
+}
+
+QTAILQ_INSERT_TAIL(client-exp-clients, client, next);
+TRACE(Option negotiation succeeded.);
+rc = 0;
+fail:
+return rc;
+}
+
 static int nbd_send_negotiate(NBDClient *client)
 {
 int csock = client-sock;
 char buf[8 + 8 + 8 + 128];
 int rc;
+const int myflags = (NBD_FLAG_HAS_FLAGS | NBD_FLAG_SEND_TRIM |
+ NBD_FLAG_SEND_FLUSH | NBD_FLAG_SEND_FUA);
 
-/* Negotiate
-[ 0 ..   7]   passwd   (NBDMAGIC)
-[ 8 ..  15]   magic(NBD_CLIENT_MAGIC)
+/* Negotiation header without options:
+[ 0 ..   7]   passwd   (NBDMAGIC)
+[ 8 ..  15]   magic(NBD_CLIENT_MAGIC)
 [16 ..  23]   size
-[24 ..  27]   flags
-[28 .. 151]   reserved (0)
+[24 ..  25]   server flags (0)
+[24 ..  27]   export flags
+[28 .. 151]   reserved (0)
+
+   Negotiation header with options, part 1:
+[ 0 ..   7]   passwd   (NBDMAGIC)
+[ 8 ..  15]   magic(NBD_OPTS_MAGIC)
+[16 ..  17]   server flags (0)
+
+   part 2 (after options are sent):
+[18 ..  25]   size
+[26 ..  27]   export flags
+[28 .. 151]   reserved (0)
  */
 
 socket_set_block(csock);
@@ -267,16 +369,39 @@ static int nbd_send_negotiate(NBDClient *client)
 
 TRACE(Beginning negotiation.);
 memcpy(buf, NBDMAGIC, 8);
-cpu_to_be64w((uint64_t*)(buf + 8), NBD_CLIENT_MAGIC);
-cpu_to_be64w((uint64_t*)(buf + 16), client-exp-size);
-cpu_to_be32w((uint32_t*)(buf + 24),
- client-exp-nbdflags | NBD_FLAG_HAS_FLAGS | 
NBD_FLAG_SEND_TRIM |
- NBD_FLAG_SEND_FLUSH | NBD_FLAG_SEND_FUA);
+if (client-exp) {
+assert ((client-exp-nbdflags  ~65535) == 0);
+cpu_to_be64w((uint64_t*)(buf + 8), NBD_CLIENT_MAGIC);
+cpu_to_be64w((uint64_t*)(buf + 16), client-exp-size);
+cpu_to_be16w((uint16_t*)(buf + 26), client-exp-nbdflags | myflags);
+} else {
+cpu_to_be64w((uint64_t*)(buf + 8), NBD_OPTS_MAGIC);
+}
 memset(buf + 28, 0, 124);
 
-if (write_sync(csock, buf, sizeof(buf)) != sizeof(buf)) {
-LOG(write failed);
-goto fail;
+if (client-exp)

[Qemu-devel] [RFC PATCH 09/13] qmp: add NBD server commands

2012-08-27 Thread Paolo Bonzini

Adding an NBD server inside QEMU is trivial, since all the logic is
in nbd.c and can be shared easily between qemu-nbd and QEMU itself.
The main difference is that qemu-nbd serves a single unnamed export,
while QEMU serves named exports.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 Makefile.objs|  2 +-
 blockdev-nbd.c   | 93 
 qapi-schema.json | 69 +
 qmp-commands.hx  | 16 ++
 4 file modificati, 179 inserzioni(+). 1 rimozione(-)
 create mode 100644 blockdev-nbd.c

diff --git a/Makefile.objs b/Makefile.objs
index 4412757..c42affc 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -59,7 +59,7 @@ endif
 # suppress *all* target specific code in case of system emulation, i.e. a
 # single QEMU executable should support all CPUs and machines.
 
-common-obj-y = $(block-obj-y) blockdev.o
+common-obj-y = $(block-obj-y) blockdev.o blockdev-nbd.o
 common-obj-y += net.o net/
 common-obj-y += qom/
 common-obj-y += readline.o console.o cursor.o
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
new file mode 100644
index 000..5a415be
--- /dev/null
+++ b/blockdev-nbd.c
@@ -0,0 +1,93 @@
+/*
+ * QEMU host block devices
+ *
+ * Copyright (c) 2003-2008 Fabrice Bellard
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include blockdev.h
+#include hw/block-common.h
+#include monitor.h
+#include qerror.h
+#include sysemu.h
+#include qmp-commands.h
+#include trace.h
+#include nbd.h
+#include qemu_socket.h
+
+static int server_fd = -1;
+
+static void nbd_accept(void *opaque)
+{
+struct sockaddr_in addr;
+socklen_t addr_len = sizeof(addr);
+
+int fd = accept(server_fd, (struct sockaddr *)addr, addr_len);
+if (fd = 0) {
+nbd_client_new(NULL, fd, NULL);
+}
+}
+
+static void nbd_server_start(QemuOpts *opts, Error **errp)
+{
+if (server_fd != -1) {
+/* TODO: error */
+return;
+}
+
+server_fd = inet_listen_opts(opts, 0, errp);
+if (server_fd != -1) {
+qemu_set_fd_handler2(server_fd, NULL, nbd_accept, NULL, NULL);
+}
+}
+
+void qmp_nbd_server_start(IPSocketAddress *addr, Error **errp)
+{
+QemuOpts *opts;
+
+opts = qemu_opts_create(socket_opts, NULL, 0, NULL);
+qemu_opt_set(opts, host, addr-host);
+qemu_opt_set(opts, port, addr-port);
+
+addr-ipv4 |= !addr-has_ipv4;
+addr-ipv6 |= !addr-has_ipv6;
+if (!addr-ipv4 || !addr-ipv6) {
+qemu_opt_set_bool(opts, ipv4, addr-ipv4);
+qemu_opt_set_bool(opts, ipv6, addr-ipv6);
+}
+
+nbd_server_start(opts, errp);
+qemu_opts_del(opts);
+}
+
+
+void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
+Error **errp)
+{
+BlockDriverState *bs;
+NBDExport *exp;
+
+bs = bdrv_find(device);
+if (!bs) {
+error_set(errp, QERR_DEVICE_NOT_FOUND, device);
+return;
+}
+
+if (nbd_export_find(bdrv_get_device_name(bs))) {
+/* TODO: error */
+return;
+}
+
+exp = nbd_export_new(bs, 0, -1, writable ? 0 : NBD_FLAG_READ_ONLY);
+nbd_export_set_name(exp, device);
+}
+
+void qmp_nbd_server_stop(Error **errp)
+{
+nbd_export_close_all();
+qemu_set_fd_handler2(server_fd, NULL, NULL, NULL, NULL);
+close(server_fd);
+server_fd = -1;
+}
diff --git a/qapi-schema.json b/qapi-schema.json
index 3d2b2d1..d792d2c 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2275,6 +2275,30 @@
 'opts': 'NetClientOptions' } }
 
 ##
+# @IPSocketAddress
+#
+# Captures the destination address of an IP socket
+#
+# @host: host part of the address
+#
+# @port: port part of the address, or lowest port if @to is present
+#
+# @to: highest port to try
+#
+# @ipv4: whether to accept IPv4 addresses, default try both IPv4 and IPv6
+#
+# @ipv6: whether to accept IPv6 addresses, default try both IPv4 and IPv6
+#
+# Since 1.3
+##
+{ 'type': 'IPSocketAddress',
+  'data': {
+'host': 'str',
+'port': 'str',
+'*ipv4': 'bool',
+'*ipv6': 'bool' } }
+
+##
 # @getfd:
 #
 # Receive a file descriptor via SCM rights and assign it a name
@@ -2454,3 +2478,46 @@
 #
 ##
 { 'command': 'query-fdsets', 'returns': ['FdsetInfo'] }
+
+##
+# @nbd-server-start:
+#
+# Start an NBD server listening on the given host and port.
+#
+# @addr: Interface on which to listen, nothing for all interfaces.
+#
+# Since: 1.3.0
+#
+##
+{ 'command': 'nbd-server-start',
+  'data': { 'addr': 'IPSocketAddress' } }
+
+##
+# @nbd-server-add:
+#
+# Export a device to QEMU's embedded NBD server.
+#
+# @device: Block device to be exported
+#
+# @writable: Whether clients should be able to write to the device via the
+# NBD connection (default false).
+#
+# Returns: error if the device is already marked for export.
+#
+# Since: 1.3.0
+#
+##
+{ 'command': 'nbd-server-add', 'data': {'device': 'str', '*writable': 'bool'} }
+
+##
+#

[Qemu-devel] [RFC PATCH 11/13] hmp: add NBD server commands

2012-08-27 Thread Paolo Bonzini

At the HMP level there is no nbd_server_add command.  nbd_server_start
automatically exposes all of the VM's block devices, and an option -w
makes them writable.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hmp-commands.hx | 29 +
 hmp.c   | 66 +
 hmp.h   |  2 ++
 3 file modificati, 97 inserzioni(+)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index f6104b0..cabb886 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1248,6 +1248,35 @@ Remove all matches from the access control list, and set 
the default
 policy back to @code{deny}.
 ETEXI
 
+{
+.name   = nbd_server_start,
+.args_type  = writable:-w,uri:s,
+.params = nbd_server_start [-w] host:port,
+.help   = serve block devices on the given host and port,
+.mhandler.cmd = hmp_nbd_server_start,
+},
+STEXI
+@item nbd_server_start @var{host}:@var{port}
+@findex nbd_server_start
+Start an NBD server on the given host and/or port, and serve all of the
+virtual machine's block devices that have an inserted media on it.
+The @option{-w} option makes the devices writable.
+ETEXI
+
+{
+.name   = nbd_server_stop,
+.args_type  = ,
+.params = nbd_server_stop,
+.help   = stop serving block devices using the NBD protocol,
+.mhandler.cmd = hmp_nbd_server_stop,
+},
+STEXI
+@item nbd_server_stop
+@findex nbd_server_stop
+Stop the QEMU embedded NBD server.
+ETEXI
+
+
 #if defined(TARGET_I386)
 
 {
diff --git a/hmp.c b/hmp.c
index a9d5675..10ff50d 100644
--- a/hmp.c
+++ b/hmp.c
@@ -18,6 +18,7 @@
 #include qemu-option.h
 #include qemu-timer.h
 #include qmp-commands.h
+#include qemu_socket.h
 #include monitor.h
 
 static void hmp_handle_error(Monitor *mon, Error **errp)
@@ -1102,3 +1103,68 @@ void hmp_closefd(Monitor *mon, const QDict *qdict)
 qmp_closefd(fdname, errp);
 hmp_handle_error(mon, errp);
 }
+
+void hmp_nbd_server_start(Monitor *mon, const QDict *qdict)
+{
+const char *uri = qdict_get_str(qdict, uri);
+int writable = qdict_get_try_bool(qdict, writable, 0);
+Error *errp = NULL;
+QemuOpts *opts;
+BlockDriverState *bs;
+IPSocketAddress addr;
+
+/* First check if the address is available and start the server.  */
+opts = qemu_opts_create(socket_opts, NULL, 0, NULL);
+if (inet_parse(opts, uri) != 0) {
+error_set(errp, QERR_SOCKET_CREATE_FAILED);
+   goto exit;
+}
+
+memset(addr, 0, sizeof(addr));
+addr.host = (char *) qemu_opt_get(opts, host);
+addr.port = (char *) qemu_opt_get(opts, port);
+addr.ipv4 = qemu_opt_get_bool(opts, ipv4, 0);
+addr.ipv6 = qemu_opt_get_bool(opts, ipv6, 0);
+addr.has_ipv4 = addr.has_ipv6 = true;
+
+if (addr.host == NULL || addr.port == NULL) {
+error_set(errp, QERR_SOCKET_CREATE_FAILED);
+goto exit;
+}
+
+qmp_nbd_server_start(addr, errp);
+if (errp != NULL) {
+goto exit;
+}
+
+/* Then try adding all block devices.  If one fails, close all and
+ * exit.
+ */
+bs = NULL;
+while ((bs = bdrv_next(bs))) {
+if (!bdrv_is_inserted(bs)) {
+continue;
+}
+
+qmp_nbd_server_add(bdrv_get_device_name(bs),
+   true, !bdrv_is_read_only(bs)  writable,
+   errp);
+
+if (errp != NULL) {
+qmp_nbd_server_stop(NULL);
+break;
+}
+}
+
+exit:
+qemu_opts_del(opts);
+hmp_handle_error(mon, errp);
+}
+
+void hmp_nbd_server_stop(Monitor *mon, const QDict *qdict)
+{
+Error *errp = NULL;
+
+qmp_nbd_server_stop(errp);
+hmp_handle_error(mon, errp);
+}
diff --git a/hmp.h b/hmp.h
index 7dd93bf..89d3960 100644
--- a/hmp.h
+++ b/hmp.h
@@ -71,5 +71,7 @@ void hmp_netdev_add(Monitor *mon, const QDict *qdict);
 void hmp_netdev_del(Monitor *mon, const QDict *qdict);
 void hmp_getfd(Monitor *mon, const QDict *qdict);
 void hmp_closefd(Monitor *mon, const QDict *qdict);
+void hmp_nbd_server_start(Monitor *mon, const QDict *qdict);
+void hmp_nbd_server_stop(Monitor *mon, const QDict *qdict);
 
 #endif
-- 
1.7.11.2

Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem

2012-08-27 Thread Jan Kiszka

On 2012-08-27 15:19, Anthony Liguori wrote:
 Liu Ping Fan qemul...@gmail.com writes:
 
 From: Liu Ping Fan pingf...@linux.vnet.ibm.com

 Scene:
   obja lies in objA, when objA's ref-0, it will be freed,
 but at that time obja can still be in use.

 The real example is:
 typedef struct PCIIDEState {
 PCIDevice dev;
 IDEBus bus[2]; -- create in place
 .
 }

 When without big lock protection for mmio-dispatch, we will hold
 obj's refcnt. So memory_region_init_io() will replace the third para
 void *opaque with Object *obj.
 With this patch, we can protect PCIIDEState from disappearing during
 mmio-dispatch hold the IDEBus-ref.

 And the ref circle has been broken when calling qdev_delete_subtree().

 Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
 
 I think this is solving the wrong problem.  There are many, many
 dependencies a device may have on other devices.  Memory allocation
 isn't the only one.
 
 The problem is that we want to make sure that a device doesn't go away
 while an MMIO dispatch is happening.  This is easy to solve without
 touching referencing counting.
 
 The device will hold a lock while the MMIO is being dispatched.  The
 delete path simply needs to acquire that same lock.  This will ensure
 that a delete operation cannot finish while MMIO is still in flight.

That's a bit too simple. Quite a few MMIO/PIO fast-paths will work
without any device-specific locking, e.g. just to read a simple register
value. So we will need reference counting (for devices using private
locks), but on the front-line object: the memory region. That region
will block its owner from disappearing by waiting on dispatch when
someone tries to unregister it.

Also note that holding a lock is easily said but will be more tricky
in practice. Quite a significant share of our code will continue to run
under BQL, even for devices with their own locks. Init/cleanup functions
will likely fall into this category, simply because the surrounding
logic is hard to convert into fine-grained locking and is also not
performance critical. At the same time, we can't take BQL - device-lock
as we have to support device-lock - BQL ordering for (slow-path) calls
into BQL-protected areas while holding a per-device lock (e.g. device
mapping changes).

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

[Qemu-devel] [RFC PATCH 07/13] nbd: do not close BlockDriverState in nbd_export_close

2012-08-27 Thread Paolo Bonzini

This is not desirable when embedding the NBD server inside QEMU.
Move the bdrv_close to qemu-nbd.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 nbd.c  | 1 -
 qemu-nbd.c | 1 +
 2 file modificati, 1 inserzione(+). 1 rimozione(-)

diff --git a/nbd.c b/nbd.c
index fe7551d..1f65b1f 100644
--- a/nbd.c
+++ b/nbd.c
@@ -893,7 +893,6 @@ void nbd_export_close(NBDExport *exp)
 g_free(exp-name);
 }
 
-bdrv_close(exp-bs);
 g_free(exp);
 }
 
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 1c1cf6a..23392e0 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -586,6 +586,7 @@ int main(int argc, char **argv)
 } while (!sigterm_reported  (persistent || !nbd_started || nb_fds  0));
 
 nbd_export_close(exp);
+bdrv_close(bs);
 if (sockpath) {
 unlink(sockpath);
 }
-- 
1.7.11.2

Re: [Qemu-devel] [PATCH v2 0/6] Running Microport UNIX (ca 1987)

2012-08-27 Thread Anthony Liguori

malc av1...@comtv.ru writes:

 On Mon, 27 Aug 2012, Anthony Liguori wrote:

 malc av1...@comtv.ru writes:
 

 [..snip..]

 
  Number 2 was, and should stay, as the emulation wasn't correct before it,
  don't really care about the rest.
 
 Okay, please revert the rest then.
 

 Done.

Thank you!

Regards,

Anthony Liguori


 [..snip..]

 -- 
 mailto:av1...@comtv.ru

Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem

2012-08-27 Thread Anthony Liguori

Jan Kiszka jan.kis...@siemens.com writes:

 On 2012-08-27 15:19, Anthony Liguori wrote:
 Liu Ping Fan qemul...@gmail.com writes:
 
 From: Liu Ping Fan pingf...@linux.vnet.ibm.com

 Scene:
   obja lies in objA, when objA's ref-0, it will be freed,
 but at that time obja can still be in use.

 The real example is:
 typedef struct PCIIDEState {
 PCIDevice dev;
 IDEBus bus[2]; -- create in place
 .
 }

 When without big lock protection for mmio-dispatch, we will hold
 obj's refcnt. So memory_region_init_io() will replace the third para
 void *opaque with Object *obj.
 With this patch, we can protect PCIIDEState from disappearing during
 mmio-dispatch hold the IDEBus-ref.

 And the ref circle has been broken when calling qdev_delete_subtree().

 Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
 
 I think this is solving the wrong problem.  There are many, many
 dependencies a device may have on other devices.  Memory allocation
 isn't the only one.
 
 The problem is that we want to make sure that a device doesn't go away
 while an MMIO dispatch is happening.  This is easy to solve without
 touching referencing counting.
 
 The device will hold a lock while the MMIO is being dispatched.  The
 delete path simply needs to acquire that same lock.  This will ensure
 that a delete operation cannot finish while MMIO is still in flight.

 That's a bit too simple. Quite a few MMIO/PIO fast-paths will work
 without any device-specific locking, e.g. just to read a simple register
 value. So we will need reference counting

But then you'll need to acquire a lock to take the reference/remove the
reference which sort of defeats the purpose of trying to fast path.

 (for devices using private
 locks), but on the front-line object: the memory region. That region
 will block its owner from disappearing by waiting on dispatch when
 someone tries to unregister it.

 Also note that holding a lock is easily said but will be more tricky
 in practice. Quite a significant share of our code will continue to run
 under BQL, even for devices with their own locks. Init/cleanup functions
 will likely fall into this category,

I'm not sure I'm convinced of this--but it's hard to tell until we
really start converting.

BTW, I'm pretty sure we have to tackle main loop functions first before
we try to convert any devices off the BQL.

Regards,

Anthony Liguori

 simply because the surrounding
 logic is hard to convert into fine-grained locking and is also not
 performance critical. At the same time, we can't take BQL - device-lock
 as we have to support device-lock - BQL ordering for (slow-path) calls
 into BQL-protected areas while holding a per-device lock (e.g. device
 mapping changes).



 Jan

 -- 
 Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
 Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 4/9] object: remove object_finalize

2012-08-27 Thread Andreas Färber

Am 26.08.2012 17:51, schrieb Anthony Liguori:
 Callers should just use object_unref
 
 Signed-off-by: Anthony Liguori aligu...@us.ibm.com
 ---
  hw/qdev.c |4 
  include/qemu/object.h |9 -
  qom/object.c  |2 +-
  3 files changed, 1 insertions(+), 14 deletions(-)
 
 diff --git a/hw/qdev.c b/hw/qdev.c
 index 6b61daa..fdee91f 100644
 --- a/hw/qdev.c
 +++ b/hw/qdev.c
 @@ -678,13 +678,9 @@ static void device_initfn(Object *obj)
  static void device_finalize(Object *obj)
  {
  DeviceState *dev = DEVICE(obj);
 -BusState *bus;
  DeviceClass *dc = DEVICE_GET_CLASS(dev);
  
  if (dev-state == DEV_STATE_INITIALIZED) {
 -while (dev-num_child_bus) {
 -bus = QLIST_FIRST(dev-child_bus);
 -}
  if (qdev_get_vmsd(dev)) {
  vmstate_unregister(dev, qdev_get_vmsd(dev), dev);
  }

This seems to answer my remark on 3/9, should've been squashed into that
one.

 diff --git a/include/qemu/object.h b/include/qemu/object.h
 index 487adcd..8bc9935 100644
 --- a/include/qemu/object.h
 +++ b/include/qemu/object.h
 @@ -490,15 +490,6 @@ void object_initialize_with_type(void *data, Type type);
  void object_initialize(void *obj, const char *typename);
  
  /**
 - * object_finalize:
 - * @obj: The object to finalize.
 - *
 - * This function destroys and object without freeing the memory associated 
 with
 - * it.
 - */
 -void object_finalize(void *obj);
 -
 -/**
   * object_dynamic_cast:
   * @obj: The object to cast.
   * @typename: The @typename to cast to.
 diff --git a/qom/object.c b/qom/object.c
 index 44135c3..1144f79 100644
 --- a/qom/object.c
 +++ b/qom/object.c
 @@ -375,7 +375,7 @@ static void object_deinit(Object *obj, TypeImpl *type)
  }
  }
  
 -void object_finalize(void *data)
 +static void object_finalize(void *data)
  {
  Object *obj = data;
  TypeImpl *ti = obj-class-type;

This is what I was referring to with breaking the encapsulation on 3/9:
When we have a PHB with embedded PCIDevice on its PCIBus, as
demonstrated with i440fx and prep_pci, then when doing object_delete()
on the whole thing I expect the main object's finalizer to call
object_finalize() on its embedded childs, forcing their uninit or an
assert if a programming error. Not just an unref that might or might not
finalize it.

If however finalize is called only at refcount 0 then who will unref the
self-created children? Finalize would never be called due to pending
references by its children...

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem

2012-08-27 Thread Jan Kiszka

On 2012-08-27 17:14, Anthony Liguori wrote:
 Jan Kiszka jan.kis...@siemens.com writes:
 
 On 2012-08-27 15:19, Anthony Liguori wrote:
 Liu Ping Fan qemul...@gmail.com writes:

 From: Liu Ping Fan pingf...@linux.vnet.ibm.com

 Scene:
   obja lies in objA, when objA's ref-0, it will be freed,
 but at that time obja can still be in use.

 The real example is:
 typedef struct PCIIDEState {
 PCIDevice dev;
 IDEBus bus[2]; -- create in place
 .
 }

 When without big lock protection for mmio-dispatch, we will hold
 obj's refcnt. So memory_region_init_io() will replace the third para
 void *opaque with Object *obj.
 With this patch, we can protect PCIIDEState from disappearing during
 mmio-dispatch hold the IDEBus-ref.

 And the ref circle has been broken when calling qdev_delete_subtree().

 Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com

 I think this is solving the wrong problem.  There are many, many
 dependencies a device may have on other devices.  Memory allocation
 isn't the only one.

 The problem is that we want to make sure that a device doesn't go away
 while an MMIO dispatch is happening.  This is easy to solve without
 touching referencing counting.

 The device will hold a lock while the MMIO is being dispatched.  The
 delete path simply needs to acquire that same lock.  This will ensure
 that a delete operation cannot finish while MMIO is still in flight.

 That's a bit too simple. Quite a few MMIO/PIO fast-paths will work
 without any device-specific locking, e.g. just to read a simple register
 value. So we will need reference counting
 
 But then you'll need to acquire a lock to take the reference/remove the
 reference which sort of defeats the purpose of trying to fast path.

Atomic ops? RCU? This problem won't be solved for the first time.

 
 (for devices using private
 locks), but on the front-line object: the memory region. That region
 will block its owner from disappearing by waiting on dispatch when
 someone tries to unregister it.

 Also note that holding a lock is easily said but will be more tricky
 in practice. Quite a significant share of our code will continue to run
 under BQL, even for devices with their own locks. Init/cleanup functions
 will likely fall into this category,
 
 I'm not sure I'm convinced of this--but it's hard to tell until we
 really start converting.
 
 BTW, I'm pretty sure we have to tackle main loop functions first before
 we try to convert any devices off the BQL.

I'm sure we should leave existing code alone wherever possible, focusing
on providing alternative versions for those paths that matter. Example:
Most timers are fine under BQL. But some sensitive devices (RTC or HPET
as clock source) will want their own timers. So the approach is to
instantiate a separate, also prioritizeable instance of the timer
subsystem for them and be done.

We won't convert QEMU in a day, but we surely want results before the
last corner is refactored (which would take years, at best).

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

1 2 >

1 - 100 of 184 matches

Mail list logo