[Qemu-devel] [PATCH 3/3] vfio-pci: rework of EOI

2012-07-23 Thread Alexey Kardashevskiy
Originally VFIO is coded to support IOAPIC only (i.e. x86).
The patch adds XICS (POWERPC interrupt controller) and replaces
ioapic_add_gsi_eoi_notifier with unified macro to have as little
#ifdef TARGET_PPC64 as possible.

Still needs some rework to get rid of #ifdef TARGET_PPC64.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/vfio_pci.c |   24 
 hw/vfio_pci.h |1 -
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
index fd65731..cd68fe0 100644
--- a/hw/vfio_pci.c
+++ b/hw/vfio_pci.c
@@ -21,7 +21,6 @@
 #include dirent.h
 #include stdio.h
 #include unistd.h
-#include sys/io.h
 #include sys/ioctl.h
 #include sys/mman.h
 #include sys/types.h
@@ -44,6 +43,15 @@
 #include range.h
 #include vfio_pci.h
 
+#ifndef TARGET_PPC64
+#include sys/io.h
+#include ioapic.h
+#define vfio_irq_add_eoi_notifier   ioapic_add_gsi_eoi_notifier
+#else
+#include xics.h
+#define vfio_irq_add_eoi_notifier   xics_add_eoi_notifier
+#endif
+
 //#define DEBUG_VFIO
 #ifdef DEBUG_VFIO
 #define DPRINTF(fmt, ...) \
@@ -258,7 +266,7 @@ static void vfio_enable_intx_kvm(VFIODevice *vdev)
 irqfd.fd = event_notifier_get_fd(vdev-intx.interrupt);
 
 qemu_set_fd_handler(irqfd.fd, NULL, NULL, vdev);
-ioapic_remove_gsi_eoi_notifier(vdev-intx.eoi, vdev-intx.irq);
+notifier_remove(vdev-intx.eoi);
 vfio_mask_intx(vdev);
 vdev-intx.pending = false;
 qemu_set_irq(vdev-pdev.irq[vdev-intx.pin], 0);
@@ -294,7 +302,7 @@ static void vfio_enable_intx_kvm(VFIODevice *vdev)
 return;
 
 fail:
-ioapic_add_gsi_eoi_notifier(vdev-intx.eoi, vdev-intx.irq);
+vfio_irq_add_eoi_notifier(vdev-intx.eoi, vdev-intx.irq);
 qemu_set_fd_handler(irqfd.fd, vfio_intx_interrupt, NULL, vdev);
 vfio_unmask_intx(vdev);
 #endif
@@ -341,7 +349,7 @@ static void vfio_disable_intx_kvm(VFIODevice *vdev)
 
 event_notifier_cleanup(vdev-intx.unmask);
 
-ioapic_add_gsi_eoi_notifier(vdev-intx.eoi, vdev-intx.irq);
+vfio_irq_add_eoi_notifier(vdev-intx.eoi, vdev-intx.irq);
 qemu_set_fd_handler(irqfd.fd, vfio_intx_interrupt, NULL, vdev);
 vfio_unmask_intx(vdev);
 
@@ -366,7 +374,7 @@ static void vfio_update_irq(Notifier *notify, void *data)
 vdev-host.func, vdev-intx.irq, irq);
 
 vfio_disable_intx_kvm(vdev);
-ioapic_remove_gsi_eoi_notifier(vdev-intx.eoi, vdev-intx.irq);
+notifier_remove(vdev-intx.eoi);
 
 vdev-intx.irq = irq;
 
@@ -375,7 +383,7 @@ static void vfio_update_irq(Notifier *notify, void *data)
 return;
 }
 
-ioapic_add_gsi_eoi_notifier(vdev-intx.eoi, vdev-intx.irq);
+vfio_irq_add_eoi_notifier(vdev-intx.eoi, vdev-intx.irq);
 vfio_enable_intx_kvm(vdev);
 
 /* Re-enable the interrupt in cased we missed an EOI */
@@ -404,7 +412,7 @@ static int vfio_enable_intx(VFIODevice *vdev)
 vdev-intx.pin = pin - 1; /* Pin A (1) - irq[0] */
 vdev-intx.irq = pci_get_irq(vdev-pdev, vdev-intx.pin);
 vdev-intx.eoi.notify = vfio_eoi;
-ioapic_add_gsi_eoi_notifier(vdev-intx.eoi, vdev-intx.irq);
+vfio_irq_add_eoi_notifier(vdev-intx.eoi, vdev-intx.irq);
 
 vdev-intx.update_irq.notify = vfio_update_irq;
 pci_add_irq_update_notifier(vdev-pdev, vdev-intx.update_irq);
@@ -441,7 +449,7 @@ static void vfio_disable_intx(VFIODevice *vdev)
 vfio_disable_irqindex(vdev, VFIO_PCI_INTX_IRQ_INDEX);
 
 pci_remove_irq_update_notifier(vdev-intx.update_irq);
-ioapic_remove_gsi_eoi_notifier(vdev-intx.eoi, vdev-intx.irq);
+notifier_remove(vdev-intx.eoi);
 
 fd = event_notifier_get_fd(vdev-intx.interrupt);
 qemu_set_fd_handler(fd, NULL, NULL, vdev);
diff --git a/hw/vfio_pci.h b/hw/vfio_pci.h
index 00bb3dd..d1a7434 100644
--- a/hw/vfio_pci.h
+++ b/hw/vfio_pci.h
@@ -4,7 +4,6 @@
 #include qemu-common.h
 #include qemu-queue.h
 #include pci.h
-#include ioapic.h
 #include event_notifier.h
 
 typedef struct VFIOPCIHostDevice {
-- 
1.7.10.4




Re: [Qemu-devel] Closing an opened telnet monitor?

2012-07-23 Thread Michael Tokarev
On 23.07.2012 02:37, Erik Rull wrote:
 Hi all,
 
 how can I close an open telnet session to my qemu monitor? I didn't find any 
 possiblity beside killing my telnet client on my remote connected system.
 
 Is there a nicer way of doing that? quit is definitively the wrong 
 command :-)

I guess every telnet client supports Ctrl+[ escape sequence.
So the answer to this is --- Ctrl+[, q.  man telnet.

/mjt



Re: [Qemu-devel] [PATCH v7] kvm: notify host when the guest is panicked

2012-07-23 Thread Wen Congyang
At 07/23/2012 04:19 AM, Sasha Levin Wrote:
 On 07/22/2012 09:22 PM, Anthony Liguori wrote:
 Sasha Levin levinsasha...@gmail.com writes:

 On 07/21/2012 09:12 AM, Wen Congyang wrote:
 +#define KVM_PV_PORT   (0x505UL)
 +
  #ifdef __KERNEL__
  #include asm/processor.h
  
 @@ -221,6 +223,11 @@ static inline void kvm_disable_steal_time(void)
  }
  #endif
  
 +static inline unsigned int kvm_arch_pv_features(void)
 +{
 +  return inl(KVM_PV_PORT);
 +}
 +

 Why is this safe?

 I'm not sure you can just pick any ioport you'd like and use it.

 There are three ways I/O ports get used on a PC:

 1) Platform devices
  - This is well defined since the vast majority of platform devices are
implemented within a single chip.  If you're emulating an i440fx
chipset, the PIIX4 spec has an exhaustive list.

 2) PCI devices
  - Typically, PCI only allocates ports starting at 0x0d00 to avoid
conflicts with ISA devices.

 3) ISA devices
  - ISA uses subtractive decoding so any ISA device can access.  In
theory, an ISA device could attempt to use port 0x0505 but it's
unlikely.  In a modern guest, there aren't really any ISA devices being
added either.

 So yes, picking port 0x0505 is safe for something like this (as long as
 you check to make sure that you really are under KVM).
 
 Is there anything that actually prevents me from using PCI ports lower than 
 0x0d00? As you said in (3), ISA isn't really used anymore (nor is implemented 
 by lkvm for example), so placing PCI below 0x0d00 might even make sense in 
 that case.
 
 Furthermore, I can place one of these brand new virtio-mmio devices which got 
 introduced recently wherever I want right now - Having a device that uses 
 0x505 would cause a pretty non-obvious failure mode.
 
 Either way, If we are going to grab an ioport, then:
 
  - It should be documented well somewhere in Documentation/virt/kvm
  - It should go through request_region() to actually claim those ioports.

Good idea.

  - It should fail gracefully if that port is taken for some reason, instead 
 of not even checking it.

Yes, I agree it.

I will update it.

Thanks
Wen Congyang

 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 




Re: [Qemu-devel] [PATCH v2 4/4] ARM: exynos4210_pmu: Add software reset support

2012-07-23 Thread Maksim Kozlov

20.07.2012 18:32, Peter Maydell пишет:

On 12 July 2012 17:54, Maksim Kozlovm.koz...@samsung.com  wrote:

Signed-off-by: Maksim Kozlovm.koz...@samsung.com
---
  hw/exynos4210_pmu.c |   40 +---
  1 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/hw/exynos4210_pmu.c b/hw/exynos4210_pmu.c
index 7f09c79..96588d9 100644
--- a/hw/exynos4210_pmu.c
+++ b/hw/exynos4210_pmu.c
@@ -18,13 +18,8 @@
   *  with this program; if not, seehttp://www.gnu.org/licenses/.
   */

-/*
- * This model implements PMU registers just as a bulk of memory. Currently,
- * the only reason this device exists is that secondary CPU boot loader
- * uses PMU INFORM5 register as a holding pen.
- */
-
  #include sysbus.h
+#include sysemu.h

  #ifndef DEBUG_PMU
  #define DEBUG_PMU   0
@@ -230,6 +225,8 @@

  #define EXYNOS4210_PMU_REGS_MEM_SIZE 0x3d0c

+#define SWRESET_SYSTEM_MASK 0x0001
+
  typedef struct Exynos4210PmuReg {
  const char  *name; /* for debug only */
  uint32_t offset;
@@ -458,7 +455,17 @@ static void exynos4210_pmu_write(void *opaque, 
target_phys_addr_t offset,
  PRINT_DEBUG_EXTEND(%s [0x%04x]- 0x%04x\n,
   exynos4210_pmu_regs[index].name, (uint32_t)offset, 
(uint32_t)val);

-s-reg[index] = val;
+switch (offset) {
+case SWRESET:
+if (val  SWRESET_SYSTEM_MASK) {
+s-reg[index] = val;
+qemu_system_reset_request();
+}
+break;
+default:
+s-reg[index] = val;
+break;
+}
  }

  static const MemoryRegionOps exynos4210_pmu_ops = {
@@ -477,9 +484,28 @@ static void exynos4210_pmu_reset(DeviceState *dev)
  Exynos4210PmuState *s =
  container_of(dev, Exynos4210PmuState, busdev.qdev);
  unsigned i;
+uint32_t index = exynos4210_pmu_get_register_index(s, SWRESET);
+uint32_t swreset = s-reg[index];

  /* Set default values for registers */
  for (i = 0; i  PMU_NUM_OF_REGISTERS; i++) {
+if (swreset) {
+switch (exynos4210_pmu_regs[i].offset) {
+case INFORM0:
+case INFORM1:
+case INFORM2:
+case INFORM3:
+case INFORM4:
+case INFORM5:
+case INFORM6:
+case INFORM7:
+case PS_HOLD_CONTROL:
+/* keep these registers during SW reset */
+continue;
+default:
+break;
+}
+}
  s-reg[i] = exynos4210_pmu_regs[i].reset_value;
  }
  }


This patch seems to be trying to make a distinction that QEMU doesn't
support, ie between system wide software reset and system wide
hard reset. I'm not convinced about the wisdom of trying to paper
over this lack with a single-device workaround.


Ok. Thank for review

And I found out that this patch contains a mistake, therefore it should 
be rejected in any case




-- PMM








Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/2] spapr: Add support for -vga option

2012-07-23 Thread Benjamin Herrenschmidt
On Wed, 2012-06-27 at 17:25 -0500, Anthony Liguori wrote:

 If a user asks for something and we can't make it work, we should fail.

Note that QEMU's cirrus works fine with the new kernel cirrusdrmfb, so
I say we should allow it (needs to be able to the powerpc device .mak
as well tho.

It shouldn't be primary because old RHEL's iirc used to have cirrusfb
enabled and that would blow due to guest kernel bugs among others.

I haven't had a chance to try whatever userspace DDX the Xorg folks have
come up with to use on top of cirrusdrmfb, so that might need a bit of
fixing but there's no reason not to allow -vga cirrus at this point.

Note to Matthew: cirrusdrmfb is a LOT SLOWER than offb for a similar
SW only dumb framebuffer, probably has to do with the way it does the
dirty stuff, not sure ...

Why not draw directly into the emulated vram ?

Cheers,
Ben.





Re: [Qemu-devel] [PATCH 4/7] RTC: Update the RTC clock only when reading it

2012-07-23 Thread Paolo Bonzini
Il 23/07/2012 07:17, Juan Quintela ha scritto:
 Paolo Bonzini pbonz...@redhat.com wrote:
 From: Zhang, Yang Z yang.z.zh...@intel.com

 Calculate guest RTC based on the time of the last update, instead of
 using timers.  The formula is

 (base_rtc + guest_time_now - guest_time_last_update + offset)

 Base_rtc is the RTC value when the RTC was last updated.
 Guest_time_now is the guest time when the access happens.
 Guest_time_last_update was the guest time when the RTC was last updated.
 Offset is used when divider reset happens or the set bit is toggled.

 The timer is kept in order to signal interrupts, but it only needs to
 run when either UF or AF is cleared.  When the bits are both set, the
 timer does not run.

 UIP is now synthesized when reading register A.  If the timer is not set,
 or if there is more than one second before it (as is the case at the
 end of this series), the leading edge of UIP is computed and the rising
 edge occurs 220us later.  If the update timer occurs within one second,
 however, the rising edge of the AF and UF bits should coincide withe
 the falling edge of UIP.  We do not know exactly when this will happen
 because there could be delays in the servicing of the timer.  Hence, in
 this case reading register A only computes for the rising edge of UIP,
 and latches the bit until the timer is fired and clears it.

 Signed-off-by: Yang Zhang yang.z.zh...@intel.com
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 @@ -540,11 +593,12 @@ static const VMStateDescription vmstate_rtc = {
  VMSTATE_INT32(current_tm.tm_mday, RTCState),
  VMSTATE_INT32(current_tm.tm_mon, RTCState),
  VMSTATE_INT32(current_tm.tm_year, RTCState),
 +VMSTATE_UINT64(base_rtc, RTCState),
 +VMSTATE_UINT64(last_update, RTCState),
 +VMSTATE_INT64(offset, RTCState),
VMSTATE_UINT64_V(base_rtc, RTCState, 3)
 same ofr the others.
 
 Normally, new fields are added at the end of the structure.

Doesn't really matter if you bump the minimum version but you're right
that it is more correct.

 
  VMSTATE_TIMER(periodic_timer, RTCState),
  VMSTATE_INT64(next_periodic_time, RTCState),
 -VMSTATE_INT64(next_second_time, RTCState),
 -VMSTATE_TIMER(second_timer, RTCState),
 -VMSTATE_TIMER(second_timer2, RTCState),
 +VMSTATE_TIMER(update_timer, RTCState),
 
 I have to read the rest of the patch to know what is the relation of
 this 4 fields, to see if there is any way to create this in any sane
 way that is compatible.

next_second_time is computed like this (see check_update_timer):

guest_nsec = get_guest_rtc_ns(s) % NSEC_PER_SEC;
next_update_time = qemu_get_clock_ns(rtc_clock)
+ NSEC_PER_SEC - guest_nsec;

One of second_timer and second_timer2 is unset, depending on whether
UIP=1 or UIP=0 respectively.

second_timer if set is next_second_time.

second_timer2 if set is next_second_time + get_ticks_per_sec/100.

 The new fields can go in a different subsection.

But then you would transmit the subsection always, and old QEMU doesn't
know how to skip it.  So it doesn't really help.

This is because base_rtc, last_update and offset need to be transmitted
always.  next_alarm_time only if no alarm fired already, but the common
case is yes.  Only for update_timer we can hack and transmit it in the
slot that was used for next_second_time.  Unless you want to transmit
the subsection only for the new machine version, which is never done in
the QEMU tree.

Paolo



Re: [Qemu-devel] [PATCH] ide scsi: Mess with geometry only for hard disk devices

2012-07-23 Thread Markus Armbruster
Ping?

Markus Armbruster arm...@redhat.com writes:

 Legacy -drive cyls=... are now ignored completely when the drive
 doesn't back a hard disk device.  Before, they were first checked
 against a hard disk's limits, then ignored.

 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  hw/ide/qdev.c  |3 ++-
  hw/scsi-disk.c |3 ++-
  2 files changed, 4 insertions(+), 2 deletions(-)

 diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
 index 22e58df..5ea9b8f 100644
 --- a/hw/ide/qdev.c
 +++ b/hw/ide/qdev.c
 @@ -149,7 +149,8 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind 
 kind)
  }
  
  blkconf_serial(dev-conf, dev-serial);
 -if (blkconf_geometry(dev-conf, dev-chs_trans, 65536, 16, 255)  0) {
 +if (kind != IDE_CD
 + blkconf_geometry(dev-conf, dev-chs_trans, 65536, 16, 255)  
 0) {
  return -1;
  }
  
 diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
 index 525816c..318318c 100644
 --- a/hw/scsi-disk.c
 +++ b/hw/scsi-disk.c
 @@ -1750,7 +1750,8 @@ static int scsi_initfn(SCSIDevice *dev)
  }
  
  blkconf_serial(s-qdev.conf, s-serial);
 -if (blkconf_geometry(dev-conf, NULL, 65535, 255, 255)  0) {
 +if (dev-type == TYPE_DISK
 + blkconf_geometry(dev-conf, NULL, 65535, 255, 255)  0) {
  return -1;
  }



Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/2] spapr: Add support for -vga option

2012-07-23 Thread Benjamin Herrenschmidt
On Mon, 2012-07-23 at 16:40 +1000, Benjamin Herrenschmidt wrote:
 
 Note to Matthew: cirrusdrmfb is a LOT SLOWER than offb for a similar
 SW only dumb framebuffer, probably has to do with the way it does the
 dirty stuff, not sure ...
 
 Why not draw directly into the emulated vram ? 

More note to Matthew ... any objection to doing something similar for
qemu bochs style framebuffer ? (ia -vga std). This is the preferred
one on ppc, does 32-bpp fine and is generally simpler.

Cheers,
Ben.





Re: [Qemu-devel] [PATCH 7/7] RTC: Allow to migrate from old QEMU

2012-07-23 Thread Paolo Bonzini
Il 23/07/2012 07:12, Juan Quintela ha scritto:
 .fields  = (VMStateField []) {
 VMSTATE_BUFFER(cmos_data, RTCState),
 VMSTATE_UINT8(cmos_index, RTCState),
 VMSTATE_INT32(current_tm.tm_sec, RTCState),
 VMSTATE_INT32(current_tm.tm_min, RTCState),
 VMSTATE_INT32(current_tm.tm_hour, RTCState),
 VMSTATE_INT32(current_tm.tm_wday, RTCState),
 VMSTATE_INT32(current_tm.tm_mday, RTCState),
 VMSTATE_INT32(current_tm.tm_mon, RTCState),
 VMSTATE_INT32(current_tm.tm_year, RTCState),
 
 you can change this to:
 
 VMSTATE_UNUSED(7*4);

I think this is not safe.  In the load_old case, we ignore the struct tm
because we call rtc_set_time and check_update_timer.  We need this to
make up some values of base_rtc, last_update and offset.

In the normal case we need both the CMOS data and the timer data.
Though perhaps I can make a further change and remove current_tm
altogether.  I'll take a look this week; however that will make it even
harder to produce the old version.

Paolo



Re: [Qemu-devel] [PATCH v2] MP initialization protocol differs between cpu families, and for P6 and onward models it is up to CPU to decide if it will be BSP using this protocol, so try to model this.

2012-07-23 Thread Igor Mammedov

Hello Gleb,

Is this v2 patch more acceptable then v1?



Re: [Qemu-devel] [PATCH 0/4 v2] target-i386: move tcg intialization inside CPU object

2012-07-23 Thread Igor Mammedov

ping.

On 06/25/2012 03:55 PM, Igor Mammedov wrote:

v2:
   - drop usage of prev_debug_excp_handler consistently in all users
   - split from reset patches to avoid confusion of inter-dependency

Compile  Run tested:
   target-i386: tcg and kvm mode
   i386-linux-user: running of /bin/ls
Compile tested:
   xtensa-softmmu  xtensaeb-softmmu

git tree for testing:
   https://github.com/imammedo/qemu/tree/x86cpu_qom_tcg_v2


Igor Mammedov (4):
   target-i386: drop usage of prev_debug_excp_handler
   target-xtensa: drop usage of prev_debug_excp_handler
   cleanup cpu_set_debug_excp_handler
   target-i386: move tcg initialization into x86_cpu_initfn()

  cpu-exec.c |5 +
  exec-all.h |2 +-
  target-i386/cpu.c  |   10 ++
  target-i386/cpu.h  |1 +
  target-i386/helper.c   |   16 +---
  target-xtensa/helper.c |8 +---
  6 files changed, 15 insertions(+), 27 deletions(-)



--
-
 Igor





Re: [Qemu-devel] [PATCH v2] MP initialization protocol differs between cpu families, and for P6 and onward models it is up to CPU to decide if it will be BSP using this protocol, so try to model this.

2012-07-23 Thread Gleb Natapov
On Mon, Jul 23, 2012 at 09:44:05AM +0200, Igor Mammedov wrote:
 Hello Gleb,
 
 Is this v2 patch more acceptable then v1?
Yes. Sorry for not being explicit about it :)

--
Gleb.



Re: [Qemu-devel] [RFC PATCH 2/2] block: gluster as block backend

2012-07-23 Thread Bharata B Rao
On Sun, Jul 22, 2012 at 04:38:00PM +0100, Stefan Hajnoczi wrote:
 On Sat, Jul 21, 2012 at 9:31 AM, Bharata B Rao
 bhar...@linux.vnet.ibm.com wrote:
  +typedef struct GlusterAIOCB {
  +BlockDriverAIOCB common;
  +QEMUIOVector *qiov;
 
 The qiov field is unused.
 
  +char *bounce;
 
 Unused.

Yes, removed these two.

 
  +struct BDRVGlusterState *s;
 
 You can get this through common.bs-opaque, but if you like having a
 shortcut, that's fine.
 
  +int cancelled;
 
 bool

Ok.

 
  +} GlusterAIOCB;
  +
  +typedef struct GlusterCBKData {
  +GlusterAIOCB *acb;
  +struct BDRVGlusterState *s;
  +int64_t size;
  +int ret;
  +} GlusterCBKData;
 
 I think GlusterCBKData could just be part of GlusterAIOCB.  That would
 simplify the code a little and avoid some malloc/free.

Are you suggesting to put a field

GlusterCBKData gcbk;

inside GlusterAIOCB and use gcbk from there or

Are you suggesting that I make the fields of GlusterCBKData part of
GlusterAIOCB and get rid of GlusterCBKData altogether ? This means I would
have to pass the GlusterAIOCB to gluster async calls and update its fields from
gluster callback routine. I can do this, but I am not sure if you can touch
the fields of GlusterAIOCB in non-QEMU threads (gluster callback thread).

 
  +
  +typedef struct BDRVGlusterState {
  +struct glfs *glfs;
  +int fds[2];
  +int open_flags;
  +struct glfs_fd *fd;
  +int qemu_aio_count;
  +int event_reader_pos;
  +GlusterCBKData *event_gcbk;
  +} BDRVGlusterState;
  +
  +#define GLUSTER_FD_READ 0
  +#define GLUSTER_FD_WRITE 1
  +
  +static void qemu_gluster_complete_aio(GlusterCBKData *gcbk)
  +{
  +GlusterAIOCB *acb = gcbk-acb;
  +int ret;
  +
  +if (acb-cancelled) {
 
 Where does cancelled get set?

I realised that I am not supporting bdrv_aio_cancel(). I guess I will have
to add support for this in next version.

 
  +qemu_aio_release(acb);
  +goto done;
  +}
  +
  +if (gcbk-ret == gcbk-size) {
  +ret = 0; /* Success */
  +} else if (gcbk-ret  0) {
  +ret = gcbk-ret; /* Read/Write failed */
  +} else {
  +ret = -EINVAL; /* Partial read/write - fail it */
 
 EINVAL is for invalid arguments.  EIO would be better.

Ok.

 
  +/*
  + * file=protocol:server@port:volname:image
  + */
  +static int qemu_gluster_parsename(GlusterConf *c, const char *filename)
  +{
  +char *file = g_strdup(filename);
  +char *token, *next_token, *saveptr;
  +char *token_s, *next_token_s, *saveptr_s;
  +int ret = -EINVAL;
  +
  +/* Discard the protocol */
  +token = strtok_r(file, :, saveptr);
  +if (!token) {
  +goto out;
  +}
  +
  +/* server@port */
  +next_token = strtok_r(NULL, :, saveptr);
  +if (!next_token) {
  +goto out;
  +}
  +if (strchr(next_token, '@')) {
  +token_s = strtok_r(next_token, @, saveptr_s);
  +if (!token_s) {
  +goto out;
  +}
  +strncpy(c-server, token_s, HOST_NAME_MAX);
 
 strncpy(3) will not NUL-terminate when token_s is HOST_NAME_MAX
 characters long.  QEMU has cutils.c:pstrcpy().

Will use pstrcpy.

 
 When the argument is too long we should probably report an error
 instead of truncating.

Or should we let gluster APIs to flag an error with truncated
server and volume names ?

 
 Same below.
 
  +next_token_s = strtok_r(NULL, @, saveptr_s);
  +if (!next_token_s) {
  +goto out;
  +}
  +c-port = atoi(next_token_s);
 
 No error checking.  If the input is invalid an error message would
 help the user here.

Fixed.

 
  +static struct glfs *qemu_gluster_init(GlusterConf *c, const char *filename)
  +{
  +struct glfs *glfs = NULL;
  +int ret;
  +
  +ret = qemu_gluster_parsename(c, filename);
  +if (ret  0) {
  +errno = -ret;
  +goto out;
  +}
  +
  +glfs = glfs_new(c-volname);
  +if (!glfs) {
  +goto out;
  +}
  +
  +ret = glfs_set_volfile_server(glfs, socket, c-server, c-port);
  +if (ret  0) {
  +goto out;
  +}
  +
  +/*
  + * TODO: Logging is not necessary but instead nice to have.
  + * Can QEMU optionally log into a standard place ?
 
 QEMU prints to stderr, can you do that here too?  The global log file
 is not okay, especially when multiple QEMU instances are running.

Ok, I can do glfs_set_logging(glfs, /dev/stderr, loglevel);

 
  + * Need to use defines like gf_loglevel_t:GF_LOG_INFO instead of
  + * hard coded values like 7 here.
  + */
  +ret = glfs_set_logging(glfs, /tmp/qemu-gluster.log, 7);
  +if (ret  0) {
  +goto out;
  +}
  +
  +ret = glfs_init(glfs);
  +if (ret  0) {
  +goto out;
  +}
  +return glfs;
  +
  +out:
  +if (glfs) {
  +(void)glfs_fini(glfs);
  +}
  +return NULL;
  +}
  +
  +static int qemu_gluster_open(BlockDriverState *bs, const char *filename,

Re: [Qemu-devel] [RFC PATCH 0/2] GlusterFS support in QEMU - v2

2012-07-23 Thread Bharata B Rao
On Sun, Jul 22, 2012 at 03:42:28PM +0100, Stefan Hajnoczi wrote:
 On Sat, Jul 21, 2012 at 9:29 AM, Bharata B Rao
 bhar...@linux.vnet.ibm.com wrote:
  -drive file=gluster:server@port:volname:image
 
  - Here 'gluster' is the protocol.
  - 'server@port' specifies the server where the volume file specification for
the given volume resides. 'port' is the port number on which gluster
management daemon (glusterd) is listening. This is optional and if not
specified, QEMU will send 0 which will make libgfapi to use the default
port.
 
 'server@port' is weird notation.  Normally it is 'server:port' (e.g.
 URLs).  Can you change it?

I don't like but, but settled for it since port was optional and : was
being used as separator here.

 
 What about the other transports supported by libgfapi: UNIX domain
 sockets and RDMA?  My reading of glfs.h is that there are 3 connection
 options:
 1. 'transport': 'socket' (default), 'unix', 'rdma'
 2. 'host': server hostname for 'socket', path to UNIX domain socket
 for 'unix', or something else for 'rdma'
 3. 'port': TCP port when 'socket' is used.  Ignored otherwise.
 
 Unfortunately QEMU block drivers cannot take custom options yet.  That
 would make it possible to cleanly map these connection options and
 save you from inventing syntax which doesn't expose all options.
 
 In the meantime it would be nice if the syntax exposed all options.

So without the capability to pass custom options to block drivers, am I forced
to keep extending the file= with more and more options ?

file=gluster:transport:server:port:volname:image ?

Looks ugly and not easy to make any particular option optional. If needed I can
support this from GlusterFS backend.

 
  Note that we are no longer using volfiles directly and use volume names
  instead. For this to work, gluster management daemon (glusterd) needs to
  be running on the QEMU node. This limits the QEMU user to access the 
  volumes by
  the default volfiles that are generated by gluster CLI. This should be
  fine as long as gluster CLI provides the capability to generate or 
  regenerate
  volume files for a given volume with the xlator set that QEMU user is
  interested in. GlusterFS developers tell me that this can be provided with
  some enhancements to Gluster CLI/glusterd. Note that the custom volume files
  is typically needed when GlusterFS server is co-located with QEMU in
  which case it would  be beneficial to get rid of client-server overhead and
  RPC communication overhead.
 
 My knowledge of GlusterFS is limited.  Here is what I am thinking:
 
 1. The user cannot specify a local configuration file, you require
 that there is a glusterd running which provides configuration
 information.

Yes. User only specifies a volume name and glusterd is used to fetch
the right volume file for that volume name.

 2. It is currently not possible to bypass RPC because the glusterd
 managed configuration file doesn't support that.

It is possible. Gluster already supports custom extensions
to volume names and it is possible to use the required volfile by specifying
this custom volname extension.

For eg, if I have a volume named test, by default the volfile used for
it will be test-fuse.vol. Currently I can put my own custom volfile into
the standard location and get glusterd pick that up. I can specify
test.rpcbypass as volname and glusterd will pick test.rpcbypass.vol.

What is currently not supported is the ability to create test.rpcbypass.vol
from gluster CLI. I believe that gluster developers are ok with enhancing
gluster CLI to support generating/regenerating volfiles for a given volume
with custom translator set.

 
 I'm not sure if these statements are true?
 
 Would you support local volfiles in the future again?  Why force users
 to run glusterd?

I will let gluster folks on CC to answer this and let us know the benefits
of always depending on glusterd.

I guess running glusterd would be beneficial when supporting migration. QEMU
working from a local volume (with volname=test.rpcbypass) can be easily
restarted on a different node by just changing volname to test. glusterd will
take care of fetching the right volfile automatically for us.

 
  - As mentioned above, the VM image on gluster volume can be specified like
this:
  -drive file=gluster:localhost:testvol:/F17,format=gluster
 
Note that format=gluster is not needed ideally and its a work around I 
  have
until libgfapi provides a working connection cleanup routine 
  (glfs_fini()).
When the format isn't specified, QEMU figures out the format by doing
find_image_format that results in one open and close before opening the
image file long term for standard read and write. Gluster connection
initialization is done from open and connection termination is done from
close. But since glfs_fini() isn't working yet, I am bypassing
find_image_format by specifying format=gluster directly which results in
just one open and 

Re: [Qemu-devel] [RFC seabios PATCH] enumerate APIC IDs directly from CPUs

2012-07-23 Thread Gleb Natapov
On Thu, Jul 19, 2012 at 04:46:35PM -0300, Eduardo Habkost wrote:
 On Thu, Jul 19, 2012 at 11:28:54AM -0300, Eduardo Habkost wrote:
  On Thu, Jul 19, 2012 at 12:58:46PM +0300, Gleb Natapov wrote:
   On Tue, Jul 17, 2012 at 06:56:30PM -0300, Eduardo Habkost wrote:
This patch is an attempt to fix the non-continguous-APIC-ID problem 
without the
FW_CFG_LAPIC_INFO approach I have sent proposed last week.

Basically, this changes Seabios to probe for APIC IDs directly from the
CPUs on boot, instead of getting it using fw_cfg, store the found APIC
IDs on a bitmap, and use that information whe building the MADT, SRAT,
and SSDT ACPI tables.

To do this properly, we have to decide the meaning of CPU IDs in the
QEMU-Seabios interfaces, too. I see two possible approaches:

1) Have Seabios and QEMU agree on a a CPU identifier, that is
   independent from the APIC ID.
2) Always use the Initial APIC ID on all communication between QEMU and
   Seabios.

   We need to be prepared to support more than 255 cpus. With  255 cpus
   comes x2apic and x2apic has 32bit apic ids. HW does not have to support
   all of the bits though, but potentially all the bitmasks can grow
   prohibitedly large.
  
  I see only two solutions:
  
  - Specify an interface/convention for QEMU and Seabios agree upon a CPU
identifier = x2apic = LAPIC ID mapping for all CPUs.
  - Specify new NUMA-information and CPU hotplug interfaces (or extend the
existing ones) based on x2apic ID, when Seabios start supporting
x2apic.
  
  I am not particularly inclined towards any of those two solutions. I
  dislike them equally.  :-)
 
 Oh, it is simpler than I have expected. x2APIC specification:
 
 The local APIC ID is initialized by hardware with a 32 bit ID (x2APIC
 ID). The lowest 8 bits of the x2APIC ID is the legacy local xAPIC ID,
 and is stored in the upper 8 bits of the APIC register for access in
 xAPIC mode.
 
 And the ACPI specification:
 
 Logical processors with APIC ID values of 255 and greater are required
 to have a Processor Device object and must convey the processor’s APIC
 information to OSPM using the Processor Local X2APIC structure. Logical
 processors with APIC ID values less than 255 must use the Processor
 Local APIC structure to convey their APIC information to OSPM.
 
 That means the x2APIC ID and the xAPIC ID are interchangeable, for
 values = 255. That means the QEMU=Seabios communication can be safely
 based on APIC IDs without any ambiguity.
 
Yes for = 255 they interchangeable. That's why we can add +x2apic to
our cpu models without changes to the BIOS.

 The CPU hotplug interface is a bit of a problem because it is based on a
 256-bit bitmap. But on the day it gets extended to support more than 256
 CPUs, it can safely be based on APIC IDs and still keep compatibility
 with systems without x2APIC.
The bitmap will have to be extended if we will go beyond 256 cpus. Using
apic-id to index the bitmap means that the size of the bitmap is a
function of max apic-id we want to support, not max number of cpus.

 
 So, now I am strongly inclined towards the second option from the list
 above: just use APIC IDs everywhere to identify CPUs when QEMU and
 Seabios communicate with each other, and QEMU can completely ignore the
 Processor ID used by Seabios.
I agree with making Processor ID Seabios internal thing.

 
  
  Note that I am more worried about the QEMU-Seabios interfaces. The
  APIC ID bitmap on smp.c, for example, is just an implementation detail:
  if we make Seabios support x2apic, that code can be changed to use a
  different data structure instead.
  
 [...]
To try to make things less likely to break on the common, non-hotplug
case, this patch makes the Processor IDs be chosen this way:

- The CPUs present on boot get contiguous Processor IDs (even if the
  APIC IDs are not contiguous);
- The remaining Processor declarations are going to associated to the
  remaining APIC IDs (immediately after the last present APIC ID),
  sequentially. This means that hotplugged CPUs may not get contiguous
  Processor declarations if their APIC IDs are not contiguous.

   I am curious what will happen if cpu will be hot plugged, than hibernate
   and resume is done. After resume hot plugged cpu will have different
   Processor ID in ACPI. This may or may not be a problem.
  
  True. Keeping those tables stable after hotplug and hibernate may be a
  challenge.
  
  Maybe it would be easier to just leave holes on the MADT and SSDT tables
  (making APIC ID and Processor ID always match), and hope no OS will be
  confused by the holes.
 
 I am inclined to try this approach first (keep APIC ID == ACPI Processor
 ID), to keep things simple in Seabios. I am hoping no OS will have
 problems with the holes in the list of enabled Processor IDs.
 
They shouldn't.

--
Gleb.



Re: [Qemu-devel] [RFC PATCH 2/2] block: gluster as block backend

2012-07-23 Thread Stefan Hajnoczi
On Mon, Jul 23, 2012 at 9:32 AM, Bharata B Rao
bhar...@linux.vnet.ibm.com wrote:
 On Sun, Jul 22, 2012 at 04:38:00PM +0100, Stefan Hajnoczi wrote:
 On Sat, Jul 21, 2012 at 9:31 AM, Bharata B Rao
 bhar...@linux.vnet.ibm.com wrote:
  +} GlusterAIOCB;
  +
  +typedef struct GlusterCBKData {
  +GlusterAIOCB *acb;
  +struct BDRVGlusterState *s;
  +int64_t size;
  +int ret;
  +} GlusterCBKData;

 I think GlusterCBKData could just be part of GlusterAIOCB.  That would
 simplify the code a little and avoid some malloc/free.

 Are you suggesting to put a field

 GlusterCBKData gcbk;

 inside GlusterAIOCB and use gcbk from there or

 Are you suggesting that I make the fields of GlusterCBKData part of
 GlusterAIOCB and get rid of GlusterCBKData altogether ? This means I would
 have to pass the GlusterAIOCB to gluster async calls and update its fields 
 from
 gluster callback routine. I can do this, but I am not sure if you can touch
 the fields of GlusterAIOCB in non-QEMU threads (gluster callback thread).

The fields in GlusterCBKData could become part of GlusterAIOCB.
Different threads can access fields in a struct, they just need to
ensure access is synchronized if they touch the same fields.  In the
case of this code I think there is nothing that requires
synchronization beyond the pipe mechanism that you already use to
complete processing in a QEMU thread.

 When the argument is too long we should probably report an error
 instead of truncating.

 Or should we let gluster APIs to flag an error with truncated
 server and volume names ?

What if the truncated name is a valid but different object?  For example:
Max chars = 5
Objects:
helloworld
hello

If helloworld is truncated to hello we get no error back because
it's a valid object!

We need to either check sizes explicitly without truncating or use a
g_strdup() approach without any size limits and let the gfapi
functions error out if the input string is too long.

  +static struct glfs *qemu_gluster_init(GlusterConf *c, const char 
  *filename)
  +{
  +struct glfs *glfs = NULL;
  +int ret;
  +
  +ret = qemu_gluster_parsename(c, filename);
  +if (ret  0) {
  +errno = -ret;
  +goto out;
  +}
  +
  +glfs = glfs_new(c-volname);
  +if (!glfs) {
  +goto out;
  +}
  +
  +ret = glfs_set_volfile_server(glfs, socket, c-server, c-port);
  +if (ret  0) {
  +goto out;
  +}
  +
  +/*
  + * TODO: Logging is not necessary but instead nice to have.
  + * Can QEMU optionally log into a standard place ?

 QEMU prints to stderr, can you do that here too?  The global log file
 is not okay, especially when multiple QEMU instances are running.

 Ok, I can do glfs_set_logging(glfs, /dev/stderr, loglevel);

Yes.  I think - is best since it is supported by gfapi
(libglusterfs/src/logging.c:gf_log_init).  /dev/stderr is not POSIX.

  + * Need to use defines like gf_loglevel_t:GF_LOG_INFO instead of
  + * hard coded values like 7 here.
  + */
  +ret = glfs_set_logging(glfs, /tmp/qemu-gluster.log, 7);
  +if (ret  0) {
  +goto out;
  +}
  +
  +ret = glfs_init(glfs);
  +if (ret  0) {
  +goto out;
  +}
  +return glfs;
  +
  +out:
  +if (glfs) {
  +(void)glfs_fini(glfs);
  +}
  +return NULL;
  +}
  +
  +static int qemu_gluster_open(BlockDriverState *bs, const char *filename,
  +int bdrv_flags)
  +{
  +BDRVGlusterState *s = bs-opaque;
  +GlusterConf *c = g_malloc(sizeof(GlusterConf));

 Can this be allocated on the stack?

 It consists of PATH_MAX(4096), HOST_NAME_MAX(255) and GLUSTERD_MAX_VOLUME_NAME
 (1000). A bit heavy to be on stack ?

This is userspace, stacks are big but it's up to you.

Stefan



Re: [Qemu-devel] [RFC PATCH 0/2] GlusterFS support in QEMU - v2

2012-07-23 Thread Daniel P. Berrange
On Sat, Jul 21, 2012 at 01:59:17PM +0530, Bharata B Rao wrote:
 Hi,
 
 Here is the v2 patchset for supporting GlusterFS protocol from QEMU.
 
 This set of patches enables QEMU to boot VM images from gluster volumes.
 This is achieved by adding gluster as a new block backend driver in QEMU.
 Its already possible to boot from VM images on gluster volumes, but this
 patchset provides the ability to boot VM images from gluster volumes by
 by-passing the FUSE layer in gluster. In case the image is present on the
 local system, it is possible to even bypass client and server translator and
 hence the RPC overhead.
 
 The major change in this version is to not implement libglusterfs based
 gluster backend within QEMU but instead use libgfapi. libgfapi library
 from GlusterFS project provides APIs to access gluster volumes directly.
 With the use of libgfapi, the specification of gluster backend from QEMU
 matches more closely with the GlusterFS's way of specifying volumes. We now
 specify the gluster backed image like this:
 
 -drive file=gluster:server@port:volname:image
 
 - Here 'gluster' is the protocol.
 - 'server@port' specifies the server where the volume file specification for
   the given volume resides. 'port' is the port number on which gluster
   management daemon (glusterd) is listening. This is optional and if not
   specified, QEMU will send 0 which will make libgfapi to use the default
   port.
 - 'volname' is the name of the gluster volume which contains the VM image.
 - 'image' is the path to the actual VM image in the gluster volume.

I don't think we should be using '@' as a field separator here, when :
can do that job just fine. In addition we already have a precendent set
with the sheepdog driver for using ':' for separating all fields:

  -drive file=sheepdog:example.org:6000:imagename

If you want to allow for port number to be omitted, this can be handled
thus:

  -drive file=sheepdog:example.org::imagename

which is how -chardev deals with omitted port numbers

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [RFC PATCH 0/2] GlusterFS support in QEMU - v2

2012-07-23 Thread Stefan Hajnoczi
On Mon, Jul 23, 2012 at 9:50 AM, Bharata B Rao
bhar...@linux.vnet.ibm.com wrote:
 On Sun, Jul 22, 2012 at 03:42:28PM +0100, Stefan Hajnoczi wrote:
 On Sat, Jul 21, 2012 at 9:29 AM, Bharata B Rao
 bhar...@linux.vnet.ibm.com wrote:
  -drive file=gluster:server@port:volname:image
 
  - Here 'gluster' is the protocol.
  - 'server@port' specifies the server where the volume file specification 
  for
the given volume resides. 'port' is the port number on which gluster
management daemon (glusterd) is listening. This is optional and if not
specified, QEMU will send 0 which will make libgfapi to use the default
port.

 'server@port' is weird notation.  Normally it is 'server:port' (e.g.
 URLs).  Can you change it?

 I don't like but, but settled for it since port was optional and : was
 being used as separator here.


 What about the other transports supported by libgfapi: UNIX domain
 sockets and RDMA?  My reading of glfs.h is that there are 3 connection
 options:
 1. 'transport': 'socket' (default), 'unix', 'rdma'
 2. 'host': server hostname for 'socket', path to UNIX domain socket
 for 'unix', or something else for 'rdma'
 3. 'port': TCP port when 'socket' is used.  Ignored otherwise.

 Unfortunately QEMU block drivers cannot take custom options yet.  That
 would make it possible to cleanly map these connection options and
 save you from inventing syntax which doesn't expose all options.

 In the meantime it would be nice if the syntax exposed all options.

 So without the capability to pass custom options to block drivers, am I forced
 to keep extending the file= with more and more options ?

 file=gluster:transport:server:port:volname:image ?

 Looks ugly and not easy to make any particular option optional. If needed I 
 can
 support this from GlusterFS backend.

Kevin, Markus: Any thoughts on passing options to block drivers?
Encoding GlusterFS options into a filename string is pretty
cumbersome.



Re: [Qemu-devel] [RFC PATCH 0/2] GlusterFS support in QEMU - v2

2012-07-23 Thread ronnie sahlberg
Why not use

-drive file=gluster://server[:port]/volname/image

A great many protocols today use the form

protocol://server:port]/path

so this would make it consistent with a lot of other naming schemes
out there, and imho make
the url more intuitive.


FTP looks like this :   ftp://user:password@host:port/path
NFS looks like this :   nfs://host:porturl-path
CIFS looks like this :
smb://[[[authdomain;]user@]host[:port][/share[/path][/name]]][?context]

For iSCSI we use  :   iscsi://server[:port]/target/lun


(The iscsi syntax was picked explicitely to be consistent with the
de-facto url naming scheme.)


I would argue that this is the de-facto way to create a url for
different protocols,   so it would imho be natural to specify a
glusterfs url in a similar format.


ronnie sahlberg


On Mon, Jul 23, 2012 at 7:16 PM, Daniel P. Berrange berra...@redhat.com wrote:
 On Sat, Jul 21, 2012 at 01:59:17PM +0530, Bharata B Rao wrote:
 Hi,

 Here is the v2 patchset for supporting GlusterFS protocol from QEMU.

 This set of patches enables QEMU to boot VM images from gluster volumes.
 This is achieved by adding gluster as a new block backend driver in QEMU.
 Its already possible to boot from VM images on gluster volumes, but this
 patchset provides the ability to boot VM images from gluster volumes by
 by-passing the FUSE layer in gluster. In case the image is present on the
 local system, it is possible to even bypass client and server translator and
 hence the RPC overhead.

 The major change in this version is to not implement libglusterfs based
 gluster backend within QEMU but instead use libgfapi. libgfapi library
 from GlusterFS project provides APIs to access gluster volumes directly.
 With the use of libgfapi, the specification of gluster backend from QEMU
 matches more closely with the GlusterFS's way of specifying volumes. We now
 specify the gluster backed image like this:

 -drive file=gluster:server@port:volname:image

 - Here 'gluster' is the protocol.
 - 'server@port' specifies the server where the volume file specification for
   the given volume resides. 'port' is the port number on which gluster
   management daemon (glusterd) is listening. This is optional and if not
   specified, QEMU will send 0 which will make libgfapi to use the default
   port.
 - 'volname' is the name of the gluster volume which contains the VM image.
 - 'image' is the path to the actual VM image in the gluster volume.

 I don't think we should be using '@' as a field separator here, when :
 can do that job just fine. In addition we already have a precendent set
 with the sheepdog driver for using ':' for separating all fields:

   -drive file=sheepdog:example.org:6000:imagename

 If you want to allow for port number to be omitted, this can be handled
 thus:

   -drive file=sheepdog:example.org::imagename

 which is how -chardev deals with omitted port numbers

 Regards,
 Daniel
 --
 |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
 |: http://libvirt.org  -o- http://virt-manager.org :|
 |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
 |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|




Re: [Qemu-devel] [RFC PATCH 0/2] GlusterFS support in QEMU - v2

2012-07-23 Thread ronnie sahlberg
Stefan,

in iscsi,  i just specify those extra arguments that are required that
are not part of the url itself as just command line options :


qemu-system-i386 -iscsi initiator-name=iqn.qemu.test:my-initiator \
-boot d -drive file=iscsi://127.0.0.1/iqn.qemu.test/1 \
-cdrom iscsi://127.0.0.1/iqn.qemu.test/2

Here  initiator-name is a custom option to the iscsi layer to tell it
which name to use when identifying/logging in to the target.

Similar concept could be used by glusterfs ?

regards
ronnie sahlberg



On Mon, Jul 23, 2012 at 7:20 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Mon, Jul 23, 2012 at 9:50 AM, Bharata B Rao
 bhar...@linux.vnet.ibm.com wrote:
 On Sun, Jul 22, 2012 at 03:42:28PM +0100, Stefan Hajnoczi wrote:
 On Sat, Jul 21, 2012 at 9:29 AM, Bharata B Rao
 bhar...@linux.vnet.ibm.com wrote:
  -drive file=gluster:server@port:volname:image
 
  - Here 'gluster' is the protocol.
  - 'server@port' specifies the server where the volume file specification 
  for
the given volume resides. 'port' is the port number on which gluster
management daemon (glusterd) is listening. This is optional and if not
specified, QEMU will send 0 which will make libgfapi to use the default
port.

 'server@port' is weird notation.  Normally it is 'server:port' (e.g.
 URLs).  Can you change it?

 I don't like but, but settled for it since port was optional and : was
 being used as separator here.


 What about the other transports supported by libgfapi: UNIX domain
 sockets and RDMA?  My reading of glfs.h is that there are 3 connection
 options:
 1. 'transport': 'socket' (default), 'unix', 'rdma'
 2. 'host': server hostname for 'socket', path to UNIX domain socket
 for 'unix', or something else for 'rdma'
 3. 'port': TCP port when 'socket' is used.  Ignored otherwise.

 Unfortunately QEMU block drivers cannot take custom options yet.  That
 would make it possible to cleanly map these connection options and
 save you from inventing syntax which doesn't expose all options.

 In the meantime it would be nice if the syntax exposed all options.

 So without the capability to pass custom options to block drivers, am I 
 forced
 to keep extending the file= with more and more options ?

 file=gluster:transport:server:port:volname:image ?

 Looks ugly and not easy to make any particular option optional. If needed I 
 can
 support this from GlusterFS backend.

 Kevin, Markus: Any thoughts on passing options to block drivers?
 Encoding GlusterFS options into a filename string is pretty
 cumbersome.




Re: [Qemu-devel] [RFC PATCH 0/2] GlusterFS support in QEMU - v2

2012-07-23 Thread Stefan Hajnoczi
On Mon, Jul 23, 2012 at 10:34 AM, ronnie sahlberg
ronniesahlb...@gmail.com wrote:
 in iscsi,  i just specify those extra arguments that are required that
 are not part of the url itself as just command line options :


 qemu-system-i386 -iscsi initiator-name=iqn.qemu.test:my-initiator \
 -boot d -drive file=iscsi://127.0.0.1/iqn.qemu.test/1 \
 -cdrom iscsi://127.0.0.1/iqn.qemu.test/2

 Here  initiator-name is a custom option to the iscsi layer to tell it
 which name to use when identifying/logging in to the target.

 Similar concept could be used by glusterfs ?

That works for global options only.  If it's per-drive then this
approach can't be used because different drives might need different
option values.

Stefan



Re: [Qemu-devel] [RFC PATCH 0/2] GlusterFS support in QEMU - v2

2012-07-23 Thread Vijay Bellur

On 07/23/2012 02:20 PM, Bharata B Rao wrote:




2. It is currently not possible to bypass RPC because the glusterd
managed configuration file doesn't support that.


It is possible. Gluster already supports custom extensions
to volume names and it is possible to use the required volfile by specifying
this custom volname extension.

For eg, if I have a volume named test, by default the volfile used for
it will be test-fuse.vol. Currently I can put my own custom volfile into
the standard location and get glusterd pick that up. I can specify
test.rpcbypass as volname and glusterd will pick test.rpcbypass.vol.

What is currently not supported is the ability to create test.rpcbypass.vol
from gluster CLI. I believe that gluster developers are ok with enhancing
gluster CLI to support generating/regenerating volfiles for a given volume
with custom translator set.



Yes, this would be the preferred approach. We can tune the volume file 
generation to evolve the desired configuration file.






I'm not sure if these statements are true?

Would you support local volfiles in the future again?  Why force users
to run glusterd?


I will let gluster folks on CC to answer this and let us know the benefits
of always depending on glusterd.

I guess running glusterd would be beneficial when supporting migration. QEMU
working from a local volume (with volname=test.rpcbypass) can be easily
restarted on a different node by just changing volname to test. glusterd will
take care of fetching the right volfile automatically for us.


Yes, running glusterd would be beneficial in migration. Without deriving 
the file from glusterd features like volume tuning, client monitoring 
etc. would not be available to to clients that talk to a gluster volume. 
Additionally, driving configuration generation and management through 
glusterd helps in standardizing and stabilizing gluster configurations.








Has libgfapi been released yet?


Its part of gluster mainline now.


Does it have versioning which will
allow the QEMU GlusterFS block driver to build against different
versions?  I'm just wondering how the pieces will fit together once
distros start shipping them.


I request gluster folks on CC to comment about version and shipping
information.



There is no release that contains libgfapi as yet. Once that is done, we 
can probably have the dependency specified in QEMU as well.


Regards,
Vijay




Re: [Qemu-devel] [PATCH 3/3] migration: block migration when streaming block jobs are running.

2012-07-23 Thread Stefan Hajnoczi
On Fri, Jul 20, 2012 at 8:32 PM,  benoit.ca...@gmail.com wrote:
 From: Benoît Canet ben...@irqsave.net

 Signed-off-by: Benoit Canet ben...@irqsave.net
 ---
  migration.c |5 +
  1 file changed, 5 insertions(+)

 diff --git a/migration.c b/migration.c
 index 8db1b43..dfce680 100644
 --- a/migration.c
 +++ b/migration.c
 @@ -425,6 +425,11 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
  return;
  }

 +if (bdrv_have_block_jobs()) {
 +error_set(errp, QERR_STREAMING_BLOCKS_MIGRATION);
 +return;
 +}

I think bdrv_have_block_jobs() is too specific and would use
bdrv_in_use(bs) here to give basically an EBUSY-type error.

Stefan



Re: [Qemu-devel] [PATCH 1/3] block: Add bdrv_have_block_jobs() so migration code abort if needed.

2012-07-23 Thread Paolo Bonzini
Il 20/07/2012 21:32, benoit.ca...@gmail.com ha scritto:
 +bool bdrv_have_block_jobs(void)
 +{
 +BlockDriverState *bs;
 +
 +QTAILQ_FOREACH(bs, bdrv_states, list) {
 +if (bs-job || bdrv_in_use(bs)) {
 +return true;
 +}
 +bdrv_close(bs);

Why close the device here?

Paolo

 +}
 +
 +return false;
 +}
 +





Re: [Qemu-devel] [PATCH] ide scsi: Mess with geometry only for hard disk devices

2012-07-23 Thread Paolo Bonzini
Il 23/07/2012 09:25, Markus Armbruster ha scritto:
 Ping?
 
 Markus Armbruster arm...@redhat.com writes:
 
 Legacy -drive cyls=... are now ignored completely when the drive
 doesn't back a hard disk device.  Before, they were first checked
 against a hard disk's limits, then ignored.

 Signed-off-by: Markus Armbruster arm...@redhat.com
 ---
  hw/ide/qdev.c  |3 ++-
  hw/scsi-disk.c |3 ++-
  2 files changed, 4 insertions(+), 2 deletions(-)

 diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
 index 22e58df..5ea9b8f 100644
 --- a/hw/ide/qdev.c
 +++ b/hw/ide/qdev.c
 @@ -149,7 +149,8 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind 
 kind)
  }
  
  blkconf_serial(dev-conf, dev-serial);
 -if (blkconf_geometry(dev-conf, dev-chs_trans, 65536, 16, 255)  0) {
 +if (kind != IDE_CD
 + blkconf_geometry(dev-conf, dev-chs_trans, 65536, 16, 255)  
 0) {
  return -1;
  }
  
 diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
 index 525816c..318318c 100644
 --- a/hw/scsi-disk.c
 +++ b/hw/scsi-disk.c
 @@ -1750,7 +1750,8 @@ static int scsi_initfn(SCSIDevice *dev)
  }
  
  blkconf_serial(s-qdev.conf, s-serial);
 -if (blkconf_geometry(dev-conf, NULL, 65535, 255, 255)  0) {
 +if (dev-type == TYPE_DISK
 + blkconf_geometry(dev-conf, NULL, 65535, 255, 255)  0) {
  return -1;
  }
 
 

Acked-by: Paolo Bonzini pbonz...@redhat.com

Kevin, are you taking this patch?

Paolo




Re: [Qemu-devel] [PATCH 3/3] migration: block migration when streaming block jobs are running.

2012-07-23 Thread Benoît Canet
Would

int bdrv_are_busy(void)
{
BlockDriverState *bs;

QTAILQ_FOREACH(bs, bdrv_states, list) {
if (bs-job || bdrv_in_use(bs)) {
return -EBUSY;
}
}

return 0;
}

be more acceptable ?

Benoît

Le Monday 23 Jul 2012 à 10:55:41 (+0100), Stefan Hajnoczi a écrit :
 On Fri, Jul 20, 2012 at 8:32 PM,  benoit.ca...@gmail.com wrote:
  From: Benoît Canet ben...@irqsave.net
 
  Signed-off-by: Benoit Canet ben...@irqsave.net
  ---
   migration.c |5 +
   1 file changed, 5 insertions(+)
 
  diff --git a/migration.c b/migration.c
  index 8db1b43..dfce680 100644
  --- a/migration.c
  +++ b/migration.c
  @@ -425,6 +425,11 @@ void qmp_migrate(const char *uri, bool has_blk, bool 
  blk,
   return;
   }
 
  +if (bdrv_have_block_jobs()) {
  +error_set(errp, QERR_STREAMING_BLOCKS_MIGRATION);
  +return;
  +}
 
 I think bdrv_have_block_jobs() is too specific and would use
 bdrv_in_use(bs) here to give basically an EBUSY-type error.
 
 Stefan



Re: [Qemu-devel] [PATCH 1/3] block: Add bdrv_have_block_jobs() so migration code abort if needed.

2012-07-23 Thread Benoît Canet
 Why close the device here?

Thanks

Benoît



Re: [Qemu-devel] [PATCH 3/3] migration: block migration when streaming block jobs are running.

2012-07-23 Thread Stefan Hajnoczi
On Mon, Jul 23, 2012 at 11:17 AM, Benoît Canet benoit.ca...@irqsave.net wrote:
 Would

 int bdrv_are_busy(void)
 {
 BlockDriverState *bs;

 QTAILQ_FOREACH(bs, bdrv_states, list) {
 if (bs-job || bdrv_in_use(bs)) {
 return -EBUSY;
 }
 }

 return 0;
 }

 be more acceptable ?

Looks fine to me.

Stefan



[Qemu-devel] [PATCH] pc: Fix max_cpus

2012-07-23 Thread riegamaths
From: Dunrong Huang riegama...@gmail.com

The VCPU count limit in kernel now is 254, defined by KVM_MAX_VCPUS
in kernel's header files. But the count limit in QEMU is 255,
so QEMU will failed to start if user passes -enable-kvm and -smp 255
to it.

This patch intruduces a Macro MAX_VCPUS whose value is KVM_MAX_VCPUS
if CONFIG_KVM is defined. If user do not use kvm, set it's value to 255.

Signed-off-by: Dunrong Huang riegama...@gmail.com
---
 hw/pc_piix.c |   28 +++-
 1 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 0c0096f..49cda51 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -49,6 +49,16 @@
 
 #define MAX_IDE_BUS 2
 
+#ifndef KVM_MAX_VCPUS
+#define KVM_MAX_VCPUS 254
+#endif
+
+#ifdef CONFIG_KVM
+#define MAX_VCPUS KVM_MAX_VCPUS
+#else
+#define MAX_VCPUS 255
+#endif
+
 static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
 static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
 static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
@@ -354,7 +364,7 @@ static QEMUMachine pc_machine_v1_2 = {
 .alias = pc,
 .desc = Standard PC,
 .init = pc_init_pci,
-.max_cpus = 255,
+.max_cpus = MAX_VCPUS,
 .is_default = 1,
 };
 
@@ -381,7 +391,7 @@ static QEMUMachine pc_machine_v1_1 = {
 .name = pc-1.1,
 .desc = Standard PC,
 .init = pc_init_pci,
-.max_cpus = 255,
+.max_cpus = MAX_VCPUS,
 .compat_props = (GlobalProperty[]) {
 PC_COMPAT_1_1,
 { /* end of list */ }
@@ -416,7 +426,7 @@ static QEMUMachine pc_machine_v1_0 = {
 .name = pc-1.0,
 .desc = Standard PC,
 .init = pc_init_pci,
-.max_cpus = 255,
+.max_cpus = MAX_VCPUS,
 .compat_props = (GlobalProperty[]) {
 PC_COMPAT_1_0,
 { /* end of list */ }
@@ -431,7 +441,7 @@ static QEMUMachine pc_machine_v0_15 = {
 .name = pc-0.15,
 .desc = Standard PC,
 .init = pc_init_pci,
-.max_cpus = 255,
+.max_cpus = MAX_VCPUS,
 .compat_props = (GlobalProperty[]) {
 PC_COMPAT_0_15,
 { /* end of list */ }
@@ -463,7 +473,7 @@ static QEMUMachine pc_machine_v0_14 = {
 .name = pc-0.14,
 .desc = Standard PC,
 .init = pc_init_pci,
-.max_cpus = 255,
+.max_cpus = MAX_VCPUS,
 .compat_props = (GlobalProperty[]) {
 PC_COMPAT_0_14, 
 {
@@ -496,7 +506,7 @@ static QEMUMachine pc_machine_v0_13 = {
 .name = pc-0.13,
 .desc = Standard PC,
 .init = pc_init_pci_no_kvmclock,
-.max_cpus = 255,
+.max_cpus = MAX_VCPUS,
 .compat_props = (GlobalProperty[]) {
 PC_COMPAT_0_13,
 {
@@ -533,7 +543,7 @@ static QEMUMachine pc_machine_v0_12 = {
 .name = pc-0.12,
 .desc = Standard PC,
 .init = pc_init_pci_no_kvmclock,
-.max_cpus = 255,
+.max_cpus = MAX_VCPUS,
 .compat_props = (GlobalProperty[]) {
 PC_COMPAT_0_12,
 {
@@ -566,7 +576,7 @@ static QEMUMachine pc_machine_v0_11 = {
 .name = pc-0.11,
 .desc = Standard PC, qemu 0.11,
 .init = pc_init_pci_no_kvmclock,
-.max_cpus = 255,
+.max_cpus = MAX_VCPUS,
 .compat_props = (GlobalProperty[]) {
 PC_COMPAT_0_11,
 {
@@ -587,7 +597,7 @@ static QEMUMachine pc_machine_v0_10 = {
 .name = pc-0.10,
 .desc = Standard PC, qemu 0.10,
 .init = pc_init_pci_no_kvmclock,
-.max_cpus = 255,
+.max_cpus = MAX_VCPUS,
 .compat_props = (GlobalProperty[]) {
 PC_COMPAT_0_11,
 {
-- 
1.7.8.6




Re: [Qemu-devel] [PATCH] pc: Fix max_cpus

2012-07-23 Thread Andreas Färber
Am 23.07.2012 12:47, schrieb riegama...@gmail.com:
 From: Dunrong Huang riegama...@gmail.com
 
 The VCPU count limit in kernel now is 254, defined by KVM_MAX_VCPUS
 in kernel's header files. But the count limit in QEMU is 255,
 so QEMU will failed to start if user passes -enable-kvm and -smp 255
 to it.
 
 This patch intruduces a Macro MAX_VCPUS whose value is KVM_MAX_VCPUS
 if CONFIG_KVM is defined. If user do not use kvm, set it's value to 255.
 
 Signed-off-by: Dunrong Huang riegama...@gmail.com
 ---
  hw/pc_piix.c |   28 +++-
  1 files changed, 19 insertions(+), 9 deletions(-)
 
 diff --git a/hw/pc_piix.c b/hw/pc_piix.c
 index 0c0096f..49cda51 100644
 --- a/hw/pc_piix.c
 +++ b/hw/pc_piix.c
 @@ -49,6 +49,16 @@
  
  #define MAX_IDE_BUS 2
  
 +#ifndef KVM_MAX_VCPUS
 +#define KVM_MAX_VCPUS 254
 +#endif
 +
 +#ifdef CONFIG_KVM
 +#define MAX_VCPUS KVM_MAX_VCPUS
 +#else
 +#define MAX_VCPUS 255
 +#endif
 +
  static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
  static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
  static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
 @@ -354,7 +364,7 @@ static QEMUMachine pc_machine_v1_2 = {
  .alias = pc,
  .desc = Standard PC,
  .init = pc_init_pci,
 -.max_cpus = 255,
 +.max_cpus = MAX_VCPUS,
  .is_default = 1,
  };
  
[snip]

This is not so ideal: -enable-kvm is a runtime switch whereas you are
changing a compile-time limit here. Any chance to change the runtime
usage of .max_cpus instead? Possibly introducing a helper function?

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg





Re: [Qemu-devel] Qemu-devel Digest, Vol 112, Issue 522

2012-07-23 Thread Paulo Arcinas
On Jul 23, 2012 5:36 PM, qemu-devel-requ...@nongnu.org wrote:

 Send Qemu-devel mailing list submissions to
 qemu-devel@nongnu.org

 To subscribe or unsubscribe via the World Wide Web, visit
 https://lists.nongnu.org/mailman/listinfo/qemu-devel
 or, via email, send a message with subject or body 'help' to
 qemu-devel-requ...@nongnu.org

 You can reach the person managing the list at
 qemu-devel-ow...@nongnu.org

 When replying, please edit your Subject line so it is more specific
 than Re: Contents of Qemu-devel digest...


 Today's Topics:

1. Re: [RFC PATCH 2/2] block: gluster as block backend
   (Stefan Hajnoczi)
2. Re: [RFC PATCH 0/2] GlusterFS support in QEMU - v2
   (Daniel P. Berrange)
3. Re: [RFC PATCH 0/2] GlusterFS support in QEMU - v2
   (Stefan Hajnoczi)
4. Re: [RFC PATCH 0/2] GlusterFS support in QEMU - v2
   (ronnie sahlberg)
5. Re: [RFC PATCH 0/2] GlusterFS support in QEMU - v2
   (ronnie sahlberg)
6. Re: [RFC PATCH 0/2] GlusterFS support in QEMU - v2
   (Stefan Hajnoczi)


 --

 Message: 1
 Date: Mon, 23 Jul 2012 10:06:40 +0100
 From: Stefan Hajnoczi stefa...@gmail.com
 To: bhar...@linux.vnet.ibm.com
 Cc: Amar Tumballi ama...@redhat.com, Anand Avati
 aav...@redhat.com,qemu-devel@nongnu.org, Vijay Bellur
 vbel...@redhat.com
 Subject: Re: [Qemu-devel] [RFC PATCH 2/2] block: gluster as block
 backend
 Message-ID:
 
 cajsp0qu+exbjp9e_r2gaewhszkkaj8rpajoqdcouhy5jzpt...@mail.gmail.com
 Content-Type: text/plain; charset=ISO-8859-1

 On Mon, Jul 23, 2012 at 9:32 AM, Bharata B Rao
 bhar...@linux.vnet.ibm.com wrote:
  On Sun, Jul 22, 2012 at 04:38:00PM +0100, Stefan Hajnoczi wrote:
  On Sat, Jul 21, 2012 at 9:31 AM, Bharata B Rao
  bhar...@linux.vnet.ibm.com wrote:
   +} GlusterAIOCB;
   +
   +typedef struct GlusterCBKData {
   +GlusterAIOCB *acb;
   +struct BDRVGlusterState *s;
   +int64_t size;
   +int ret;
   +} GlusterCBKData;
 
  I think GlusterCBKData could just be part of GlusterAIOCB.  That would
  simplify the code a little and avoid some malloc/free.
 
  Are you suggesting to put a field
 
  GlusterCBKData gcbk;
 
  inside GlusterAIOCB and use gcbk from there or
 
  Are you suggesting that I make the fields of GlusterCBKData part of
  GlusterAIOCB and get rid of GlusterCBKData altogether ? This means I
 would
  have to pass the GlusterAIOCB to gluster async calls and update its
 fields from
  gluster callback routine. I can do this, but I am not sure if you can
 touch
  the fields of GlusterAIOCB in non-QEMU threads (gluster callback thread).

 The fields in GlusterCBKData could become part of GlusterAIOCB.
 Different threads can access fields in a struct, they just need to
 ensure access is synchronized if they touch the same fields.  In the
 case of this code I think there is nothing that requires
 synchronization beyond the pipe mechanism that you already use to
 complete processing in a QEMU thread.

  When the argument is too long we should probably report an error
  instead of truncating.
 
  Or should we let gluster APIs to flag an error with truncated
  server and volume names ?

 What if the truncated name is a valid but different object?  For example:
 Max chars = 5
 Objects:
 helloworld
 hello

 If helloworld is truncated to hello we get no error back because
 it's a valid object!

 We need to either check sizes explicitly without truncating or use a
 g_strdup() approach without any size limits and let the gfapi
 functions error out if the input string is too long.

   +static struct glfs *qemu_gluster_init(GlusterConf *c, const char
 *filename)
   +{
   +struct glfs *glfs = NULL;
   +int ret;
   +
   +ret = qemu_gluster_parsename(c, filename);
   +if (ret  0) {
   +errno = -ret;
   +goto out;
   +}
   +
   +glfs = glfs_new(c-volname);
   +if (!glfs) {
   +goto out;
   +}
   +
   +ret = glfs_set_volfile_server(glfs, socket, c-server,
 c-port);
   +if (ret  0) {
   +goto out;
   +}
   +
   +/*
   + * TODO: Logging is not necessary but instead nice to have.
   + * Can QEMU optionally log into a standard place ?
 
  QEMU prints to stderr, can you do that here too?  The global log file
  is not okay, especially when multiple QEMU instances are running.
 
  Ok, I can do glfs_set_logging(glfs, /dev/stderr, loglevel);

 Yes.  I think - is best since it is supported by gfapi
 (libglusterfs/src/logging.c:gf_log_init).  /dev/stderr is not POSIX.

   + * Need to use defines like gf_loglevel_t:GF_LOG_INFO instead of
   + * hard coded values like 7 here.
   + */
   +ret = glfs_set_logging(glfs, /tmp/qemu-gluster.log, 7);
   +if (ret  0) {
   +goto out;
   +}
   +
   +ret = glfs_init(glfs);
   +if (ret  0) {
   +goto out;
   +}
   +return glfs;
   +
   

Re: [Qemu-devel] Qemu-devel Digest, Vol 112, Issue 523

2012-07-23 Thread Paulo Arcinas
On Jul 23, 2012 6:47 PM, qemu-devel-requ...@nongnu.org wrote:

 Send Qemu-devel mailing list submissions to
 qemu-devel@nongnu.org

 To subscribe or unsubscribe via the World Wide Web, visit
 https://lists.nongnu.org/mailman/listinfo/qemu-devel
 or, via email, send a message with subject or body 'help' to
 qemu-devel-requ...@nongnu.org

 You can reach the person managing the list at
 qemu-devel-ow...@nongnu.org

 When replying, please edit your Subject line so it is more specific
 than Re: Contents of Qemu-devel digest...


 Today's Topics:

1. Re: [RFC PATCH 0/2] GlusterFS support in QEMU - v2 (Vijay Bellur)
2. Re: [PATCH 3/3] migration: block migration when streaming
   block jobs are running. (Stefan Hajnoczi)
3. Re: [PATCH 1/3] block: Add bdrv_have_block_jobs() so
   migration code abort if needed. (Paolo Bonzini)
4. Re: [PATCH] ide scsi: Mess with geometry only for harddisk
   devices (Paolo Bonzini)
5. Re: [PATCH 3/3] migration: block migration when streaming
   block jobs are running. (Beno?t Canet)
6. Re: [PATCH 1/3] block: Add bdrv_have_block_jobs() so
   migration code abort if needed. (Beno?t Canet)
7. Re: [PATCH 3/3] migration: block migration when streaming
   block jobs are running. (Stefan Hajnoczi)
8. [PATCH] pc: Fix max_cpus (riegama...@gmail.com)


 --

 Message: 1
 Date: Mon, 23 Jul 2012 15:06:06 +0530
 From: Vijay Bellur vbel...@redhat.com
 To: bhar...@linux.vnet.ibm.com
 Cc: Stefan Hajnoczi stefa...@gmail.com, Anand Avati
 aav...@redhat.com,qemu-devel@nongnu.org, Amar Tumballi
 ama...@redhat.com
 Subject: Re: [Qemu-devel] [RFC PATCH 0/2] GlusterFS support in QEMU -
 v2
 Message-ID: 500d1b06.3020...@redhat.com
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed

 On 07/23/2012 02:20 PM, Bharata B Rao wrote:

 
  2. It is currently not possible to bypass RPC because the glusterd
  managed configuration file doesn't support that.
 
  It is possible. Gluster already supports custom extensions
  to volume names and it is possible to use the required volfile by
specifying
  this custom volname extension.
 
  For eg, if I have a volume named test, by default the volfile used for
  it will be test-fuse.vol. Currently I can put my own custom volfile into
  the standard location and get glusterd pick that up. I can specify
  test.rpcbypass as volname and glusterd will pick test.rpcbypass.vol.
 
  What is currently not supported is the ability to create
test.rpcbypass.vol
  from gluster CLI. I believe that gluster developers are ok with
enhancing
  gluster CLI to support generating/regenerating volfiles for a given
volume
  with custom translator set.


 Yes, this would be the preferred approach. We can tune the volume file
 generation to evolve the desired configuration file.

 
 
  I'm not sure if these statements are true?
 
  Would you support local volfiles in the future again?  Why force users
  to run glusterd?
 
  I will let gluster folks on CC to answer this and let us know the
benefits
  of always depending on glusterd.
 
  I guess running glusterd would be beneficial when supporting migration.
QEMU
  working from a local volume (with volname=test.rpcbypass) can be easily
  restarted on a different node by just changing volname to test.
glusterd will
  take care of fetching the right volfile automatically for us.

 Yes, running glusterd would be beneficial in migration. Without deriving
 the file from glusterd features like volume tuning, client monitoring
 etc. would not be available to to clients that talk to a gluster volume.
 Additionally, driving configuration generation and management through
 glusterd helps in standardizing and stabilizing gluster configurations.

 

 
  Has libgfapi been released yet?
 
  Its part of gluster mainline now.
 
  Does it have versioning which will
  allow the QEMU GlusterFS block driver to build against different
  versions?  I'm just wondering how the pieces will fit together once
  distros start shipping them.
 
  I request gluster folks on CC to comment about version and shipping
  information.
 

 There is no release that contains libgfapi as yet. Once that is done, we
 can probably have the dependency specified in QEMU as well.

 Regards,
 Vijay




 --

 Message: 2
 Date: Mon, 23 Jul 2012 10:55:41 +0100
 From: Stefan Hajnoczi stefa...@gmail.com
 To: benoit.ca...@gmail.com
 Cc: Kevin Wolf kw...@redhat.com, Beno?t Canet ben...@irqsave.net,
 qemu-devel@nongnu.org, stefa...@linux.vnet.ibm.com
 Subject: Re: [Qemu-devel] [PATCH 3/3] migration: block migration when
 streaming block jobs are running.
 Message-ID:
 
cajsp0qx6krazmkxmfvwhxjdizg6x5uvhqzgshvfqk+tsrxe...@mail.gmail.com
 Content-Type: text/plain; charset=ISO-8859-1

 On Fri, Jul 20, 2012 at 8:32 PM,  benoit.ca...@gmail.com wrote:
  From: Beno?t 

Re: [Qemu-devel] [PATCH v2 4/4] ARM: exynos4210_pmu: Add software reset support

2012-07-23 Thread Maksim Kozlov

20.07.2012 18:32, Peter Maydell пишет:

On 12 July 2012 17:54, Maksim Kozlovm.koz...@samsung.com  wrote:

Signed-off-by: Maksim Kozlovm.koz...@samsung.com
---
  hw/exynos4210_pmu.c |   40 +---
  1 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/hw/exynos4210_pmu.c b/hw/exynos4210_pmu.c
index 7f09c79..96588d9 100644
--- a/hw/exynos4210_pmu.c
+++ b/hw/exynos4210_pmu.c
@@ -18,13 +18,8 @@
   *  with this program; if not, seehttp://www.gnu.org/licenses/.
   */

-/*
- * This model implements PMU registers just as a bulk of memory. Currently,
- * the only reason this device exists is that secondary CPU boot loader
- * uses PMU INFORM5 register as a holding pen.
- */
-
  #include sysbus.h
+#include sysemu.h

  #ifndef DEBUG_PMU
  #define DEBUG_PMU   0
@@ -230,6 +225,8 @@

  #define EXYNOS4210_PMU_REGS_MEM_SIZE 0x3d0c

+#define SWRESET_SYSTEM_MASK 0x0001
+
  typedef struct Exynos4210PmuReg {
  const char  *name; /* for debug only */
  uint32_t offset;
@@ -458,7 +455,17 @@ static void exynos4210_pmu_write(void *opaque, 
target_phys_addr_t offset,
  PRINT_DEBUG_EXTEND(%s [0x%04x]- 0x%04x\n,
   exynos4210_pmu_regs[index].name, (uint32_t)offset, 
(uint32_t)val);

-s-reg[index] = val;
+switch (offset) {
+case SWRESET:
+if (val  SWRESET_SYSTEM_MASK) {
+s-reg[index] = val;
+qemu_system_reset_request();
+}
+break;
+default:
+s-reg[index] = val;
+break;
+}
  }

  static const MemoryRegionOps exynos4210_pmu_ops = {
@@ -477,9 +484,28 @@ static void exynos4210_pmu_reset(DeviceState *dev)
  Exynos4210PmuState *s =
  container_of(dev, Exynos4210PmuState, busdev.qdev);
  unsigned i;
+uint32_t index = exynos4210_pmu_get_register_index(s, SWRESET);
+uint32_t swreset = s-reg[index];

  /* Set default values for registers */
  for (i = 0; i  PMU_NUM_OF_REGISTERS; i++) {
+if (swreset) {
+switch (exynos4210_pmu_regs[i].offset) {
+case INFORM0:
+case INFORM1:
+case INFORM2:
+case INFORM3:
+case INFORM4:
+case INFORM5:
+case INFORM6:
+case INFORM7:
+case PS_HOLD_CONTROL:
+/* keep these registers during SW reset */
+continue;
+default:
+break;
+}
+}
  s-reg[i] = exynos4210_pmu_regs[i].reset_value;
  }
  }

This patch seems to be trying to make a distinction that QEMU doesn't
support, ie between system wide software reset and system wide
hard reset. I'm not convinced about the wisdom of trying to paper
over this lack with a single-device workaround.
Peter, if we add callback (*swreset)() into DeviceClass, whether it will 
be an acceptable solution? And devices which have different behavior 
during swreset can register different function for that. What do you 
think about this?


-- PMM




--
Best regards,
Maksim Kozlov
Leading Engineer
Advanced Software Group,
Samsung Moscow Research Center
e-mail: m.koz...@samsung.com
Tel: 7(495) 797-25-00 ext. 3949





Re: [Qemu-devel] [PATCH v2] MP initialization protocol differs between cpu families, and for P6 and onward models it is up to CPU to decide if it will be BSP using this protocol, so try to model this.

2012-07-23 Thread Andreas Färber
Am 12.07.2012 15:22, schrieb Igor Mammedov:
 This patch:
  - moves decision to designate BSP from board into cpu, making cpu
 self-sufficient in this regard. Later it will allow to cleanup hw/pc.c
 and remove cpu_reset and wrappers from there.
  - stores flag that CPU is BSP in IA32_APIC_BASE to model behavior
 described in Inted SDM vol 3a part 1 chapter 8.4.1
  - uses MSR_IA32_APICBASE_BSP flag in apic_base for checking if cpu is BSP
 
 patch is based on Jan Kiszka's proposal:
 http://thread.gmane.org/gmane.comp.emulators.qemu/100806
 
 v2:
   - fix build for i386-linux-user
   spotted-by: Peter Maydell peter.mayd...@linaro.org
 v3:
   - style change requested by Andreas Färber afaer...@suse.de
 
 v4:
   - reuse cpu_is_bsp() rather than open code check if apicbase has BSP bit set
   requested by  Gleb Natapov g...@redhat.com
   - hijacked Andreas' patch [1] to use X86CPU instead of CPUX86State in
 cpu_is_bsp()
 
   1)  http://lists.gnu.org/archive/html/qemu-devel/2012-05/msg03185.html
 
 Signed-off-by: Igor Mammedov imamm...@redhat.com
 ---
  hw/apic.h|5 -
  hw/apic_common.c |   16 +---
  hw/pc.c  |9 -
  target-i386/cpu.c|   18 ++
  target-i386/helper.c |1 -
  target-i386/kvm.c|4 +++-
  6 files changed, 38 insertions(+), 15 deletions(-)
 
 diff --git a/hw/apic.h b/hw/apic.h
 index 62179ce..4da10b6 100644
 --- a/hw/apic.h
 +++ b/hw/apic.h
 @@ -20,9 +20,12 @@ void apic_init_reset(DeviceState *s);
  void apic_sipi(DeviceState *s);
  void apic_handle_tpr_access_report(DeviceState *d, target_ulong ip,
 TPRAccess access);
 +void apic_designate_bsp(DeviceState *d);
  
  /* pc.c */
 -int cpu_is_bsp(CPUX86State *env);
  DeviceState *cpu_get_current_apic(void);
  
 +/* cpu.c */
 +bool cpu_is_bsp(X86CPU *cpu);
 +
  #endif
 diff --git a/hw/apic_common.c b/hw/apic_common.c
 index 60b8259..58e63b0 100644
 --- a/hw/apic_common.c
 +++ b/hw/apic_common.c
 @@ -43,8 +43,8 @@ uint64_t cpu_get_apic_base(DeviceState *d)
  trace_cpu_get_apic_base((uint64_t)s-apicbase);
  return s-apicbase;
  } else {
 -trace_cpu_get_apic_base(0);
 -return 0;
 +trace_cpu_get_apic_base(MSR_IA32_APICBASE_BSP);
 +return MSR_IA32_APICBASE_BSP;
  }
  }
  
 @@ -201,13 +201,23 @@ void apic_init_reset(DeviceState *d)
  s-timer_expiry = -1;
  }
  
 +void apic_designate_bsp(DeviceState *d)
 +{
 +if (d == NULL) {
 +return;
 +}
 +
 +APICCommonState *s = APIC_COMMON(d);
 +s-apicbase |= MSR_IA32_APICBASE_BSP;
 +}
 +
  static void apic_reset_common(DeviceState *d)
  {
  APICCommonState *s = DO_UPCAST(APICCommonState, busdev.qdev, d);
  APICCommonClass *info = APIC_COMMON_GET_CLASS(s);
  bool bsp;
  
 -bsp = cpu_is_bsp(s-cpu_env);
 +bsp = cpu_is_bsp(x86_env_get_cpu(s-cpu_env));
  s-apicbase = 0xfee0 |
  (bsp ? MSR_IA32_APICBASE_BSP : 0) | MSR_IA32_APICBASE_ENABLE;
  
 diff --git a/hw/pc.c b/hw/pc.c
 index c7e9ab3..50c1715 100644
 --- a/hw/pc.c
 +++ b/hw/pc.c
 @@ -871,12 +871,6 @@ void pc_init_ne2k_isa(ISABus *bus, NICInfo *nd)
  nb_ne2k++;
  }
  
 -int cpu_is_bsp(CPUX86State *env)
 -{
 -/* We hard-wire the BSP to the first CPU. */
 -return env-cpu_index == 0;
 -}
 -
  DeviceState *cpu_get_current_apic(void)
  {
  if (cpu_single_env) {
 @@ -927,10 +921,7 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int 
 level)
  static void pc_cpu_reset(void *opaque)
  {
  X86CPU *cpu = opaque;
 -CPUX86State *env = cpu-env;
 -
  cpu_reset(CPU(cpu));
 -env-halted = !cpu_is_bsp(env);
  }
  
  static X86CPU *pc_new_cpu(const char *cpu_model)
 diff --git a/target-i386/cpu.c b/target-i386/cpu.c
 index 5521709..0c38b7f 100644
 --- a/target-i386/cpu.c
 +++ b/target-i386/cpu.c
 @@ -1686,6 +1686,24 @@ static void x86_cpu_reset(CPUState *s)
  env-dr[7] = DR7_FIXED_1;
  cpu_breakpoint_remove_all(env, BP_CPU);
  cpu_watchpoint_remove_all(env, BP_CPU);
 +
 +#if !defined(CONFIG_USER_ONLY)
 +/* We hard-wire the BSP to the first CPU. */
 +if (env-cpu_index == 0) {
 +apic_designate_bsp(env-apic_state);
 +}
 +
 +env-halted = !cpu_is_bsp(cpu);
 +#endif
 +}
 +
 +#ifndef CONFIG_USER_ONLY
 +bool cpu_is_bsp(X86CPU *cpu)
 +{
 +return cpu_get_apic_base(cpu-env.apic_state)  MSR_IA32_APICBASE_BSP;
 +}
 +#endif
 +
  }

I'm okay with this approach too, but I think the above brace is a merge
conflict?

Did you git-grep for cpu_is_bsp to be sure you caught all usages?

Andreas

  
  static void mce_init(X86CPU *cpu)
 diff --git a/target-i386/helper.c b/target-i386/helper.c
 index d3af6ea..b748d90 100644
 --- a/target-i386/helper.c
 +++ b/target-i386/helper.c
 @@ -1191,7 +1191,6 @@ void do_cpu_init(X86CPU *cpu)
  env-interrupt_request = sipi;
  env-pat = pat;
  apic_init_reset(env-apic_state);
 -env-halted = !cpu_is_bsp(env);
  }
  
  void 

Re: [Qemu-devel] [PATCH 2/6] s390: sclp base support

2012-07-23 Thread Christian Borntraeger
Andreas,

thanks for having a look. I no other comments arrive today I will resubmit the 
patch 
set with all comments addressed.

On 20/07/12 16:06, Andreas Färber wrote:
 +#ifdef DEBUG_HELPER
 +printf(KVM: invalid sclp call 0x%x / 0x% PRIx64 x\n, sccb, 
 code);
 
 sccb is a pointer so %x seems wrong for 64-bit host. %p?

yes its broken.
We will probably just remove that, since it will be changed by followon patch.


 +typedef struct S390SCLPDevice {
 +DeviceState qdev;
 
 I note that this is probably the first device directly derived from
 DeviceState. This should be possible now after having merged the QOM
 QBus series. Test case to check is info qdm from the monitor.

info qdm seems to work. what test did you have in mind?

[...]
  //#define DEBUG_S390
  
 @@ -61,6 +62,7 @@
  #define MAX_BLK_DEVS10
  
  static VirtIOS390Bus *s390_bus;
 +SCLPS390Bus *sclp_bus;
 
 I don't like this so much. We shouldn't be following that example and
 adding global state for each bus there. Can't we add a child property to
 the machine for the hosting device so that we can obtain that via QOM
 path and access the bus from there?

Since Blue already compained Heinz already moved that bus into s390-sclp.c as a
static variable without any export since we only touch that there. We also added
an comment that by architecture definition there is only one sclp per machine, 
so 
this should be fine.


 --- a/hw/s390x/Makefile.objs
 +++ b/hw/s390x/Makefile.objs
 @@ -1,3 +1,4 @@
  obj-y = s390-virtio-bus.o s390-virtio.o
 +obj-y += s390-sclp.o
  
  obj-y := $(addprefix ../,$(obj-y))
 
 IIUC we have a dependency on the CPU inside the device code. If so,
 please move the s390-sclp.c file into hw/s390x/ and move the obj-y +=
 line below the addprefix line.

Ok.

 diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
 index 619b202..8900872 100644
 --- a/target-s390x/cpu.c
 +++ b/target-s390x/cpu.c
 @@ -21,9 +21,26 @@
   */
  
  #include cpu.h
 +#include kvm.h
  #include qemu-common.h
  #include qemu-timer.h
  
 +/* service interrupts are floating therefore we must not pass an cpustate */
 +void s390_sclp_extint(uint32_t parm)
 +{
 +S390CPU *dummy_cpu = s390_cpu_addr2state(0);
 +CPUS390XState *env = dummy_cpu-env;
 +
 +if (kvm_enabled()) {
 +#ifdef CONFIG_KVM
 +kvm_s390_interrupt_internal(env, KVM_S390_INT_SERVICE, parm, 0, 1);
 +#endif
 +} else {
 +env-psw.addr += 4;
 +cpu_inject_ext(env, EXT_SERVICE, parm, 0);
 +}
 +}
 
 Not so happy about this placement, it is not called s390_cpu_, the
 prototype is in cpu.h not cpu-qom.h and it operates only indirectly on
 the CPU. Is there another place for it?

Not yet. I will try to introduce a new one, e.g. target-s390x/interrupt.c
We can then move other code there as well. 




Re: [Qemu-devel] [PATCH v2 4/4] ARM: exynos4210_pmu: Add software reset support

2012-07-23 Thread Peter Maydell
On 23 July 2012 12:01, Maksim Kozlov m.koz...@samsung.com wrote:
 20.07.2012 18:32, Peter Maydell пишет:
 This patch seems to be trying to make a distinction that QEMU doesn't
 support, ie between system wide software reset and system wide
 hard reset. I'm not convinced about the wisdom of trying to paper
 over this lack with a single-device workaround.

 Peter, if we add callback (*swreset)() into DeviceClass, whether it will be
 an acceptable solution? And devices which have different behavior during
 swreset can register different function for that. What do you think about
 this?

I think reset is a key part of device lifecycle and adding support
for 'warm reset' or whatever needs general discussion so we can get
the semantics right... cc'ing a couple of people who might have opinions.

-- PMM



Re: [Qemu-devel] [Bug 992067] Re: Windows 2008R2 very slow cold boot when 4GB memory

2012-07-23 Thread Owen Tuz
We have been experiencing this problem for a while now too, using qemu-kvm 
(currently at 1.1.1). 

Unfortunately, hv_relaxed doesn't seem to fix it. The following command line 
produces the issue:

qemu-kvm -nodefaults -m 4096 -smp 8 -cpu host,hv_relaxed -vga cirrus -usbdevice 
tablet -vnc :99 -monitor stdio -hda
test.img

The hardware consists of dual AMD Opteron 6128 processors (16 cores in total) 
and 64GB of memory. This command line was tested on kernel 3.1.4. 

I've also tested with -no-hpet.

What I have seen is much as described: the memory fills out slowly, and top on 
the host will show the process using
100% on all allocated CPU cores. The most extreme case was a machine which took 
something between 6 and 8 hours to boot.

This seems to be related to the assigned memory, as described, but also the 
number of processor cores (which makes
sense if we believe it's a timing issue?). I have seen slow-booting guests 
improved by switching down to a single or even two cores.

Matthew, I agree that this seems to be linked to the number of VMs running - in 
fact, shutting down other VMs on a dedicated test host caused the machine to 
start booting at a normal speed (with no reboot required).

However, the level of contention is never such that this could be explained by 
the host simply being overcommitted.

If it helps anyone, there's an image of the hard drive I've been using to test 
at:

http://46.20.114.253/

It's 5G of gzip file containing a fairly standard Windows 2008 trial 
installation. Since it's in the trial period, anyone who wants to use it may 
have to re-arm the trial: http://support.microsoft.com/kb/948472 

Please let me know if I can provide any more information, or test anything.

Best wishes,

Owen Tuz



[Qemu-devel] [PATCH V2 0/3] Block migration when streaming block jobs are running

2012-07-23 Thread benoit . canet
From: Benoît Canet ben...@irqsave.net

This patchset is designed to avoid starting a live migration while one or more
streaming block jobs are running.

Tested with the following sequence:
QEMU 1.1.50 monitor - type 'help' for more information
(qemu) block_stream virtio0 1k
(qemu) migrate tcp:localhost:
migrate: Streaming blocks migration
(qemu) migrate tcp:localhost:
migrate: Streaming blocks migration
(qemu) block_job_cancel virtio0
(qemu) migrate tcp:localhost:
migrate: Connection can not be completed immediately
(qemu)
= migration then succeed

in v2:
stefanha: Rename bdrv_have_block_jobs() to bdrv_are_busy() and make it return 
-EBUSY.
paolo: remove spurious bdrv_close()

Benoît Canet (3):
  block: Add bdrv_are_busy() so migration code abort if needed.
  qerror: Add error telling that streaming blocks migration
  migration: block migration when streaming block jobs are running.

 block.c |   13 +
 block.h |2 ++
 migration.c |5 +
 qerror.c|4 
 qerror.h|3 +++
 5 files changed, 27 insertions(+)

-- 
1.7.9.5




[Qemu-devel] [PATCH V2 2/3] qerror: Add error telling that streaming blocks migration

2012-07-23 Thread benoit . canet
From: Benoît Canet ben...@irqsave.net

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 qerror.c |4 
 qerror.h |3 +++
 2 files changed, 7 insertions(+)

diff --git a/qerror.c b/qerror.c
index 92c4eff..bcd74b7 100644
--- a/qerror.c
+++ b/qerror.c
@@ -283,6 +283,10 @@ static const QErrorStringTable qerror_table[] = {
 .desc  = Could not set password,
 },
 {
+.error_fmt = QERR_STREAMING_BLOCKS_MIGRATION,
+.desc  = Streaming blocks migration,
+},
+{
 .error_fmt = QERR_TOO_MANY_FILES,
 .desc  = Too many open files,
 },
diff --git a/qerror.h b/qerror.h
index b4c8758..95d9e8d 100644
--- a/qerror.h
+++ b/qerror.h
@@ -233,6 +233,9 @@ QError *qobject_to_qerror(const QObject *obj);
 #define QERR_SET_PASSWD_FAILED \
 { 'class': 'SetPasswdFailed', 'data': {} }
 
+#define QERR_STREAMING_BLOCKS_MIGRATION \
+{ 'class': 'StreamingFeatureBlocksMigration', 'data': {} }
+
 #define QERR_TOO_MANY_FILES \
 { 'class': 'TooManyFiles', 'data': {} }
 
-- 
1.7.9.5




[Qemu-devel] [PATCH V2 1/3] block: Add bdrv_are_busy()

2012-07-23 Thread benoit . canet
From: Benoît Canet ben...@irqsave.net

bdrv_are_busy will be used to check if any of the bs are in use
or if one of them have a running block job.

The first user will be qmp_migrate().

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 block.c |   13 +
 block.h |2 ++
 2 files changed, 15 insertions(+)

diff --git a/block.c b/block.c
index ce7eb8f..bc8f160 100644
--- a/block.c
+++ b/block.c
@@ -4027,6 +4027,19 @@ out:
 return ret;
 }
 
+int bdrv_are_busy(void)
+{
+BlockDriverState *bs;
+
+QTAILQ_FOREACH(bs, bdrv_states, list) {
+if (bs-job || bdrv_in_use(bs)) {
+return -EBUSY;
+}
+}
+
+return 0;
+}
+
 void *block_job_create(const BlockJobType *job_type, BlockDriverState *bs,
int64_t speed, BlockDriverCompletionFunc *cb,
void *opaque, Error **errp)
diff --git a/block.h b/block.h
index c89590d..0a3de2f 100644
--- a/block.h
+++ b/block.h
@@ -337,6 +337,8 @@ void bdrv_disable_copy_on_read(BlockDriverState *bs);
 void bdrv_set_in_use(BlockDriverState *bs, int in_use);
 int bdrv_in_use(BlockDriverState *bs);
 
+int bdrv_are_busy(void);
+
 enum BlockAcctType {
 BDRV_ACCT_READ,
 BDRV_ACCT_WRITE,
-- 
1.7.9.5




[Qemu-devel] [PATCH V2 3/3] migration: block migration when streaming block jobs are running.

2012-07-23 Thread benoit . canet
From: Benoît Canet ben...@irqsave.net

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 migration.c |5 +
 1 file changed, 5 insertions(+)

diff --git a/migration.c b/migration.c
index 8db1b43..5196d7e 100644
--- a/migration.c
+++ b/migration.c
@@ -425,6 +425,11 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 return;
 }
 
+if (bdrv_are_busy()) {
+error_set(errp, QERR_STREAMING_BLOCKS_MIGRATION);
+return;
+}
+
 s = migrate_init(params);
 
 if (strstart(uri, tcp:, p)) {
-- 
1.7.9.5




Re: [Qemu-devel] [PATCH V2 2/3] qerror: Add error telling that streaming blocks migration

2012-07-23 Thread Peter Maydell
On 23 July 2012 12:23,  benoit.ca...@gmail.com wrote:
 --- a/qerror.c
 +++ b/qerror.c
 @@ -283,6 +283,10 @@ static const QErrorStringTable qerror_table[] = {
  .desc  = Could not set password,
  },
  {
 +.error_fmt = QERR_STREAMING_BLOCKS_MIGRATION,
 +.desc  = Streaming blocks migration,
 +},

An error should be a description of something that went wrong
(this isn't supported, I couldn't do X, couldn't find Y,
that isn't valid on devices of type Z). This text isn't
describing anything that's gone wrong in any comprehensible
way.

In particular I just parsed Streaming blocks migration
as (streaming blocks) migration ie migration of
streaming blocks. If you meant Migration was blocked
by streaming or something similar you need to rephrase.

-- PMM



[Qemu-devel] [PATCH] check for available room when formatting OpenFirmware device path

2012-07-23 Thread Laszlo Ersek

Signed-off-by: Laszlo Ersek ler...@redhat.com
---
 hw/qdev.c |   14 +-
 vl.c  |7 ++-
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/hw/qdev.c b/hw/qdev.c
index af54467..f1e83a4 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -502,6 +502,10 @@ static int qdev_get_fw_dev_path_helper(DeviceState *dev, 
char *p, int size)
 if (dev  dev-parent_bus) {
 char *d;
 l = qdev_get_fw_dev_path_helper(dev-parent_bus-parent, p, size);
+if (l = size) {
+return l;
+}
+
 d = bus_get_fw_dev_path(dev-parent_bus, dev);
 if (d) {
 l += snprintf(p + l, size - l, %s, d);
@@ -509,6 +513,10 @@ static int qdev_get_fw_dev_path_helper(DeviceState *dev, 
char *p, int size)
 } else {
 l += snprintf(p + l, size - l, %s, 
object_get_typename(OBJECT(dev)));
 }
+
+if (l = size) {
+return l;
+}
 }
 l += snprintf(p + l , size - l, /);
 
@@ -520,8 +528,12 @@ char* qdev_get_fw_dev_path(DeviceState *dev)
 char path[128];
 int l;
 
-l = qdev_get_fw_dev_path_helper(dev, path, 128);
+l = qdev_get_fw_dev_path_helper(dev, path, sizeof(path));
 
+assert(l  0);
+if (l = sizeof(path)) {
+return NULL;
+}
 path[l-1] = '\0';
 
 return strdup(path);
diff --git a/vl.c b/vl.c
index 8904db1..78dcc93 100644
--- a/vl.c
+++ b/vl.c
@@ -913,7 +913,12 @@ char *get_boot_devices_list(uint32_t *size)
 
 if (i-dev) {
 devpath = qdev_get_fw_dev_path(i-dev);
-assert(devpath);
+if (devpath == NULL) {
+fprintf(stderr,
+OpenFirmware Device Path too long (boot index %d)\n,
+i-bootindex);
+exit(1);
+}
 }
 
 if (i-suffix  devpath) {
-- 
1.7.1




Re: [Qemu-devel] [seabios PATCH 1/2] acpi: set I/O APIC ID to 0 by default

2012-07-23 Thread Gleb Natapov
On Fri, Jul 20, 2012 at 01:22:43PM -0300, Eduardo Habkost wrote:
 On Fri, Jul 20, 2012 at 12:18:59AM +0300, Gleb Natapov wrote:
  On Thu, Jul 19, 2012 at 05:52:41PM -0300, Eduardo Habkost wrote:
   When resetting an I/O APIC, its ID is set to 0, so set it to 0 on the
   MADT table too.
   
  Actually BIOS needs to configure ioapic id to a uniqe value. This does
  not really matter for KVM though.
 
 Where does this requirement comes from? I am guessing it matters only
 when the I/O APIC is directly connected to the APIC bus (according to
 Intel SDM, that's the case only for old Pentium and P6 CPUs)[1].
 
http://www.intel.com/design/chipsets/datashts/29056601.pdf says that it
should be programmed to unique value. I checked 4 machines on 3 of them
this was the case on one (AMD) there are 3 IOAPICs and they all overlap with
APICs.

 Anyway, even if some hardware has this unique-ID requirement, today
 Seabios does not fulfill it, leaving the I/O APIC ID as 0. The patch at
 least makes the MADT table match reality.
 
The currant code was written when we could only dream supporting 16 cpus
and back than it did correct thing, but now it just broken. QEMU do not
care about IOAPIC ID for sure so I am OK with the patch.
 
 [1] I have checked 3 different machines, and all machines I have looked
 have an I/O APIC ID that conflicts with an existing Local APIC ID, on
 the ACPI MADT table.
 
 Some iasl dumps may be found online by googling for:
  Subtable Type : 01 I/O Apic ID
 
 I looked at 5 or 6 matches, and almost every one have an I/O APIC ID
 conflicting with a Local APIC ID.
1 from 4 for me :)

 
  
   Signed-off-by: Eduardo Habkost ehabk...@redhat.com
   ---
src/acpi.c   |2 +-
src/config.h |2 ++
2 files changed, 3 insertions(+), 1 deletion(-)
   
   diff --git a/src/acpi.c b/src/acpi.c
   index 55e4607..3f55de9 100644
   --- a/src/acpi.c
   +++ b/src/acpi.c
   @@ -335,7 +335,7 @@ build_madt(void)
struct madt_io_apic *io_apic = (void*)apic;
io_apic-type = APIC_IO;
io_apic-length = sizeof(*io_apic);
   -io_apic-io_apic_id = CountCPUs;
   +io_apic-io_apic_id = BUILD_IOAPIC_ID;
io_apic-address = cpu_to_le32(BUILD_IOAPIC_ADDR);
io_apic-interrupt = cpu_to_le32(0);

   diff --git a/src/config.h b/src/config.h
   index 3a70867..878c691 100644
   --- a/src/config.h
   +++ b/src/config.h
   @@ -52,9 +52,11 @@
#define BUILD_PCIMEM64_END0x100ULL

#define BUILD_IOAPIC_ADDR 0xfec0
   +#define BUILD_IOAPIC_ID   0
#define BUILD_HPET_ADDRESS0xfed0
#define BUILD_APIC_ADDR   0xfee0

   +
// Important real-mode segments
#define SEG_IVT  0x
#define SEG_BDA  0x0040
   -- 
   1.7.10.4
  
  --
  Gleb.
  
 
 -- 
 Eduardo

--
Gleb.



[Qemu-devel] [PATCH 1/2] usb: Clean common object and dependency files

2012-07-23 Thread Jan Kiszka
Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 Makefile |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Makefile b/Makefile
index ab82ef3..1b3882a 100644
--- a/Makefile
+++ b/Makefile
@@ -218,7 +218,7 @@ clean:
rm -Rf .libs
rm -f slirp/*.o slirp/*.d audio/*.o audio/*.d block/*.o block/*.d 
net/*.o net/*.d fsdev/*.o fsdev/*.d ui/*.o ui/*.d qapi/*.o qapi/*.d qga/*.o 
qga/*.d
rm -f qom/*.o qom/*.d
-   rm -f usb/*.o usb/*.d hw/*.o hw/*.d
+   rm -f hw/usb/*.o hw/usb/*.d hw/*.o hw/*.d
rm -f qemu-img-cmds.h
rm -f trace/*.o trace/*.d
rm -f trace-dtrace.dtrace trace-dtrace.dtrace-timestamp
-- 
1.7.3.4



[Qemu-devel] [PATCH 2/2] qom: Clean libuser object and dependency files

2012-07-23 Thread Jan Kiszka
Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 Makefile |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Makefile b/Makefile
index 1b3882a..964e592 100644
--- a/Makefile
+++ b/Makefile
@@ -217,7 +217,7 @@ clean:
rm -f *.o *.d *.a *.lo $(TOOLS) $(HELPERS-y) qemu-ga TAGS cscope.* 
*.pod *~ */*~
rm -Rf .libs
rm -f slirp/*.o slirp/*.d audio/*.o audio/*.d block/*.o block/*.d 
net/*.o net/*.d fsdev/*.o fsdev/*.d ui/*.o ui/*.d qapi/*.o qapi/*.d qga/*.o 
qga/*.d
-   rm -f qom/*.o qom/*.d
+   rm -f qom/*.o qom/*.d libuser/qom/*.o libuser/qom/*.d
rm -f hw/usb/*.o hw/usb/*.d hw/*.o hw/*.d
rm -f qemu-img-cmds.h
rm -f trace/*.o trace/*.d
-- 
1.7.3.4



[Qemu-devel] [PATCH 03/19] qapi: add test case for deallocating traversal of incomplete structure

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

v3:
- new patch

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 tests/test-qmp-commands.c |   42 ++
 1 file changed, 42 insertions(+)

diff --git a/tests/test-qmp-commands.c b/tests/test-qmp-commands.c
index 60cbf01..dc3c507 100644
--- a/tests/test-qmp-commands.c
+++ b/tests/test-qmp-commands.c
@@ -3,6 +3,9 @@
 #include test-qmp-commands.h
 #include qapi/qmp-core.h
 #include module.h
+#include qapi/qmp-input-visitor.h
+#include tests/test-qapi-types.h
+#include tests/test-qapi-visit.h
 
 void qmp_user_def_cmd(Error **errp)
 {
@@ -123,6 +126,44 @@ static void test_dealloc_types(void)
 qapi_free_UserDefOneList(ud1list);
 }
 
+/* test generated deallocation on an object whose construction was prematurely
+ * terminated due to an error */
+static void test_dealloc_partial(void)
+{
+static const char text[] = don't leak me;
+
+UserDefTwo *ud2 = NULL;
+Error *err = NULL;
+
+/* create partial object */
+{
+QDict *ud2_dict;
+QmpInputVisitor *qiv;
+
+ud2_dict = qdict_new();
+qdict_put_obj(ud2_dict, string, QOBJECT(qstring_from_str(text)));
+
+qiv = qmp_input_visitor_new(QOBJECT(ud2_dict));
+visit_type_UserDefTwo(qmp_input_get_visitor(qiv), ud2, NULL, err);
+qmp_input_visitor_cleanup(qiv);
+QDECREF(ud2_dict);
+}
+
+/* verify partial success */
+assert(ud2 != NULL);
+assert(ud2-string != NULL);
+assert(strcmp(ud2-string, text) == 0);
+assert(ud2-dict.dict.userdef == NULL);
+
+/* confirm  release construction error */
+assert(err != NULL);
+error_free(err);
+
+/* tear down partial object */
+qapi_free_UserDefTwo(ud2);
+}
+
+
 int main(int argc, char **argv)
 {
 g_test_init(argc, argv, NULL);
@@ -131,6 +172,7 @@ int main(int argc, char **argv)
 g_test_add_func(/0.15/dispatch_cmd_error, test_dispatch_cmd_error);
 g_test_add_func(/0.15/dispatch_cmd_io, test_dispatch_cmd_io);
 g_test_add_func(/0.15/dealloc_types, test_dealloc_types);
+g_test_add_func(/0.15/dealloc_partial, test_dealloc_partial);
 
 module_call_init(MODULE_INIT_QAPI);
 g_test_run();
-- 
1.7.10.4




[Qemu-devel] [PATCH 04/19] qapi: generate C types for fixed-width integers

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

(Long line folded using parens:
http://www.python.org/dev/peps/pep-0008/#maximum-line-length.)

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 scripts/qapi.py |4 
 1 file changed, 4 insertions(+)

diff --git a/scripts/qapi.py b/scripts/qapi.py
index e062336..1292476 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -159,6 +159,10 @@ def c_type(name):
 return 'char *'
 elif name == 'int':
 return 'int64_t'
+elif (name == 'int8' or name == 'int16' or name == 'int32' or
+  name == 'int64' or name == 'uint8' or name == 'uint16' or
+  name == 'uint32' or name == 'uint64'):
+return name + '_t'
 elif name == 'bool':
 return 'bool'
 elif name == 'number':
-- 
1.7.10.4




[Qemu-devel] [PATCH 12/19] convert net_init_nic() to NetClientOptions

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

v1-v2:
- NetLegacyNicOptions::vectors is of type uint32

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 net.c |   39 ++-
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/net.c b/net.c
index af544b2..a62e902 100644
--- a/net.c
+++ b/net.c
@@ -748,12 +748,15 @@ int net_handle_fd_param(Monitor *mon, const char *param)
 return fd;
 }
 
-static int net_init_nic(QemuOpts *opts, const NetClientOptions *new_opts,
+static int net_init_nic(QemuOpts *old_opts, const NetClientOptions *opts,
 const char *name, VLANState *vlan)
 {
 int idx;
 NICInfo *nd;
-const char *netdev;
+const NetLegacyNicOptions *nic;
+
+assert(opts-kind == NET_CLIENT_OPTIONS_KIND_NIC);
+nic = opts-nic;
 
 idx = nic_get_free_idx();
 if (idx == -1 || nb_nics = MAX_NICS) {
@@ -765,10 +768,10 @@ static int net_init_nic(QemuOpts *opts, const 
NetClientOptions *new_opts,
 
 memset(nd, 0, sizeof(*nd));
 
-if ((netdev = qemu_opt_get(opts, netdev))) {
-nd-netdev = qemu_find_netdev(netdev);
+if (nic-has_netdev) {
+nd-netdev = qemu_find_netdev(nic-netdev);
 if (!nd-netdev) {
-error_report(netdev '%s' not found, netdev);
+error_report(netdev '%s' not found, nic-netdev);
 return -1;
 }
 } else {
@@ -778,26 +781,28 @@ static int net_init_nic(QemuOpts *opts, const 
NetClientOptions *new_opts,
 if (name) {
 nd-name = g_strdup(name);
 }
-if (qemu_opt_get(opts, model)) {
-nd-model = g_strdup(qemu_opt_get(opts, model));
+if (nic-has_model) {
+nd-model = g_strdup(nic-model);
 }
-if (qemu_opt_get(opts, addr)) {
-nd-devaddr = g_strdup(qemu_opt_get(opts, addr));
+if (nic-has_addr) {
+nd-devaddr = g_strdup(nic-addr);
 }
 
-if (qemu_opt_get(opts, macaddr) 
-net_parse_macaddr(nd-macaddr.a, qemu_opt_get(opts, macaddr))  0) {
+if (nic-has_macaddr 
+net_parse_macaddr(nd-macaddr.a, nic-macaddr)  0) {
 error_report(invalid syntax for ethernet address);
 return -1;
 }
 qemu_macaddr_default_if_unset(nd-macaddr);
 
-nd-nvectors = qemu_opt_get_number(opts, vectors,
-   DEV_NVECTORS_UNSPECIFIED);
-if (nd-nvectors != DEV_NVECTORS_UNSPECIFIED 
-(nd-nvectors  0 || nd-nvectors  0x7ff)) {
-error_report(invalid # of vectors: %d, nd-nvectors);
-return -1;
+if (nic-has_vectors) {
+if (nic-vectors  0x7ff) {
+error_report(invalid # of vectors: %PRIu32, nic-vectors);
+return -1;
+}
+nd-nvectors = nic-vectors;
+} else {
+nd-nvectors = DEV_NVECTORS_UNSPECIFIED;
 }
 
 nd-used = 1;
-- 
1.7.10.4




[Qemu-devel] [PATCH 09/19] qapi schema: add Netdev types

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

NetdevTapOptions::sndbuf and NetdevDumpOptions::len use the new size
type.

v1-v2:
- NetLegacy::name is optional
- NetLegacyNicOptions::vectors is of type uint32
- NetdevVdeOptions::port and ::mode are of type uint16
- NetLegacy::vlan has type int32

v2-v3:
- NetLegacy::id is allowed and takes precedence over NetLegacy::name
- replace @traits with @opts in NetLegacy  Netdev descriptions

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 qapi-schema.json |  278 ++
 1 file changed, 278 insertions(+)

diff --git a/qapi-schema.json b/qapi-schema.json
index d2f8e02..bc55ed2 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1872,6 +1872,284 @@
 { 'command': 'netdev_del', 'data': {'id': 'str'} }
 
 ##
+# @NetdevNoneOptions
+#
+# Use it alone to have zero network devices.
+#
+# Since 1.2
+##
+{ 'type': 'NetdevNoneOptions',
+  'data': { } }
+
+##
+# @NetLegacyNicOptions
+#
+# Create a new Network Interface Card.
+#
+# @netdev: #optional id of -netdev to connect to
+#
+# @macaddr: #optional MAC address
+#
+# @model: #optional device model (e1000, rtl8139, virtio etc.)
+#
+# @addr: #optional PCI device address
+#
+# @vectors: #optional number of MSI-x vectors, 0 to disable MSI-X
+#
+# Since 1.2
+##
+{ 'type': 'NetLegacyNicOptions',
+  'data': {
+'*netdev':  'str',
+'*macaddr': 'str',
+'*model':   'str',
+'*addr':'str',
+'*vectors': 'uint32' } }
+
+##
+# @String
+#
+# A fat type wrapping 'str', to be embedded in lists.
+#
+# Since 1.2
+##
+{ 'type': 'String',
+  'data': {
+'str': 'str' } }
+
+##
+# @NetdevUserOptions
+#
+# Use the user mode network stack which requires no administrator privilege to
+# run.
+#
+# @hostname: #optional client hostname reported by the builtin DHCP server
+#
+# @restrict: #optional isolate the guest from the host
+#
+# @ip: #optional legacy parameter, use net= instead
+#
+# @net: #optional IP address and optional netmask
+#
+# @host: #optional guest-visible address of the host
+#
+# @tftp: #optional root directory of the built-in TFTP server
+#
+# @bootfile: #optional BOOTP filename, for use with tftp=
+#
+# @dhcpstart: #optional the first of the 16 IPs the built-in DHCP server can
+# assign
+#
+# @dns: #optional guest-visible address of the virtual nameserver
+#
+# @smb: #optional root directory of the built-in SMB server
+#
+# @smbserver: #optional IP address of the built-in SMB server
+#
+# @hostfwd: #optional redirect incoming TCP or UDP host connections to guest
+#   endpoints
+#
+# @guestfwd: #optional forward guest TCP connections
+#
+# Since 1.2
+##
+{ 'type': 'NetdevUserOptions',
+  'data': {
+'*hostname':  'str',
+'*restrict':  'bool',
+'*ip':'str',
+'*net':   'str',
+'*host':  'str',
+'*tftp':  'str',
+'*bootfile':  'str',
+'*dhcpstart': 'str',
+'*dns':   'str',
+'*smb':   'str',
+'*smbserver': 'str',
+'*hostfwd':   ['String'],
+'*guestfwd':  ['String'] } }
+
+##
+# @NetdevTapOptions
+#
+# Connect the host TAP network interface name to the VLAN.
+#
+# @ifname: #optional interface name
+#
+# @fd: #optional file descriptor of an already opened tap
+#
+# @script: #optional script to initialize the interface
+#
+# @downscript: #optional script to shut down the interface
+#
+# @helper: #optional command to execute to configure bridge
+#
+# @sndbuf: #optional send buffer limit. Understands [TGMKkb] suffixes.
+#
+# @vnet_hdr: #optional enable the IFF_VNET_HDR flag on the tap interface
+#
+# @vhost: #optional enable vhost-net network accelerator
+#
+# @vhostfd: #optional file descriptor of an already opened vhost net device
+#
+# @vhostforce: #optional vhost on for non-MSIX virtio guests
+#
+# Since 1.2
+##
+{ 'type': 'NetdevTapOptions',
+  'data': {
+'*ifname': 'str',
+'*fd': 'str',
+'*script': 'str',
+'*downscript': 'str',
+'*helper': 'str',
+'*sndbuf': 'size',
+'*vnet_hdr':   'bool',
+'*vhost':  'bool',
+'*vhostfd':'str',
+'*vhostforce': 'bool' } }
+
+##
+# @NetdevSocketOptions
+#
+# Connect the VLAN to a remote VLAN in another QEMU virtual machine using a TCP
+# socket connection.
+#
+# @fd: #optional file descriptor of an already opened socket
+#
+# @listen: #optional port number, and optional hostname, to listen on
+#
+# @connect: #optional port number, and optional hostname, to connect to
+#
+# @mcast: #optional UDP multicast address and port number
+#
+# @localaddr: #optional source address and port for multicast and udp packets
+#
+# @udp: #optional UDP unicast address and port number
+#
+# Since 1.2
+##
+{ 'type': 'NetdevSocketOptions',
+  'data': {
+'*fd':'str',
+'*listen':'str',
+'*connect':   'str',
+'*mcast': 'str',
+'*localaddr': 'str',
+'*udp':   'str' } }
+
+##
+# 

[Qemu-devel] [PATCH 18/19] convert net_init_bridge() to NetClientOptions

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 net/tap.c |   23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/net/tap.c b/net/tap.c
index c5563c0..d2736ea 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -513,21 +513,22 @@ static int net_bridge_run_helper(const char *helper, 
const char *bridge)
 return -1;
 }
 
-int net_init_bridge(QemuOpts *opts, const NetClientOptions *new_opts,
+int net_init_bridge(QemuOpts *old_opts, const NetClientOptions *opts,
 const char *name, VLANState *vlan)
 {
+const NetdevBridgeOptions *bridge;
+const char *helper, *br;
+
 TAPState *s;
 int fd, vnet_hdr;
 
-if (!qemu_opt_get(opts, br)) {
-qemu_opt_set(opts, br, DEFAULT_BRIDGE_INTERFACE);
-}
-if (!qemu_opt_get(opts, helper)) {
-qemu_opt_set(opts, helper, DEFAULT_BRIDGE_HELPER);
-}
+assert(opts-kind == NET_CLIENT_OPTIONS_KIND_BRIDGE);
+bridge = opts-bridge;
+
+helper = bridge-has_helper ? bridge-helper : DEFAULT_BRIDGE_HELPER;
+br = bridge-has_br ? bridge-br : DEFAULT_BRIDGE_INTERFACE;
 
-fd = net_bridge_run_helper(qemu_opt_get(opts, helper),
-   qemu_opt_get(opts, br));
+fd = net_bridge_run_helper(helper, br);
 if (fd == -1) {
 return -1;
 }
@@ -542,8 +543,8 @@ int net_init_bridge(QemuOpts *opts, const NetClientOptions 
*new_opts,
 return -1;
 }
 
-snprintf(s-nc.info_str, sizeof(s-nc.info_str), helper=%s,br=%s,
- qemu_opt_get(opts, helper), qemu_opt_get(opts, br));
+snprintf(s-nc.info_str, sizeof(s-nc.info_str), helper=%s,br=%s, helper,
+ br);
 
 return 0;
 }
-- 
1.7.10.4




[Qemu-devel] [PATCH 13/19] convert net_init_dump() to NetClientOptions

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

v1-v2:
- NetdevDumpOptions::len is of type 'size', whose C type was changed to
  uint64_t. Adapt the printf() format specifier macro.

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 net/dump.c |   21 +
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/net/dump.c b/net/dump.c
index 27e9528..f3d2fa9 100644
--- a/net/dump.c
+++ b/net/dump.c
@@ -144,22 +144,35 @@ static int net_dump_init(VLANState *vlan, const char 
*device,
 return 0;
 }
 
-int net_init_dump(QemuOpts *opts, const NetClientOptions *new_opts,
+int net_init_dump(QemuOpts *old_opts, const NetClientOptions *opts,
   const char *name, VLANState *vlan)
 {
 int len;
 const char *file;
 char def_file[128];
+const NetdevDumpOptions *dump;
+
+assert(opts-kind == NET_CLIENT_OPTIONS_KIND_DUMP);
+dump = opts-dump;
 
 assert(vlan);
 
-file = qemu_opt_get(opts, file);
-if (!file) {
+if (dump-has_file) {
+file = dump-file;
+} else {
 snprintf(def_file, sizeof(def_file), qemu-vlan%d.pcap, vlan-id);
 file = def_file;
 }
 
-len = qemu_opt_get_size(opts, len, 65536);
+if (dump-has_len) {
+if (dump-len  INT_MAX) {
+error_report(invalid length: %PRIu64, dump-len);
+return -1;
+}
+len = dump-len;
+} else {
+len = 65536;
+}
 
 return net_dump_init(vlan, dump, name, file, len);
 }
-- 
1.7.10.4




[Qemu-devel] [PATCH 06/19] expose QemuOpt and QemuOpts struct definitions to interested parties

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

The only clients should be the existent qemu-option.c, and the upcoming
qapi/opts-visitor.c.

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 qemu-option-internal.h |   53 
 qemu-option.c  |   24 +-
 2 files changed, 54 insertions(+), 23 deletions(-)
 create mode 100644 qemu-option-internal.h

diff --git a/qemu-option-internal.h b/qemu-option-internal.h
new file mode 100644
index 000..19fdc1c
--- /dev/null
+++ b/qemu-option-internal.h
@@ -0,0 +1,53 @@
+/*
+ * Commandline option parsing functions
+ *
+ * Copyright (c) 2003-2008 Fabrice Bellard
+ * Copyright (c) 2009 Kevin Wolf kw...@redhat.com
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef QEMU_OPTIONS_INTERNAL_H
+#define QEMU_OPTIONS_INTERNAL_H
+
+#include qemu-option.h
+
+struct QemuOpt {
+const char   *name;
+const char   *str;
+
+const QemuOptDesc *desc;
+union {
+bool boolean;
+uint64_t uint;
+} value;
+
+QemuOpts *opts;
+QTAILQ_ENTRY(QemuOpt) next;
+};
+
+struct QemuOpts {
+char *id;
+QemuOptsList *list;
+Location loc;
+QTAILQ_HEAD(QemuOptHead, QemuOpt) head;
+QTAILQ_ENTRY(QemuOpts) next;
+};
+
+#endif
diff --git a/qemu-option.c b/qemu-option.c
index bb3886c..8334190 100644
--- a/qemu-option.c
+++ b/qemu-option.c
@@ -29,9 +29,9 @@
 #include qemu-common.h
 #include qemu-error.h
 #include qemu-objects.h
-#include qemu-option.h
 #include error.h
 #include qerror.h
+#include qemu-option-internal.h
 
 /*
  * Extracts the name of an option from the parameter string (p points at the
@@ -511,28 +511,6 @@ void print_option_help(QEMUOptionParameter *list)
 
 /* -- */
 
-struct QemuOpt {
-const char   *name;
-const char   *str;
-
-const QemuOptDesc *desc;
-union {
-bool boolean;
-uint64_t uint;
-} value;
-
-QemuOpts *opts;
-QTAILQ_ENTRY(QemuOpt) next;
-};
-
-struct QemuOpts {
-char *id;
-QemuOptsList *list;
-Location loc;
-QTAILQ_HEAD(QemuOptHead, QemuOpt) head;
-QTAILQ_ENTRY(QemuOpts) next;
-};
-
 static QemuOpt *qemu_opt_find(QemuOpts *opts, const char *name)
 {
 QemuOpt *opt;
-- 
1.7.10.4




Re: [Qemu-devel] [RFC] introduce a dynamic library to expose qemu block API

2012-07-23 Thread Lluís Vilanova
Stefan Hajnoczi writes:
[...]
  So I tried trimming down the list of files needed to compile
  qemu tools, and here is a list:
 
  Easy to relicense to LGPLv2+:
  block/raw.c none (GPLv2+: Red Hat, IBM)
  error.c LGPLv2 (Red Hat, IBM, Stefan Weil)
  iov.c   GPLv2 (Red Hat, SuSE/Hannes Reinecke, 
  Michael Tokarev)
  module.cGPLv2 (Red Hat, IBM, Blue Swirl)
  qemu-error.cGPLv2+ (Red Hat, Blue Swirl, IBM)
  trace/control.c GPLv2 (Lluis Vilanova)
  trace/default.c GPLv2 (Lluis Vilanova)
 
  (I added some people to Cc.  Lluis and Michael, can you also look at
  http://wiki.qemu.org/Relicensing if you're willing to relicense
  your past contributions from GPLv2 to GPLv2+?.  Blue Swirl said 
  he'd accept any other GPLv2 or GPLv3 compatible license, which
  should include LGPLv2+).
 
 I have no problems relicensing to GPLv2 or later or GPLv3 or later.

 What about LGPLv2+?  (Note the L.)

I'd prefer to keep it non-lesser. Is it absolutely necessary?


PS: I suppose the + stands for or later

Lluis

-- 
 And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer.
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth



Re: [Qemu-devel] [PATCH] kvm: Move kvm_allows_irq0_override() to target-i386

2012-07-23 Thread Cornelia Huck
On Sat, 21 Jul 2012 15:16:56 +0200
Jan Kiszka jan.kis...@web.de wrote:

 On 2012-07-21 14:57, Peter Maydell wrote:
  On 21 July 2012 13:35, Jan Kiszka jan.kis...@web.de wrote:
  On 2012-07-21 14:17, Peter Maydell wrote:
  You still haven't really explained why we can't just ignore irqfd
  for now. I don't see how it would particularly affect the design
  of the kernel implementation very much, and the userspace interface
  seems to just be an extra ioctl.
 
  I bet you ignored MSI so far. Physical IRQ lines are just a part of the
  whole picture. How are MSIs delivered on the systems you want to add?
  
  You're using random acronyms without defining them again. It looks
  as if MSI is a PCI specific term. That would seem to me to fall
  under the heading of routing across a board model which we can't
  do anyway, because you have no idea how this all wired up, it
  will depend on the details of the SoC and the PCI controller.
 
 For sure you can. You need to discover those wiring, cache it, and then
 let the source inject to the final destination directly. See the INTx
 routing notifier and pci_device_route_intx_to_irq from [1] for that
 simplistic approach we are taking on the x86/PC architecture.

  Once you support the backend (KVM_SET_GSI_ROUTING + KVM_IRQ_LINE),
  adding irqfd is in fact simple.
 
  I don't really understand where KVM_SET_GSI_ROUTING comes into
  this -- the documentation is a bit opaque. It says Sets the GSI
  routing table entries but it doesn't define what a GSI is or
  what we're routing to where. Googling suggests GSI is an APIC
  specific term so it doesn't sound like it's relevant for non-x86.
 
  As I said before: GSI needs to be read as physical or virtual IRQ
  line. The virtual ones can be of any source you define, irqfd is just one.
  
  What's a virtual irq line in this context? We're modelling a physical
  bit of hardware which has N interrupt lines, so I'm not sure what
  a virtual irq line would be or how it would appear to the guest...
 
 A virtual line is an input of the in-kernel IRQ router you configure via
 SET_GSI_ROUTING. A physical line is a potential output of it that goes
 into some in-kernel interrupt controller model. It can also be an
 interrupt message sent to a specific CPU - provided the arch supports
 such a delivery protocol.
 
 Of course, the current router was modeled after x86 and ia64. So I
 wouldn't be surprised if some ARM system configuration cannot be
 expressed this way. Then we need to discuss reasonable extensions. But
 it should provide a sound foundation at least.

OK, so I was reading through this thread since I want to add irqfd
support for s390, but we don't have any kind of irqchip.

The understanding I got so far is that !s390 architectures have some
kind of mechanism that allows them to route an interrupt between a
device and a cpu, meaning that there's a fixed tie-in between a device
and a cpu. If that's correct, I don't see how to model irqfds via
this irqchip infrastructure for s390.

Here's in a nutshell how (external and I/O) interrupts work on s390:

- Interrupts have an internal prioritation, that means different types
of interrupts (external, I/O, machine check, ...) take precedent over
other types

- External and I/O interrupts are floating, i. e. they are not tied
to a specific cpu, but can be delivered to any cpu that has external
resp. I/O interrupts enabled

- Interrupts take payload with them that defines which device they are
for

So, for example, if a specific subchannel (=device) has pending status
and an I/O interrupt is to be generated, this interrupt remains pending
until an arbitrary cpu is enabled for I/O interrupts. If several cpus
are enabled for I/O interrupts, any of them may be interrupted. When an
I/O interrupt is delivered on a cpu, the cpu's lowcore contains the
interrupt payload which defines the subchannel (=device) the interrupt
is for.

Any idea on how this architecture can be married with the irqchip
concept is welcome. If all else fails, would a special irqfd concept
for !irqchip be acceptable?

Cornelia




[Qemu-devel] [PULL 00/14] Migration next

2012-07-23 Thread Juan Quintela
Hi Anthony

This series include the ram_save_live() split.  XBZRLE patches got
dropped until we fix a new bug.  Please apply.

Thanks, Juan.

The following changes since commit 61dc008f3529fa74a63aad1907438dad857e255a:

  Revert audio: Make PC speaker audio card available by default (2012-07-19 
18:25:52 -0500)

are available in the git repository at:

  http://repo.or.cz/r/qemu/quintela.git 

for you to fetch changes up to 6c779f22a93cc6e4565b940ef616e3efc5b50ba5:

  Change ram_save_block to return -1 if there are no more changes (2012-07-23 
14:02:28 +0200)


Juan Quintela (12):
  savevm: Use a struct to pass all handlers
  savevm: Live migration handlers register the struct directly
  savevm: remove SaveSetParamsHandler
  savevm: remove SaveLiveStateHandler
  savevm: Refactor cancel operation in its own operation
  savevm: introduce is_active method
  savevm: split save_live_setup from save_live_state
  savevm: split save_live into stage2 and stage3
  ram: save_live_setup() don't need to sent pages
  ram: save_live_complete() only do one loop
  ram: iterate phase
  ram: save_live_setup() we don't need to synchronize the dirty bitmap.

Orit Wasserman (2):
  Add migration capabilities
  Change ram_save_block to return -1 if there are no more changes

 arch_init.c   |  137 ++-
 block-migration.c |  153 +++--
 hmp-commands.hx   |   16 ++
 hmp.c |   64 ++
 hmp.h |2 +
 migration.c   |   72 -
 migration.h   |6 ++-
 monitor.c |7 +++
 qapi-schema.json  |   53 ++-
 qmp-commands.hx   |   71 +++--
 savevm.c  |   77 +++
 vl.c  |3 +-
 vmstate.h |   18 ---
 13 files changed, 525 insertions(+), 154 deletions(-)

-- 
1.7.10.4




[Qemu-devel] KVM call agenda for Tuesday, July 24

2012-07-23 Thread Juan Quintela

Hi

Please send in any agenda items you are interested in covering.

Later, Juan.



Re: [Qemu-devel] [RFC] introduce a dynamic library to expose qemu block API

2012-07-23 Thread Paolo Bonzini
Il 23/07/2012 13:55, Lluís Vilanova ha scritto:
  I have no problems relicensing to GPLv2 or later or GPLv3 or later.
  What about LGPLv2+?  (Note the L.)
 I'd prefer to keep it non-lesser. Is it absolutely necessary?

This is about making the block layer part of a library.  The block layer
is 90% under a BSD license, but the LGPLv2 is a good compromise.

Paolo



Re: [Qemu-devel] [SeaBIOS PATCH 1/2] acpi: report real I/O APIC ID (0) on MADT table (v2)

2012-07-23 Thread Gleb Natapov
On Fri, Jul 20, 2012 at 02:04:49PM -0300, Eduardo Habkost wrote:
 When resetting an I/O APIC, its ID is set to 0, and SeaBIOS doesn't
 change it, so report it correctly on the MADT table.
 
 Some hardware may require the BIOS to initialize I/O APIC ID to an
 unique value, but SeaBIOS doesn't do that. This patch at least makes the
 MADT table reflect reality.
 
 Changes v1 - v2:
  - Cosmetic: whitespace change (removed extra newline)
  - New patch description
 
 Signed-off-by: Eduardo Habkost ehabk...@redhat.com
 ---
  src/acpi.c   |2 +-
  src/config.h |1 +
  2 files changed, 2 insertions(+), 1 deletion(-)
 
 diff --git a/src/acpi.c b/src/acpi.c
 index d39cbd9..da3bc57 100644
 --- a/src/acpi.c
 +++ b/src/acpi.c
 @@ -336,7 +336,7 @@ build_madt(void)
  struct madt_io_apic *io_apic = (void*)apic;
  io_apic-type = APIC_IO;
  io_apic-length = sizeof(*io_apic);
 -io_apic-io_apic_id = CountCPUs;
 +io_apic-io_apic_id = BUILD_IOAPIC_ID;
  io_apic-address = cpu_to_le32(BUILD_IOAPIC_ADDR);
  io_apic-interrupt = cpu_to_le32(0);
  
mptable also have ioapic_id.

 diff --git a/src/config.h b/src/config.h
 index 3a70867..0d4066d 100644
 --- a/src/config.h
 +++ b/src/config.h
 @@ -52,6 +52,7 @@
  #define BUILD_PCIMEM64_END0x100ULL
  
  #define BUILD_IOAPIC_ADDR 0xfec0
 +#define BUILD_IOAPIC_ID   0
  #define BUILD_HPET_ADDRESS0xfed0
  #define BUILD_APIC_ADDR   0xfee0
  
 -- 
 1.7.10.4

--
Gleb.



[Qemu-devel] [PATCH 07/19] qapi: introduce OptsVisitor

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

This visitor supports parsing

  -option [type=]discriminator[,optarg1=val1][,optarg2=val2][,...]

style QemuOpts objects into native C structures. After defining the type
tree in the qapi schema (see below), a root type traversal with this
visitor linked to the underlying QemuOpts object will build the native C
representation of the option.

The type tree in the schema, corresponding to an option with a
discriminator, must have the following structure:

  struct
scalar member for non-discriminated optarg 1 [*]
list for repeating non-discriminated optarg 2 [*]
  wrapper struct
single scalar member
union
  struct for discriminator case 1
scalar member for optarg 3 [*]
list for repeating optarg 4 [*]
  wrapper struct
single scalar member
scalar member for optarg 5 [*]
  struct for discriminator case 2
...

The type optarg name is fixed for the discriminator role. Its schema
representation is union of structures, and each discriminator value must
correspond to a member name in the union.

If the option takes no type descriminator, then the type subtree rooted
at the union must be absent from the schema (including the union itself).

Optarg values can be of scalar types str / bool / integers / size.

Members marked with [*] may be defined as optional in the schema,
describing an optional optarg.

Repeating an optarg is supported; its schema representation must be list
of structure with single mandatory scalar member. If an optarg is not
described as repeating in the schema (ie. it is defined as a scalar field
instead of a list), its last occurrence will take effect. Ordering between
differently named optargs is not preserved.

A mandatory list (or an optional one which is reported to be available),
corresponding to a repeating optarg, has at least one element after
successful parsing.

v1-v2:
- Update opts_type_size() prototype to uint64_t.
- Add opts_type_uint64() for options needing the full uint64_t range.
  (Internals could be extracted to cutils.c.)
- Allow negative values in opts_type_int().
- Rebase to nested Makefiles.

v2-v3:
- Factor opts_visitor_insert() out of opts_start_struct() and call it
  separately for opts_root-id if there's any.
- Don't require non-negative values in opts_type_int()'s error message.
- g_malloc0() may return NULL for zero-sized requests. Support empty
  structures by requesting 1 byte for them instead.

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 qapi/Makefile.objs  |2 +-
 qapi/opts-visitor.c |  427 +++
 qapi/opts-visitor.h |   31 
 3 files changed, 459 insertions(+), 1 deletion(-)
 create mode 100644 qapi/opts-visitor.c
 create mode 100644 qapi/opts-visitor.h

diff --git a/qapi/Makefile.objs b/qapi/Makefile.objs
index d0b0c16..5f5846e 100644
--- a/qapi/Makefile.objs
+++ b/qapi/Makefile.objs
@@ -1,3 +1,3 @@
 qapi-obj-y = qapi-visit-core.o qapi-dealloc-visitor.o qmp-input-visitor.o
 qapi-obj-y += qmp-output-visitor.o qmp-registry.o qmp-dispatch.o
-qapi-obj-y += string-input-visitor.o string-output-visitor.o
+qapi-obj-y += string-input-visitor.o string-output-visitor.o opts-visitor.o
diff --git a/qapi/opts-visitor.c b/qapi/opts-visitor.c
new file mode 100644
index 000..a59d306
--- /dev/null
+++ b/qapi/opts-visitor.c
@@ -0,0 +1,427 @@
+/*
+ * Options Visitor
+ *
+ * Copyright Red Hat, Inc. 2012
+ *
+ * Author: Laszlo Ersek ler...@redhat.com
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include opts-visitor.h
+#include qemu-queue.h
+#include qemu-option-internal.h
+#include qapi-visit-impl.h
+
+
+struct OptsVisitor
+{
+Visitor visitor;
+
+/* Ownership remains with opts_visitor_new()'s caller. */
+const QemuOpts *opts_root;
+
+unsigned depth;
+
+/* Non-null iff depth is positive. Each key is a QemuOpt name. Each value
+ * is a non-empty GQueue, enumerating all QemuOpt occurrences with that
+ * name. */
+GHashTable *unprocessed_opts;
+
+/* The list currently being traversed with opts_start_list() /
+ * opts_next_list(). The list must have a struct element type in the
+ * schema, with a single mandatory scalar member. */
+GQueue *repeated_opts;
+bool repeated_opts_first;
+
+/* If opts_root-id is set, reinstantiate it as a fake QemuOpt for
+ * uniformity. Only its name and str fields are set. fake_id_opt does
+ * not survive or escape the OptsVisitor object.
+ */
+QemuOpt *fake_id_opt;
+};
+
+
+static void
+destroy_list(gpointer list)
+{
+  g_queue_free(list);
+}
+
+
+static void
+opts_visitor_insert(GHashTable *unprocessed_opts, const QemuOpt *opt)
+{
+GQueue *list;
+
+list = g_hash_table_lookup(unprocessed_opts, opt-name);
+if (list == NULL) {
+   

Re: [Qemu-devel] [PATCH] kvm: Move kvm_allows_irq0_override() to target-i386

2012-07-23 Thread Avi Kivity
On 07/23/2012 03:04 PM, Cornelia Huck wrote:
 
 OK, so I was reading through this thread since I want to add irqfd
 support for s390, but we don't have any kind of irqchip.
 
 The understanding I got so far is that !s390 architectures have some
 kind of mechanism that allows them to route an interrupt between a
 device and a cpu, meaning that there's a fixed tie-in between a device
 and a cpu. 

It's not fixed, but it changes rarely.  Various interrupt routers can be
programmed to change the route, but guests do so only rarely.

 If that's correct, I don't see how to model irqfds via
 this irqchip infrastructure for s390.
 
 Here's in a nutshell how (external and I/O) interrupts work on s390:
 
 - Interrupts have an internal prioritation, that means different types
 of interrupts (external, I/O, machine check, ...) take precedent over
 other types
 
 - External and I/O interrupts are floating, i. e. they are not tied
 to a specific cpu, but can be delivered to any cpu that has external
 resp. I/O interrupts enabled
 
 - Interrupts take payload with them that defines which device they are
 for
 
 So, for example, if a specific subchannel (=device) has pending status
 and an I/O interrupt is to be generated, this interrupt remains pending
 until an arbitrary cpu is enabled for I/O interrupts. If several cpus
 are enabled for I/O interrupts, any of them may be interrupted.

This may be costly to emulate.  On x86 we do not have access to a
guest's interrupt status while it is running.  Is this not the case for
s390?

Oh, let me guess.  You write some interrupt descriptor in memory
somewhere, issue one of your famous instructions, and the hardware finds
a guest vcpu and injects the interrupt.

x86 has a least priority mode which is similar to what you're
describing, but I don't think we emulate it correctly.

 When an
 I/O interrupt is delivered on a cpu, the cpu's lowcore contains the
 interrupt payload which defines the subchannel (=device) the interrupt
 is for.
 
 Any idea on how this architecture can be married with the irqchip
 concept is welcome. If all else fails, would a special irqfd concept
 for !irqchip be acceptable?

I don't see an issue.  You need an arch-specific irqfd configuration
ioctl (or your own data field in the existing ioctl) that defines the
payload.  Then the kernel installs a poll function on that eventfd that,
when called, does the magic sequence needed to get the interrupt there.
 While you don't have an irqchip, you do have asynchronous interrupt
injection, yes?  That's what irqchip really is all about.

-- 
error compiling committee.c: too many arguments to function





[Qemu-devel] [PATCH 02/19] qapi: fix error propagation

2012-07-23 Thread Stefan Hajnoczi
From: Paolo Bonzini pbonz...@redhat.com

Don't overwrite / leak previously set errors.
Make traversal cope with missing mandatory sub-structs.
Don't try to end a container that could not be started.

v1-v2:
- unchanged

v2-v3:
- instead of examining, assert that we never overwrite errors with
  error_set()
- allow visitors to set a NULL struct pointer successfully, so traversal
  of incomplete objects can continue
- check for a NULL obj before accessing (*obj)-has_XXX (this is not a
  typo, obj != NULL implies *obj != NULL here)
- fix start_struct / end_struct balance for unions as well

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 docs/qapi-code-gen.txt |2 +
 error.c|3 +-
 error.h|2 +-
 qapi/qapi-visit-core.c |   10 ++-
 scripts/qapi-visit.py  |  150 +---
 tests/test-qmp-input-visitor.c |   24 ---
 6 files changed, 121 insertions(+), 70 deletions(-)

diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt
index ad11767..cccb11e 100644
--- a/docs/qapi-code-gen.txt
+++ b/docs/qapi-code-gen.txt
@@ -220,6 +220,8 @@ Example:
 #endif
 mdroth@illuin:~/w/qemu2.git$
 
+(The actual structure of the visit_type_* functions is a bit more complex
+in order to propagate errors correctly and avoid leaking memory).
 
 === scripts/qapi-commands.py ===
 
diff --git a/error.c b/error.c
index a52b771..58f55a0 100644
--- a/error.c
+++ b/error.c
@@ -32,6 +32,7 @@ void error_set(Error **errp, const char *fmt, ...)
 if (errp == NULL) {
 return;
 }
+assert(*errp == NULL);
 
 err = g_malloc0(sizeof(*err));
 
@@ -132,7 +133,7 @@ bool error_is_type(Error *err, const char *fmt)
 
 void error_propagate(Error **dst_err, Error *local_err)
 {
-if (dst_err) {
+if (dst_err  !*dst_err) {
 *dst_err = local_err;
 } else if (local_err) {
 error_free(local_err);
diff --git a/error.h b/error.h
index 45ff6c1..3d9d96d 100644
--- a/error.h
+++ b/error.h
@@ -57,7 +57,7 @@ void error_set_field(Error *err, const char *field, const 
char *value);
 /**
  * Propagate an error to an indirect pointer to an error.  This function will
  * always transfer ownership of the error reference and handles the case where
- * dst_err is NULL correctly.
+ * dst_err is NULL correctly.  Errors after the first are discarded.
  */
 void error_propagate(Error **dst_err, Error *local_err);
 
diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
index 705eca9..d41595e 100644
--- a/qapi/qapi-visit-core.c
+++ b/qapi/qapi-visit-core.c
@@ -39,9 +39,8 @@ void visit_start_struct(Visitor *v, void **obj, const char 
*kind,
 
 void visit_end_struct(Visitor *v, Error **errp)
 {
-if (!error_is_set(errp)) {
-v-end_struct(v, errp);
-}
+assert(!error_is_set(errp));
+v-end_struct(v, errp);
 }
 
 void visit_start_list(Visitor *v, const char *name, Error **errp)
@@ -62,9 +61,8 @@ GenericList *visit_next_list(Visitor *v, GenericList **list, 
Error **errp)
 
 void visit_end_list(Visitor *v, Error **errp)
 {
-if (!error_is_set(errp)) {
-v-end_list(v, errp);
-}
+assert(!error_is_set(errp));
+v-end_list(v, errp);
 }
 
 void visit_start_optional(Visitor *v, bool *present, const char *name,
diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
index 8d4e94a..04ef7c4 100644
--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -17,32 +17,49 @@ import os
 import getopt
 import errno
 
-def generate_visit_struct_body(field_prefix, members):
-ret = 
+def generate_visit_struct_body(field_prefix, name, members):
+ret = mcgen('''
+if (!error_is_set(errp)) {
+''')
+push_indent()
+
 if len(field_prefix):
 field_prefix = field_prefix + .
+ret += mcgen('''
+Error **errp = err; /* from outer scope */
+Error *err = NULL;
+visit_start_struct(m, NULL, , %(name)s, 0, err);
+''',
+name=name)
+else:
+ret += mcgen('''
+Error *err = NULL;
+visit_start_struct(m, (void **)obj, %(name)s, name, sizeof(%(name)s), err);
+''',
+name=name)
+
+ret += mcgen('''
+if (!err) {
+if (!obj || *obj) {
+''')
+
+push_indent()
+push_indent()
 for argname, argentry, optional, structured in parse_args(members):
 if optional:
 ret += mcgen('''
-visit_start_optional(m, (obj  *obj) ? (*obj)-%(c_prefix)shas_%(c_name)s : 
NULL, %(name)s, errp);
-if ((*obj)-%(prefix)shas_%(c_name)s) {
+visit_start_optional(m, obj ? (*obj)-%(c_prefix)shas_%(c_name)s : NULL, 
%(name)s, err);
+if (obj  (*obj)-%(prefix)shas_%(c_name)s) {
 ''',
  c_prefix=c_var(field_prefix), prefix=field_prefix,
  c_name=c_var(argname), name=argname)
 push_indent()
 
 if structured:
-ret += mcgen('''
-visit_start_struct(m, NULL, 

Re: [Qemu-devel] [SeaBIOS PATCH 2/2] allow CPUs to have non-contiguous Local APIC IDs (v2)

2012-07-23 Thread Gleb Natapov
On Fri, Jul 20, 2012 at 02:04:50PM -0300, Eduardo Habkost wrote:
 Extract Local APIC IDs directly from the CPUs, and instead of check for
 i  CountCPUs, check if the APIC ID was present on boot, when building
 ACPI tables and the MP-Table.
 
 This keeps ACPI Processor ID == APIC ID, but allows the
 hardware-SeaBIOS interface be completely APIC-ID based and not depend
 on any other kind of CPU identifier. This way, SeaBIOS may change the
 way ACPI Processor IDs are chosen in the future.
 
 As currently SeaBIOS supports only xAPIC and not x2APIC, the list of
 present-on-boot APIC IDs is a 256-bit bitmap. If one day SeaBIOS starts
 to support x2APIC, the data structure used to enumerate the APIC IDs
 will have to be changed (but this is an internal implementation detail,
 not visible to the OS or on any hardware=SeaBIOS interface).
 
 For current QEMU versions (that always make the APIC IDs contiguous),
 the OS-visible behavior and resulting ACPI tables should be exactly the
 same. This patch will simply allow QEMU to start setting non-contiguous
 APIC IDs (that is a requirement for some sockets/cores/threads topology
 settings).
 
 Changes v1 - v2:
  - Use size suffixes on all asm instructions on smp.c
  - New patch description
 
 Signed-off-by: Eduardo Habkost ehabk...@redhat.com
 ---
  src/acpi-dsdt.dsl |4 +++-
  src/acpi.c|9 +
  src/mptable.c |2 +-
  src/smp.c |   17 +
  src/util.h|1 +
  5 files changed, 27 insertions(+), 6 deletions(-)
 
 diff --git a/src/acpi-dsdt.dsl b/src/acpi-dsdt.dsl
 index 2060686..72dc7d8 100644
 --- a/src/acpi-dsdt.dsl
 +++ b/src/acpi-dsdt.dsl
 @@ -676,6 +676,7 @@ DefinitionBlock (
  /* Methods called by run-time generated SSDT Processor objects */
  Method (CPMA, 1, NotSerialized) {
  // _MAT method - create an madt apic buffer
 +// Arg0 = Processor ID = Local APIC ID
  // Local0 = CPON flag for this cpu
  Store(DerefOf(Index(CPON, Arg0)), Local0)
  // Local1 = Buffer (in madt apic form) to return
 @@ -688,6 +689,7 @@ DefinitionBlock (
  }
  Method (CPST, 1, NotSerialized) {
  // _STA method - return ON status of cpu
 +// Arg0 = Processor ID = Local APIC ID
  // Local0 = CPON flag for this cpu
  Store(DerefOf(Index(CPON, Arg0)), Local0)
  If (Local0) { Return(0xF) } Else { Return(0x0) }
 @@ -708,7 +710,7 @@ DefinitionBlock (
  Store (PRS, Local5)
  // Local2 = last read byte from bitmap
  Store (Zero, Local2)
 -// Local0 = cpuid iterator
 +// Local0 = Processor ID / APIC ID iterator
  Store (Zero, Local0)
  While (LLess(Local0, SizeOf(CPON))) {
  // Local1 = CPON flag for this cpu
 diff --git a/src/acpi.c b/src/acpi.c
 index da3bc57..39b7172 100644
 --- a/src/acpi.c
 +++ b/src/acpi.c
 @@ -327,7 +327,7 @@ build_madt(void)
  apic-length = sizeof(*apic);
  apic-processor_id = i;
  apic-local_apic_id = i;
 -if (i  CountCPUs)
 +if (apic_id_is_present(apic-local_apic_id))
  apic-flags = cpu_to_le32(1);
  else
  apic-flags = cpu_to_le32(0);
 @@ -445,6 +445,7 @@ build_ssdt(void)
  }
  
  // build Method(NTFY, 2) {If (LEqual(Arg0, 0x00)) {Notify(CP00, Arg1)} 
 ...}
 +// Arg0 = Processor ID = APIC ID
  *(ssdt_ptr++) = 0x14; // MethodOp
  ssdt_ptr = encodeLen(ssdt_ptr, 2+5+(12*acpi_cpus), 2);
  *(ssdt_ptr++) = 'N';
 @@ -477,7 +478,7 @@ build_ssdt(void)
  ssdt_ptr = encodeLen(ssdt_ptr, 2+1+(1*acpi_cpus), 2);
  *(ssdt_ptr++) = acpi_cpus;
  for (i=0; iacpi_cpus; i++)
 -*(ssdt_ptr++) = (i  CountCPUs) ? 0x01 : 0x00;
 +*(ssdt_ptr++) = (apic_id_is_present(i)) ? 0x01 : 0x00;
  
  // store pci io windows: start, end, length
  // this way we don't have to do the math in the dsdt
 @@ -656,10 +657,10 @@ build_srat(void)
  core-proximity_lo = curnode;
  memset(core-proximity_hi, 0, 3);
  core-local_sapic_eid = 0;
 -if (i  CountCPUs)
 +if (apic_id_is_present(i))
  core-flags = cpu_to_le32(1);
  else
 -core-flags = 0;
 +core-flags = cpu_to_le32(0);
  core++;
  }
  
 diff --git a/src/mptable.c b/src/mptable.c
 index 103f462..9406f98 100644
 --- a/src/mptable.c
 +++ b/src/mptable.c
 @@ -59,7 +59,7 @@ mptable_init(void)
  cpu-apicid = i;
  cpu-apicver = apic_version;
  /* cpu flags: enabled, bootstrap cpu */
 -cpu-cpuflag = ((iCountCPUs) ? 0x01 : 0x00) | ((i==0) ? 0x02 : 
 0x00);
 +cpu-cpuflag = (apic_id_is_present(i) ? 0x01 : 0x00) | ((i==0) ? 
 0x02 : 0x00);
  cpu-cpusignature = cpuid_signature;
  cpu-featureflag = cpuid_features;
  cpu++;
 diff --git a/src/smp.c b/src/smp.c
 index 8c077a1..3c36f8c 

[Qemu-devel] [PULL 00/19] Net patches

2012-07-23 Thread Stefan Hajnoczi
The main piece is Laszlo's OptsVisitor and -net/-netdev parsing conversion.

The following changes since commit 61dc008f3529fa74a63aad1907438dad857e255a:

  Revert audio: Make PC speaker audio card available by default (2012-07-19 
18:25:52 -0500)

are available in the git repository at:

  git://github.com/stefanha/qemu.git net

for you to fetch changes up to 1a0c09583df097d62b0580f9073ba45c9d18351a:

  remove unused QemuOpts parameter from net init functions (2012-07-23 11:55:18 
+0100)


Laszlo Ersek (17):
  qapi: add test case for deallocating traversal of incomplete structure
  qapi: generate C types for fixed-width integers
  qapi: introduce size type
  expose QemuOpt and QemuOpts struct definitions to interested parties
  qapi: introduce OptsVisitor
  qapi schema: remove trailing whitespace
  qapi schema: add Netdev types
  hw, net: net_client_type - NetClientOptionsKind (qapi-generated)
  convert net_client_init() to OptsVisitor
  convert net_init_nic() to NetClientOptions
  convert net_init_dump() to NetClientOptions
  convert net_init_slirp() to NetClientOptions
  convert net_init_socket() to NetClientOptions
  convert net_init_vde() to NetClientOptions
  convert net_init_tap() to NetClientOptions
  convert net_init_bridge() to NetClientOptions
  remove unused QemuOpts parameter from net init functions

Paolo Bonzini (1):
  qapi: fix error propagation

Stefan Hajnoczi (1):
  MAINTAINERS: Replace net maintainer Mark McLoughlin with Stefan Hajnoczi

 MAINTAINERS|3 +-
 docs/qapi-code-gen.txt |2 +
 error.c|3 +-
 error.h|2 +-
 hw/cadence_gem.c   |2 +-
 hw/dp8393x.c   |2 +-
 hw/e1000.c |2 +-
 hw/eepro100.c  |2 +-
 hw/etraxfs_eth.c   |2 +-
 hw/lan9118.c   |2 +-
 hw/lance.c |2 +-
 hw/mcf_fec.c   |2 +-
 hw/milkymist-minimac2.c|2 +-
 hw/mipsnet.c   |2 +-
 hw/musicpal.c  |2 +-
 hw/ne2000-isa.c|2 +-
 hw/ne2000.c|2 +-
 hw/opencores_eth.c |2 +-
 hw/pcnet-pci.c |2 +-
 hw/rtl8139.c   |2 +-
 hw/smc91c111.c |2 +-
 hw/spapr_llan.c|2 +-
 hw/stellaris_enet.c|2 +-
 hw/usb/dev-network.c   |2 +-
 hw/vhost_net.c |2 +-
 hw/virtio-net.c|   10 +-
 hw/xen_nic.c   |2 +-
 hw/xgmac.c |2 +-
 hw/xilinx_axienet.c|2 +-
 hw/xilinx_ethlite.c|2 +-
 net.c  |  494 +++-
 net.h  |   16 +-
 net/dump.c |   24 +-
 net/dump.h |5 +-
 net/slirp.c|   96 +++-
 net/slirp.h|4 +-
 net/socket.c   |  124 --
 net/socket.h   |5 +-
 net/tap-aix.c  |2 +-
 net/tap-bsd.c  |2 +-
 net/tap-haiku.c|2 +-
 net/tap-linux.c|9 +-
 net/tap-solaris.c  |2 +-
 net/tap-win32.c|   14 +-
 net/tap.c  |  152 ++---
 net/tap.h  |   10 +-
 net/vde.c  |   20 +-
 net/vde.h  |5 +-
 qapi-schema.json   |  288 ++-
 qapi/Makefile.objs |2 +-
 qapi/opts-visitor.c|  427 ++
 qapi/opts-visitor.h|   31 +++
 qapi/qapi-visit-core.c |   17 +-
 qapi/qapi-visit-core.h |3 +
 qemu-option-internal.h |   53 +
 qemu-option.c  |   24 +-
 scripts/qapi-visit.py  |  150 +++-
 scripts/qapi.py|6 +
 tests/test-qmp-commands.c  |   42 
 tests/test-qmp-input-visitor.c |   24 +-
 60 files changed, 1352 insertions(+), 771 deletions(-)
 create mode 100644 qapi/opts-visitor.c
 create mode 100644 qapi/opts-visitor.h
 create mode 100644 qemu-option-internal.h

-- 
1.7.10.4




Re: [Qemu-devel] [PATCH] kvm: Move kvm_allows_irq0_override() to target-i386

2012-07-23 Thread Peter Maydell
On 23 July 2012 13:18, Avi Kivity a...@redhat.com wrote:
  While you don't have an irqchip, you do have asynchronous interrupt
 injection, yes?  That's what irqchip really is all about.

This seems an odd point of view -- async interrupt injection
doesn't have anything to do with whether we're modelling
the irqchip in the kernel or in QEMU, I thought...

-- PMM



Re: [Qemu-devel] [PATCH] kvm: Move kvm_allows_irq0_override() to target-i386

2012-07-23 Thread Avi Kivity
On 07/21/2012 11:54 AM, Peter Maydell wrote:
 On 21 July 2012 07:57, Jan Kiszka jan.kis...@web.de wrote:
 On 2012-07-20 21:14, Peter Maydell wrote:
 I'm sure this isn't the only x86ism in the KVM generic source
 files. However the thing I'm specifically trying to do is
 nuke all the uses of kvm_irqchip_in_kernel() in common code,

 No, irqchip in kernel is supposed to be a generic concept. We will
 also have it on Power. Not sure what your plans are for ARM, maybe it
 will always be true there.
 
 I agree that irqchip in kernel? is generic (though as you'll see
 below there's disagreement about what that ought to mean or imply).
 irq0_override though seems to me to be absolutely x86 specific.
 
 That said, maybe there is room for discussion about what it means for
 the general KVM code and its users if the irqchip is in the kernel. Two
 things that should be common for every arch:
  - VCPU idle management is done inside the kernel
 
 The trouble is that at the moment QEMU assumes that is the
 irqchip in kernel? == is VCPU idle management in kernel, for
 instance. For ARM, VCPU idle management is in kernel whether
 we're using the kernel's model of the VGIC or not. Alex tells
 me PPC is the same way. It's only x86 that has tied these two
 concepts together.

Really, irqchip in kernel means asynchronous interrupts - you can
inject an interrupt from outside the vcpu thread.  Obviously if the vcpu
is sleeping you need to wake it up and that pulls in idle management.

irqchip for x86 really means the local APIC, which has a synchronous
interface with the cpu core.  local APIC in kernel really is
equivalent to kernel idle management, KVM_IRQ_LINE, and irqfd.  The
ioapic and pit, on the other hand, don't contribute anything to this
(just performance).

So yes, ARM with and without GIC are irqchip_in_kernel, since the
ARM-GIC interface is asynchronous.  Whether to emulate the GIC or not
is just a performance question.

 The reason I want to get rid of common-code uses of kvm_irqchip_in_kernel()
 is because I think they're all similar to this -- the common code is
 using the check as a proxy for something else, and it should be directly
 asking about that something else. The only bits of code that should
 care about is the irqchip in kernel? are:
  * target-specific device/machine setup code which needs to know
which apic/etc to instantiate
  * target-specific x86 code which has this weird synchronous IRQ
delivery model for irqchip-not-in-kernel
 (Obviously I might have missed something, I'm flailing around
 trying to understand this code :-))

Agree naming should be improved.  In fact the early series I pushed to
decompose local apic, ioapic, and pic, but that didn't happen.  If it
did we'd probably not have this conversation.


-- 
error compiling committee.c: too many arguments to function





Re: [Qemu-devel] [PATCH] kvm: Move kvm_allows_irq0_override() to target-i386

2012-07-23 Thread Avi Kivity
On 07/23/2012 03:25 PM, Peter Maydell wrote:
 On 23 July 2012 13:18, Avi Kivity a...@redhat.com wrote:
  While you don't have an irqchip, you do have asynchronous interrupt
 injection, yes?  That's what irqchip really is all about.
 
 This seems an odd point of view -- async interrupt injection
 doesn't have anything to do with whether we're modelling
 the irqchip in the kernel or in QEMU, I thought...

It does on x86.  The relationship between the APIC and the core is
synchronous - the APIC presents the interrupt, the core grabs is when it
is ready (interrupt flag, etc.) and signals the APIC it has done so.
The APIC then clears the interrupt from the interrupt request register
and sets it in the interrupt status register.  This sequence has to be
done while the vcpu is stopped, since we don't have access to the
interrupt flag otherwise.

-- 
error compiling committee.c: too many arguments to function





[Qemu-devel] [PATCH 16/19] convert net_init_vde() to NetClientOptions

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

v1-v2:
- NetdevVdeOptions::port and ::mode are of type uint16. Remove superfluous
  range checks.

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 net/vde.c |   17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/net/vde.c b/net/vde.c
index 8e60f68..703888c 100644
--- a/net/vde.c
+++ b/net/vde.c
@@ -110,20 +110,17 @@ static int net_vde_init(VLANState *vlan, const char 
*model,
 return 0;
 }
 
-int net_init_vde(QemuOpts *opts, const NetClientOptions *new_opts,
+int net_init_vde(QemuOpts *old_opts, const NetClientOptions *opts,
  const char *name, VLANState *vlan)
 {
-const char *sock;
-const char *group;
-int port, mode;
+const NetdevVdeOptions *vde;
 
-sock  = qemu_opt_get(opts, sock);
-group = qemu_opt_get(opts, group);
+assert(opts-kind == NET_CLIENT_OPTIONS_KIND_VDE);
+vde = opts-vde;
 
-port = qemu_opt_get_number(opts, port, 0);
-mode = qemu_opt_get_number(opts, mode, 0700);
-
-if (net_vde_init(vlan, vde, name, sock, port, group, mode) == -1) {
+/* missing optional values have been initialized to all bits zero */
+if (net_vde_init(vlan, vde, name, vde-sock, vde-port, vde-group,
+ vde-has_mode ? vde-mode : 0700) == -1) {
 return -1;
 }
 
-- 
1.7.10.4




Re: [Qemu-devel] [PATCH] pc: Fix max_cpus

2012-07-23 Thread Dunrong Huang
2012/7/23 Andreas Färber afaer...@suse.de:
 Am 23.07.2012 12:47, schrieb riegama...@gmail.com:
 From: Dunrong Huang riegama...@gmail.com

 The VCPU count limit in kernel now is 254, defined by KVM_MAX_VCPUS
 in kernel's header files. But the count limit in QEMU is 255,
 so QEMU will failed to start if user passes -enable-kvm and -smp 255
 to it.

 This patch intruduces a Macro MAX_VCPUS whose value is KVM_MAX_VCPUS
 if CONFIG_KVM is defined. If user do not use kvm, set it's value to 255.

 Signed-off-by: Dunrong Huang riegama...@gmail.com
 ---
  hw/pc_piix.c |   28 +++-
  1 files changed, 19 insertions(+), 9 deletions(-)

 diff --git a/hw/pc_piix.c b/hw/pc_piix.c
 index 0c0096f..49cda51 100644
 --- a/hw/pc_piix.c
 +++ b/hw/pc_piix.c
 @@ -49,6 +49,16 @@

  #define MAX_IDE_BUS 2

 +#ifndef KVM_MAX_VCPUS
 +#define KVM_MAX_VCPUS 254
 +#endif
 +
 +#ifdef CONFIG_KVM
 +#define MAX_VCPUS KVM_MAX_VCPUS
 +#else
 +#define MAX_VCPUS 255
 +#endif
 +
  static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
  static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
  static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
 @@ -354,7 +364,7 @@ static QEMUMachine pc_machine_v1_2 = {
  .alias = pc,
  .desc = Standard PC,
  .init = pc_init_pci,
 -.max_cpus = 255,
 +.max_cpus = MAX_VCPUS,
  .is_default = 1,
  };

 [snip]

 This is not so ideal: -enable-kvm is a runtime switch whereas you are
 changing a compile-time limit here. Any chance to change the runtime
 usage of .max_cpus instead? Possibly introducing a helper function?


Do you mean do some hacks in smp_parse?
I agree with you, there has codes for checking max_cpus in runtime:
if (max_cpus  255) { // Should be changed to 254 if enable kvm.
fprintf(stderr, Unsupported number of maxcpus\n);
exit(1);
}

I think we also need to change the hard coded value of .max_cpus from 255
to macro which should be synchronized with KVM_MAX_VCPUS or something else.
 Andreas

 --
 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
 GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg





-- 
Best Regards,

Dunrong Huang



Re: [Qemu-devel] [PATCH] kvm: Move kvm_allows_irq0_override() to target-i386

2012-07-23 Thread Avi Kivity
On 07/23/2012 03:31 PM, Avi Kivity wrote:
 On 07/23/2012 03:25 PM, Peter Maydell wrote:
 On 23 July 2012 13:18, Avi Kivity a...@redhat.com wrote:
  While you don't have an irqchip, you do have asynchronous interrupt
 injection, yes?  That's what irqchip really is all about.
 
 This seems an odd point of view -- async interrupt injection
 doesn't have anything to do with whether we're modelling
 the irqchip in the kernel or in QEMU, I thought...
 
 It does on x86.  The relationship between the APIC and the core is
 synchronous - the APIC presents the interrupt, the core grabs is when it
 is ready (interrupt flag, etc.) and signals the APIC it has done so.
 The APIC then clears the interrupt from the interrupt request register
 and sets it in the interrupt status register.  This sequence has to be
 done while the vcpu is stopped, since we don't have access to the
 interrupt flag otherwise.

Again, this is just the local APIC.  The IOAPIC (which is x86 equivalent
to the GIC, IIUC) doesn't change this picture and could have been
emulated in userspace even with async interrupts.  As it happens local
APIC emulation pulls in the IOAPIC and PIC.

So my view is that ARM with and without kernel GIC are
irqchip_in_kernel, since whatever is the local APIC in ARM is always
emulated in the kernel.

-- 
error compiling committee.c: too many arguments to function





[Qemu-devel] [PATCH 11/19] convert net_client_init() to OptsVisitor

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

The net_client_init() prototype is kept intact.

Based on is_netdev, the QemuOpts-rooted QemuOpt-list is parsed as a
Netdev or a NetLegacy. The original meat of net_client_init() is moved to
and simplified in net_client_init1():

Fields not common between -net and -netdev are clearly separated. Getting
the name for the init functions is cleaner: Netdev::id is mandatory, and
all init functions handle a NULL NetLegacy::name. NetLegacy::vlan
explicitly depends on -net (see below).

Verifying the type= option for -netdev can be turned into a switch.

Format validation with qemu_opts_validate() can be removed because the
visitor covers it. Relatedly, the net_client_types array is reduced to
an array of init functions that can be directly indexed by opts-kind.
(Help text is available in the schema JSON.)

The outermost negation in the condition around qemu_find_vlan() was
flattened, because it expresses the dependent code's requirements more
clearly.

VLAN lookup is avoided if there's no init function to pass the VLAN to.

Whenever the value of type=... is needed, we substitute
NetClientOptionsKind_lookup[kind].

The individual init functions are not converted yet, thus the original
QemuOpts instance is passed transparently.

v1-v2:
- NetLegacy::name is optional. Tracked it through all init functions: they
  all handle a NULL name. Updated commit message accordingly.

v2-v3:
- NetLegacy::id is allowed and takes precedence over NetLegacy::name.

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 net.c   |  429 +--
 net/dump.c  |3 +-
 net/dump.h  |4 +-
 net/slirp.c |3 +-
 net/slirp.h |4 +-
 net/socket.c|3 +-
 net/socket.h|4 +-
 net/tap-win32.c |3 +-
 net/tap.c   |6 +-
 net/tap.h   |7 +-
 net/vde.c   |3 +-
 net/vde.h   |4 +-
 12 files changed, 127 insertions(+), 346 deletions(-)

diff --git a/net.c b/net.c
index c46695f..af544b2 100644
--- a/net.c
+++ b/net.c
@@ -37,6 +37,9 @@
 #include qmp-commands.h
 #include hw/qdev.h
 #include iov.h
+#include qapi-visit.h
+#include qapi/opts-visitor.h
+#include qapi/qapi-dealloc-visitor.h
 
 /* Net bridge is currently not supported for W32. */
 #if !defined(_WIN32)
@@ -745,7 +748,8 @@ int net_handle_fd_param(Monitor *mon, const char *param)
 return fd;
 }
 
-static int net_init_nic(QemuOpts *opts, const char *name, VLANState *vlan)
+static int net_init_nic(QemuOpts *opts, const NetClientOptions *new_opts,
+const char *name, VLANState *vlan)
 {
 int idx;
 NICInfo *nd;
@@ -802,371 +806,130 @@ static int net_init_nic(QemuOpts *opts, const char 
*name, VLANState *vlan)
 return idx;
 }
 
-#define NET_COMMON_PARAMS_DESC \
-{  \
-.name = type,\
-.type = QEMU_OPT_STRING,   \
-.help = net client type (nic, tap etc.), \
- }, {  \
-.name = vlan,\
-.type = QEMU_OPT_NUMBER,   \
-.help = vlan number, \
- }, {  \
-.name = name,\
-.type = QEMU_OPT_STRING,   \
-.help = identifier for monitor commands, \
- }
-
-typedef int (*net_client_init_func)(QemuOpts *opts,
-const char *name,
-VLANState *vlan);
-
-/* magic number, but compiler will warn if too small */
-#define NET_MAX_DESC 20
-
-static const struct {
-const char *type;
-net_client_init_func init;
-QemuOptDesc desc[NET_MAX_DESC];
-} net_client_types[NET_CLIENT_OPTIONS_KIND_MAX] = {
-[NET_CLIENT_OPTIONS_KIND_NONE] = {
-.type = none,
-.desc = {
-NET_COMMON_PARAMS_DESC,
-{ /* end of list */ }
-},
-},
-[NET_CLIENT_OPTIONS_KIND_NIC] = {
-.type = nic,
-.init = net_init_nic,
-.desc = {
-NET_COMMON_PARAMS_DESC,
-{
-.name = netdev,
-.type = QEMU_OPT_STRING,
-.help = id of -netdev to connect to,
-},
-{
-.name = macaddr,
-.type = QEMU_OPT_STRING,
-.help = MAC address,
-}, {
-.name = model,
-.type = QEMU_OPT_STRING,
-.help = device model (e1000, rtl8139, virtio etc.),
-}, {
-.name = addr,
-.type = QEMU_OPT_STRING,
-.help = PCI device address,
-}, {
-.name = vectors,
-.type = QEMU_OPT_NUMBER,
-.help = number of MSI-x 

Re: [Qemu-devel] [PATCH] check for available room when formatting OpenFirmware device path

2012-07-23 Thread Peter Maydell
On 23 July 2012 12:33, Laszlo Ersek ler...@redhat.com wrote:

 Signed-off-by: Laszlo Ersek ler...@redhat.com

I think it would be much nicer to just rewrite qdev_get_fw_dev_path
so we weren't trying to fill the path into a fixed string buffer
at all. Here is an entirely untested implementation:

char *qdev_get_fw_dev_path(DeviceState *dev)
{
char *path;
char **strarray;
int depth = 0;
DeviceState *d = dev;
for (d = dev; d  d-parent_bus; d = d-parent_bus-parent) {
depth++;
}
depth++;
strarray = g_new(char*, depth);
for (d = dev; d  d-parent_bus; d = d-parent_bus-parent) {
depth--;
strarray[depth] = bus_get_fw_dev_path(dev-parent_bus, dev);
if (!strarray[depth]) {
strarray[depth] = g_strdup(object_get_typename(OBJECT(dev)));
}
}
strarray[0] = g_strdup();
path = g_strjoinv(/, strarray);
g_strfreev(strarray);
return path;
}

Bonus extra patch: check that all the implementations of get_fw_dev_path()
are returning a g_malloc'd string rather than a plain malloc'd one
(both this code and the current implementation assume so but at least
the scsi bus implementation doesn't use the glib functions.)

-- PMM



[Qemu-devel] [PATCH 19/19] remove unused QemuOpts parameter from net init functions

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

v1-v2:
- unchanged

v2-v3:
- keep qemu-option.h included in net/slirp.h

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 net.c   |   14 ++
 net/dump.c  |4 ++--
 net/dump.h  |5 ++---
 net/slirp.c |4 ++--
 net/slirp.h |4 ++--
 net/socket.c|4 ++--
 net/socket.h|5 ++---
 net/tap-win32.c |4 ++--
 net/tap.c   |8 
 net/tap.h   |9 -
 net/vde.c   |4 ++--
 net/vde.h   |5 ++---
 12 files changed, 32 insertions(+), 38 deletions(-)

diff --git a/net.c b/net.c
index a62e902..dbca77b 100644
--- a/net.c
+++ b/net.c
@@ -748,8 +748,8 @@ int net_handle_fd_param(Monitor *mon, const char *param)
 return fd;
 }
 
-static int net_init_nic(QemuOpts *old_opts, const NetClientOptions *opts,
-const char *name, VLANState *vlan)
+static int net_init_nic(const NetClientOptions *opts, const char *name,
+VLANState *vlan)
 {
 int idx;
 NICInfo *nd;
@@ -813,8 +813,7 @@ static int net_init_nic(QemuOpts *old_opts, const 
NetClientOptions *opts,
 
 
 static int (* const net_client_init_fun[NET_CLIENT_OPTIONS_KIND_MAX])(
-QemuOpts *old_opts,
-const NetClientOptions *new_opts,
+const NetClientOptions *opts,
 const char *name,
 VLANState *vlan) = {
 [NET_CLIENT_OPTIONS_KIND_NIC]= net_init_nic,
@@ -833,8 +832,7 @@ static int (* const 
net_client_init_fun[NET_CLIENT_OPTIONS_KIND_MAX])(
 };
 
 
-static int net_client_init1(const void *object, int is_netdev,
-QemuOpts *old_opts, Error **errp)
+static int net_client_init1(const void *object, int is_netdev, Error **errp)
 {
 union {
 const Netdev*netdev;
@@ -885,7 +883,7 @@ static int net_client_init1(const void *object, int 
is_netdev,
 vlan = qemu_find_vlan(u.net-has_vlan ? u.net-vlan : 0, true);
 }
 
-if (net_client_init_fun[opts-kind](old_opts, opts, name, vlan)  0) {
+if (net_client_init_fun[opts-kind](opts, name, vlan)  0) {
 /* TODO push error reporting into init() methods */
 error_set(errp, QERR_DEVICE_INIT_FAILED,
   NetClientOptionsKind_lookup[opts-kind]);
@@ -920,7 +918,7 @@ int net_client_init(QemuOpts *opts, int is_netdev, Error 
**errp)
 }
 
 if (!err) {
-ret = net_client_init1(object, is_netdev, opts, err);
+ret = net_client_init1(object, is_netdev, err);
 }
 
 if (object) {
diff --git a/net/dump.c b/net/dump.c
index f3d2fa9..b575430 100644
--- a/net/dump.c
+++ b/net/dump.c
@@ -144,8 +144,8 @@ static int net_dump_init(VLANState *vlan, const char 
*device,
 return 0;
 }
 
-int net_init_dump(QemuOpts *old_opts, const NetClientOptions *opts,
-  const char *name, VLANState *vlan)
+int net_init_dump(const NetClientOptions *opts, const char *name,
+  VLANState *vlan)
 {
 int len;
 const char *file;
diff --git a/net/dump.h b/net/dump.h
index 85ac00b..0fa2dd7 100644
--- a/net/dump.h
+++ b/net/dump.h
@@ -25,10 +25,9 @@
 #define QEMU_NET_DUMP_H
 
 #include net.h
-#include qemu-common.h
 #include qapi-types.h
 
-int net_init_dump(QemuOpts *opts, const NetClientOptions *new_opts,
-  const char *name, VLANState *vlan);
+int net_init_dump(const NetClientOptions *opts, const char *name,
+  VLANState *vlan);
 
 #endif /* QEMU_NET_DUMP_H */
diff --git a/net/slirp.c b/net/slirp.c
index 44b059f..5c2e6b2 100644
--- a/net/slirp.c
+++ b/net/slirp.c
@@ -702,8 +702,8 @@ net_init_slirp_configs(const StringList *fwd, int flags)
 }
 }
 
-int net_init_slirp(QemuOpts *old_opts, const NetClientOptions *opts,
-   const char *name, VLANState *vlan)
+int net_init_slirp(const NetClientOptions *opts, const char *name,
+   VLANState *vlan)
 {
 struct slirp_config_str *config;
 char *vnet;
diff --git a/net/slirp.h b/net/slirp.h
index ef13a65..e2c71ee 100644
--- a/net/slirp.h
+++ b/net/slirp.h
@@ -31,8 +31,8 @@
 
 #ifdef CONFIG_SLIRP
 
-int net_init_slirp(QemuOpts *opts, const NetClientOptions *new_opts,
-   const char *name, VLANState *vlan);
+int net_init_slirp(const NetClientOptions *opts, const char *name,
+   VLANState *vlan);
 
 void net_slirp_hostfwd_add(Monitor *mon, const QDict *qdict);
 void net_slirp_hostfwd_remove(Monitor *mon, const QDict *qdict);
diff --git a/net/socket.c b/net/socket.c
index e3cba20..600c287 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -586,8 +586,8 @@ static int net_socket_udp_init(VLANState *vlan,
 return 0;
 }
 
-int net_init_socket(QemuOpts *old_opts, const NetClientOptions *opts,
-const char *name, VLANState *vlan)
+int net_init_socket(const NetClientOptions *opts, const char *name,
+VLANState *vlan)
 {
 const 

[Qemu-devel] [PATCH 10/19] hw, net: net_client_type - NetClientOptionsKind (qapi-generated)

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

NET_CLIENT_TYPE_ - NET_CLIENT_OPTIONS_KIND_

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 hw/cadence_gem.c|2 +-
 hw/dp8393x.c|2 +-
 hw/e1000.c  |2 +-
 hw/eepro100.c   |2 +-
 hw/etraxfs_eth.c|2 +-
 hw/lan9118.c|2 +-
 hw/lance.c  |2 +-
 hw/mcf_fec.c|2 +-
 hw/milkymist-minimac2.c |2 +-
 hw/mipsnet.c|2 +-
 hw/musicpal.c   |2 +-
 hw/ne2000-isa.c |2 +-
 hw/ne2000.c |2 +-
 hw/opencores_eth.c  |2 +-
 hw/pcnet-pci.c  |2 +-
 hw/rtl8139.c|2 +-
 hw/smc91c111.c  |2 +-
 hw/spapr_llan.c |2 +-
 hw/stellaris_enet.c |2 +-
 hw/usb/dev-network.c|2 +-
 hw/vhost_net.c  |2 +-
 hw/virtio-net.c |   10 +-
 hw/xen_nic.c|2 +-
 hw/xgmac.c  |2 +-
 hw/xilinx_axienet.c |2 +-
 hw/xilinx_ethlite.c |2 +-
 net.c   |   50 +++
 net.h   |   16 ++-
 net/dump.c  |2 +-
 net/slirp.c |2 +-
 net/socket.c|4 ++--
 net/tap-win32.c |2 +-
 net/tap.c   |   16 +++
 net/vde.c   |2 +-
 34 files changed, 71 insertions(+), 83 deletions(-)

diff --git a/hw/cadence_gem.c b/hw/cadence_gem.c
index 87143ca..a0f51de 100644
--- a/hw/cadence_gem.c
+++ b/hw/cadence_gem.c
@@ -1161,7 +1161,7 @@ static void gem_set_link(VLANClientState *nc)
 }
 
 static NetClientInfo net_gem_info = {
-.type = NET_CLIENT_TYPE_NIC,
+.type = NET_CLIENT_OPTIONS_KIND_NIC,
 .size = sizeof(NICState),
 .can_receive = gem_can_receive,
 .receive = gem_receive,
diff --git a/hw/dp8393x.c b/hw/dp8393x.c
index 017d074..756d630 100644
--- a/hw/dp8393x.c
+++ b/hw/dp8393x.c
@@ -872,7 +872,7 @@ static void nic_cleanup(VLANClientState *nc)
 }
 
 static NetClientInfo net_dp83932_info = {
-.type = NET_CLIENT_TYPE_NIC,
+.type = NET_CLIENT_OPTIONS_KIND_NIC,
 .size = sizeof(NICState),
 .can_receive = nic_can_receive,
 .receive = nic_receive,
diff --git a/hw/e1000.c b/hw/e1000.c
index 4573f13..ad24298 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -1206,7 +1206,7 @@ pci_e1000_uninit(PCIDevice *dev)
 }
 
 static NetClientInfo net_e1000_info = {
-.type = NET_CLIENT_TYPE_NIC,
+.type = NET_CLIENT_OPTIONS_KIND_NIC,
 .size = sizeof(NICState),
 .can_receive = e1000_can_receive,
 .receive = e1000_receive,
diff --git a/hw/eepro100.c b/hw/eepro100.c
index 6279ae3..f343685 100644
--- a/hw/eepro100.c
+++ b/hw/eepro100.c
@@ -1845,7 +1845,7 @@ static int pci_nic_uninit(PCIDevice *pci_dev)
 }
 
 static NetClientInfo net_eepro100_info = {
-.type = NET_CLIENT_TYPE_NIC,
+.type = NET_CLIENT_OPTIONS_KIND_NIC,
 .size = sizeof(NICState),
 .can_receive = nic_can_receive,
 .receive = nic_receive,
diff --git a/hw/etraxfs_eth.c b/hw/etraxfs_eth.c
index 16a0637..45fb40c 100644
--- a/hw/etraxfs_eth.c
+++ b/hw/etraxfs_eth.c
@@ -579,7 +579,7 @@ static void eth_cleanup(VLANClientState *nc)
 }
 
 static NetClientInfo net_etraxfs_info = {
-   .type = NET_CLIENT_TYPE_NIC,
+   .type = NET_CLIENT_OPTIONS_KIND_NIC,
.size = sizeof(NICState),
.can_receive = eth_can_receive,
.receive = eth_receive,
diff --git a/hw/lan9118.c b/hw/lan9118.c
index 7b4fe87..40fb765 100644
--- a/hw/lan9118.c
+++ b/hw/lan9118.c
@@ -1310,7 +1310,7 @@ static void lan9118_cleanup(VLANClientState *nc)
 }
 
 static NetClientInfo net_lan9118_info = {
-.type = NET_CLIENT_TYPE_NIC,
+.type = NET_CLIENT_OPTIONS_KIND_NIC,
 .size = sizeof(NICState),
 .can_receive = lan9118_can_receive,
 .receive = lan9118_receive,
diff --git a/hw/lance.c b/hw/lance.c
index ce3d46c..91c0e16 100644
--- a/hw/lance.c
+++ b/hw/lance.c
@@ -93,7 +93,7 @@ static void lance_cleanup(VLANClientState *nc)
 }
 
 static NetClientInfo net_lance_info = {
-.type = NET_CLIENT_TYPE_NIC,
+.type = NET_CLIENT_OPTIONS_KIND_NIC,
 .size = sizeof(NICState),
 .can_receive = pcnet_can_receive,
 .receive = pcnet_receive,
diff --git a/hw/mcf_fec.c b/hw/mcf_fec.c
index ae37bef..4ab4ff5 100644
--- a/hw/mcf_fec.c
+++ b/hw/mcf_fec.c
@@ -450,7 +450,7 @@ static void mcf_fec_cleanup(VLANClientState *nc)
 }
 
 static NetClientInfo net_mcf_fec_info = {
-.type = NET_CLIENT_TYPE_NIC,
+.type = NET_CLIENT_OPTIONS_KIND_NIC,
 .size = sizeof(NICState),
 .can_receive = mcf_fec_can_receive,
 .receive = mcf_fec_receive,
diff --git a/hw/milkymist-minimac2.c b/hw/milkymist-minimac2.c
index 70bf336..3924b83 100644
--- a/hw/milkymist-minimac2.c
+++ b/hw/milkymist-minimac2.c
@@ -448,7 +448,7 @@ static void milkymist_minimac2_reset(DeviceState *d)
 }
 
 static NetClientInfo 

[Qemu-devel] [PATCH v5 0/3] refactor PC machine, i440fx and piix3 to take advantage of QOM

2012-07-23 Thread Wanpeng Li
This series aggressively refactors the PC machine initialization to be more
modelled and less ad-hoc.  The highlights of this series are:

1) Things like -m and -bios-name are now device model properties

2) The i440fx and piix3 are now modelled in a thorough fashion

3) i440fx_init is trivialized to creating devices and setting properties

4) convert PCI host bridge to QOM

The point (3) is the most important one.  As we refactor in this fashion,
we should quickly get to the point where machine-init disappears completely in
favor of just creating a handful of devices.

The two stage initialization of QOM is important here.  instance_init() is when
composed devices are created which means that after you've created a device, all
of its children are visible in the device model.  This lets you set properties
of the parent and its children.

realize() (which is still called DeviceState::init today) will be called right
before the guest starts up for the first time.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Wanpeng Li liw...@linux.vnet.ibm.com

Change in v5:
* drop patch convert MemoryRegion to QOM and 
prepare to create HPET, RTC and i8254 through composition
* add Andreas' recent attempt against pci_host

Change in v4:

*rebase patchset

Changes in v3:

* fix coding style issues
* fix rebase error
* add changes log

Changes in v2:

* Rebase patch series of i440fx in Anthony's qom-rebase.12 branch to upstream
* convert MemoryRegion to QOM
* convert pci_host to QOM


Anthony Liguori (3):
  eliminate piix_pci.c and module i440fx and piix3
  merge pc_piix.c to pc.c
  convert pci-host to QOM

 hw/i386/Makefile.objs |3 +-
 hw/i440fx.c   |  434 ++
 hw/i440fx.h   |   77 ++
 hw/pc.c   |  695 +
 hw/pc.h   |   46 +---
 hw/pc_piix.c  |  661 --
 hw/pci_host.c |   14 +
 hw/pci_host.h |2 +
 hw/piix3.c|  234 +
 hw/piix3.h|   69 +
 hw/piix_pci.c |  599 --
 11 files changed, 1481 insertions(+), 1353 deletions(-)
 create mode 100644 hw/i440fx.c
 create mode 100644 hw/i440fx.h
 delete mode 100644 hw/pc_piix.c
 create mode 100644 hw/piix3.c
 create mode 100644 hw/piix3.h
 delete mode 100644 hw/piix_pci.c

-- 
1.7.7.6




[Qemu-devel] [PATCH v5 3/3] convert pci-host to QOM

2012-07-23 Thread Wanpeng Li
From: Anthony Liguori aligu...@us.ibm.com

makes pci_host a proper QOM type.

Changelog:
* against Andreas pci_host branch
* make host bridge TypeInfos const
* use PCI_HOST_BRIDGE() where appropriate 

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Wanpeng Li liw...@linux.vnet.ibm.com

---
 hw/i440fx.c   |6 +++---
 hw/pc.c   |2 +-
 hw/pci_host.c |   14 ++
 hw/pci_host.h |2 ++
 hw/piix3.c|4 ++--
 5 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/hw/i440fx.c b/hw/i440fx.c
index 720a25a..fdf040b 100644
--- a/hw/i440fx.c
+++ b/hw/i440fx.c
@@ -191,7 +191,7 @@ static const VMStateDescription vmstate_i440fx_pmc = {
 static int i440fx_realize(SysBusDevice *dev)
 {
 I440FXState *s = I440FX(dev);
-PCIHostState *h = PCI_HOST(s);
+PCIHostState *h = PCI_HOST_BRIDGE(s);
 int bios_size, isa_bios_size;
 char *filename;
 int ret;
@@ -401,7 +401,7 @@ static void i440fx_pmc_class_init(ObjectClass *klass, void 
*data)
 dc-vmsd = vmstate_i440fx_pmc;
 }
 
-static TypeInfo i440fx_pmc_info = {
+static const TypeInfo i440fx_pmc_info = {
 .name  = TYPE_I440FX_PMC,
 .parent= TYPE_PCI_DEVICE,
 .instance_size = sizeof(I440FXPMCState),
@@ -418,7 +418,7 @@ static void i440fx_class_init(ObjectClass *klass, void 
*data)
 dc-no_user = 1;
 }
 
-static TypeInfo i440fx_info = {
+static const TypeInfo i440fx_info = {
 .name  = TYPE_I440FX,
 .parent= TYPE_PCI_HOST_BRIDGE,
 .instance_size = sizeof(I440FXState),
diff --git a/hw/pc.c b/hw/pc.c
index d9a0443..f095109 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -1217,7 +1217,7 @@ static PCIBus *i440fx_init(I440FXPMCState 
**pi440fx_state, int *piix3_devfn,
 PCIHostState *h;
 
 s = I440FX(object_new(TYPE_I440FX));
-h = PCI_HOST(s);
+h = PCI_HOST_BRIDGE(s);
 
 /* FIXME make a properties */
 h-address_space = address_space_mem;
diff --git a/hw/pci_host.c b/hw/pci_host.c
index 3950e94..4e10042 100644
--- a/hw/pci_host.c
+++ b/hw/pci_host.c
@@ -165,11 +165,25 @@ const MemoryRegionOps pci_host_data_be_ops = {
 .endianness = DEVICE_BIG_ENDIAN,
 };
 
+void pci_host_set_mmio(PCIHostState *s, MemoryRegion *value)
+{
+object_property_set_link(OBJECT(s), OBJECT(value), mmio, NULL);
+}
+
+static void pci_host_initfn(Object *obj)
+{
+PCIHostState *s = PCI_HOST_BRIDGE(obj);
+
+object_property_add_link(obj, mmio, memory-region,
+(Object **)s-address_space, NULL);
+}
+
 static const TypeInfo pci_host_type_info = {
 .name = TYPE_PCI_HOST_BRIDGE,
 .parent = TYPE_SYS_BUS_DEVICE,
 .abstract = true,
 .instance_size = sizeof(PCIHostState),
+.instance_init = pci_host_initfn,
 };
 
 static void pci_host_register_types(void)
diff --git a/hw/pci_host.h b/hw/pci_host.h
index 4b9c300..9f28728 100644
--- a/hw/pci_host.h
+++ b/hw/pci_host.h
@@ -54,6 +54,8 @@ uint32_t pci_host_config_read_common(PCIDevice *pci_dev, 
uint32_t addr,
 void pci_data_write(PCIBus *s, uint32_t addr, uint32_t val, int len);
 uint32_t pci_data_read(PCIBus *s, uint32_t addr, int len);
 
+void pci_host_set_mmio(PCIHostState *s, MemoryRegion *value);
+
 extern const MemoryRegionOps pci_host_conf_le_ops;
 extern const MemoryRegionOps pci_host_conf_be_ops;
 extern const MemoryRegionOps pci_host_data_le_ops;
diff --git a/hw/piix3.c b/hw/piix3.c
index eca6ec8..3b69b15 100644
--- a/hw/piix3.c
+++ b/hw/piix3.c
@@ -204,7 +204,7 @@ static void piix3_class_init(ObjectClass *klass, void *data)
 k-class_id = PCI_CLASS_BRIDGE_ISA;
 }
 
-static TypeInfo piix3_info = {
+static const TypeInfo piix3_info = {
 .name  = TYPE_PIIX3,
 .parent= TYPE_PCI_DEVICE,
 .instance_size = sizeof(PIIX3State),
@@ -219,7 +219,7 @@ static void piix3_xen_class_init(ObjectClass *klass, void 
*data)
 k-config_write = piix3_write_config_xen;
 };
 
-static TypeInfo piix3_xen_info = {
+static const TypeInfo piix3_xen_info = {
 .name  = PIIX3-xen,
 .parent= TYPE_PIIX3,
 .instance_size = sizeof(PIIX3State),
-- 
1.7.5.4




[Qemu-devel] [PATCH v5 2/3] merge pc_piix.c to pc.c

2012-07-23 Thread Wanpeng Li
From: Anthony Liguori aligu...@us.ibm.com

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Wanpeng Li liw...@linux.vnet.ibm.com

---
 hw/i386/Makefile.objs |1 -
 hw/pc.c   |  695 +
 hw/pc.h   |   46 +---
 hw/pc_piix.c  |  661 --
 4 files changed, 650 insertions(+), 753 deletions(-)
 delete mode 100644 hw/pc_piix.c

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index 49b32d0..868020c 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -4,7 +4,6 @@ obj-y += sga.o ioapic_common.o ioapic.o i440fx.o piix3.o
 obj-y += vmport.o
 obj-y += pci-hotplug.o smbios.o wdt_ib700.o
 obj-y += debugcon.o multiboot.o
-obj-y += pc_piix.o
 obj-y += pc_sysfw.o
 obj-$(CONFIG_XEN) += xen_platform.o xen_apic.o
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
diff --git a/hw/pc.c b/hw/pc.c
index 598267a..d9a0443 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -27,6 +27,7 @@
 #include fdc.h
 #include ide.h
 #include pci.h
+#include usb.h
 #include vmware_vga.h
 #include monitor.h
 #include fw_cfg.h
@@ -48,7 +49,10 @@
 #include ui/qemu-spice.h
 #include memory.h
 #include exec-memory.h
+#include kvm/clock.h
 #include arch_init.h
+#include smbus.h
+#include boards.h
 
 /* output Bochs bios info messages */
 //#define DEBUG_BIOS
@@ -76,6 +80,8 @@
 
 #define E820_NR_ENTRIES16
 
+#define MAX_IDE_BUS 2
+
 struct e820_entry {
 uint64_t address;
 uint64_t length;
@@ -87,10 +93,14 @@ struct e820_table {
 struct e820_entry entry[E820_NR_ENTRIES];
 } QEMU_PACKED __attribute((__aligned__(4)));
 
+static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
+static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
+static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
+
 static struct e820_table e820_table;
 struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX};
 
-void gsi_handler(void *opaque, int n, int level)
+static void gsi_handler(void *opaque, int n, int level)
 {
 GSIState *s = opaque;
 
@@ -108,7 +118,7 @@ static void ioport80_write(void *opaque, uint32_t addr, 
uint32_t data)
 /* MSDOS compatibility mode FPU exception support */
 static qemu_irq ferr_irq;
 
-void pc_register_ferr_irq(qemu_irq irq)
+static void pc_register_ferr_irq(qemu_irq irq)
 {
 ferr_irq = irq;
 }
@@ -323,7 +333,7 @@ static void pc_cmos_init_late(void *opaque)
 qemu_unregister_reset(pc_cmos_init_late, opaque);
 }
 
-void pc_cmos_init(ram_addr_t ram_size, ram_addr_t above_4g_mem_size,
+static void pc_cmos_init(ram_addr_t ram_size, ram_addr_t above_4g_mem_size,
   const char *boot_device,
   ISADevice *floppy, BusState *idebus0, BusState *idebus1,
   ISADevice *s)
@@ -846,7 +856,7 @@ static const int ne2000_irq[NE2000_NB_MAX] = { 9, 10, 11, 
3, 4, 5 };
 static const int parallel_io[MAX_PARALLEL_PORTS] = { 0x378, 0x278, 0x3bc };
 static const int parallel_irq[MAX_PARALLEL_PORTS] = { 7, 7, 7 };
 
-void pc_init_ne2k_isa(ISABus *bus, NICInfo *nd)
+static void pc_init_ne2k_isa(ISABus *bus, NICInfo *nd)
 {
 static int nb_ne2k = 0;
 
@@ -901,7 +911,7 @@ static DeviceState *apic_init(void *env, uint8_t apic_id)
 return dev;
 }
 
-void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
+static void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
 {
 CPUX86State *s = opaque;
 
@@ -938,7 +948,7 @@ static X86CPU *pc_new_cpu(const char *cpu_model)
 return cpu;
 }
 
-void pc_cpus_init(const char *cpu_model)
+static void pc_cpus_init(const char *cpu_model)
 {
 int i;
 
@@ -956,55 +966,18 @@ void pc_cpus_init(const char *cpu_model)
 }
 }
 
-void *pc_memory_init(MemoryRegion *system_memory,
+static void *pc_memory_init(MemoryRegion *system_memory,
 const char *kernel_filename,
 const char *kernel_cmdline,
 const char *initrd_filename,
 ram_addr_t below_4g_mem_size,
-ram_addr_t above_4g_mem_size,
-MemoryRegion *rom_memory,
-MemoryRegion **ram_memory)
+ram_addr_t above_4g_mem_size)
 {
 int linux_boot, i;
-MemoryRegion *ram, *option_rom_mr;
-MemoryRegion *ram_below_4g, *ram_above_4g;
 void *fw_cfg;
 
 linux_boot = (kernel_filename != NULL);
 
-/* Allocate RAM.  We allocate it as a single memory region and use
- * aliases to address portions of it, mostly for backwards compatibility
- * with older qemus that used qemu_ram_alloc().
- */
-ram = g_malloc(sizeof(*ram));
-memory_region_init_ram(ram, pc.ram,
-   below_4g_mem_size + above_4g_mem_size);
-vmstate_register_ram_global(ram);
-*ram_memory = ram;
-ram_below_4g = g_malloc(sizeof(*ram_below_4g));
-memory_region_init_alias(ram_below_4g, ram-below-4g, ram,
- 0, 

[Qemu-devel] [PATCH v5 1/3] eliminate piix_pci.c and module i440fx and piix3

2012-07-23 Thread Wanpeng Li
From: Anthony Liguori aligu...@us.ibm.com

The big picture about the patch is shown as follows:

1) pc_init creates an I440FX, any bus devices (ISA serial port, PCI
vga and nics, etc.), sets properties appropriately, and realizes the
devices.
2) I440FX is-a PCIHost, has-a I440FX-PMC, has-a PIIX3
3) PIIX3 has-a RTC, has-a I8042, has-a DMAController, etc.

i440fx-pcihost = i440fx
i440fx = i440fx-pmc

i440fx_pmc is Programmable Memory Controller which integrated in I440FX
chipset, and move ram initialization into i440fx-pmc.

It might seem like a small change, but it better reflects the fact
that the PMC is contained within the i440fx which we will now reflect in
composition in the next few changesets.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Wanpeng Li liw...@linux.vnet.ibm.com

---
 hw/i386/Makefile.objs |2 +-
 hw/i440fx.c   |  434 +++
 hw/i440fx.h   |   77 +++
 hw/piix3.c|  234 +++
 hw/piix3.h|   69 ++
 hw/piix_pci.c |  599 -
 6 files changed, 815 insertions(+), 600 deletions(-)
 create mode 100644 hw/i440fx.c
 create mode 100644 hw/i440fx.h
 create mode 100644 hw/piix3.c
 create mode 100644 hw/piix3.h
 delete mode 100644 hw/piix_pci.c

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index 8c764bb..49b32d0 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -1,6 +1,6 @@
 obj-y += mc146818rtc.o pc.o
 obj-y += apic_common.o apic.o kvmvapic.o
-obj-y += sga.o ioapic_common.o ioapic.o piix_pci.o
+obj-y += sga.o ioapic_common.o ioapic.o i440fx.o piix3.o
 obj-y += vmport.o
 obj-y += pci-hotplug.o smbios.o wdt_ib700.o
 obj-y += debugcon.o multiboot.o
diff --git a/hw/i440fx.c b/hw/i440fx.c
new file mode 100644
index 000..720a25a
--- /dev/null
+++ b/hw/i440fx.c
@@ -0,0 +1,434 @@
+/*
+ * QEMU i440FX PCI Host Bridge Emulation
+ *
+ * Copyright (c) 2006 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include i440fx.h
+#include range.h
+#include xen.h
+#include loader.h
+#include pc.h
+
+#define BIOS_FILENAME bios.bin
+
+/*
+ * I440FX chipset data sheet.
+ * http://download.intel.com/design/chipsets/datashts/29054901.pdf
+ *
+ * The I440FX is a package that contains an integrated PCI Host controller,
+ * memory controller, and is usually packaged with a PCI-ISA bus and super I/O
+ * chipset.
+ *
+ * The i440FX device is the PCI host controller.  On function 0.0, there is a
+ * memory controller called the Programmable Memory Controller (PMC).  On
+ * function 1.0, there is the PCI-ISA bus/super I/O chip called the PIIX3.
+ */
+
+#define I440FX_PMC_PCI_HOLE 0xE000ULL
+#define I440FX_PMC_PCI_HOLE_END 0x1ULL
+
+#define I440FX_PAM  0x59
+#define I440FX_PAM_SIZE 7
+#define I440FX_SMRAM0x72
+
+static void piix3_set_irq(void *opaque, int pirq, int level)
+{
+PIIX3State *piix3 = opaque;
+piix3_set_irq_level(piix3, pirq, level);
+}
+
+/*
+ * return the global irq number corresponding to a given device irq
+ * pin. We could also use the bus number to have a more precise
+ * mapping.
+ */
+static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
+{
+int slot_addend;
+slot_addend = (pci_dev-devfn  3) - 1;
+return (pci_intx + slot_addend)  3;
+}
+
+static void update_pam(I440FXPMCState *d, uint32_t start, uint32_t end, int r,
+   PAMMemoryRegion *mem)
+{
+if (mem-initialized) {
+memory_region_del_subregion(d-system_memory, mem-mem);
+memory_region_destroy(mem-mem);
+}
+
+switch (r) {
+case 3:
+/* RAM */
+memory_region_init_alias(mem-mem, pam-ram, d-ram_memory,
+ start, end - start);
+break;
+case 1:
+/* ROM (XXX: not quite correct) */
+memory_region_init_alias(mem-mem, pam-rom, d-ram_memory,
+   

Re: [Qemu-devel] [PATCH] check for available room when formatting OpenFirmware device path

2012-07-23 Thread Peter Maydell
On 23 July 2012 13:34, Peter Maydell peter.mayd...@linaro.org wrote:
 On 23 July 2012 12:33, Laszlo Ersek ler...@redhat.com wrote:

 Signed-off-by: Laszlo Ersek ler...@redhat.com

 I think it would be much nicer to just rewrite qdev_get_fw_dev_path
 so we weren't trying to fill the path into a fixed string buffer
 at all. Here is an entirely untested implementation:

 char *qdev_get_fw_dev_path(DeviceState *dev)
 {
 char *path;
 char **strarray;
 int depth = 0;
 DeviceState *d = dev;
 for (d = dev; d  d-parent_bus; d = d-parent_bus-parent) {
 depth++;
 }
 depth++;
 strarray = g_new(char*, depth);
 for (d = dev; d  d-parent_bus; d = d-parent_bus-parent) {
 depth--;
 strarray[depth] = bus_get_fw_dev_path(dev-parent_bus, dev);

d not dev here and in the line below, obviously. I said it was
untested :-)

 if (!strarray[depth]) {
 strarray[depth] = g_strdup(object_get_typename(OBJECT(dev)));
 }
 }
 strarray[0] = g_strdup();
 path = g_strjoinv(/, strarray);
 g_strfreev(strarray);
 return path;
 }

-- PMM



[Qemu-devel] [PATCH 08/19] qapi schema: remove trailing whitespace

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 qapi-schema.json |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index a92adb1..d2f8e02 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -343,7 +343,7 @@
 # @CPU: the index of the virtual CPU
 #
 # @current: this only exists for backwards compatible and should be ignored
-# 
+#
 # @halted: true if the virtual CPU is in the halt state.  Halt usually refers
 #  to a processor specific low power mode.
 #
@@ -686,7 +686,7 @@
 # @SpiceInfo
 #
 # Information about the SPICE session.
-# 
+#
 # @enabled: true if the SPICE server is enabled, false otherwise
 #
 # @host: #optional The hostname the SPICE server is bound to.  This depends on
@@ -1297,7 +1297,7 @@
 ##
 { 'command': 'human-monitor-command',
   'data': {'command-line': 'str', '*cpu-index': 'int'},
-  'returns': 'str' } 
+  'returns': 'str' }
 
 ##
 # @migrate_cancel
@@ -1458,7 +1458,7 @@
 # @password: the new password
 #
 # @connected: #optional how to handle existing clients when changing the
-#   password.  If nothing is specified, defaults to `keep' 
+#   password.  If nothing is specified, defaults to `keep'
 #   `fail' to fail the command if clients are connected
 #   `disconnect' to disconnect existing clients
 #   `keep' to maintain existing clients
@@ -1598,7 +1598,7 @@
 #  If the argument combination is invalid, InvalidParameterCombination
 #
 # Since: 1.1
-## 
+##
 { 'command': 'block_set_io_throttle',
   'data': { 'device': 'str', 'bps': 'int', 'bps_rd': 'int', 'bps_wr': 'int',
 'iops': 'int', 'iops_rd': 'int', 'iops_wr': 'int' } }
-- 
1.7.10.4




Re: [Qemu-devel] [PATCH 01/16] net: Add a hub net client

2012-07-23 Thread Laszlo Ersek
Two hairs to split:

On 07/20/12 14:01, Stefan Hajnoczi wrote:

 +static NetHubPort *net_hub_port_new(NetHub *hub, const char *name)
 +{
 +VLANClientState *nc;
 +NetHubPort *port;
 +unsigned int id = hub-num_ports++;

There are projects that don't like to put logic or externally visible
side-effects into initializers. I don't know about qemu.

 diff --git a/qapi-schema.json b/qapi-schema.json
 index bc55ed2..6618eb5 100644
 --- a/qapi-schema.json
 +++ b/qapi-schema.json
 @@ -2094,6 +2094,19 @@
  '*helper': 'str' } }
  
  ##
 +# @NetdevHubPortOptions
 +#
 +# Connect two or more net clients through a software hub.
 +#
 +# @hubid: hub identifier number
 +#
 +# Since 1.2
 +##
 +{ 'type': 'NetdevHubPortOptions',
 +  'data': {
 +'hubid': 'int' } }

I think this should say 'uint32'.

Thanks,
Laszlo



[Qemu-devel] [PATCH 14/19] convert net_init_slirp() to NetClientOptions

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 net/slirp.c |   93 ---
 1 file changed, 25 insertions(+), 68 deletions(-)

diff --git a/net/slirp.c b/net/slirp.c
index 1243d43..44b059f 100644
--- a/net/slirp.c
+++ b/net/slirp.c
@@ -686,89 +686,46 @@ void do_info_usernet(Monitor *mon)
 }
 }
 
-static int net_init_slirp_configs(const char *name, const char *value, void 
*opaque)
+static void
+net_init_slirp_configs(const StringList *fwd, int flags)
 {
-struct slirp_config_str *config;
-
-if (strcmp(name, hostfwd) != 0  strcmp(name, guestfwd) != 0) {
-return 0;
-}
-
-config = g_malloc0(sizeof(*config));
+while (fwd) {
+struct slirp_config_str *config;
 
-pstrcpy(config-str, sizeof(config-str), value);
+config = g_malloc0(sizeof(*config));
+pstrcpy(config-str, sizeof(config-str), fwd-value-str);
+config-flags = flags;
+config-next = slirp_configs;
+slirp_configs = config;
 
-if (!strcmp(name, hostfwd)) {
-config-flags = SLIRP_CFG_HOSTFWD;
+fwd = fwd-next;
 }
-
-config-next = slirp_configs;
-slirp_configs = config;
-
-return 0;
 }
 
-int net_init_slirp(QemuOpts *opts, const NetClientOptions *new_opts,
+int net_init_slirp(QemuOpts *old_opts, const NetClientOptions *opts,
const char *name, VLANState *vlan)
 {
 struct slirp_config_str *config;
-const char *vhost;
-const char *vhostname;
-const char *vdhcp_start;
-const char *vnamesrv;
-const char *tftp_export;
-const char *bootfile;
-const char *smb_export;
-const char *vsmbsrv;
-const char *restrict_opt;
-char *vnet = NULL;
-int restricted = 0;
+char *vnet;
 int ret;
+const NetdevUserOptions *user;
 
-vhost   = qemu_opt_get(opts, host);
-vhostname   = qemu_opt_get(opts, hostname);
-vdhcp_start = qemu_opt_get(opts, dhcpstart);
-vnamesrv= qemu_opt_get(opts, dns);
-tftp_export = qemu_opt_get(opts, tftp);
-bootfile= qemu_opt_get(opts, bootfile);
-smb_export  = qemu_opt_get(opts, smb);
-vsmbsrv = qemu_opt_get(opts, smbserver);
-
-restrict_opt = qemu_opt_get(opts, restrict);
-if (restrict_opt) {
-if (!strcmp(restrict_opt, on) ||
-!strcmp(restrict_opt, yes) || !strcmp(restrict_opt, y)) {
-restricted = 1;
-} else if (strcmp(restrict_opt, off) 
-strcmp(restrict_opt, no)  strcmp(restrict_opt, n)) {
-error_report(invalid option: 'restrict=%s', restrict_opt);
-return -1;
-}
-}
-
-if (qemu_opt_get(opts, ip)) {
-const char *ip = qemu_opt_get(opts, ip);
-int l = strlen(ip) + strlen(/24) + 1;
+assert(opts-kind == NET_CLIENT_OPTIONS_KIND_USER);
+user = opts-user;
 
-vnet = g_malloc(l);
+vnet = user-has_net ? g_strdup(user-net) :
+   user-has_ip  ? g_strdup_printf(%s/24, user-ip) :
+   NULL;
 
-/* emulate legacy ip= parameter */
-pstrcpy(vnet, l, ip);
-pstrcat(vnet, l, /24);
-}
-
-if (qemu_opt_get(opts, net)) {
-if (vnet) {
-g_free(vnet);
-}
-vnet = g_strdup(qemu_opt_get(opts, net));
-}
+/* all optional fields are initialized to all bits zero */
 
-qemu_opt_foreach(opts, net_init_slirp_configs, NULL, 0);
+net_init_slirp_configs(user-hostfwd, SLIRP_CFG_HOSTFWD);
+net_init_slirp_configs(user-guestfwd, 0);
 
-ret = net_slirp_init(vlan, user, name, restricted, vnet, vhost,
- vhostname, tftp_export, bootfile, vdhcp_start,
- vnamesrv, smb_export, vsmbsrv);
+ret = net_slirp_init(vlan, user, name, user-restrict, vnet, user-host,
+ user-hostname, user-tftp, user-bootfile,
+ user-dhcpstart, user-dns, user-smb,
+ user-smbserver);
 
 while (slirp_configs) {
 config = slirp_configs;
-- 
1.7.10.4




[Qemu-devel] [PATCH 05/19] qapi: introduce size type

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

v1-v2:
- fall back to uint64 rather than int

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 qapi/qapi-visit-core.c |7 +++
 qapi/qapi-visit-core.h |3 +++
 scripts/qapi.py|2 ++
 3 files changed, 12 insertions(+)

diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
index d41595e..7a82b637 100644
--- a/qapi/qapi-visit-core.c
+++ b/qapi/qapi-visit-core.c
@@ -234,6 +234,13 @@ void visit_type_int64(Visitor *v, int64_t *obj, const char 
*name, Error **errp)
 }
 }
 
+void visit_type_size(Visitor *v, uint64_t *obj, const char *name, Error **errp)
+{
+if (!error_is_set(errp)) {
+(v-type_size ? v-type_size : v-type_uint64)(v, obj, name, errp);
+}
+}
+
 void visit_type_bool(Visitor *v, bool *obj, const char *name, Error **errp)
 {
 if (!error_is_set(errp)) {
diff --git a/qapi/qapi-visit-core.h b/qapi/qapi-visit-core.h
index a19d70c..60aceda 100644
--- a/qapi/qapi-visit-core.h
+++ b/qapi/qapi-visit-core.h
@@ -60,6 +60,8 @@ struct Visitor
 void (*type_int16)(Visitor *v, int16_t *obj, const char *name, Error 
**errp);
 void (*type_int32)(Visitor *v, int32_t *obj, const char *name, Error 
**errp);
 void (*type_int64)(Visitor *v, int64_t *obj, const char *name, Error 
**errp);
+/* visit_type_size() falls back to (*type_uint64)() if type_size is unset 
*/
+void (*type_size)(Visitor *v, uint64_t *obj, const char *name, Error 
**errp);
 };
 
 void visit_start_handle(Visitor *v, void **obj, const char *kind,
@@ -85,6 +87,7 @@ void visit_type_int8(Visitor *v, int8_t *obj, const char 
*name, Error **errp);
 void visit_type_int16(Visitor *v, int16_t *obj, const char *name, Error 
**errp);
 void visit_type_int32(Visitor *v, int32_t *obj, const char *name, Error 
**errp);
 void visit_type_int64(Visitor *v, int64_t *obj, const char *name, Error 
**errp);
+void visit_type_size(Visitor *v, uint64_t *obj, const char *name, Error 
**errp);
 void visit_type_bool(Visitor *v, bool *obj, const char *name, Error **errp);
 void visit_type_str(Visitor *v, char **obj, const char *name, Error **errp);
 void visit_type_number(Visitor *v, double *obj, const char *name, Error 
**errp);
diff --git a/scripts/qapi.py b/scripts/qapi.py
index 1292476..8082af3 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -163,6 +163,8 @@ def c_type(name):
   name == 'int64' or name == 'uint8' or name == 'uint16' or
   name == 'uint32' or name == 'uint64'):
 return name + '_t'
+elif name == 'size':
+return 'uint64_t'
 elif name == 'bool':
 return 'bool'
 elif name == 'number':
-- 
1.7.10.4




[Qemu-devel] [PATCH 01/19] MAINTAINERS: Replace net maintainer Mark McLoughlin with Stefan Hajnoczi

2012-07-23 Thread Stefan Hajnoczi
The net subsystem has lacked an active maintainer since 2009.  I have
built and tested a net-next tree to get the ball rolling again.

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 MAINTAINERS |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 30ed56d..2d219d2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -560,9 +560,10 @@ F: monitor.c
 
 Network device layer
 M: Anthony Liguori aligu...@us.ibm.com
-M: Mark McLoughlin mar...@redhat.com
+M: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
 S: Maintained
 F: net/
+T: git git://github.com/stefanha/qemu.git net
 
 Network Block Device (NBD)
 M: Paolo Bonzini pbonz...@redhat.com
-- 
1.7.10.4




Re: [Qemu-devel] [PATCH v5 3/3] convert pci-host to QOM

2012-07-23 Thread Andreas Färber
Am 23.07.2012 14:35, schrieb Wanpeng Li:
 From: Anthony Liguori aligu...@us.ibm.com
 
 makes pci_host a proper QOM type.
 
 Changelog:
 * against Andreas pci_host branch
 * make host bridge TypeInfos const
 * use PCI_HOST_BRIDGE() where appropriate 
 
 Signed-off-by: Anthony Liguori aligu...@us.ibm.com
 Signed-off-by: Wanpeng Li liw...@linux.vnet.ibm.com
 
 ---
  hw/i440fx.c   |6 +++---
  hw/pc.c   |2 +-
  hw/pci_host.c |   14 ++
  hw/pci_host.h |2 ++
  hw/piix3.c|4 ++--
  5 files changed, 22 insertions(+), 6 deletions(-)
 
 diff --git a/hw/i440fx.c b/hw/i440fx.c
 index 720a25a..fdf040b 100644
 --- a/hw/i440fx.c
 +++ b/hw/i440fx.c
 @@ -191,7 +191,7 @@ static const VMStateDescription vmstate_i440fx_pmc = {
  static int i440fx_realize(SysBusDevice *dev)
  {
  I440FXState *s = I440FX(dev);
 -PCIHostState *h = PCI_HOST(s);
 +PCIHostState *h = PCI_HOST_BRIDGE(s);
  int bios_size, isa_bios_size;
  char *filename;
  int ret;

Either there's a miscommunication or a technical error: My branch surely
is using PCI_HOST_BRIDGE(), so these PCI_HOST - PCI_HOST_BRIDGE changes
look bogus. Did you make sure each patch compiles?

 @@ -401,7 +401,7 @@ static void i440fx_pmc_class_init(ObjectClass *klass, 
 void *data)
  dc-vmsd = vmstate_i440fx_pmc;
  }
  
 -static TypeInfo i440fx_pmc_info = {
 +static const TypeInfo i440fx_pmc_info = {
  .name  = TYPE_I440FX_PMC,
  .parent= TYPE_PCI_DEVICE,
  .instance_size = sizeof(I440FXPMCState),
 @@ -418,7 +418,7 @@ static void i440fx_class_init(ObjectClass *klass, void 
 *data)
  dc-no_user = 1;
  }
  
 -static TypeInfo i440fx_info = {
 +static const TypeInfo i440fx_info = {
  .name  = TYPE_I440FX,
  .parent= TYPE_PCI_HOST_BRIDGE,
  .instance_size = sizeof(I440FXState),

Patch 1/3 does not have const, patch 2/3 adds new TypeInfos without const.

So my guess is you've not rebased this on my pci-host branch [1] but
onto something else? If they're not against master it's advisable to
mark patches [PATCH treename xx/nn] btw, for clarity.

For the new/changed TypeInfos please add const from the start.

Regards,
Andreas

[1] http://repo.or.cz/w/qemu/afaerber.git/shortlog/refs/heads/pci-host
git://repo.or.cz/qemu/afaerber.git pci-host

 diff --git a/hw/pc.c b/hw/pc.c
 index d9a0443..f095109 100644
 --- a/hw/pc.c
 +++ b/hw/pc.c
 @@ -1217,7 +1217,7 @@ static PCIBus *i440fx_init(I440FXPMCState 
 **pi440fx_state, int *piix3_devfn,
  PCIHostState *h;
  
  s = I440FX(object_new(TYPE_I440FX));
 -h = PCI_HOST(s);
 +h = PCI_HOST_BRIDGE(s);
  
  /* FIXME make a properties */
  h-address_space = address_space_mem;
 diff --git a/hw/pci_host.c b/hw/pci_host.c
 index 3950e94..4e10042 100644
 --- a/hw/pci_host.c
 +++ b/hw/pci_host.c
 @@ -165,11 +165,25 @@ const MemoryRegionOps pci_host_data_be_ops = {
  .endianness = DEVICE_BIG_ENDIAN,
  };
  
 +void pci_host_set_mmio(PCIHostState *s, MemoryRegion *value)
 +{
 +object_property_set_link(OBJECT(s), OBJECT(value), mmio, NULL);
 +}
 +
 +static void pci_host_initfn(Object *obj)
 +{
 +PCIHostState *s = PCI_HOST_BRIDGE(obj);
 +
 +object_property_add_link(obj, mmio, memory-region,
 +(Object **)s-address_space, NULL);
 +}
 +
  static const TypeInfo pci_host_type_info = {
  .name = TYPE_PCI_HOST_BRIDGE,
  .parent = TYPE_SYS_BUS_DEVICE,
  .abstract = true,
  .instance_size = sizeof(PCIHostState),
 +.instance_init = pci_host_initfn,
  };
  
  static void pci_host_register_types(void)
 diff --git a/hw/pci_host.h b/hw/pci_host.h
 index 4b9c300..9f28728 100644
 --- a/hw/pci_host.h
 +++ b/hw/pci_host.h
 @@ -54,6 +54,8 @@ uint32_t pci_host_config_read_common(PCIDevice *pci_dev, 
 uint32_t addr,
  void pci_data_write(PCIBus *s, uint32_t addr, uint32_t val, int len);
  uint32_t pci_data_read(PCIBus *s, uint32_t addr, int len);
  
 +void pci_host_set_mmio(PCIHostState *s, MemoryRegion *value);
 +
  extern const MemoryRegionOps pci_host_conf_le_ops;
  extern const MemoryRegionOps pci_host_conf_be_ops;
  extern const MemoryRegionOps pci_host_data_le_ops;
 diff --git a/hw/piix3.c b/hw/piix3.c
 index eca6ec8..3b69b15 100644
 --- a/hw/piix3.c
 +++ b/hw/piix3.c
 @@ -204,7 +204,7 @@ static void piix3_class_init(ObjectClass *klass, void 
 *data)
  k-class_id = PCI_CLASS_BRIDGE_ISA;
  }
  
 -static TypeInfo piix3_info = {
 +static const TypeInfo piix3_info = {
  .name  = TYPE_PIIX3,
  .parent= TYPE_PCI_DEVICE,
  .instance_size = sizeof(PIIX3State),
 @@ -219,7 +219,7 @@ static void piix3_xen_class_init(ObjectClass *klass, void 
 *data)
  k-config_write = piix3_write_config_xen;
  };
  
 -static TypeInfo piix3_xen_info = {
 +static const TypeInfo piix3_xen_info = {
  .name  = PIIX3-xen,
  .parent= TYPE_PIIX3,
  .instance_size = sizeof(PIIX3State),
 


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, 

[Qemu-devel] [PATCH 15/19] convert net_init_socket() to NetClientOptions

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

I reverse engineered the following permissions between the -socket
sub-options:

 fd  listen  connect  mcast  udp | localaddr
  fd x   .   ..  .   | .
  listen .   x   ..  .   | .
  connect.   .   x.  .   | .
  mcast  .   .   .x  .   | x
  udp.   .   ..  x   | x
  ---+
  localaddr  .   .   .x  x x

I transformed the code accordingly. The real fix would be to embed fd,
listen, connect, mcast and udp in a separate union. However
OptsVisitor's enum parser only supports the type=XXX QemuOpt instance as
union discriminator.

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 net/socket.c |  119 +-
 1 file changed, 43 insertions(+), 76 deletions(-)

diff --git a/net/socket.c b/net/socket.c
index 563447d0..e3cba20 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -586,101 +586,68 @@ static int net_socket_udp_init(VLANState *vlan,
 return 0;
 }
 
-int net_init_socket(QemuOpts *opts, const NetClientOptions *new_opts,
+int net_init_socket(QemuOpts *old_opts, const NetClientOptions *opts,
 const char *name, VLANState *vlan)
 {
-if (qemu_opt_get(opts, fd)) {
-int fd;
+const NetdevSocketOptions *sock;
 
-if (qemu_opt_get(opts, listen) ||
-qemu_opt_get(opts, connect) ||
-qemu_opt_get(opts, mcast) ||
-qemu_opt_get(opts, localaddr)) {
-error_report(listen=, connect=, mcast= and localaddr= is invalid 
with fd=);
-return -1;
-}
+assert(opts-kind == NET_CLIENT_OPTIONS_KIND_SOCKET);
+sock = opts-socket;
 
-fd = net_handle_fd_param(cur_mon, qemu_opt_get(opts, fd));
-if (fd == -1) {
-return -1;
-}
+if (sock-has_fd + sock-has_listen + sock-has_connect + sock-has_mcast +
+sock-has_udp != 1) {
+error_report(exactly one of fd=, listen=, connect=, mcast= or udp=
+  is required);
+return -1;
+}
 
-if (!net_socket_fd_init(vlan, socket, name, fd, 1)) {
-return -1;
-}
-} else if (qemu_opt_get(opts, listen)) {
-const char *listen;
-
-if (qemu_opt_get(opts, fd) ||
-qemu_opt_get(opts, connect) ||
-qemu_opt_get(opts, mcast) ||
-qemu_opt_get(opts, localaddr)) {
-error_report(fd=, connect=, mcast= and localaddr= is invalid with 
listen=);
-return -1;
-}
+if (sock-has_localaddr  !sock-has_mcast  !sock-has_udp) {
+error_report(localaddr= is only valid with mcast= or udp=);
+return -1;
+}
 
-listen = qemu_opt_get(opts, listen);
+if (sock-has_fd) {
+int fd;
 
-if (net_socket_listen_init(vlan, socket, name, listen) == -1) {
-return -1;
-}
-} else if (qemu_opt_get(opts, connect)) {
-const char *connect;
-
-if (qemu_opt_get(opts, fd) ||
-qemu_opt_get(opts, listen) ||
-qemu_opt_get(opts, mcast) ||
-qemu_opt_get(opts, localaddr)) {
-error_report(fd=, listen=, mcast= and localaddr= is invalid with 
connect=);
+fd = net_handle_fd_param(cur_mon, sock-fd);
+if (fd == -1 || !net_socket_fd_init(vlan, socket, name, fd, 1)) {
 return -1;
 }
+return 0;
+}
 
-connect = qemu_opt_get(opts, connect);
-
-if (net_socket_connect_init(vlan, socket, name, connect) == -1) {
+if (sock-has_listen) {
+if (net_socket_listen_init(vlan, socket, name, sock-listen) == -1) {
 return -1;
 }
-} else if (qemu_opt_get(opts, mcast)) {
-const char *mcast, *localaddr;
+return 0;
+}
 
-if (qemu_opt_get(opts, fd) ||
-qemu_opt_get(opts, connect) ||
-qemu_opt_get(opts, listen)) {
-error_report(fd=, connect= and listen= is invalid with mcast=);
+if (sock-has_connect) {
+if (net_socket_connect_init(vlan, socket, name, sock-connect) ==
+-1) {
 return -1;
 }
+return 0;
+}
 
-mcast = qemu_opt_get(opts, mcast);
-localaddr = qemu_opt_get(opts, localaddr);
-
-if (net_socket_mcast_init(vlan, socket, name, mcast, localaddr) == 
-1) {
-return -1;
-}
-} else if (qemu_opt_get(opts, udp)) {
-const char *udp, *localaddr;
-
-if (qemu_opt_get(opts, fd) ||
-qemu_opt_get(opts, connect) ||
-qemu_opt_get(opts, listen) ||
-qemu_opt_get(opts, mcast)) {
-error_report(fd=, connect=, listen=
-  and mcast= is invalid with udp=);
+if (sock-has_mcast) {
+/* if 

[Qemu-devel] [PATCH 17/19] convert net_init_tap() to NetClientOptions

2012-07-23 Thread Stefan Hajnoczi
From: Laszlo Ersek ler...@redhat.com

Signed-off-by: Laszlo Ersek ler...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 net/tap-aix.c |2 +-
 net/tap-bsd.c |2 +-
 net/tap-haiku.c   |2 +-
 net/tap-linux.c   |9 +++--
 net/tap-solaris.c |2 +-
 net/tap-win32.c   |   11 +++---
 net/tap.c |  111 ++---
 net/tap.h |2 +-
 8 files changed, 71 insertions(+), 70 deletions(-)

diff --git a/net/tap-aix.c b/net/tap-aix.c
index e19aaba..f27c177 100644
--- a/net/tap-aix.c
+++ b/net/tap-aix.c
@@ -31,7 +31,7 @@ int tap_open(char *ifname, int ifname_size, int *vnet_hdr, 
int vnet_hdr_required
 return -1;
 }
 
-int tap_set_sndbuf(int fd, QemuOpts *opts)
+int tap_set_sndbuf(int fd, const NetdevTapOptions *tap)
 {
 return 0;
 }
diff --git a/net/tap-bsd.c b/net/tap-bsd.c
index 937a94b..a3b717d 100644
--- a/net/tap-bsd.c
+++ b/net/tap-bsd.c
@@ -117,7 +117,7 @@ int tap_open(char *ifname, int ifname_size, int *vnet_hdr, 
int vnet_hdr_required
 return fd;
 }
 
-int tap_set_sndbuf(int fd, QemuOpts *opts)
+int tap_set_sndbuf(int fd, const NetdevTapOptions *tap)
 {
 return 0;
 }
diff --git a/net/tap-haiku.c b/net/tap-haiku.c
index 91dda8e..34739d1 100644
--- a/net/tap-haiku.c
+++ b/net/tap-haiku.c
@@ -31,7 +31,7 @@ int tap_open(char *ifname, int ifname_size, int *vnet_hdr, 
int vnet_hdr_required
 return -1;
 }
 
-int tap_set_sndbuf(int fd, QemuOpts *opts)
+int tap_set_sndbuf(int fd, const NetdevTapOptions *tap)
 {
 return 0;
 }
diff --git a/net/tap-linux.c b/net/tap-linux.c
index 41d581b..c6521be 100644
--- a/net/tap-linux.c
+++ b/net/tap-linux.c
@@ -98,16 +98,19 @@ int tap_open(char *ifname, int ifname_size, int *vnet_hdr, 
int vnet_hdr_required
  */
 #define TAP_DEFAULT_SNDBUF 0
 
-int tap_set_sndbuf(int fd, QemuOpts *opts)
+int tap_set_sndbuf(int fd, const NetdevTapOptions *tap)
 {
 int sndbuf;
 
-sndbuf = qemu_opt_get_size(opts, sndbuf, TAP_DEFAULT_SNDBUF);
+sndbuf = !tap-has_sndbuf   ? TAP_DEFAULT_SNDBUF :
+ tap-sndbuf  INT_MAX  ? INT_MAX :
+ tap-sndbuf;
+
 if (!sndbuf) {
 sndbuf = INT_MAX;
 }
 
-if (ioctl(fd, TUNSETSNDBUF, sndbuf) == -1  qemu_opt_get(opts, 
sndbuf)) {
+if (ioctl(fd, TUNSETSNDBUF, sndbuf) == -1  tap-has_sndbuf) {
 error_report(TUNSETSNDBUF ioctl failed: %s, strerror(errno));
 return -1;
 }
diff --git a/net/tap-solaris.c b/net/tap-solaris.c
index cf76463..5d6ac42 100644
--- a/net/tap-solaris.c
+++ b/net/tap-solaris.c
@@ -197,7 +197,7 @@ int tap_open(char *ifname, int ifname_size, int *vnet_hdr, 
int vnet_hdr_required
 return fd;
 }
 
-int tap_set_sndbuf(int fd, QemuOpts *opts)
+int tap_set_sndbuf(int fd, const NetdevTapOptions *tap)
 {
 return 0;
 }
diff --git a/net/tap-win32.c b/net/tap-win32.c
index b738f45..b6099cd 100644
--- a/net/tap-win32.c
+++ b/net/tap-win32.c
@@ -699,19 +699,20 @@ static int tap_win32_init(VLANState *vlan, const char 
*model,
 return 0;
 }
 
-int net_init_tap(QemuOpts *opts, const NetClientOptions *new_opts,
+int net_init_tap(QemuOpts *old_opts, const NetClientOptions *opts,
  const char *name, VLANState *vlan)
 {
-const char *ifname;
+const NetdevTapOptions *tap;
 
-ifname = qemu_opt_get(opts, ifname);
+assert(opts-kind == NET_CLIENT_OPTIONS_KIND_TAP);
+tap = opts-tap;
 
-if (!ifname) {
+if (!tap-has_ifname) {
 error_report(tap: no interface name);
 return -1;
 }
 
-if (tap_win32_init(vlan, tap, name, ifname) == -1) {
+if (tap_win32_init(vlan, tap, name, tap-ifname) == -1) {
 return -1;
 }
 
diff --git a/net/tap.c b/net/tap.c
index 0fc856c..c5563c0 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -548,29 +548,32 @@ int net_init_bridge(QemuOpts *opts, const 
NetClientOptions *new_opts,
 return 0;
 }
 
-static int net_tap_init(QemuOpts *opts, int *vnet_hdr)
+static int net_tap_init(const NetdevTapOptions *tap, int *vnet_hdr,
+const char *setup_script, char *ifname,
+size_t ifname_sz)
 {
 int fd, vnet_hdr_required;
-char ifname[128] = {0,};
-const char *setup_script;
 
-if (qemu_opt_get(opts, ifname)) {
-pstrcpy(ifname, sizeof(ifname), qemu_opt_get(opts, ifname));
+if (tap-has_ifname) {
+pstrcpy(ifname, ifname_sz, tap-ifname);
+} else {
+assert(ifname_sz  0);
+ifname[0] = '\0';
 }
 
-*vnet_hdr = qemu_opt_get_bool(opts, vnet_hdr, 1);
-if (qemu_opt_get(opts, vnet_hdr)) {
+if (tap-has_vnet_hdr) {
+*vnet_hdr = tap-vnet_hdr;
 vnet_hdr_required = *vnet_hdr;
 } else {
+*vnet_hdr = 1;
 vnet_hdr_required = 0;
 }
 
-TFR(fd = tap_open(ifname, sizeof(ifname), vnet_hdr, vnet_hdr_required));
+TFR(fd = tap_open(ifname, ifname_sz, vnet_hdr, vnet_hdr_required));
 if (fd  0) {
 return -1;
 }
 
-setup_script 

Re: [Qemu-devel] [PATCH] kvm: Move kvm_allows_irq0_override() to target-i386

2012-07-23 Thread Peter Maydell
On 23 July 2012 13:26, Avi Kivity a...@redhat.com wrote:
 Really, irqchip in kernel means asynchronous interrupts - you can
 inject an interrupt from outside the vcpu thread.  Obviously if the vcpu
 is sleeping you need to wake it up and that pulls in idle management.

 irqchip for x86 really means the local APIC, which has a synchronous
 interface with the cpu core.  local APIC in kernel really is
 equivalent to kernel idle management, KVM_IRQ_LINE, and irqfd.  The
 ioapic and pit, on the other hand, don't contribute anything to this
 (just performance).

 So yes, ARM with and without GIC are irqchip_in_kernel, since the
 ARM-GIC interface is asynchronous.  Whether to emulate the GIC or not
 is just a performance question.

So should we be using something other than KVM_CREATE_IRQCHIP to
ask the kernel to create a GIC model for us (and leave KVM_CREATE_IRQCHIP
as a dummy always succeed ioctl)?

 So my view is that ARM with and without kernel GIC are
 irqchip_in_kernel, since whatever is the local APIC in ARM is always
 emulated in the kernel.

I'm not sure ARM has any equivalent to the local APIC -- the GIC
deals with everything and we don't have any equivalent division
of labour to the x86 LAPIC-IOAPIC one.

-- PMM



Re: [Qemu-devel] [PATCHv4 0/4] Sandboxing Qemu guests with Libseccomp

2012-07-23 Thread Eduardo Otubo
On Tue, Jul 17, 2012 at 04:19:11PM -0300, Eduardo Otubo wrote:
 Hello all,
 
 This patch is an effort to sandbox Qemu guests using Libseccomp[0]. The 
 patches
 that follows are pretty simple and straightforward. I added the correct 
 options
 and checks to the configure script and the basic calls to libseccomp in the
 main loop at vl.c. Details of each one are in the emails of the patch set.
 
 This support limits the system call footprint of the entire QEMU process to a
 limited set of syscalls, those that we know QEMU uses. The idea is to limit 
 the
 allowable syscalls, therefore limiting the impact that an attacked guest could
 have on the host system.
 
 It's important to note that the libseccomp itself needs the seccomp mode 2
 feature in the kernel, which is only available in kernel versions older (or
 equal) than 3.5-rc1.
 
 v2: Files separated in qemu-seccomp.c and qemu-seccomp.h for a cleaner
 implementation. The development was tested with the 3.5-rc1 kernel.
 
 v3: As we discussed in previous emails in this mailing list, this feature is
 not supposed to replace existing security feature, but add another layer 
 to
 the whole. The whitelist should contain all the syscalls QEMU needs. And 
 as
 stated by Will Drewry's commit message[1]: Filter programs will be 
 inherited
 across fork/clone and execve., the same white list should be passed 
 along from
 the father process to the child, then execve() shouldn't be a problem. 
 Note
 that there's a feature PR_SET_NO_NEW_PRIVS in seccomp mode 2 in the 
 kernel,
 this prevents processes from gaining privileges on execve. For example, 
 this
 will prevent qemu (if running unprivileged) from executing setuid 
 programs[2].
 
 v4: Introducing debug mode on libseccomp support. The debug mode will set
 the flag SCMP_ACT_TRAP when calling seccomp_start(). It will verbosely
 print a message to the stderr in the form seccomp: illegal system call
 execution trapped: XXX and resume the execution. This is really just 
 used as
 debug mode, it helps users and developers to full fill the whitelist.
 
 As always, comments are more than welcome.

Hello folks,

Does anyone got a chance to take a look at these? Thanks in advance :)

 
 Regards,
 
 [0] - http://sourceforge.net/projects/libseccomp/
 [1] - 
 http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=e2cfabdfd075648216f99c2c03821cf3f47c1727
 [2] - https://lkml.org/lkml/2012/4/12/457
 
 Eduardo Otubo (4):
   Adding support for libseccomp in configure and Makefile
   Adding qemu-seccomp.[ch]
   Adding qemu-seccomp-debug.[ch]
   Adding seccomp calls to vl.c
 
  Makefile.objs|   10 
  configure|   34 ++
  qemu-seccomp-debug.c |   95 +
  qemu-seccomp-debug.h |   38 +++
  qemu-seccomp.c   |  126 
 ++
  qemu-seccomp.h   |   22 +
  vl.c |   31 +
  7 files changed, 356 insertions(+)
  create mode 100644 qemu-seccomp-debug.c
  create mode 100644 qemu-seccomp-debug.h
  create mode 100644 qemu-seccomp.c
  create mode 100644 qemu-seccomp.h
 
 -- 
 1.7.9.5
 

-- 
Eduardo Otubo
Software Engineer
Linux Technology Center
IBM Systems  Technology Group
Mobile: +55 19 8135 0885 
eot...@linux.vnet.ibm.com




Re: [Qemu-devel] [PATCH] kvm: Move kvm_allows_irq0_override() to target-i386

2012-07-23 Thread Cornelia Huck
On Mon, 23 Jul 2012 15:18:49 +0300
Avi Kivity a...@redhat.com wrote:

  So, for example, if a specific subchannel (=device) has pending status
  and an I/O interrupt is to be generated, this interrupt remains pending
  until an arbitrary cpu is enabled for I/O interrupts. If several cpus
  are enabled for I/O interrupts, any of them may be interrupted.
 
 This may be costly to emulate.  On x86 we do not have access to a
 guest's interrupt status while it is running.  Is this not the case for
 s390?
 
 Oh, let me guess.  You write some interrupt descriptor in memory
 somewhere, issue one of your famous instructions, and the hardware finds
 a guest vcpu and injects the interrupt.

Basically, we have some flags in our control block we can set so that
the cpu drops out of SIE whenever external/I/O/... interrupts are
enabled and then have the host do the lowcore updates, psw swaps, etc.

 
 x86 has a least priority mode which is similar to what you're
 describing, but I don't think we emulate it correctly.
 
  When an
  I/O interrupt is delivered on a cpu, the cpu's lowcore contains the
  interrupt payload which defines the subchannel (=device) the interrupt
  is for.
  
  Any idea on how this architecture can be married with the irqchip
  concept is welcome. If all else fails, would a special irqfd concept
  for !irqchip be acceptable?
 
 I don't see an issue.  You need an arch-specific irqfd configuration
 ioctl (or your own data field in the existing ioctl) that defines the
 payload.  Then the kernel installs a poll function on that eventfd that,
 when called, does the magic sequence needed to get the interrupt there.

If extending the existing ioctl is acceptable, I think we will go that
route.

  While you don't have an irqchip, you do have asynchronous interrupt
 injection, yes?  That's what irqchip really is all about.

You mean injection via ioctl() that is asynchronous to vcpu execution?
Yes, although we use a different ioctl than the others.

Cornelia




[Qemu-devel] [PATCH V3 2/3] qerror: Add error telling that streaming blocks migration

2012-07-23 Thread benoit . canet
From: Benoît Canet ben...@irqsave.net

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 qerror.c |4 
 qerror.h |3 +++
 2 files changed, 7 insertions(+)

diff --git a/qerror.c b/qerror.c
index 92c4eff..c7889fe 100644
--- a/qerror.c
+++ b/qerror.c
@@ -283,6 +283,10 @@ static const QErrorStringTable qerror_table[] = {
 .desc  = Could not set password,
 },
 {
+.error_fmt = QERR_MIGRATION_BLOCKED_BY_STREAMING,
+.desc  = Migration is blocked by streaming,
+},
+{
 .error_fmt = QERR_TOO_MANY_FILES,
 .desc  = Too many open files,
 },
diff --git a/qerror.h b/qerror.h
index b4c8758..dca27f7 100644
--- a/qerror.h
+++ b/qerror.h
@@ -233,6 +233,9 @@ QError *qobject_to_qerror(const QObject *obj);
 #define QERR_SET_PASSWD_FAILED \
 { 'class': 'SetPasswdFailed', 'data': {} }
 
+#define QERR_MIGRATION_BLOCKED_BY_STREAMING \
+{ 'class': 'MigrationBlockedByStreaming', 'data': {} }
+
 #define QERR_TOO_MANY_FILES \
 { 'class': 'TooManyFiles', 'data': {} }
 
-- 
1.7.9.5




[Qemu-devel] [PATCH V3 3/3] migration: block migration when streaming block jobs are running.

2012-07-23 Thread benoit . canet
From: Benoît Canet ben...@irqsave.net

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 migration.c |5 +
 1 file changed, 5 insertions(+)

diff --git a/migration.c b/migration.c
index 8db1b43..4ffdcf2 100644
--- a/migration.c
+++ b/migration.c
@@ -425,6 +425,11 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 return;
 }
 
+if (bdrv_are_busy()) {
+error_set(errp, QERR_MIGRATION_BLOCKED_BY_STREAMING);
+return;
+}
+
 s = migrate_init(params);
 
 if (strstart(uri, tcp:, p)) {
-- 
1.7.9.5




[Qemu-devel] [PATCH V3 1/3] block: Add bdrv_are_busy()

2012-07-23 Thread benoit . canet
From: Benoît Canet ben...@irqsave.net

bdrv_are_busy will be used to check if any of the bs are in use
or if one of them have a running block job.

The first user will be qmp_migrate().

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 block.c |   13 +
 block.h |2 ++
 2 files changed, 15 insertions(+)

diff --git a/block.c b/block.c
index ce7eb8f..bc8f160 100644
--- a/block.c
+++ b/block.c
@@ -4027,6 +4027,19 @@ out:
 return ret;
 }
 
+int bdrv_are_busy(void)
+{
+BlockDriverState *bs;
+
+QTAILQ_FOREACH(bs, bdrv_states, list) {
+if (bs-job || bdrv_in_use(bs)) {
+return -EBUSY;
+}
+}
+
+return 0;
+}
+
 void *block_job_create(const BlockJobType *job_type, BlockDriverState *bs,
int64_t speed, BlockDriverCompletionFunc *cb,
void *opaque, Error **errp)
diff --git a/block.h b/block.h
index c89590d..0a3de2f 100644
--- a/block.h
+++ b/block.h
@@ -337,6 +337,8 @@ void bdrv_disable_copy_on_read(BlockDriverState *bs);
 void bdrv_set_in_use(BlockDriverState *bs, int in_use);
 int bdrv_in_use(BlockDriverState *bs);
 
+int bdrv_are_busy(void);
+
 enum BlockAcctType {
 BDRV_ACCT_READ,
 BDRV_ACCT_WRITE,
-- 
1.7.9.5




[Qemu-devel] [PATCH V3 0/3] Block migration when streaming block jobs are running

2012-07-23 Thread benoit . canet
From: Benoît Canet ben...@irqsave.net

This patchset is designed to avoid starting a live migration while one or more
streaming block jobs are running.

Tested with the following sequence:

QEMU 1.1.50 monitor - type 'help' for more information
(qemu) block_stream virtio0 1k
(qemu) migrate tcp:localhost:
migrate: Migration is blocked by streaming
(qemu)  block_job_cancel virtio0
(qemu)  migrate tcp:localhost:
migrate: Connection can not be completed immediately
(qemu) 
= migration then succeed

in v2:
stefanha: Rename bdrv_have_block_jobs() to bdrv_are_busy() and make it return 
-EBUSY.
paolo: remove spurious bdrv_close()

in v3
pm215: rewrite confusing error message

Benoît Canet (3):
  block: Add bdrv_are_busy()
  qerror: Add error telling that streaming blocks migration
  migration: block migration when streaming block jobs are running.

 block.c |   13 +
 block.h |2 ++
 migration.c |5 +
 qerror.c|4 
 qerror.h|3 +++
 5 files changed, 27 insertions(+)

-- 
1.7.9.5




Re: [Qemu-devel] [PATCH] check for available room when formatting OpenFirmware device path

2012-07-23 Thread Laszlo Ersek
On 07/23/12 14:46, Markus Armbruster wrote:
 Laszlo Ersek ler...@redhat.com writes:
 
 Signed-off-by: Laszlo Ersek ler...@redhat.com
 ---
  hw/qdev.c |   14 +-
  vl.c  |7 ++-
  2 files changed, 19 insertions(+), 2 deletions(-)

 diff --git a/hw/qdev.c b/hw/qdev.c
 index af54467..f1e83a4 100644
 --- a/hw/qdev.c
 +++ b/hw/qdev.c
 @@ -502,6 +502,10 @@ static int qdev_get_fw_dev_path_helper(DeviceState 
 *dev, char *p, int size)
  if (dev  dev-parent_bus) {
  char *d;
  l = qdev_get_fw_dev_path_helper(dev-parent_bus-parent, p, size);
 +if (l = size) {
 +return l;
 +}
 +
  d = bus_get_fw_dev_path(dev-parent_bus, dev);
  if (d) {
  l += snprintf(p + l, size - l, %s, d);
 @@ -509,6 +513,10 @@ static int qdev_get_fw_dev_path_helper(DeviceState 
 *dev, char *p, int size)
  } else {
  l += snprintf(p + l, size - l, %s, 
 object_get_typename(OBJECT(dev)));
  }
 +
 +if (l = size) {
 +return l;
 +}
  }
  l += snprintf(p + l , size - l, /);
  
 
 If the return value is less than the size argument, it's the length of
 the string written into p[].  Else, it means p[] has insufficient
 space.

Yes. (snprintf() returns the number of bytes it would store, excluding
the terminating NUL, had there been enough room.
http://pubs.opengroup.org/onlinepubs/9699919799/functions/snprintf.html#tag_16_159_04)

Did I make a mistake?

Supposing snprintf() encounters no error, it returns a positive value P
in the above.

P = snprintf(..., size - l0, ...)
l1 = l0 + P;

 l1 = size
-  l0 + P = size
-   P = size - l0


The return value of qdev_get_fw_dev_path_helper() comes from another
invocation of itself, or from the trailing snprintf(), so it behaves
like snprintf.

Or what do you have in mind?


 
 @@ -520,8 +528,12 @@ char* qdev_get_fw_dev_path(DeviceState *dev)
  char path[128];
  int l;
  
 -l = qdev_get_fw_dev_path_helper(dev, path, 128);
 +l = qdev_get_fw_dev_path_helper(dev, path, sizeof(path));
  
 +assert(l  0);
 +if (l = sizeof(path)) {
 +return NULL;
 +}
 
 Failure mode could be avoided with the common technique: make
 qdev_get_fw_dev_path_helper() return the true length.  If it fit into
 path[], return strdup(path).  Else, allocate a suitable buffer and try
 again.
 
 What do you think?

You're right of course, but I didn't have (or wanted to spend) the time
to change the code that much (and to test it...), especially because it
would have been fixed already if it had caused actual problems. I think
the patch could work as a last resort, small improvement, but I don't
mind if someone posts a better fix :)

Laszlo



[Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_MONITORS_CONFIG_ASYNC

2012-07-23 Thread Alon Levy
bumps spice-protocol to 0.12.0 for new IO.
revision bumped to 4 for new IO support, enabled for spice-server = 0.11.1

RHBZ: 770842

Signed-off-by: Alon Levy al...@redhat.com
---
 configure|2 +-
 hw/qxl.c |   29 -
 hw/qxl.h |4 
 trace-events |1 +
 4 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index cef0a71..5fcd315 100755
--- a/configure
+++ b/configure
@@ -2630,7 +2630,7 @@ EOF
   spice_cflags=$($pkg_config --cflags spice-protocol spice-server 2/dev/null)
   spice_libs=$($pkg_config --libs spice-protocol spice-server 2/dev/null)
   if $pkg_config --atleast-version=0.8.2 spice-server /dev/null 21  \
- $pkg_config --atleast-version=0.8.1 spice-protocol  /dev/null 21  \
+ $pkg_config --atleast-version=0.12.0 spice-protocol  /dev/null 21  \
  compile_prog $spice_cflags $spice_libs ; then
 spice=yes
 libs_softmmu=$libs_softmmu $spice_libs
diff --git a/hw/qxl.c b/hw/qxl.c
index 3a883ce..5b5d380 100644
--- a/hw/qxl.c
+++ b/hw/qxl.c
@@ -249,6 +249,18 @@ static void qxl_spice_destroy_surfaces(PCIQXLDevice *qxl, 
qxl_async_io async)
 }
 }
 
+#if SPICE_SERVER_VERSION = 0x000b01 /* 0.11.1 */
+static void qxl_spice_monitors_config_async(PCIQXLDevice *qxl)
+{
+trace_qxl_spice_monitors_config(qxl-id);
+spice_qxl_monitors_config_async(qxl-ssd.qxl,
+qxl-ram-monitors_config,
+MEMSLOT_GROUP_GUEST,
+(uintptr_t)qxl_cookie_new(QXL_COOKIE_TYPE_IO,
+  QXL_IO_MONITORS_CONFIG_ASYNC));
+}
+#endif
+
 void qxl_spice_reset_image_cache(PCIQXLDevice *qxl)
 {
 trace_qxl_spice_reset_image_cache(qxl-id);
@@ -538,6 +550,7 @@ static const char *io_port_to_string(uint32_t io_port)
 = QXL_IO_DESTROY_ALL_SURFACES_ASYNC,
 [QXL_IO_FLUSH_SURFACES_ASYNC]   = QXL_IO_FLUSH_SURFACES_ASYNC,
 [QXL_IO_FLUSH_RELEASE]  = QXL_IO_FLUSH_RELEASE,
+[QXL_IO_MONITORS_CONFIG_ASYNC]  = QXL_IO_MONITORS_CONFIG_ASYNC,
 };
 return io_port_to_string[io_port];
 }
@@ -1333,7 +1346,7 @@ static void ioport_write(void *opaque, target_phys_addr_t 
addr,
 io_port, io_port_to_string(io_port));
 /* be nice to buggy guest drivers */
 if (io_port = QXL_IO_UPDATE_AREA_ASYNC 
-io_port = QXL_IO_DESTROY_ALL_SURFACES_ASYNC) {
+io_port = QXL_IO_MONITORS_CONFIG_ASYNC) {
 qxl_send_events(d, QXL_INTERRUPT_IO_CMD);
 }
 return;
@@ -1361,6 +1374,9 @@ static void ioport_write(void *opaque, target_phys_addr_t 
addr,
 io_port = QXL_IO_DESTROY_ALL_SURFACES;
 goto async_common;
 case QXL_IO_FLUSH_SURFACES_ASYNC:
+#if SPICE_SERVER_VERSION = 0x000b01 /* 0.11.1 */
+case QXL_IO_MONITORS_CONFIG_ASYNC:
+#endif
 async_common:
 async = QXL_ASYNC;
 qemu_mutex_lock(d-async_lock);
@@ -1490,6 +1506,11 @@ async_common:
 d-mode = QXL_MODE_UNDEFINED;
 qxl_spice_destroy_surfaces(d, async);
 break;
+#if SPICE_SERVER_VERSION = 0x000b01 /* 0.11.1 */
+case QXL_IO_MONITORS_CONFIG_ASYNC:
+qxl_spice_monitors_config_async(d);
+break;
+#endif
 default:
 qxl_set_guest_bug(d, %s: unexpected ioport=0x%x\n, __func__, 
io_port);
 }
@@ -1785,9 +1806,15 @@ static int qxl_init_common(PCIQXLDevice *qxl)
 io_size = 16;
 break;
 case 3: /* qxl-3 */
+pci_device_rev = QXL_REVISION_STABLE_V10;
+io_size = 32; /* PCI region size must be pow2 */
+break;
+#if SPICE_SERVER_VERSION = 0x000b01 /* 0.11.1 */
+case 4: /* qxl-4 */
 pci_device_rev = QXL_DEFAULT_REVISION;
 io_size = msb_mask(QXL_IO_RANGE_SIZE * 2 - 1);
 break;
+#endif
 default:
 error_report(Invalid revision %d for qxl device (max %d),
  qxl-revision, QXL_DEFAULT_REVISION);
diff --git a/hw/qxl.h b/hw/qxl.h
index 172baf6..c1aadaa 100644
--- a/hw/qxl.h
+++ b/hw/qxl.h
@@ -128,7 +128,11 @@ typedef struct PCIQXLDevice {
 }   \
 } while (0)
 
+#if SPICE_SERVER_VERSION = 0x000b01 /* 0.11.1 */
+#define QXL_DEFAULT_REVISION QXL_REVISION_STABLE_V12
+#else
 #define QXL_DEFAULT_REVISION QXL_REVISION_STABLE_V10
+#endif
 
 /* qxl.c */
 void *qxl_phys2virt(PCIQXLDevice *qxl, QXLPHYSICAL phys, int group_id);
diff --git a/trace-events b/trace-events
index 2a5f074..3db04ad 100644
--- a/trace-events
+++ b/trace-events
@@ -954,6 +954,7 @@ qxl_spice_destroy_surfaces(int qid, int async) %d async=%d
 qxl_spice_destroy_surface_wait_complete(int qid, uint32_t id) %d sid=%d
 qxl_spice_destroy_surface_wait(int qid, uint32_t id, int async) %d sid=%d 
async=%d
 qxl_spice_flush_surfaces_async(int qid, uint32_t surface_count, uint32_t 
num_free_res) %d s#=%d, res#=%d
+qxl_spice_monitors_config(int id) %d
 qxl_spice_loadvm_commands(int qid, void *ext, uint32_t count) %d ext=%p 

[Qemu-devel] [PATCH 1/2] qxl: disallow unknown revisions

2012-07-23 Thread Alon Levy
Signed-off-by: Alon Levy al...@redhat.com
---
 hw/qxl.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/qxl.c b/hw/qxl.c
index c2dd3b4..3a883ce 100644
--- a/hw/qxl.c
+++ b/hw/qxl.c
@@ -1785,10 +1785,13 @@ static int qxl_init_common(PCIQXLDevice *qxl)
 io_size = 16;
 break;
 case 3: /* qxl-3 */
-default:
 pci_device_rev = QXL_DEFAULT_REVISION;
 io_size = msb_mask(QXL_IO_RANGE_SIZE * 2 - 1);
 break;
+default:
+error_report(Invalid revision %d for qxl device (max %d),
+ qxl-revision, QXL_DEFAULT_REVISION);
+return -1;
 }
 
 pci_set_byte(config[PCI_REVISION_ID], pci_device_rev);
-- 
1.7.10.1




[Qemu-devel] [PATCH v5 5/6] block: Convert close calls to qemu_close

2012-07-23 Thread Corey Bryant
This patch converts all block layer close calls, that correspond
to qemu_open calls, to qemu_close.

v5:
 -This patch is new in v5. (kw...@redhat.com, ebl...@redhat.com)

Signed-off-by: Corey Bryant cor...@linux.vnet.ibm.com
---
 block/raw-posix.c |   24 
 block/raw-win32.c |2 +-
 block/vmdk.c  |4 ++--
 block/vpc.c   |2 +-
 block/vvfat.c |   12 ++--
 osdep.c   |5 +
 qemu-common.h |1 +
 savevm.c  |4 ++--
 8 files changed, 30 insertions(+), 24 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 7408a42..a172de3 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -271,7 +271,7 @@ static int raw_open_common(BlockDriverState *bs, const char 
*filename,
 out_free_buf:
 qemu_vfree(s-aligned_buf);
 out_close:
-close(fd);
+qemu_close(fd);
 return -errno;
 }
 
@@ -376,7 +376,7 @@ static void raw_close(BlockDriverState *bs)
 {
 BDRVRawState *s = bs-opaque;
 if (s-fd = 0) {
-close(s-fd);
+qemu_close(s-fd);
 s-fd = -1;
 if (s-aligned_buf != NULL)
 qemu_vfree(s-aligned_buf);
@@ -580,7 +580,7 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options)
 if (ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
 result = -errno;
 }
-if (close(fd) != 0) {
+if (qemu_close(fd) != 0) {
 result = -errno;
 }
 }
@@ -850,7 +850,7 @@ static int hdev_open(BlockDriverState *bs, const char 
*filename, int flags)
 if (fd  0) {
 bsdPath[strlen(bsdPath)-1] = '1';
 } else {
-close(fd);
+qemu_close(fd);
 }
 filename = bsdPath;
 }
@@ -889,7 +889,7 @@ static int fd_open(BlockDriverState *bs)
 last_media_present = (s-fd = 0);
 if (s-fd = 0 
 (get_clock() - s-fd_open_time) = FD_OPEN_TIMEOUT) {
-close(s-fd);
+qemu_close(s-fd);
 s-fd = -1;
 #ifdef DEBUG_FLOPPY
 printf(Floppy closed\n);
@@ -988,7 +988,7 @@ static int hdev_create(const char *filename, 
QEMUOptionParameter *options)
 else if (lseek(fd, 0, SEEK_END)  total_size * BDRV_SECTOR_SIZE)
 ret = -ENOSPC;
 
-close(fd);
+qemu_close(fd);
 return ret;
 }
 
@@ -1038,7 +1038,7 @@ static int floppy_open(BlockDriverState *bs, const char 
*filename, int flags)
 return ret;
 
 /* close fd so that we can reopen it as needed */
-close(s-fd);
+qemu_close(s-fd);
 s-fd = -1;
 s-fd_media_changed = 1;
 
@@ -1070,7 +1070,7 @@ static int floppy_probe_device(const char *filename)
 prio = 100;
 
 outc:
-close(fd);
+qemu_close(fd);
 out:
 return prio;
 }
@@ -1105,14 +1105,14 @@ static void floppy_eject(BlockDriverState *bs, bool 
eject_flag)
 int fd;
 
 if (s-fd = 0) {
-close(s-fd);
+qemu_close(s-fd);
 s-fd = -1;
 }
 fd = qemu_open(bs-filename, s-open_flags | O_NONBLOCK);
 if (fd = 0) {
 if (ioctl(fd, FDEJECT, 0)  0)
 perror(FDEJECT);
-close(fd);
+qemu_close(fd);
 }
 }
 
@@ -1173,7 +1173,7 @@ static int cdrom_probe_device(const char *filename)
 prio = 100;
 
 outc:
-close(fd);
+qemu_close(fd);
 out:
 return prio;
 }
@@ -1281,7 +1281,7 @@ static int cdrom_reopen(BlockDriverState *bs)
  * FreeBSD seems to not notice sometimes...
  */
 if (s-fd = 0)
-close(s-fd);
+qemu_close(s-fd);
 fd = qemu_open(bs-filename, s-open_flags, 0644);
 if (fd  0) {
 s-fd = -1;
diff --git a/block/raw-win32.c b/block/raw-win32.c
index 8d7838d..c56bf83 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -261,7 +261,7 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options)
 return -EIO;
 set_sparse(fd);
 ftruncate(fd, total_size * 512);
-close(fd);
+qemu_close(fd);
 return 0;
 }
 
diff --git a/block/vmdk.c b/block/vmdk.c
index 557dc1b..daee426 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1258,7 +1258,7 @@ static int vmdk_create_extent(const char *filename, 
int64_t filesize,
 
 ret = 0;
  exit:
-close(fd);
+qemu_close(fd);
 return ret;
 }
 
@@ -1506,7 +1506,7 @@ static int vmdk_create(const char *filename, 
QEMUOptionParameter *options)
 }
 ret = 0;
 exit:
-close(fd);
+qemu_close(fd);
 return ret;
 }
 
diff --git a/block/vpc.c b/block/vpc.c
index 60ebf5a..c0b82c4 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -744,7 +744,7 @@ static int vpc_create(const char *filename, 
QEMUOptionParameter *options)
 }
 
  fail:
-close(fd);
+qemu_close(fd);
 return ret;
 }
 
diff --git a/block/vvfat.c b/block/vvfat.c
index 22b586a..59d3c5b 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1105,7 +1105,7 @@ static inline void 
vvfat_close_current_file(BDRVVVFATState *s)
 if(s-current_mapping) {

Re: [Qemu-devel] [PATCH] kvm: Move kvm_allows_irq0_override() to target-i386

2012-07-23 Thread Avi Kivity
On 07/23/2012 03:58 PM, Peter Maydell wrote:
 On 23 July 2012 13:26, Avi Kivity a...@redhat.com wrote:
 Really, irqchip in kernel means asynchronous interrupts - you can
 inject an interrupt from outside the vcpu thread.  Obviously if the vcpu
 is sleeping you need to wake it up and that pulls in idle management.

 irqchip for x86 really means the local APIC, which has a synchronous
 interface with the cpu core.  local APIC in kernel really is
 equivalent to kernel idle management, KVM_IRQ_LINE, and irqfd.  The
 ioapic and pit, on the other hand, don't contribute anything to this
 (just performance).
 
 So yes, ARM with and without GIC are irqchip_in_kernel, since the
 ARM-GIC interface is asynchronous.  Whether to emulate the GIC or not
 is just a performance question.
 
 So should we be using something other than KVM_CREATE_IRQCHIP to
 ask the kernel to create a GIC model for us (and leave KVM_CREATE_IRQCHIP
 as a dummy always succeed ioctl)?

Some time ago I suggested using the parameter to KVM_CREATE_IRQCHIP to
select the irqchip type.

 
 So my view is that ARM with and without kernel GIC are
 irqchip_in_kernel, since whatever is the local APIC in ARM is always
 emulated in the kernel.
 
 I'm not sure ARM has any equivalent to the local APIC -- the GIC
 deals with everything and we don't have any equivalent division
 of labour to the x86 LAPIC-IOAPIC one.

It's probably a tiny part of the core with no name.  The point is that
the x86-lapic interface is synchronous and bidirectional, while the
lapic-ioapic interface is asynchronous (it is still bidirectional, but
not in a stop-the-vcpu way).  I assume the ARM-GIC interface is
unidirectional?


-- 
error compiling committee.c: too many arguments to function





Re: [Qemu-devel] [PATCH] kvm: Move kvm_allows_irq0_override() to target-i386

2012-07-23 Thread Avi Kivity
On 07/23/2012 04:06 PM, Cornelia Huck wrote:
 On Mon, 23 Jul 2012 15:18:49 +0300
 Avi Kivity a...@redhat.com wrote:
 
  So, for example, if a specific subchannel (=device) has pending status
  and an I/O interrupt is to be generated, this interrupt remains pending
  until an arbitrary cpu is enabled for I/O interrupts. If several cpus
  are enabled for I/O interrupts, any of them may be interrupted.
 
 This may be costly to emulate.  On x86 we do not have access to a
 guest's interrupt status while it is running.  Is this not the case for
 s390?
 
 Oh, let me guess.  You write some interrupt descriptor in memory
 somewhere, issue one of your famous instructions, and the hardware finds
 a guest vcpu and injects the interrupt.
 
 Basically, we have some flags in our control block we can set so that
 the cpu drops out of SIE whenever external/I/O/... interrupts are
 enabled and then have the host do the lowcore updates, psw swaps, etc.

Can you write them from a different cpu and expect them to take effect?

How do you emulate an interrupt with a large guest?  You have to update
the flags in the control blocks for all vcpus; then restore them when
the interrupt is delivered?

 I don't see an issue.  You need an arch-specific irqfd configuration
 ioctl (or your own data field in the existing ioctl) that defines the
 payload.  Then the kernel installs a poll function on that eventfd that,
 when called, does the magic sequence needed to get the interrupt there.
 
 If extending the existing ioctl is acceptable, I think we will go that
 route.

Sure.

  While you don't have an irqchip, you do have asynchronous interrupt
 injection, yes?  That's what irqchip really is all about.
 
 You mean injection via ioctl() that is asynchronous to vcpu execution?
 Yes, although we use a different ioctl than the others.

Ok.  The difference between irqfd and the ioctl is that with irqfd
everything (the payload in your case, the interrupt number for others)
is prepared beforehand, while with the ioctl the extra information is
delivered with the ioctl.

-- 
error compiling committee.c: too many arguments to function





  1   2   3   >