date:20100614

[Qemu-devel] [RFC PATCH 4/5] eepro100: Add a dev field to eeprom new/free functions

2010-06-14 Thread Alex Williamson

This allows us to create a more meaningful savevm string.

Signed-off-by: Alex Williamson alex.william...@redhat.com
---

 hw/eepro100.c   |4 ++--
 hw/eeprom93xx.c |8 
 hw/eeprom93xx.h |4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/hw/eepro100.c b/hw/eepro100.c
index 92cfea7..c5731cc 100644
--- a/hw/eepro100.c
+++ b/hw/eepro100.c
@@ -1836,7 +1836,7 @@ static int pci_nic_uninit(PCIDevice *pci_dev)
 
 cpu_unregister_io_memory(s-mmio_index);
 vmstate_unregister(pci_dev-qdev, s-vmstate, s);
-eeprom93xx_free(s-eeprom);
+eeprom93xx_free(pci_dev-qdev, s-eeprom);
 qemu_del_vlan_client(s-nic-nc);
 return 0;
 }
@@ -1863,7 +1863,7 @@ static int e100_nic_init(PCIDevice *pci_dev)
 
 /* Add 64 * 2 EEPROM. i82557 and i82558 support a 64 word EEPROM,
  * i82559 and later support 64 or 256 word EEPROM. */
-s-eeprom = eeprom93xx_new(EEPROM_SIZE);
+s-eeprom = eeprom93xx_new(pci_dev-qdev, EEPROM_SIZE);
 
 /* Handler for memory-mapped I/O */
 s-mmio_index =
diff --git a/hw/eeprom93xx.c b/hw/eeprom93xx.c
index 6ba546f..660b28f 100644
--- a/hw/eeprom93xx.c
+++ b/hw/eeprom93xx.c
@@ -289,7 +289,7 @@ void eeprom93xx_reset(eeprom_t *eeprom)
 }
 #endif
 
-eeprom_t *eeprom93xx_new(uint16_t nwords)
+eeprom_t *eeprom93xx_new(DeviceState *dev, uint16_t nwords)
 {
 /* Add a new EEPROM (with 16, 64 or 256 words). */
 eeprom_t *eeprom;
@@ -316,15 +316,15 @@ eeprom_t *eeprom93xx_new(uint16_t nwords)
 /* Output DO is tristate, read results in 1. */
 eeprom-eedo = 1;
 logout(eeprom = 0x%p, nwords = %u\n, eeprom, nwords);
-vmstate_register(NULL, 0, vmstate_eeprom, eeprom);
+vmstate_register(dev, 0, vmstate_eeprom, eeprom);
 return eeprom;
 }
 
-void eeprom93xx_free(eeprom_t *eeprom)
+void eeprom93xx_free(DeviceState *dev, eeprom_t *eeprom)
 {
 /* Destroy EEPROM. */
 logout(eeprom = 0x%p\n, eeprom);
-vmstate_unregister(NULL, vmstate_eeprom, eeprom);
+vmstate_unregister(dev, vmstate_eeprom, eeprom);
 qemu_free(eeprom);
 }
 
diff --git a/hw/eeprom93xx.h b/hw/eeprom93xx.h
index 47282d3..8ba0e28 100644
--- a/hw/eeprom93xx.h
+++ b/hw/eeprom93xx.h
@@ -23,10 +23,10 @@
 typedef struct _eeprom_t eeprom_t;
 
 /* Create a new EEPROM with (nwords * 2) bytes. */
-eeprom_t *eeprom93xx_new(uint16_t nwords);
+eeprom_t *eeprom93xx_new(DeviceState *dev, uint16_t nwords);
 
 /* Destroy an existing EEPROM. */
-void eeprom93xx_free(eeprom_t *eeprom);
+void eeprom93xx_free(DeviceState *dev, eeprom_t *eeprom);
 
 /* Read from the EEPROM. */
 uint16_t eeprom93xx_read(eeprom_t *eeprom);

[Qemu-devel] [RFC PATCH 5/5] virtio-net: Incorporate a DeviceState pointer and let savevm track instances

2010-06-14 Thread Alex Williamson

Stuff a pointer to the DeviceState into the VirtIONet structure so that
we can easily remove the vmstate entry later.  Also, let vmstate track
the instance number (it should always be zero internally since the
device path should now be unique).

Signed-off-by: Alex Williamson alex.william...@redhat.com
---

 hw/virtio-net.c |7 ---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index e9768e0..f41db45 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -60,6 +60,7 @@ typedef struct VirtIONet
 uint8_t *macs;
 } mac_table;
 uint32_t *vlans;
+DeviceState *qdev;
 } VirtIONet;
 
 /* TODO
@@ -890,7 +891,6 @@ static void virtio_net_vmstate_change(void *opaque, int 
running, int reason)
 VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf)
 {
 VirtIONet *n;
-static int virtio_net_id;
 
 n = (VirtIONet *)virtio_common_init(virtio-net, VIRTIO_ID_NET,
 sizeof(struct virtio_net_config),
@@ -923,7 +923,8 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf 
*conf)
 
 n-vlans = qemu_mallocz(MAX_VLAN  3);
 
-register_savevm(NULL, virtio-net, virtio_net_id++, VIRTIO_NET_VM_VERSION,
+n-qdev = dev;
+register_savevm(dev, virtio-net, -1, VIRTIO_NET_VM_VERSION,
 virtio_net_save, virtio_net_load, n);
 n-vmstate = qemu_add_vm_change_state_handler(virtio_net_vmstate_change, 
n);
 
@@ -941,7 +942,7 @@ void virtio_net_exit(VirtIODevice *vdev)
 
 qemu_purge_queued_packets(n-nic-nc);
 
-unregister_savevm(NULL, virtio-net, n);
+unregister_savevm(n-qdev, virtio-net, n);
 
 qemu_free(n-mac_table.macs);
 qemu_free(n-vlans);

Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path()

2010-06-14 Thread Markus Armbruster

Alex Williamson alex.william...@redhat.com writes:

 qdev_get_dev_path() is intended to be the canonical utility for creating
 a string representing the qdev hierarchy of a device.  The path consists
 of bus and device names as well as identified properties of the immediate
 parent bus and device.  This results in strings like:

 /main-system-bus/pci.0,addr=00.0/i440FX
 /main-system-bus/pci.0,addr=01.0/PIIX3
 /main-system-bus/pci.0,addr=02.0/cirrus-vga
 /main-system-bus/pci.0/isa.0/mc146818rtc
 /main-system-bus/pci.0/isa.0/isa-serial
 /main-system-bus/pci.0/isa.0/i8042
 /main-system-bus/pci.0/isa.0/isa-fdc
 /main-system-bus/pci.0,addr=03.0/i82551,mac=52:54:00:12:34:56
 /main-system-bus/pci.0,addr=04.0/virtio-net-pci,mac=52:54:00:12:34:57
 /main-system-bus/pci.0,addr=05.0/e1000,mac=52:54:00:12:34:58
 /main-system-bus/pci.0,addr=06.0/rtl8139,mac=52:54:00:12:34:59
 /main-system-bus/pci.0,addr=07.0/pcnet,mac=52:54:00:12:34:5a
 /main-system-bus/pci.0,addr=01.1/piix3-ide
 /main-system-bus/pci.0,addr=01.3/PIIX4_PM
 /main-system-bus/pci.0,addr=08.0/lsi53c895a
 /main-system-bus/pci.0,addr=09.0/virtio-blk-pci

 There are two primary targets for these strings.  The first is vmsave, where
 we currently use a device provided string plus instance number to track
 SaveStateEntries.  This fails when we introduce device hotplug, particularly
 in a case were we have gaps in the instance numbers that cannot easily be
 reproduced on a migration target.  The second is for naming RAMBlocks.  For
 these, we would like a string that matches the vmstate string.

Could you explain why you add identified properties of the immediate
parent bus and device?  They make the result ver much *not* a dev
path in the qdev sense...

[Qemu-devel] [PATCH] hpet: Init capability register only once

2010-06-14 Thread Jan Kiszka

From: Jan Kiszka jan.kis...@siemens.com

The capability register is read-only from guest POV, so we do not need
to update it on reset.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/hpet.c |9 +
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index 93fc399..e9b585c 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -648,10 +648,6 @@ static void hpet_reset(DeviceState *d)
 
 s-hpet_counter = 0ULL;
 s-hpet_offset = 0ULL;
-/* 64-bit main counter; LegacyReplacementRoute. */
-s-capability = 0x8086a001ULL;
-s-capability |= (s-num_timers - 1)  HPET_ID_NUM_TIM_SHIFT;
-s-capability |= ((HPET_CLK_PERIOD)  32);
 s-config = 0ULL;
 if (count  0) {
 /* we don't enable pit when hpet_reset is first called (by hpet_init)
@@ -696,6 +692,11 @@ static int hpet_init(SysBusDevice *dev)
 timer-state = s;
 }
 
+/* 64-bit main counter; LegacyReplacementRoute. */
+s-capability = 0x8086a001ULL;
+s-capability |= (s-num_timers - 1)  HPET_ID_NUM_TIM_SHIFT;
+s-capability |= ((HPET_CLK_PERIOD)  32);
+
 isa_reserve_irq(RTC_ISA_IRQ);
 qdev_init_gpio_in(dev-qdev, hpet_handle_rtc_irq, 1);
 
-- 
1.6.0.2

[Qemu-devel] Re: [RFC PATCH 0/5] Introduce canonical device hierarchy string

2010-06-14 Thread Gerd Hoffmann


  Hi,


My premise with this attempt is that we walk the hierarchy and use the
names to create the base of the path.  As we get to the device,
particularly to the parent bus of the device, we need to start looking at
properties to ensure uniqueness.


You'll need that for every bus along the way down to the device.  Create 
a virtual machine with two lsi scsi host adapters, then attach a disk 
with scsi id 0 to each.  Just the scsi id isn't good enougth to identify 
the device.  You'll need the lsi pci address too.



For now, the only properties I've tagged as path
properties are PCI bus addresses and MAC addresses.


mac address isn't needed here.  You need the property which specifies 
the bus address.  For PCI this obviously is the PCI address.  For scsi 
the scsi id.  For ISA you can use the I/O port base.  virtio-serial the 
port number, ...


cheers,
  Gerd

[Qemu-devel] [PATCHv2] pass info about hpets to seabios.]

2010-06-14 Thread Gleb Natapov

Currently HPET ACPI table is created regardless of whether qemu actually
created hpet device. This may confuse some guests that don't check that
hpet is functional before using it. Solve this by passing info about
hpets in qemu to seabios via fw config interface. Additional benefit is
that seabios no longer uses hard coded hpet configuration. Proposed
interface supports up to 8 hpets. This is the number defined by hpet 
spec.

Signed-off-by: Gleb Natapov g...@redhat.com
diff --git a/hw/hpet.c b/hw/hpet.c
index 93fc399..704fed1 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -71,8 +71,11 @@ typedef struct HPETState {
 uint64_t config;/* configuration */
 uint64_t isr;   /* interrupt status reg */
 uint64_t hpet_counter;  /* main counter */
+uint8_t  hpet_id;   /* instance id */
 } HPETState;
 
+struct hpet_fw_config hpet_cfg = {.count = ~0};
+
 static uint32_t hpet_in_legacy_mode(HPETState *s)
 {
 return s-config  HPET_CFG_LEGACY;
@@ -228,6 +231,7 @@ static int hpet_post_load(void *opaque, int version_id)
 /* Push number of timers into capability returned via HPET_ID */
 s-capability = ~HPET_ID_NUM_TIM_MASK;
 s-capability |= (s-num_timers - 1)  HPET_ID_NUM_TIM_SHIFT;
+hpet_cfg.hpet[s-hpet_id].event_timer_block_id = (uint32_t)s-capability;
 
 /* Derive HPET_MSI_SUPPORT from the capability of the first timer. */
 s-flags = ~(1  HPET_MSI_SUPPORT);
@@ -661,6 +665,8 @@ static void hpet_reset(DeviceState *d)
  */
 hpet_pit_enable();
 }
+hpet_cfg.hpet[s-hpet_id].event_timer_block_id = (uint32_t)s-capability;
+hpet_cfg.hpet[s-hpet_id].address = sysbus_from_qdev(d)-mmio[0].addr;
 count = 1;
 }
 
@@ -680,6 +686,16 @@ static int hpet_init(SysBusDevice *dev)
 int i, iomemtype;
 HPETTimer *timer;
 
+if (hpet_cfg.count == ~0) /* first instance */
+hpet_cfg.count = 0;
+
+if (hpet_cfg.count == 8) {
+fprintf(stderr, Only 8 instances of HPET is allowed\n);
+return -1;
+}
+
+s-hpet_id = hpet_cfg.count++;
+
 for (i = 0; i  HPET_NUM_IRQ_ROUTES; i++) {
 sysbus_init_irq(dev, s-irqs[i]);
 }
diff --git a/hw/hpet_emul.h b/hw/hpet_emul.h
index d7bc102..8bf312a 100644
--- a/hw/hpet_emul.h
+++ b/hw/hpet_emul.h
@@ -53,4 +53,19 @@
 #define HPET_TN_INT_ROUTE_CAP_SHIFT 32
 #define HPET_TN_CFG_BITS_READONLY_OR_RESERVED 0x80b1U
 
+struct hpet_fw_entry
+{
+uint32_t event_timer_block_id;
+uint64_t address;
+uint16_t min_tick;
+uint8_t page_prot;
+} __attribute__ ((packed));
+
+struct hpet_fw_config
+{
+uint8_t count;
+struct hpet_fw_entry hpet[8];
+} __attribute__ ((packed));
+
+extern struct hpet_fw_config hpet_cfg;
 #endif
diff --git a/hw/pc.c b/hw/pc.c
index 1491129..d14d657 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -61,6 +61,7 @@
 #define FW_CFG_SMBIOS_ENTRIES (FW_CFG_ARCH_LOCAL + 1)
 #define FW_CFG_IRQ0_OVERRIDE (FW_CFG_ARCH_LOCAL + 2)
 #define FW_CFG_E820_TABLE (FW_CFG_ARCH_LOCAL + 3)
+#define FW_CFG_HPET (FW_CFG_ARCH_LOCAL + 4)
 
 #define E820_NR_ENTRIES16
 
@@ -484,6 +485,8 @@ static void *bochs_bios_init(void)
 fw_cfg_add_bytes(fw_cfg, FW_CFG_E820_TABLE, (uint8_t *)e820_table,
  sizeof(struct e820_table));
 
+fw_cfg_add_bytes(fw_cfg, FW_CFG_HPET, (uint8_t *)hpet_cfg,
+ sizeof(struct hpet_fw_config));
 /* allocate memory for the NUMA channel: one (64bit) word for the number
  * of nodes, one word for each VCPU-node and one word for each node to
  * hold the amount of memory.
--
Gleb.

[Qemu-devel] [PATCHv2] load hpet info for HPET ACPI table from qemu

2010-06-14 Thread Gleb Natapov

Load hpet info for HPET ACPI table from qemu instead of using hardcoded
values. Use hardcoded values anyway if old qemu is detected. 

Signed-off-by: Gleb Natapov g...@redhat.com
diff --git a/src/acpi.c b/src/acpi.c
index 0559443..864f1a8 100644
--- a/src/acpi.c
+++ b/src/acpi.c
@@ -469,7 +469,7 @@ build_ssdt(void)
 
 #define HPET_SIGNATURE 0x54455048 //HPET
 static void*
-build_hpet(void)
+build_hpet(struct hpet_fw_entry *e, u8 id)
 {
 struct acpi_20_hpet *hpet = malloc_high(sizeof(*hpet));
 if (!hpet) {
@@ -478,11 +478,11 @@ build_hpet(void)
 }
 
 memset(hpet, 0, sizeof(*hpet));
-/* Note timer_block_id value must be kept in sync with value advertised by
- * emulated hpet
- */
-hpet-timer_block_id = cpu_to_le32(0x8086a201);
-hpet-addr.address = cpu_to_le32(ACPI_HPET_ADDRESS);
+hpet-timer_block_id = cpu_to_le32(e-event_timer_block_id);
+hpet-addr.address = cpu_to_le32(e-address);
+hpet-min_tick = cpu_to_le32(e-min_tick);
+hpet-hpet_number = id;
+hpet-page_protect = e-page_prot;
 build_header((void*)hpet, HPET_SIGNATURE, sizeof(*hpet), 1);
 
 return hpet;
@@ -637,9 +637,27 @@ acpi_bios_init(void)
 ACPI_INIT_TABLE(build_fadt(bdf));
 ACPI_INIT_TABLE(build_ssdt());
 ACPI_INIT_TABLE(build_madt());
-ACPI_INIT_TABLE(build_hpet());
 ACPI_INIT_TABLE(build_srat());
 
+u8 hpet_id, c =  qemu_cfg_hpet_entries();
+struct hpet_fw_entry e;
+
+if (c == ~0) {
+/* qemu do not provide hpet description */
+e.event_timer_block_id = 0x8086a201;
+e.address = ACPI_HPET_ADDRESS;
+e.min_tick = 0;
+c = 1;
+} else if (c != 0)
+qemu_cfg_hpet_load_next(e);
+
+while (c--) {
+ACPI_INIT_TABLE(build_hpet(e, hpet_id++));
+if (c)
+qemu_cfg_hpet_load_next(e);
+}
+
+
 u16 i, external_tables = qemu_cfg_acpi_additional_tables();
 
 for(i = 0; i  external_tables; i++) {
diff --git a/src/paravirt.c b/src/paravirt.c
index 5c77b5c..458ab08 100644
--- a/src/paravirt.c
+++ b/src/paravirt.c
@@ -149,6 +149,23 @@ void* qemu_cfg_e820_load_next(void *addr)
 return addr;
 }
 
+u8 qemu_cfg_hpet_entries(void)
+{
+u8 cnt;
+
+if (!qemu_cfg_present)
+return 0;
+
+/* read valid flags */
+qemu_cfg_read_entry(cnt, QEMU_CFG_HPET, sizeof(cnt));
+return cnt;
+}
+
+void qemu_cfg_hpet_load_next(struct hpet_fw_entry *e)
+{
+qemu_cfg_read((u8*)e, sizeof(*e));
+}
+
 struct smbios_header {
 u16 length;
 u8 type;
diff --git a/src/paravirt.h b/src/paravirt.h
index c46418f..272af81 100644
--- a/src/paravirt.h
+++ b/src/paravirt.h
@@ -37,6 +37,7 @@ static inline int kvm_para_available(void)
 #define QEMU_CFG_SMBIOS_ENTRIES(QEMU_CFG_ARCH_LOCAL + 1)
 #define QEMU_CFG_IRQ0_OVERRIDE (QEMU_CFG_ARCH_LOCAL + 2)
 #define QEMU_CFG_E820_TABLE(QEMU_CFG_ARCH_LOCAL + 3)
+#define QEMU_CFG_HPET  (QEMU_CFG_ARCH_LOCAL + 4)
 
 extern int qemu_cfg_present;
 
@@ -68,10 +69,20 @@ struct e820_reservation {
 u32 type;
 };
 
+struct hpet_fw_entry
+{
+u32 event_timer_block_id;
+u64 address;
+u16 min_tick;
+u8 page_prot;
+} __attribute__ ((packed));
+
 u16 qemu_cfg_first_file(QemuCfgFile *entry);
 u16 qemu_cfg_next_file(QemuCfgFile *entry);
 u32 qemu_cfg_read_file(QemuCfgFile *entry, void *dst, u32 maxlen);
 u32 qemu_cfg_e820_entries(void);
 void* qemu_cfg_e820_load_next(void *addr);
+u8 qemu_cfg_hpet_entries(void);
+void qemu_cfg_hpet_load_next(struct hpet_fw_entry *e);
 
 #endif
--
Gleb.

Re: [Qemu-devel] [PATCH v2 2/7] ioapic: convert to qdev

2010-06-14 Thread Markus Armbruster

Blue Swirl blauwir...@gmail.com writes:

 Convert to qdev.

 Signed-off-by: Blue Swirl blauwir...@gmail.com
 ---
  hw/apic.h|2 --
  hw/ioapic.c  |   45 ++---
  hw/pc.h  |4 +++-
  hw/pc_piix.c |   19 ++-
  4 files changed, 51 insertions(+), 19 deletions(-)

 diff --git a/hw/apic.h b/hw/apic.h
 index e1954f4..dc41400 100644
 --- a/hw/apic.h
 +++ b/hw/apic.h
 @@ -1,7 +1,6 @@
  #ifndef APIC_H
  #define APIC_H

 -typedef struct IOAPICState IOAPICState;
  void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
   uint8_t delivery_mode,
   uint8_t vector_num, uint8_t polarity,
 @@ -10,7 +9,6 @@ int apic_init(CPUState *env);
  int apic_accept_pic_intr(CPUState *env);
  void apic_deliver_pic_intr(CPUState *env, int level);
  int apic_get_interrupt(CPUState *env);
 -qemu_irq *ioapic_init(void);
  void apic_reset_irq_delivered(void);
  int apic_get_irq_delivered(void);

 diff --git a/hw/ioapic.c b/hw/ioapic.c
 index e3f8a46..0336dbd 100644
 --- a/hw/ioapic.c
 +++ b/hw/ioapic.c
 @@ -25,6 +25,7 @@
  #include apic.h
  #include qemu-timer.h
  #include host-utils.h
 +#include sysbus.h

  //#define DEBUG_IOAPIC

 @@ -35,7 +36,6 @@
  #define DPRINTF(fmt, ...)
  #endif

 -#define IOAPIC_NUM_PINS  0x18
  #define IOAPIC_LVT_MASKED(116)

  #define IOAPIC_TRIGGER_EDGE  0
 @@ -50,7 +50,10 @@
  #define IOAPIC_DM_SIPI   0x5
  #define IOAPIC_DM_EXTINT 0x7

 +typedef struct IOAPICState IOAPICState;
 +
  struct IOAPICState {
 +SysBusDevice busdev;
  uint8_t id;
  uint8_t ioregsel;

 @@ -209,12 +212,14 @@ static const VMStateDescription vmstate_ioapic = {
  }
  };

 -static void ioapic_reset(void *opaque)
 +static void ioapic_reset(DeviceState *d)
  {
 -IOAPICState *s = opaque;
 +IOAPICState *s = container_of(d, IOAPICState, busdev.qdev);

DO_UPCAST()?  I hate it, but it's what the other devices do...

[...]

Re: [Qemu-devel] [PATCH 2/2] Return usb device to host on exit

2010-06-14 Thread Gerd Hoffmann


@@ -1066,6 +1077,7 @@ USBDevice *usb_host_device_open(const char *devname)
  qdev_prop_set_uint32(dev-qdev, vendorid,  filter.vendor_id);
  qdev_prop_set_uint32(dev-qdev, productid, filter.product_id);
  qdev_init_nofail(dev-qdev);
+atexit(usb_host_cleanup);
  return dev;


You'll register atexit multiple times here (once per device).

I still think this should use exit notifiers, see 
http://patchwork.ozlabs.org/patch/54571/ (doesn't apply cleanly and 
more, will post a rebased version later today).


cheers,
  Gerd

[Qemu-devel] [PATCH 0/5] [QEMU-KVM]: Add BSG backstore using struct sg_io_v4

2010-06-14 Thread Nicholas A. Bellinger

From: Nicholas Bellinger n...@linux-iscsi.org

Quick resend with project subject for cover letter..

Greetings Gerd, Hannes and co,

This series adds initial support for a hw/scsi-bsg.c backstore for scsi-bus
compatible HBA emulation in QEMU-KVM on Linux hosts supporting the BSG driver.
This code is available from the scsi-bsg branch in the megasas/scsi friendly 
QEMU-KVM tree at:

http://git.kernel.org/?p=virt/kvm/nab/qemu-kvm.git;a=shortlog;h=refs/heads/scsi-bsg

Note that this initial code is being posted for review and to see how useful a 
BSG backstore
would be for QEMU-KVM and Linux hosts.  Note that in order for BSG I/O to 
function using vectored
AIO a kernel patch to linux/block/bsg.c:bsg_map_hdr() is currently required 
running on a bit paired
user/kernel enviroment.  The kernel patch in question is here:

http://marc.info/?l=linux-scsim=127649585524598w=2

The first three patches involve updating block code to support the BSG 
backstore for scsi-bsg.

The forth patch adds the minor changes to hw/scsi-bus.c and hw/scsi-disk.c in 
order to
function with scsi-bsg.

And the fifth patch adds the main hw/scsi-bsg.c logic necessary to run the new 
struct
SCSIDeviceInfo and for BSG AIO using struct iovec and paio_submit_len() to 
function.

So far this patch series has been tested with a Linux based x86_64 KVM host and 
guest
using the hw/megasas.c 8708EM2 HBA Emulation with TCM_Loop virtual SAS Port 
LUNs.

Comments are welcome,

Signed-off-by: Nicholas A. Bellinger n...@linux-iscsi.org

Nicholas Bellinger (5):
  [block]: Add top level BSG support
  [block]: Add BSG qemu_open() in block/raw.c:raw_open()
  [block]: Add paio_submit_len() non sector sized AIO
  [scsi]: Add BSG support for scsi-bus and scsi-disk
  [scsi-bsg]: Add initial support for BSG based SCSIDeviceInfo

 Makefile.objs |2 +-
 block.c   |   23 ++-
 block.h   |1 +
 block/raw-posix-aio.h |3 +
 block/raw-posix.c |   17 ++-
 block/raw.c   |   20 ++
 block_int.h   |5 +
 hw/scsi-bsg.c |  588 +
 hw/scsi-bus.c |3 +-
 hw/scsi-disk.c|4 +
 posix-aio-compat.c|   28 +++
 11 files changed, 687 insertions(+), 7 deletions(-)
 create mode 100644 hw/scsi-bsg.c

Re: [Qemu-devel] [PATCH v2 7/7] apic: qdev conversion cleanup

2010-06-14 Thread Markus Armbruster

Blue Swirl blauwir...@gmail.com writes:

 Make APICState completely private to apic.c by using DeviceState
 in external APIs.

Could you explain why this is an improvement?

[Qemu-devel] [PATCH 4/5] [scsi]: Add BSG support for scsi-bus and scsi-disk

2010-06-14 Thread Nicholas A. Bellinger

From: Nicholas Bellinger n...@linux-iscsi.org

This patch updates hw/scsi-bus.c:scsi_bus_legacy_add_drive() to check for the 
scsi-bsg backstore.

It also updates hw/scsi-disk.c:scsi_disk_initfn() to check for when 
bdrv_is_bsg() is present and
we need to fail for the fileio backed scsi-disk code.

Signed-off-by: Nicholas A. Bellinger n...@linux-iscsi.org
---
 hw/scsi-bus.c  |3 ++-
 hw/scsi-disk.c |4 
 2 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index 7d80405..b5f5fbb 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -90,7 +90,8 @@ SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, DriveInfo 
*dinfo, int unit)
 const char *driver;
 DeviceState *dev;
 
-driver = bdrv_is_sg(dinfo-bdrv) ? scsi-generic : scsi-disk;
+driver = bdrv_is_bsg(dinfo-bdrv) ? scsi-bsg :
+ bdrv_is_sg(dinfo-bdrv) ? scsi-generic : scsi-disk;
 dev = qdev_create(bus-qbus, driver);
 qdev_prop_set_uint32(dev, scsi-id, unit);
 qdev_prop_set_drive(dev, drive, dinfo);
diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index fb3c189..290a2e7 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -704,6 +704,10 @@ static int scsi_disk_initfn(SCSIDevice *dev)
 error_report(scsi-disk: unwanted /dev/sg*);
 return -1;
 }
+if (bdrv_is_bsg(s-bs)) {
+error_report(scsi-disk: unwanted BSG);
+return -1;
+} 
 
 if (bdrv_get_type_hint(s-bs) == BDRV_TYPE_CDROM) {
 s-qdev.blocksize = 2048;
-- 
1.5.6.5

[Qemu-devel] [Bug 524447] Re: virsh save is very slow

2010-06-14 Thread Dave Walker

Changing to libvirt as commentary here, and on the upstream bug report
by Cole indicate a fix has been commit that improves this performance.

** Package changed: qemu-kvm (Ubuntu) = libvirt (Ubuntu)

** Also affects: libvirt via
   https://bugzilla.redhat.com/show_bug.cgi?id=599091
   Importance: Unknown
   Status: Unknown

-- 
virsh save is very slow
https://bugs.launchpad.net/bugs/524447
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.

Status in libvirt virtualization API: Unknown
Status in QEMU: Invalid
Status in “libvirt” package in Ubuntu: Confirmed

Bug description:
As reported here: 
http://www.redhat.com/archives/libvir-list/2009-December/msg00203.html

virsh save is very slow - it writes the image at around 1MB/sec on my test 
system.

(I think I saw a bug report for this issue on Fedora's bugzilla, but I can't 
find it now...)

Confirmed under Karmic.

[Qemu-devel] [PATCH 3/5] [block]: Add paio_submit_len() non sector sized AIO

2010-06-14 Thread Nicholas A. Bellinger

From: Nicholas Bellinger n...@linux-iscsi.org

This patch adds posix-aio-compat.c:paio_submit_len(), which is a identical
to paio_submit() expect that in expected nb_len instead of nb_sectors (* 512)
so that it can be used by BSG AIO for write()/read() of struct sg_io_v4.

Signed-off-by: Nicholas A. Bellinger n...@linux-iscsi.org
---
 block/raw-posix-aio.h |3 +++
 posix-aio-compat.c|   28 
 2 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/block/raw-posix-aio.h b/block/raw-posix-aio.h
index dfc63b8..29df842 100644
--- a/block/raw-posix-aio.h
+++ b/block/raw-posix-aio.h
@@ -30,6 +30,9 @@ int paio_init(void);
 BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
 int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
 BlockDriverCompletionFunc *cb, void *opaque, int type);
+BlockDriverAIOCB *paio_submit_len(BlockDriverState *bs, int fd,
+int64_t sector_num, QEMUIOVector *qiov, int nb_len,
+BlockDriverCompletionFunc *cb, void *opaque, int type);
 BlockDriverAIOCB *paio_ioctl(BlockDriverState *bs, int fd,
 unsigned long int req, void *buf,
 BlockDriverCompletionFunc *cb, void *opaque);
diff --git a/posix-aio-compat.c b/posix-aio-compat.c
index 272e998..ac9276c 100644
--- a/posix-aio-compat.c
+++ b/posix-aio-compat.c
@@ -585,6 +585,34 @@ BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
 return acb-common;
 }
 
+BlockDriverAIOCB *paio_submit_len(BlockDriverState *bs, int fd,
+int64_t sector_num, QEMUIOVector *qiov, int nb_len,
+BlockDriverCompletionFunc *cb, void *opaque, int type)
+{
+struct qemu_paiocb *acb;
+
+acb = qemu_aio_get(raw_aio_pool, bs, cb, opaque);
+if (!acb)
+return NULL;
+acb-aio_type = type;
+acb-aio_fildes = fd;
+acb-ev_signo = SIGUSR2;
+acb-async_context_id = get_async_context_id();
+
+if (qiov) {
+acb-aio_iov = qiov-iov;
+acb-aio_niov = qiov-niov;
+}
+acb-aio_nbytes = nb_len;
+acb-aio_offset = 0;
+
+acb-next = posix_aio_state-first_aio;
+posix_aio_state-first_aio = acb;
+
+qemu_paio_submit(acb);
+return acb-common;
+}
+
 BlockDriverAIOCB *paio_ioctl(BlockDriverState *bs, int fd,
 unsigned long int req, void *buf,
 BlockDriverCompletionFunc *cb, void *opaque)
-- 
1.5.6.5

Re: [Qemu-devel] [PATCH] qemu-iotests: qcow2 error path tests

2010-06-14 Thread Kevin Wolf

Am 14.06.2010 11:38, schrieb Christoph Hellwig:
 On Fri, Jun 04, 2010 at 07:35:24PM +0200, Kevin Wolf wrote:
 This adds test cases for qcow2 error paths (using blkdebug)
 
 Thanks, applied.
 
 What's the plan for getting the fixes this tests into mainline?

The test passes in the block branch, but Anthony hasn't pulled for a
while now.

I'm not sure how to handle it, I have about 50 patches in the queue now
- if I want to send pull requests of resonable sizes it will take two
more pulls after the currently pending pull request of the week before
last. Of course I could just send all 50 patches at once, but I guess
he'd be even more uncomfortable with pulling then...

Anthony, what's your plan and what can I do to support you?

Kevin

[Qemu-devel] [PATCH 0/5] * SUBJECT HERE *

2010-06-14 Thread Nicholas A. Bellinger

From: Nicholas Bellinger n...@linux-iscsi.org

Greetings Gerd, Hannes and co,

This series adds initial support for a hw/scsi-bsg.c backstore for scsi-bus
compatible HBA emulation in QEMU-KVM on Linux hosts supporting the BSG driver.
This code is available from the scsi-bsg branch in the megasas/scsi friendly 
QEMU-KVM tree at:

http://git.kernel.org/?p=virt/kvm/nab/qemu-kvm.git;a=shortlog;h=refs/heads/scsi-bsg

Note that this initial code is being posted for review and to see how useful a 
BSG backstore
would be for QEMU-KVM and Linux hosts.  Note that in order for BSG I/O to 
function using vectored
AIO a kernel patch to linux/block/bsg.c:bsg_map_hdr() is currently required 
running on a bit paired
user/kernel enviroment.  The kernel patch in question is here:

http://marc.info/?l=linux-scsim=127649585524598w=2

The first three patches involve updating block code to support the BSG 
backstore for scsi-bsg.

The forth patch adds the minor changes to hw/scsi-bus.c and hw/scsi-disk.c in 
order to
function with scsi-bsg.

And the fifth patch adds the main hw/scsi-bsg.c logic necessary to run the new 
struct
SCSIDeviceInfo and for BSG AIO using struct iovec and paio_submit_len() to 
function.

So far this patch series has been tested with a Linux based x86_64 KVM host and 
guest
using the hw/megasas.c 8708EM2 HBA Emulation with TCM_Loop virtual SAS Port 
LUNs.

Comments are welcome,

Signed-off-by: Nicholas A. Bellinger n...@linux-iscsi.org

Nicholas Bellinger (5):
  [block]: Add top level BSG support
  [block]: Add BSG qemu_open() in block/raw.c:raw_open()
  [block]: Add paio_submit_len() non sector sized AIO
  [scsi]: Add BSG support for scsi-bus and scsi-disk
  [scsi-bsg]: Add initial support for BSG based SCSIDeviceInfo

 Makefile.objs |2 +-
 block.c   |   23 ++-
 block.h   |1 +
 block/raw-posix-aio.h |3 +
 block/raw-posix.c |   17 ++-
 block/raw.c   |   20 ++
 block_int.h   |5 +
 hw/scsi-bsg.c |  588 +
 hw/scsi-bus.c |3 +-
 hw/scsi-disk.c|4 +
 posix-aio-compat.c|   28 +++
 11 files changed, 687 insertions(+), 7 deletions(-)
 create mode 100644 hw/scsi-bsg.c

[Qemu-devel] [PATCH 5/5] [scsi-bsg]: Add initial support for BSG based SCSIDeviceInfo

2010-06-14 Thread Nicholas A. Bellinger

From: Nicholas Bellinger n...@linux-iscsi.org

This patch adds initial support for using the Linux BSG interface with 
write/read vectored
AIO as a QEMU backstore (SCSIDeviceInfo) with hw/scsi-bus.c compatible HBA 
emulation.

So far it has been tested with x86_64 host and guest using hw/megasas.c and 
TCM_Loop LLD
Port LUNs.  Because this path uses struct iovec for struct 
sg_io_v4-d[out,in]_xferp payloads,
which currently requires a patch to linux/block/bsg.c:bsg_map_hdr() in order to 
setup the
user - kernel iovecs.  This also will only currently work with paired 
user/kernel
(eg: 64bit user / 64bit kernel) because of different pointer sizes in struct 
iovec-iov_base.

There are also two FIXMEs in hw/scsi-bsg.c:bsg_generic_initfn() related to 
extraction of
SCSI LUN and device type values using BSG and required by QEMU-KVM.

Signed-off-by: Nicholas A. Bellinger n...@linux-iscsi.org
---
 Makefile.objs |2 +-
 hw/scsi-bsg.c |  588 +
 2 files changed, 589 insertions(+), 1 deletions(-)
 create mode 100644 hw/scsi-bsg.c

diff --git a/Makefile.objs b/Makefile.objs
index 188d617..c4fcb72 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -197,7 +197,7 @@ hw-obj-$(CONFIG_IDE_CMD646) += ide/cmd646.o
 hw-obj-$(CONFIG_IDE_MACIO) += ide/macio.o
 
 # SCSI layer
-hw-obj-y += scsi-disk.o scsi-generic.o
+hw-obj-y += scsi-disk.o scsi-generic.o scsi-bsg.o
 hw-obj-y += lsi53c895a.o megasas.o
 hw-obj-$(CONFIG_ESP) += esp.o
 
diff --git a/hw/scsi-bsg.c b/hw/scsi-bsg.c
new file mode 100644
index 000..fc76b76
--- /dev/null
+++ b/hw/scsi-bsg.c
@@ -0,0 +1,588 @@
+/*
+ * block layer implementation of the sg v4 interface for Linux hosts
+ *
+ * Copyright (c) 2010 Rising Tide Systems
+ * Written by Nicholas A. Bellinger n...@linux-iscsi.org
+ *
+ * Based on hw/scsi-generic code by Laurent Vivier, Paul Brook, and Fabrice 
Bellard
+ *
+ * This code is licenced under the LGPL.
+ */
+
+#include qemu-common.h
+#include qemu-error.h
+#include block.h
+#include scsi.h
+#include dma.h
+#include block/raw-posix-aio.h
+
+#ifdef __linux__
+
+#define DEBUG_BSG
+#undef DEBUG_BSG_IO
+#undef DEBUG_BSG_MAP
+
+#ifdef DEBUG_BSG
+#define DPRINTF(fmt, ...) \
+do { printf(scsi-bsg:  fmt , ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) do {} while(0)
+#endif
+
+#define BADF(fmt, ...) \
+do { fprintf(stderr, scsi-bsg:  fmt , ## __VA_ARGS__); } while (0)
+
+#include stdio.h
+#include sys/types.h
+#include sys/stat.h
+#include sys/epoll.h
+#include unistd.h
+#include scsi/sg.h
+#include linux/bsg.h
+#include scsi-defs.h
+
+#define SCSI_SENSE_BUF_SIZE 96
+
+#define SG_ERR_DRIVER_TIMEOUT 0x06
+#define SG_ERR_DRIVER_SENSE 0x08
+
+#ifndef MAX_UINT
+#define MAX_UINT ((unsigned int)-1)
+#endif
+
+typedef struct SCSIBSGState SCSIBSGState;
+
+typedef struct SCSIBSGReq {
+SCSIRequest req;
+uint8_t *buf;
+int buflen;
+QEMUIOVector iov;
+QEMUIOVector aio_iov;
+struct sg_io_v4 bsg_hdr;
+} SCSIBSGReq;
+
+struct SCSIBSGState {
+SCSIDevice qdev;
+BlockDriverState *bs;
+int lun;
+int driver_status;
+uint8_t sensebuf[SCSI_SENSE_BUF_SIZE];
+uint8_t senselen;
+};
+
+static int bsg_read(int fd, void *p_read, int to_read)
+{
+int err;
+
+while (to_read  0) {
+err = read(fd, p_read, to_read);
+if (err = 0) {
+to_read -= err;
+p_read += err;
+} else if (errno == EINTR)
+continue;
+else {
+printf(bsg device %d read failed, errno: %d\n,
+fd, errno);
+return errno;
+}
+}
+return 0;
+}
+
+static SCSIBSGReq *bsg_new_request(SCSIDevice *d, uint32_t tag, uint32_t lun)
+{
+SCSIRequest *req;
+SCSIBSGReq *r;
+
+req = scsi_req_alloc(sizeof(SCSIBSGReq), d, tag, lun);
+r = DO_UPCAST(SCSIBSGReq, req, req);
+qemu_iovec_init(r-iov, 1);
+qemu_iovec_init(r-aio_iov, 1);
+return r;
+}
+
+static void bsg_remove_request(SCSIBSGReq *r)
+{
+qemu_free(r-buf);
+qemu_iovec_destroy(r-iov);
+qemu_iovec_destroy(r-aio_iov);
+scsi_req_free(r-req);
+}
+
+static void bsg_command_complete(void *opaque, int ret)
+{
+SCSIBSGReq *r = (SCSIBSGReq *)opaque;
+SCSIBSGState *s = DO_UPCAST(SCSIBSGState, qdev, r-req.dev);
+
+s-driver_status = r-bsg_hdr.driver_status;
+if (s-driver_status)
+s-senselen = SCSI_SENSE_BUF_SIZE;
+
+if (ret != 0) {
+scsi_req_print(r-req);
+fprintf(stderr, %s: ret %d (%s)\n, __FUNCTION__,
+ret, strerror(-ret));
+s-senselen = scsi_build_sense(SENSE_CODE(INVALID_FIELD),
+s-sensebuf, SCSI_SENSE_BUF_SIZE, 0);
+s-driver_status = SG_ERR_DRIVER_SENSE;
+r-req.status = CHECK_CONDITION;
+} else {
+if (s-driver_status  SG_ERR_DRIVER_TIMEOUT) {
+scsi_req_print(r-req);
+fprintf(stderr, %s: timeout\n, __FUNCTION__);
+r-req.status = BUSY  1;
+} else if

[Qemu-devel] [PATCH 1/5] [block]: Add top level BSG support

2010-06-14 Thread Nicholas A. Bellinger

From: Nicholas Bellinger n...@linux-iscsi.org

This patch adds top level BSG support to QEMU-KVM block and adds the
BDS_* prefixed defines for SG_IO and BSG.

It adds the BDS_SCSI_GENERIC and BDS_BSG assignments in 
block/raw-posix.c:hdev_open()
using S_ISCHR() and major(st.st_rdev) in order to determine when we are dealing 
with
scsi-generic or scsi-bsg backstores.

It also adds a special case BSG check in find_protocol() using the BSG
major in order to avoid the strchr() for ':' as we expect filenames to
contain /dev/bsg/H:C:T:L.

This path also adds a struct BlockDriverState-fd to save a opened file 
descriptor
with format_name = 'raw' for use by scsi-bsg.

Signed-off-by: Nicholas A. Bellinger n...@linux-iscsi.org
---
 block.c   |   23 ---
 block.h   |1 +
 block/raw-posix.c |   17 +++--
 block_int.h   |5 +
 4 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index 88dbc00..3cd18ec 100644
--- a/block.c
+++ b/block.c
@@ -285,8 +285,12 @@ static BlockDriver *find_protocol(const char *filename)
 {
 BlockDriver *drv1;
 char protocol[128];
-int len;
+int len, bsg = 0;
 const char *p;
+#if defined(__linux__)
+struct stat st;
+#endif
+
 
 /* TODO Drivers without bdrv_file_open must be specified explicitly */
 
@@ -296,7 +300,15 @@ static BlockDriver *find_protocol(const char *filename)
 return bdrv_find_format(file);
 #endif
 p = strchr(filename, ':');
-if (!p) {
+#if defined(__linux__)
+if (stat(filename, st)  0)
+return NULL;
+/* This is not yet defined in include/linux/major.h.. */
+if (S_ISCHR(st.st_mode)  major(st.st_rdev) == 254)
+bsg = 1;
+#endif
+
+if (!p || bsg) {
 drv1 = find_hdev_driver(filename);
 if (!drv1) {
 drv1 = bdrv_find_format(file);
@@ -1209,7 +1221,12 @@ int bdrv_is_read_only(BlockDriverState *bs)
 
 int bdrv_is_sg(BlockDriverState *bs)
 {
-return bs-sg;
+return bs-sg == BDS_SCSI_GENERIC;
+}
+
+int bdrv_is_bsg(BlockDriverState *bs)
+{
+return bs-sg == BDS_BSG;
 }
 
 int bdrv_enable_write_cache(BlockDriverState *bs)
diff --git a/block.h b/block.h
index f87d24e..4faeedc 100644
--- a/block.h
+++ b/block.h
@@ -146,6 +146,7 @@ int bdrv_get_translation_hint(BlockDriverState *bs);
 int bdrv_is_removable(BlockDriverState *bs);
 int bdrv_is_read_only(BlockDriverState *bs);
 int bdrv_is_sg(BlockDriverState *bs);
+int bdrv_is_bsg(BlockDriverState *bs);
 int bdrv_enable_write_cache(BlockDriverState *bs);
 int bdrv_is_inserted(BlockDriverState *bs);
 int bdrv_media_changed(BlockDriverState *bs);
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 1515ca9..d349109 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -52,6 +52,7 @@
 #include sys/ioctl.h
 #include linux/cdrom.h
 #include linux/fd.h
+#include linux/major.h
 #endif
 #if defined (__FreeBSD__) || defined(__FreeBSD_kernel__)
 #include signal.h
@@ -843,6 +844,9 @@ static int hdev_probe_device(const char *filename)
 static int hdev_open(BlockDriverState *bs, const char *filename, int flags)
 {
 BDRVRawState *s = bs-opaque;
+#if defined(__linux__)
+struct stat st;
+#endif
 
 #ifdef CONFIG_COCOA
 if (strstart(filename, /dev/cdrom, NULL)) {
@@ -873,8 +877,17 @@ static int hdev_open(BlockDriverState *bs, const char 
*filename, int flags)
 
 s-type = FTYPE_FILE;
 #if defined(__linux__)
-if (strstart(filename, /dev/sg, NULL)) {
-bs-sg = 1;
+if (stat(filename, st)  0) {
+printf(stat() failed errno: %d\n, errno);
+return -1;
+}
+if (S_ISCHR(st.st_mode)) {
+if (major(st.st_rdev) == SCSI_GENERIC_MAJOR) {
+bs-sg = BDS_SCSI_GENERIC;
+} else if (major(st.st_rdev) == 254) {
+/* This is not yet defined in include/linux/major.h.. */
+bs-sg = BDS_BSG;
+}
 }
 #endif
 
diff --git a/block_int.h b/block_int.h
index 1a7240c..74bcb1a 100644
--- a/block_int.h
+++ b/block_int.h
@@ -40,6 +40,10 @@
 #define BLOCK_OPT_CLUSTER_SIZE  cluster_size
 #define BLOCK_OPT_PREALLOC  preallocation
 
+#define BDS_NONE0
+#define BDS_SCSI_GENERIC1
+#define BDS_BSG 2
+
 typedef struct AIOPool {
 void (*cancel)(BlockDriverAIOCB *acb);
 int aiocb_size;
@@ -141,6 +145,7 @@ struct BlockDriverState {
 int encrypted; /* if true, the media is encrypted */
 int valid_key; /* if true, a valid encryption key has been set */
 int sg;/* if true, the device is a /dev/sg* */
+int fd;/* Used for BSG file descriptor */
 /* event callback when inserting/removing */
 void (*change_cb)(void *opaque);
 void *change_opaque;
-- 
1.5.6.5

[Qemu-devel] [PATCH 2/5] [block]: Add BSG qemu_open() in block/raw.c:raw_open()

2010-06-14 Thread Nicholas A. Bellinger

From: Nicholas Bellinger n...@linux-iscsi.org

This patch adds a BSG specific qemu_open() call in block/raw.c:raw_open() that
saves the opened file descriptor for BSG AIO into BlockDriverState-fd.

It also adds the reverse close() call to block/raw.c:raw_close()

Signed-off-by: Nicholas A. Bellinger n...@linux-iscsi.org
---
 block/raw.c |   20 
 1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/block/raw.c b/block/raw.c
index 4406b8c..7820c78 100644
--- a/block/raw.c
+++ b/block/raw.c
@@ -5,7 +5,25 @@
 
 static int raw_open(BlockDriverState *bs, int flags)
 {
+int fd, ret;
+
 bs-sg = bs-file-sg;
+/*
+ * scsi-generic and other raw types do not call qemu_open()
+  */
+if (bs-sg != BDS_BSG)
+return 0;
+/*
+ * Obtain a file descriptor for the underlying BSG device for AIO w/ iovecs
+ */
+fd = qemu_open(bs-filename, flags, 0644);
+if (fd  0) {
+ret = -errno;
+if (ret == -EROFS)
+ret = -EACCES;
+return ret;
+}
+bs-fd = fd;
 return 0;
 }
 
@@ -37,6 +55,8 @@ static BlockDriverAIOCB *raw_aio_writev(BlockDriverState *bs,
 
 static void raw_close(BlockDriverState *bs)
 {
+if (bs-fd  0)
+close(bs-fd);
 }
 
 static void raw_flush(BlockDriverState *bs)
-- 
1.5.6.5

Re: [Qemu-devel] [PATCH] qemu-iotests: qcow2 error path tests

2010-06-14 Thread Christoph Hellwig

On Fri, Jun 04, 2010 at 07:35:24PM +0200, Kevin Wolf wrote:
 This adds test cases for qcow2 error paths (using blkdebug)

Thanks, applied.

What's the plan for getting the fixes this tests into mainline?

Re: [Qemu-devel] [Bug 524447] Re: virsh save is very slow

2010-06-14 Thread Michael Tokarev


14.06.2010 13:37, Dave Walker wrote:

Changing to libvirt as commentary here, and on the upstream bug report
by Cole indicate a fix has been commit that improves this performance.


Um.  This is not that simple, apparently.

I did some tests after this bug were discussed/mentioned last time,
and see absolutely no difference between 512 and 1M dd block size.
Well, there IS some difference, like several percents, but it is
still very slow, and is doing nothing -- the machine is 99% idle
while saving the state.  Also, when using, say, gzip or lzop in
place of dd (both of them perform input in units larger than 512
bytes), it is still as slow.

strace shows one of kvm threads is slowly poking eventfd all the
time, while others are doing right to nothing.

It looks like some bad timing issue.  The fact that changing block
size helps sometimes - I think it's because the improvements of
timing (mix-n-match), not because of real improvements.


** Package changed: qemu-kvm (Ubuntu) =  libvirt (Ubuntu)


In any way, the problem is definitely NOT in libvirt, since the
same slowness is observed without libvirt.

Thanks!

/mjt

[Qemu-devel] [Bug 524447] Re: virsh save is very slow

2010-06-14 Thread Dave Walker

Re-introducing qemu-kvm, as commentary on qemu-devel mailing list
suggest there could be a timing concern meaning poor performance.
Leaving Libvirt on this report, as upstream libvirt have quoted improved
performance adjusting the block size for dd.  However, Qemu feel that
the real issue is in the qemu/kvm code.

Thanks.

** Also affects: qemu-kvm (Ubuntu)
   Importance: Undecided
   Status: New

** Changed in: qemu-kvm (Ubuntu)
   Status: New = Confirmed

** Changed in: qemu-kvm (Ubuntu)
   Importance: Undecided = Wishlist

-- 
virsh save is very slow
https://bugs.launchpad.net/bugs/524447
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.

Status in libvirt virtualization API: Unknown
Status in QEMU: Invalid
Status in “libvirt” package in Ubuntu: Confirmed
Status in “qemu-kvm” package in Ubuntu: Confirmed

Bug description:
As reported here: 
http://www.redhat.com/archives/libvir-list/2009-December/msg00203.html

virsh save is very slow - it writes the image at around 1MB/sec on my test 
system.

(I think I saw a bug report for this issue on Fedora's bugzilla, but I can't 
find it now...)

Confirmed under Karmic.

[Qemu-devel] Re: [PATCH 3/5] [block]: Add paio_submit_len() non sector sized AIO

2010-06-14 Thread Christoph Hellwig

On Mon, Jun 14, 2010 at 02:44:31AM -0700, Nicholas A. Bellinger wrote:
 From: Nicholas Bellinger n...@linux-iscsi.org
 
 This patch adds posix-aio-compat.c:paio_submit_len(), which is a identical
 to paio_submit() expect that in expected nb_len instead of nb_sectors (* 512)
 so that it can be used by BSG AIO for write()/read() of struct sg_io_v4.

Jusre remove the nb_sectors argument, we already get the length passed
in the iovec.

[Qemu-devel] [PATCH] SeaBIOS: Fix bvprintf() to respect padding for hex printing.

2010-06-14 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

Fix bvprintf to respect space padding when printing hex numbers
and the caller specifies alignment without zero padding, eg. %2x
as opposed to %02x

Signed-off-by: Jes Sorensen jes.soren...@redhat.com
---
 src/output.c |   27 +--
 1 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/src/output.c b/src/output.c
index 3de565a..a0d07e3 100644
--- a/src/output.c
+++ b/src/output.c
@@ -195,7 +195,7 @@ putsinglehex(struct putcinfo *action, u32 val)
 
 // Output an integer in hexadecimal.
 static void
-puthex(struct putcinfo *action, u32 val, int width)
+puthex(struct putcinfo *action, u32 val, int width, int spacepad)
 {
 if (!width) {
 u32 tmp = val;
@@ -210,6 +210,17 @@ puthex(struct putcinfo *action, u32 val, int width)
 }
 if (tmp  0xf)
 width += 1;
+} else if (spacepad)  {
+int count = 1;
+u32 tmp = val;
+while (tmp = 4) {
+count++;
+}
+count = width - count;
+width -= count;
+while (count--) {
+putc(action, ' ');
+}
 }
 
 switch (width) {
@@ -244,11 +255,15 @@ bvprintf(struct putcinfo *action, const char *fmt, 
va_list args)
 }
 const char *n = s+1;
 int field_width = 0;
+int spacepad = 1;
 for (;;) {
 c = GET_GLOBAL(*(u8*)n);
 if (!isdigit(c))
 break;
-field_width = field_width * 10 + c - '0';
+if (!field_width  (c == '0'))
+spacepad = 0;
+else
+field_width = field_width * 10 + c - '0';
 n++;
 }
 if (c == 'l') {
@@ -281,7 +296,7 @@ bvprintf(struct putcinfo *action, const char *fmt, va_list 
args)
 field_width = 8;
 case 'x':
 val = va_arg(args, s32);
-puthex(action, val, field_width);
+puthex(action, val, field_width, spacepad);
 break;
 case 'c':
 val = va_arg(args, int);
@@ -333,7 +348,7 @@ __dprintf(const char *fmt, ...)
 if (cur != MainThread) {
 // Show thread id for this debug message.
 putc_debug(debuginfo, '|');
-puthex(debuginfo, (u32)cur, 8);
+puthex(debuginfo, (u32)cur, 8, 0);
 putc_debug(debuginfo, '|');
 putc_debug(debuginfo, ' ');
 }
@@ -411,12 +426,12 @@ hexdump(const void *d, int len)
 while (len  0) {
 if (count % 8 == 0) {
 putc(debuginfo, '\n');
-puthex(debuginfo, count*4, 8);
+puthex(debuginfo, count*4, 8, 0);
 putc(debuginfo, ':');
 } else {
 putc(debuginfo, ' ');
 }
-puthex(debuginfo, *(u32*)d, 8);
+puthex(debuginfo, *(u32*)d, 8, 0);
 count++;
 len-=4;
 d+=4;
-- 
1.6.5.2

Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path()

2010-06-14 Thread Alex Williamson

On Mon, 2010-06-14 at 08:39 +0200, Markus Armbruster wrote:
 Alex Williamson alex.william...@redhat.com writes:
 
  qdev_get_dev_path() is intended to be the canonical utility for creating
  a string representing the qdev hierarchy of a device.  The path consists
  of bus and device names as well as identified properties of the immediate
  parent bus and device.  This results in strings like:
 
  /main-system-bus/pci.0,addr=00.0/i440FX
  /main-system-bus/pci.0,addr=01.0/PIIX3
  /main-system-bus/pci.0,addr=02.0/cirrus-vga
  /main-system-bus/pci.0/isa.0/mc146818rtc
  /main-system-bus/pci.0/isa.0/isa-serial
  /main-system-bus/pci.0/isa.0/i8042
  /main-system-bus/pci.0/isa.0/isa-fdc
  /main-system-bus/pci.0,addr=03.0/i82551,mac=52:54:00:12:34:56
  /main-system-bus/pci.0,addr=04.0/virtio-net-pci,mac=52:54:00:12:34:57
  /main-system-bus/pci.0,addr=05.0/e1000,mac=52:54:00:12:34:58
  /main-system-bus/pci.0,addr=06.0/rtl8139,mac=52:54:00:12:34:59
  /main-system-bus/pci.0,addr=07.0/pcnet,mac=52:54:00:12:34:5a
  /main-system-bus/pci.0,addr=01.1/piix3-ide
  /main-system-bus/pci.0,addr=01.3/PIIX4_PM
  /main-system-bus/pci.0,addr=08.0/lsi53c895a
  /main-system-bus/pci.0,addr=09.0/virtio-blk-pci
 
  There are two primary targets for these strings.  The first is vmsave, where
  we currently use a device provided string plus instance number to track
  SaveStateEntries.  This fails when we introduce device hotplug, particularly
  in a case were we have gaps in the instance numbers that cannot easily be
  reproduced on a migration target.  The second is for naming RAMBlocks.  For
  these, we would like a string that matches the vmstate string.
 
 Could you explain why you add identified properties of the immediate
 parent bus and device?  They make the result ver much *not* a dev
 path in the qdev sense...

In order to try to get a unique string.  Without looking into device
properties, two e1000s would both be:

/main-system-bus/pci.0/e1000
/main-system-bus/pci.0/e1000

Which is no better than simply e1000 and would require us to fall back
to instance ids again.  The goal here is that anything that makes use of
passing a dev when registering a vmstate gets an instance id of zero.

Alex

Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path()

2010-06-14 Thread Paul Brook

   /main-system-bus/pci.0,addr=09.0/virtio-blk-pci

There's a device missing between the main system bus and the pci bus.  Should 
be something like:

/main-system-bus/piix4-pcihost/pci.0/_09.0

  Could you explain why you add identified properties of the immediate
  parent bus and device?  They make the result ver much *not* a dev
  path in the qdev sense...
 
 In order to try to get a unique string.  Without looking into device
 properties, two e1000s would both be:
 
 /main-system-bus/pci.0/e1000
 /main-system-bus/pci.0/e1000
 
 Which is no better than simply e1000 and would require us to fall back
 to instance ids again.  The goal here is that anything that makes use of
 passing a dev when registering a vmstate gets an instance id of zero.

You already got the information you need, you just put it in the wrong place. 
The canonical ID for the device could be its bus address. In practice we'd 
probably want to allow the user to specify it by name, provided these are 
unique. e.g. in the above machine we could accept [...]/virtiio-blk-pci would 
as an aias for [...]:_09.0. Device names have a restricted namespace, so we 
can use an initial prefix to disambiguate a name/label from a bus address.

For busses that don't have a consistent addressing scheme then some sort of 
instance ID is unavoidable. I guess it may be possible to invent something 
based on other device properties (e.g. address of the first IO port/memory 
region).

Paul

Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path()

2010-06-14 Thread Jan Kiszka

Alex Williamson wrote:
 On Mon, 2010-06-14 at 08:39 +0200, Markus Armbruster wrote:
 Alex Williamson alex.william...@redhat.com writes:

 qdev_get_dev_path() is intended to be the canonical utility for creating
 a string representing the qdev hierarchy of a device.  The path consists
 of bus and device names as well as identified properties of the immediate
 parent bus and device.  This results in strings like:

 /main-system-bus/pci.0,addr=00.0/i440FX
 /main-system-bus/pci.0,addr=01.0/PIIX3
 /main-system-bus/pci.0,addr=02.0/cirrus-vga
 /main-system-bus/pci.0/isa.0/mc146818rtc
 /main-system-bus/pci.0/isa.0/isa-serial
 /main-system-bus/pci.0/isa.0/i8042
 /main-system-bus/pci.0/isa.0/isa-fdc
 /main-system-bus/pci.0,addr=03.0/i82551,mac=52:54:00:12:34:56
 /main-system-bus/pci.0,addr=04.0/virtio-net-pci,mac=52:54:00:12:34:57
 /main-system-bus/pci.0,addr=05.0/e1000,mac=52:54:00:12:34:58
 /main-system-bus/pci.0,addr=06.0/rtl8139,mac=52:54:00:12:34:59
 /main-system-bus/pci.0,addr=07.0/pcnet,mac=52:54:00:12:34:5a
 /main-system-bus/pci.0,addr=01.1/piix3-ide
 /main-system-bus/pci.0,addr=01.3/PIIX4_PM
 /main-system-bus/pci.0,addr=08.0/lsi53c895a
 /main-system-bus/pci.0,addr=09.0/virtio-blk-pci

 There are two primary targets for these strings.  The first is vmsave, where
 we currently use a device provided string plus instance number to track
 SaveStateEntries.  This fails when we introduce device hotplug, particularly
 in a case were we have gaps in the instance numbers that cannot easily be
 reproduced on a migration target.  The second is for naming RAMBlocks.  For
 these, we would like a string that matches the vmstate string.
 Could you explain why you add identified properties of the immediate
 parent bus and device?  They make the result ver much *not* a dev
 path in the qdev sense...
 
 In order to try to get a unique string.  Without looking into device
 properties, two e1000s would both be:
 
 /main-system-bus/pci.0/e1000
 /main-system-bus/pci.0/e1000
 
 Which is no better than simply e1000 and would require us to fall back
 to instance ids again.  The goal here is that anything that makes use of
 passing a dev when registering a vmstate gets an instance id of zero.

Will soon (re-)post a patch that adds per-bus instance numbers to deal
with identically named devices. That's required as not all buses have
canonical node IDs (e.g. ISA or the main system bus).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] ARM/system mode/stdin

2010-06-14 Thread Christophe LYON


Hello,

I am trying to use qemu-system-arm (0.12.3) to execute an ARM bare 
machine program (not a Linux kernel), and I have some trouble when the 
program in question tries to read from stdin.


The program does use ARM semihosting to communicate with the host.

Here is the sample code:
==
#include stdio.h
int main() {
  int i, j;
  char tab[81];
  printf(enter a number\n);
  scanf(%d, i);
  printf(enter a string (max: 80 char)\n);
  scanf(%80s, tab);
  printf(enter a number\n);
  scanf(%d, j);

  printf(You entered:\n%d\n%s\n%d\n, i, tab, j);

  return 0;
}
===
(compiled with arm-none-eabi-gcc -O0 -mcpu=cortex-a9)

To execute it, I use:
$ qemu-system-arm -semihosting -cpu cortex-a9 -nographic -kernel ./input.exe
In this case, it does not wait for me to type something:
enter a number
enter a string (max: 80 char)
enter a number
You entered:
134217720

1

Now, if
$ cat data
4 hello 5
$ qemu-system-arm -semihosting -cpu cortex-a9 -nographic -kernel 
./input.exe  data

enter a number
enter a string (max: 80 char)
enter a number
You entered:
134217720
ello
5

as you can see, the first 3 chars of 'data' have been changed/swallowed.

If I remove
fcntl(0, F_SETFL, O_NONBLOCK);
from term_init() (qemu-char.c), then I get expected behaviour when I 
don't redirect stdin, but this has no effect on the latter case.



What should I do/patch to achieve the behaviour I'd like?

Thanks!

Christophe.

[Qemu-devel] Q35 qemu repository?

2010-06-14 Thread Matthew Garrett

Hi there,

I'm currently doing some work on tidying up Linux's PCIe ASPM support, 
and one thing that would be useful would be to be able to instrument 
reads and writes made by Windows to PCIe space. I noticed that you've 
been working on Q35 support for qemu - is there a public repository 
which contains your patch sets?

Thank you,
-- 
Matthew Garrett | mj...@srcf.ucam.org

[Qemu-devel] Re: [PATCH v3 0/5] Add QMP migration events

2010-06-14 Thread Anthony Liguori


On 06/12/2010 06:14 AM, Juan Quintela wrote:

Anthony Liguorianth...@codemonkey.ws  wrote:
   

On 06/11/2010 09:30 AM, Luiz Capitulino wrote:
 

On Thu, 10 Jun 2010 12:44:55 +0200
Juan Quintelaquint...@redhat.com   wrote:
   
   

I think we've more or less agreed that MIGRATION_CONNECTED is really
the event we want.
 

This had the problem of migrating to a file/whatever.
   


With migrating to exec, the MIGRATION_STARTED event fires as soon as you 
launch qemu which is also very unlikely to actually be when a QMP client 
is connected.  IOW, I'm fairly certain you'll never see a 
MIGRATION_STARTED event with QMP for exec migration anyway.


It's not useful information.  MIGRATION_CONNECTED is useful information 
and something we'll want to support long term.



Yes.  Today, we should just generate a MIGRATION_DONE event and let a
client poll for failure status.
 

I disagree completely.  It just defeat the reason for this.

MIGRATION_ENDED on destination machine: go ahead, everything is ok.
MIGRATION_FAILED: Uh, oh, something got wrong, we are in the slow path.

With MIGRATION_DONE + polling, we are delaying the success case just
to avoid having a new event.  I don't buy it.
   


This is an event that we plan on deprecating immediately.  Why introduce 
two events that are going to be immediately deprecated when we can just 
introduce one?



I still think that we want the 4 events that I described.  My
understanding is that libvirt people also would love to have that 4
events.

Answer here is that: you can do this workaround and this other
workaround and you can get that information.

About semantics of messages, I don't see anytime soon that migration are
going to change from:

Start migration and then end with success or failure.

The only one that we could change/remove is MIGRATION_CANCEL to
MIGRATION_FAILURE(User canceled) it.  But that is it.

Why have to do a polling when none is needed?
If you preffer to change the MIGRATION_ENDED + MIGRATION_FAILURE(error)
to MIGRATION_ENDED(result code), and you have to check the error code, I
can also live with that.  But that is it.
   


We will have a MIGRATION_ENDED(result code) event for 0.14.  It'll be a 
proper async migrate command.  But we have no way to do this sanely 
because we can't generate rich errors during migrate and we can't pass 
QErrors over async events.


For 0.13, we need to focus on introducing the least disruptive change 
that addresses the fundamental requirement--allow clients to avoid a 
polling loop for determining when migration ends.  Having a single event 
with no payload is an extremely simple change.


Regards,

Anthony Liguori


Later, Juan.

[Qemu-devel] Re: [SeaBIOS] [PATCHv2] load hpet info for HPET ACPI table from qemu

2010-06-14 Thread Kevin O'Connor

On Mon, Jun 14, 2010 at 11:30:53AM +0300, Gleb Natapov wrote:
 Load hpet info for HPET ACPI table from qemu instead of using hardcoded
 values. Use hardcoded values anyway if old qemu is detected.

The current code does a lot of mixing of qemu provided and seabios
provided data to build the acpi tables.  This patch extends that.
Unfortunately, I find it confusing because someone then needs to look
through both the seabios and qemu code to understand how the acpi
tables are generated.

Could we just have qemu build the hpet tables and pass them through to
seabios?  Perhaps using the qemu_cfg_acpi_additional_tables() method.

-Kevin

[Qemu-devel] Re: [PATCH v3 0/5] Add QMP migration events

2010-06-14 Thread Luiz Capitulino

On Mon, 14 Jun 2010 08:58:19 -0500
Anthony Liguori aligu...@linux.vnet.ibm.com wrote:

 For 0.13, we need to focus on introducing the least disruptive change 
 that addresses the fundamental requirement--allow clients to avoid a 
 polling loop for determining when migration ends.  Having a single event 
 with no payload is an extremely simple change.

 Agreed.

 Actually, I'm slightly against introducing the done event too. While the
polling is undesired, I don't think that having an event only for a release
is worth it.

[Qemu-devel] Re: [PATCH v3 0/5] Add QMP migration events

2010-06-14 Thread Anthony Liguori


On 06/14/2010 09:24 AM, Luiz Capitulino wrote:

On Mon, 14 Jun 2010 08:58:19 -0500
Anthony Liguorialigu...@linux.vnet.ibm.com  wrote:

   

For 0.13, we need to focus on introducing the least disruptive change
that addresses the fundamental requirement--allow clients to avoid a
polling loop for determining when migration ends.  Having a single event
with no payload is an extremely simple change.
 

  Agreed.

  Actually, I'm slightly against introducing the done event too. While the
polling is undesired, I don't think that having an event only for a release
is worth it.
   


I don't think we can realistically remove it.  In my mind, what 
deprecate means with respect to QMP is that we never add additional 
features to the deprecated commands.


Right now, 'migrate', 'info migrate', and MIGRATE_DONE do not provide 
very rich error reporting.  They really just provide a boolean 
failed/succeeded.


For 0.14, we'll have to introduce a 'migrate2' command and deprecate the 
old 'migrate' command.  'migrate2' will provide rich error reporting and 
all new features will be added to 'migrate2'.


BTW, I lack any kind of naming creativity so 'migrate2' is not 
necessarily what it ought to be called.


Regards,

Anthony Liguori

[Qemu-devel] Re: [PATCH v3 0/5] Add QMP migration events

2010-06-14 Thread Luiz Capitulino

On Sat, 12 Jun 2010 13:20:54 +0200
Juan Quintela quint...@redhat.com wrote:

 Luiz Capitulino lcapitul...@redhat.com wrote:
  On Fri, 11 Jun 2010 09:38:42 -0500
  Anthony Liguori anth...@codemonkey.ws wrote:
 
 1. QMP only returns the response when the command is finished, eg:
  
C: { execute: migrate, id: foo ... }
/* nothing is returned, other commands are issued, after several 
   hours... */
S: { return: ... id: foo }
  
  
  This is how just about every RPC mechanism works...
 
   Let's go for it then.
 
   - MIGRATION_STARTED:  somebody started a migration, it is emited on
  source and target, all monitors receive this event.

 The client has started the migration, it knows it. Why is the event 
   needed?
  
  
  I think we've more or less agreed that MIGRATION_CONNECTED is really the 
  event we want.
 
   Is it useful in the source or in the target?
 
 Both.

 What does it report in the source?

   - MIGRATION_ENDED: migration ended with sucess, all needed data is in
  target machine.  Also emitted in all monitors on source and target.
  
   - MIGRATION_CANCELED: in one of the source monitors somebody typed:
  migrate_cancel.  It is only emmited on the source monitors, target
  monitors will receive a MIGRATION_FAILED event.
  
   - MIGRATION_FAILED (with this error).  At this point we don't have
  neither the QMP infraestructure for sending (with this error) nor
  migration infrastructure to put there anything different than -1.

 Aren't all the three events above duplicating the async response?
  
  
  Yes.  Today, we should just generate a MIGRATION_DONE event and let a 
  client poll for failure status.
 
  [...]
 
  MIGRATION_DONE gets deprecated for 0.14.
 
   Yeah, this removes the need for polling in 0.13, but I was wondering if
  it's worth it. If I'm not mistaken, libvirt does the polling when working
  with the text Monitor and I believe it's not a big deal to keep it until 
  0.14.
 
 It makes things slower for no good reason.
 
 This reasoning is sneaky at least: 
 - qemu didn't give interfaces to libvirt for do what libvirt wanted
 - libvirt uses workarounds
 - qemu tells libvirt that they are using workarounds that they shouldn't
 - libvirt tells qemu why they need the new interface
 - qemu tells libvirt that they could continue to use its workarounds.
 
 I am losing something?  The whole point of live migration is that they
 need to be as fast as possible.  For some scenaries (shared storage with
 funny locking) libvirt needs to move from shared locks to normal locks
 as far as migration ends on target.  We are telling them to do
 workarounds becauese qemu don't want to tell libvirt on destination when
 migration has ended.
 
 Why can't we just tell them that migration has ended with success as
 fast as possible?
 
 I can't understand what I am missing here.  I can't believe that
 libvirt(management app in general) could came with a simple request that
 would make its live better.  And to make things worse, it is _trivial_
 to implement, semantics are clear, has other uses, .

 I'd be ok in having the done event if the polling is shown to be that
problematic, but from some private talks I had with libvirt guys it seemed
to me that it's more like a wish have.

 For us, having this event means that we'll have to carry it for (at least)
several releases. I really would like to avoid adding one-time workarounds
in a stable interface like QMP.

[Qemu-devel] [PATCH 1/2] qcow2: Simplify image creation

2010-06-14 Thread Kevin Wolf

Instead of doing lots of magic for setting up initial refcount blocks and stuff
create a minimal (inconsistent) image, open it and initialize the rest with
regular qcow2 functions.

This is a complete rewrite of the image creation function. The old
implementating is #ifdef'd out and will be removed by the next patch (removing
it here would have made the diff unreadable because diff tries to find
similarities when it's really a rewrite)

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/qcow2.c |  126 -
 1 files changed, 125 insertions(+), 1 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 33fa9a9..acb850c 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -27,6 +27,7 @@
 #include zlib.h
 #include aes.h
 #include block/qcow2.h
+#include qemu-error.h
 
 /*
   Differences with QCOW:
@@ -767,6 +768,7 @@ static int qcow2_change_backing_file(BlockDriverState *bs,
 return qcow2_update_ext_header(bs, backing_file, backing_fmt);
 }
 
+#if 0
 static int get_bits_from_size(size_t size)
 {
 int res = 0;
@@ -787,6 +789,7 @@ static int get_bits_from_size(size_t size)
 
 return res;
 }
+#endif
 
 
 static int preallocate(BlockDriverState *bs)
@@ -839,6 +842,7 @@ static int preallocate(BlockDriverState *bs)
 return 0;
 }
 
+#if 0
 static int qcow_create2(const char *filename, int64_t total_size,
 const char *backing_file, const char *backing_format,
 int flags, size_t cluster_size, int prealloc)
@@ -1036,6 +1040,126 @@ exit:
 
 return ret;
 }
+#else
+static int qcow_create2(const char *filename, int64_t total_size,
+const char *backing_file, const char *backing_format,
+int flags, size_t cluster_size, int prealloc,
+QEMUOptionParameter *options)
+{
+/* Calulate cluster_bits */
+int cluster_bits;
+cluster_bits = ffs(cluster_size) - 1;
+if (cluster_bits  MIN_CLUSTER_BITS || cluster_bits  MAX_CLUSTER_BITS ||
+(1  cluster_bits) != cluster_size)
+{
+error_report(
+Cluster size must be a power of two between %d and %dk\n,
+1  MIN_CLUSTER_BITS, 1  (MAX_CLUSTER_BITS - 10));
+return -EINVAL;
+}
+
+/*
+ * Open the image file and write a minimal qcow2 header.
+ *
+ * We keep things simple and start with a zero-sized image. We also
+ * do without refcount blocks or a L1 table for now. We'll fix the
+ * inconsistency later.
+ *
+ * We do need a refcount table because growing the refcount table means
+ * allocating two new refcount blocks - the seconds of which would be at
+ * 2 GB for 64k clusters, and we don't want to have a 2 GB initial file
+ * size for any qcow2 image.
+ */
+BlockDriverState* bs;
+QCowHeader header;
+uint8_t* refcount_table;
+int ret;
+
+ret = bdrv_create_file(filename, options);
+if (ret  0) {
+return ret;
+}
+
+ret = bdrv_file_open(bs, filename, BDRV_O_RDWR);
+if (ret  0) {
+return ret;
+}
+
+/* Write the header */
+memset(header, 0, sizeof(header));
+header.magic = cpu_to_be32(QCOW_MAGIC);
+header.version = cpu_to_be32(QCOW_VERSION);
+header.cluster_bits = cpu_to_be32(cluster_bits);
+header.size = cpu_to_be64(0);
+header.l1_table_offset = cpu_to_be64(0);
+header.l1_size = cpu_to_be32(0);
+header.refcount_table_offset = cpu_to_be64(cluster_size);
+header.refcount_table_clusters = cpu_to_be32(1);
+
+if (flags  BLOCK_FLAG_ENCRYPT) {
+header.crypt_method = cpu_to_be32(QCOW_CRYPT_AES);
+} else {
+header.crypt_method = cpu_to_be32(QCOW_CRYPT_NONE);
+}
+
+ret = bdrv_pwrite(bs, 0, header, sizeof(header));
+if (ret  0) {
+return ret;
+}
+
+/* Write an empty refcount table */
+refcount_table = qemu_mallocz(cluster_size);
+ret = bdrv_pwrite(bs, cluster_size, refcount_table, cluster_size);
+qemu_free(refcount_table);
+
+if (ret  0) {
+return ret;
+}
+
+bdrv_close(bs);
+
+/*
+ * And now open the image and make it consistent first (i.e. increase the
+ * refcount of the cluster that is occupied by the header and the refcount
+ * table)
+ */
+BlockDriver* drv = bdrv_find_format(qcow2);
+assert(drv != NULL);
+ret = bdrv_open(bs, filename, BDRV_O_RDWR | BDRV_O_NO_FLUSH, drv);
+if (ret  0) {
+return ret;
+}
+
+ret = qcow2_alloc_clusters(bs, 2 * cluster_size);
+if (ret  0) {
+return ret;
+} else if (ret != 0) {
+error_report(Huh, first cluster in empty image is already in use?);
+abort();
+}
+
+/* Okay, now that we have a valid image, let's give it the right size */
+ret = bdrv_truncate(bs, total_size * BDRV_SECTOR_SIZE);
+if (ret  0) {
+return ret;
+}
+
+/* Want a backing file? There you go.*/
+if

[Qemu-devel] [PATCH 2/2] qcow2: Remove old image creation function

2010-06-14 Thread Kevin Wolf

They have been #ifdef'd out by the previous patch.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/qcow2.c |  224 -
 1 files changed, 0 insertions(+), 224 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index acb850c..6f26564 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -768,30 +768,6 @@ static int qcow2_change_backing_file(BlockDriverState *bs,
 return qcow2_update_ext_header(bs, backing_file, backing_fmt);
 }
 
-#if 0
-static int get_bits_from_size(size_t size)
-{
-int res = 0;
-
-if (size == 0) {
-return -1;
-}
-
-while (size != 1) {
-/* Not a power of two */
-if (size  1) {
-return -1;
-}
-
-size = 1;
-res++;
-}
-
-return res;
-}
-#endif
-
-
 static int preallocate(BlockDriverState *bs)
 {
 uint64_t nb_sectors;
@@ -842,205 +818,6 @@ static int preallocate(BlockDriverState *bs)
 return 0;
 }
 
-#if 0
-static int qcow_create2(const char *filename, int64_t total_size,
-const char *backing_file, const char *backing_format,
-int flags, size_t cluster_size, int prealloc)
-{
-
-int fd, header_size, backing_filename_len, l1_size, i, shift, l2_bits;
-int ref_clusters, reftable_clusters, backing_format_len = 0;
-int rounded_ext_bf_len = 0;
-QCowHeader header;
-uint64_t tmp, offset;
-uint64_t old_ref_clusters;
-QCowCreateState s1, *s = s1;
-QCowExtension ext_bf = {0, 0};
-int ret;
-
-memset(s, 0, sizeof(*s));
-
-fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0644);
-if (fd  0)
-return -errno;
-memset(header, 0, sizeof(header));
-header.magic = cpu_to_be32(QCOW_MAGIC);
-header.version = cpu_to_be32(QCOW_VERSION);
-header.size = cpu_to_be64(total_size * 512);
-header_size = sizeof(header);
-backing_filename_len = 0;
-if (backing_file) {
-if (backing_format) {
-ext_bf.magic = QCOW_EXT_MAGIC_BACKING_FORMAT;
-backing_format_len = strlen(backing_format);
-ext_bf.len = backing_format_len;
-rounded_ext_bf_len = (sizeof(ext_bf) + ext_bf.len + 7)  ~7;
-header_size += rounded_ext_bf_len;
-}
-header.backing_file_offset = cpu_to_be64(header_size);
-backing_filename_len = strlen(backing_file);
-header.backing_file_size = cpu_to_be32(backing_filename_len);
-header_size += backing_filename_len;
-}
-
-/* Cluster size */
-s-cluster_bits = get_bits_from_size(cluster_size);
-if (s-cluster_bits  MIN_CLUSTER_BITS ||
-s-cluster_bits  MAX_CLUSTER_BITS)
-{
-fprintf(stderr, Cluster size must be a power of two between 
-%d and %dk\n,
-1  MIN_CLUSTER_BITS,
-1  (MAX_CLUSTER_BITS - 10));
-return -EINVAL;
-}
-s-cluster_size = 1  s-cluster_bits;
-
-header.cluster_bits = cpu_to_be32(s-cluster_bits);
-header_size = (header_size + 7)  ~7;
-if (flags  BLOCK_FLAG_ENCRYPT) {
-header.crypt_method = cpu_to_be32(QCOW_CRYPT_AES);
-} else {
-header.crypt_method = cpu_to_be32(QCOW_CRYPT_NONE);
-}
-l2_bits = s-cluster_bits - 3;
-shift = s-cluster_bits + l2_bits;
-l1_size = (((total_size * 512) + (1LL  shift) - 1)  shift);
-offset = align_offset(header_size, s-cluster_size);
-s-l1_table_offset = offset;
-header.l1_table_offset = cpu_to_be64(s-l1_table_offset);
-header.l1_size = cpu_to_be32(l1_size);
-offset += align_offset(l1_size * sizeof(uint64_t), s-cluster_size);
-
-/* count how many refcount blocks needed */
-
-#define NUM_CLUSTERS(bytes) \
-(((bytes) + (s-cluster_size) - 1) / (s-cluster_size))
-
-ref_clusters = NUM_CLUSTERS(NUM_CLUSTERS(offset) * sizeof(uint16_t));
-
-do {
-uint64_t image_clusters;
-old_ref_clusters = ref_clusters;
-
-/* Number of clusters used for the refcount table */
-reftable_clusters = NUM_CLUSTERS(ref_clusters * sizeof(uint64_t));
-
-/* Number of clusters that the whole image will have */
-image_clusters = NUM_CLUSTERS(offset) + ref_clusters
-+ reftable_clusters;
-
-/* Number of refcount blocks needed for the image */
-ref_clusters = NUM_CLUSTERS(image_clusters * sizeof(uint16_t));
-
-} while (ref_clusters != old_ref_clusters);
-
-s-refcount_table = qemu_mallocz(reftable_clusters * s-cluster_size);
-
-s-refcount_table_offset = offset;
-header.refcount_table_offset = cpu_to_be64(offset);
-header.refcount_table_clusters = cpu_to_be32(reftable_clusters);
-offset += (reftable_clusters * s-cluster_size);
-s-refcount_block_offset = offset;
-
-for (i=0; i  ref_clusters; i++) {
-s-refcount_table[i] = cpu_to_be64(offset);
-offset += s-cluster_size;
-}
-
-s-refcount_block = qemu_mallocz(ref_clusters * s-cluster_size);
-
-

[Qemu-devel] [PATCH 1/2] Remove unused DEBUG defines from hw/msix.c

2010-06-14 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

Remove unused DEBUG defines from hw/msix.c to avoid having anything
define the word DEBUG without any additions such as MSIX_DEBUG.

Signed-off-by: Jes Sorensen jes.soren...@redhat.com
---
 hw/msix.c |9 -
 1 files changed, 0 insertions(+), 9 deletions(-)

diff --git a/hw/msix.c b/hw/msix.c
index 1613bb4..d99403a 100644
--- a/hw/msix.c
+++ b/hw/msix.c
@@ -42,15 +42,6 @@
 #define MSIX_MAX_ENTRIES 32
 
 
-#ifdef MSIX_DEBUG
-#define DEBUG(fmt, ...)   \
-do {  \
-  fprintf(stderr, %s:  fmt, __func__ , __VA_ARGS__);\
-} while (0)
-#else
-#define DEBUG(fmt, ...) do { } while(0)
-#endif
-
 /* Flag for interrupt controller to declare MSI-X support */
 int msix_supported;
 
-- 
1.6.6.1

[Qemu-devel] [PATCH 2/2] Change #define DEBUG to #define E1000_DEBUG in hw/e1000.c

2010-06-14 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

Change #define DEBUG to #define E1000_DEBUG in hw/e1000.c to make
it possible to build QEMU with -DDEBUG

Signed-off-by: Jes Sorensen jes.soren...@redhat.com
---
 hw/e1000.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/e1000.c b/hw/e1000.c
index 96d045d..0da65f9 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -33,9 +33,9 @@
 
 #include e1000_hw.h
 
-#define DEBUG
+#define E1000_DEBUG
 
-#ifdef DEBUG
+#ifdef E1000_DEBUG
 enum {
 DEBUG_GENERAL, DEBUG_IO,   DEBUG_MMIO, DEBUG_INTERRUPT,
 DEBUG_RX,  DEBUG_TX,   DEBUG_MDIC, DEBUG_EEPROM,
-- 
1.6.6.1

[Qemu-devel] [RESENT PATCH 2/2] vnc: sync lock modifier state on connect.

2010-06-14 Thread Gerd Hoffmann


Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 vnc.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/vnc.c b/vnc.c
index b25b6a1..039fb21 100644
--- a/vnc.c
+++ b/vnc.c
@@ -2300,6 +2300,7 @@ static void vnc_connect(VncDisplay *vd, int csock)
 if (vs-vd-lock_key_sync) {
 vs-led_notifier.notify = kbd_leds;
 qemu_add_led_event_notifier(vs-led_notifier);
+kbd_leds(vs-led_notifier);
 }
 
 vs-mouse_mode_notifier.notify = check_pointer_type_change;
-- 
1.6.5.2

[Qemu-devel] [RESENT PATCH 1/2] switch keyboard led state notification to notifiers.

2010-06-14 Thread Gerd Hoffmann


Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 console.h |   11 +++
 input.c   |   37 -
 vnc.c |   13 -
 vnc.h |2 +-
 4 files changed, 28 insertions(+), 35 deletions(-)

diff --git a/console.h b/console.h
index cac959f..171b32c 100644
--- a/console.h
+++ b/console.h
@@ -35,12 +35,6 @@ typedef struct QEMUPutMouseEntry {
 QTAILQ_ENTRY(QEMUPutMouseEntry) node;
 } QEMUPutMouseEntry;
 
-typedef struct QEMUPutLEDEntry {
-QEMUPutLEDEvent *put_led;
-void *opaque;
-QTAILQ_ENTRY(QEMUPutLEDEntry) next;
-} QEMUPutLEDEntry;
-
 void qemu_add_kbd_event_handler(QEMUPutKBDEvent *func, void *opaque);
 QEMUPutMouseEntry *qemu_add_mouse_event_handler(QEMUPutMouseEvent *func,
 void *opaque, int absolute,
@@ -48,11 +42,12 @@ QEMUPutMouseEntry 
*qemu_add_mouse_event_handler(QEMUPutMouseEvent *func,
 void qemu_remove_mouse_event_handler(QEMUPutMouseEntry *entry);
 void qemu_activate_mouse_event_handler(QEMUPutMouseEntry *entry);
 
-QEMUPutLEDEntry *qemu_add_led_event_handler(QEMUPutLEDEvent *func, void 
*opaque);
-void qemu_remove_led_event_handler(QEMUPutLEDEntry *entry);
+void qemu_add_led_event_notifier(Notifier *notify);
+void qemu_remove_led_event_notifier(Notifier *notify);
 
 void kbd_put_keycode(int keycode);
 void kbd_put_ledstate(int ledstate);
+int kbd_get_ledstate(void);
 void kbd_mouse_event(int dx, int dy, int dz, int buttons_state);
 
 /* Does the current mouse generate absolute events */
diff --git a/input.c b/input.c
index 651442d..af178d9 100644
--- a/input.c
+++ b/input.c
@@ -30,11 +30,13 @@
 
 static QEMUPutKBDEvent *qemu_put_kbd_event;
 static void *qemu_put_kbd_event_opaque;
-static QTAILQ_HEAD(, QEMUPutLEDEntry) led_handlers = 
QTAILQ_HEAD_INITIALIZER(led_handlers);
 static QTAILQ_HEAD(, QEMUPutMouseEntry) mouse_handlers =
 QTAILQ_HEAD_INITIALIZER(mouse_handlers);
 static NotifierList mouse_mode_notifiers = 
 NOTIFIER_LIST_INITIALIZER(mouse_mode_notifiers);
+static NotifierList led_event_notifiers =
+NOTIFIER_LIST_INITIALIZER(led_event_notifiers);
+static int ledstate;
 
 void qemu_add_kbd_event_handler(QEMUPutKBDEvent *func, void *opaque)
 {
@@ -100,25 +102,14 @@ void qemu_remove_mouse_event_handler(QEMUPutMouseEntry 
*entry)
 check_mode_change();
 }
 
-QEMUPutLEDEntry *qemu_add_led_event_handler(QEMUPutLEDEvent *func,
-void *opaque)
+void qemu_add_led_event_notifier(Notifier *notify)
 {
-QEMUPutLEDEntry *s;
-
-s = qemu_mallocz(sizeof(QEMUPutLEDEntry));
-
-s-put_led = func;
-s-opaque = opaque;
-QTAILQ_INSERT_TAIL(led_handlers, s, next);
-return s;
+notifier_list_add(led_event_notifiers, notify);
 }
 
-void qemu_remove_led_event_handler(QEMUPutLEDEntry *entry)
+void qemu_remove_led_event_notifier(Notifier *notify)
 {
-if (entry == NULL)
-return;
-QTAILQ_REMOVE(led_handlers, entry, next);
-qemu_free(entry);
+notifier_list_remove(led_event_notifiers, notify);
 }
 
 void kbd_put_keycode(int keycode)
@@ -128,15 +119,19 @@ void kbd_put_keycode(int keycode)
 }
 }
 
-void kbd_put_ledstate(int ledstate)
+void kbd_put_ledstate(int l)
 {
-QEMUPutLEDEntry *cursor;
-
-QTAILQ_FOREACH(cursor, led_handlers, next) {
-cursor-put_led(cursor-opaque, ledstate);
+if (ledstate != l) {
+ledstate = l;
+notifier_list_notify(led_event_notifiers);
 }
 }
 
+int kbd_get_ledstate(void)
+{
+return ledstate;
+}
+
 void kbd_mouse_event(int dx, int dy, int dz, int buttons_state)
 {
 QEMUPutMouseEntry *entry;
diff --git a/vnc.c b/vnc.c
index ed0e096..b25b6a1 100644
--- a/vnc.c
+++ b/vnc.c
@@ -988,7 +988,7 @@ static void vnc_disconnect_finish(VncState *vs)
 qemu_remove_mouse_mode_change_notifier(vs-mouse_mode_notifier);
 vnc_remove_timer(vs-vd);
 if (vs-vd-lock_key_sync)
-qemu_remove_led_event_handler(vs-led);
+qemu_remove_led_event_notifier(vs-led_notifier);
 qemu_free(vs);
 }
 
@@ -1381,9 +1381,10 @@ static void press_key(VncState *vs, int keysym)
 kbd_put_keycode(keycode | SCANCODE_UP);
 }
 
-static void kbd_leds(void *opaque, int ledstate)
+static void kbd_leds(Notifier *notifier)
 {
-VncState *vs = opaque;
+VncState *vs = container_of(notifier, VncState, led_notifier);
+int ledstate = kbd_get_ledstate();
 int caps, num;
 
 caps = ledstate  QEMU_CAPS_LOCK_LED ? 1 : 0;
@@ -2296,8 +2297,10 @@ static void vnc_connect(VncDisplay *vd, int csock)
 vnc_flush(vs);
 vnc_read_when(vs, protocol_version, 12);
 reset_keys(vs);
-if (vs-vd-lock_key_sync)
-vs-led = qemu_add_led_event_handler(kbd_leds, vs);
+if (vs-vd-lock_key_sync) {
+vs-led_notifier.notify = kbd_leds;
+qemu_add_led_event_notifier(vs-led_notifier);
+}
 
 vs-mouse_mode_notifier.notify = check_pointer_type_change;
 qemu_add_mouse_mode_change_notifier(vs-mouse_mode_notifier);
diff

[Qemu-devel] [RfC PATCH] add pflib: PixelFormat conversion library.

2010-06-14 Thread Gerd Hoffmann


Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 Makefile.objs |1 +
 pflib.c   |  204 +
 pflib.h   |6 ++
 3 files changed, 211 insertions(+), 0 deletions(-)
 create mode 100644 pflib.c
 create mode 100644 pflib.h

diff --git a/Makefile.objs b/Makefile.objs
index 2dad0f9..6201675 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -83,6 +83,7 @@ common-obj-y += qemu-char.o savevm.o #aio.o
 common-obj-y += msmouse.o ps2.o
 common-obj-y += qdev.o qdev-properties.o
 common-obj-y += block-migration.o
+common-obj-y += pflib.o
 
 common-obj-$(CONFIG_BRLAPI) += baum.o
 common-obj-$(CONFIG_POSIX) += migration-exec.o migration-unix.o migration-fd.o
diff --git a/pflib.c b/pflib.c
new file mode 100644
index 000..1c04378
--- /dev/null
+++ b/pflib.c
@@ -0,0 +1,204 @@
+#include qemu-common.h
+#include console.h
+#include pflib.h
+
+typedef struct QemuPixel QemuPixel;
+
+typedef void (*pf_convert)(QemuPfConv *conv,
+   void *dst, void *src, uint32_t cnt);
+typedef void (*pf_convert_from)(PixelFormat *pf,
+QemuPixel *dst, void *src, uint32_t cnt);
+typedef void (*pf_convert_to)(PixelFormat *pf,
+  void *dst, QemuPixel *src, uint32_t cnt);
+
+struct QemuPfConv {
+pf_convertconvert;
+PixelFormat   src;
+PixelFormat   dst;
+
+/* for copy_generic() */
+pf_convert_from   conv_from;
+pf_convert_to conv_to;
+QemuPixel *conv_buf;
+uint32_t  conv_cnt;
+};
+
+struct QemuPixel {
+uint8_t red;
+uint8_t green;
+uint8_t blue;
+uint8_t alpha;
+};
+
+/* --- */
+/* PixelFormat - QemuPixel conversions*/
+
+static void conv_16_to_pixel(PixelFormat *pf,
+ QemuPixel *dst, void *src, uint32_t cnt)
+{
+uint16_t *src16 = src;
+
+while (cnt  0) {
+dst-red   = ((*src16  pf-rmask)  pf-rshift)  (8 - pf-rbits);
+dst-green = ((*src16  pf-gmask)  pf-gshift)  (8 - pf-gbits);
+dst-blue  = ((*src16  pf-bmask)  pf-bshift)  (8 - pf-bbits);
+dst-alpha = ((*src16  pf-amask)  pf-ashift)  (8 - pf-abits);
+dst++, src16++, cnt--;
+}
+}
+
+/* assumes pf-{r,g,b,a}bits == 8 */
+static void conv_32_to_pixel_fast(PixelFormat *pf,
+  QemuPixel *dst, void *src, uint32_t cnt)
+{
+uint32_t *src32 = src;
+
+while (cnt  0) {
+dst-red   = (*src32  pf-rmask)  pf-rshift;
+dst-green = (*src32  pf-gmask)  pf-gshift;
+dst-blue  = (*src32  pf-bmask)  pf-bshift;
+dst-alpha = (*src32  pf-amask)  pf-ashift;
+dst++, src32++, cnt--;
+}
+}
+
+static void conv_32_to_pixel_generic(PixelFormat *pf,
+ QemuPixel *dst, void *src, uint32_t cnt)
+{
+uint32_t *src32 = src;
+
+while (cnt  0) {
+if (pf-rbits  8) {
+dst-red   = ((*src32  pf-rmask)  pf-rshift)  (8 - 
pf-rbits);
+} else {
+dst-red   = ((*src32  pf-rmask)  pf-rshift)  (pf-rbits - 
8);
+}
+if (pf-gbits  8) {
+dst-green = ((*src32  pf-gmask)  pf-gshift)  (8 - 
pf-gbits);
+} else {
+dst-green = ((*src32  pf-gmask)  pf-gshift)  (pf-gbits - 
8);
+}
+if (pf-bbits  8) {
+dst-blue  = ((*src32  pf-bmask)  pf-bshift)  (8 - 
pf-bbits);
+} else {
+dst-blue  = ((*src32  pf-bmask)  pf-bshift)  (pf-bbits - 
8);
+}
+if (pf-abits  8) {
+dst-alpha = ((*src32  pf-amask)  pf-ashift)  (8 - 
pf-abits);
+} else {
+dst-alpha = ((*src32  pf-amask)  pf-ashift)  (pf-abits - 
8);
+}
+dst++, src32++, cnt--;
+}
+}
+
+/* --- */
+/* QemuPixel - PixelFormat conversions*/
+
+static void conv_pixel_to_16(PixelFormat *pf,
+ void *dst, QemuPixel *src, uint32_t cnt)
+{
+uint16_t *dst16 = dst;
+
+while (cnt  0) {
+*dst16  = ((uint16_t)src-red(8 - pf-rbits))  pf-rshift;
+*dst16 |= ((uint16_t)src-green  (8 - pf-gbits))  pf-gshift;
+*dst16 |= ((uint16_t)src-blue   (8 - pf-bbits))  pf-bshift;
+*dst16 |= ((uint16_t)src-alpha  (8 - pf-abits))  pf-ashift;
+dst16++, src++, cnt--;
+}
+}
+
+static void conv_pixel_to_32(PixelFormat *pf,
+ void *dst, QemuPixel *src, uint32_t cnt)
+{
+uint32_t *dst32 = dst;
+
+while (cnt  0) {
+*dst32  = ((uint32_t)src-red(8 - pf-rbits))  pf-rshift;
+*dst32 |= ((uint32_t)src-green  (8 - pf-gbits))  pf-gshift;
+*dst32 |= ((uint32_t)src-blue   (8 - pf-bbits))  pf-bshift;
+*dst32 |= ((uint32_t)src-alpha  (8 - pf-abits))  pf-ashift;
+dst32++,

[Qemu-devel] [RESENT PATCH] Add exit notifiers.

2010-06-14 Thread Gerd Hoffmann

Hook up any cleanup work which needs to be done here.  Advantages over
using atexit(3):

  (1) You get passed in a pointer to the notifier.  If you embed that
  into your state struct you can use container_of() to get get your
  state info.
  (2) You can unregister, say when un-plugging a device.

[ v2: move code out of #ifndef _WIN32 ]

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 sysemu.h |4 
 vl.c |   19 +++
 2 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/sysemu.h b/sysemu.h
index 346cccd..0bfbd6a 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -6,6 +6,7 @@
 #include qemu-option.h
 #include qemu-queue.h
 #include qemu-timer.h
+#include notify.h
 
 #ifdef _WIN32
 #include windows.h
@@ -56,6 +57,9 @@ int qemu_powerdown_requested(void);
 extern qemu_irq qemu_system_powerdown;
 void qemu_system_reset(void);
 
+void qemu_add_exit_notifier(Notifier *notify);
+void qemu_remove_exit_notifier(Notifier *notify);
+
 void do_savevm(Monitor *mon, const QDict *qdict);
 int load_vmstate(const char *name);
 void do_delvm(Monitor *mon, const QDict *qdict);
diff --git a/vl.c b/vl.c
index 9cf5334..5daeb7b 100644
--- a/vl.c
+++ b/vl.c
@@ -238,6 +238,9 @@ uint8_t qemu_uuid[16];
 static QEMUBootSetHandler *boot_set_handler;
 static void *boot_set_opaque;
 
+static NotifierList exit_notifiers =
+NOTIFIER_LIST_INITIALIZER(exit_notifiers);
+
 int kvm_allowed = 0;
 uint32_t xen_domid;
 enum xen_mode xen_mode = XEN_EMULATE;
@@ -1956,6 +1959,21 @@ static int balloon_parse(const char *arg)
 return -1;
 }
 
+void qemu_add_exit_notifier(Notifier *notify)
+{
+notifier_list_add(exit_notifiers, notify);
+}
+
+void qemu_remove_exit_notifier(Notifier *notify)
+{
+notifier_list_remove(exit_notifiers, notify);
+}
+
+static void qemu_run_exit_notifiers(void)
+{
+notifier_list_notify(exit_notifiers);
+}
+
 char *qemu_find_file(int type, const char *name)
 {
 int len;
@@ -2287,6 +2305,7 @@ int main(int argc, char **argv, char **envp)
 int show_vnc_port = 0;
 int defconfig = 1;
 
+atexit(qemu_run_exit_notifiers);
 error_set_progname(argv[0]);
 
 init_clocks();
-- 
1.6.5.2

[Qemu-devel] [RESENT PATCH 3/3] Fix and simplify gui timer logic.

2010-06-14 Thread Gerd Hoffmann

Kill nographic timer.  Have a global gui_timer instead.  Have the gui
timer enabled unconditionally.  We need a timer running anyway for mmio
flush, so the whole have-gui-timer-only-when-needed logic is pretty
pointless.  It also simplifies displaylisteners coming and going at
runtime, we don't need to care about the timer then as it runs anyway.

Don't allocate the timer twice in case we have two display listeners.

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 console.h |1 -
 vl.c  |   38 +++---
 2 files changed, 11 insertions(+), 28 deletions(-)

diff --git a/console.h b/console.h
index a0da498..6dad8d0 100644
--- a/console.h
+++ b/console.h
@@ -173,7 +173,6 @@ struct DisplayAllocator {
 struct DisplayState {
 struct DisplaySurface *surface;
 void *opaque;
-struct QEMUTimer *gui_timer;
 
 struct DisplayAllocator* allocator;
 QLIST_HEAD(, DisplayChangeListener) listeners;
diff --git a/vl.c b/vl.c
index 2cfeab1..5181d16 100644
--- a/vl.c
+++ b/vl.c
@@ -231,7 +231,7 @@ int nb_numa_nodes;
 uint64_t node_mem[MAX_NODES];
 uint64_t node_cpumask[MAX_NODES];
 
-static QEMUTimer *nographic_timer;
+static QEMUTimer *gui_timer;
 
 uint8_t qemu_uuid[16];
 
@@ -1532,22 +1532,17 @@ static void gui_update(void *opaque)
 DisplayChangeListener *dcl;
 
 qemu_flush_coalesced_mmio_buffer();
-dpy_refresh(ds);
 
-QLIST_FOREACH(dcl, ds-listeners, next) {
-if (dcl-gui_timer_interval 
-dcl-gui_timer_interval  interval)
-interval = dcl-gui_timer_interval;
+if (ds != NULL  !QLIST_EMPTY(ds-listeners)) {
+dpy_refresh(ds);
+QLIST_FOREACH(dcl, ds-listeners, next) {
+if (dcl-gui_timer_interval 
+dcl-gui_timer_interval  interval)
+interval = dcl-gui_timer_interval;
+}
 }
-qemu_mod_timer(ds-gui_timer, interval + qemu_get_clock(rt_clock));
-}
-
-static void nographic_update(void *opaque)
-{
-uint64_t interval = GUI_REFRESH_INTERVAL;
 
-qemu_flush_coalesced_mmio_buffer();
-qemu_mod_timer(nographic_timer, interval + qemu_get_clock(rt_clock));
+qemu_mod_timer(gui_timer, interval + qemu_get_clock(rt_clock));
 }
 
 struct vm_change_state_entry {
@@ -2290,7 +2285,6 @@ int main(int argc, char **argv, char **envp)
 const char *kernel_filename, *kernel_cmdline;
 char boot_devices[33] = cad; /* default to HD-floppy-CD-ROM */
 DisplayState *ds;
-DisplayChangeListener *dcl;
 int cyls, heads, secs, translation;
 QemuOpts *hda_opts = NULL, *opts;
 int optind;
@@ -3407,18 +3401,8 @@ int main(int argc, char **argv, char **envp)
 }
 dpy_resize(ds);
 
-QLIST_FOREACH(dcl, ds-listeners, next) {
-if (dcl-dpy_refresh != NULL) {
-ds-gui_timer = qemu_new_timer(rt_clock, gui_update, ds);
-qemu_mod_timer(ds-gui_timer, qemu_get_clock(rt_clock));
-break;
-}
-}
-
-if (display_type == DT_NOGRAPHIC || display_type == DT_VNC) {
-nographic_timer = qemu_new_timer(rt_clock, nographic_update, NULL);
-qemu_mod_timer(nographic_timer, qemu_get_clock(rt_clock));
-}
+gui_timer = qemu_new_timer(rt_clock, gui_update, ds);
+qemu_mod_timer(gui_timer, qemu_get_clock(rt_clock));
 
 text_consoles_set_display(ds);
 
-- 
1.6.5.2

[Qemu-devel] [RESENT PATCH 2/3] add unregister_displaychangelistener

2010-06-14 Thread Gerd Hoffmann


Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 console.h |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/console.h b/console.h
index 3a80dca..a0da498 100644
--- a/console.h
+++ b/console.h
@@ -227,6 +227,11 @@ static inline void 
register_displaychangelistener(DisplayState *ds, DisplayChang
 QLIST_INSERT_HEAD(ds-listeners, dcl, next);
 }
 
+static inline void unregister_displaychangelistener(DisplayChangeListener *dcl)
+{
+QLIST_REMOVE(dcl, next);
+}
+
 static inline void dpy_update(DisplayState *s, int x, int y, int w, int h)
 {
 struct DisplayChangeListener *dcl;
-- 
1.6.5.2

Re: [Qemu-devel] [RfC PATCH] add pflib: PixelFormat conversion library.

2010-06-14 Thread Anthony Liguori


On 06/14/2010 10:25 AM, Gerd Hoffmann wrote:

Signed-off-by: Gerd Hoffmannkra...@redhat.com
---
  Makefile.objs |1 +
  pflib.c   |  204 +
  pflib.h   |6 ++
  3 files changed, 211 insertions(+), 0 deletions(-)
  create mode 100644 pflib.c
  create mode 100644 pflib.h

diff --git a/Makefile.objs b/Makefile.objs
index 2dad0f9..6201675 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -83,6 +83,7 @@ common-obj-y += qemu-char.o savevm.o #aio.o
  common-obj-y += msmouse.o ps2.o
  common-obj-y += qdev.o qdev-properties.o
  common-obj-y += block-migration.o
+common-obj-y += pflib.o

  common-obj-$(CONFIG_BRLAPI) += baum.o
  common-obj-$(CONFIG_POSIX) += migration-exec.o migration-unix.o migration-fd.o
diff --git a/pflib.c b/pflib.c
new file mode 100644
index 000..1c04378
--- /dev/null
+++ b/pflib.c
   


Good idea but needs copyright/license.

Regards,

Anthony Liguori


@@ -0,0 +1,204 @@
+#include qemu-common.h
+#include console.h
+#include pflib.h
+
+typedef struct QemuPixel QemuPixel;
+
+typedef void (*pf_convert)(QemuPfConv *conv,
+   void *dst, void *src, uint32_t cnt);
+typedef void (*pf_convert_from)(PixelFormat *pf,
+QemuPixel *dst, void *src, uint32_t cnt);
+typedef void (*pf_convert_to)(PixelFormat *pf,
+  void *dst, QemuPixel *src, uint32_t cnt);
+
+struct QemuPfConv {
+pf_convertconvert;
+PixelFormat   src;
+PixelFormat   dst;
+
+/* for copy_generic() */
+pf_convert_from   conv_from;
+pf_convert_to conv_to;
+QemuPixel *conv_buf;
+uint32_t  conv_cnt;
+};
+
+struct QemuPixel {
+uint8_t red;
+uint8_t green;
+uint8_t blue;
+uint8_t alpha;
+};
+
+/* --- */
+/* PixelFormat -  QemuPixel conversions*/
+
+static void conv_16_to_pixel(PixelFormat *pf,
+ QemuPixel *dst, void *src, uint32_t cnt)
+{
+uint16_t *src16 = src;
+
+while (cnt  0) {
+dst-red   = ((*src16  pf-rmask)  pf-rshift)  (8 - pf-rbits);
+dst-green = ((*src16  pf-gmask)  pf-gshift)  (8 - pf-gbits);
+dst-blue  = ((*src16  pf-bmask)  pf-bshift)  (8 - pf-bbits);
+dst-alpha = ((*src16  pf-amask)  pf-ashift)  (8 - pf-abits);
+dst++, src16++, cnt--;
+}
+}
+
+/* assumes pf-{r,g,b,a}bits == 8 */
+static void conv_32_to_pixel_fast(PixelFormat *pf,
+  QemuPixel *dst, void *src, uint32_t cnt)
+{
+uint32_t *src32 = src;
+
+while (cnt  0) {
+dst-red   = (*src32  pf-rmask)  pf-rshift;
+dst-green = (*src32  pf-gmask)  pf-gshift;
+dst-blue  = (*src32  pf-bmask)  pf-bshift;
+dst-alpha = (*src32  pf-amask)  pf-ashift;
+dst++, src32++, cnt--;
+}
+}
+
+static void conv_32_to_pixel_generic(PixelFormat *pf,
+ QemuPixel *dst, void *src, uint32_t cnt)
+{
+uint32_t *src32 = src;
+
+while (cnt  0) {
+if (pf-rbits  8) {
+dst-red   = ((*src32  pf-rmask)  pf-rshift)  (8 - 
pf-rbits);
+} else {
+dst-red   = ((*src32  pf-rmask)  pf-rshift)  (pf-rbits - 
8);
+}
+if (pf-gbits  8) {
+dst-green = ((*src32  pf-gmask)  pf-gshift)  (8 - 
pf-gbits);
+} else {
+dst-green = ((*src32  pf-gmask)  pf-gshift)  (pf-gbits - 
8);
+}
+if (pf-bbits  8) {
+dst-blue  = ((*src32  pf-bmask)  pf-bshift)  (8 - 
pf-bbits);
+} else {
+dst-blue  = ((*src32  pf-bmask)  pf-bshift)  (pf-bbits - 
8);
+}
+if (pf-abits  8) {
+dst-alpha = ((*src32  pf-amask)  pf-ashift)  (8 - 
pf-abits);
+} else {
+dst-alpha = ((*src32  pf-amask)  pf-ashift)  (pf-abits - 
8);
+}
+dst++, src32++, cnt--;
+}
+}
+
+/* --- */
+/* QemuPixel -  PixelFormat conversions*/
+
+static void conv_pixel_to_16(PixelFormat *pf,
+ void *dst, QemuPixel *src, uint32_t cnt)
+{
+uint16_t *dst16 = dst;
+
+while (cnt  0) {
+*dst16  = ((uint16_t)src-red  (8 - pf-rbits))  pf-rshift;
+*dst16 |= ((uint16_t)src-green  (8 - pf-gbits))  pf-gshift;
+*dst16 |= ((uint16_t)src-blue  (8 - pf-bbits))  pf-bshift;
+*dst16 |= ((uint16_t)src-alpha  (8 - pf-abits))  pf-ashift;
+dst16++, src++, cnt--;
+}
+}
+
+static void conv_pixel_to_32(PixelFormat *pf,
+ void *dst, QemuPixel *src, uint32_t cnt)
+{
+uint32_t *dst32 = dst;
+
+while (cnt  0) {
+*dst32  = ((uint32_t)src-red  (8 - pf-rbits))  pf-rshift;
+*dst32 |= ((uint32_t)src-green  (8 - pf-gbits))  pf-gshift;
+*dst32 |=

[Qemu-devel] [RESENT PATCH 1/3] QLIST-ify display change listeners.

2010-06-14 Thread Gerd Hoffmann


Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 console.h  |   72 +++
 hw/xenfb.c |2 +-
 vl.c   |9 ++-
 3 files changed, 42 insertions(+), 41 deletions(-)

diff --git a/console.h b/console.h
index cac959f..3a80dca 100644
--- a/console.h
+++ b/console.h
@@ -161,7 +161,7 @@ struct DisplayChangeListener {
  int w, int h, uint32_t c);
 void (*dpy_text_cursor)(struct DisplayState *s, int x, int y);
 
-struct DisplayChangeListener *next;
+QLIST_ENTRY(DisplayChangeListener) next;
 };
 
 struct DisplayAllocator {
@@ -176,7 +176,7 @@ struct DisplayState {
 struct QEMUTimer *gui_timer;
 
 struct DisplayAllocator* allocator;
-struct DisplayChangeListener* listeners;
+QLIST_HEAD(, DisplayChangeListener) listeners;
 
 void (*mouse_set)(int x, int y, int on);
 void (*cursor_define)(QEMUCursor *cursor);
@@ -224,72 +224,76 @@ static inline int is_buffer_shared(DisplaySurface 
*surface)
 
 static inline void register_displaychangelistener(DisplayState *ds, 
DisplayChangeListener *dcl)
 {
-dcl-next = ds-listeners;
-ds-listeners = dcl;
+QLIST_INSERT_HEAD(ds-listeners, dcl, next);
 }
 
 static inline void dpy_update(DisplayState *s, int x, int y, int w, int h)
 {
-struct DisplayChangeListener *dcl = s-listeners;
-while (dcl != NULL) {
+struct DisplayChangeListener *dcl;
+QLIST_FOREACH(dcl, s-listeners, next) {
 dcl-dpy_update(s, x, y, w, h);
-dcl = dcl-next;
 }
 }
 
 static inline void dpy_resize(DisplayState *s)
 {
-struct DisplayChangeListener *dcl = s-listeners;
-while (dcl != NULL) {
+struct DisplayChangeListener *dcl;
+QLIST_FOREACH(dcl, s-listeners, next) {
 dcl-dpy_resize(s);
-dcl = dcl-next;
 }
 }
 
 static inline void dpy_setdata(DisplayState *s)
 {
-struct DisplayChangeListener *dcl = s-listeners;
-while (dcl != NULL) {
-if (dcl-dpy_setdata) dcl-dpy_setdata(s);
-dcl = dcl-next;
+struct DisplayChangeListener *dcl;
+QLIST_FOREACH(dcl, s-listeners, next) {
+if (dcl-dpy_setdata) {
+dcl-dpy_setdata(s);
+}
 }
 }
 
 static inline void dpy_refresh(DisplayState *s)
 {
-struct DisplayChangeListener *dcl = s-listeners;
-while (dcl != NULL) {
-if (dcl-dpy_refresh) dcl-dpy_refresh(s);
-dcl = dcl-next;
+struct DisplayChangeListener *dcl;
+QLIST_FOREACH(dcl, s-listeners, next) {
+if (dcl-dpy_refresh) {
+dcl-dpy_refresh(s);
+}
 }
 }
 
 static inline void dpy_copy(struct DisplayState *s, int src_x, int src_y,
- int dst_x, int dst_y, int w, int h) {
-struct DisplayChangeListener *dcl = s-listeners;
-while (dcl != NULL) {
-if (dcl-dpy_copy)
+ int dst_x, int dst_y, int w, int h)
+{
+struct DisplayChangeListener *dcl;
+QLIST_FOREACH(dcl, s-listeners, next) {
+if (dcl-dpy_copy) {
 dcl-dpy_copy(s, src_x, src_y, dst_x, dst_y, w, h);
-else /* TODO */
+} else { /* TODO */
 dcl-dpy_update(s, dst_x, dst_y, w, h);
-dcl = dcl-next;
+}
 }
 }
 
 static inline void dpy_fill(struct DisplayState *s, int x, int y,
- int w, int h, uint32_t c) {
-struct DisplayChangeListener *dcl = s-listeners;
-while (dcl != NULL) {
-if (dcl-dpy_fill) dcl-dpy_fill(s, x, y, w, h, c);
-dcl = dcl-next;
+ int w, int h, uint32_t c)
+{
+struct DisplayChangeListener *dcl;
+QLIST_FOREACH(dcl, s-listeners, next) {
+if (dcl-dpy_fill) {
+dcl-dpy_fill(s, x, y, w, h, c);
+}
 }
 }
 
-static inline void dpy_cursor(struct DisplayState *s, int x, int y) {
-struct DisplayChangeListener *dcl = s-listeners;
-while (dcl != NULL) {
-if (dcl-dpy_text_cursor) dcl-dpy_text_cursor(s, x, y);
-dcl = dcl-next;
+static inline void dpy_cursor(struct DisplayState *s, int x, int y)
+{
+struct DisplayChangeListener *dcl;
+QLIST_FOREACH(dcl, s-listeners, next) {
+if (dcl-dpy_text_cursor) {
+dcl-dpy_text_cursor(s, x, y);
+}
 }
 }
 
diff --git a/hw/xenfb.c b/hw/xenfb.c
index da5297b..5076beb 100644
--- a/hw/xenfb.c
+++ b/hw/xenfb.c
@@ -706,7 +706,7 @@ static void xenfb_update(void *opaque)
if (xenfb_queue_full(xenfb))
return;
 
-for (l = xenfb-c.ds-listeners; l != NULL; l = l-next) {
+QLIST_FOREACH(l, xenfb-c.ds-listeners, next) {
 if (l-idle)
 continue;
 idle = 0;
diff --git a/vl.c b/vl.c
index 5daeb7b..2cfeab1 100644
--- a/vl.c
+++ b/vl.c
@@ -1529,16 +1529,15 @@ static void gui_update(void *opaque)
 {
 uint64_t interval = GUI_REFRESH_INTERVAL;
 DisplayState *ds = opaque;
-DisplayChangeListener *dcl = ds-listeners;
+DisplayChangeListener *dcl;

[Qemu-devel] [RfC PATCH] Fix vnc memory corruption with width = 1400

2010-06-14 Thread Gerd Hoffmann

vnc assumes that the screen width is a multiple of 16 in several places.
If this is not the case vnc will overrun buffers, corrupt memory, make
qemu crash.

This is the minimum fix for this bug. It makes sure we don't overrun the
scanline, thereby fixing the segfault.  The rendering is *not* correct
though, there is a black border at the right side of the screen, 8
pixels wide because 1400 % 16 == 8.


Not sure what the best way to fix that for real is.  qemu allways sends
updates which are 16 (or a multiple of 16) pixels wide.  I'm not sure
whenever this is something specified in the protocol or just in the qemu
implementation.

jpeg encodes 16x16 blocks too.  How does vnc+jpeg-encoding handle odd
screen sizes?


Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 vnc.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/vnc.c b/vnc.c
index 142e769..a4187a1 100644
--- a/vnc.c
+++ b/vnc.c
@@ -2205,7 +2205,7 @@ static int vnc_refresh_server_surface(VncDisplay *vd)
 guest_ptr  = guest_row;
 server_ptr = server_row;
 
-for (x = 0; x  vd-guest.ds-width;
+for (x = 0; x + 15  vd-guest.ds-width;
 x += 16, guest_ptr += cmp_bytes, server_ptr += cmp_bytes) {
 if (!vnc_get_bit(vd-guest.dirty[y], (x / 16)))
 continue;
-- 
1.6.5.2

Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path()

2010-06-14 Thread Alex Williamson

On Mon, 2010-06-14 at 14:09 +0100, Paul Brook wrote:
/main-system-bus/pci.0,addr=09.0/virtio-blk-pci
 
 There's a device missing between the main system bus and the pci bus.  Should 
 be something like:
 
 /main-system-bus/piix4-pcihost/pci.0/_09.0

Ok, I can easily come up with:

/System/main-system-bus/i440FX-pcihost/PCI/pci.0/_09.0/virtio-blk-pci/virtio-blk

to expand the previous output.  Code below.

   Could you explain why you add identified properties of the immediate
   parent bus and device?  They make the result ver much *not* a dev
   path in the qdev sense...
  
  In order to try to get a unique string.  Without looking into device
  properties, two e1000s would both be:
  
  /main-system-bus/pci.0/e1000
  /main-system-bus/pci.0/e1000
  
  Which is no better than simply e1000 and would require us to fall back
  to instance ids again.  The goal here is that anything that makes use of
  passing a dev when registering a vmstate gets an instance id of zero.
 
 You already got the information you need, you just put it in the wrong place. 
 The canonical ID for the device could be its bus address. In practice we'd 
 probably want to allow the user to specify it by name, provided these are 
 unique. e.g. in the above machine we could accept [...]/virtiio-blk-pci would 
 as an aias for [...]:_09.0.

Sure, if there was a guaranteed unique char* on a DeviceState that was
consistent between guest invocations, we could just use that instead.  I
suppose all devices could have a global system id property and if that
was present we'd use that instead of creating the device path.

 Device names have a restricted namespace, so we 
 can use an initial prefix to disambiguate a name/label from a bus address.

I'm not sure it's necessary.  It seems like it would only be necessary
to differentiate the two if we wanted to translate between namespaces.
But I think it's reasonable to require that if a global name is
provided, it must always be provided for the life of the VM.

 For busses that don't have a consistent addressing scheme then some sort of 
 instance ID is unavoidable. I guess it may be possible to invent something 
 based on other device properties (e.g. address of the first IO port/memory 
 region).

Instance IDs aren't always bad, we just run into trouble with hotplug
and maybe creating unique ramblock names.  But, such busses typically
don't support hotplug and don't have multiple instances of the same
device on the bus, so I don't expect us to hit many issues there as long
as we can get a stable address path.  Thanks,

Alex

Signed-off-by: Alex Williamson alex.william...@redhat.com
---

static int qdev_strprint_path_props(DeviceState *dev, Property *props,
char *buf, size_t len)
{
int offset = 0;
char pbuf[64];

if (!props)
return 0;

while (props-name) {
if (props-info-flags  PROP_FLAG_PATH) {
if (props-info-print) {
props-info-print(dev, props, pbuf, sizeof(pbuf));
offset += snprintf(buf + offset, len - offset, /_%s, pbuf);
}
}
props++;
}
return offset;
}

static int qdev_strprint_dev(DeviceState *dev, char *buf, size_t len);

static int qdev_strprint_bus(DeviceState *dev, char *buf, size_t len)
{
int offset = 0;

if (dev-parent_bus-parent)
offset += qdev_strprint_dev(dev-parent_bus-parent, buf, len);

offset += snprintf(buf + offset, len - offset, /%s/%s,
   dev-parent_bus-info-name, dev-parent_bus-name);

offset += qdev_strprint_path_props(dev, dev-parent_bus-info-props,
   buf + offset, len - offset);

return offset;
}

static int qdev_strprint_dev(DeviceState *dev, char *buf, size_t len)
{
int offset = 0;

if (dev-parent_bus)
offset += qdev_strprint_bus(dev, buf, len);

offset += snprintf(buf + offset, len - offset, /%s, dev-info-name);

offset += qdev_strprint_path_props(dev, dev-info-props,
   buf + offset, len - offset);
if (dev-id)
offset += snprintf(buf + offset, len - offset, /%s, dev-id);

return offset;

}

char *qdev_get_dev_path(DeviceState *dev)
{
char buf[256] = ;

if (!dev)
return NULL;

qdev_strprint_dev(dev, buf, sizeof(buf));
return qemu_strdup(buf);
}

Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path()

2010-06-14 Thread Paul Brook

 On Mon, 2010-06-14 at 14:09 +0100, Paul Brook wrote:
 /main-system-bus/pci.0,addr=09.0/virtio-blk-pci
  
  There's a device missing between the main system bus and the pci bus. 
  Should be something like:
  
  /main-system-bus/piix4-pcihost/pci.0/_09.0
 
 Ok, I can easily come up with:
 
 /System/main-system-bus/i440FX-pcihost/PCI/pci.0/_09.0/virtio-blk-pci/virti
 o-blk

No. Now you've got way to many elements in the path.  My point is that you 
replace the name of the device with the bus address of the device.

However you determine the element names the path should be /busname/devicename 
pairs. I'm undecided whether the root element should be the main system bus, 
or a system device node that contains the main system bus.

  You already got the information you need, you just put it in the wrong
  place. The canonical ID for the device could be its bus address. In
  practice we'd probably want to allow the user to specify it by name,
  provided these are unique. e.g. in the above machine we could accept
  [...]/virtiio-blk-pci would as an aias for [...]:_09.0.
 
 Sure, if there was a guaranteed unique char* on a DeviceState that was
 consistent between guest invocations, we could just use that instead.  I
 suppose all devices could have a global system id property and if that
 was present we'd use that instead of creating the device path.

All we require is some way of uniquely addressing each device on the bus. For 
PCI that's trivial (devfn). For other busses we get to pick something 
appropriate.
 
  Device names have a restricted namespace, so we
  can use an initial prefix to disambiguate a name/label from a bus
  address.
 
 I'm not sure it's necessary.  It seems like it would only be necessary
 to differentiate the two if we wanted to translate between namespaces.
 But I think it's reasonable to require that if a global name is
 provided, it must always be provided for the life of the VM.

I don't think requiring that all devices are given a globally unique name is 
going to fly. locally (per-bus) unique user-specified are still a PITA, but 
may be acceptable. Having a canonical addressing system that's independent of 
user assigned IDs seems like a good thing.

  For busses that don't have a consistent addressing scheme then some sort
  of instance ID is unavoidable. I guess it may be possible to invent
  something based on other device properties (e.g. address of the first IO
  port/memory region).
 
 Instance IDs aren't always bad, we just run into trouble with hotplug
 and maybe creating unique ramblock names.  But, such busses typically
 don't support hotplug and don't have multiple instances of the same
 device on the bus, so I don't expect us to hit many issues there as long
 as we can get a stable address path.  Thanks,

Simple consecutive instance IDs are inherently unstable. They depend on teh 
order of device creation. The ID really wants to be something that can be 
reliably determined from the device bus itself (and/or its interaction with 
its parent bus).

Paul

[Qemu-devel] Re: [PATCH v3 0/5] Add QMP migration events

2010-06-14 Thread Juan Quintela

Luiz Capitulino lcapitul...@redhat.com wrote:
 On Sat, 12 Jun 2010 13:20:54 +0200
 Juan Quintela quint...@redhat.com wrote:

 Both.

  What does it report in the source?

That migration has started :)  Nothing else, nothing less.

Think again multiple monitors and/or audit.

 Why can't we just tell them that migration has ended with success as
 fast as possible?
 
 I can't understand what I am missing here.  I can't believe that
 libvirt(management app in general) could came with a simple request that
 would make its live better.  And to make things worse, it is _trivial_
 to implement, semantics are clear, has other uses, .

  I'd be ok in having the done event if the polling is shown to be that
 problematic, but from some private talks I had with libvirt guys it seemed
 to me that it's more like a wish have.

  For us, having this event means that we'll have to carry it for (at least)
 several releases. I really would like to avoid adding one-time workarounds
 in a stable interface like QMP.

For me, this is going to be there forever.  It has clear semantics, it
is useful and all other alternatives given to date are convoluted at
least.

Later, Juan.

[Qemu-devel] [PATCH v2] add pflib: PixelFormat conversion library.

2010-06-14 Thread Gerd Hoffmann


Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 Makefile.objs |1 +
 pflib.c   |  213 +
 pflib.h   |6 ++
 3 files changed, 220 insertions(+), 0 deletions(-)
 create mode 100644 pflib.c
 create mode 100644 pflib.h

diff --git a/Makefile.objs b/Makefile.objs
index 2dad0f9..6201675 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -83,6 +83,7 @@ common-obj-y += qemu-char.o savevm.o #aio.o
 common-obj-y += msmouse.o ps2.o
 common-obj-y += qdev.o qdev-properties.o
 common-obj-y += block-migration.o
+common-obj-y += pflib.o
 
 common-obj-$(CONFIG_BRLAPI) += baum.o
 common-obj-$(CONFIG_POSIX) += migration-exec.o migration-unix.o migration-fd.o
diff --git a/pflib.c b/pflib.c
new file mode 100644
index 000..1154d0c
--- /dev/null
+++ b/pflib.c
@@ -0,0 +1,213 @@
+/*
+ * PixelFormat conversion library.
+ *
+ * Author: Gerd Hoffmann kra...@redhat.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+#include qemu-common.h
+#include console.h
+#include pflib.h
+
+typedef struct QemuPixel QemuPixel;
+
+typedef void (*pf_convert)(QemuPfConv *conv,
+   void *dst, void *src, uint32_t cnt);
+typedef void (*pf_convert_from)(PixelFormat *pf,
+QemuPixel *dst, void *src, uint32_t cnt);
+typedef void (*pf_convert_to)(PixelFormat *pf,
+  void *dst, QemuPixel *src, uint32_t cnt);
+
+struct QemuPfConv {
+pf_convertconvert;
+PixelFormat   src;
+PixelFormat   dst;
+
+/* for copy_generic() */
+pf_convert_from   conv_from;
+pf_convert_to conv_to;
+QemuPixel *conv_buf;
+uint32_t  conv_cnt;
+};
+
+struct QemuPixel {
+uint8_t red;
+uint8_t green;
+uint8_t blue;
+uint8_t alpha;
+};
+
+/* --- */
+/* PixelFormat - QemuPixel conversions*/
+
+static void conv_16_to_pixel(PixelFormat *pf,
+ QemuPixel *dst, void *src, uint32_t cnt)
+{
+uint16_t *src16 = src;
+
+while (cnt  0) {
+dst-red   = ((*src16  pf-rmask)  pf-rshift)  (8 - pf-rbits);
+dst-green = ((*src16  pf-gmask)  pf-gshift)  (8 - pf-gbits);
+dst-blue  = ((*src16  pf-bmask)  pf-bshift)  (8 - pf-bbits);
+dst-alpha = ((*src16  pf-amask)  pf-ashift)  (8 - pf-abits);
+dst++, src16++, cnt--;
+}
+}
+
+/* assumes pf-{r,g,b,a}bits == 8 */
+static void conv_32_to_pixel_fast(PixelFormat *pf,
+  QemuPixel *dst, void *src, uint32_t cnt)
+{
+uint32_t *src32 = src;
+
+while (cnt  0) {
+dst-red   = (*src32  pf-rmask)  pf-rshift;
+dst-green = (*src32  pf-gmask)  pf-gshift;
+dst-blue  = (*src32  pf-bmask)  pf-bshift;
+dst-alpha = (*src32  pf-amask)  pf-ashift;
+dst++, src32++, cnt--;
+}
+}
+
+static void conv_32_to_pixel_generic(PixelFormat *pf,
+ QemuPixel *dst, void *src, uint32_t cnt)
+{
+uint32_t *src32 = src;
+
+while (cnt  0) {
+if (pf-rbits  8) {
+dst-red   = ((*src32  pf-rmask)  pf-rshift)  (8 - 
pf-rbits);
+} else {
+dst-red   = ((*src32  pf-rmask)  pf-rshift)  (pf-rbits - 
8);
+}
+if (pf-gbits  8) {
+dst-green = ((*src32  pf-gmask)  pf-gshift)  (8 - 
pf-gbits);
+} else {
+dst-green = ((*src32  pf-gmask)  pf-gshift)  (pf-gbits - 
8);
+}
+if (pf-bbits  8) {
+dst-blue  = ((*src32  pf-bmask)  pf-bshift)  (8 - 
pf-bbits);
+} else {
+dst-blue  = ((*src32  pf-bmask)  pf-bshift)  (pf-bbits - 
8);
+}
+if (pf-abits  8) {
+dst-alpha = ((*src32  pf-amask)  pf-ashift)  (8 - 
pf-abits);
+} else {
+dst-alpha = ((*src32  pf-amask)  pf-ashift)  (pf-abits - 
8);
+}
+dst++, src32++, cnt--;
+}
+}
+
+/* --- */
+/* QemuPixel - PixelFormat conversions*/
+
+static void conv_pixel_to_16(PixelFormat *pf,
+ void *dst, QemuPixel *src, uint32_t cnt)
+{
+uint16_t *dst16 = dst;
+
+while (cnt  0) {
+*dst16  = ((uint16_t)src-red(8 - pf-rbits))  pf-rshift;
+*dst16 |= ((uint16_t)src-green  (8 - pf-gbits))  pf-gshift;
+*dst16 |= ((uint16_t)src-blue   (8 - pf-bbits))  pf-bshift;
+*dst16 |= ((uint16_t)src-alpha  (8 - pf-abits))  pf-ashift;
+dst16++, src++, cnt--;
+}
+}
+
+static void conv_pixel_to_32(PixelFormat *pf,
+ void *dst, QemuPixel *src, uint32_t cnt)
+{
+uint32_t *dst32 = dst;
+
+while (cnt  0) {
+*dst32  = ((uint32_t)src-red(8 - pf-rbits))  pf-rshift;
+

Re: [Qemu-devel] [PATCH v6 4/6] Add support for marking memory to not be migrated. On migration, memory is checked for the NO_MIGRATION_FLAG.

2010-06-14 Thread Anthony Liguori


On 06/04/2010 04:45 PM, Cam Macdonell wrote:

This is useful for devices that do not want to take memory regions data with 
them on migration.
---
  arch_init.c  |   28 
  cpu-all.h|2 ++
  cpu-common.h |2 ++
  exec.c   |   12 
  4 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index cfc03ea..7a234fa 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -118,18 +118,21 @@ static int ram_save_block(QEMUFile *f)
  current_addr + TARGET_PAGE_SIZE,
  MIGRATION_DIRTY_FLAG);

-p = qemu_get_ram_ptr(current_addr);
-
-if (is_dup_page(p, *p)) {
-qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_COMPRESS);
-qemu_put_byte(f, *p);
-} else {
-qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_PAGE);
-qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
-}
+if (!cpu_physical_memory_get_dirty(current_addr,
+NO_MIGRATION_FLAG)) {
+p = qemu_get_ram_ptr(current_addr);
+
+if (is_dup_page(p, *p)) {
+qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_COMPRESS);
+qemu_put_byte(f, *p);
+} else {
+qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_PAGE);
+qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
+}

-found = 1;
-break;
+found = 1;
+break;
+}
  }
   


Shouldn't we just disable live migration out right?

I would rather that the device mark migration as impossible having the 
user hot remove the device before migration and then add it again after 
migration.  Device assignment could also use this functionality.


What this does is make migration possible but fundamentally broken which 
is not a good thing.


Regards,

Anthony Liguori


  addr += TARGET_PAGE_SIZE;
  current_addr = (saved_addr + addr) % last_ram_offset;
@@ -146,7 +149,8 @@ static ram_addr_t ram_save_remaining(void)
  ram_addr_t count = 0;

  for (addr = 0; addr  last_ram_offset; addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
+if (!cpu_physical_memory_get_dirty(addr, NO_MIGRATION_FLAG)
+cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
  count++;
  }
  }
diff --git a/cpu-all.h b/cpu-all.h
index 9080cc7..4df00ab 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -887,6 +887,8 @@ extern int mem_prealloc;
  #define CODE_DIRTY_FLAG  0x02
  #define MIGRATION_DIRTY_FLAG 0x08

+#define NO_MIGRATION_FLAG 0x10
+
  #define DIRTY_ALL_FLAG  (VGA_DIRTY_FLAG | CODE_DIRTY_FLAG | 
MIGRATION_DIRTY_FLAG)

  /* read dirty bit (return 0 or 1) */
diff --git a/cpu-common.h b/cpu-common.h
index 4b0ba60..a1ebbbe 100644
--- a/cpu-common.h
+++ b/cpu-common.h
@@ -39,6 +39,8 @@ static inline void 
cpu_register_physical_memory(target_phys_addr_t start_addr,
  cpu_register_physical_memory_offset(start_addr, size, phys_offset, 0);
  }

+void cpu_mark_pages_no_migrate(ram_addr_t start, uint64_t size);
+
  ram_addr_t cpu_get_physical_page_desc(target_phys_addr_t addr);
  ram_addr_t qemu_ram_map(ram_addr_t size, void *host);
  ram_addr_t qemu_ram_alloc(ram_addr_t);
diff --git a/exec.c b/exec.c
index 39c18a7..c11d22f 100644
--- a/exec.c
+++ b/exec.c
@@ -2786,6 +2786,18 @@ static void *file_ram_alloc(ram_addr_t memory, const 
char *path)
  }
  #endif

+void cpu_mark_pages_no_migrate(ram_addr_t start, uint64_t length)
+{
+int i, len;
+uint8_t *p;
+
+len = length  TARGET_PAGE_BITS;
+p = phys_ram_flags + (start  TARGET_PAGE_BITS);
+for (i = 0; i  len; i++) {
+p[i] |= NO_MIGRATION_FLAG;
+}
+}
+
  ram_addr_t qemu_ram_map(ram_addr_t size, void *host)
  {
  RAMBlock *new_block;

Re: [Qemu-devel] Re: [PATCH v6 0/6] Inter-VM Shared Memory Device with migration support

2010-06-14 Thread Anthony Liguori


On 06/11/2010 05:03 PM, Cam Macdonell wrote:

Hi Anthony,

Is my implementation of master/peer roles acceptable?


Yes, it looks good.


   I realize with
Alex's RAMList changes I may need to modify my patch, but is the
approach of marking memory non-migratable an acceptable
implementation?
   


Please make sure to address some of the CODING_STYLE comments too.

Regards,

Anthony Liguori


Thanks,
Cam

On Fri, Jun 4, 2010 at 3:45 PM, Cam Macdonellc...@cs.ualberta.ca  wrote:
   

Latest patch for PCI shared memory device that maps a host shared memory object
to be shared between guests

new in this series
- migration support with 'master' and 'peer' roles for guest to determine
  who owns memory.  With 'master', the guest has the canonical copy of
  the shared memory and will copy it with it on migration.  With 
'role=peer',
  the guest will not copy the shared memory, but attach to what is on the
  destination machine.
- modified phys_ram_dirty array for marking memory as not to be migrated
- add support for non-migrated memory regions

v5:
- fixed segfault for non-server case
- code style fixes
- removed limit on the number of guests
- shared memory server is now in qemu.git/contrib
- made ioeventfd setup function generic
- removed interrupts when guest joined (let application handle it)

v4:
- moved to single Doorbell register and use datamatch to trigger different
  VMs rather than one register per eventfd
- remove writing arbitrary values to eventfds.  Only values of 1 are now
  written to ensure correct usage

Cam Macdonell (6):
  Device specification for shared memory PCI device
  Adds two new functions for assigning ioeventfd and irqfds.
  Change phys_ram_dirty to phys_ram_status
  Add support for marking memory to not be migrated.  On migration,
memory is checked for the NO_MIGRATION_FLAG.
  Inter-VM shared memory PCI device
  the stand-alone shared memory server for inter-VM shared memory

  Makefile.target |3 +
  arch_init.c |   28 +-
  contrib/ivshmem-server/Makefile |   16 +
  contrib/ivshmem-server/README   |   30 ++
  contrib/ivshmem-server/ivshmem_server.c |  353 +
  contrib/ivshmem-server/send_scm.c   |  208 
  contrib/ivshmem-server/send_scm.h   |   19 +
  cpu-all.h   |   18 +-
  cpu-common.h|2 +
  docs/specs/ivshmem_device_spec.txt  |   96 
  exec.c  |   48 ++-
  hw/ivshmem.c|  852 +++
  kvm-all.c   |   32 ++
  kvm.h   |1 +
  qemu-char.c |6 +
  qemu-char.h |3 +
  qemu-doc.texi   |   32 ++
  17 files changed, 1710 insertions(+), 37 deletions(-)
  create mode 100644 contrib/ivshmem-server/Makefile
  create mode 100644 contrib/ivshmem-server/README
  create mode 100644 contrib/ivshmem-server/ivshmem_server.c
  create mode 100644 contrib/ivshmem-server/send_scm.c
  create mode 100644 contrib/ivshmem-server/send_scm.h
  create mode 100644 docs/specs/ivshmem_device_spec.txt
  create mode 100644 hw/ivshmem.c

Re: [Qemu-devel] [PATCH v6 6/6] the stand-alone shared memory server for inter-VM shared memory

2010-06-14 Thread Anthony Liguori


On 06/04/2010 04:45 PM, Cam Macdonell wrote:

this code is a standalone server which will pass file descriptors for the shared
memory region and eventfds to support interrupts between guests using inter-VM
shared memory.
---
  contrib/ivshmem-server/Makefile |   16 ++
  contrib/ivshmem-server/README   |   30 +++
  contrib/ivshmem-server/ivshmem_server.c |  353 +++
  contrib/ivshmem-server/send_scm.c   |  208 ++
  contrib/ivshmem-server/send_scm.h   |   19 ++
  5 files changed, 626 insertions(+), 0 deletions(-)
  create mode 100644 contrib/ivshmem-server/Makefile
  create mode 100644 contrib/ivshmem-server/README
  create mode 100644 contrib/ivshmem-server/ivshmem_server.c
  create mode 100644 contrib/ivshmem-server/send_scm.c
  create mode 100644 contrib/ivshmem-server/send_scm.h

diff --git a/contrib/ivshmem-server/Makefile b/contrib/ivshmem-server/Makefile
new file mode 100644
index 000..da40ffa
--- /dev/null
+++ b/contrib/ivshmem-server/Makefile
@@ -0,0 +1,16 @@
+CC = gcc
+CFLAGS = -O3 -Wall -Werror
+LIBS = -lrt
+
+# a very simple makefile to build the inter-VM shared memory server
+
+all: ivshmem_server
+
+.c.o:
+   $(CC) $(CFLAGS) -c $^ -o $@
+
+ivshmem_server: ivshmem_server.o send_scm.o
+   $(CC) $(CFLAGS) -o $@ $^ $(LIBS)
+
+clean:
+   rm -f *.o ivshmem_server
diff --git a/contrib/ivshmem-server/README b/contrib/ivshmem-server/README
new file mode 100644
index 000..b1fc2a2
--- /dev/null
+++ b/contrib/ivshmem-server/README
@@ -0,0 +1,30 @@
+Using the ivshmem shared memory server
+--
+
+This server is only supported on Linux.
+
+To use the shared memory server, first compile it.  Running 'make' should
+accomplish this.  An executable named 'ivshmem_server' will be built.
+
+to display the options run:
+
+./ivshmem_server -h
+
+Options
+---
+
+-h  print help message
+
+-ppath on host
+unix socket to listen on.  The qemu-kvm chardev needs to connect on
+this socket. (default: '/tmp/ivshmem_socket')
+
+-sstring
+POSIX shared object to create that is the shared memory (default: 
'ivshmem')
+
+-m#
+size of the POSIX object in MBs (default: 1)
+
+-n#
+number of eventfds for each guest.  This number must match the
+'vectors' argument passed the ivshmem device. (default: 1)
diff --git a/contrib/ivshmem-server/ivshmem_server.c 
b/contrib/ivshmem-server/ivshmem_server.c
new file mode 100644
index 000..e0a7b98
--- /dev/null
+++ b/contrib/ivshmem-server/ivshmem_server.c
   


There's no licensing here.  I don't think this belongs in the qemu tree 
either to be honest.  If it were to be included, it ought to use all of 
the existing qemu infrastructure like the other qemu-* tools.


Regards,

Anthony Liguori


@@ -0,0 +1,353 @@
+/*
+ * A stand-alone shared memory server for inter-VM shared memory for KVM
+*/
+
+#includeerrno.h
+#includestring.h
+#includesys/types.h
+#includesys/socket.h
+#includesys/un.h
+#includeunistd.h
+#includesys/types.h
+#includesys/stat.h
+#includefcntl.h
+#includesys/eventfd.h
+#includesys/mman.h
+#includesys/select.h
+#includestdio.h
+#includestdlib.h
+#include send_scm.h
+
+#define DEFAULT_SOCK_PATH /tmp/ivshmem_socket
+#define DEFAULT_SHM_OBJ ivshmem
+
+#define DEBUG 1
+
+typedef struct server_state {
+vmguest_t *live_vms;
+int nr_allocated_vms;
+int shm_size;
+long live_count;
+long total_count;
+int shm_fd;
+char * path;
+char * shmobj;
+int maxfd, conn_socket;
+long msi_vectors;
+} server_state_t;
+
+void usage(char const *prg);
+int find_set(fd_set * readset, int max);
+void print_vec(server_state_t * s, const char * c);
+
+void add_new_guest(server_state_t * s);
+void parse_args(int argc, char **argv, server_state_t * s);
+int create_listening_socket(char * path);
+
+int main(int argc, char ** argv)
+{
+fd_set readset;
+server_state_t * s;
+
+s = (server_state_t *)calloc(1, sizeof(server_state_t));
+
+s-live_count = 0;
+s-total_count = 0;
+parse_args(argc, argv, s);
+
+/* open shared memory file  */
+if ((s-shm_fd = shm_open(s-shmobj, O_CREAT|O_RDWR, S_IRWXU))  0)
+{
+fprintf(stderr, kvm_ivshmem: could not open shared file\n);
+exit(-1);
+}
+
+ftruncate(s-shm_fd, s-shm_size);
+
+s-conn_socket = create_listening_socket(s-path);
+
+s-maxfd = s-conn_socket;
+
+for(;;) {
+int ret, handle, i;
+char buf[1024];
+
+print_vec(s, vm_sockets);
+
+FD_ZERO(readset);
+/* conn socket is in Live_vms at posn 0 */
+FD_SET(s-conn_socket,readset);
+for (i = 0; i  s-total_count; i++) {
+if (s-live_vms[i].alive != 0) {
+FD_SET(s-live_vms[i].sockfd,readset);
+}
+}
+
+printf(\nWaiting (maxfd = %d)\n, s-maxfd);
+
+ret = select(s-maxfd + 1,readset, NULL, NULL, NULL);
+
+if

Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path()

2010-06-14 Thread Jan Kiszka

Alex Williamson wrote:
 On Mon, 2010-06-14 at 14:09 +0100, Paul Brook wrote:
 /main-system-bus/pci.0,addr=09.0/virtio-blk-pci
 There's a device missing between the main system bus and the pci bus.  
 Should 
 be something like:

 /main-system-bus/piix4-pcihost/pci.0/_09.0
 
 Ok, I can easily come up with:
 
 /System/main-system-bus/i440FX-pcihost/PCI/pci.0/_09.0/virtio-blk-pci/virtio-blk

First two elements are redundant, '/' already stands for the main system
bus. Then I don't get what last element expresses (the target device is
virtio-blk-pci). Is this due to some vmstate layout? But that should not
be part into qdev paths (maybe I'm missing your use case).

And instead of introducing another hierarchy level with the bus address,
I would also prefer to add this as prefix or suffix to the device name,
e.g. driver.busaddr.

 
 to expand the previous output.  Code below.
 
 Could you explain why you add identified properties of the immediate
 parent bus and device?  They make the result ver much *not* a dev
 path in the qdev sense...
 In order to try to get a unique string.  Without looking into device
 properties, two e1000s would both be:

 /main-system-bus/pci.0/e1000
 /main-system-bus/pci.0/e1000

 Which is no better than simply e1000 and would require us to fall back
 to instance ids again.  The goal here is that anything that makes use of
 passing a dev when registering a vmstate gets an instance id of zero.
 You already got the information you need, you just put it in the wrong 
 place. 
 The canonical ID for the device could be its bus address. In practice we'd 
 probably want to allow the user to specify it by name, provided these are 
 unique. e.g. in the above machine we could accept [...]/virtiio-blk-pci 
 would 
 as an aias for [...]:_09.0.
 
 Sure, if there was a guaranteed unique char* on a DeviceState that was
 consistent between guest invocations, we could just use that instead.  I
 suppose all devices could have a global system id property and if that
 was present we'd use that instead of creating the device path.
 
 Device names have a restricted namespace, so we 
 can use an initial prefix to disambiguate a name/label from a bus address.
 
 I'm not sure it's necessary.  It seems like it would only be necessary
 to differentiate the two if we wanted to translate between namespaces.
 But I think it's reasonable to require that if a global name is
 provided, it must always be provided for the life of the VM.
 
 For busses that don't have a consistent addressing scheme then some sort of 
 instance ID is unavoidable. I guess it may be possible to invent something 
 based on other device properties (e.g. address of the first IO port/memory 
 region).
 
 Instance IDs aren't always bad, we just run into trouble with hotplug
 and maybe creating unique ramblock names.  But, such busses typically
 don't support hotplug and don't have multiple instances of the same
 device on the bus, so I don't expect us to hit many issues there as long
 as we can get a stable address path.  Thanks,
 

If stable instance numbers are required, we could simply keep them in
DeviceState and search for the smallest free one on additions. Actually,
I'm more in favor of this direction than including the bus address. That
way we could keep a canonical format across all buses and do not have to
provide possibly complex ID generation rules for each of them.

BTW, the problem isn't truly solved by printing paths. We also need to
parse them. It would be counterproductive if such paths ever see the
light of day (ie. leak outside vmstate) and a user tries to hack it
into the monitor or pass it on the command line. With my series, I'm
currently able to process paths like this one:

/i440FX-pcihost.0/pci.0/PIIX4_PM.0/i2c/smbus-eeprom.6

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 00/10][PULL]: QMP/Monitor queue

2010-06-14 Thread Anthony Liguori


On 06/11/2010 02:58 PM, Luiz Capitulino wrote:

  Hi Anthony,

  The following QMP/Monitor patches have been sent to the list and look good
to me. I have also tested most of them.

  The changes (since 0e2029a063405091ee34170ef71aa321715e4357) are available in
the following repository:

 git://repo.or.cz/qemu/qmp-unstable.git for-anthony
   


Pulled.  Thanks.

Regards,

Anthony Liguori


Jan Kiszka (1):
   hxtool: Fix line number reporting on SQMP/EQMP errors

Luiz Capitulino (6):
   json-lexer: Initialize 'x' and 'y'
   json-lexer: Handle missing escapes
   qjson: Handle \f
   check-qjson: Add more escape tests
   json-lexer: Drop 'buf'
   json-streamer: Don't use qdict_put_obj()

Paolo Bonzini (3):
   add some tests for invalid JSON
   implement optional lookahead in json lexer
   remove unnecessary lookaheads

  check-qjson.c   |  112 -
  hxtool  |2 +
  json-lexer.c|  114 ++
  json-parser.c   |4 ++
  json-streamer.c |8 ++--
  qjson.c |3 +
  6 files changed, 177 insertions(+), 66 deletions(-)

[Qemu-devel] Re: [PATCH v3 0/5] Add QMP migration events

2010-06-14 Thread Juan Quintela

Anthony Liguori aligu...@linux.vnet.ibm.com wrote:
 On 06/12/2010 06:05 AM, Juan Quintela wrote:
 Luiz Capitulinolcapitul...@redhat.com  wrote:

 The monitor that did it knows it, nobody else knows it.  At destination
 time, I guess you agree this is important, i.e. the management app knows
 that migration has started.


 Dual monitors is a slippery slope argument because even if you had
 these events, it doesn't give you nearly enough information to
 implement anything safely.

Security folks here needed to do logging of qemu events, here.  Guest
what ones: vm_start, vm_stop, vm_continue, and vm_migrate.

Insteod of a nice: write this small qmp client, and listen for this
four events, I just had to point them where to hack their logging.

About libvirt, I would rreally, really like to be able to use libvirt to
launch a guest, and then let im me launch another monitor and stoup
libvirt for continuing with it.  There is no way for a monitor that
there has been doing write operations in other monitors.  I see this
as a useful feature for all write (i.e. not query) commands.

 If you truly want to support dual uncooperative monitors, you
 basically need to mirror every monitor session so that each monitor
 can see what the other monitors are doing.  You also need a
 transaction mechanism to allow a client to do operations in a non-racy
 way.

For now, I will set with just knowing that other write operations has 
happened.

 Since we're not likely to ever implement these things, dual monitor
 support is really not a viable argument for such a change.

As showed before for the audit logging, A read only monitor would have
been useful for me now, i.e. not I wish, whatever.

 QMP is the wrong mechanism for this.  Merging the tracing code and
 then adding trace points to migration is the right solution for this
 problem.

 The problem is, all of the arguments you're using to justify this for
 the migrate command is applicable for every other command we have.
 Why do this for migrate and not for commit or savevm?

 Do we really want to introduce 4 events for every command that we support?

Migration commands have a feature that dont' have other commands: they
invosve two machines.

And I would also liked to have that events for all the write commands.
Migration is more interesting becaues it needs synchronization.


Later, Juan.

Re: [Qemu-devel] Re: [SeaBIOS] [PATCHv2] load hpet info for HPET ACPI table from qemu

2010-06-14 Thread Gleb Natapov

On Mon, Jun 14, 2010 at 03:40:16PM +0100, Jamie Lokier wrote:
 Gleb Natapov wrote:
  On Mon, Jun 14, 2010 at 09:54:25AM -0400, Kevin O'Connor wrote:
   Could we just have qemu build the hpet tables and pass them through to
   seabios?  Perhaps using the qemu_cfg_acpi_additional_tables() method.
   
  Possible, and I considered that. I personally prefer to pass minimum
  information required for seabios to discover underlying HW and leave
  ACPI table creation to seabios. That is how things done for HW that
  seabios can actually detect. If we will go your way pretty soon we will
  move creation of ACPI/SMBIOS/MP tables into qemu and IMHO this will be
  step backworkds.
 
 Why would creation of all the tables in qemu be a bad thing or a step
 in the wrong direction?
 
See Avi's answer. All those tables are firmware/OS interface and as such
may (and some definitely do) contain information known only to BIOS.

--
Gleb.

Re: [Qemu-devel] [PATCH v6 4/6] Add support for marking memory to not be migrated. On migration, memory is checked for the NO_MIGRATION_FLAG.

2010-06-14 Thread Anthony Liguori


On 06/14/2010 11:08 AM, Cam Macdonell wrote:

On Mon, Jun 14, 2010 at 9:51 AM, Anthony Liguorianth...@codemonkey.ws  wrote:
   

On 06/04/2010 04:45 PM, Cam Macdonell wrote:
 

This is useful for devices that do not want to take memory regions data
with them on migration.
---
  arch_init.c  |   28 
  cpu-all.h|2 ++
  cpu-common.h |2 ++
  exec.c   |   12 
  4 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index cfc03ea..7a234fa 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -118,18 +118,21 @@ static int ram_save_block(QEMUFile *f)
  current_addr +
TARGET_PAGE_SIZE,
  MIGRATION_DIRTY_FLAG);

-p = qemu_get_ram_ptr(current_addr);
-
-if (is_dup_page(p, *p)) {
-qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_COMPRESS);
-qemu_put_byte(f, *p);
-} else {
-qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_PAGE);
-qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
-}
+if (!cpu_physical_memory_get_dirty(current_addr,
+NO_MIGRATION_FLAG)) {
+p = qemu_get_ram_ptr(current_addr);
+
+if (is_dup_page(p, *p)) {
+qemu_put_be64(f, current_addr |
RAM_SAVE_FLAG_COMPRESS);
+qemu_put_byte(f, *p);
+} else {
+qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_PAGE);
+qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
+}

-found = 1;
-break;
+found = 1;
+break;
+}
  }

   

Shouldn't we just disable live migration out right?
 

I'm confused, as you seemed insistent on migration before.  Do you
want to support static migration (suspend/resume), but not live
migration?  What information do the master/peer roles represent then?
   


When role=master, you should not disable live migration and you should 
still migrate the contents of the data.  Otherwise, when you migrate, 
you lose the contents of shared memory and since the role is master, 
it's the one responsible for the data.



I would rather that the device mark migration as impossible having the user
hot remove the device before migration and then add it again after
migration.  Device assignment could also use this functionality.
 

Would marking migration impossible be a new mechanism or are there
other devices that mark migration impossible? or something added to
QMP Sorry, you can't migrate with device 'x' attached?
   


We don't have such a mechanism today.

Regards,

Anthony Liguori


Cam

   

  addr += TARGET_PAGE_SIZE;
  current_addr = (saved_addr + addr) % last_ram_offset;
@@ -146,7 +149,8 @@ static ram_addr_t ram_save_remaining(void)
  ram_addr_t count = 0;

  for (addr = 0; addrlast_ram_offset; addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
+if (!cpu_physical_memory_get_dirty(addr, NO_MIGRATION_FLAG)
+cpu_physical_memory_get_dirty(addr,
MIGRATION_DIRTY_FLAG)) {
  count++;
  }
  }
diff --git a/cpu-all.h b/cpu-all.h
index 9080cc7..4df00ab 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -887,6 +887,8 @@ extern int mem_prealloc;
  #define CODE_DIRTY_FLAG  0x02
  #define MIGRATION_DIRTY_FLAG 0x08

+#define NO_MIGRATION_FLAG 0x10
+
  #define DIRTY_ALL_FLAG  (VGA_DIRTY_FLAG | CODE_DIRTY_FLAG |
MIGRATION_DIRTY_FLAG)

  /* read dirty bit (return 0 or 1) */
diff --git a/cpu-common.h b/cpu-common.h
index 4b0ba60..a1ebbbe 100644
--- a/cpu-common.h
+++ b/cpu-common.h
@@ -39,6 +39,8 @@ static inline void
cpu_register_physical_memory(target_phys_addr_t start_addr,
  cpu_register_physical_memory_offset(start_addr, size, phys_offset,
0);
  }

+void cpu_mark_pages_no_migrate(ram_addr_t start, uint64_t size);
+
  ram_addr_t cpu_get_physical_page_desc(target_phys_addr_t addr);
  ram_addr_t qemu_ram_map(ram_addr_t size, void *host);
  ram_addr_t qemu_ram_alloc(ram_addr_t);
diff --git a/exec.c b/exec.c
index 39c18a7..c11d22f 100644
--- a/exec.c
+++ b/exec.c
@@ -2786,6 +2786,18 @@ static void *file_ram_alloc(ram_addr_t memory,
const char *path)
  }
  #endif

+void cpu_mark_pages_no_migrate(ram_addr_t start, uint64_t length)
+{
+int i, len;
+uint8_t *p;
+
+len = lengthTARGET_PAGE_BITS;
+p = phys_ram_flags + (startTARGET_PAGE_BITS);
+for (i = 0; ilen; i++) {
+p[i] |= NO_MIGRATION_FLAG;
+}
+}
+
  ram_addr_t qemu_ram_map(ram_addr_t size, void *host)
  {
  RAMBlock *new_block;

[Qemu-devel] ARM Cortex-M3 Resets

2010-06-14 Thread Christopher Johnson

I am attempting to run FreeRTOS under qemu-system-arm 0.12.  I am compiling
from source.  At the current time arm-test works fine.  It uses a boot
loader with the expectation that the PC=0 after Reset is de-asserted.

The CORTEXT-M3 reference states:
NVIC resets, holds core in reset NVIC clears most of its registers. The
processor is in Thread mode, priority is
 privileged, and the stack is set to Main.
NVIC releases core from resetNVIC releases core from reset.
Core sets stack  Core reads the start SP, SP_main, from
vector-table offset 0.
Core sets PC and LR  Core reads the start PC from vector-table
offset. LR is set to 0x.
Reset routine runs   NVIC has interrupts disabled, and NMI and
Hard Fault are not disabled.

My translation of this is that the NVIC vector table is located at 0x0.
Therefore SP=Word at location 0 of physical memory.  LR=0x, and
PC=Word at location 4 of physical memory.

This matches what I see in the LM3S811 example code from TI, it also matches
what I see in the FreeRTOS code.

In looking at target-arm/helper.c I did not see anything that seemed to set
the PC, SP or LR.  I added some code to the reset functions and moved what I
think is the PC set.

--- helper.c.ORG2010-05-04 11:27:48.0 -0400
+++ helper.c2010-06-13 19:46:58.0 -0400
@@ -134,6 +134,15 @@
 set_feature(env, ARM_FEATURE_V7);
 set_feature(env, ARM_FEATURE_M);
 set_feature(env, ARM_FEATURE_DIV);
+/* R13 = SP, R14=LR, R15=PC */
+{
+uint8_t buf[8];
+cpu_physical_memory_read(0x0, buf, sizeof(buf));
+env-regs[13]=(buf[0]24) | (buf[1]  16) | (buf[2]  8) | bu
f[3];
+env-regs[14]=0x;
+env-regs[15]=(buf[4]24) | (buf[5]  16) | (buf[6]  8) | bu
f[7];
+fprintf(stderr,cpu_reset_model_id: SP=0x%X, LR=0x%X, PC=0x%X\n,
env-regs[13]
, env-regs[14], env-regs[15]);
+}
 break;
 case ARM_CPUID_ANY: /* For userspace emulation.  */
 set_feature(env, ARM_FEATURE_V6);
@@ -193,11 +202,13 @@
 qemu_log(CPU Reset (CPU %d)\n, env-cpu_index);
 log_cpu_state(env, 0);
 }
+fprintf(stderr,CPU Reset (CPU %d)\n, env-cpu_index);

 id = env-cp15.c0_cpuid;
 memset(env, 0, offsetof(CPUARMState, breakpoints));
 if (id)
 cpu_reset_model_id(env, id);
+env-regs[15] = 0;
 #if defined (CONFIG_USER_ONLY)
 env-uncached_cpsr = ARM_CPU_MODE_USR;
 env-vfp.xregs[ARM_VFP_FPEXC] = 1  30;
@@ -211,7 +222,6 @@
 env-vfp.xregs[ARM_VFP_FPEXC] = 0;
 env-cp15.c2_base_mask = 0xc000u;
 #endif
-env-regs[15] = 0;
 tlb_flush(env, 1);
 }

The problem is that It does not seem to load from the NVIC vector and even
when I do a monitor command: system_reset, the arm does not reset.

I'm looking for help in accessing rom memory to set SP and PC on reset.  Of
implementing system_reset for the ARM processor(s).

Thank you,
-Chris

[Qemu-devel] Re: [PATCH v3 0/5] Add QMP migration events

2010-06-14 Thread Anthony Liguori


On 06/14/2010 11:02 AM, Juan Quintela wrote:

Anthony Liguorialigu...@linux.vnet.ibm.com  wrote:
   

On 06/12/2010 06:05 AM, Juan Quintela wrote:
 

Luiz Capitulinolcapitul...@redhat.com   wrote:
   
   

The monitor that did it knows it, nobody else knows it.  At destination
time, I guess you agree this is important, i.e. the management app knows
that migration has started.

   

Dual monitors is a slippery slope argument because even if you had
these events, it doesn't give you nearly enough information to
implement anything safely.
 

Security folks here needed to do logging of qemu events, here.  Guest
what ones: vm_start, vm_stop, vm_continue, and vm_migrate.
   


Why do security folks need this?  Why are they not interested in other 
things like savevm?  Why are they talkign to qemu and not libvirt (they 
shouldn't trust qemu).



Insteod of a nice: write this small qmp client, and listen for this
four events, I just had to point them where to hack their logging.

About libvirt, I would rreally, really like to be able to use libvirt to
launch a guest, and then let im me launch another monitor and stoup
libvirt for continuing with it.  There is no way for a monitor that
there has been doing write operations in other monitors.  I see this
as a useful feature for all write (i.e. not query) commands.
   


Yeah, but if we want to do this, then we need to discuss this with the 
libvirt folks and come up with a proposal that works for all commands.  
Sneaking in a few migration events is not going to help.



QMP is the wrong mechanism for this.  Merging the tracing code and
then adding trace points to migration is the right solution for this
problem.
 
   

The problem is, all of the arguments you're using to justify this for
the migrate command is applicable for every other command we have.
Why do this for migrate and not for commit or savevm?

Do we really want to introduce 4 events for every command that we support?
 

Migration commands have a feature that dont' have other commands: they
invosve two machines.

And I would also liked to have that events for all the write commands.
Migration is more interesting becaues it needs synchronization.
   


I'm still fundamentally confused about what you think you can do with 
these events.  But do you really intend on introducing events for every 
non-query QMP command?  Does that seem a bit unrealistic?


Wouldn't it be better to have a mechanism to snoop on monitor traffic in 
a more proper way?


Regards,

Anthony Liguori


Later, Juan.

Re: [Qemu-devel] [PATCH v6 4/6] Add support for marking memory to not be migrated. On migration, memory is checked for the NO_MIGRATION_FLAG.

2010-06-14 Thread Cam Macdonell

On Mon, Jun 14, 2010 at 9:51 AM, Anthony Liguori anth...@codemonkey.ws wrote:
 On 06/04/2010 04:45 PM, Cam Macdonell wrote:

 This is useful for devices that do not want to take memory regions data
 with them on migration.
 ---
  arch_init.c  |   28 
  cpu-all.h    |    2 ++
  cpu-common.h |    2 ++
  exec.c       |   12 
  4 files changed, 32 insertions(+), 12 deletions(-)

 diff --git a/arch_init.c b/arch_init.c
 index cfc03ea..7a234fa 100644
 --- a/arch_init.c
 +++ b/arch_init.c
 @@ -118,18 +118,21 @@ static int ram_save_block(QEMUFile *f)
                                              current_addr +
 TARGET_PAGE_SIZE,
                                              MIGRATION_DIRTY_FLAG);

 -            p = qemu_get_ram_ptr(current_addr);
 -
 -            if (is_dup_page(p, *p)) {
 -                qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_COMPRESS);
 -                qemu_put_byte(f, *p);
 -            } else {
 -                qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_PAGE);
 -                qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
 -            }
 +            if (!cpu_physical_memory_get_dirty(current_addr,
 +                                                    NO_MIGRATION_FLAG)) {
 +                p = qemu_get_ram_ptr(current_addr);
 +
 +                if (is_dup_page(p, *p)) {
 +                    qemu_put_be64(f, current_addr |
 RAM_SAVE_FLAG_COMPRESS);
 +                    qemu_put_byte(f, *p);
 +                } else {
 +                    qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_PAGE);
 +                    qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
 +                }

 -            found = 1;
 -            break;
 +                found = 1;
 +                break;
 +            }
          }


 Shouldn't we just disable live migration out right?

I'm confused, as you seemed insistent on migration before.  Do you
want to support static migration (suspend/resume), but not live
migration?  What information do the master/peer roles represent then?


 I would rather that the device mark migration as impossible having the user
 hot remove the device before migration and then add it again after
 migration.  Device assignment could also use this functionality.

Would marking migration impossible be a new mechanism or are there
other devices that mark migration impossible? or something added to
QMP Sorry, you can't migrate with device 'x' attached?

Cam

          addr += TARGET_PAGE_SIZE;
          current_addr = (saved_addr + addr) % last_ram_offset;
 @@ -146,7 +149,8 @@ static ram_addr_t ram_save_remaining(void)
      ram_addr_t count = 0;

      for (addr = 0; addr  last_ram_offset; addr += TARGET_PAGE_SIZE) {
 -        if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
 +        if (!cpu_physical_memory_get_dirty(addr, NO_MIGRATION_FLAG)
 +                cpu_physical_memory_get_dirty(addr,
 MIGRATION_DIRTY_FLAG)) {
              count++;
          }
      }
 diff --git a/cpu-all.h b/cpu-all.h
 index 9080cc7..4df00ab 100644
 --- a/cpu-all.h
 +++ b/cpu-all.h
 @@ -887,6 +887,8 @@ extern int mem_prealloc;
  #define CODE_DIRTY_FLAG      0x02
  #define MIGRATION_DIRTY_FLAG 0x08

 +#define NO_MIGRATION_FLAG 0x10
 +
  #define DIRTY_ALL_FLAG  (VGA_DIRTY_FLAG | CODE_DIRTY_FLAG |
 MIGRATION_DIRTY_FLAG)

  /* read dirty bit (return 0 or 1) */
 diff --git a/cpu-common.h b/cpu-common.h
 index 4b0ba60..a1ebbbe 100644
 --- a/cpu-common.h
 +++ b/cpu-common.h
 @@ -39,6 +39,8 @@ static inline void
 cpu_register_physical_memory(target_phys_addr_t start_addr,
      cpu_register_physical_memory_offset(start_addr, size, phys_offset,
 0);
  }

 +void cpu_mark_pages_no_migrate(ram_addr_t start, uint64_t size);
 +
  ram_addr_t cpu_get_physical_page_desc(target_phys_addr_t addr);
  ram_addr_t qemu_ram_map(ram_addr_t size, void *host);
  ram_addr_t qemu_ram_alloc(ram_addr_t);
 diff --git a/exec.c b/exec.c
 index 39c18a7..c11d22f 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -2786,6 +2786,18 @@ static void *file_ram_alloc(ram_addr_t memory,
 const char *path)
  }
  #endif

 +void cpu_mark_pages_no_migrate(ram_addr_t start, uint64_t length)
 +{
 +    int i, len;
 +    uint8_t *p;
 +
 +    len = length  TARGET_PAGE_BITS;
 +    p = phys_ram_flags + (start  TARGET_PAGE_BITS);
 +    for (i = 0; i  len; i++) {
 +        p[i] |= NO_MIGRATION_FLAG;
 +    }
 +}
 +
  ram_addr_t qemu_ram_map(ram_addr_t size, void *host)
  {
      RAMBlock *new_block;

Re: [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed.

2010-06-14 Thread Richard Henderson

On 06/13/2010 03:23 PM, Alexander Graf wrote:
 On 13.06.2010, at 18:44, Aurelien Jarno wrote:
 Is KVM in 31-bit mode actually functional?
 
 I'm not aware of anything preventing it to be. But I honestly haven't
 tried. As long as all hypercall parameters stay within the first
 32/31 bits, things should be safe.

On the other hand, is there any point in supporting it?
This does seem to be a case for which it does seem like
having the extra VM space (KVM+TCG) and the extra register
size (TCG) is extremely helpful.

I'd be just as happy to adjust the configury to actively
prevent compilation in 31-bit mode, rather than pretend
it might work and never bother to test it...


r~

Re: [Qemu-devel] [PATCH] qemu-option: Fix uninitialized value in append_option_parameter

2010-06-14 Thread Anthony Liguori


On 06/11/2010 03:19 AM, Kevin Wolf wrote:

When dest is NULL, i.e. a new copy of the list is created, we don't get a
properly terminated list after the realloc. Initialize it as an empty list.

Signed-off-by: Kevin Wolfkw...@redhat.com
   


Applied.  Thanks.

Regards,

Anthony Liguori


---

Xudong, can you please try this one? I think it should fix your qemu-img
problem.

  qemu-option.c |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/qemu-option.c b/qemu-option.c
index acd74f9..f884865 100644
--- a/qemu-option.c
+++ b/qemu-option.c
@@ -378,6 +378,7 @@ QEMUOptionParameter 
*append_option_parameters(QEMUOptionParameter *dest,
  num_options += count_option_parameters(list);

  dest = qemu_realloc(dest, (num_options + 1) * 
sizeof(QEMUOptionParameter));
+dest[num_dest_options].name = NULL;

  while (list  list-name) {
  if (get_option_parameter(dest, list-name) == NULL) {

Re: [Qemu-devel] [PATCH] Make netdev_del delete the netdev even when it's in use

2010-06-14 Thread Anthony Liguori


On 06/11/2010 07:21 AM, Markus Armbruster wrote:

To hot-unplug guest and host part of a network device, you do:

 device_del NIC-ID
 netdev_del NETDEV-ID

For PCI devices, device_del merely tells ACPI to unplug the device.
The device goes away for real only after the guest processed the ACPI
unplug event.

You have to wait until then (e.g. by polling info pci) before you can
unplug the netdev.  Not good.

Fix by removing the in use check from do_netdev_del().  Deleting a
netdev while it's in use is safe; packets simply get routed to the bit
bucket.

Signed-off-by: Markus Armbrusterarm...@redhat.com
   


Applied.  Thanks.

Regards,

Anthony Liguori

---
I can add a force option instead, if you think that's better.

  net.c |4 
  1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/net.c b/net.c
index 4cb93ed..0703698 100644
--- a/net.c
+++ b/net.c
@@ -1221,10 +1221,6 @@ int do_netdev_del(Monitor *mon, const QDict *qdict, 
QObject **ret_data)
  qerror_report(QERR_DEVICE_NOT_FOUND, id);
  return -1;
  }
-if (vc-peer) {
-qerror_report(QERR_DEVICE_IN_USE, id);
-return -1;
-}
  qemu_del_vlan_client(vc);
  qemu_opts_del(qemu_opts_find(qemu_netdev_opts, id));
  return 0;

Re: [Qemu-devel] [PATCH v2] ram_blocks: Convert to a QLIST

2010-06-14 Thread Anthony Liguori


On 06/11/2010 12:11 PM, Alex Williamson wrote:

This makes the RAM block list easier to manipulate.  Also incorporate
relevant variables into the RAMList struct.

Signed-off-by: Alex Williamsonalex.william...@redhat.com
Acked-by: Chris Wrightchr...@redhat.com
   


Applied.  Thanks.

Regards,

Anthony Liguori

---

v2: For qemu.git this time

  arch_init.c |   14 +--
  cpu-all.h   |   28 +++--
  exec.c  |   78 ---
  3 files changed, 54 insertions(+), 66 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 8e849a8..eb5b67c 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -110,7 +110,7 @@ static int ram_save_block(QEMUFile *f)
  ram_addr_t addr = 0;
  int bytes_sent = 0;

-while (addr  last_ram_offset) {
+while (addr  ram_list.last_offset) {
  if (cpu_physical_memory_get_dirty(current_addr, 
MIGRATION_DIRTY_FLAG)) {
  uint8_t *p;

@@ -133,7 +133,7 @@ static int ram_save_block(QEMUFile *f)
  break;
  }
  addr += TARGET_PAGE_SIZE;
-current_addr = (saved_addr + addr) % last_ram_offset;
+current_addr = (saved_addr + addr) % ram_list.last_offset;
  }

  return bytes_sent;
@@ -146,7 +146,7 @@ static ram_addr_t ram_save_remaining(void)
  ram_addr_t addr;
  ram_addr_t count = 0;

-for (addr = 0; addr  last_ram_offset; addr += TARGET_PAGE_SIZE) {
+for (addr = 0; addr  ram_list.last_offset; addr += TARGET_PAGE_SIZE) {
  if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
  count++;
  }
@@ -167,7 +167,7 @@ uint64_t ram_bytes_transferred(void)

  uint64_t ram_bytes_total(void)
  {
-return last_ram_offset;
+return ram_list.last_offset;
  }

  int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque)
@@ -191,7 +191,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
  bytes_transferred = 0;

  /* Make sure all dirty bits are set */
-for (addr = 0; addr  last_ram_offset; addr += TARGET_PAGE_SIZE) {
+for (addr = 0; addr  ram_list.last_offset; addr += TARGET_PAGE_SIZE) {
  if (!cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) {
  cpu_physical_memory_set_dirty(addr);
  }
@@ -200,7 +200,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
  /* Enable dirty memory tracking */
  cpu_physical_memory_set_dirty_tracking(1);

-qemu_put_be64(f, last_ram_offset | RAM_SAVE_FLAG_MEM_SIZE);
+qemu_put_be64(f, ram_list.last_offset | RAM_SAVE_FLAG_MEM_SIZE);
  }

  bytes_transferred_last = bytes_transferred;
@@ -259,7 +259,7 @@ int ram_load(QEMUFile *f, void *opaque, int version_id)
  addr= TARGET_PAGE_MASK;

  if (flags  RAM_SAVE_FLAG_MEM_SIZE) {
-if (addr != last_ram_offset) {
+if (addr != ram_list.last_offset) {
  return -EINVAL;
  }
  }
diff --git a/cpu-all.h b/cpu-all.h
index 77eaf85..e31c2de 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -859,9 +859,21 @@ target_phys_addr_t cpu_get_phys_page_debug(CPUState *env, 
target_ulong addr);
  /* memory API */

  extern int phys_ram_fd;
-extern uint8_t *phys_ram_dirty;
  extern ram_addr_t ram_size;
-extern ram_addr_t last_ram_offset;
+
+typedef struct RAMBlock {
+uint8_t *host;
+ram_addr_t offset;
+ram_addr_t length;
+QLIST_ENTRY(RAMBlock) next;
+} RAMBlock;
+
+typedef struct RAMList {
+uint8_t *phys_dirty;
+ram_addr_t last_offset;
+QLIST_HEAD(ram, RAMBlock) blocks;
+} RAMList;
+extern RAMList ram_list;

  extern const char *mem_path;
  extern int mem_prealloc;
@@ -891,29 +903,29 @@ extern int mem_prealloc;
  /* read dirty bit (return 0 or 1) */
  static inline int cpu_physical_memory_is_dirty(ram_addr_t addr)
  {
-return phys_ram_dirty[addr  TARGET_PAGE_BITS] == 0xff;
+return ram_list.phys_dirty[addr  TARGET_PAGE_BITS] == 0xff;
  }

  static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
  {
-return phys_ram_dirty[addr  TARGET_PAGE_BITS];
+return ram_list.phys_dirty[addr  TARGET_PAGE_BITS];
  }

  static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
  int dirty_flags)
  {
-return phys_ram_dirty[addr  TARGET_PAGE_BITS]  dirty_flags;
+return ram_list.phys_dirty[addr  TARGET_PAGE_BITS]  dirty_flags;
  }

  static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
  {
-phys_ram_dirty[addr  TARGET_PAGE_BITS] = 0xff;
+ram_list.phys_dirty[addr  TARGET_PAGE_BITS] = 0xff;
  }

  static inline int cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
int dirty_flags)
  {
-return phys_ram_dirty[addr  TARGET_PAGE_BITS] |= dirty_flags;
+return ram_list.phys_dirty[addr  TARGET_PAGE_BITS] |= dirty_flags;
  }

  static

Re: [Qemu-devel] [PATCH] hpet: Init capability register only once

2010-06-14 Thread Anthony Liguori


On 06/14/2010 01:40 AM, Jan Kiszka wrote:

From: Jan Kiszkajan.kis...@siemens.com

The capability register is read-only from guest POV, so we do not need
to update it on reset.

Signed-off-by: Jan Kiszkajan.kis...@siemens.com
   


Applied.  Thanks.

Regards,

Anthony Liguori

---
  hw/hpet.c |9 +
  1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index 93fc399..e9b585c 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -648,10 +648,6 @@ static void hpet_reset(DeviceState *d)

  s-hpet_counter = 0ULL;
  s-hpet_offset = 0ULL;
-/* 64-bit main counter; LegacyReplacementRoute. */
-s-capability = 0x8086a001ULL;
-s-capability |= (s-num_timers - 1)  HPET_ID_NUM_TIM_SHIFT;
-s-capability |= ((HPET_CLK_PERIOD)  32);
  s-config = 0ULL;
  if (count  0) {
  /* we don't enable pit when hpet_reset is first called (by hpet_init)
@@ -696,6 +692,11 @@ static int hpet_init(SysBusDevice *dev)
  timer-state = s;
  }

+/* 64-bit main counter; LegacyReplacementRoute. */
+s-capability = 0x8086a001ULL;
+s-capability |= (s-num_timers - 1)  HPET_ID_NUM_TIM_SHIFT;
+s-capability |= ((HPET_CLK_PERIOD)  32);
+
  isa_reserve_irq(RTC_ISA_IRQ);
  qdev_init_gpio_in(dev-qdev, hpet_handle_rtc_irq, 1);

Re: [Qemu-devel] [PATCH 1/2] Remove unused DEBUG defines from hw/msix.c

2010-06-14 Thread Anthony Liguori


On 06/14/2010 10:05 AM, jes.soren...@redhat.com wrote:

From: Jes Sorensenjes.soren...@redhat.com

Remove unused DEBUG defines from hw/msix.c to avoid having anything
define the word DEBUG without any additions such as MSIX_DEBUG.

Signed-off-by: Jes Sorensenjes.soren...@redhat.com
   


Applied.  Thanks.

Regards,

Anthony Liguori


---
  hw/msix.c |9 -
  1 files changed, 0 insertions(+), 9 deletions(-)

diff --git a/hw/msix.c b/hw/msix.c
index 1613bb4..d99403a 100644
--- a/hw/msix.c
+++ b/hw/msix.c
@@ -42,15 +42,6 @@
  #define MSIX_MAX_ENTRIES 32


-#ifdef MSIX_DEBUG
-#define DEBUG(fmt, ...)   \
-do {  \
-  fprintf(stderr, %s:  fmt, __func__ , __VA_ARGS__);\
-} while (0)
-#else
-#define DEBUG(fmt, ...) do { } while(0)
-#endif
-
  /* Flag for interrupt controller to declare MSI-X support */
  int msix_supported;

Re: [Qemu-devel] [PATCHv2] pass info about hpets to seabios.]

2010-06-14 Thread Anthony Liguori


On 06/14/2010 03:29 AM, Gleb Natapov wrote:

Currently HPET ACPI table is created regardless of whether qemu actually
created hpet device. This may confuse some guests that don't check that
hpet is functional before using it. Solve this by passing info about
hpets in qemu to seabios via fw config interface. Additional benefit is
that seabios no longer uses hard coded hpet configuration. Proposed
interface supports up to 8 hpets. This is the number defined by hpet
spec.

Signed-off-by: Gleb Natapovg...@redhat.com
   


Applied.  Thanks.

Please let me know when Kevin merges the SeaBIOS side of this and I'll 
update the binary in the tree.


Regards,

Anthony Liguori

diff --git a/hw/hpet.c b/hw/hpet.c
index 93fc399..704fed1 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -71,8 +71,11 @@ typedef struct HPETState {
  uint64_t config;/* configuration */
  uint64_t isr;   /* interrupt status reg */
  uint64_t hpet_counter;  /* main counter */
+uint8_t  hpet_id;   /* instance id */
  } HPETState;

+struct hpet_fw_config hpet_cfg = {.count = ~0};
+
  static uint32_t hpet_in_legacy_mode(HPETState *s)
  {
  return s-config  HPET_CFG_LEGACY;
@@ -228,6 +231,7 @@ static int hpet_post_load(void *opaque, int version_id)
  /* Push number of timers into capability returned via HPET_ID */
  s-capability= ~HPET_ID_NUM_TIM_MASK;
  s-capability |= (s-num_timers - 1)  HPET_ID_NUM_TIM_SHIFT;
+hpet_cfg.hpet[s-hpet_id].event_timer_block_id = (uint32_t)s-capability;

  /* Derive HPET_MSI_SUPPORT from the capability of the first timer. */
  s-flags= ~(1  HPET_MSI_SUPPORT);
@@ -661,6 +665,8 @@ static void hpet_reset(DeviceState *d)
   */
  hpet_pit_enable();
  }
+hpet_cfg.hpet[s-hpet_id].event_timer_block_id = (uint32_t)s-capability;
+hpet_cfg.hpet[s-hpet_id].address = sysbus_from_qdev(d)-mmio[0].addr;
  count = 1;
  }

@@ -680,6 +686,16 @@ static int hpet_init(SysBusDevice *dev)
  int i, iomemtype;
  HPETTimer *timer;

+if (hpet_cfg.count == ~0) /* first instance */
+hpet_cfg.count = 0;
+
+if (hpet_cfg.count == 8) {
+fprintf(stderr, Only 8 instances of HPET is allowed\n);
+return -1;
+}
+
+s-hpet_id = hpet_cfg.count++;
+
  for (i = 0; i  HPET_NUM_IRQ_ROUTES; i++) {
  sysbus_init_irq(dev,s-irqs[i]);
  }
diff --git a/hw/hpet_emul.h b/hw/hpet_emul.h
index d7bc102..8bf312a 100644
--- a/hw/hpet_emul.h
+++ b/hw/hpet_emul.h
@@ -53,4 +53,19 @@
  #define HPET_TN_INT_ROUTE_CAP_SHIFT 32
  #define HPET_TN_CFG_BITS_READONLY_OR_RESERVED 0x80b1U

+struct hpet_fw_entry
+{
+uint32_t event_timer_block_id;
+uint64_t address;
+uint16_t min_tick;
+uint8_t page_prot;
+} __attribute__ ((packed));
+
+struct hpet_fw_config
+{
+uint8_t count;
+struct hpet_fw_entry hpet[8];
+} __attribute__ ((packed));
+
+extern struct hpet_fw_config hpet_cfg;
  #endif
diff --git a/hw/pc.c b/hw/pc.c
index 1491129..d14d657 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -61,6 +61,7 @@
  #define FW_CFG_SMBIOS_ENTRIES (FW_CFG_ARCH_LOCAL + 1)
  #define FW_CFG_IRQ0_OVERRIDE (FW_CFG_ARCH_LOCAL + 2)
  #define FW_CFG_E820_TABLE (FW_CFG_ARCH_LOCAL + 3)
+#define FW_CFG_HPET (FW_CFG_ARCH_LOCAL + 4)

  #define E820_NR_ENTRIES   16

@@ -484,6 +485,8 @@ static void *bochs_bios_init(void)
  fw_cfg_add_bytes(fw_cfg, FW_CFG_E820_TABLE, (uint8_t *)e820_table,
   sizeof(struct e820_table));

+fw_cfg_add_bytes(fw_cfg, FW_CFG_HPET, (uint8_t *)hpet_cfg,
+ sizeof(struct hpet_fw_config));
  /* allocate memory for the NUMA channel: one (64bit) word for the number
   * of nodes, one word for each VCPU-node and one word for each node to
   * hold the amount of memory.
--
Gleb.

Re: [Qemu-devel] [PATCH v2] add pflib: PixelFormat conversion library.

2010-06-14 Thread Anthony Liguori


On 06/14/2010 10:46 AM, Gerd Hoffmann wrote:

Signed-off-by: Gerd Hoffmannkra...@redhat.com
---
  Makefile.objs |1 +
  pflib.c   |  213 +
  pflib.h   |6 ++
  3 files changed, 220 insertions(+), 0 deletions(-)
  create mode 100644 pflib.c
  create mode 100644 pflib.h

diff --git a/Makefile.objs b/Makefile.objs
index 2dad0f9..6201675 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -83,6 +83,7 @@ common-obj-y += qemu-char.o savevm.o #aio.o
  common-obj-y += msmouse.o ps2.o
  common-obj-y += qdev.o qdev-properties.o
  common-obj-y += block-migration.o
+common-obj-y += pflib.o

  common-obj-$(CONFIG_BRLAPI) += baum.o
  common-obj-$(CONFIG_POSIX) += migration-exec.o migration-unix.o migration-fd.o
diff --git a/pflib.c b/pflib.c
new file mode 100644
index 000..1154d0c
--- /dev/null
+++ b/pflib.c
@@ -0,0 +1,213 @@
+/*
+ * PixelFormat conversion library.
+ *
+ * Author: Gerd Hoffmannkra...@redhat.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+#include qemu-common.h
+#include console.h
+#include pflib.h
+
+typedef struct QemuPixel QemuPixel;
+
+typedef void (*pf_convert)(QemuPfConv *conv,
+   void *dst, void *src, uint32_t cnt);
+typedef void (*pf_convert_from)(PixelFormat *pf,
+QemuPixel *dst, void *src, uint32_t cnt);
+typedef void (*pf_convert_to)(PixelFormat *pf,
+  void *dst, QemuPixel *src, uint32_t cnt);
+
+struct QemuPfConv {
+pf_convertconvert;
+PixelFormat   src;
+PixelFormat   dst;
+
+/* for copy_generic() */
+pf_convert_from   conv_from;
+pf_convert_to conv_to;
+QemuPixel *conv_buf;
+uint32_t  conv_cnt;
+};
+
+struct QemuPixel {
+uint8_t red;
+uint8_t green;
+uint8_t blue;
+uint8_t alpha;
+};
+
+/* --- */
+/* PixelFormat -  QemuPixel conversions*/
+
+static void conv_16_to_pixel(PixelFormat *pf,
+ QemuPixel *dst, void *src, uint32_t cnt)
+{
+uint16_t *src16 = src;
+
+while (cnt  0) {
+dst-red   = ((*src16  pf-rmask)  pf-rshift)  (8 - pf-rbits);
+dst-green = ((*src16  pf-gmask)  pf-gshift)  (8 - pf-gbits);
+dst-blue  = ((*src16  pf-bmask)  pf-bshift)  (8 - pf-bbits);
+dst-alpha = ((*src16  pf-amask)  pf-ashift)  (8 - pf-abits);
+dst++, src16++, cnt--;
+}
+}
+
+/* assumes pf-{r,g,b,a}bits == 8 */
+static void conv_32_to_pixel_fast(PixelFormat *pf,
+  QemuPixel *dst, void *src, uint32_t cnt)
+{
+uint32_t *src32 = src;
+
+while (cnt  0) {
+dst-red   = (*src32  pf-rmask)  pf-rshift;
+dst-green = (*src32  pf-gmask)  pf-gshift;
+dst-blue  = (*src32  pf-bmask)  pf-bshift;
+dst-alpha = (*src32  pf-amask)  pf-ashift;
+dst++, src32++, cnt--;
+}
+}
+
+static void conv_32_to_pixel_generic(PixelFormat *pf,
+ QemuPixel *dst, void *src, uint32_t cnt)
+{
+uint32_t *src32 = src;
+
+while (cnt  0) {
+if (pf-rbits  8) {
+dst-red   = ((*src32  pf-rmask)  pf-rshift)  (8 - 
pf-rbits);
+} else {
+dst-red   = ((*src32  pf-rmask)  pf-rshift)  (pf-rbits - 
8);
+}
+if (pf-gbits  8) {
+dst-green = ((*src32  pf-gmask)  pf-gshift)  (8 - 
pf-gbits);
+} else {
+dst-green = ((*src32  pf-gmask)  pf-gshift)  (pf-gbits - 
8);
+}
+if (pf-bbits  8) {
+dst-blue  = ((*src32  pf-bmask)  pf-bshift)  (8 - 
pf-bbits);
+} else {
+dst-blue  = ((*src32  pf-bmask)  pf-bshift)  (pf-bbits - 
8);
+}
+if (pf-abits  8) {
+dst-alpha = ((*src32  pf-amask)  pf-ashift)  (8 - 
pf-abits);
+} else {
+dst-alpha = ((*src32  pf-amask)  pf-ashift)  (pf-abits - 
8);
+}
+dst++, src32++, cnt--;
+}
+}
+
+/* --- */
+/* QemuPixel -  PixelFormat conversions*/
+
+static void conv_pixel_to_16(PixelFormat *pf,
+ void *dst, QemuPixel *src, uint32_t cnt)
+{
+uint16_t *dst16 = dst;
+
+while (cnt  0) {
+*dst16  = ((uint16_t)src-red  (8 - pf-rbits))  pf-rshift;
+*dst16 |= ((uint16_t)src-green  (8 - pf-gbits))  pf-gshift;
+*dst16 |= ((uint16_t)src-blue  (8 - pf-bbits))  pf-bshift;
+*dst16 |= ((uint16_t)src-alpha  (8 - pf-abits))  pf-ashift;
+dst16++, src++, cnt--;
+}
+}
+
+static void conv_pixel_to_32(PixelFormat *pf,
+ void *dst, QemuPixel *src, uint32_t cnt)
+{
+uint32_t *dst32 = dst;
+
+while (cnt  0) {
+*dst32  =

Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path()

2010-06-14 Thread Alex Williamson

On Mon, 2010-06-14 at 18:00 +0200, Jan Kiszka wrote:
 Alex Williamson wrote:
  On Mon, 2010-06-14 at 14:09 +0100, Paul Brook wrote:
  /main-system-bus/pci.0,addr=09.0/virtio-blk-pci
  There's a device missing between the main system bus and the pci bus.  
  Should 
  be something like:
 
  /main-system-bus/piix4-pcihost/pci.0/_09.0
  
  Ok, I can easily come up with:
  
  /System/main-system-bus/i440FX-pcihost/PCI/pci.0/_09.0/virtio-blk-pci/virtio-blk
 
 First two elements are redundant, '/' already stands for the main system
 bus.

Ok, we can start printing after the system bus.

  Then I don't get what last element expresses (the target device is
 virtio-blk-pci). Is this due to some vmstate layout? But that should not
 be part into qdev paths (maybe I'm missing your use case).

Sorry, I wasn't clear about that.  My printf is in the savevm code,
after it's already appended the idstr passed in from the device.  The
device path in the example above ends at virtio-blk-pci.  That's the
dev-info-name of the device passed into this function.

 And instead of introducing another hierarchy level with the bus address,
 I would also prefer to add this as prefix or suffix to the device name,
 e.g. driver.busaddr.

That's what I had started with.  The first post in this thread has
pci.0,addr=09.0 as a single hierarchy level.  The addr= may be
unnecessary there, but I also prefer something along those lines.  For
PCI it'd make sense to have name:addr, which comes out to
pci.0:09.0.  (Maybe rather than flagging properties as being relevant
to the path and printing them generically, we should extract specific
properties based on the bus type.)

  For busses that don't have a consistent addressing scheme then some sort 
  of 
  instance ID is unavoidable. I guess it may be possible to invent something 
  based on other device properties (e.g. address of the first IO port/memory 
  region).
  
  Instance IDs aren't always bad, we just run into trouble with hotplug
  and maybe creating unique ramblock names.  But, such busses typically
  don't support hotplug and don't have multiple instances of the same
  device on the bus, so I don't expect us to hit many issues there as long
  as we can get a stable address path.  Thanks,
  
 
 If stable instance numbers are required, we could simply keep them in
 DeviceState and search for the smallest free one on additions. Actually,
 I'm more in favor of this direction than including the bus address. That
 way we could keep a canonical format across all buses and do not have to
 provide possibly complex ID generation rules for each of them.

I started down that path, but it still breaks for hotplug.  If we start
a VM with two e1000 NICs, then remove the first, we can no longer
migrate because the target can't represent having a single e1000 with a
non-zero instance ID.

 BTW, the problem isn't truly solved by printing paths. We also need to
 parse them. It would be counterproductive if such paths ever see the
 light of day (ie. leak outside vmstate) and a user tries to hack it
 into the monitor or pass it on the command line. With my series, I'm
 currently able to process paths like this one:
 
 /i440FX-pcihost.0/pci.0/PIIX4_PM.0/i2c/smbus-eeprom.6

Yes, these are only intended for internal use, but we should get them to
sync up as much as possible.  Thanks,

Alex

[Qemu-devel] [PATCH] lsi53c895a: fix Phase Mismatch Jump

2010-06-14 Thread Paolo Bonzini

lsi_bad_phase has a bug in the choice of pmjad1/pmjad2.  This does
not matter with Linux guests because it uses just one routine for
both, but it breaks Windows 64-bit guests.  This is the text
from the spec:

   [The PMJCTL] bit controls which decision mechanism is used
   when jumping on phase mismatch. When this bit is cleared the
   LSI53C895A will use Phase Mismatch Jump Address 1 (PMJAD1) when
   the WSR bit is cleared and Phase Mismatch Jump Address 2 (PMJAD2)
   when the WSR bit is set.  When this bit is set the LSI53C895A will
   use jump address one (PMJAD1) on data out (data out, command,
   message out) transfers and jump address two (PMJAD2) on data in
   (data in, status, message in) transfers.

Which means:

CCNTL0.PMJCTL
0  SCNTL2.WSR = 0 PMJAD1
0  SCNTL2.WSR = 1 PMJAD2
1out  PMJAD1
1in   PMJAD2

In qemu, what you get instead is:

CCNTL0.PMJCTL
0out  PMJAD1
0in   PMJAD2
1out  PMJAD1
1in   PMJAD1

Considering that qemu always has SCNTL2.WSR cleared, the two marked cases
(corresponding to phase mismatch on input) are always jumping to the
wrong PMJAD register.  The patch implements the correct semantics.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/lsi53c895a.c |   12 +---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index f5a91ba..00df2bd 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -490,11 +490,14 @@ static void lsi_bad_phase(LSIState *s, int out, int 
new_phase)
 {
 /* Trigger a phase mismatch.  */
 if (s-ccntl0  LSI_CCNTL0_ENPMJ) {
-if ((s-ccntl0  LSI_CCNTL0_PMJCTL) || out) {
-s-dsp = s-pmjad1;
+int dest;
+if ((s-ccntl0  LSI_CCNTL0_PMJCTL)) {
+dest = out ? 1 : 2;
 } else {
-s-dsp = s-pmjad2;
+dest = (s-scntl2  LSI_SCNTL2_WSR ? 2 : 1);
 }
+
+s-dsp = (dest == 1) ? s-pmjad1 : s-pmjad2;
 DPRINTF(Data phase mismatch jump to %08x\n, s-dsp);
 } else {
 DPRINTF(Phase mismatch interrupt\n);
-- 
1.7.0.1

Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path()

2010-06-14 Thread Jan Kiszka

Alex Williamson wrote:
 On Mon, 2010-06-14 at 18:00 +0200, Jan Kiszka wrote:
 Alex Williamson wrote:
 On Mon, 2010-06-14 at 14:09 +0100, Paul Brook wrote:
 /main-system-bus/pci.0,addr=09.0/virtio-blk-pci
 There's a device missing between the main system bus and the pci bus.  
 Should 
 be something like:

 /main-system-bus/piix4-pcihost/pci.0/_09.0
 Ok, I can easily come up with:

 /System/main-system-bus/i440FX-pcihost/PCI/pci.0/_09.0/virtio-blk-pci/virtio-blk
 First two elements are redundant, '/' already stands for the main system
 bus.
 
 Ok, we can start printing after the system bus.
 
  Then I don't get what last element expresses (the target device is
 virtio-blk-pci). Is this due to some vmstate layout? But that should not
 be part into qdev paths (maybe I'm missing your use case).
 
 Sorry, I wasn't clear about that.  My printf is in the savevm code,
 after it's already appended the idstr passed in from the device.  The
 device path in the example above ends at virtio-blk-pci.  That's the
 dev-info-name of the device passed into this function.
 
 And instead of introducing another hierarchy level with the bus address,
 I would also prefer to add this as prefix or suffix to the device name,
 e.g. driver.busaddr.
 
 That's what I had started with.  The first post in this thread has
 pci.0,addr=09.0 as a single hierarchy level.  The addr= may be
 unnecessary there, but I also prefer something along those lines.  For
 PCI it'd make sense to have name:addr, which comes out to
 pci.0:09.0.  (Maybe rather than flagging properties as being relevant
 to the path and printing them generically, we should extract specific
 properties based on the bus type.)

Not bus.addr, driver.addr. We only have one PCI bus here, not as many as
there are slots on that bus.

 
 For busses that don't have a consistent addressing scheme then some sort 
 of 
 instance ID is unavoidable. I guess it may be possible to invent something 
 based on other device properties (e.g. address of the first IO port/memory 
 region).
 Instance IDs aren't always bad, we just run into trouble with hotplug
 and maybe creating unique ramblock names.  But, such busses typically
 don't support hotplug and don't have multiple instances of the same
 device on the bus, so I don't expect us to hit many issues there as long
 as we can get a stable address path.  Thanks,

 If stable instance numbers are required, we could simply keep them in
 DeviceState and search for the smallest free one on additions. Actually,
 I'm more in favor of this direction than including the bus address. That
 way we could keep a canonical format across all buses and do not have to
 provide possibly complex ID generation rules for each of them.
 
 I started down that path, but it still breaks for hotplug.  If we start
 a VM with two e1000 NICs, then remove the first, we can no longer
 migrate because the target can't represent having a single e1000 with a
 non-zero instance ID.

That's indeed a good point.

Still, I'm worried about having to define numbering schemes for all the
buses qemu supports. Maybe we can run a mixture: address-based for
hotplug-capably buses (they tend to be cooperative in this regard) and
simple instance numbers for the rest, likely the majority.

 
 BTW, the problem isn't truly solved by printing paths. We also need to
 parse them. It would be counterproductive if such paths ever see the
 light of day (ie. leak outside vmstate) and a user tries to hack it
 into the monitor or pass it on the command line. With my series, I'm
 currently able to process paths like this one:

 /i440FX-pcihost.0/pci.0/PIIX4_PM.0/i2c/smbus-eeprom.6
 
 Yes, these are only intended for internal use, but we should get them to
 sync up as much as possible.  Thanks,

Unless there is a good reason, the match should be 100%.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 0/5] Add '-device help' output for device params and help text

2010-06-14 Thread Anthony Liguori


On 06/08/2010 12:21 AM, Amit Shah wrote:

On (Mon) Jun 07 2010 [11:09:32], Anthony Liguori wrote:
   

On 05/31/2010 07:41 AM, Amit Shah wrote:
 

Hello,

This patch series adds support to specify some descriptive help text
to qdev device parameters. This series adds some help text to the
virtserialport and net family of devices as an example, and the new
output is shown in the respective commits.

This series also adds a new '-device help' option that lists all the
available qdev devices (which is avl. via -device ? now), and adds
each device's parameters to the output listing. This output also shows
the descriptive text.

The idea is to auto-generate documentation from code and to populate
some wiki / qemu-doc.texi using this new target.
   

I really dislike having options print their own help.
 

Why? I really like it, coders can embed help exactly in the same place
they're adding / changing options and there's much less chance of
someone missing updating help strings when updating / adding options.
   


Because it's inconsistent from a UI perspective.  -device help, -M ?, etc.

How's a user supposed to know which options can display help and what 
the magic invocation is that is used to display it?


-help device/-help machine

Provides a consistent, self discoverable interface for users to get 
detailed inline help.



Maybe we can introduce a proper -help option that takes an argument
that can display subsystem specific help?

For instance:

qemu -help device

Would display the help output in this series.
 

What I'm adding here is similar; instead of -help device, I'm adding
-device help, and I think it's nicer because it fits directly in the
qdev code.
   


The UI should not be a consequence of the implementation.


My other concern is that we now have a big mess of properties that
don't have help text.  What are the chances that anyone is going to
go through and do this?

I'd rather we bite the bullet and add help everywhere before merging
any of this because experience has shown that existing code usually
never gets converted if not converted all at once.
 

Let me put my community contributor hat on: we're not really giving away
vibes that qemu has to be 100% enterprise-ready at each commit, are we?
qemu is a *development* project, and development happens one small
commit at a time, commit early and often. If I'm going to have to
maintiain hundreds of small help-related patches, it's soon going to
grow stale as people change related code and in the end I'll grow
frustrated and drop then entire exercise. Not the first time that
would've happened.
   


The problem with this argument is that people introduce things like this 
with this exact argument and then nothing every ends up getting help 
options.



On the other hand, we commit this right away, and interested developers
in their own domains start contributing the small one-liners which
brings their subsystem up to the mark for documentation.
   


It simply doesn't happen or else it already would have.  The best way to 
make this is a one time painful conversion followed by strong 
enforcement that all new options take help strings.


Regards,

Anthony Liguori


I think this is the best way to contribute such patches.

If you think some devices are going to go documentation-less, instead of
empty strings, I can do description needed strings, which can even
invite first-time contributors grep through this and contribute the
one-liners.

If the enterprise people want this fixed before any enterprise release,
they better commit resources to ensure they don't release anything that
has a 'documentation needed!' line. :-)

Amit

Re: [Qemu-devel] [PATCH] monitor: Add force option support to pci_del command

2010-06-14 Thread Anthony Liguori


On 06/09/2010 09:27 AM, Gerd Hoffmann wrote:

  Hi,


This make sense when you mistakenly add a pci device on a -s -S
scenario, like the scenario described on the following bug:
https://bugs.launchpad.net/qemu/+bug/544367.


It doesn't IMHO.


When ACPI-based hotplug support is present on the guest and we run
pci_del with the force option, the hotplug events will still be
generated to the guest and the guest still will trigger the EJx event,
which will end by calling pciej_write() on qemu side. This function will
do nothing on a -f and pci hotplug support scenario, as the pci device
was previously removed by pci_del.


And in case the guest wants to do anything (like flushing dirty 
buffers) before triggering the EJx event it will horribly fail.


If the guest is stopped while unplugging the device the unplug should 
happen as soon as the guest is unpaused.


This is a case where the fundamental problem is that the pci_del command 
should block until the guest has actually responded to the request.


pci_del returning with no error and yet not having the operation 
complete is certainly a usability issue.


Regards,

Anthony Liguori


cheers,
  Gerd

Re: [Qemu-devel] [PATCH] Add optional dump of default config file paths..

2010-06-14 Thread Anthony Liguori


On 06/09/2010 03:05 AM, john cooper wrote:

This patch adds the ability to determine the build-configured
runtime config file paths from the command line.  After
support for cpu model definitions were added to the default
runtime target- config file, testing of this feature has
tripped over an unintentionally mis-installed config file
enough to indicate some help is needed resolving such issues.

As no general verbose flag is currently available, specifying
-readconfig ? on the command line will maintain the default
(config file) disposition but additionally emit diagnostic info.
This mode is optional, otherwise the existing startup behavior
is identical.
   


I assume this is something requested by libvirt?  I'd prefer we support 
this via Daniel's capabilities patchset instead of adding yet another 
hidden help output and having libvirt parse that output.


Regards,

Anthony Liguori


Signed-off-by: john cooperjohn.coo...@redhat.com
---

diff --git a/qemu-config.c b/qemu-config.c
index 5a4e61b..a490603 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -518,21 +518,29 @@ out:
  return res;
  }

-int qemu_read_config_file(const char *filename)
+/* attempt to open and parse config file, report problems if vflag
+ */
+int qemu_read_config_file(const char *filename, int vflag)
  {
  FILE *f = fopen(filename, r);
-int ret;
+int rv = 0;

  if (f == NULL) {
-return -errno;
+rv = -errno;
  }
-
-ret = qemu_config_parse(f, vm_config_groups, filename);
-fclose(f);
-
-if (ret == 0) {
-return 0;
-} else {
-return -EINVAL;
+else if (qemu_config_parse(f, vm_config_groups, filename) != 0) {
+rv = -EINVAL;
+}
+else if (vflag) {
+fprintf(stderr, read config file %s\n, filename);
  }
+if (f) {
+fclose(f);
+}
+if (rv  vflag) {
+fprintf(stderr, can't read config file %s: %s\n,
+filename, strerror(-rv));
+}
+return rv;
  }
+
diff --git a/qemu-config.h b/qemu-config.h
index dca69d4..2e15556 100644
--- a/qemu-config.h
+++ b/qemu-config.h
@@ -23,6 +23,6 @@ void qemu_add_globals(void);
  void qemu_config_write(FILE *fp);
  int qemu_config_parse(FILE *fp, QemuOptsList **lists, const char *fname);

-int qemu_read_config_file(const char *filename);
+int qemu_read_config_file(const char *filename, int vflag);

  #endif /* QEMU_CONFIG_H */
diff --git a/vl.c b/vl.c
index 7121cd0..23c7276 100644
--- a/vl.c
+++ b/vl.c
@@ -2582,6 +2582,7 @@ int main(int argc, char **argv, char **envp)
  #endif
  int show_vnc_port = 0;
  int defconfig = 1;
+int defconfig_verbose = 0; 

  error_set_progname(argv[0]);

@@ -2657,6 +2658,10 @@ int main(int argc, char **argv, char **envp)
  case QEMU_OPTION_nodefconfig:
  defconfig=0;
  break;
+case QEMU_OPTION_readconfig:
+if (!strcmp(optarg, ?))
+defconfig_verbose = 1;
+break;
  }
  }
  }
@@ -2664,12 +2669,14 @@ int main(int argc, char **argv, char **envp)
  if (defconfig) {
  int ret;

-ret = qemu_read_config_file(CONFIG_QEMU_CONFDIR /qemu.conf);
+ret = qemu_read_config_file(CONFIG_QEMU_CONFDIR /qemu.conf,
+defconfig_verbose);
  if (ret  0  ret != -ENOENT) {
  exit(1);
  }

-ret = qemu_read_config_file(arch_config_name);
+ret = qemu_read_config_file(arch_config_name,
+defconfig_verbose);
  if (ret  0  ret != -ENOENT) {
  exit(1);
  }
@@ -3386,15 +3393,9 @@ int main(int argc, char **argv, char **envp)
  xen_mode = XEN_ATTACH;
  break;
  case QEMU_OPTION_readconfig:
-{
-int ret = qemu_read_config_file(optarg);
-if (ret  0) {
-fprintf(stderr, read config %s: %s\n, optarg,
-strerror(-ret));
-exit(1);
-}
-break;
-}
+if (!defconfig_verbose  qemu_read_config_file(optarg, 1)  0)
+exit(1);
+break;
  case QEMU_OPTION_writeconfig:
  {
  FILE *fp;

[Qemu-devel] Re: [PATCH] lsi53c895a: fix Phase Mismatch Jump

2010-06-14 Thread Jan Kiszka

Paolo Bonzini wrote:
 lsi_bad_phase has a bug in the choice of pmjad1/pmjad2.  This does
 not matter with Linux guests because it uses just one routine for
 both, but it breaks Windows 64-bit guests.  This is the text
 from the spec:
 
[The PMJCTL] bit controls which decision mechanism is used
when jumping on phase mismatch. When this bit is cleared the
LSI53C895A will use Phase Mismatch Jump Address 1 (PMJAD1) when
the WSR bit is cleared and Phase Mismatch Jump Address 2 (PMJAD2)
when the WSR bit is set.  When this bit is set the LSI53C895A will
use jump address one (PMJAD1) on data out (data out, command,
message out) transfers and jump address two (PMJAD2) on data in
(data in, status, message in) transfers.
 
 Which means:
 
 CCNTL0.PMJCTL
 0  SCNTL2.WSR = 0 PMJAD1
 0  SCNTL2.WSR = 1 PMJAD2
 1out  PMJAD1
 1in   PMJAD2
 
 In qemu, what you get instead is:
 
 CCNTL0.PMJCTL
 0out  PMJAD1
 0in   PMJAD2
 1out  PMJAD1
 1in   PMJAD1
 
 Considering that qemu always has SCNTL2.WSR cleared, the two marked cases
 (corresponding to phase mismatch on input) are always jumping to the
 wrong PMJAD register.  The patch implements the correct semantics.
 
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  hw/lsi53c895a.c |   12 +---
  1 files changed, 9 insertions(+), 3 deletions(-)
 
 diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
 index f5a91ba..00df2bd 100644
 --- a/hw/lsi53c895a.c
 +++ b/hw/lsi53c895a.c
 @@ -490,11 +490,14 @@ static void lsi_bad_phase(LSIState *s, int out, int 
 new_phase)
  {
  /* Trigger a phase mismatch.  */
  if (s-ccntl0  LSI_CCNTL0_ENPMJ) {
 -if ((s-ccntl0  LSI_CCNTL0_PMJCTL) || out) {
 -s-dsp = s-pmjad1;
 +int dest;
 +if ((s-ccntl0  LSI_CCNTL0_PMJCTL)) {
 +dest = out ? 1 : 2;
  } else {
 -s-dsp = s-pmjad2;
 +dest = (s-scntl2  LSI_SCNTL2_WSR ? 2 : 1);
  }
 +
 +s-dsp = (dest == 1) ? s-pmjad1 : s-pmjad2;
  DPRINTF(Data phase mismatch jump to %08x\n, s-dsp);
  } else {
  DPRINTF(Phase mismatch interrupt\n);

Looks correct. But why not assigning s-pmjad[12] directly? Would
improve readability IMO.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH] Add optional dump of default config file paths..

2010-06-14 Thread Daniel P. Berrange

On Mon, Jun 14, 2010 at 12:01:42PM -0500, Anthony Liguori wrote:
 On 06/09/2010 03:05 AM, john cooper wrote:
 This patch adds the ability to determine the build-configured
 runtime config file paths from the command line.  After
 support for cpu model definitions were added to the default
 runtime target- config file, testing of this feature has
 tripped over an unintentionally mis-installed config file
 enough to indicate some help is needed resolving such issues.
 
 As no general verbose flag is currently available, specifying
 -readconfig ? on the command line will maintain the default
 (config file) disposition but additionally emit diagnostic info.
 This mode is optional, otherwise the existing startup behavior
 is identical.

 
 I assume this is something requested by libvirt?  I'd prefer we support 
 this via Daniel's capabilities patchset instead of adding yet another 
 hidden help output and having libvirt parse that output.

No, libvirt has no need for this patch. We now forcably disable
all default configs with -nodefconfig.

Regards,
Daniel
-- 
|: Red Hat, Engineering, London-o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :|
|: http://autobuild.org-o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Re: [Qemu-devel] [PATCH] Fix and simplify gui timer logic.

2010-06-14 Thread Anthony Liguori


On 06/08/2010 08:18 AM, Gerd Hoffmann wrote:

On 06/08/10 13:50, Paul Brook wrote:

Kill nographic timer.  Have a global gui_timer instead.  Have the gui
timer enabled unconditionally.  We need a timer running anyway for 
mmio

flush, so the whole have-gui-timer-only-when-needed logic is pretty
pointless.  It also simplifies displaylisteners coming and going at
runtime, we don't need to care about the timer then as it runs 
anyway.


Linking mmio flushes to the gui is completely bogus.  The fact that 
we're
doing arbitrary periodic mmio flushes suggests something else is 
horribly

broken.


Was added by commit

http://git.qemu.org/qemu.git/commit/?id=62a2744ca09a0b44b8406ea0c430c4c67a2 


c3231

Patch description makes sense to me.  Of course we could have a 
separate

timer for the mmio flushes instead of re-using the gui timer.  But we
would have more wakeups then ...


This suggests we are incorrectly coalescing writes, and we should 
actually be
notifying qemu when (at least) he first write occurs. If we aren't 
outputting
anything we don't want to be waking up periodically just to flush an 
empty

MMIO buffer.


That is completely unrelated to this patch though.  The patch doesn't 
change mmio flush behaviour at all.  And the periodic wakeup was there 
even before the mmio flush patch was added.  Even without gui, 
although I can't see a obvious reason for it ...


Agreed.  Regardless of the periodic mmio flush, having a separate 
nographic timer isn't terribly helpful IMHO.


Regards,

Anthony Liguori


cheers,
  Gerd

Re: [Qemu-devel] Re: [PATCH] lsi53c895a: fix Phase Mismatch Jump

2010-06-14 Thread Michal Novotny


On 06/14/2010 07:05 PM, Jan Kiszka wrote:

Paolo Bonzini wrote:
   

lsi_bad_phase has a bug in the choice of pmjad1/pmjad2.  This does
not matter with Linux guests because it uses just one routine for
both, but it breaks Windows 64-bit guests.  This is the text
from the spec:

[The PMJCTL] bit controls which decision mechanism is used
when jumping on phase mismatch. When this bit is cleared the
LSI53C895A will use Phase Mismatch Jump Address 1 (PMJAD1) when
the WSR bit is cleared and Phase Mismatch Jump Address 2 (PMJAD2)
when the WSR bit is set.  When this bit is set the LSI53C895A will
use jump address one (PMJAD1) on data out (data out, command,
message out) transfers and jump address two (PMJAD2) on data in
(data in, status, message in) transfers.

Which means:

 CCNTL0.PMJCTL
 0  SCNTL2.WSR = 0 PMJAD1
 0  SCNTL2.WSR = 1 PMJAD2
 1out  PMJAD1
 1in   PMJAD2

In qemu, what you get instead is:

 CCNTL0.PMJCTL
 0out  PMJAD1
 0in   PMJAD2
 1out  PMJAD1
 1in   PMJAD1

Considering that qemu always has SCNTL2.WSR cleared, the two marked cases
(corresponding to phase mismatch on input) are always jumping to the
wrong PMJAD register.  The patch implements the correct semantics.

Signed-off-by: Paolo Bonzinipbonz...@redhat.com
---
  hw/lsi53c895a.c |   12 +---
  1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index f5a91ba..00df2bd 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -490,11 +490,14 @@ static void lsi_bad_phase(LSIState *s, int out, int 
new_phase)
  {
  /* Trigger a phase mismatch.  */
  if (s-ccntl0  LSI_CCNTL0_ENPMJ) {
-if ((s-ccntl0  LSI_CCNTL0_PMJCTL) || out) {
-s-dsp = s-pmjad1;
+int dest;
+if ((s-ccntl0  LSI_CCNTL0_PMJCTL)) {
+dest = out ? 1 : 2;
  } else {
-s-dsp = s-pmjad2;
+dest = (s-scntl2  LSI_SCNTL2_WSR ? 2 : 1);
  }
+
+s-dsp = (dest == 1) ? s-pmjad1 : s-pmjad2;
  DPRINTF(Data phase mismatch jump to %08x\n, s-dsp);
  } else {
  DPRINTF(Phase mismatch interrupt\n);
 

Looks correct. But why not assigning s-pmjad[12] directly? Would
improve readability IMO.

Jan

   

Jan,
I think this is better since if something goes wrong it could be easier 
to just put dest variable to DPRINTF() macro, like:


DPRINTF(Data phase mismatch jump to %08x (== pmjad%d)\n, s-dsp, dest);

rather than implementing it some other way. Now it could be easier to 
just know what the problem is - i.e. whether it's accessing the wrong 
register or now.


Michal


--
Michal Novotnyminov...@redhat.com, RHCE
Virtualization Team (xen userspace), Red Hat

[Qemu-devel] [PATCH v2] lsi53c895a: fix Phase Mismatch Jump

2010-06-14 Thread Paolo Bonzini

lsi_bad_phase has a bug in the choice of pmjad1/pmjad2.  This does
not matter with Linux guests because it uses just one routine for
both, but it breaks Windows 64-bit guests.  This is the text
from the spec:

   [The PMJCTL] bit controls which decision mechanism is used
   when jumping on phase mismatch. When this bit is cleared the
   LSI53C895A will use Phase Mismatch Jump Address 1 (PMJAD1) when
   the WSR bit is cleared and Phase Mismatch Jump Address 2 (PMJAD2)
   when the WSR bit is set.  When this bit is set the LSI53C895A will
   use jump address one (PMJAD1) on data out (data out, command,
   message out) transfers and jump address two (PMJAD2) on data in
   (data in, status, message in) transfers.

Which means:

CCNTL0.PMJCTL
0  SCNTL2.WSR = 0 PMJAD1
0  SCNTL2.WSR = 1 PMJAD2
1out  PMJAD1
1in   PMJAD2

In qemu, what you get instead is:

CCNTL0.PMJCTL
0out  PMJAD1
0in   PMJAD2
1out  PMJAD1
1in   PMJAD1

Considering that qemu always has SCNTL2.WSR cleared, the two marked cases
(corresponding to phase mismatch on input) are always jumping to the
wrong PMJAD register.  The patch implements the correct semantics.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 Looks correct. But why not assigning s-pmjad[12] directly? Would
 improve readability IMO.

No particular reason, hence fine by me.

 hw/lsi53c895a.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index f5a91ba..9a37fed 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -490,10 +490,10 @@ static void lsi_bad_phase(LSIState *s, int out, int 
new_phase)
 {
 /* Trigger a phase mismatch.  */
 if (s-ccntl0  LSI_CCNTL0_ENPMJ) {
-if ((s-ccntl0  LSI_CCNTL0_PMJCTL) || out) {
-s-dsp = s-pmjad1;
+if ((s-ccntl0  LSI_CCNTL0_PMJCTL)) {
+s-dsp = out ? s-pmjad1 : s-pmjad2;
 } else {
-s-dsp = s-pmjad2;
+s-dsp = (s-scntl2  LSI_SCNTL2_WSR ? s-pmjad2 : s-pmjad1);
 }
 DPRINTF(Data phase mismatch jump to %08x\n, s-dsp);
 } else {
-- 
1.7.0.1

Re: [Qemu-devel] Re: [PATCH] lsi53c895a: fix Phase Mismatch Jump

2010-06-14 Thread Jan Kiszka

Michal Novotny wrote:
 On 06/14/2010 07:05 PM, Jan Kiszka wrote:
 Paolo Bonzini wrote:

 lsi_bad_phase has a bug in the choice of pmjad1/pmjad2.  This does
 not matter with Linux guests because it uses just one routine for
 both, but it breaks Windows 64-bit guests.  This is the text
 from the spec:

 [The PMJCTL] bit controls which decision mechanism is used
 when jumping on phase mismatch. When this bit is cleared the
 LSI53C895A will use Phase Mismatch Jump Address 1 (PMJAD1) when
 the WSR bit is cleared and Phase Mismatch Jump Address 2 (PMJAD2)
 when the WSR bit is set.  When this bit is set the LSI53C895A will
 use jump address one (PMJAD1) on data out (data out, command,
 message out) transfers and jump address two (PMJAD2) on data in
 (data in, status, message in) transfers.

 Which means:

  CCNTL0.PMJCTL
  0  SCNTL2.WSR = 0 PMJAD1
  0  SCNTL2.WSR = 1 PMJAD2
  1out  PMJAD1
  1in   PMJAD2

 In qemu, what you get instead is:

  CCNTL0.PMJCTL
  0out  PMJAD1
  0in   PMJAD2
  1out  PMJAD1
  1in   PMJAD1

 Considering that qemu always has SCNTL2.WSR cleared, the two marked cases
 (corresponding to phase mismatch on input) are always jumping to the
 wrong PMJAD register.  The patch implements the correct semantics.

 Signed-off-by: Paolo Bonzinipbonz...@redhat.com
 ---
   hw/lsi53c895a.c |   12 +---
   1 files changed, 9 insertions(+), 3 deletions(-)

 diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
 index f5a91ba..00df2bd 100644
 --- a/hw/lsi53c895a.c
 +++ b/hw/lsi53c895a.c
 @@ -490,11 +490,14 @@ static void lsi_bad_phase(LSIState *s, int out, int 
 new_phase)
   {
   /* Trigger a phase mismatch.  */
   if (s-ccntl0  LSI_CCNTL0_ENPMJ) {
 -if ((s-ccntl0  LSI_CCNTL0_PMJCTL) || out) {
 -s-dsp = s-pmjad1;
 +int dest;
 +if ((s-ccntl0  LSI_CCNTL0_PMJCTL)) {
 +dest = out ? 1 : 2;
   } else {
 -s-dsp = s-pmjad2;
 +dest = (s-scntl2  LSI_SCNTL2_WSR ? 2 : 1);
   }
 +
 +s-dsp = (dest == 1) ? s-pmjad1 : s-pmjad2;
   DPRINTF(Data phase mismatch jump to %08x\n, s-dsp);
   } else {
   DPRINTF(Phase mismatch interrupt\n);
  
 Looks correct. But why not assigning s-pmjad[12] directly? Would
 improve readability IMO.

 Jan


 Jan,
 I think this is better since if something goes wrong it could be easier 
 to just put dest variable to DPRINTF() macro, like:
 
 DPRINTF(Data phase mismatch jump to %08x (== pmjad%d)\n, s-dsp, dest);
 
 rather than implementing it some other way. Now it could be easier to 
 just know what the problem is - i.e. whether it's accessing the wrong 
 register or now.

I don't mind. But if you have a use case for that separate variable,
then include it. No one can read your mind, and even less once this
patch is long merged.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] Re: [PATCH] lsi53c895a: fix Phase Mismatch Jump

2010-06-14 Thread Michal Novotny


On 06/14/2010 07:05 PM, Jan Kiszka wrote:

Paolo Bonzini wrote:
   

lsi_bad_phase has a bug in the choice of pmjad1/pmjad2.  This does
not matter with Linux guests because it uses just one routine for
both, but it breaks Windows 64-bit guests.  This is the text
from the spec:

[The PMJCTL] bit controls which decision mechanism is used
when jumping on phase mismatch. When this bit is cleared the
LSI53C895A will use Phase Mismatch Jump Address 1 (PMJAD1) when
the WSR bit is cleared and Phase Mismatch Jump Address 2 (PMJAD2)
when the WSR bit is set.  When this bit is set the LSI53C895A will
use jump address one (PMJAD1) on data out (data out, command,
message out) transfers and jump address two (PMJAD2) on data in
(data in, status, message in) transfers.

Which means:

 CCNTL0.PMJCTL
 0  SCNTL2.WSR = 0 PMJAD1
 0  SCNTL2.WSR = 1 PMJAD2
 1out  PMJAD1
 1in   PMJAD2

In qemu, what you get instead is:

 CCNTL0.PMJCTL
 0out  PMJAD1
 0in   PMJAD2
 1out  PMJAD1
 1in   PMJAD1

Considering that qemu always has SCNTL2.WSR cleared, the two marked cases
(corresponding to phase mismatch on input) are always jumping to the
wrong PMJAD register.  The patch implements the correct semantics.

Signed-off-by: Paolo Bonzinipbonz...@redhat.com
---
  hw/lsi53c895a.c |   12 +---
  1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index f5a91ba..00df2bd 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -490,11 +490,14 @@ static void lsi_bad_phase(LSIState *s, int out, int 
new_phase)
  {
  /* Trigger a phase mismatch.  */
  if (s-ccntl0  LSI_CCNTL0_ENPMJ) {
-if ((s-ccntl0  LSI_CCNTL0_PMJCTL) || out) {
-s-dsp = s-pmjad1;
+int dest;
+if ((s-ccntl0  LSI_CCNTL0_PMJCTL)) {
+dest = out ? 1 : 2;
  } else {
-s-dsp = s-pmjad2;
+dest = (s-scntl2  LSI_SCNTL2_WSR ? 2 : 1);
  }
+
+s-dsp = (dest == 1) ? s-pmjad1 : s-pmjad2;
  DPRINTF(Data phase mismatch jump to %08x\n, s-dsp);
  } else {
  DPRINTF(Phase mismatch interrupt\n);
 

Looks correct. But why not assigning s-pmjad[12] directly? Would
improve readability IMO.

Jan

   
Jan, I think this is better readable since something goes wrong it could 
be easier to just put dest to DPRINTF() macro, like:


DPRINTF(Data phase mismatch jump to %08x (== pmjad%d)\n, s-dsp, dest);


rather than implementing it some other way.

Michal

--
Michal Novotnyminov...@redhat.com, RHCE
Virtualization Team (xen userspace), Red Hat

Re: [Qemu-devel] Re: [PATCH] lsi53c895a: fix Phase Mismatch Jump

2010-06-14 Thread Michal Novotny


On 06/14/2010 07:31 PM, Jan Kiszka wrote:

Michal Novotny wrote:
   

On 06/14/2010 07:05 PM, Jan Kiszka wrote:
 

Paolo Bonzini wrote:

   

lsi_bad_phase has a bug in the choice of pmjad1/pmjad2.  This does
not matter with Linux guests because it uses just one routine for
both, but it breaks Windows 64-bit guests.  This is the text
from the spec:

 [The PMJCTL] bit controls which decision mechanism is used
 when jumping on phase mismatch. When this bit is cleared the
 LSI53C895A will use Phase Mismatch Jump Address 1 (PMJAD1) when
 the WSR bit is cleared and Phase Mismatch Jump Address 2 (PMJAD2)
 when the WSR bit is set.  When this bit is set the LSI53C895A will
 use jump address one (PMJAD1) on data out (data out, command,
 message out) transfers and jump address two (PMJAD2) on data in
 (data in, status, message in) transfers.

Which means:

  CCNTL0.PMJCTL
  0  SCNTL2.WSR = 0 PMJAD1
  0  SCNTL2.WSR = 1 PMJAD2
  1out  PMJAD1
  1in   PMJAD2

In qemu, what you get instead is:

  CCNTL0.PMJCTL
  0out  PMJAD1
  0in   PMJAD2
  1out  PMJAD1
  1in   PMJAD1

Considering that qemu always has SCNTL2.WSR cleared, the two marked cases
(corresponding to phase mismatch on input) are always jumping to the
wrong PMJAD register.  The patch implements the correct semantics.

Signed-off-by: Paolo Bonzinipbonz...@redhat.com
---
   hw/lsi53c895a.c |   12 +---
   1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index f5a91ba..00df2bd 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -490,11 +490,14 @@ static void lsi_bad_phase(LSIState *s, int out, int 
new_phase)
   {
   /* Trigger a phase mismatch.  */
   if (s-ccntl0   LSI_CCNTL0_ENPMJ) {
-if ((s-ccntl0   LSI_CCNTL0_PMJCTL) || out) {
-s-dsp = s-pmjad1;
+int dest;
+if ((s-ccntl0   LSI_CCNTL0_PMJCTL)) {
+dest = out ? 1 : 2;
   } else {
-s-dsp = s-pmjad2;
+dest = (s-scntl2   LSI_SCNTL2_WSR ? 2 : 1);
   }
+
+s-dsp = (dest == 1) ? s-pmjad1 : s-pmjad2;
   DPRINTF(Data phase mismatch jump to %08x\n, s-dsp);
   } else {
   DPRINTF(Phase mismatch interrupt\n);

 

Looks correct. But why not assigning s-pmjad[12] directly? Would
improve readability IMO.

Jan


   

Jan,
I think this is better since if something goes wrong it could be easier
to just put dest variable to DPRINTF() macro, like:

DPRINTF(Data phase mismatch jump to %08x (== pmjad%d)\n, s-dsp, dest);

rather than implementing it some other way. Now it could be easier to
just know what the problem is - i.e. whether it's accessing the wrong
register or now.
 

I don't mind. But if you have a use case for that separate variable,
then include it. No one can read your mind, and even less once this
patch is long merged.

Jan

   
This is not my patch, it's Paolo's but I'm just telling you I can see it 
useful. If it's not used in the DPRINTF() it's being optimized by gcc 
anyway so not a big deal ;)


Michal

--
Michal Novotnyminov...@redhat.com, RHCE
Virtualization Team (xen userspace), Red Hat

Re: [Qemu-devel] [PATCH v2 2/7] ioapic: convert to qdev

2010-06-14 Thread Blue Swirl

On Mon, Jun 14, 2010 at 9:33 AM, Markus Armbruster arm...@redhat.com wrote:
 Blue Swirl blauwir...@gmail.com writes:

 Convert to qdev.

 Signed-off-by: Blue Swirl blauwir...@gmail.com
 ---
  hw/apic.h    |    2 --
  hw/ioapic.c  |   45 ++---
  hw/pc.h      |    4 +++-
  hw/pc_piix.c |   19 ++-
  4 files changed, 51 insertions(+), 19 deletions(-)

 diff --git a/hw/apic.h b/hw/apic.h
 index e1954f4..dc41400 100644
 --- a/hw/apic.h
 +++ b/hw/apic.h
 @@ -1,7 +1,6 @@
  #ifndef APIC_H
  #define APIC_H

 -typedef struct IOAPICState IOAPICState;
  void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
                               uint8_t delivery_mode,
                               uint8_t vector_num, uint8_t polarity,
 @@ -10,7 +9,6 @@ int apic_init(CPUState *env);
  int apic_accept_pic_intr(CPUState *env);
  void apic_deliver_pic_intr(CPUState *env, int level);
  int apic_get_interrupt(CPUState *env);
 -qemu_irq *ioapic_init(void);
  void apic_reset_irq_delivered(void);
  int apic_get_irq_delivered(void);

 diff --git a/hw/ioapic.c b/hw/ioapic.c
 index e3f8a46..0336dbd 100644
 --- a/hw/ioapic.c
 +++ b/hw/ioapic.c
 @@ -25,6 +25,7 @@
  #include apic.h
  #include qemu-timer.h
  #include host-utils.h
 +#include sysbus.h

  //#define DEBUG_IOAPIC

 @@ -35,7 +36,6 @@
  #define DPRINTF(fmt, ...)
  #endif

 -#define IOAPIC_NUM_PINS                      0x18
  #define IOAPIC_LVT_MASKED            (116)

  #define IOAPIC_TRIGGER_EDGE          0
 @@ -50,7 +50,10 @@
  #define IOAPIC_DM_SIPI                       0x5
  #define IOAPIC_DM_EXTINT             0x7

 +typedef struct IOAPICState IOAPICState;
 +
  struct IOAPICState {
 +    SysBusDevice busdev;
      uint8_t id;
      uint8_t ioregsel;

 @@ -209,12 +212,14 @@ static const VMStateDescription vmstate_ioapic = {
      }
  };

 -static void ioapic_reset(void *opaque)
 +static void ioapic_reset(DeviceState *d)
  {
 -    IOAPICState *s = opaque;
 +    IOAPICState *s = container_of(d, IOAPICState, busdev.qdev);

 DO_UPCAST()?  I hate it, but it's what the other devices do...

Both are used:
grep container_of hw/*.[ch]|wc -l
81
grep DO_UPCAST hw/*.[ch]|wc -l
246

Re: [Qemu-devel] [PATCH 15/35] tcg-s390: Query instruction extensions that are installed.

2010-06-14 Thread Alexander Graf



Am 14.06.2010 um 18:20 schrieb Richard Henderson r...@twiddle.net:


On 06/13/2010 03:23 PM, Alexander Graf wrote:

On 13.06.2010, at 18:44, Aurelien Jarno wrote:

Is KVM in 31-bit mode actually functional?


I'm not aware of anything preventing it to be. But I honestly haven't
tried. As long as all hypercall parameters stay within the first
32/31 bits, things should be safe.


On the other hand, is there any point in supporting it?
This does seem to be a case for which it does seem like
having the extra VM space (KVM+TCG) and the extra register
size (TCG) is extremely helpful.

I'd be just as happy to adjust the configury to actively
prevent compilation in 31-bit mode, rather than pretend
it might work and never bother to test it...


Yeah, I agree. 31-bit is really deprecated by now anyways. Aurelien  
was the one interested in it.


Alex

Re: [Qemu-devel] [PATCH v2 7/7] apic: qdev conversion cleanup

2010-06-14 Thread Blue Swirl

On Mon, Jun 14, 2010 at 9:36 AM, Markus Armbruster arm...@redhat.com wrote:
 Blue Swirl blauwir...@gmail.com writes:

 Make APICState completely private to apic.c by using DeviceState
 in external APIs.

 Could you explain why this is an improvement?

Outside of apic.c, there is no need to access APICState fields so we
can remove that privilege. We can move the device instantiation to the
board level where it belongs.

Re: [Qemu-devel] [PATCH 5/5] [scsi-bsg]: Add initial support for BSG based SCSIDeviceInfo

2010-06-14 Thread Blue Swirl

On Mon, Jun 14, 2010 at 9:44 AM, Nicholas A. Bellinger
n...@linux-iscsi.org wrote:
 From: Nicholas Bellinger n...@linux-iscsi.org

 This patch adds initial support for using the Linux BSG interface with 
 write/read vectored
 AIO as a QEMU backstore (SCSIDeviceInfo) with hw/scsi-bus.c compatible HBA 
 emulation.

Did I miss the docs?


 So far it has been tested with x86_64 host and guest using hw/megasas.c and 
 TCM_Loop LLD
 Port LUNs.  Because this path uses struct iovec for struct 
 sg_io_v4-d[out,in]_xferp payloads,
 which currently requires a patch to linux/block/bsg.c:bsg_map_hdr() in order 
 to setup the
 user - kernel iovecs.  This also will only currently work with paired 
 user/kernel
 (eg: 64bit user / 64bit kernel) because of different pointer sizes in struct 
 iovec-iov_base.

 There are also two FIXMEs in hw/scsi-bsg.c:bsg_generic_initfn() related to 
 extraction of
 SCSI LUN and device type values using BSG and required by QEMU-KVM.

 Signed-off-by: Nicholas A. Bellinger n...@linux-iscsi.org
 ---
  Makefile.objs |    2 +-
  hw/scsi-bsg.c |  588 
 +
  2 files changed, 589 insertions(+), 1 deletions(-)
  create mode 100644 hw/scsi-bsg.c

 diff --git a/Makefile.objs b/Makefile.objs
 index 188d617..c4fcb72 100644
 --- a/Makefile.objs
 +++ b/Makefile.objs
 @@ -197,7 +197,7 @@ hw-obj-$(CONFIG_IDE_CMD646) += ide/cmd646.o
  hw-obj-$(CONFIG_IDE_MACIO) += ide/macio.o

  # SCSI layer
 -hw-obj-y += scsi-disk.o scsi-generic.o
 +hw-obj-y += scsi-disk.o scsi-generic.o scsi-bsg.o

Instead of '#ifdef __linux__' (which should be '#ifdef CONFIG_LINUX'),
please compile the object only if CONFIG_LINUX is set, something like:
hw-obj-$(CONFIG_LINUX) += scsi-bsg.o

Please also check if this could be compiled in common-obj set.

  hw-obj-y += lsi53c895a.o megasas.o
  hw-obj-$(CONFIG_ESP) += esp.o

 diff --git a/hw/scsi-bsg.c b/hw/scsi-bsg.c
 new file mode 100644
 index 000..fc76b76
 --- /dev/null
 +++ b/hw/scsi-bsg.c
 @@ -0,0 +1,588 @@
 +/*
 + * block layer implementation of the sg v4 interface for Linux hosts
 + *
 + * Copyright (c) 2010 Rising Tide Systems
 + * Written by Nicholas A. Bellinger n...@linux-iscsi.org
 + *
 + * Based on hw/scsi-generic code by Laurent Vivier, Paul Brook, and Fabrice 
 Bellard
 + *
 + * This code is licenced under the LGPL.
 + */
 +
 +#include qemu-common.h
 +#include qemu-error.h
 +#include block.h
 +#include scsi.h
 +#include dma.h
 +#include block/raw-posix-aio.h
 +
 +#ifdef __linux__
 +
 +#define DEBUG_BSG
 +#undef DEBUG_BSG_IO
 +#undef DEBUG_BSG_MAP

This should be
//#define DEBUG_BSG
//#define DEBUG_BSG_IO
//#define DEBUG_BSG_MAP

 +
 +#ifdef DEBUG_BSG
 +#define DPRINTF(fmt, ...) \
 +do { printf(scsi-bsg:  fmt , ## __VA_ARGS__); } while (0)
 +#else
 +#define DPRINTF(fmt, ...) do {} while(0)
 +#endif
 +
 +#define BADF(fmt, ...) \
 +do { fprintf(stderr, scsi-bsg:  fmt , ## __VA_ARGS__); } while (0)
 +
 +#include stdio.h
 +#include sys/types.h
 +#include sys/stat.h
 +#include sys/epoll.h
 +#include unistd.h
 +#include scsi/sg.h
 +#include linux/bsg.h
 +#include scsi-defs.h
 +
 +#define SCSI_SENSE_BUF_SIZE 96
 +
 +#define SG_ERR_DRIVER_TIMEOUT 0x06
 +#define SG_ERR_DRIVER_SENSE 0x08
 +
 +#ifndef MAX_UINT
 +#define MAX_UINT ((unsigned int)-1)

The standard macro is UINT_MAX.

 +#endif
 +
 +typedef struct SCSIBSGState SCSIBSGState;
 +
 +typedef struct SCSIBSGReq {
 +    SCSIRequest req;
 +    uint8_t *buf;
 +    int buflen;
 +    QEMUIOVector iov;
 +    QEMUIOVector aio_iov;
 +    struct sg_io_v4 bsg_hdr;
 +} SCSIBSGReq;
 +
 +struct SCSIBSGState {
 +    SCSIDevice qdev;
 +    BlockDriverState *bs;
 +    int lun;
 +    int driver_status;
 +    uint8_t sensebuf[SCSI_SENSE_BUF_SIZE];
 +    uint8_t senselen;
 +};
 +
 +static int bsg_read(int fd, void *p_read, int to_read)
 +{
 +    int err;
 +
 +    while (to_read  0) {
 +        err = read(fd, p_read, to_read);
 +        if (err = 0) {
 +            to_read -= err;
 +            p_read += err;
 +        } else if (errno == EINTR)
 +            continue;
 +        else {
 +            printf(bsg device %d read failed, errno: %d\n,
 +                    fd, errno);

DPRINTF?

 +            return errno;
 +        }
 +    }
 +    return 0;
 +}
 +
 +static SCSIBSGReq *bsg_new_request(SCSIDevice *d, uint32_t tag, uint32_t lun)
 +{
 +    SCSIRequest *req;
 +    SCSIBSGReq *r;
 +
 +    req = scsi_req_alloc(sizeof(SCSIBSGReq), d, tag, lun);
 +    r = DO_UPCAST(SCSIBSGReq, req, req);
 +    qemu_iovec_init(r-iov, 1);
 +    qemu_iovec_init(r-aio_iov, 1);
 +    return r;
 +}
 +
 +static void bsg_remove_request(SCSIBSGReq *r)
 +{
 +    qemu_free(r-buf);
 +    qemu_iovec_destroy(r-iov);
 +    qemu_iovec_destroy(r-aio_iov);
 +    scsi_req_free(r-req);
 +}
 +
 +static void bsg_command_complete(void *opaque, int ret)
 +{
 +    SCSIBSGReq *r = (SCSIBSGReq *)opaque;

Useless cast in C.

 +    SCSIBSGState *s = DO_UPCAST(SCSIBSGState, qdev, r-req.dev);
 +
 +    s-driver_status =

Re: [Qemu-devel] [PATCH] Add optional dump of default config file paths..

2010-06-14 Thread john cooper

Anthony Liguori wrote:
 On 06/09/2010 03:05 AM, john cooper wrote:
 This patch adds the ability to determine the build-configured
 runtime config file paths from the command line.  After
 support for cpu model definitions were added to the default
 runtime target- config file, testing of this feature has
 tripped over an unintentionally mis-installed config file
 enough to indicate some help is needed resolving such issues.

 As no general verbose flag is currently available, specifying
 -readconfig ? on the command line will maintain the default
 (config file) disposition but additionally emit diagnostic info.
 This mode is optional, otherwise the existing startup behavior
 is identical.

 
 I assume this is something requested by libvirt?

Not requested by libvirt but rather intended for the qemu
CLI facing user.  The alternatives of trying to puzzle
out built-in config file paths via strace or strings when
config problems surface is rather awkward.  This also
seems a general need for test of config file related
functionality.

Thanks,

-john

 I'd prefer we support
 this via Daniel's capabilities patchset instead of adding yet another
 hidden help output and having libvirt parse that output.
 
 Regards,
 
 Anthony Liguori
 
 Signed-off-by: john cooperjohn.coo...@redhat.com
 ---

 diff --git a/qemu-config.c b/qemu-config.c
 index 5a4e61b..a490603 100644
 --- a/qemu-config.c
 +++ b/qemu-config.c
 @@ -518,21 +518,29 @@ out:
   return res;
   }

 -int qemu_read_config_file(const char *filename)
 +/* attempt to open and parse config file, report problems if vflag
 + */
 +int qemu_read_config_file(const char *filename, int vflag)
   {
   FILE *f = fopen(filename, r);
 -int ret;
 +int rv = 0;

   if (f == NULL) {
 -return -errno;
 +rv = -errno;
   }
 -
 -ret = qemu_config_parse(f, vm_config_groups, filename);
 -fclose(f);
 -
 -if (ret == 0) {
 -return 0;
 -} else {
 -return -EINVAL;
 +else if (qemu_config_parse(f, vm_config_groups, filename) != 0) {
 +rv = -EINVAL;
 +}
 +else if (vflag) {
 +fprintf(stderr, read config file %s\n, filename);
   }
 +if (f) {
 +fclose(f);
 +}
 +if (rv  vflag) {
 +fprintf(stderr, can't read config file %s: %s\n,
 +filename, strerror(-rv));
 +}
 +return rv;
   }
 +
 diff --git a/qemu-config.h b/qemu-config.h
 index dca69d4..2e15556 100644
 --- a/qemu-config.h
 +++ b/qemu-config.h
 @@ -23,6 +23,6 @@ void qemu_add_globals(void);
   void qemu_config_write(FILE *fp);
   int qemu_config_parse(FILE *fp, QemuOptsList **lists, const char
 *fname);

 -int qemu_read_config_file(const char *filename);
 +int qemu_read_config_file(const char *filename, int vflag);

   #endif /* QEMU_CONFIG_H */
 diff --git a/vl.c b/vl.c
 index 7121cd0..23c7276 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -2582,6 +2582,7 @@ int main(int argc, char **argv, char **envp)
   #endif
   int show_vnc_port = 0;
   int defconfig = 1;
 +int defconfig_verbose = 0;   

   error_set_progname(argv[0]);

 @@ -2657,6 +2658,10 @@ int main(int argc, char **argv, char **envp)
   case QEMU_OPTION_nodefconfig:
   defconfig=0;
   break;
 +case QEMU_OPTION_readconfig:
 +if (!strcmp(optarg, ?))
 +defconfig_verbose = 1;
 +break;
   }
   }
   }
 @@ -2664,12 +2669,14 @@ int main(int argc, char **argv, char **envp)
   if (defconfig) {
   int ret;

 -ret = qemu_read_config_file(CONFIG_QEMU_CONFDIR /qemu.conf);
 +ret = qemu_read_config_file(CONFIG_QEMU_CONFDIR /qemu.conf,
 +defconfig_verbose);
   if (ret  0  ret != -ENOENT) {
   exit(1);
   }

 -ret = qemu_read_config_file(arch_config_name);
 +ret = qemu_read_config_file(arch_config_name,
 +defconfig_verbose);
   if (ret  0  ret != -ENOENT) {
   exit(1);
   }
 @@ -3386,15 +3393,9 @@ int main(int argc, char **argv, char **envp)
   xen_mode = XEN_ATTACH;
   break;
   case QEMU_OPTION_readconfig:
 -{
 -int ret = qemu_read_config_file(optarg);
 -if (ret  0) {
 -fprintf(stderr, read config %s: %s\n, optarg,
 -strerror(-ret));
 -exit(1);
 -}
 -break;
 -}
 +if (!defconfig_verbose 
 qemu_read_config_file(optarg, 1)  0)
 +exit(1);
 +break;
   case QEMU_OPTION_writeconfig:
   {
   FILE *fp;


 


-- 
john.coo...@redhat.com

[Qemu-devel] Re: [SeaBIOS] [PATCHv2] load hpet info for HPET ACPI table from qemu

2010-06-14 Thread Kevin O'Connor

On Mon, Jun 14, 2010 at 05:51:27PM +0300, Avi Kivity wrote:
 On 06/14/2010 05:09 PM, Gleb Natapov wrote:
 Could we just have qemu build the hpet tables and pass them through to
 seabios?  Perhaps using the qemu_cfg_acpi_additional_tables() method.
 
 Possible, and I considered that. I personally prefer to pass minimum
 information required for seabios to discover underlying HW and leave
 ACPI table creation to seabios. That is how things done for HW that
 seabios can actually detect. If we will go your way pretty soon we will
 move creation of ACPI/SMBIOS/MP tables into qemu and IMHO this will be
 step backworkds.
 
 I agree.  ACPI is a firmware/OS interface.  If we move ACPI table
 generation into qemu, it becomes a mixed hardware/firmware/OS
 interface.

This seems to be a philosophical distinction.  Lets go over the
practical implications.

It seems there was a change in qemu to the hpet functionality.
Although the change is solely between qemu and the OS, it's necessary
to patch both qemu and seabios for the OS to see the change.  This
means creating and reviewing patches for two separate repos.  This
also requires release coordination - the seabios change has to be
committed and released, and then qemu needs to be released with the
new seabios.  Additional changes in seabios tip will get merged into
qemu, which could complicate testing.

 Better keep those interfaces separate: hardware/firmware (fwcfg) and
 firmware/OS (acpi).

One could look at the current hpet patch as implementing:
qemu - struct hpet_fw_entry - seabios - struct acpi_20_hpet - OS.

I'm suggesting that we do the following instead:
qemu - struct acpi_20_hpet - seabios - struct acpi_20_hpet - OS.

I'm not suggesting a radical rethink of fwcfg, but I fail to see the
advantage in introducing the arbitrary struct hpet_fw_entry when
there is a perfectly good, well defined, struct acpi_20_hpet that
already exists.  This new arbitrary intermediate format just
introduces make work for all of us.

-Kevin

[Qemu-devel] Re: [PATCH v3 0/5] Add QMP migration events

2010-06-14 Thread Juan Quintela

Anthony Liguori aligu...@linux.vnet.ibm.com wrote:
 On 06/14/2010 11:02 AM, Juan Quintela wrote:
 Anthony Liguorialigu...@linux.vnet.ibm.com  wrote:

 On 06/12/2010 06:05 AM, Juan Quintela wrote:
  
 Luiz Capitulinolcapitul...@redhat.com   wrote:


 The monitor that did it knows it, nobody else knows it.  At destination
 time, I guess you agree this is important, i.e. the management app knows
 that migration has started.


 Dual monitors is a slippery slope argument because even if you had
 these events, it doesn't give you nearly enough information to
 implement anything safely.
  
 Security folks here needed to do logging of qemu events, here.  Guest
 what ones: vm_start, vm_stop, vm_continue, and vm_migrate.


 Why do security folks need this?  Why are they not interested in other
 things like savevm?  Why are they talkign to qemu and not libvirt
 (they shouldn't trust qemu).

No clue about last one.  I just was asked to provide that list of
events.  will ask back.

 Insteod of a nice: write this small qmp client, and listen for this
 four events, I just had to point them where to hack their logging.

 About libvirt, I would rreally, really like to be able to use libvirt to
 launch a guest, and then let im me launch another monitor and stoup
 libvirt for continuing with it.  There is no way for a monitor that
 there has been doing write operations in other monitors.  I see this
 as a useful feature for all write (i.e. not query) commands.


 Yeah, but if we want to do this, then we need to discuss this with the
 libvirt folks and come up with a proposal that works for all commands.
 Sneaking in a few migration events is not going to help.

Fully agree.

 Migration commands have a feature that dont' have other commands: they
 invosve two machines.

 And I would also liked to have that events for all the write commands.
 Migration is more interesting becaues it needs synchronization.


 I'm still fundamentally confused about what you think you can do with
 these events.  But do you really intend on introducing events for
 every non-query QMP command?  Does that seem a bit unrealistic?

Ok. lets stop here.  My definitions:

Event: this important thing happened (important has several meanings).

Migration events fully enter in this definition.  Furthermore, migration
events happens from actions that are issued in machine A and event
happens in machine A and machine B. (I.e. they are so special as they
can get).

Now convenience.  I think it would be convenient to also know in the
other monitors when any write command happens.  About how to implement
this, if there are more uses or no,  that is clearly open to
discussion.  I think that this enter fully in the politics vs mechanism
discussions, events allow you to notify when things happen, and
management app can do anything that it sees fit.

As principle, I think that important happenings (to not repeat the
event word) should be published in a very clear way.  Migration
start/end are a basic example of that.  It is not as if Migration is
going to stop having a start or an end any time soon.  Making the
app polling to know that is too cumbersome for the normal good case.
This kind of things should be plublished somehow.  The same that
happens when a machine start/stops.  That are improntant events.

 Wouldn't it be better to have a mechanism to snoop on monitor traffic
 in a more proper way?

I haven't looked at the trace commands that you pointed (so no comment
there), but I see it interesting to be able to know what commands have
been issued from other monitor.

In my ideal world, I would get a monitor from libvirt, where I could
play.  And libvirt will look at the commands that I issue and decided if
he can continue or just leave the guest for me.

Notice the ideal world part O:-)

Adiso, Juan.

Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path()

2010-06-14 Thread Alex Williamson

On Mon, 2010-06-14 at 18:49 +0200, Jan Kiszka wrote:
 Alex Williamson wrote:
  On Mon, 2010-06-14 at 18:00 +0200, Jan Kiszka wrote: 
  And instead of introducing another hierarchy level with the bus address,
  I would also prefer to add this as prefix or suffix to the device name,
  e.g. driver.busaddr.
  
  That's what I had started with.  The first post in this thread has
  pci.0,addr=09.0 as a single hierarchy level.  The addr= may be
  unnecessary there, but I also prefer something along those lines.  For
  PCI it'd make sense to have name:addr, which comes out to
  pci.0:09.0.  (Maybe rather than flagging properties as being relevant
  to the path and printing them generically, we should extract specific
  properties based on the bus type.)
 
 Not bus.addr, driver.addr. We only have one PCI bus here, not as many as
 there are slots on that bus.

Ok, I can get it down to something like:

/i440FX-pcihost/pci.0/virtio-blk-pci,09.0

The addr on the device is initially a little non-intuitive to me since
it's a property of the bus, but I guess it make sense if we think of
that level as slot, which includes an address and driver.

  For busses that don't have a consistent addressing scheme then some sort 
  of 
  instance ID is unavoidable. I guess it may be possible to invent 
  something 
  based on other device properties (e.g. address of the first IO 
  port/memory 
  region).
  Instance IDs aren't always bad, we just run into trouble with hotplug
  and maybe creating unique ramblock names.  But, such busses typically
  don't support hotplug and don't have multiple instances of the same
  device on the bus, so I don't expect us to hit many issues there as long
  as we can get a stable address path.  Thanks,
 
  If stable instance numbers are required, we could simply keep them in
  DeviceState and search for the smallest free one on additions. Actually,
  I'm more in favor of this direction than including the bus address. That
  way we could keep a canonical format across all buses and do not have to
  provide possibly complex ID generation rules for each of them.
  
  I started down that path, but it still breaks for hotplug.  If we start
  a VM with two e1000 NICs, then remove the first, we can no longer
  migrate because the target can't represent having a single e1000 with a
  non-zero instance ID.
 
 That's indeed a good point.
 
 Still, I'm worried about having to define numbering schemes for all the
 buses qemu supports. Maybe we can run a mixture: address-based for
 hotplug-capably buses (they tend to be cooperative in this regard) and
 simple instance numbers for the rest, likely the majority.

Yep, I share that concern, which is I say instance numbers aren't always
bad.  If we have devices that we don't care about doing hotplug on, we
can live with instance numbers.  Thanks,

Alex

[Qemu-devel] [PATCH] block: fix a warning and possible truncation

2010-06-14 Thread Blue Swirl

Fix a warning from OpenBSD gcc (3.3.5 (propolice)):
/src/qemu/block.c: In function `bdrv_info_stats_bs':
/src/qemu/block.c:1548: warning: long long int format, long unsigned
int arg (arg 6)

There may be also truncation effects.

Signed-off-by: Blue Swirl blauwir...@gmail.com
---
Alternatively 'ULL' prefix could be appended to BDRV_SECTOR_SIZE
definition but that may have other side effects.

 block.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/block.c b/block.c
index cacf11b..a7ab0b4 100644
--- a/block.c
+++ b/block.c
@@ -1545,7 +1545,8 @@ static QObject* bdrv_info_stats_bs(BlockDriverState *bs)
  } },
  bs-rd_bytes, bs-wr_bytes,
  bs-rd_ops, bs-wr_ops,
- bs-wr_highest_sector * (long)BDRV_SECTOR_SIZE);
+ bs-wr_highest_sector *
+ (uint64_t)BDRV_SECTOR_SIZE);
 dict  = qobject_to_qdict(res);

 if (*bs-device_name) {
-- 
1.7.1

[Qemu-devel] Re: [SeaBIOS] [PATCHv2] load hpet info for HPET ACPI table from qemu

2010-06-14 Thread Gleb Natapov

On Mon, Jun 14, 2010 at 02:25:21PM -0400, Kevin O'Connor wrote:
 On Mon, Jun 14, 2010 at 05:51:27PM +0300, Avi Kivity wrote:
  On 06/14/2010 05:09 PM, Gleb Natapov wrote:
  Could we just have qemu build the hpet tables and pass them through to
  seabios?  Perhaps using the qemu_cfg_acpi_additional_tables() method.
  
  Possible, and I considered that. I personally prefer to pass minimum
  information required for seabios to discover underlying HW and leave
  ACPI table creation to seabios. That is how things done for HW that
  seabios can actually detect. If we will go your way pretty soon we will
  move creation of ACPI/SMBIOS/MP tables into qemu and IMHO this will be
  step backworkds.
  
  I agree.  ACPI is a firmware/OS interface.  If we move ACPI table
  generation into qemu, it becomes a mixed hardware/firmware/OS
  interface.
 
 This seems to be a philosophical distinction.  Lets go over the
 practical implications.
 
 It seems there was a change in qemu to the hpet functionality.
 Although the change is solely between qemu and the OS, it's necessary
 to patch both qemu and seabios for the OS to see the change.  This
 means creating and reviewing patches for two separate repos.  This
 also requires release coordination - the seabios change has to be
 committed and released, and then qemu needs to be released with the
 new seabios.  Additional changes in seabios tip will get merged into
 qemu, which could complicate testing.
 
My patch is completely unrelated to functionality change in qemu. In
fact I wrote it before the change and had to rebase. Seabios/qemu have
a bug that HPET is always advertise through ACPI table even when qemu
haven't created one. So your description above is not accurate.
But even if it was accurate propagation features through software stack is
common operation everywhere. Think about adding system call to the
kernel and updating libc, or adding feature to kvm kernel module and
adding patch to use it in qemu.

  Better keep those interfaces separate: hardware/firmware (fwcfg) and
  firmware/OS (acpi).
 
 One could look at the current hpet patch as implementing:
 qemu - struct hpet_fw_entry - seabios - struct acpi_20_hpet - OS.
 
 I'm suggesting that we do the following instead:
 qemu - struct acpi_20_hpet - seabios - struct acpi_20_hpet - OS.
 
 I'm not suggesting a radical rethink of fwcfg, but I fail to see the
 advantage in introducing the arbitrary struct hpet_fw_entry when
 there is a perfectly good, well defined, struct acpi_20_hpet that
 already exists.  This new arbitrary intermediate format just
 introduces make work for all of us.
 
Then qemu will have to create ACPI header too and will have to fill
details like oem_id/oem_table_id and so on. Now if I want to change them
in my bios version it is not enough to edit CONFIG_APPNAME in seabios.
If seabios will decide to move to more resent version of ACPI spec it
will not be able to do so since it will not fully control table creation.

--
Gleb.

[Qemu-devel] Re: [PATCH v3 0/5] Add QMP migration events

2010-06-14 Thread Anthony Liguori


On 06/14/2010 01:35 PM, Juan Quintela wrote:

Anthony Liguorialigu...@linux.vnet.ibm.com  wrote:
   

On 06/14/2010 11:02 AM, Juan Quintela wrote:
 

Anthony Liguorialigu...@linux.vnet.ibm.com   wrote:

   

On 06/12/2010 06:05 AM, Juan Quintela wrote:

 

Luiz Capitulinolcapitul...@redhat.comwrote:

   


   

The monitor that did it knows it, nobody else knows it.  At destination
time, I guess you agree this is important, i.e. the management app knows
that migration has started.


   

Dual monitors is a slippery slope argument because even if you had
these events, it doesn't give you nearly enough information to
implement anything safely.

 

Security folks here needed to do logging of qemu events, here.  Guest
what ones: vm_start, vm_stop, vm_continue, and vm_migrate.

   

Why do security folks need this?  Why are they not interested in other
things like savevm?  Why are they talkign to qemu and not libvirt
(they shouldn't trust qemu).
 

No clue about last one.  I just was asked to provide that list of
events.  will ask back.

   

Insteod of a nice: write this small qmp client, and listen for this
four events, I just had to point them where to hack their logging.

About libvirt, I would rreally, really like to be able to use libvirt to
launch a guest, and then let im me launch another monitor and stoup
libvirt for continuing with it.  There is no way for a monitor that
there has been doing write operations in other monitors.  I see this
as a useful feature for all write (i.e. not query) commands.

   

Yeah, but if we want to do this, then we need to discuss this with the
libvirt folks and come up with a proposal that works for all commands.
Sneaking in a few migration events is not going to help.
 

Fully agree.

   

Migration commands have a feature that dont' have other commands: they
invosve two machines.

And I would also liked to have that events for all the write commands.
Migration is more interesting becaues it needs synchronization.

   

I'm still fundamentally confused about what you think you can do with
these events.  But do you really intend on introducing events for
every non-query QMP command?  Does that seem a bit unrealistic?
 

Ok. lets stop here.  My definitions:

Event: this important thing happened (important has several meanings).

Migration events fully enter in this definition.  Furthermore, migration
events happens from actions that are issued in machine A and event
happens in machine A and machine B. (I.e. they are so special as they
can get).
   


I think you've got too narrow a view.  Migration doesn't always involve 
two machines.  Migration can involve just the source writing via 
exec:dd of=foo.img and this is in fact an important use case for libvirt.



Now convenience.  I think it would be convenient to also know in the
other monitors when any write command happens.  About how to implement
this, if there are more uses or no,  that is clearly open to
discussion.  I think that this enter fully in the politics vs mechanism
discussions, events allow you to notify when things happen, and
management app can do anything that it sees fit.

As principle, I think that important happenings (to not repeat the
event word) should be published in a very clear way.  Migration
start/end are a basic example of that.  It is not as if Migration is
going to stop having a start or an end any time soon.  Making the
app polling to know that is too cumbersome for the normal good case.
This kind of things should be plublished somehow.  The same that
happens when a machine start/stops.  That are improntant events.
   


What makes migration important and not savevm?

It's not that I don't agree that migration is important and that it's 
important for tools to be able to know about it.  I disagree that 
migration is *more* important than most of the other things that happen 
in the monitor and I want to make sure that we come up with a solution 
that solves the broader problem.


Regards,

Anthony Liguori

Re: [Qemu-devel] [PATCH] Add optional dump of default config file paths..

2010-06-14 Thread Anthony Liguori


On 06/14/2010 12:59 PM, john cooper wrote:

Anthony Liguori wrote:
   

On 06/09/2010 03:05 AM, john cooper wrote:
 

This patch adds the ability to determine the build-configured
runtime config file paths from the command line.  After
support for cpu model definitions were added to the default
runtime target- config file, testing of this feature has
tripped over an unintentionally mis-installed config file
enough to indicate some help is needed resolving such issues.

As no general verbose flag is currently available, specifying
-readconfig ? on the command line will maintain the default
(config file) disposition but additionally emit diagnostic info.
This mode is optional, otherwise the existing startup behavior
is identical.

   

I assume this is something requested by libvirt?
 

Not requested by libvirt but rather intended for the qemu
CLI facing user.  The alternatives of trying to puzzle
out built-in config file paths via strace or strings when
config problems surface is rather awkward.  This also
seems a general need for test of config file related
functionality.
   


I wouldn't mind spitting out the ./configure line in qemu -version like 
gcc does.  That has the benefit of being a bit more obviously discoverable.


Regards,

Anthony Liguori


Thanks,

-john

   

I'd prefer we support
this via Daniel's capabilities patchset instead of adding yet another
hidden help output and having libvirt parse that output.

Regards,

Anthony Liguori

 

Signed-off-by: john cooperjohn.coo...@redhat.com
---

diff --git a/qemu-config.c b/qemu-config.c
index 5a4e61b..a490603 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -518,21 +518,29 @@ out:
   return res;
   }

-int qemu_read_config_file(const char *filename)
+/* attempt to open and parse config file, report problems if vflag
+ */
+int qemu_read_config_file(const char *filename, int vflag)
   {
   FILE *f = fopen(filename, r);
-int ret;
+int rv = 0;

   if (f == NULL) {
-return -errno;
+rv = -errno;
   }
-
-ret = qemu_config_parse(f, vm_config_groups, filename);
-fclose(f);
-
-if (ret == 0) {
-return 0;
-} else {
-return -EINVAL;
+else if (qemu_config_parse(f, vm_config_groups, filename) != 0) {
+rv = -EINVAL;
+}
+else if (vflag) {
+fprintf(stderr, read config file %s\n, filename);
   }
+if (f) {
+fclose(f);
+}
+if (rv   vflag) {
+fprintf(stderr, can't read config file %s: %s\n,
+filename, strerror(-rv));
+}
+return rv;
   }
+
diff --git a/qemu-config.h b/qemu-config.h
index dca69d4..2e15556 100644
--- a/qemu-config.h
+++ b/qemu-config.h
@@ -23,6 +23,6 @@ void qemu_add_globals(void);
   void qemu_config_write(FILE *fp);
   int qemu_config_parse(FILE *fp, QemuOptsList **lists, const char
*fname);

-int qemu_read_config_file(const char *filename);
+int qemu_read_config_file(const char *filename, int vflag);

   #endif /* QEMU_CONFIG_H */
diff --git a/vl.c b/vl.c
index 7121cd0..23c7276 100644
--- a/vl.c
+++ b/vl.c
@@ -2582,6 +2582,7 @@ int main(int argc, char **argv, char **envp)
   #endif
   int show_vnc_port = 0;
   int defconfig = 1;
+int defconfig_verbose = 0;

   error_set_progname(argv[0]);

@@ -2657,6 +2658,10 @@ int main(int argc, char **argv, char **envp)
   case QEMU_OPTION_nodefconfig:
   defconfig=0;
   break;
+case QEMU_OPTION_readconfig:
+if (!strcmp(optarg, ?))
+defconfig_verbose = 1;
+break;
   }
   }
   }
@@ -2664,12 +2669,14 @@ int main(int argc, char **argv, char **envp)
   if (defconfig) {
   int ret;

-ret = qemu_read_config_file(CONFIG_QEMU_CONFDIR /qemu.conf);
+ret = qemu_read_config_file(CONFIG_QEMU_CONFDIR /qemu.conf,
+defconfig_verbose);
   if (ret   0   ret != -ENOENT) {
   exit(1);
   }

-ret = qemu_read_config_file(arch_config_name);
+ret = qemu_read_config_file(arch_config_name,
+defconfig_verbose);
   if (ret   0   ret != -ENOENT) {
   exit(1);
   }
@@ -3386,15 +3393,9 @@ int main(int argc, char **argv, char **envp)
   xen_mode = XEN_ATTACH;
   break;
   case QEMU_OPTION_readconfig:
-{
-int ret = qemu_read_config_file(optarg);
-if (ret   0) {
-fprintf(stderr, read config %s: %s\n, optarg,
-strerror(-ret));
-exit(1);
-}
-break;
-}
+if (!defconfig_verbose
qemu_read_config_file(optarg, 1)   0)
+exit(1);
+break;
   case QEMU_OPTION_writeconfig:

[Qemu-devel] [PATCH] stop cpus before forking.

2010-06-14 Thread Glauber Costa

This patch fixes a bug that happens with kvm, irqchip-in-kernel,
while adding a netdev. Despite the situations of reproduction being
specific to kvm, I believe this fix is pretty generic, and fits here.
Specially if we ever want to have our own irqchip in kernel too.

The problem happens after the fork system call, and although it is not
100 % reproduceable, happens pretty often. After fork, the memory where
the apic is mapped is present in both processes. It ends up confusing
the vcpus somewhere in the irq - ack path, and qemu hangs, with no
irqs being delivered at all from that point on.

Making sure the vcpus are stopped before forking makes the problem go
away. Besides, this is a pretty unfrequent operation, which already hangs
the io-thread for a while. So it should not hurt performance.

Signed-off-by: Glauber Costa glom...@redhat.com
---
 net/tap.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/net/tap.c b/net/tap.c
index 0147dab..f34dd9c 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -330,6 +330,9 @@ static int launch_script(const char *setup_script, const 
char *ifname, int fd)
 sigaddset(mask, SIGCHLD);
 sigprocmask(SIG_BLOCK, mask, oldmask);
 
+/* make sure no cpus are running, so the apic does not
+ * get confused */
+vm_stop(0);
 /* try to launch network script */
 pid = fork();
 if (pid == 0) {
@@ -350,6 +353,7 @@ static int launch_script(const char *setup_script, const 
char *ifname, int fd)
 execv(setup_script, args);
 _exit(1);
 } else if (pid  0) {
+vm_start();
 while (waitpid(pid, status, 0) != pid) {
 /* loop */
 }
-- 
1.7.0.1

Re: [Qemu-devel] Fwd: [PATCH 1/4] Make configure find uuid functions in Mac OS X by looking into libSystem.B

2010-06-14 Thread Anthony Liguori


On 06/03/2010 04:05 PM, C.W. Betts wrote:

Since this didn't seem to get to the mailing list, I'm forwarding it.


It's lacking a Signed-off-by and as a quoted message, it's not possible 
to apply it.


Regards,

Anthony Liguori


Begin forwarded message:

*From: *C.W. Betts computer...@hotmail.com 
mailto:computer...@hotmail.com

*Date: *May 16, 2010 12:47:33 PM MDT
*To: *qemu-de...@nongnu.org mailto:qemu-devel@nongnu.org
*Cc: *C.W. Betts computer...@hotmail.com 
mailto:computer...@hotmail.com
*Subject: **[PATCH 1/4] Make configure find uuid functions in Mac OS 
X by looking into libSystem.B*


---
configure |6 +-
1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/configure b/configure
index 966cd7d..ecc3317 100755
--- a/configure
+++ b/configure
@@ -1198,7 +1198,11 @@ fi
##
# uuid_generate() probe, used for vdi block driver
if test $uuid != no ; then
-  uuid_libs=-luuid
+  if test $darwin == yes; then
+uuid_libs=
+  else
+uuid_libs=-luuid
+  fi
  cat  $TMPC  EOF
#include uuid/uuid.h
int main(void)
--
1.6.5.5

Re: [Qemu-devel] [PATCH] stop cpus before forking.

2010-06-14 Thread Anthony Liguori


On 06/14/2010 02:27 PM, Glauber Costa wrote:

This patch fixes a bug that happens with kvm, irqchip-in-kernel,
while adding a netdev. Despite the situations of reproduction being
specific to kvm, I believe this fix is pretty generic, and fits here.
Specially if we ever want to have our own irqchip in kernel too.

The problem happens after the fork system call, and although it is not
100 % reproduceable, happens pretty often. After fork, the memory where
the apic is mapped is present in both processes. It ends up confusing
the vcpus somewhere in the irq-  ack path, and qemu hangs, with no
irqs being delivered at all from that point on.

Making sure the vcpus are stopped before forking makes the problem go
away. Besides, this is a pretty unfrequent operation, which already hangs
the io-thread for a while. So it should not hurt performance.

Signed-off-by: Glauber Costaglom...@redhat.com
   


This doesn't make very much sense to me but smells like a kernel bug to me.

Even if it isn't, I can't rationalize why stopping the vm like this is 
enough to fix such a problem.  Is the problem that the KVM VCPU threads 
get duplicated while potentially running or something like that?


Regards,

Anthony Liguori


---
  net/tap.c |4 
  1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/net/tap.c b/net/tap.c
index 0147dab..f34dd9c 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -330,6 +330,9 @@ static int launch_script(const char *setup_script, const 
char *ifname, int fd)
  sigaddset(mask, SIGCHLD);
  sigprocmask(SIG_BLOCK,mask,oldmask);

+/* make sure no cpus are running, so the apic does not
+ * get confused */
+vm_stop(0);
  /* try to launch network script */
  pid = fork();
  if (pid == 0) {
@@ -350,6 +353,7 @@ static int launch_script(const char *setup_script, const 
char *ifname, int fd)
  execv(setup_script, args);
  _exit(1);
  } else if (pid  0) {
+vm_start();
  while (waitpid(pid,status, 0) != pid) {
  /* loop */
  }

[Qemu-devel] Re: [SeaBIOS] [PATCHv2] load hpet info for HPET ACPI table from qemu

2010-06-14 Thread Anthony Liguori


On 06/14/2010 01:25 PM, Kevin O'Connor wrote:

On Mon, Jun 14, 2010 at 05:51:27PM +0300, Avi Kivity wrote:
   

On 06/14/2010 05:09 PM, Gleb Natapov wrote:
 

Could we just have qemu build the hpet tables and pass them through to
seabios?  Perhaps using the qemu_cfg_acpi_additional_tables() method.

 

Possible, and I considered that. I personally prefer to pass minimum
information required for seabios to discover underlying HW and leave
ACPI table creation to seabios. That is how things done for HW that
seabios can actually detect. If we will go your way pretty soon we will
move creation of ACPI/SMBIOS/MP tables into qemu and IMHO this will be
step backworkds.
   

I agree.  ACPI is a firmware/OS interface.  If we move ACPI table
generation into qemu, it becomes a mixed hardware/firmware/OS
interface.
 

This seems to be a philosophical distinction.  Lets go over the
practical implications.

It seems there was a change in qemu to the hpet functionality.
Although the change is solely between qemu and the OS, it's necessary
to patch both qemu and seabios for the OS to see the change.  This
means creating and reviewing patches for two separate repos.  This
also requires release coordination - the seabios change has to be
committed and released, and then qemu needs to be released with the
new seabios.  Additional changes in seabios tip will get merged into
qemu, which could complicate testing.
   


I think we can be pretty flexible as long as we're careful about 
releases.  For instance, I've applied Gleb's current patch but won't 
update SeaBIOS until the interface is worked out.  If we decide to 
implement a new interface, there's no harm since we've never had a qemu 
build that had a combination of SeaBIOS and fw_cfg that didn't work.


Regards,

Anthony Liguori

1 2 >

1 - 100 of 160 matches

Mail list logo