date:20140707

Re: [Qemu-devel] [PATCH v1 1/2] virtio-blk: data-plane: fix save/set .complete_request in start

2014-07-07 Thread Fam Zheng

On Sat, 07/05 12:18, Ming Lei wrote:
 The callback has to be saved and reset in virtio_blk_data_plane_start(),
 otherwise dataplane's requests will be completed in qemu aio context.

Yes, the cb is wrong once virtio_blk_data_plane_stop is called (device reset,
etc.).

Reviewed-by: Fam Zheng f...@redhat.com

 
 Signed-off-by: Ming Lei ming@canonical.com
 ---
  hw/block/dataplane/virtio-blk.c |7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)
 
 diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
 index 227bb15..e88862d 100644
 --- a/hw/block/dataplane/virtio-blk.c
 +++ b/hw/block/dataplane/virtio-blk.c
 @@ -125,7 +125,6 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, 
 VirtIOBlkConf *blk,
Error **errp)
  {
  VirtIOBlockDataPlane *s;
 -VirtIOBlock *vblk = VIRTIO_BLK(vdev);
  Error *local_err = NULL;
  BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
  VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
 @@ -178,8 +177,6 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, 
 VirtIOBlkConf *blk,
  bdrv_op_block_all(blk-conf.bs, s-blocker);
  
  *dataplane = s;
 -s-saved_complete_request = vblk-complete_request;
 -vblk-complete_request = complete_request_vring;
  }
  
  /* Context: QEMU global mutex held */
 @@ -201,6 +198,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
  {
  BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s-vdev)));
  VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
 +VirtIOBlock *vblk = VIRTIO_BLK(s-vdev);
  VirtQueue *vq;
  
  if (s-started) {
 @@ -234,6 +232,9 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
  }
  s-host_notifier = *virtio_queue_get_host_notifier(vq);
  
 +s-saved_complete_request = vblk-complete_request;
 +vblk-complete_request = complete_request_vring;
 +
  s-starting = false;
  s-started = true;
  trace_virtio_blk_data_plane_start(s);
 -- 
 1.7.9.5

[Qemu-devel] [RFC PATCH V5 1/3] xen: pass kernel initrd to qemu

2014-07-07 Thread Chunyan Liu

xen side patch to support xen HVM direct kernel boot:
support 'kernel', 'ramdisk', 'cmdline' (and 'root', 'extra' as well
which would be deprecated later) in HVM config file, parse config file,
pass -kernel, -initrd, -append parameters to qemu.

It's working with qemu-xen when using the default BIOS (seabios).

[HVM config example]
name=sles11_sp2
description=None
uuid=5c84adcc-bd59-788a-96d2-195f9b599cfe
memory=512
maxmem=512
vcpus=4
on_poweroff=destroy
on_reboot=restart
on_crash=destroy
localtime=0
keymap=en-us
builder=hvm
device_model_override=/home/cyliu/git/qemu/x86_64-softmmu/qemu-system-x86_64
kernel=/mnt/vmlinuz-3.0.13-0.27-default
ramdisk=/mnt/initrd-3.0.13-0.27-default
root=/dev/hda2
extra=console=tty0 console=ttyS0
disk=[ 'file:/mnt/images/sles11_sp2/disk0.raw,hda,w', ]
vif=[ 'mac=00:16:3e:56:af:69,bridge=br0,type=netfront', ]
stdvga=0
vnc=1
vncunused=1
viridian=0
acpi=1
pae=1
serial=pty


Signed-off-by: Chunyan Liu cy...@suse.com
---
Changes:
  - update b_info-u.pv.kernel compatibility work:
turn b_info-u.pv.kernel to b_info-kernel in 
libxl__domain_build_info_setdefault, handle b_info-kernel
in later processing.
  - include LIBXL_HAVE_BUILDINFO_KERNEL only instead of each
for kernel, ramdisk and cmdline.
  - update examples in commit message

 docs/man/xl.cfg.pod.5  | 53 ++--
 tools/libxl/libxl.h| 15 +++
 tools/libxl/libxl_bootloader.c | 18 ++---
 tools/libxl/libxl_create.c | 19 +
 tools/libxl/libxl_dm.c | 15 +++
 tools/libxl/libxl_types.idl|  3 +++
 tools/libxl/xl_cmdimpl.c   | 61 --
 7 files changed, 129 insertions(+), 55 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index ff9ea77..c4a6589 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -304,6 +304,37 @@ Action to take if the domain crashes.  Default is 
Cdestroy.
 
 =back
 
+=head3 Direct Kernel Boot
+
+Direct kernel boot allows booting directly from a kernel and initrd
+stored in the host physical machine OS, allowing command line arguments
+to be passed directly. PV guest direct kernel boot is supported. HVM
+guest direct kernel boot is supported with limitation (it's supported
+when using qemu-xen and default BIOS 'seabios'; not supported in case of
+stubdom-dm and old rombios.)
+
+=over 4
+
+=item Bkernel=PATHNAME
+
+Load the specified file as the kernel image.
+
+=item Bramdisk=PATHNAME
+
+Load the specified file as the ramdisk.
+
+=item Broot=STRING
+
+Append Broot=STRING to the kernel command line (Note: it is guest
+specific what meaning this has).
+
+=item Bextra=STRING
+
+Append BSTRING to the kernel command line. (Note: it is guest
+specific what meaning this has).
+
+=back
+
 =head3 Other Options
 
 =over 4
@@ -646,20 +677,12 @@ The following options apply only to Paravirtual guests.
 
 =over 4
 
-=item Bkernel=PATHNAME
-
-Load the specified file as the kernel image.  Either Bkernel or
-Bbootloader must be specified for PV guests.
-
-=item Bramdisk=PATHNAME
-
-Load the specified file as the ramdisk.
-
 =item Bbootloader=PROGRAM
 
 Run CPROGRAM to find the kernel image and ramdisk to use.  Normally
 CPROGRAM would be Cpygrub, which is an emulation of
-grub/grub2/syslinux.
+grub/grub2/syslinux. Either Bkernel or Bbootloader must be specified
+for PV guests.
 
 =item Bbootloader_args=[ ARG, ARG, ...]
 
@@ -667,16 +690,6 @@ Append BARGs to the arguments to the Bbootloader
 program. Alternatively if the argument is a simple string then it will
 be split into words at whitespace (this second option is deprecated).
 
-=item Broot=STRING
-
-Append Broot=STRING to the kernel command line (Note: it is guest
-specific what meaning this has).
-
-=item Bextra=STRING
-
-Append BSTRING to the kernel command line. Note: it is guest
-specific what meaning this has).
-
 =item Be820_host=BOOLEAN
 
 Selects whether to expose the host e820 (memory map) to the guest via
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 459557d..3a1be8d 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -530,6 +530,21 @@ typedef struct libxl__ctx libxl_ctx;
  */
 #define LIBXL_HAVE_DEVICE_PCI_SEIZE 1
 
+/*
+ * LIBXL_HAVE_BUILDINFO_KERNEL
+ *
+ * If this is defined, then the libxl_domain_build_info structure will
+ * contain 'kernel', 'ramdisk', 'cmdline' fields. 'kernel' is a string
+ * to indicate kernel image location, 'ramdisk' is a string to indicate
+ * ramdisk location, 'cmdline' is a string to indicate the paramters which
+ * would be appended to kernel image.
+ *
+ * Both PV guest and HVM guest can use these fields for direct kernel boot.
+ * But for compatibility reason, u.pv.kernel, u.pv.ramdisk and u.pv.cmdline
+ * still exist.
+ */
+#define LIBXL_HAVE_BUILDINFO_KERNEL 1
+
 /* Functions annotated with LIBXL_EXTERNAL_CALLERS_ONLY may not be
  * called from within libxl itself. Callers outside libxl, who
  * do not #include

[Qemu-devel] [RFC PATCH V5 0/3] Support xen HVM direct kernel boot

2014-07-07 Thread Chunyan Liu

Updated current patch series for working with qemu-xen and default
BIOS (seabios), to make it in good shape. Stubdom support will be
continued.
  
xen side patches: 
* pass kernel/initrd/append parameters to qemu-dm
* add 'cmdline' in xl.cfg
qemu side patch: reuse load_linux() for xen hvm direct kernel boot.
Different from pc_memory_init which does lots of ram alloc work
and rom/bios loading work, for xen, we only need to init a basic
fw_cfg device used by load_linux() to store ADDRs and
linuxboot.bin/multiboot.bin to retrive ADDRs, then load_linux(),
after that, do real add option rom work to add
linuxboot.bin/multiboot.bin to system option rom. Other things
would be done by seabios smoothly.

Changes:
  xen side patch:
- add 'cmdline' in xl.cfg (as a separate patch)
- update u.pv.kernel compable work: fill in b_info.kernel with
  old u.pv.kernel in libxl__domain_build_info_setdefault, handle
  b_info.kernel only later.
- update libxl.h to include LIBXL_HAVE_BUILDINFO_KERNEL only
  rather than three
  qemu side patch:
- no change to v4.

v4 is here:
http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00026.html

v3 is here:
https://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg04903.html

v2 is here:
http://thread.gmane.org/gmane.comp.emulators.qemu/277514

v1 is here:
http://lists.gnu.org/archive/html/qemu-devel/2014-05/msg06233.html

Chunyan Liu (3):
  xen: pass kernel initrd to qemu
  xl.cfg: add 'cmdline' in config file
  qemu: support xen HVM direct kernel boot

-- 
1.8.4.5

[Qemu-devel] [RFC PATCH V5 2/3] xl.cfg: add 'cmdline' in config file

2014-07-07 Thread Chunyan Liu

Currently in xl.cfg, use 'root' and 'extra' to generate the command
line. 'cmdline' could be a more generic equivalent. So, add 'cmdline'
in xl.cfg and let it be preferred. 'root' and 'extra' still works.
But when 'cmdline' is specified, 'root' and 'extra' will be ignored.

[HVM config example]
[snip]
builder=hvm
device_model_override=/home/cyliu/git/qemu/x86_64-softmmu/qemu-system-x86_64
kernel=/mnt/vmlinuz-3.0.13-0.27-default
ramdisk=/mnt/initrd-3.0.13-0.27-default
root=/dev/hda2
extra=console=tty0 console=ttyS0
[snip]

or:

[snip]
builder=hvm
device_model_override=/home/cyliu/git/qemu/x86_64-softmmu/qemu-system-x86_64
kernel=/mnt/vmlinuz-3.0.13-0.27-default
ramdisk=/mnt/initrd-3.0.13-0.27-default
cmdline=root=/dev/hda2 console=tty0 console=ttyS0
[snip]


Signed-off-by: Chunyan Liu cy...@suse.com
---
Changes:
  - add back the 'cmdline' in xl.cfg, but as separate patch

 docs/man/xl.cfg.pod.5|  7 +++
 tools/libxl/xl_cmdimpl.c | 20 ++--
 2 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index c4a6589..cb5b76b 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -323,6 +323,13 @@ Load the specified file as the kernel image.
 
 Load the specified file as the ramdisk.
 
+=item Bcmdline=STRING
+
+Append Bcmdline=STRING to the kernel command line. (Note: it is
+guest specific what meaning this has). It can replace Broot=STRING
+plus Bextra=STRING and is preferred. When Bcmdline=STRING is set,
+Broot=STRING and Bextra=STRING will be ignored.
+
 =item Broot=STRING
 
 Append Broot=STRING to the kernel command line (Note: it is guest
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index d4cd50b..cfe13e3 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -693,19 +693,27 @@ static void parse_top_level_sdl_options(XLU_Config 
*config,
 static char *parse_cmdline(XLU_Config *config)
 {
 char *cmdline = NULL;
-const char *root = NULL, *extra = ;
+const char *root = NULL, *extra = NULL, *buf = NULL;
 
+xlu_cfg_get_string (config, cmdline, buf, 0);
 xlu_cfg_get_string (config, root, root, 0);
 xlu_cfg_get_string (config, extra, extra, 0);
 
-if (root) {
-if (asprintf(cmdline, root=%s %s, root, extra) == -1)
-cmdline = NULL;
+if (buf) {
+cmdline = strdup(buf);
+if (root || extra)
+fprintf(stderr, Warning: ignoring root= and extra= 
+in favour of cmdline=\n);
 } else {
-cmdline = strdup(extra);
+if (root) {
+if (asprintf(cmdline, root=%s %s, root, extra) == -1)
+cmdline = NULL;
+} else if (extra) {
+cmdline = strdup(extra);
+}
 }
 
-if ((root || extra)  !cmdline) {
+if ((buf || root || extra)  !cmdline) {
 fprintf(stderr, Failed to allocate memory for cmdline\n);
 exit(1);
 }
-- 
1.8.4.5

[Qemu-devel] [RFC PATCH V5 3/3] qemu: support xen hvm direct kernel boot

2014-07-07 Thread Chunyan Liu

qemu side patch to support xen HVM direct kernel boot:
if -kernel exists, calls xen_load_linux(), which will read kernel/initrd
and add a linuxboot.bin or multiboot.bin option rom. The
linuxboot.bin/multiboot.bin will load kernel/initrd and jump to execute
kernel directly. It's working when xen uses seabios.

During this work, found the 'kvmvapic' is in option_rom list, it should
not be there in xen case. Set s-vapic_control = 0 in xen_apic_realize()
to handle that.

Signed-off-by: Chunyan Liu cy...@suse.com
Acked-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
Acked-by: Michael S. Tsirkin m...@redhat.com
---
 hw/i386/pc.c   | 25 +
 hw/i386/pc_piix.c  |  7 +++
 hw/i386/xen/xen_apic.c |  1 +
 include/hw/i386/pc.h   |  5 +
 4 files changed, 38 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 2cf22b1..9e58982 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1190,6 +1190,31 @@ void pc_acpi_init(const char *default_dsdt)
 }
 }
 
+FWCfgState *xen_load_linux(const char *kernel_filename,
+   const char *kernel_cmdline,
+   const char *initrd_filename,
+   ram_addr_t below_4g_mem_size,
+   PcGuestInfo *guest_info)
+{
+int i;
+FWCfgState *fw_cfg;
+
+assert(kernel_filename != NULL);
+
+fw_cfg = fw_cfg_init(BIOS_CFG_IOPORT, BIOS_CFG_IOPORT + 1, 0, 0);
+rom_set_fw(fw_cfg);
+
+load_linux(fw_cfg, kernel_filename, initrd_filename,
+   kernel_cmdline, below_4g_mem_size);
+for (i = 0; i  nb_option_roms; i++) {
+assert(!strcmp(option_rom[i].name, linuxboot.bin) ||
+   !strcmp(option_rom[i].name, multiboot.bin));
+rom_add_option(option_rom[i].name, option_rom[i].bootindex);
+}
+guest_info-fw_cfg = fw_cfg;
+return fw_cfg;
+}
+
 FWCfgState *pc_memory_init(MachineState *machine,
MemoryRegion *system_memory,
ram_addr_t below_4g_mem_size,
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 2dccb34..63e2198 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -180,6 +180,13 @@ static void pc_init1(MachineState *machine,
 fw_cfg = pc_memory_init(machine, system_memory,
 below_4g_mem_size, above_4g_mem_size,
 rom_memory, ram_memory, guest_info);
+} else if (machine-kernel_filename != NULL) {
+/* For xen HVM direct kernel boot, load linux here */
+fw_cfg = xen_load_linux(machine-kernel_filename,
+machine-kernel_cmdline,
+machine-initrd_filename,
+below_4g_mem_size,
+guest_info);
 }
 
 gsi_state = g_malloc0(sizeof(*gsi_state));
diff --git a/hw/i386/xen/xen_apic.c b/hw/i386/xen/xen_apic.c
index 63bb7f7..f5acd6a 100644
--- a/hw/i386/xen/xen_apic.c
+++ b/hw/i386/xen/xen_apic.c
@@ -40,6 +40,7 @@ static void xen_apic_realize(DeviceState *dev, Error **errp)
 {
 APICCommonState *s = APIC_COMMON(dev);
 
+s-vapic_control = 0;
 memory_region_init_io(s-io_memory, OBJECT(s), xen_apic_io_ops, s,
   xen-apic-msi, APIC_SPACE_SIZE);
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 1c0c382..b47aaa9 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -187,6 +187,11 @@ PcGuestInfo *pc_guest_info_init(ram_addr_t 
below_4g_mem_size,
 void pc_pci_as_mapping_init(Object *owner, MemoryRegion *system_memory,
 MemoryRegion *pci_address_space);
 
+FWCfgState *xen_load_linux(const char *kernel_filename,
+   const char *kernel_cmdline,
+   const char *initrd_filename,
+   ram_addr_t below_4g_mem_size,
+   PcGuestInfo *guest_info);
 FWCfgState *pc_memory_init(MachineState *machine,
MemoryRegion *system_memory,
ram_addr_t below_4g_mem_size,
-- 
1.8.4.5

Re: [Qemu-devel] [PATCH V2 for 2.1 2/3] memory: add errp parameter to memory_region_init_ram() and memory_region_init_ram_ptr()

2014-07-07 Thread Michael S. Tsirkin

On Mon, Jul 07, 2014 at 10:58:07AM +0800, Hu Tao wrote:
 This patch reintroduces memory_region_init_ram() and
 memory_region_init_ram_ptr() which are almost the same as the renamed
 ones in the previous patch, except that an errp parameter is introduced
 to let callers handle error.
 
 In hostmem-ram.c we call memory_region_init_ram() now rather than
 memory_region_init_ram_nofail() so that error can be handled.
 
 This patch solves a problem that qemu just exits when using monitor
 command object_add to add a memory backend whose size is way too large.
 In the case we'd better give an error message and keep guest running.
 
 The problem can be reproduced as follows:
 
 1. run qemu
 2. (monitor)object_add memory-backend-ram,size=10G,id=ram0
 
 Signed-off-by: Hu Tao hu...@cn.fujitsu.com
 ---
  backends/hostmem-ram.c  |  4 ++--
  exec.c  | 32 -
  hw/block/pflash_cfi01.c |  5 -
  hw/block/pflash_cfi02.c |  5 -
  include/exec/memory.h   | 39 +++-
  include/exec/ram_addr.h |  4 ++--
  memory.c| 53 
 +
  7 files changed, 122 insertions(+), 20 deletions(-)
 
 diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
 index 9d4960b..a67a134 100644
 --- a/backends/hostmem-ram.c
 +++ b/backends/hostmem-ram.c
 @@ -26,8 +26,8 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error 
 **errp)
  }
  
  path = object_get_canonical_path_component(OBJECT(backend));
 -memory_region_init_ram_nofail(backend-mr, OBJECT(backend), path,
 -  backend-size);
 +memory_region_init_ram(backend-mr, OBJECT(backend), path,
 +   backend-size, errp);
  g_free(path);
  }
  
 diff --git a/exec.c b/exec.c
 index 5a2a25e..ca7741b 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -1224,7 +1224,7 @@ static int memory_try_enable_merging(void *addr, size_t 
 len)
  return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE);
  }
  
 -static ram_addr_t ram_block_add(RAMBlock *new_block)
 +static ram_addr_t ram_block_add(RAMBlock *new_block, Error **errp)
  {
  RAMBlock *block;
  ram_addr_t old_ram_size, new_ram_size;
 @@ -1241,9 +1241,11 @@ static ram_addr_t ram_block_add(RAMBlock *new_block)
  } else {
  new_block-host = phys_mem_alloc(new_block-length);
  if (!new_block-host) {
 -fprintf(stderr, Cannot set up guest memory '%s': %s\n,
 -new_block-mr-name, strerror(errno));
 -exit(1);
 +error_setg_errno(errp, errno,
 + cannot set up guest memory '%s',
 + new_block-mr-name);
 +qemu_mutex_unlock_ramlist();
 +return -1;
  }
  memory_try_enable_merging(new_block-host, new_block-length);
  }
 @@ -1294,6 +1296,7 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, 
 MemoryRegion *mr,
  Error **errp)
  {
  RAMBlock *new_block;
 +ram_addr_t addr;
  
  if (xen_enabled()) {
  error_setg(errp, -mem-path not supported with Xen);
 @@ -1323,14 +1326,20 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, 
 MemoryRegion *mr,
  return -1;
  }
  
 -return ram_block_add(new_block);
 +addr = ram_block_add(new_block, errp);
 +if (errp  *errp) {
 +g_free(new_block);
 +return -1;
 +}
 +return addr;
  }
  #endif
  
  ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
 -   MemoryRegion *mr)
 +   MemoryRegion *mr, Error **errp)
  {
  RAMBlock *new_block;
 +ram_addr_t addr;
  
  size = TARGET_PAGE_ALIGN(size);
  new_block = g_malloc0(sizeof(*new_block));
 @@ -1341,12 +1350,17 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, 
 void *host,
  if (host) {
  new_block-flags |= RAM_PREALLOC;
  }
 -return ram_block_add(new_block);
 +addr = ram_block_add(new_block, errp);
 +if (errp  *errp) {
 +g_free(new_block);
 +return -1;
 +}
 +return addr;
  }
  
 -ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr)
 +ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr, Error **errp)
  {
 -return qemu_ram_alloc_from_ptr(size, NULL, mr);
 +return qemu_ram_alloc_from_ptr(size, NULL, mr, errp);
  }
  
  void qemu_ram_free_from_ptr(ram_addr_t addr)
 diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c
 index f9507b4..92b8b87 100644
 --- a/hw/block/pflash_cfi01.c
 +++ b/hw/block/pflash_cfi01.c
 @@ -770,7 +770,10 @@ static void pflash_cfi01_realize(DeviceState *dev, Error 
 **errp)
  memory_region_init_rom_device(
  pfl-mem, OBJECT(dev),
  pfl-be ? pflash_cfi01_ops_be : pflash_cfi01_ops_le, pfl,
 -pfl-name, total_len);
 +

Re: [Qemu-devel] [PATCH for-2.1 2/2] qdev: Fix crash when using non-device class name on -global

2014-07-07 Thread Igor Mammedov

On Thu,  3 Jul 2014 16:45:35 -0300
Eduardo Habkost ehabk...@redhat.com wrote:

 This fixes the following crash:
 
 $ qemu-system-x86_64 -global container.xxx=y
 hw/core/qdev-properties-system.c:399:qdev_add_one_global: Object 
 0x7f7eff234100 is not an instance of type device
 Aborted (core dumped)
 
 New behavior will be to just warn, just like when non-existing clas
 names are used:
 
 $ qemu-system-x86_64 -global container.xxx=y
 qemu-system-x86_64: Warning: -global container.xxx=y not used
 
 Signed-off-by: Eduardo Habkost ehabk...@redhat.com
 ---
  hw/core/qdev-properties-system.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)
 
 diff --git a/hw/core/qdev-properties-system.c 
 b/hw/core/qdev-properties-system.c
 index 8e140af..ae0900f 100644
 --- a/hw/core/qdev-properties-system.c
 +++ b/hw/core/qdev-properties-system.c
 @@ -394,7 +394,8 @@ static int qdev_add_one_global(QemuOpts *opts, void 
 *opaque)
  g-driver   = qemu_opt_get(opts, driver);
  g-property = qemu_opt_get(opts, property);
  g-value= qemu_opt_get(opts, value);
 -oc = object_class_by_name(g-driver);
 +oc = object_class_dynamic_cast(object_class_by_name(g-driver),
 +   TYPE_DEVICE);
  if (oc) {
  DeviceClass *dc = DEVICE_CLASS(oc);
  

Reviewed-by: Igor Mammedov imamm...@redhat.com

Re: [Qemu-devel] virtualize sparc developer workstation?

2014-07-07 Thread Markus Armbruster

Cc'ing the SPARC maintainer.

dennis luehring dl.so...@gmx.net writes:

 i want to virtualize (under a linux x86 host) my noisy sparc workstation
 with the help of qemu and want to know if its possible to
 get an system image to run or what other options available
 for sparc virtualization

 im developing a low-traffic network communication software
 and sparc is one of my test platforms

 is the sparc emulation good enough for
 (low-traffic) network communication tests?

 my current system specs:
 -SUN Ultra 10 SPARC Workstation
 -384 MB RAM
 -Operating System: Solaris 10 November 2006 - SunOS Release 5.10
 Generic_118833-33 64-bit
 -Desktop: CDE 1.6.3, X11 Version 6.6.2
 -Compiler: SunStudio 12 - Sun C/C++ v5.9 2007/05/03 (and gcc v.3.4.6)
 -Workstation-Info: SUNW,Ultra-5_10;sparc;sun4u

 it would be nice to have the desktop running
 but i can compile  test my software fully on command line

 questions are:
 -can i boot an image of my machine into qemu?
 -what bios do i need - OpenBios, OpenBoot, original (how can i get
 this from my machine?)
 -use qemu git head or other version?

 thx for any help

Re: [Qemu-devel] [PATCH] qmp: show QOM properties in device-list-properties

2014-07-07 Thread Paolo Bonzini


Il 06/07/2014 21:03, Cole Robinson ha scritto:

On 07/05/2014 05:14 AM, Paolo Bonzini wrote:

Il 20/05/2014 14:29, Stefan Hajnoczi ha scritto:

Devices can use a mix of qdev and QOM properties.  Currently only the
qdev properties are displayed by device-list-properties.

This patch extends the property enumeration algorithm to also display
QOM properties (excluding the implicit type, realized,
hotpluggable, and parent_bus properties).

When a qdev property exists, use the qdev type name to preserve
backwards compatibility.  QOM type names can be different for bool (qdev
on/off) and str (used by qdev pointers).

Signed-off-by: Stefan Hajnoczi stefa...@redhat.com




Stefan, was this never applied?



I assume you CC'd me in reference to the bug I reported:

https://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00882.html

I tested this patch, but it doesn't fix the specific bit I mentioned (lack of
'bootindex' in -device virtio-blk,? )


Yes, it doesn't, but does libvirt work then?  I'm not sure if libvirt 
still uses -device or rather device-list-properties (which lets you 
start a single QEMU process and do multiple probes).


Paolo

[Qemu-devel] [PATCH 0/7] machvirt dynamic sysbus device instantiation

2014-07-07 Thread Eric Auger

This RFC enables machvirt to dynamically instantiate sysbus devices
from command line.

the RFC relies on
- Alex Graf's Dynamic sysbus device allocation support
  http://lists.gnu.org/archive/html/qemu-ppc/2014-07/msg00047.html

On top of sysbus device Alex' modifications, the RFC reuses his code
developped in PPC e500.c. First that code was moved in a
separate module (hw/misc/physical_devices) and then machvirt was
adapted to call those helper routines.

It is also proposed to add a new method in SysBysDeviceClass, named
fdt_add_node, whose role is to create the device tree node. It is
meant to be specialized by devices that support dynamic instantiation.

In practice there is a need for 2 specializations: one for the device,
and one for the board. It is assumed the provision for PlatformDevtreeData
enables the board adaptation. However, this later may need to be
augmented: typically some clock handles may need to be provided.

Best Regards

Eric

Eric Auger (7):
  hw/misc/platform_devices: helpers for dynamic instantiation of
platform devices
  hw/arm/boot: load_dtb becomes non static
  hw/arm/virt: add new add_fdt_xxx_node functions
  hw/arm/virt: Support dynamically spawned sysbus devices
  hw/core/sysbus: add fdt_add_node method
  hw/misc/platform_devices: add call to sysbus fdt_add_node
  hw/misc/platform_devices: Add platform_bus_base to PlatformDevtreeData

 hw/arm/boot.c  |   2 +-
 hw/arm/virt.c  | 125 -
 hw/core/sysbus.c   |  12 +++
 hw/misc/Makefile.objs  |   1 +
 hw/misc/platform_devices.c | 215 +
 include/hw/arm/arm.h   |   1 +
 include/hw/misc/platform_devices.h |  62 +++
 include/hw/sysbus.h|   2 +
 8 files changed, 395 insertions(+), 25 deletions(-)
 create mode 100644 hw/misc/platform_devices.c
 create mode 100644 include/hw/misc/platform_devices.h

-- 
1.8.3.2

[Qemu-devel] [PATCH 1/7] hw/misc/platform_devices: helpers for dynamic instantiation of platform devices

2014-07-07 Thread Eric Auger

This new module implements routines which help in dynamic instantiation
of sysbus devices. Machine files can use those generic routines.

---

Dynamic sysbus device allocation fully written by Alex Graf.

[Eric Auger]
Those functions were initially in ppc e500 machine file. Now moved to a
separate module.

PPCE500Params is replaced by a generic struct named PlatformParams

Signed-off-by: Alexander Graf ag...@suse.de
Signed-off-by: Eric Auger eric.au...@linaro.org
---
 hw/misc/Makefile.objs  |   1 +
 hw/misc/platform_devices.c | 217 +
 include/hw/misc/platform_devices.h |  61 +++
 3 files changed, 279 insertions(+)
 create mode 100644 hw/misc/platform_devices.c
 create mode 100644 include/hw/misc/platform_devices.h

diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index e47fea8..d081606 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -40,3 +40,4 @@ obj-$(CONFIG_SLAVIO) += slavio_misc.o
 obj-$(CONFIG_ZYNQ) += zynq_slcr.o
 
 obj-$(CONFIG_PVPANIC) += pvpanic.o
+obj-y += platform_devices.o
diff --git a/hw/misc/platform_devices.c b/hw/misc/platform_devices.c
new file mode 100644
index 000..96ab272
--- /dev/null
+++ b/hw/misc/platform_devices.c
@@ -0,0 +1,217 @@
+#include hw/misc/platform_devices.h
+#include hw/sysbus.h
+#include qemu/error-report.h
+
+#define PAGE_SHIFT 12
+
+int sysbus_device_create_devtree(Object *obj, void *opaque)
+{
+PlatformDevtreeData *data = opaque;
+Object *dev;
+SysBusDevice *sbdev;
+bool matched = false;
+
+dev = object_dynamic_cast(obj, TYPE_SYS_BUS_DEVICE);
+sbdev = (SysBusDevice *)dev;
+
+if (!sbdev) {
+/* Container, traverse it for children */
+return object_child_foreach(obj, sysbus_device_create_devtree, data);
+}
+
+if (!matched) {
+error_report(Device %s is not supported by this machine yet.,
+ qdev_fw_name(DEVICE(dev)));
+exit(1);
+}
+
+return 0;
+}
+
+void platform_bus_create_devtree(PlatformParams *params, void *fdt,
+const char *mpic)
+{
+gchar *node = g_strdup_printf(/platform@%PRIx64,
+  params-platform_bus_base);
+const char platcomp[] = qemu,platform\0simple-bus;
+PlatformDevtreeData data;
+Object *container;
+uint64_t addr = params-platform_bus_base;
+uint64_t size = params-platform_bus_size;
+int irq_start = params-platform_bus_first_irq;
+
+/* Create a /platform node that we can put all devices into */
+
+qemu_fdt_add_subnode(fdt, node);
+qemu_fdt_setprop(fdt, node, compatible, platcomp, sizeof(platcomp));
+
+/* Our platform bus region is less than 32bit big, so 1 cell is enough for
+   address and size */
+qemu_fdt_setprop_cells(fdt, node, #size-cells, 1);
+qemu_fdt_setprop_cells(fdt, node, #address-cells, 1);
+qemu_fdt_setprop_cells(fdt, node, ranges, 0, addr  32, addr, size);
+
+qemu_fdt_setprop_phandle(fdt, node, interrupt-parent, mpic);
+
+/* Loop through all devices and create nodes for known ones */
+data.fdt = fdt;
+data.mpic = mpic;
+data.irq_start = irq_start;
+data.node = node;
+
+container = container_get(qdev_get_machine(), /peripheral);
+sysbus_device_create_devtree(container, data);
+container = container_get(qdev_get_machine(), /peripheral-anon);
+sysbus_device_create_devtree(container, data);
+
+g_free(node);
+}
+
+int platform_bus_map_irq(PlatformParams *params, SysBusDevice *sbdev,
+ int n, unsigned long *used_irqs,
+ qemu_irq *platform_irqs)
+{
+int max_irqs = params-platform_bus_num_irqs;
+char *prop = g_strdup_printf(irq[%d], n);
+int irqn = object_property_get_int(OBJECT(sbdev), prop, NULL);
+
+if (irqn == SYSBUS_DYNAMIC) {
+/* Find the first available IRQ */
+irqn = find_first_zero_bit(used_irqs, max_irqs);
+}
+
+if ((irqn = max_irqs) || test_and_set_bit(irqn, used_irqs)) {
+hw_error(IRQ %d is already allocated or no free IRQ left, irqn);
+}
+
+sysbus_connect_irq(sbdev, n, platform_irqs[irqn]);
+object_property_set_int(OBJECT(sbdev), irqn, prop, NULL);
+
+g_free(prop);
+return 0;
+}
+
+int platform_bus_map_mmio(PlatformParams *params, SysBusDevice *sbdev,
+  int n, unsigned long *used_mem,
+  MemoryRegion *pmem)
+{
+MemoryRegion *device_mem = sbdev-mmio[n].memory;
+uint64_t size = memory_region_size(device_mem);
+uint64_t page_size = (1  PAGE_SHIFT);
+uint64_t page_mask = page_size - 1;
+uint64_t size_pages = (size + page_mask)  PAGE_SHIFT;
+uint64_t max_size = params-platform_bus_size;
+uint64_t max_pages = max_size  PAGE_SHIFT;
+char *prop = g_strdup_printf(mmio[%d], n);
+hwaddr addr = object_property_get_int(OBJECT(sbdev), prop, NULL);
+int page;
+int i;
+
+page = addr

[Qemu-devel] [PATCH 4/7] hw/arm/virt: Support dynamically spawned sysbus devices

2014-07-07 Thread Eric Auger

Allows sysbus devices to be instantiated from command line by
using -device option

---

Inspired from what Alex Graf did in ppc e500
https://lists.gnu.org/archive/html/qemu-ppc/2014-07/msg00012.html

Signed-off-by: Alexander Graf ag...@suse.de
Signed-off-by: Eric Auger eric.au...@linaro.org
---
 hw/arm/virt.c | 58 +-
 1 file changed, 57 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index eeecdbf..3a21db4 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -40,6 +40,8 @@
 #include exec/address-spaces.h
 #include qemu/bitops.h
 #include qemu/error-report.h
+#include hw/misc/platform_devices.h
+#include hw/vfio/vfio-platform.h
 
 #define NUM_VIRTIO_TRANSPORTS 32
 
@@ -57,6 +59,14 @@
 #define GIC_FDT_IRQ_PPI_CPU_START 8
 #define GIC_FDT_IRQ_PPI_CPU_WIDTH 8
 
+#define MACHVIRT_PLATFORM_BASE 0xa004000
+#define MACHVIRT_PLATFORM_HOLE (128ULL * 1024 * 1024) /* 128 MB */
+#define MACHVIRT_PLATFORM_PAGE_SHIFT   12
+#define MACHVIRT_PLATFORM_HOLE_PAGES   (MACHVIRT_PLATFORM_HOLE  \
+MACHVIRT_PLATFORM_PAGE_SHIFT)
+#define MACHVIRT_PLATFORM_FIRST_IRQ48
+#define MACHVIRT_PLATFORM_NUM_IRQS 20
+
 enum {
 VIRT_FLASH,
 VIRT_MEM,
@@ -66,6 +76,7 @@ enum {
 VIRT_UART,
 VIRT_MMIO,
 VIRT_RTC,
+VIRT_PLATFORM,
 };
 
 typedef struct MemMapEntry {
@@ -108,6 +119,7 @@ static const MemMapEntry a15memmap[] = {
 [VIRT_MMIO] = { 0xa00, 0x200 },
 /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
 /* 0x1000 .. 0x4000 reserved for PCI */
+[VIRT_PLATFORM] = {MACHVIRT_PLATFORM_BASE , MACHVIRT_PLATFORM_HOLE},
 [VIRT_MEM] = { 0x4000, 30ULL * 1024 * 1024 * 1024 },
 };
 
@@ -115,6 +127,15 @@ static const int a15irqmap[] = {
 [VIRT_UART] = 1,
 [VIRT_RTC] = 2,
 [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
+[VIRT_PLATFORM] = MACHVIRT_PLATFORM_FIRST_IRQ,
+};
+
+static PlatformParams machvirt_params = {
+.has_platform_bus = true,
+.platform_bus_base = MACHVIRT_PLATFORM_BASE,
+.platform_bus_size = MACHVIRT_PLATFORM_HOLE,
+.platform_bus_first_irq = MACHVIRT_PLATFORM_FIRST_IRQ,
+.platform_bus_num_irqs = MACHVIRT_PLATFORM_NUM_IRQS
 };
 
 static VirtBoardInfo machines[] = {
@@ -437,6 +458,18 @@ static void create_virtio_devices(const VirtBoardInfo 
*vbi, qemu_irq *pic)
 fdt_add_virtio_nodes(vbi);
 }
 
+static void machvirt_prep_device_tree(VirtBoardInfo *vbi)
+{
+create_fdt(vbi);
+fdt_add_timer_nodes(vbi);
+fdt_add_cpu_nodes(vbi);
+fdt_add_psci_node(vbi);
+fdt_add_gic_node(vbi);
+fdt_add_uart_node(vbi);
+fdt_add_rtc_node(vbi);
+fdt_add_virtio_nodes(vbi);
+}
+
 static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
 {
 const VirtBoardInfo *board = (const VirtBoardInfo *)binfo;
@@ -445,14 +478,27 @@ static void *machvirt_dtb(const struct arm_boot_info 
*binfo, int *fdt_size)
 return board-fdt;
 }
 
+static void machvirt_reset_device_tree(void *opaque)
+{
+VirtBoardInfo *board = (VirtBoardInfo *)opaque;
+struct arm_boot_info *info = board-bootinfo;
+hwaddr dtb_start = QEMU_ALIGN_UP(info-initrd_start + info-initrd_size,
+ 4096);
+machvirt_prep_device_tree(board);
+platform_bus_create_devtree(machvirt_params, board-fdt, /intc);
+
+load_dtb(dtb_start, info);
+}
+
 static void machvirt_init(MachineState *machine)
 {
-qemu_irq pic[NUM_IRQS];
+qemu_irq *pic = g_new(qemu_irq, NUM_IRQS);
 MemoryRegion *sysmem = get_system_memory();
 int n;
 MemoryRegion *ram = g_new(MemoryRegion, 1);
 const char *cpu_model = machine-cpu_model;
 VirtBoardInfo *vbi;
+PlatformBusNotifier *notifier;
 
 if (!cpu_model) {
 cpu_model = cortex-a15;
@@ -526,6 +572,13 @@ static void machvirt_init(MachineState *machine)
  */
 create_virtio_devices(vbi, pic);
 
+notifier = g_new(PlatformBusNotifier, 1);
+notifier-notifier.notify = platform_bus_init_notify;
+notifier-address_space_mem = sysmem;
+notifier-mpic = pic;
+notifier-params = machvirt_params;
+qemu_add_machine_init_done_notifier(notifier-notifier);
+
 vbi-bootinfo.ram_size = machine-ram_size;
 vbi-bootinfo.kernel_filename = machine-kernel_filename;
 vbi-bootinfo.kernel_cmdline = machine-kernel_cmdline;
@@ -535,6 +588,8 @@ static void machvirt_init(MachineState *machine)
 vbi-bootinfo.loader_start = vbi-memmap[VIRT_MEM].base;
 vbi-bootinfo.get_dtb = machvirt_dtb;
 arm_load_kernel(ARM_CPU(first_cpu), vbi-bootinfo);
+
+qemu_register_reset(machvirt_reset_device_tree, vbi);
 }
 
 static QEMUMachine machvirt_a15_machine = {
@@ -542,6 +597,7 @@ static QEMUMachine machvirt_a15_machine = {
 .desc = ARM Virtual Machine,
 .init = machvirt_init,
 .max_cpus = 4,
+.has_dynamic_sysbus = true,
 };
 
 static void machvirt_machine_init(void)

[Qemu-devel] [PATCH 3/7] hw/arm/virt: add new add_fdt_xxx_node functions

2014-07-07 Thread Eric Auger

Create new functions:
- add_fdt_uart_node
- add_fdt_rtc_node
- add_fdt_virtio_nodes

They will be used for dynamic sysbus instantiation.

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 hw/arm/virt.c | 67 +++
 1 file changed, 44 insertions(+), 23 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 405c61d..eeecdbf 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -330,18 +330,15 @@ static void create_gic(const VirtBoardInfo *vbi, qemu_irq 
*pic)
 fdt_add_gic_node(vbi);
 }
 
-static void create_uart(const VirtBoardInfo *vbi, qemu_irq *pic)
+static void fdt_add_uart_node(const VirtBoardInfo *vbi)
 {
-char *nodename;
 hwaddr base = vbi-memmap[VIRT_UART].base;
 hwaddr size = vbi-memmap[VIRT_UART].size;
 int irq = vbi-irqmap[VIRT_UART];
 const char compat[] = arm,pl011\0arm,primecell;
 const char clocknames[] = uartclk\0apb_pclk;
+char *nodename = g_strdup_printf(/pl011@% PRIx64, base);
 
-sysbus_create_simple(pl011, base, pic[irq]);
-
-nodename = g_strdup_printf(/pl011@% PRIx64, base);
 qemu_fdt_add_subnode(vbi-fdt, nodename);
 /* Note that we can't use setprop_string because of the embedded NUL */
 qemu_fdt_setprop(vbi-fdt, nodename, compatible,
@@ -358,17 +355,23 @@ static void create_uart(const VirtBoardInfo *vbi, 
qemu_irq *pic)
 g_free(nodename);
 }
 
-static void create_rtc(const VirtBoardInfo *vbi, qemu_irq *pic)
+static void create_uart(const VirtBoardInfo *vbi, qemu_irq *pic)
+{
+hwaddr base = vbi-memmap[VIRT_UART].base;
+int irq = vbi-irqmap[VIRT_UART];
+
+sysbus_create_simple(pl011, base, pic[irq]);
+fdt_add_uart_node(vbi);
+}
+
+static void fdt_add_rtc_node(const VirtBoardInfo *vbi)
 {
-char *nodename;
 hwaddr base = vbi-memmap[VIRT_RTC].base;
 hwaddr size = vbi-memmap[VIRT_RTC].size;
 int irq = vbi-irqmap[VIRT_RTC];
 const char compat[] = arm,pl031\0arm,primecell;
+char *nodename = g_strdup_printf(/pl031@% PRIx64, base);
 
-sysbus_create_simple(pl031, base, pic[irq]);
-
-nodename = g_strdup_printf(/pl031@% PRIx64, base);
 qemu_fdt_add_subnode(vbi-fdt, nodename);
 qemu_fdt_setprop(vbi-fdt, nodename, compatible, compat, sizeof(compat));
 qemu_fdt_setprop_sized_cells(vbi-fdt, nodename, reg,
@@ -381,22 +384,20 @@ static void create_rtc(const VirtBoardInfo *vbi, qemu_irq 
*pic)
 g_free(nodename);
 }
 
-static void create_virtio_devices(const VirtBoardInfo *vbi, qemu_irq *pic)
+static void create_rtc(const VirtBoardInfo *vbi, qemu_irq *pic)
 {
-int i;
-hwaddr size = vbi-memmap[VIRT_MMIO].size;
+hwaddr base = vbi-memmap[VIRT_RTC].base;
+int irq = vbi-irqmap[VIRT_RTC];
 
-/* Note that we have to create the transports in forwards order
- * so that command line devices are inserted lowest address first,
- * and then add dtb nodes in reverse order so that they appear in
- * the finished device tree lowest address first.
- */
-for (i = 0; i  NUM_VIRTIO_TRANSPORTS; i++) {
-int irq = vbi-irqmap[VIRT_MMIO] + i;
-hwaddr base = vbi-memmap[VIRT_MMIO].base + i * size;
+sysbus_create_simple(pl031, base, pic[irq]);
 
-sysbus_create_simple(virtio-mmio, base, pic[irq]);
-}
+fdt_add_rtc_node(vbi);
+}
+
+static void fdt_add_virtio_nodes(const VirtBoardInfo *vbi)
+{
+int i;
+hwaddr size = vbi-memmap[VIRT_MMIO].size;
 
 for (i = NUM_VIRTIO_TRANSPORTS - 1; i = 0; i--) {
 char *nodename;
@@ -416,6 +417,26 @@ static void create_virtio_devices(const VirtBoardInfo 
*vbi, qemu_irq *pic)
 }
 }
 
+static void create_virtio_devices(const VirtBoardInfo *vbi, qemu_irq *pic)
+{
+int i;
+hwaddr size = vbi-memmap[VIRT_MMIO].size;
+
+/* Note that we have to create the transports in forwards order
+ * so that command line devices are inserted lowest address first,
+ * and then add dtb nodes in reverse order so that they appear in
+ * the finished device tree lowest address first.
+ */
+for (i = 0; i  NUM_VIRTIO_TRANSPORTS; i++) {
+int irq = vbi-irqmap[VIRT_MMIO] + i;
+hwaddr base = vbi-memmap[VIRT_MMIO].base + i * size;
+
+sysbus_create_simple(virtio-mmio, base, pic[irq]);
+}
+
+fdt_add_virtio_nodes(vbi);
+}
+
 static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
 {
 const VirtBoardInfo *board = (const VirtBoardInfo *)binfo;
-- 
1.8.3.2

[Qemu-devel] [PATCH 2/7] hw/arm/boot: load_dtb becomes non static

2014-07-07 Thread Eric Auger

load_dtb will be used by machvirt for dynamic instantiation of
platform devices
---
 hw/arm/boot.c| 2 +-
 include/hw/arm/arm.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 3d1f4a2..314bbfd 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -312,7 +312,7 @@ static void set_kernel_args_old(const struct arm_boot_info 
*info)
 }
 }
 
-static int load_dtb(hwaddr addr, const struct arm_boot_info *binfo)
+int load_dtb(hwaddr addr, const struct arm_boot_info *binfo)
 {
 void *fdt = NULL;
 int size, rc;
diff --git a/include/hw/arm/arm.h b/include/hw/arm/arm.h
index cbbf4ca..fe58dc0 100644
--- a/include/hw/arm/arm.h
+++ b/include/hw/arm/arm.h
@@ -68,6 +68,7 @@ struct arm_boot_info {
 hwaddr entry;
 };
 void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info *info);
+int load_dtb(hwaddr addr, const struct arm_boot_info *binfo);
 
 /* Multiplication factor to convert from system clock ticks to qemu timer
ticks.  */
-- 
1.8.3.2

[Qemu-devel] [PATCH 5/7] hw/core/sysbus: add fdt_add_node method

2014-07-07 Thread Eric Auger

This method is meant to be called on sysbus device dynamic
instantiation (-device option). Devices that support this
kind of instantiation must implement this method.

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 hw/core/sysbus.c| 12 
 include/hw/sysbus.h |  2 ++
 2 files changed, 14 insertions(+)

diff --git a/hw/core/sysbus.c b/hw/core/sysbus.c
index aacc446..c1c0009 100644
--- a/hw/core/sysbus.c
+++ b/hw/core/sysbus.c
@@ -289,11 +289,23 @@ MemoryRegion *sysbus_address_space(SysBusDevice *dev)
 return get_system_memory();
 }
 
+/*
+ * to be specialized in susbus devices that support dynamic instantiation
+ */
+void sysbus_fdt_add_node(SysBusDevice *dev, void *data)
+{
+error_report(Dynamic instantiation of Device %s
+  is not implemented,
+ qdev_fw_name(DEVICE(dev)));
+}
+
 static void sysbus_device_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *k = DEVICE_CLASS(klass);
+SysBusDeviceClass *sbdevk = SYS_BUS_DEVICE_CLASS(klass);
 k-init = sysbus_device_init;
 k-bus_type = TYPE_SYSTEM_BUS;
+sbdevk-fdt_add_node = sysbus_fdt_add_node;
 }
 
 static const TypeInfo sysbus_device_type_info = {
diff --git a/include/hw/sysbus.h b/include/hw/sysbus.h
index 533184a..df514f9 100644
--- a/include/hw/sysbus.h
+++ b/include/hw/sysbus.h
@@ -40,6 +40,7 @@ typedef struct SysBusDeviceClass {
 /* public */
 
 int (*init)(SysBusDevice *dev);
+void (*fdt_add_node)(SysBusDevice *dev, void *data);
 } SysBusDeviceClass;
 
 struct SysBusDevice {
@@ -73,6 +74,7 @@ void sysbus_mmio_map_overlap(SysBusDevice *dev, int n, hwaddr 
addr,
 void sysbus_add_io(SysBusDevice *dev, hwaddr addr,
MemoryRegion *mem);
 void sysbus_del_io(SysBusDevice *dev, MemoryRegion *mem);
+void sysbus_fdt_add_node(SysBusDevice *dev, void *mem);
 MemoryRegion *sysbus_address_space(SysBusDevice *dev);
 
 /* Legacy helper function for creating devices.  */
-- 
1.8.3.2

[Qemu-devel] [PATCH 6/7] hw/misc/platform_devices: add call to sysbus fdt_add_node

2014-07-07 Thread Eric Auger

Creation of the node in the device tree relies on the new sysbus
fdt_add_node method.

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 hw/misc/platform_devices.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/hw/misc/platform_devices.c b/hw/misc/platform_devices.c
index 96ab272..a054606 100644
--- a/hw/misc/platform_devices.c
+++ b/hw/misc/platform_devices.c
@@ -9,7 +9,8 @@ int sysbus_device_create_devtree(Object *obj, void *opaque)
 PlatformDevtreeData *data = opaque;
 Object *dev;
 SysBusDevice *sbdev;
-bool matched = false;
+SysBusDeviceClass *k;
+
 
 dev = object_dynamic_cast(obj, TYPE_SYS_BUS_DEVICE);
 sbdev = (SysBusDevice *)dev;
@@ -19,12 +20,8 @@ int sysbus_device_create_devtree(Object *obj, void *opaque)
 return object_child_foreach(obj, sysbus_device_create_devtree, data);
 }
 
-if (!matched) {
-error_report(Device %s is not supported by this machine yet.,
- qdev_fw_name(DEVICE(dev)));
-exit(1);
-}
-
+k = SYS_BUS_DEVICE_GET_CLASS(dev);
+k-fdt_add_node(sbdev, data);
 return 0;
 }
 
-- 
1.8.3.2

[Qemu-devel] [PATCH 7/7] hw/misc/platform_devices: Add platform_bus_base to PlatformDevtreeData

2014-07-07 Thread Eric Auger

The base address of the platform bus sometimes is used to build the
reg property.

---

Actually I did not succeed in doing it another way with Calxeda xgmac.
If someone knows how to do without, please advise.

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 hw/misc/platform_devices.c | 1 +
 include/hw/misc/platform_devices.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/hw/misc/platform_devices.c b/hw/misc/platform_devices.c
index a054606..f194b1d 100644
--- a/hw/misc/platform_devices.c
+++ b/hw/misc/platform_devices.c
@@ -54,6 +54,7 @@ void platform_bus_create_devtree(PlatformParams *params, void 
*fdt,
 data.fdt = fdt;
 data.mpic = mpic;
 data.irq_start = irq_start;
+data.platform_bus_base = addr;
 data.node = node;
 
 container = container_get(qdev_get_machine(), /peripheral);
diff --git a/include/hw/misc/platform_devices.h 
b/include/hw/misc/platform_devices.h
index ab79346..6d228ca 100644
--- a/include/hw/misc/platform_devices.h
+++ b/include/hw/misc/platform_devices.h
@@ -11,6 +11,7 @@ typedef struct PlatformDevtreeData {
 const char *mpic;
 int irq_start;
 const char *node;
+hwaddr platform_bus_base;
 } PlatformDevtreeData;
 
 typedef struct {
-- 
1.8.3.2

Re: [Qemu-devel] [PATCH] qmp: show QOM properties in device-list-properties

2014-07-07 Thread Markus Armbruster

Paolo Bonzini pbonz...@redhat.com writes:

 Il 06/07/2014 21:03, Cole Robinson ha scritto:
 On 07/05/2014 05:14 AM, Paolo Bonzini wrote:
 Il 20/05/2014 14:29, Stefan Hajnoczi ha scritto:
 Devices can use a mix of qdev and QOM properties.  Currently only the
 qdev properties are displayed by device-list-properties.

 This patch extends the property enumeration algorithm to also display
 QOM properties (excluding the implicit type, realized,
 hotpluggable, and parent_bus properties).

 When a qdev property exists, use the qdev type name to preserve
 backwards compatibility.  QOM type names can be different for bool (qdev
 on/off) and str (used by qdev pointers).

 Signed-off-by: Stefan Hajnoczi stefa...@redhat.com


 Stefan, was this never applied?


 I assume you CC'd me in reference to the bug I reported:

 https://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00882.html

 I tested this patch, but it doesn't fix the specific bit I mentioned (lack of
 'bootindex' in -device virtio-blk,? )

 Yes, it doesn't, but does libvirt work then?  I'm not sure if libvirt
 still uses -device or rather device-list-properties (which lets you
 start a single QEMU process and do multiple probes).

Valid question, but of course we need to fix the -device FOO,help
regression regardless.

Re: [Qemu-devel] [Bug 1333651] [NEW] qemu-2.0 occasionally segfaults with Windows 2012R2

2014-07-07 Thread Stefan Hajnoczi

On Tue, Jun 24, 2014 at 11:17:51AM -, dev-zero wrote:
 (gdb) bt
 #0  event_notifier_set (e=0x124) at 
 /var/tmp/portage/app-emulation/qemu-2.0.0/work/qemu-2.0.0/util/event_notifier-posix.c:97
 #1  0x7f5c457145d1 in ?? () from /usr/lib64/libgfapi.so.0
 #2  0x7f5c454d1d0a in synctask_wrap () from /usr/lib64/libglusterfs.so.0
 #3  0x7f5c4259d760 in ?? () from /lib64/libc.so.6
 #4  0x in ?? ()

e=0x124 is an invalid address.  This crash is probably fixed by:

  commit 924fe1293c3e7a3c787bbdfb351e7f168caee3e9
  Author: Stefan Hajnoczi stefa...@redhat.com
  Date:   Tue Jun 3 11:21:01 2014 +0200

  aio: fix qemu_bh_schedule() bh-ctx race condition

Please apply the patch or try QEMU 2.1-rc0:
http://wiki.qemu.org/download/qemu-2.1.0-rc0.tar.bz2

Stefan


pgpSMmsD5TzPC.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH v5 01/10] raw-posix: Fix raw_getlength() to always return -errno on error

2014-07-07 Thread Stefan Hajnoczi

On Thu, Jun 26, 2014 at 01:23:16PM +0200, Markus Armbruster wrote:
 We got a merry mix of -1 and -errno here.
 
 Signed-off-by: Markus Armbruster arm...@redhat.com
 Reviewed-by: Eric Blake ebl...@redhat.com
 Reviewed-by: Benoit Canet ben...@irqsave.net
 ---
  block/raw-posix.c | 28 ++--
  1 file changed, 22 insertions(+), 6 deletions(-)

Thanks, applied to my block tree for QEMU 2.1:
https://github.com/stefanha/qemu/commits/block

Stefan


pgppMEvqu0CXx.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 2/2] block: bump coroutine pool size for drives

2014-07-07 Thread Stefan Hajnoczi

On Fri, Jul 04, 2014 at 12:03:27PM +0200, Markus Armbruster wrote:
 Stefan Hajnoczi stefa...@redhat.com writes:
  @@ -2112,6 +2115,7 @@ void bdrv_detach_dev(BlockDriverState *bs, void *dev)
   bs-dev_ops = NULL;
   bs-dev_opaque = NULL;
   bs-guest_block_size = 512;
  +qemu_coroutine_adjust_pool_size(-64);
   }
   
   /* TODO change to return DeviceState * when all users are qdevified */
 
 This enlarges the pool regardless of how the device model uses the block
 layer.  Isn't this a bit crude?
 
 Have you considered adapting the number of coroutines to actual demand?
 Within reasonable limits, of course.

I picked the simplest algorithm because I couldn't think of one which is
clearly better.  We cannot predict future coroutine usage so any
algorithm will have pathological cases.

In this case we might as well stick to the simplest implementation.

Stefan


pgpR8X_xoADCQ.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH] ide: fix double free

2014-07-07 Thread ChenLiang

On 2014/7/3 18:41, Paolo Bonzini wrote:

 Il 03/07/2014 04:23, ChenLiang ha scritto:
 Program received signal SIGABRT, Aborted.
 0x7fd548355b55 in raise () from /lib64/libc.so.6
 (gdb) bt
 #0  0x7fd548355b55 in raise () from /lib64/libc.so.6
 #1  0x7fd548357131 in abort () from /lib64/libc.so.6
 #2  0x7fd548393e0f in __libc_message () from /lib64/libc.so.6
 #3  0x7fd548399618 in malloc_printerr () from /lib64/libc.so.6
 #4  0x7fd54b15e80e in free_and_trace (mem=0x7fd54beb2230) at vl.c:2815
 #5  0x7fd54b3453cd in qemu_aio_release (p=0x7fd54beb2230) at block.c:4813
 #6  0x7fd54b15717d in dma_complete (dbs=0x7fd54beb2230, ret=0) at 
 dma-helpers.c:132
 #7  0x7fd54b157253 in dma_bdrv_cb (opaque=0x7fd54beb2230, ret=0) at 
 dma-helpers.c:148
 #8  0x7fd54b344db8 in bdrv_co_em_bh (opaque=0x7fd54bea4b30) at 
 block.c:4676
 #9  0x7fd54b335a72 in aio_bh_poll (ctx=0x7fd54bcec990) at async.c:81
 #10 0x7fd54b34b1b4 in aio_poll (ctx=0x7fd54bcec990, blocking=false) at 
 aio-posix.c:188
 #11 0x7fd54b335ee0 in aio_ctx_dispatch (source=0x7fd54bcec990, 
 callback=0x0, user_data=0x0) at async.c:211
 #12 0x7fd549e3669a in g_main_context_dispatch () from 
 /usr/lib64/libglib-2.0.so.0
 #13 0x7fd54b348c45 in glib_pollfds_poll () at main-loop.c:190
 #14 0x7fd54b348d3d in os_host_main_loop_wait (timeout=0) at 
 main-loop.c:235
 #15 0x7fd54b348e2f in main_loop_wait (nonblocking=0) at main-loop.c:484
 #16 0x7fd54b15b0f8 in main_loop () at vl.c:2007
 #17 0x7fd54b162a35 in main (argc=57, argv=0x7fff152720a8, 
 envp=0x7fff15272278) at vl.c:4526

 (gdb) bt
 #0  qemu_aio_release (p=0x7f86420ebec0) at block.c:4811
 #1  0x7f86412b617d in dma_complete (dbs=0x7f86420ebec0, ret=0) at 
 dma-helpers.c:132
 #2  0x7f86412b65ab in dma_aio_cancel (acb=0x7f86420ebec0) at 
 dma-helpers.c:192
 #3  0x7f86414a3996 in bdrv_aio_cancel (acb=0x7f86420ebec0) at 
 block.c:4559
 #4  0x7f86413906af in ide_bus_reset (bus=0x7f8641fe3a20) at 
 hw/ide/core.c:2056
 #5  0x7f86413967d6 in piix3_reset (opaque=0x7f8641fe32a0) at 
 hw/ide/piix.c:114
 #6  0x7f86412b9a37 in qemu_devices_reset () at vl.c:1829
 #7  0x7f86412b9aef in qemu_system_reset (report=true) at vl.c:1842
 #8  0x7f86412b9fe2 in main_loop_should_exit () at vl.c:1971
 #9  0x7f86412ba100 in main_loop () at vl.c:2011
 #10 0x7f86412c1a35 in main (argc=57, argv=0x7fff2e827d38, 
 envp=0x7fff2e827f08) at vl.c:4526
 
 Ok, this is the same as your previous backtrace.  The bug is still the same: 
 dma_bdrv_cb must not be called dma_aio_cancel has finished the recursive call 
 to bdrv_aio_cancel.
 
 BTW, is it better to rename dbs-in_cancel to dbs-canceled ?
 
 If we were to apply my patch, yes.  But with the current logic in_cancel 
 says are we inside the recursive call to bdrv_aio_cancel so I think the 
 answer is no.  My patch is just a band-aid, I don't think it should be 
 applied.
 
 Paolo
 
 .
 

Hi,
virtio_blk_reset uses bdrv_drain_all too.

static void virtio_blk_reset(VirtIODevice *vdev)
{
VirtIOBlock *s = VIRTIO_BLK(vdev);

#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
if (s-dataplane) {
virtio_blk_data_plane_stop(s-dataplane);
}
#endif

/*
 * This should cancel pending requests, but can't do nicely until there
 * are per-device request lists.
 */
bdrv_drain_all();
bdrv_set_enable_write_cache(s-bs, s-original_wce);
}

Best regards
Chenliang

Re: [Qemu-devel] [PATCH v2 2/2] block: bump coroutine pool size for drives

2014-07-07 Thread Stefan Hajnoczi

On Fri, Jul 04, 2014 at 12:36:13PM +0200, Lluís Vilanova wrote:
 Stefan Hajnoczi writes:
  @@ -2093,6 +2093,9 @@ int bdrv_attach_dev(BlockDriverState *bs, void *dev)
   }
 bs- dev = dev;
   bdrv_iostatus_reset(bs);
  +
  +/* We're expecting I/O from the device so bump up coroutine pool size 
  */
  +qemu_coroutine_adjust_pool_size(64);
   return 0;
   }
  
  @@ -2112,6 +2115,7 @@ void bdrv_detach_dev(BlockDriverState *bs, void *dev)
 bs- dev_ops = NULL;
 bs- dev_opaque = NULL;
 bs- guest_block_size = 512;
  +qemu_coroutine_adjust_pool_size(-64);
   }
  
   /* TODO change to return DeviceState * when all users are qdevified */
 
 Small nitpick. Wouldn't it be better to refactor that constant to a 
 define/enum
 (like in POOL_DEFAULT_SIZE)?

You are right.

Stefan


pgpzO6UNVzSpo.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH v5 00/10] Clean up around bdrv_getlength()

2014-07-07 Thread Stefan Hajnoczi

On Thu, Jun 26, 2014 at 01:23:15PM +0200, Markus Armbruster wrote:
 Issues addressed in this series:
 
 * BlockDriver method bdrv_getlength() generally returns -errno, but
   some implementations return -1 instead.  Fix them [PATCH 1].
 
 * Frequent conversions between sectors and bytes complicate the code
   needlessly.  Clean up some [PATCH 2-7].
 
 * bdrv_getlength() always returns a multiple of BDRV_SECTOR_SIZE, but
   some places appear to be confused about that, and align the result
   up or down.  Don't [PATCH 8].
 
 * bdrv_get_geometry() hides errors.  Don't use it in places where
   errors should be detected [PATCH 9+10].
 
 Issues not addressed:
 
 * We want to move away from counting in arbitrary units of 512 bytes
   we call sector, even though it's not really related to either
   guest or host sector size.  My patches mostly move sideways:
 
   - Sector-based bdrv_get_geometry() gets partly replaced by new
 bdrv_nb_sectors(), still sector-based.
 
   - Some sector-based places get converted from bdrv_getlength() to
 bdrv_nb_sectors().  At least, this de-duplicates the conversion
 from bytes to sectors.
 
   - Two places get converted from bdrv_get_geometry() to
 bdrv_getlength().  Two baby steps forward.
 
 * There are quite a few literals left in the code where
   BDRV_SECTOR_SIZE, BDRV_SECTOR_BITS or BDRV_SECTOR_MASK should be
   used instead.
 
 * Error handling is missing in places, but it's not always obvious
   whether errors can actually happen, and if yes, how to handle them.
 
 * Several calls of bdrv_get_geometry() remain in hw/.  I wanted to
   replace them all, but ran out of steam.
 
 v5:
 * Straightforward rebase, only 10/10 conflicted
 v4:
 * Trivially rebased
 * Set ret on error paths in img_compare() and img_rebase() in PATCH 10
   [Benoît]
 v3:
 * Trivially rebased
 * Correct silly g_new() vs. g_new0() mistake in PATCH 09 [Max]
 v2:
 * Trivially rebased
 * Correct silly bdrv_getlength() vs. bdrv_nb_sectors() mistake in
   PATCH 03
 * Split PATCH 03 into 03-07 [Kevin]
 * Conversion of bs_sectors to array in PATCH 05 had a subscript off by
   one, fix [Kevin]
 * Split PATCH 05 into 09-10 [Kevin]
 
 Markus Armbruster (10):
   raw-posix: Fix raw_getlength() to always return -errno on error
   block: New bdrv_nb_sectors()
   block: Use bdrv_nb_sectors() in bdrv_make_zero()
   block: Use bdrv_nb_sectors() in bdrv_aligned_preadv()
   block: Use bdrv_nb_sectors() in bdrv_co_get_block_status()
   block: Use bdrv_nb_sectors() in img_convert()
   block: Use bdrv_nb_sectors() where sectors, not bytes are wanted
   block: Drop superfluous aligning of bdrv_getlength()'s value
   qemu-img: Make img_convert() get image size just once per image
   block: Avoid bdrv_get_geometry() where errors should be detected
 
  block-migration.c |  9 +++--
  block.c   | 81 +++---
  block/qapi.c  | 14 +---
  block/qcow2.c |  3 +-
  block/raw-posix.c | 28 +++
  block/vmdk.c  |  5 ++-
  include/block/block.h |  1 +
  qemu-img.c| 98 
 +++
  8 files changed, 152 insertions(+), 87 deletions(-)

Thanks, applied Patches 2-10 to my block-next tree:
https://github.com/stefanha/qemu/commits/block-next

Stefan


pgpSmV__I_v6Y.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH V2 for 2.1 0/3] bug fixs for memory backend

2014-07-07 Thread Hu Tao

On Mon, Jul 07, 2014 at 06:35:42AM +0300, Michael S. Tsirkin wrote:
 On Mon, Jul 07, 2014 at 06:24:59AM +0300, Michael S. Tsirkin wrote:
  On Mon, Jul 07, 2014 at 10:58:05AM +0800, Hu Tao wrote:
   This series includes three patches to fix bugs of memory backend. Patch
   1 prepares for next patches, patch 2 and patch 3 fix two bugs
   respectively, see each patch for the bugs and how to reproduce them.
   
   changes to v1:
   
 - split patch 1 in v1 into 2 patches
 - don't rely on ram_block_add to return -1
 - error message tweak in file_ram_alloc
 - add error messages reported by qemu to commit message of patch 3
   
   Hu Tao (3):
 memory: rename memory_region_init_ram() and
   memory_region_init_ram_ptr()
 memory: add errp parameter to memory_region_init_ram() and
   memory_region_init_ram_ptr()
 exec: improve error handling and reporting in file_ram_alloc() and
   gethugepagesize()
  
  I fixed up some minor issues and applied this, thanks.
 
 And reverted.
 
 Build fails, and a simple check after applying patch 1 gives me:
 git grep memory_region_init_ram |grep -v nofail|wc -l
 132

Thanks for catching this! I should have built all targets.

 
 Apparently you fixed up about 10% of the files using this function.
 So forget about me merging patch 1.
 
 Add a new
 memory_region_init_ram_may_fail
 and
 memory_region_init_ram_ptr_may_fail
 
 and use it specifically for the new stuff.

Thanks for the change!

 
 Do the rename on top in two steps:
 memory_region_init_ram - memory_region_init_ram_nofail
 memory_region_init_ram_may_fail - memory_region_init_ram
 
 Paolo can then merge it when he prefers, though I'd say 2.2
 is more reasonable.
 
 
backends/hostmem-ram.c   |  2 +-
exec.c   | 51 ---
hw/block/pflash_cfi01.c  |  5 -
hw/block/pflash_cfi02.c  |  5 -
hw/core/loader.c |  2 +-
hw/display/vga.c |  2 +-
hw/display/vmware_vga.c  |  3 ++-
hw/i386/kvm/pci-assign.c |  9 
hw/i386/pc.c |  2 +-
hw/i386/pc_sysfw.c   |  4 ++--
hw/misc/ivshmem.c|  9 
hw/misc/vfio.c   |  3 ++-
hw/pci/pci.c |  2 +-
include/exec/memory.h| 43 +---
include/exec/ram_addr.h  |  4 ++--
memory.c | 57 
   +++-
numa.c   |  4 ++--
17 files changed, 158 insertions(+), 49 deletions(-)
   
   -- 
   1.9.3

Re: [Qemu-devel] for-2.1 (was Re: [PATCH] ahci: map memory via device's address space instead of address_space_memory)

2014-07-07 Thread Stefan Hajnoczi

On Sun, Jul 06, 2014 at 09:15:38AM +0300, Michael S. Tsirkin wrote:
 On Thu, Jul 03, 2014 at 11:28:52AM +0300, Michael S. Tsirkin wrote:
  On Thu, Jul 03, 2014 at 04:26:27PM +0800, Le Tan wrote:
   In map_page() in hw/ide/ahci.c, replace cpu_physical_memory_map() and
   cpu_physical_memory_unmap() with dma_memory_map() and dma_memory_unmap(),
   because ahci devices should not access memory directly but via their 
   address
   space. Add an AddressSpace parameter to map_page(). In order to call
   map_page(), we should pass the AHCIState.as as the AddressSpace argument.
   
   Signed-off-by: Le Tan tamlokv...@gmail.com
  
  Makes sense
  Reviewed-by: Michael S. Tsirkin m...@redhat.com
  
 
 Stefan, Kevin, you are going to pick this one?

Done.

Stefan


pgpm1BBA4dYXs.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH] ahci: map memory via device's address space instead of address_space_memory

2014-07-07 Thread Stefan Hajnoczi

On Thu, Jul 03, 2014 at 04:26:27PM +0800, Le Tan wrote:
 In map_page() in hw/ide/ahci.c, replace cpu_physical_memory_map() and
 cpu_physical_memory_unmap() with dma_memory_map() and dma_memory_unmap(),
 because ahci devices should not access memory directly but via their address
 space. Add an AddressSpace parameter to map_page(). In order to call
 map_page(), we should pass the AHCIState.as as the AddressSpace argument.
 
 Signed-off-by: Le Tan tamlokv...@gmail.com
 ---
  hw/ide/ahci.c |   21 +++--
  1 file changed, 11 insertions(+), 10 deletions(-)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


pgp0mIqgTQcjX.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH for-2.1?!?] AioContext: speed up aio_notify

2014-07-07 Thread Stefan Hajnoczi

On Thu, Jul 03, 2014 at 06:59:20PM +0200, Paolo Bonzini wrote:
 In many cases, the call to event_notifier_set in aio_notify is unnecessary.
 In particular, if we are executing aio_dispatch, or if aio_poll is not
 blocking, we know that we will soon get to the next loop iteration (if
 necessary); the thread that hosts the AioContext's event loop does not
 need any nudging.
 
 The patch includes a Promela formal model that shows that this really
 works and does not need any further complication such as generation
 counts.  It needs a memory barrier though.
 
 The generation counts are not needed because we only care of the state
 of ctx-dispatching at the time the memory barrier happens.  If
 ctx-dispatching is one at the time the memory barrier happens,
 the aio_notify is not needed even if it afterwards becomes zero.
 
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
 It should work, but I think this is a bit too tricky for 2.1.
 
  aio-posix.c |  34 +++-
  async.c |  13 +-
  docs/aio_notify.promela | 104 
 
  include/block/aio.h |   9 +
  4 files changed, 158 insertions(+), 2 deletions(-)
  create mode 100644 docs/aio_notify.promela

I can test rbd and gluster.

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


pgpUxuj8bklBR.pgp
Description: PGP signature

[Qemu-devel] Where to get precompiled qga-vss.dll from ?

2014-07-07 Thread Puneet Bakshi

Hi,

I want to work with guest-fsfreeze-* commands in Windows 2008 guest VM.
Host is CentOS 6.4.

Windows 2008 is running QEMU VSS provider. When guest-fsfreeze-* commands
are invoked from host, response received is This is not supported.

I am following
http://lists.gnu.org/archive/html/qemu-devel/2013-02/msg01963.html.

Regards,
~Puneet

Re: [Qemu-devel] [PATCH v6 0/3] linux-aio: introduce submit I/O as a batch

2014-07-07 Thread Stefan Hajnoczi

On Fri, Jul 04, 2014 at 06:04:32PM +0800, Ming Lei wrote:
 Hi,
 
 The commit 580b6b2aa2(dataplane: use the QEMU block layer for I/O)
 introduces ~40% throughput regression on virtio-blk dataplane, and
 one of causes is that submitting I/O as a batch is removed.
 
 This patchset trys to introduce this mechanism on block, at least,
 linux-aio can benefit from that.
 
 With these patches, it is observed that thoughout on virtio-blk
 dataplane can be improved a lot, see data in commit log of patch
 3/3.
 
 It should be possible to apply the batch mechanism to other devices
 (such as virtio-scsi) too.
 
 TODO:
   - support queuing I/O to multi files for scsi devies, which
 need some changes to linux-aio
 
 V6:
   - fix requests leak if part of them arn't submitted successfully,
   pointed by Stefan
   - linux-aio.c coding style fix
 
 V5:
   - rebase on v2.1.0-rc0 of qemu.git/master
   - block/linux-aio.c code style fix
   - don't flush io queue before flush, pointed by Paolo
 
 V4:
   - support other non-raw formats with under-optimized performance
   - use reference counter for plug  unplug
   - flush io queue before sending flush command
 
 V3:
   - only support submitting I/O as a batch for raw format, pointed by
 Kevin
 
 V2:
   - define return value of bdrv_io_unplug as void, suggested by Paolo
   - avoid busy-wait for handling io_submit
 V1:
   - move queuing io stuff into linux-aio.c as suggested by Paolo

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

In Patch 2 we should complete requests with -EIO if io_submit() returned
0 = ret  len.  I fixed this up when applying because the patch was
completing with a bogus ret value.

Stefan


pgpStijVcmaQ3.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH] qmp: show QOM properties in device-list-properties

2014-07-07 Thread Stefan Hajnoczi

On Sat, Jul 5, 2014 at 11:14 AM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 20/05/2014 14:29, Stefan Hajnoczi ha scritto:
 Stefan, was this never applied?

Just discovered this too.  We need it for 2.1.

As Markus says, the -device FOO,? regression fix is still necessary.

Stefan

Re: [Qemu-devel] [PATCH V2 for 2.1 0/3] bug fixs for memory backend

2014-07-07 Thread Michael S. Tsirkin

On Mon, Jul 07, 2014 at 04:17:30PM +0800, Hu Tao wrote:
 On Mon, Jul 07, 2014 at 06:35:42AM +0300, Michael S. Tsirkin wrote:
  On Mon, Jul 07, 2014 at 06:24:59AM +0300, Michael S. Tsirkin wrote:
   On Mon, Jul 07, 2014 at 10:58:05AM +0800, Hu Tao wrote:
This series includes three patches to fix bugs of memory backend. Patch
1 prepares for next patches, patch 2 and patch 3 fix two bugs
respectively, see each patch for the bugs and how to reproduce them.

changes to v1:

  - split patch 1 in v1 into 2 patches
  - don't rely on ram_block_add to return -1
  - error message tweak in file_ram_alloc
  - add error messages reported by qemu to commit message of patch 3

Hu Tao (3):
  memory: rename memory_region_init_ram() and
memory_region_init_ram_ptr()
  memory: add errp parameter to memory_region_init_ram() and
memory_region_init_ram_ptr()
  exec: improve error handling and reporting in file_ram_alloc() and
gethugepagesize()
   
   I fixed up some minor issues and applied this, thanks.
  
  And reverted.
  
  Build fails, and a simple check after applying patch 1 gives me:
  git grep memory_region_init_ram |grep -v nofail|wc -l
  132
 
 Thanks for catching this! I should have built all targets.
 
  
  Apparently you fixed up about 10% of the files using this function.
  So forget about me merging patch 1.
  
  Add a new
  memory_region_init_ram_may_fail
  and
  memory_region_init_ram_ptr_may_fail
  
  and use it specifically for the new stuff.
 
 Thanks for the change!

To clarify: I didn't apply this change.  I simply reverted the whole
patchset.  If you want this patchset applied please do it, at this point
I'm making no promises on whether it will get into 2.1.


  
  Do the rename on top in two steps:
  memory_region_init_ram - memory_region_init_ram_nofail
  memory_region_init_ram_may_fail - memory_region_init_ram
  
  Paolo can then merge it when he prefers, though I'd say 2.2
  is more reasonable.
  
  
 backends/hostmem-ram.c   |  2 +-
 exec.c   | 51 
---
 hw/block/pflash_cfi01.c  |  5 -
 hw/block/pflash_cfi02.c  |  5 -
 hw/core/loader.c |  2 +-
 hw/display/vga.c |  2 +-
 hw/display/vmware_vga.c  |  3 ++-
 hw/i386/kvm/pci-assign.c |  9 
 hw/i386/pc.c |  2 +-
 hw/i386/pc_sysfw.c   |  4 ++--
 hw/misc/ivshmem.c|  9 
 hw/misc/vfio.c   |  3 ++-
 hw/pci/pci.c |  2 +-
 include/exec/memory.h| 43 +---
 include/exec/ram_addr.h  |  4 ++--
 memory.c | 57 
+++-
 numa.c   |  4 ++--
 17 files changed, 158 insertions(+), 49 deletions(-)

-- 
1.9.3

Re: [Qemu-devel] bootindex dropped from -device virtio-blk, ? output, upsets libvirt

2014-07-07 Thread Stefan Hajnoczi

On Fri, Jul 04, 2014 at 02:47:41PM -0400, Cole Robinson wrote:
 qemu-2.1-rc0 upsets some of libvirt's qemu feature introspection, the example
 I hit is with bootindex support. qemu -device virtio-blk,? no longer lists the
 bootindex= property, so libvirt thinks that qemu doesn't support it, and fails
 to launch a VM with per-device boot order configuration.
 
 The qemu culprit is:
 
 commit caffdac363801cd2cf2bf01ad013a8c1e1e43800
 Author: Stefan Hajnoczi stefa...@redhat.com
 Date:   Wed Jun 18 17:58:33 2014 +0800
 
 virtio-blk: use aliases instead of duplicate qdev properties
 
 These alias properties aren't printed in qdev-monitor.c:qdev_device_help. In
 fact I'm not sure if aliases are even accessible in that function, since the
 they are only registered at instance init time, and I don't think any device
 has actually been initialized when qdev_device_help is called. That's my
 reading anyways.
 
 Thoughts?

I will send a fix.

Thanks,
Stefan


pgptDz1pMcBWq.pgp
Description: PGP signature

Re: [Qemu-devel] Any one notice my patch about Support vhd type VHD_DIFFERENCING

2014-07-07 Thread Stefan Hajnoczi

On Sun, Jul 06, 2014 at 10:09:47PM +0800, ssdxiao wrote:
 I have commited a patch about how to read and write type VHD_DIFFERENCING?
 Is any one interesting?
 
 
 http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg07385.html
 http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg07386.html

It is QEMU 2.1 hard freeze right now.  This means only bug fixes are
being merged into qemu.git/master.

Most code review is focussed on bug fixes at the moment and new features
are lower priority until the QEMU 2.2 release cycle opens up at the end
of July.

Hopefully you'll get feedback sooner than that but I just wanted to let
you know why review is slow :).

Stefan


pgpQfnkN1Sv55.pgp
Description: PGP signature

[Qemu-devel] [RFC PATCH 2/5] bootindex: reset bootindex when vm reset

2014-07-07 Thread arei.gonglei

From: Chenliang chenlian...@huawei.com

Reset bootindex when vm reboot. Prepare to achive that modify boot
order when vm is running.

Signed-off-by: Chenliang chenlian...@huawei.com
Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/nvram/fw_cfg.c | 53 ---
 include/hw/nvram/fw_cfg.h |  2 ++
 2 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index b71d251..97d4951 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -56,7 +56,6 @@ struct FWCfgState {
 FWCfgFiles *files;
 uint16_t cur_entry;
 uint32_t cur_offset;
-Notifier machine_ready;
 };
 
 #define JPG_FILE 0
@@ -402,6 +401,25 @@ static void fw_cfg_add_bytes_read_callback(FWCfgState *s, 
uint16_t key,
 s-entries[arch][key].callback_opaque = callback_opaque;
 }
 
+static void* fw_cfg_modify_bytes_read(FWCfgState *s, uint16_t key,
+  void *data, size_t len)
+{
+void *ptr;
+int arch = !!(key  FW_CFG_ARCH_LOCAL);
+
+key = FW_CFG_ENTRY_MASK;
+
+assert(key  FW_CFG_MAX_ENTRY  len  UINT32_MAX);
+
+ptr = s-entries[arch][key].data;
+s-entries[arch][key].data = data;
+s-entries[arch][key].len = len;
+s-entries[arch][key].callback_opaque = NULL;
+s-entries[arch][key].callback = NULL;
+
+return ptr;
+}
+
 void fw_cfg_add_bytes(FWCfgState *s, uint16_t key, void *data, size_t len)
 {
 fw_cfg_add_bytes_read_callback(s, key, NULL, NULL, data, len);
@@ -499,13 +517,36 @@ void fw_cfg_add_file(FWCfgState *s,  const char *filename,
 fw_cfg_add_file_callback(s, filename, NULL, NULL, data, len);
 }
 
-static void fw_cfg_machine_ready(struct Notifier *n, void *data)
+void *fw_cfg_modify_file(FWCfgState *s, const char *filename,
+void *data, size_t len)
+{
+int i, index;
+
+assert(s-files);
+
+index = be32_to_cpu(s-files-count);
+assert(index  FW_CFG_FILE_SLOTS);
+
+for (i = 0; i  index; i++) {
+if (strcmp(filename, s-files-f[i].name) == 0) {
+return fw_cfg_modify_bytes_read(s, FW_CFG_FILE_FIRST + i,
+ data, len);
+}
+}
+/* add new one */
+fw_cfg_add_file_callback(s, filename, NULL, NULL, data, len);
+return NULL;
+}
+
+static void fw_cfg_machine_reset(void *opaque)
 {
+void *ptr;
 size_t len;
-FWCfgState *s = container_of(n, FWCfgState, machine_ready);
+FWCfgState *s = opaque;
 char *bootindex = get_boot_devices_list(len, false);
 
-fw_cfg_add_file(s, bootorder, (uint8_t*)bootindex, len);
+ptr = fw_cfg_modify_file(s, bootorder, (uint8_t*)bootindex, len);
+g_free(ptr);
 }
 
 FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
@@ -542,9 +583,7 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t 
data_port,
 fw_cfg_bootsplash(s);
 fw_cfg_reboot(s);
 
-s-machine_ready.notify = fw_cfg_machine_ready;
-qemu_add_machine_init_done_notifier(s-machine_ready);
-
+qemu_register_reset(fw_cfg_machine_reset, s);
 return s;
 }
 
diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
index 72b1549..56e1ed7 100644
--- a/include/hw/nvram/fw_cfg.h
+++ b/include/hw/nvram/fw_cfg.h
@@ -76,6 +76,8 @@ void fw_cfg_add_file(FWCfgState *s, const char *filename, 
void *data,
 void fw_cfg_add_file_callback(FWCfgState *s, const char *filename,
   FWCfgReadCallback callback, void 
*callback_opaque,
   void *data, size_t len);
+void *fw_cfg_modify_file(FWCfgState *s, const char *filename, void *data,
+ size_t len);
 FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
 hwaddr crl_addr, hwaddr data_addr);
 
-- 
1.7.12.4

[Qemu-devel] [RFC PATCH 4/5] bootindex: add qmp to set boot index when vm is running

2014-07-07 Thread arei.gonglei

From: Chenliang chenlian...@huawei.com

Signed-off-by: Chenliang chenlian...@huawei.com
Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hmp.c| 11 +++
 hmp.h|  1 +
 qapi-schema.json | 16 
 qmp-commands.hx  | 16 
 qmp.c| 14 ++
 5 files changed, 58 insertions(+)

diff --git a/hmp.c b/hmp.c
index dc3d279..b2c6b6c 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1713,3 +1713,14 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict)
 
 monitor_printf(mon, \n);
 }
+
+void hmp_set_bootindex(Monitor *mon, const QDict *qdict)
+{
+Error *err = NULL;
+const char *id = qdict_get_str(qdict, id);
+int64_t bootindex = qdict_get_int(qdict, bootindex);
+char *suffix = qdict_get_try_str(qdict, suffix);
+
+qmp_set_bootindex(id, bootindex, suffix, err);
+hmp_handle_error(mon, err);
+}
diff --git a/hmp.h b/hmp.h
index 4fd3c4a..eb2641a 100644
--- a/hmp.h
+++ b/hmp.h
@@ -94,6 +94,7 @@ void hmp_cpu_add(Monitor *mon, const QDict *qdict);
 void hmp_object_add(Monitor *mon, const QDict *qdict);
 void hmp_object_del(Monitor *mon, const QDict *qdict);
 void hmp_info_memdev(Monitor *mon, const QDict *qdict);
+void hmp_set_bootindex(Monitor *mon, const QDict *qdict);
 void object_add_completion(ReadLineState *rs, int nb_args, const char *str);
 void object_del_completion(ReadLineState *rs, int nb_args, const char *str);
 void device_add_completion(ReadLineState *rs, int nb_args, const char *str);
diff --git a/qapi-schema.json b/qapi-schema.json
index e7727a1..b414cae 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1694,6 +1694,22 @@
 { 'command': 'device_del', 'data': {'id': 'str'} }
 
 ##
+# @set_bootindex:
+#
+# set bootindex of a devcie
+#
+# @id: the name of the device
+# @bootindex: the bootindex of the device
+# @suffix: the suffix of the device
+#
+# Returns: Nothing on success
+#  If @id is not a valid device, DeviceNotFound
+#
+# Since: 2.1
+##
+{ 'command': 'set_bootindex', 'data': {'id': 'str', 'bootindex': 'int', 
'suffix': 'str'} }
+
+##
 # @DumpGuestMemoryFormat:
 #
 # An enumeration of guest-memory-dump's format.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index e4a1c80..03645b6 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -3661,3 +3661,19 @@ Example:
  { slot: 3, slot-type: DIMM, source: 0, status: 0}
]}
 EQMP
+
+SQMP
+@set-bootindex
+
+
+Set boot index of one device
+
+Example:
+- { execute: set-bootindex, arguments: { id: ide0-0-1, bootindex: 
1, suffix: /disk@0}}
+
+EQMP
+{
+.name   = set-bootindex,
+.args_type  = id:s,bootindex:l,suffix:s?,
+.mhandler.cmd_new = qmp_marshal_input_set_bootindex,
+},
diff --git a/qmp.c b/qmp.c
index dca6efb..5ac9401 100644
--- a/qmp.c
+++ b/qmp.c
@@ -659,3 +659,17 @@ ACPIOSTInfoList *qmp_query_acpi_ospm_status(Error **errp)
 
 return head;
 }
+
+void qmp_set_bootindex(const char *id, int64_t bootindex,
+   const char *suffix, Error **errp)
+{
+DeviceState *dev;
+
+dev = qdev_find_recursive(sysbus_get_default(), id);
+if (NULL == dev) {
+error_set(errp, QERR_DEVICE_NOT_FOUND, id);
+return;
+}
+
+modify_boot_device_path(bootindex, dev, strlen(suffix) ? suffix : NULL);
+}
-- 
1.7.12.4

[Qemu-devel] [RFC PATCH 0/5] modify boot order when vm is running

2014-07-07 Thread arei.gonglei

From: Chenliang chenlian...@huawei.com

Sometime, we want to modify boot order of vm without shutdown it.
This sets of patches add one qmp to achieve it. And fix some little
bug when device is hotpluged.

Chenliang (5):
  bootindex: add *_boot_device_path function
  bootindex: reset bootindex when vm reset
  bootindex: delete boot index when device is removed
  bootindex: add qmp to set boot index when vm is running
  bootindex: fix memory leak when ppc sets boot index

 hmp.c | 11 ++
 hmp.h |  1 +
 hw/block/virtio-blk.c |  1 +
 hw/i386/kvm/pci-assign.c  |  1 +
 hw/misc/vfio.c|  1 +
 hw/net/e1000.c|  1 +
 hw/net/eepro100.c |  1 +
 hw/net/ne2000.c   |  1 +
 hw/net/rtl8139.c  |  1 +
 hw/net/virtio-net.c   |  1 +
 hw/net/vmxnet3.c  |  1 +
 hw/nvram/fw_cfg.c | 53 +++--
 hw/ppc/spapr.c|  1 +
 hw/scsi/scsi-generic.c|  1 +
 hw/usb/dev-network.c  |  1 +
 hw/usb/host-libusb.c  |  1 +
 hw/usb/redirect.c |  1 +
 include/hw/nvram/fw_cfg.h |  2 ++
 include/sysemu/sysemu.h   |  4 
 qapi-schema.json  | 16 ++
 qmp-commands.hx   | 16 ++
 qmp.c | 14 
 vl.c  | 55 +++
 23 files changed, 179 insertions(+), 7 deletions(-)

-- 
1.7.12.4

[Qemu-devel] [RFC PATCH 1/5] bootindex: add *_boot_device_path function

2014-07-07 Thread arei.gonglei

From: Chenliang chenlian...@huawei.com

Add del_boot_device_path and modify_boot_device_path. Device should
be removed from boot device list  by del_boot_device_path when device
hotplug. modify_boot_device_path is used to modify deviceboot order.

Signed-off-by: Chenliang chenlian...@huawei.com
Signed-off-by: Gonglei arei.gong...@huawei.com
---
 include/sysemu/sysemu.h |  4 
 vl.c| 55 +
 2 files changed, 59 insertions(+)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 285c45b..38ef1cd 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -204,6 +204,10 @@ void usb_info(Monitor *mon, const QDict *qdict);
 
 void add_boot_device_path(int32_t bootindex, DeviceState *dev,
   const char *suffix);
+void del_boot_device_path(int32_t bootindex, DeviceState *dev,
+  const char *suffix);
+void modify_boot_device_path(int32_t bootindex, DeviceState *dev,
+ const char *suffix);
 char *get_boot_devices_list(size_t *size, bool ignore_suffixes);
 
 DeviceState *get_boot_device(uint32_t position);
diff --git a/vl.c b/vl.c
index a1686ef..6264e11 100644
--- a/vl.c
+++ b/vl.c
@@ -1247,6 +1247,61 @@ void add_boot_device_path(int32_t bootindex, DeviceState 
*dev,
 QTAILQ_INSERT_TAIL(fw_boot_order, node, link);
 }
 
+static bool is_same_fw_dev_path(DeviceState *src, DeviceState *target)
+{
+bool ret = false;
+char *devpath_src = qdev_get_fw_dev_path(src);
+char *devpath_target = qdev_get_fw_dev_path(target);
+
+if (!strcmp(devpath_src, devpath_target)) {
+ret = true;
+}
+
+g_free(devpath_src);
+g_free(devpath_target);
+return ret;
+}
+
+void del_boot_device_path(int32_t bootindex, DeviceState *dev,
+  const char *suffix)
+{
+FWBootEntry *i;
+
+assert(dev != NULL);
+
+QTAILQ_FOREACH(i, fw_boot_order, link) {
+if (is_same_fw_dev_path(i-dev, dev)) {
+if (suffix  i-suffix  strcmp(i-suffix, suffix)) {
+continue;
+}
+QTAILQ_REMOVE(fw_boot_order, i, link);
+g_free(i-suffix);
+g_free(i);
+break;
+}
+}
+
+if (bootindex == -1) {
+return;
+}
+
+QTAILQ_FOREACH(i, fw_boot_order, link) {
+if (i-bootindex == bootindex) {
+QTAILQ_REMOVE(fw_boot_order, i, link);
+g_free(i-suffix);
+g_free(i);
+break;
+}
+}
+}
+
+void modify_boot_device_path(int32_t bootindex, DeviceState *dev,
+ const char *suffix)
+{
+del_boot_device_path(bootindex, dev, suffix);
+add_boot_device_path(bootindex, dev, suffix);
+}
+
 DeviceState *get_boot_device(uint32_t position)
 {
 uint32_t counter = 0;
-- 
1.7.12.4

[Qemu-devel] [RFC PATCH 3/5] bootindex: delete boot index when device is removed

2014-07-07 Thread arei.gonglei

From: Chenliang chenlian...@huawei.com

Device should be remove from boot list when it hotplug.

Signed-off-by: Chenliang chenlian...@huawei.com
Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/block/virtio-blk.c| 1 +
 hw/i386/kvm/pci-assign.c | 1 +
 hw/misc/vfio.c   | 1 +
 hw/net/e1000.c   | 1 +
 hw/net/eepro100.c| 1 +
 hw/net/ne2000.c  | 1 +
 hw/net/rtl8139.c | 1 +
 hw/net/virtio-net.c  | 1 +
 hw/net/vmxnet3.c | 1 +
 hw/scsi/scsi-generic.c   | 1 +
 hw/usb/dev-network.c | 1 +
 hw/usb/host-libusb.c | 1 +
 hw/usb/redirect.c| 1 +
 13 files changed, 13 insertions(+)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 08562ea..3395a42 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -756,6 +756,7 @@ static void virtio_blk_device_unrealize(DeviceState *dev, 
Error **errp)
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
 VirtIOBlock *s = VIRTIO_BLK(dev);
 
+del_boot_device_path(-1, dev, /disk@0,0);
 #ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
 remove_migration_state_change_notifier(s-migration_state_notifier);
 virtio_blk_data_plane_destroy(s-dataplane);
diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index de33657..4dcd78c 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -1853,6 +1853,7 @@ static void assigned_exitfn(struct PCIDevice *pci_dev)
 {
 AssignedDevice *dev = DO_UPCAST(AssignedDevice, dev, pci_dev);
 
+del_boot_device_path(-1, pci_dev-qdev, NULL);
 deassign_device(dev);
 free_assigned_device(dev);
 }
diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index 7437c2e..2f4bec5 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -4220,6 +4220,7 @@ static void vfio_exitfn(PCIDevice *pdev)
 VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
 VFIOGroup *group = vdev-group;
 
+del_boot_device_path(-1, pdev-qdev, NULL);
 vfio_unregister_err_notifier(vdev);
 pci_device_set_intx_routing_notifier(vdev-pdev, NULL);
 vfio_disable_interrupts(vdev);
diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 0fc29a0..2ca1acd 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -1492,6 +1492,7 @@ pci_e1000_uninit(PCIDevice *dev)
 {
 E1000State *d = E1000(dev);
 
+del_boot_device_path(-1, DEVICE(dev), /ethernet-phy@0);
 timer_del(d-autoneg_timer);
 timer_free(d-autoneg_timer);
 timer_del(d-mit_timer);
diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c
index aaa3ff2..9b9734d 100644
--- a/hw/net/eepro100.c
+++ b/hw/net/eepro100.c
@@ -1843,6 +1843,7 @@ static void pci_nic_uninit(PCIDevice *pci_dev)
 {
 EEPRO100State *s = DO_UPCAST(EEPRO100State, dev, pci_dev);
 
+del_boot_device_path(-1, pci_dev-qdev, /ethernet-phy@0);
 memory_region_destroy(s-mmio_bar);
 memory_region_destroy(s-io_bar);
 memory_region_destroy(s-flash_bar);
diff --git a/hw/net/ne2000.c b/hw/net/ne2000.c
index d558b8c..780b74c 100644
--- a/hw/net/ne2000.c
+++ b/hw/net/ne2000.c
@@ -748,6 +748,7 @@ static void pci_ne2000_exit(PCIDevice *pci_dev)
 PCINE2000State *d = DO_UPCAST(PCINE2000State, dev, pci_dev);
 NE2000State *s = d-ne2000;
 
+del_boot_device_path(-1, pci_dev-qdev, /ethernet-phy@0);
 memory_region_destroy(s-io);
 qemu_del_nic(s-nic);
 qemu_free_irq(s-irq);
diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
index 90bc5ec..fe637da 100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -3462,6 +3462,7 @@ static void pci_rtl8139_uninit(PCIDevice *dev)
 {
 RTL8139State *s = RTL8139(dev);
 
+del_boot_device_path(-1, DEVICE(dev), /ethernet-phy@0);
 memory_region_destroy(s-bar_io);
 memory_region_destroy(s-bar_mem);
 if (s-cplus_txbuffer) {
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 00b5e07..ebe5394 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1626,6 +1626,7 @@ static void virtio_net_device_unrealize(DeviceState *dev, 
Error **errp)
 virtio_net_set_status(vdev, 0);
 
 unregister_savevm(dev, virtio-net, n);
+del_boot_device_path(-1, dev, /ethernet-phy@0);
 
 g_free(n-netclient_name);
 n-netclient_name = NULL;
diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 77bea6f..1e0573f 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -2176,6 +2176,7 @@ static void vmxnet3_pci_uninit(PCIDevice *pci_dev)
 VMW_CBPRN(Starting uninit...);
 
 unregister_savevm(dev, vmxnet3-msix, s);
+del_boot_device_path(-1, dev, /ethernet-phy@0);
 
 vmxnet3_net_uninit(s);
 
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 3733d2c..b567319 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -388,6 +388,7 @@ static void scsi_generic_reset(DeviceState *dev)
 
 static void scsi_destroy(SCSIDevice *s)
 {
+del_boot_device_path(-1, s-qdev, NULL);
 scsi_device_purge_requests(s, SENSE_CODE(NO_SENSE));
 blockdev_mark_auto_del(s-conf.bs);
 }
diff --git a/hw/usb/dev-network.c b/hw/usb/dev-network.c
index

[Qemu-devel] [RFC PATCH 5/5] bootindex: fix memory leak when set boot index

2014-07-07 Thread arei.gonglei

From: Chenliang chenlian...@huawei.com

get_boot_devices_list will malloc some memory, spapr_finalize_fdt
doesn't free it.

Signed-off-by: Chenliang chenlian...@huawei.com
Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/ppc/spapr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 82f183f..502868e 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -795,6 +795,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
 
 }
 ret = fdt_setprop_string(fdt, offset, qemu,boot-list, bootlist);
+g_free(bootlist);
 }
 
 if (!spapr-has_graphics) {
-- 
1.7.12.4

Re: [Qemu-devel] [RFC PATCH 0/5] modify boot order when vm is running

2014-07-07 Thread Michael S. Tsirkin

On Mon, Jul 07, 2014 at 05:10:56PM +0800, arei.gong...@huawei.com wrote:
 From: Chenliang chenlian...@huawei.com
 
 Sometime, we want to modify boot order of vm without shutdown it.
 This sets of patches add one qmp to achieve it. And fix some little
 bug when device is hotpluged.
 
 Chenliang (5):
   bootindex: add *_boot_device_path function
   bootindex: reset bootindex when vm reset
   bootindex: delete boot index when device is removed
   bootindex: add qmp to set boot index when vm is running
   bootindex: fix memory leak when ppc sets boot index

Unfortunately at least for PC, boot order is exposed
in fw cfg which can not change while guest is running.
I suspect we need to change how we report boot order to guests.
While we are at it, maybe we can fix the silly bootindex
convention: I think people really want to specify boot *order*,
not boot index.



  hmp.c | 11 ++
  hmp.h |  1 +
  hw/block/virtio-blk.c |  1 +
  hw/i386/kvm/pci-assign.c  |  1 +
  hw/misc/vfio.c|  1 +
  hw/net/e1000.c|  1 +
  hw/net/eepro100.c |  1 +
  hw/net/ne2000.c   |  1 +
  hw/net/rtl8139.c  |  1 +
  hw/net/virtio-net.c   |  1 +
  hw/net/vmxnet3.c  |  1 +
  hw/nvram/fw_cfg.c | 53 +++--
  hw/ppc/spapr.c|  1 +
  hw/scsi/scsi-generic.c|  1 +
  hw/usb/dev-network.c  |  1 +
  hw/usb/host-libusb.c  |  1 +
  hw/usb/redirect.c |  1 +
  include/hw/nvram/fw_cfg.h |  2 ++
  include/sysemu/sysemu.h   |  4 
  qapi-schema.json  | 16 ++
  qmp-commands.hx   | 16 ++
  qmp.c | 14 
  vl.c  | 55 
 +++
  23 files changed, 179 insertions(+), 7 deletions(-)
 
 -- 
 1.7.12.4

Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes

2014-07-07 Thread Gerd Hoffmann

On So, 2014-07-06 at 17:22 +1000, Benjamin Herrenschmidt wrote:
 On Sun, 2014-07-06 at 17:05 +1000, Benjamin Herrenschmidt wrote:
  On Sun, 2014-07-06 at 16:46 +1000, Benjamin Herrenschmidt wrote:
   At this point, Im tempted to just revert that commit. What do you
   think Gerd ?
  
  I mean that hunk of the commit... I missed that the commit itself
  added a whole lot more bound checking.
 
 So we have an number of other problems :-)

[ list snipped ]

Nobody really uses host cursor and relative mouse mode together today.  

UIs basically assume that the guest will render the mouse pointer when
in relative mouse mode, because that is what qxl is doing.  virtio-gpu
(not upstream) has simliar issues in relative mouse mode.

I will go look into these, but I suspect it will take quite some time to
sort as it involves quite some testing will all the UIs we have.

I think the best short-term solution for cirrus is to simply keep the
existing hwcursor emulation code and force shadow mode when the guest
enables the hwcursor.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v4 15/33] target-arm: add NSACR register

2014-07-07 Thread Aggeler Fabian


On 01 Jul 2014, at 01:09, greg.bell...@linaro.org wrote:

 From: Fabian Aggeler aggel...@ethz.ch
 
 Implements NSACR register with corresponding read/write functions
 for ARMv7 and ARMv8.
 

Actually, in this patch we could add a check in cpu_get_tb_cpu_state() (cpu.h) 
to not set 
the ARM_TBFLAG_CPACR_FPEN_MASK if NSACR disables it. 

What do you think?

 Signed-off-by: Sergey Fedorov s.fedo...@samsung.com
 Signed-off-by: Fabian Aggeler aggel...@ethz.ch
 Signed-off-by: Greg Bellows greg.bell...@linaro.org
 ---
 target-arm/cpu.h|  6 +
 target-arm/helper.c | 68 -
 2 files changed, 73 insertions(+), 1 deletion(-)
 
 diff --git a/target-arm/cpu.h b/target-arm/cpu.h
 index 1e8d5ee..4625088 100644
 --- a/target-arm/cpu.h
 +++ b/target-arm/cpu.h
 @@ -182,6 +182,7 @@ typedef struct CPUARMState {
 uint64_t c1_coproc; /* Coprocessor access register.  */
 uint32_t c1_xscaleauxcr; /* XScale auxiliary control register.  */
 uint32_t c1_scr; /* secure config register.  */
 +uint32_t c1_nsacr; /* Non-secure access control register. */
 uint64_t ttbr0_el1; /* MMU translation table base 0. */
 uint64_t ttbr1_el1; /* MMU translation table base 1. */
 uint64_t c2_control; /* MMU translation table base control.  */
 @@ -609,6 +610,11 @@ static inline void xpsr_write(CPUARMState *env, uint32_t 
 val, uint32_t mask)
 #define SCR_RES1_MASK (3U  4)
 #define SCR_MASK  (0x3fff  ~SCR_RES1_MASK)
 
 +#define NSACR_NSTRCDIS (1U  20)
 +#define NSACR_RFR  (1U  19)
 +#define NSACR_NSASEDIS (1U  15)
 +#define NSACR_NSD32DIS (1U  14)
 +
 /* Return the current FPSCR value.  */
 uint32_t vfp_get_fpscr(CPUARMState *env);
 void vfp_set_fpscr(CPUARMState *env, uint32_t val);
 diff --git a/target-arm/helper.c b/target-arm/helper.c
 index e43545a..6342dbf 100644
 --- a/target-arm/helper.c
 +++ b/target-arm/helper.c
 @@ -489,7 +489,19 @@ static void cpacr_write(CPUARMState *env, const 
 ARMCPRegInfo *ri,
 /* VFP coprocessor: cp10  cp11 [23:20] */
 mask |= (1  31) | (1  30) | (0xf  20);
 
 -if (!arm_feature(env, ARM_FEATURE_NEON)) {
 +if (arm_feature(env, ARM_FEATURE_NEON)) {
 +/* NSACR can disable non-secure writes to
 + * ASEDIS [31] or D32DIS [30]
 + */
 +if (arm_feature(env, ARM_FEATURE_EL3)  
 !arm_is_secure(env)) {
 +if ((env-cp15.c1_nsacr  NSACR_NSASEDIS)) {
 +mask = ~(1  31);
 +}
 +if ((env-cp15.c1_nsacr  NSACR_NSD32DIS)) {
 +mask = ~(1  30);
 +}
 +}
 +} else {
 /* ASEDIS [31] bit is RAO/WI */
 value |= (1  31);
 }
 @@ -501,6 +513,7 @@ static void cpacr_write(CPUARMState *env, const 
 ARMCPRegInfo *ri,
 !arm_feature(env, ARM_FEATURE_VFP3)) {
 /* D32DIS [30] is RAO/WI if D16-31 are not implemented. */
 value |= (1  30);
 +mask |= (1  30);
 }
 }
 value = mask;
 @@ -2195,6 +2208,55 @@ static void scr_write(CPUARMState *env, const 
 ARMCPRegInfo *ri, uint64_t value)
 raw_write(env, ri, value);
 }
 
 +static void nsacr_write(CPUARMState *env, const ARMCPRegInfo *ri,
 +  uint64_t value)
 +{
 +uint32_t mask = 0;
 +
 +/* Pre ARMv8 some bits are RAO or UNK/SBZP */
 +if (!arm_feature(env, ARM_FEATURE_V8)) {
 +
 +if (arm_feature(env, ARM_FEATURE_VFP)) {
 +mask |= NSACR_NSASEDIS | NSACR_NSD32DIS;
 +
 +if (!arm_feature(env, ARM_FEATURE_NEON)) {
 +/* NSASEDIS are RAO/WI */
 +value |= NSACR_NSASEDIS;
 +}
 +
 +/* VFPv3 and upwards with NEON implement 32 double precision
 + * registers (D0-D31).
 + */
 +if (!arm_feature(env, ARM_FEATURE_NEON) ||
 +!arm_feature(env, ARM_FEATURE_VFP3)) {
 +/* NSD32DIS is RAO/WI if D16-31 are not implemented. */
 +value |= NSACR_NSD32DIS;
 +}
 +}
 +
 +/* cpn bits [13:0] */
 +mask = 0x3fff;
 +
 +value = mask;
 +}
 +
 +raw_write(env, ri, value);
 +}
 +
 +static uint64_t nsacr_read(CPUARMState *env, const ARMCPRegInfo *ri)
 +{
 +uint64_t ret = raw_read(env, ri);
 +
 +if (arm_feature(env, ARM_FEATURE_V8)) {
 +if (!arm_feature(env, ARM_FEATURE_EL3) || (
 +arm_el_is_aa64(env, 3)  !is_a64(env) 
 +arm_current_pl(env) != 3)) {
 +ret = 0xC00;
 +}
 +}
 +return ret;
 +}
 +
 static const ARMCPRegInfo v8_el3_cp_reginfo[] = {
 { .name = ELR_EL3, .state = ARM_CP_STATE_AA64,
   .type = ARM_CP_NO_MIGRATE,
 @@ -2228,6 +2290,10 @@ static const ARMCPRegInfo v7_el3_cp_reginfo[] =

Re: [Qemu-devel] [RFC PATCH 0/5] modify boot order when vm is running

2014-07-07 Thread Laszlo Ersek

On 07/07/14 11:29, Michael S. Tsirkin wrote:
 On Mon, Jul 07, 2014 at 05:10:56PM +0800, arei.gong...@huawei.com wrote:
 From: Chenliang chenlian...@huawei.com

 Sometime, we want to modify boot order of vm without shutdown it.
 This sets of patches add one qmp to achieve it. And fix some little
 bug when device is hotpluged.

 Chenliang (5):
   bootindex: add *_boot_device_path function
   bootindex: reset bootindex when vm reset
   bootindex: delete boot index when device is removed
   bootindex: add qmp to set boot index when vm is running
   bootindex: fix memory leak when ppc sets boot index
 
 Unfortunately at least for PC, boot order is exposed
 in fw cfg which can not change while guest is running.
 I suspect we need to change how we report boot order to guests.
 While we are at it, maybe we can fix the silly bootindex
 convention: I think people really want to specify boot *order*,
 not boot index.

Please preserve the bootorder fw_cfg file, and its format.

I don't have any request in relation to the new (== dynamic) feature ATM.

Thanks
Laszlo

Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes

2014-07-07 Thread Gerd Hoffmann

  Hi,

  The problem is that when using relative mouses, we can't really assume that
  there is any relationship between the absolute position of the host cursor
  vs. the guest cursor, we should only operate in deltas and even then, we
  probably want to dampen them to compensate for the guest own acceleration.
 
 The guest's own acceleration can easily be non-linear, so we can't 
 really tell. However, FWIW we basically have 2 modes
 
1) absolute pointing device (usb tablet for example or vmmouse)
2) relative pointing device
 
 In case 1, we can keep using the host cursor, and just tell the guest 
 where exactly the cursor is in absolute coordinates. This works very 
 well with VNC too ;).

Yep, #1 is the easy case and only that works reasonable well in qemu
today.

 In case 2, we can't tell anything at all. We can calculate the delta and 
 hope for the best. That's why with any backend that supports it, we 
 enable mouse grabbing here. In mouse grabbing mode we behave like any 
 game that may do whatever it likes with the mouse delta information.

There is dpy_mouse_set() which the gfx emulation can use to tell the UI
where the cursor actually is.  So we can move the host cursor to that
point, and sdl/gtk UIs attempt to do that.

Problem is that this has some latency due to the round-trip to the
guest.  Also things like mouse acceleration easily can make the mouse
pointer a bit jumpy.  I think it needs some experimentation to see
whenever operating in that mode can be made work well in practice.

On top of that I think we currently have some kind of feedback loop in
there, which makes the pointer quickly go totally wonky.  To be
debugged ...

  But that means that the guest HW cursor is never quite where the host cursor
  is. So unless the guest draws its own (or something like VNC can draw one),
  we have a problem.
 
 VNC can explicitly draw the host cursor at specific locations IIRC. You 
 can just send a packet where the cursor is at the moment.

Guess we should actually do that in vnc_mouse_set() then ;)

 I don't know 
 about SDL or GTK+ though.

See above.

  I'm thinking that for relative mouse, we should probably draw a cursor 
  ourselves
  by moving / drawing the cursor pixmap on top of the display pixmap at the UI
  backend (gtk/SDL) level... Or am I missing a big part of the puzzle ?
 
 Can't we just always draw it ourselves with a second surface on top of 
 our normal guest screen?

We might want create some helper functions to do that, so UIs which need
it can use them.  I don't think we should do that unconditionally, UI
support for RGBA cursors isn't that bad.

 Then we can make the real cursor for GTK+ / 
 SDL / VNC be a 100% alpha cursor as soon as we enable this self-drawn 
 surface and can expose hardware pointers that the respective backend 
 couldn't support.
 
 For example, IIRC VNC only supports 1-bit cursors. We certainly want 
 more fancy ones :).

SDL1 has 1bit cursors.
SDL2 has full RGBA cursors.
gtk has full RGBA cursors.
vnc has RGB cursors + 1-bit mask (with VNC_FEATURE_RICH_CURSOR).
spice has full RGBA cursors.

cheers,
  Gerd

[Qemu-devel] [PULL 0/2] build QEMU with Xen support on ARM

2014-07-07 Thread Stefano Stabellini

The following changes since commit 9d9de254c2b81b68cd48f2324cc753a570a4cdd8:

  MAINTAINERS: seccomp: change email contact for Eduardo Otubo (2014-07-03 
12:36:15 +0100)

are available in the git repository at:

  git://xenbits.xen.org/people/sstabellini/qemu-dm.git xen_arm_20140707

for you to fetch changes up to 643f59322432d77165329dfabe2d040d7e30dae8:

  xen: build on ARM (2014-07-07 10:37:40 +)


Stefano Stabellini (2):
  xen_backend: introduce xenstore_read_uint64 and xenstore_read_fe_uint64
  xen: build on ARM

 hw/display/xenfb.c   |   18 ++
 hw/xen/xen_backend.c |   18 ++
 include/hw/xen/xen_backend.h |2 ++
 xen-hvm.c|2 +-
 xen-mapcache.c   |4 ++--
 5 files changed, 33 insertions(+), 11 deletions(-)

[Qemu-devel] [PULL 1/2] xen_backend: introduce xenstore_read_uint64 and xenstore_read_fe_uint64

2014-07-07 Thread Stefano Stabellini

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
Reviewed-by: Peter Maydell peter.mayd...@linaro.org
---
 hw/xen/xen_backend.c |   18 ++
 include/hw/xen/xen_backend.h |2 ++
 2 files changed, 20 insertions(+)

diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index 3cd45b4..b2cb22b 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -111,6 +111,19 @@ int xenstore_read_int(const char *base, const char *node, 
int *ival)
 return rc;
 }
 
+int xenstore_read_uint64(const char *base, const char *node, uint64_t *uval)
+{
+char *val;
+int rc = -1;
+
+val = xenstore_read_str(base, node);
+if (val  1 == sscanf(val, %SCNu64, uval)) {
+rc = 0;
+}
+g_free(val);
+return rc;
+}
+
 int xenstore_write_be_str(struct XenDevice *xendev, const char *node, const 
char *val)
 {
 return xenstore_write_str(xendev-be, node, val);
@@ -146,6 +159,11 @@ int xenstore_read_fe_int(struct XenDevice *xendev, const 
char *node, int *ival)
 return xenstore_read_int(xendev-fe, node, ival);
 }
 
+int xenstore_read_fe_uint64(struct XenDevice *xendev, const char *node, 
uint64_t *uval)
+{
+return xenstore_read_uint64(xendev-fe, node, uval);
+}
+
 /* - */
 
 const char *xenbus_strstate(enum xenbus_state state)
diff --git a/include/hw/xen/xen_backend.h b/include/hw/xen/xen_backend.h
index 3b7d96d..3b4125e 100644
--- a/include/hw/xen/xen_backend.h
+++ b/include/hw/xen/xen_backend.h
@@ -74,6 +74,8 @@ char *xenstore_read_be_str(struct XenDevice *xendev, const 
char *node);
 int xenstore_read_be_int(struct XenDevice *xendev, const char *node, int 
*ival);
 char *xenstore_read_fe_str(struct XenDevice *xendev, const char *node);
 int xenstore_read_fe_int(struct XenDevice *xendev, const char *node, int 
*ival);
+int xenstore_read_uint64(const char *base, const char *node, uint64_t *uval);
+int xenstore_read_fe_uint64(struct XenDevice *xendev, const char *node, 
uint64_t *uval);
 
 const char *xenbus_strstate(enum xenbus_state state);
 struct XenDevice *xen_be_find_xendev(const char *type, int dom, int dev);
-- 
1.7.10.4

[Qemu-devel] [PULL 2/2] xen: build on ARM

2014-07-07 Thread Stefano Stabellini

Collection of fixes to build QEMU with Xen support on ARM:
- use xenstore_read_fe_uint64 to retrieve the page-ref (xenfb);
- use xen_pfn_t instead of unsigned long in xenfb;
- unsigned long/xenpfn_t in xen_remove_from_physmap;
- in xen-mapcache.c use HOST_LONG_BITS to check for QEMU's address space
size.

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
Reviewed-by: Peter Maydell peter.mayd...@linaro.org
---
 hw/display/xenfb.c |   18 ++
 xen-hvm.c  |2 +-
 xen-mapcache.c |4 ++--
 3 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c
index 032eb7a..07ddc9d 100644
--- a/hw/display/xenfb.c
+++ b/hw/display/xenfb.c
@@ -93,10 +93,12 @@ struct XenFB {
 
 static int common_bind(struct common *c)
 {
-int mfn;
+uint64_t mfn;
 
-if (xenstore_read_fe_int(c-xendev, page-ref, mfn) == -1)
+if (xenstore_read_fe_uint64(c-xendev, page-ref, mfn) == -1)
return -1;
+assert(mfn == (xen_pfn_t)mfn);
+
 if (xenstore_read_fe_int(c-xendev, event-channel, 
c-xendev.remote_port) == -1)
return -1;
 
@@ -107,7 +109,7 @@ static int common_bind(struct common *c)
return -1;
 
 xen_be_bind_evtchn(c-xendev);
-xen_be_printf(c-xendev, 1, ring mfn %d, remote-port %d, local-port 
%d\n,
+xen_be_printf(c-xendev, 1, ring mfn %PRIx64, remote-port %d, 
local-port %d\n,
  mfn, c-xendev.remote_port, c-xendev.local_port);
 
 return 0;
@@ -409,7 +411,7 @@ static void input_event(struct XenDevice *xendev)
 
 /*  */
 
-static void xenfb_copy_mfns(int mode, int count, unsigned long *dst, void *src)
+static void xenfb_copy_mfns(int mode, int count, xen_pfn_t *dst, void *src)
 {
 uint32_t *src32 = src;
 uint64_t *src64 = src;
@@ -424,8 +426,8 @@ static int xenfb_map_fb(struct XenFB *xenfb)
 struct xenfb_page *page = xenfb-c.page;
 char *protocol = xenfb-c.xendev.protocol;
 int n_fbdirs;
-unsigned long *pgmfns = NULL;
-unsigned long *fbmfns = NULL;
+xen_pfn_t *pgmfns = NULL;
+xen_pfn_t *fbmfns = NULL;
 void *map, *pd;
 int mode, ret = -1;
 
@@ -483,8 +485,8 @@ static int xenfb_map_fb(struct XenFB *xenfb)
 n_fbdirs = xenfb-fbpages * mode / 8;
 n_fbdirs = (n_fbdirs + (XC_PAGE_SIZE - 1)) / XC_PAGE_SIZE;
 
-pgmfns = g_malloc0(sizeof(unsigned long) * n_fbdirs);
-fbmfns = g_malloc0(sizeof(unsigned long) * xenfb-fbpages);
+pgmfns = g_malloc0(sizeof(xen_pfn_t) * n_fbdirs);
+fbmfns = g_malloc0(sizeof(xen_pfn_t) * xenfb-fbpages);
 
 xenfb_copy_mfns(mode, n_fbdirs, pgmfns, pd);
 map = xc_map_foreign_pages(xen_xc, xenfb-c.xendev.dom,
diff --git a/xen-hvm.c b/xen-hvm.c
index bafdf12..c928b36 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -390,7 +390,7 @@ static int xen_remove_from_physmap(XenIOState *state,
 start_addr = TARGET_PAGE_BITS;
 phys_offset = TARGET_PAGE_BITS;
 for (i = 0; i  size; i++) {
-unsigned long idx = start_addr + i;
+xen_pfn_t idx = start_addr + i;
 xen_pfn_t gpfn = phys_offset + i;
 
 rc = xc_domain_add_to_physmap(xen_xc, xen_domid, XENMAPSPACE_gmfn, 
idx, gpfn);
diff --git a/xen-mapcache.c b/xen-mapcache.c
index eda914a..66da1a6 100644
--- a/xen-mapcache.c
+++ b/xen-mapcache.c
@@ -33,10 +33,10 @@
 #  define DPRINTF(fmt, ...) do { } while (0)
 #endif
 
-#if defined(__i386__)
+#if HOST_LONG_BITS == 32
 #  define MCACHE_BUCKET_SHIFT 16
 #  define MCACHE_MAX_SIZE (1UL31) /* 2GB Cap */
-#elif defined(__x86_64__)
+#else
 #  define MCACHE_BUCKET_SHIFT 20
 #  define MCACHE_MAX_SIZE (1UL35) /* 32GB Cap */
 #endif
-- 
1.7.10.4

[Qemu-devel] [PATCH V3 for 2.1 2/2] exec: improve error handling and reporting in file_ram_alloc() and gethugepagesize()

2014-07-07 Thread Hu Tao

This patch fixes two problems of memory-backend-file:

1. If user adds a memory-backend-file object using object_add command,
   specifying a non-existing directory for property mem-path, qemu
   will core dump with message:

 /nonexistingdir: No such file or directory
 Bad ram offset f000
 Aborted (core dumped)

   with this patch, qemu reports error message like:

 qemu-system-x86_64: -object 
memory-backend-file,mem-path=/nonexistingdir,id=mem-file0,size=128M:
 failed to stat file /nonexistingdir: No such file or directory

2. If user adds a memory-backend-file object using object_add command,
   specifying a size that is less than huge page size, qemu
   will core dump with message:

 Bad ram offset f000
 Aborted (core dumped)

   with this patch, qemu reports error message like:

 qemu-system-x86_64: -object 
memory-backend-file,mem-path=/hugepages,id=mem-file0,size=1M: memory
 size 0x10 should be euqal or larger than huge page size 0x20

Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 exec.c | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/exec.c b/exec.c
index ca7741b..bb97b15 100644
--- a/exec.c
+++ b/exec.c
@@ -996,7 +996,7 @@ void qemu_mutex_unlock_ramlist(void)
 
 #define HUGETLBFS_MAGIC   0x958458f6
 
-static long gethugepagesize(const char *path)
+static long gethugepagesize(const char *path, Error **errp)
 {
 struct statfs fs;
 int ret;
@@ -1006,7 +1006,7 @@ static long gethugepagesize(const char *path)
 } while (ret != 0  errno == EINTR);
 
 if (ret != 0) {
-perror(path);
+error_setg_errno(errp, errno, failed to get size of file %s, path);
 return 0;
 }
 
@@ -1024,17 +1024,20 @@ static void *file_ram_alloc(RAMBlock *block,
 char *filename;
 char *sanitized_name;
 char *c;
-void *area;
+void *area = NULL;
 int fd;
 unsigned long hpagesize;
 
-hpagesize = gethugepagesize(path);
-if (!hpagesize) {
+hpagesize = gethugepagesize(path, errp);
+if (errp  *errp) {
 goto error;
 }
 
 if (memory  hpagesize) {
-return NULL;
+error_setg(errp, memory size 0x RAM_ADDR_FMT  must be euqal to 
+   or larger than huge page size 0x% PRIx64,
+   memory, hpagesize);
+goto error;
 }
 
 if (kvm_enabled()  !kvm_has_sync_mmu()) {
@@ -1094,8 +1097,8 @@ static void *file_ram_alloc(RAMBlock *block,
 return area;
 
 error:
-if (mem_prealloc) {
-exit(1);
+if (area  area != MAP_FAILED) {
+munmap(area, memory);
 }
 return NULL;
 }
-- 
1.9.3

[Qemu-devel] [PATCH V3 for 2.1 1/2] memory: add memory_region_init_ram_may_fail() and memory_region_init_ram_ptr_may_fail()

2014-07-07 Thread Hu Tao

These two are almost the same as memory_region_init_ram() and
memory_region_init_ram_ptr() except that they have an extra errp
parameter to let callers handle error.

In hostmem-ram.c we call memory_region_init_ram_may_fail() now rather
than memory_region_init_ram() so that error can be handled.

This patch solves a problem that qemu just exits when using monitor
command object_add to add a memory backend whose size is way too large.
In the case we'd better give an error message and keep guest running.

The problem can be reproduced as follows:

1. run qemu
2. (monitor)object_add memory-backend-ram,size=10G,id=ram0

Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 backends/hostmem-ram.c  |  4 ++--
 exec.c  | 32 ++
 hw/block/pflash_cfi01.c |  5 +++-
 hw/block/pflash_cfi02.c |  5 +++-
 include/exec/memory.h   | 40 +++-
 include/exec/ram_addr.h |  4 ++--
 memory.c| 61 +++--
 7 files changed, 123 insertions(+), 28 deletions(-)

diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
index d9a8290..d0e92ad 100644
--- a/backends/hostmem-ram.c
+++ b/backends/hostmem-ram.c
@@ -26,8 +26,8 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error 
**errp)
 }
 
 path = object_get_canonical_path_component(OBJECT(backend));
-memory_region_init_ram(backend-mr, OBJECT(backend), path,
-   backend-size);
+memory_region_init_ram_may_fail(backend-mr, OBJECT(backend), path,
+backend-size, errp);
 g_free(path);
 }
 
diff --git a/exec.c b/exec.c
index 5a2a25e..ca7741b 100644
--- a/exec.c
+++ b/exec.c
@@ -1224,7 +1224,7 @@ static int memory_try_enable_merging(void *addr, size_t 
len)
 return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE);
 }
 
-static ram_addr_t ram_block_add(RAMBlock *new_block)
+static ram_addr_t ram_block_add(RAMBlock *new_block, Error **errp)
 {
 RAMBlock *block;
 ram_addr_t old_ram_size, new_ram_size;
@@ -1241,9 +1241,11 @@ static ram_addr_t ram_block_add(RAMBlock *new_block)
 } else {
 new_block-host = phys_mem_alloc(new_block-length);
 if (!new_block-host) {
-fprintf(stderr, Cannot set up guest memory '%s': %s\n,
-new_block-mr-name, strerror(errno));
-exit(1);
+error_setg_errno(errp, errno,
+ cannot set up guest memory '%s',
+ new_block-mr-name);
+qemu_mutex_unlock_ramlist();
+return -1;
 }
 memory_try_enable_merging(new_block-host, new_block-length);
 }
@@ -1294,6 +1296,7 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, 
MemoryRegion *mr,
 Error **errp)
 {
 RAMBlock *new_block;
+ram_addr_t addr;
 
 if (xen_enabled()) {
 error_setg(errp, -mem-path not supported with Xen);
@@ -1323,14 +1326,20 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, 
MemoryRegion *mr,
 return -1;
 }
 
-return ram_block_add(new_block);
+addr = ram_block_add(new_block, errp);
+if (errp  *errp) {
+g_free(new_block);
+return -1;
+}
+return addr;
 }
 #endif
 
 ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
-   MemoryRegion *mr)
+   MemoryRegion *mr, Error **errp)
 {
 RAMBlock *new_block;
+ram_addr_t addr;
 
 size = TARGET_PAGE_ALIGN(size);
 new_block = g_malloc0(sizeof(*new_block));
@@ -1341,12 +1350,17 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, 
void *host,
 if (host) {
 new_block-flags |= RAM_PREALLOC;
 }
-return ram_block_add(new_block);
+addr = ram_block_add(new_block, errp);
+if (errp  *errp) {
+g_free(new_block);
+return -1;
+}
+return addr;
 }
 
-ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr)
+ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr, Error **errp)
 {
-return qemu_ram_alloc_from_ptr(size, NULL, mr);
+return qemu_ram_alloc_from_ptr(size, NULL, mr, errp);
 }
 
 void qemu_ram_free_from_ptr(ram_addr_t addr)
diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c
index f9507b4..92b8b87 100644
--- a/hw/block/pflash_cfi01.c
+++ b/hw/block/pflash_cfi01.c
@@ -770,7 +770,10 @@ static void pflash_cfi01_realize(DeviceState *dev, Error 
**errp)
 memory_region_init_rom_device(
 pfl-mem, OBJECT(dev),
 pfl-be ? pflash_cfi01_ops_be : pflash_cfi01_ops_le, pfl,
-pfl-name, total_len);
+pfl-name, total_len, errp);
+if (errp  *errp) {
+return;
+}
 vmstate_register_ram(pfl-mem, DEVICE(pfl));
 pfl-storage = memory_region_get_ram_ptr(pfl-mem);
 sysbus_init_mmio(SYS_BUS_DEVICE(dev), pfl-mem);
diff --git

[Qemu-devel] [PATCH V3 for 2.1 0/2] bug fixs for memory backend

2014-07-07 Thread Hu Tao

This series includes two patches to fix bugs of memory backend. See each
patch for the bugs and how to reproduce them.

changes to v2:

- introduce memory_region_init_ram_may_fail and
  memory_region_init_ram_ptr_may_fail

- address comments by MST

- missing the functions renaming. will send later.

Hu Tao (2):
  memory: add memory_region_init_ram_may_fail() and
memory_region_init_ram_ptr_may_fail()
  exec: improve error handling and reporting in file_ram_alloc() and
gethugepagesize()

 backends/hostmem-ram.c  |  4 ++--
 exec.c  | 51 +++--
 hw/block/pflash_cfi01.c |  5 +++-
 hw/block/pflash_cfi02.c |  5 +++-
 include/exec/memory.h   | 40 +++-
 include/exec/ram_addr.h |  4 ++--
 memory.c| 61 +++--
 7 files changed, 134 insertions(+), 36 deletions(-)

-- 
1.9.3

Re: [Qemu-devel] [RFC PATCH 0/5] modify boot order when vm is running

2014-07-07 Thread Gonglei (Arei)

 -Original Message-
 From: Michael S. Tsirkin [mailto:m...@redhat.com]
 Sent: Monday, July 07, 2014 5:29 PM
 To: Gonglei (Arei)
 Cc: qemu-devel@nongnu.org; afaer...@suse.de; ag...@suse.de;
 stefa...@redhat.com; ak...@redhat.com; a...@ozlabs.ru;
 alex.william...@redhat.com; arm...@redhat.com; ebl...@redhat.com;
 kw...@redhat.com; peter.mayd...@linaro.org; lcapitul...@redhat.com;
 pbonz...@redhat.com; ler...@redhat.com; kra...@redhat.com;
 imamm...@redhat.com; dmi...@daynix.com; marce...@redhat.com;
 peter.crosthwa...@xilinx.com; r...@twiddle.net; so...@cmu.edu;
 Huangweidong (C); Luonengjun; Huangpeng (Peter); chenliang (T)
 Subject: Re: [RFC PATCH 0/5] modify boot order when vm is running

 On Mon, Jul 07, 2014 at 05:10:56PM +0800, arei.gong...@huawei.com wrote:
  From: Chenliang chenlian...@huawei.com

  Sometime, we want to modify boot order of vm without shutdown it.
  This sets of patches add one qmp to achieve it. And fix some little
  bug when device is hotpluged.

  Chenliang (5):
bootindex: add *_boot_device_path function
bootindex: reset bootindex when vm reset
bootindex: delete boot index when device is removed
bootindex: add qmp to set boot index when vm is running
bootindex: fix memory leak when ppc sets boot index

 Unfortunately at least for PC, boot order is exposed
 in fw cfg which can not change while guest is running.

Yes, so we should assure it take effect after the guest rebooting. 

 I suspect we need to change how we report boot order to guests.
 While we are at it, maybe we can fix the silly bootindex
 convention: I think people really want to specify boot *order*,
 not boot index.

Agreed.

But at present, the boot index can be used for the boot order 
except -boot command line. Because -boot only can assign
the guest booting from HD or Network or Floppy etc.. but cannot
assign the index of hard disks or PXE net cards, which not be enough
for many scenes, such as P2V, or two different system hard disks
(vda/sda/hda).

Best regards,
-Gonglei

Re: [Qemu-devel] [RFC PATCH 0/5] modify boot order when vm is running

2014-07-07 Thread Gonglei (Arei)








 -Original Message-
 From: Laszlo Ersek [mailto:ler...@redhat.com]
 Sent: Monday, July 07, 2014 6:04 PM
 To: Michael S. Tsirkin; Gonglei (Arei)
 Cc: qemu-devel@nongnu.org; afaer...@suse.de; ag...@suse.de;
 stefa...@redhat.com; ak...@redhat.com; a...@ozlabs.ru;
 alex.william...@redhat.com; arm...@redhat.com; ebl...@redhat.com;
 kw...@redhat.com; peter.mayd...@linaro.org; lcapitul...@redhat.com;
 pbonz...@redhat.com; kra...@redhat.com; imamm...@redhat.com;
 dmi...@daynix.com; marce...@redhat.com; peter.crosthwa...@xilinx.com;
 r...@twiddle.net; so...@cmu.edu; Huangweidong (C); Luonengjun;
 Huangpeng (Peter); chenliang (T)
 Subject: Re: [RFC PATCH 0/5] modify boot order when vm is running
 
 On 07/07/14 11:29, Michael S. Tsirkin wrote:
  On Mon, Jul 07, 2014 at 05:10:56PM +0800, arei.gong...@huawei.com
 wrote:
  From: Chenliang chenlian...@huawei.com
 
  Sometime, we want to modify boot order of vm without shutdown it.
  This sets of patches add one qmp to achieve it. And fix some little
  bug when device is hotpluged.
 
  Chenliang (5):
bootindex: add *_boot_device_path function
bootindex: reset bootindex when vm reset
bootindex: delete boot index when device is removed
bootindex: add qmp to set boot index when vm is running
bootindex: fix memory leak when ppc sets boot index
 
  Unfortunately at least for PC, boot order is exposed
  in fw cfg which can not change while guest is running.
  I suspect we need to change how we report boot order to guests.
  While we are at it, maybe we can fix the silly bootindex
  convention: I think people really want to specify boot *order*,
  not boot index.
 
 Please preserve the bootorder fw_cfg file, and its format.
 
 I don't have any request in relation to the new (== dynamic) feature ATM.
 
Sorry, I can't understand your meaning exactly. 
Would you explain it? Thanks!

 Thanks
 Laszlo

Best regards,
-Gonglei

[Qemu-devel] another locking issue in current dataplane code?

2014-07-07 Thread Christian Borntraeger

Folks,

with current 2.1-rc0 (
+  dataplane: do not free VirtQueueElement in vring_push()
+  virtio-blk: avoid dataplane VirtIOBlockReq early free
+ some not-ready yet s390 patches for migration
)

I still having issues with dataplane during managedsave (without dataplane 
everything seems to work fine):

With 1 CPU and 1 disk (and some workload, e.g. a simple dd on the disk) I get:


Thread 3 (Thread 0x3fff90fd910 (LWP 27218)):
#0  0x03fffcdb7ba0 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x03fffcdbac0c in __pthread_mutex_cond_lock () from 
/lib64/libpthread.so.0
#2  0x03fffcdb399a in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0
#3  0x801fff06 in qemu_cond_wait (cond=optimized out, 
mutex=mutex@entry=0x8037f788 qemu_global_mutex) at 
/home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:135
#4  0x800472f4 in qemu_kvm_wait_io_event (cpu=optimized out) at 
/home/cborntra/REPOS/qemu/cpus.c:843
#5  qemu_kvm_cpu_thread_fn (arg=0x809ad6b0) at 
/home/cborntra/REPOS/qemu/cpus.c:879
#6  0x03fffcdaf412 in start_thread () from /lib64/libpthread.so.0
#7  0x03fffba350ae in thread_start () from /lib64/libc.so.6

Thread 2 (Thread 0x3fff88fd910 (LWP 27219)):
#0  0x03fffba2a8e0 in ppoll () from /lib64/libc.so.6
#1  0x801af250 in ppoll (__ss=0x0, __timeout=0x0, __nfds=optimized 
out, __fds=optimized out) at /usr/include/bits/poll2.h:77
#2  qemu_poll_ns (fds=fds@entry=0x3fff40010c0, nfds=nfds@entry=3, timeout=-1) 
at /home/cborntra/REPOS/qemu/qemu-timer.c:314
#3  0x801b0702 in aio_poll (ctx=0x807f2230, 
blocking=blocking@entry=true) at /home/cborntra/REPOS/qemu/aio-posix.c:221
#4  0x800be3c4 in iothread_run (opaque=0x807f20d8) at 
/home/cborntra/REPOS/qemu/iothread.c:41
#5  0x03fffcdaf412 in start_thread () from /lib64/libpthread.so.0
#6  0x03fffba350ae in thread_start () from /lib64/libc.so.6

Thread 1 (Thread 0x3fff9c529b0 (LWP 27215)):
#0  0x03fffcdb38f0 in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0
#1  0x801fff06 in qemu_cond_wait (cond=cond@entry=0x807f22c0, 
mutex=mutex@entry=0x807f2290) at 
/home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:135
#2  0x80212906 in rfifolock_lock (r=r@entry=0x807f2290) at 
/home/cborntra/REPOS/qemu/util/rfifolock.c:59
#3  0x8019e536 in aio_context_acquire (ctx=ctx@entry=0x807f2230) at 
/home/cborntra/REPOS/qemu/async.c:295
#4  0x801a34e6 in bdrv_drain_all () at 
/home/cborntra/REPOS/qemu/block.c:1907
#5  0x80048e24 in do_vm_stop (state=RUN_STATE_PAUSED) at 
/home/cborntra/REPOS/qemu/cpus.c:538
#6  vm_stop (state=state@entry=RUN_STATE_PAUSED) at 
/home/cborntra/REPOS/qemu/cpus.c:1221
#7  0x800e6338 in qmp_stop (errp=errp@entry=0x3a9dc00) at 
/home/cborntra/REPOS/qemu/qmp.c:98
#8  0x800e1314 in qmp_marshal_input_stop (mon=optimized out, 
qdict=optimized out, ret=optimized out) at qmp-marshal.c:2806
#9  0x8004b91a in qmp_call_cmd (cmd=optimized out, params=0x8096cf50, 
mon=0x8080b8a0) at /home/cborntra/REPOS/qemu/monitor.c:5038
#10 handle_qmp_command (parser=optimized out, tokens=optimized out) at 
/home/cborntra/REPOS/qemu/monitor.c:5104
#11 0x801faf16 in json_message_process_token (lexer=0x8080b7c0, 
token=0x808f2610, type=optimized out, x=optimized out, y=6) at 
/home/cborntra/REPOS/qemu/qobject/json-streamer.c:87
#12 0x80212bac in json_lexer_feed_char (lexer=lexer@entry=0x8080b7c0, 
ch=optimized out, flush=flush@entry=false) at 
/home/cborntra/REPOS/qemu/qobject/json-lexer.c:303
#13 0x80212cfe in json_lexer_feed (lexer=0x8080b7c0, buffer=optimized 
out, size=optimized out) at 
/home/cborntra/REPOS/qemu/qobject/json-lexer.c:356
#14 0x801fb10e in json_message_parser_feed (parser=optimized out, 
buffer=optimized out, size=optimized out) at 
/home/cborntra/REPOS/qemu/qobject/json-streamer.c:110
#15 0x80049f28 in monitor_control_read (opaque=optimized out, 
buf=optimized out, size=optimized out) at 
/home/cborntra/REPOS/qemu/monitor.c:5125
#16 0x800c8636 in qemu_chr_be_write (len=1, buf=0x3a9e010 
}[B\377\373\251\372\b, s=0x807f5af0) at 
/home/cborntra/REPOS/qemu/qemu-char.c:213
#17 tcp_chr_read (chan=optimized out, cond=optimized out, 
opaque=0x807f5af0) at /home/cborntra/REPOS/qemu/qemu-char.c:2690
#18 0x03fffcc9f05a in g_main_context_dispatch () from 
/lib64/libglib-2.0.so.0
#19 0x801ae3e0 in glib_pollfds_poll () at 
/home/cborntra/REPOS/qemu/main-loop.c:190
#20 os_host_main_loop_wait (timeout=optimized out) at 
/home/cborntra/REPOS/qemu/main-loop.c:235
#21 main_loop_wait (nonblocking=optimized out) at 
/home/cborntra/REPOS/qemu/main-loop.c:484
#22 0x800169e2 in main_loop () at /home/cborntra/REPOS/qemu/vl.c:2024
#23 main (argc=optimized out, argv=optimized out, envp=optimized out) at 
/home/cborntra/REPOS/qemu/vl.c:4551

Now. If aio_poll never returns, we have a deadlock here. 
To me it looks like, that aio_poll could be called

[Qemu-devel] Strange behaviour with MSR?

2014-07-07 Thread François

Hello,

I'm not sure I'm on the right list to post, sorry about that, but I
tried on IRC and got no answer.

I'm working on a low level piece of system, which has to change PSR
values on ARM.

I use qemu-system-arm v 2.0.0, with the command : qemu-system-arm
-nographic -s -S -m 1024 -M vexpress-a9 -kernel ./bootstrap


My issue is the following: Just before the MSR call, I have an LR value.
When executing MSR, the LR value gets nulled.
After a second iteration, MSR does *not* set this value to 0.

I really don't see wether it can come from qemu, if I'm writting this
message, I think it does not come from my code, since the reset caused
by the lr = 0 restarts the system, and thus, re initilizes the context
with the same values and same call graph.

Here is a gdb trace from the issue :


(gdb) target remote 127.0.0.1:1234
Remote debugging using 127.0.0.1:1234
0x6000 in ?? ()
(gdb) b *0x61005814
Breakpoint 1 at 0x61005814
(gdb) c
Continuing.

Breakpoint 1, 0x61005814 in ?? ()
(gdb) x /i $pc
= 0x61005814:  msr CPSR_fsxc, r3
(gdb) info reg
r0 0xe1a010b2   -509603662
r1 0x0  0
r2 0x282200142082305
r3 0xe1a010b2   -509603662
r4 0x0  0
r5 0x6100ec04   1627450372
r6 0x0  0
r7 0x0  0
r8 0x0  0
r9 0x0  0
r100x0  0
r110x60340be4   1614023652
r120x0  0
sp 0x60340bc0   0x60340bc0
lr 0x610057cc   1627412428
pc 0x61005814   0x61005814
cpsr   0x8013   -2147483629
(gdb) si
0x61005818 in ?? ()
(gdb) info reg
r0 0xe1a010b2   -509603662
r1 0x0  0
r2 0x282200142082305
r3 0xe1a010b2   -509603662
r4 0x0  0
r5 0x6100ec04   1627450372
r6 0x0  0
r7 0x0  0
r8 0x0  0
r9 0x0  0
r100x0  0
r110x60340be4   1614023652
r120x0  0
sp 0x0  0x0
lr 0x0  0
pc 0x61005818   0x61005818
cpsr   0xe092   -536870766


Thanks in advance for any piece of advice :)

--
François



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFC PATCH 00/11] Adding FreeBSD's Capsicum security framework (part 1)

2014-07-07 Thread Paolo Bonzini


Il 07/07/2014 12:29, David Drysdale ha scritto:

I think that's more easily done by opening the file as O_RDONLY/O_WRONLY
/O_RDWR.   You could do it by running the file descriptor's seccomp-bpf
program once per iocb with synthesized syscall numbers and argument
vectors.


Right, but generating the equivalent seccomp input environment for an
equivalent single-fd syscall is going to be subtle and complex (which
are worrying words to mention in a security context).  And how many
other syscalls are going to need similar special-case processing?
(poll? select? send[m]msg? ...)


Yeah, the difficult part is getting the right balance between:

1) limitations due to seccomp's impossibility to chase pointers (which 
is not something that can be lifted, as it's required for correctness)


2) subtlety and complexity of the resulting code.

The problem stems when you have a single a single syscall operating on 
multiple file descriptors.  So for example among the cases you mention 
poll and select are problematic; sendm?msg are not.  They would be if 
Capsicum had a capability for SCM_RIGHTS file descriptor passing, but I 
cannot find it.


But then you also have to strike the right balance between a complete 
solution and an overengineered one.


For example, even though poll and select are problematic, I wonder what 
would really the point be in blocking them; poll/select are 
level-triggered, and calling them should be idempotent as far as the 
file descriptor is concerned.  If you want to prevent a process/thread 
from issuing blocking system calls, but you'd do that with a per-process 
filter, not with per-file-descriptor filters or capabilities.



Capsicum capabilities are associated with the file descriptor (a la
F_GETFD), not the open file itself -- different FDs with different
associated rights can map to the same underlying open file.


Good to know, thanks.  I suppose you have testcases that cover this.

Paolo

Re: [Qemu-devel] [PATCH V3 for 2.1 1/2] memory: add memory_region_init_ram_may_fail() and memory_region_init_ram_ptr_may_fail()

2014-07-07 Thread Michael S. Tsirkin

On Mon, Jul 07, 2014 at 06:55:27PM +0800, Hu Tao wrote:
 These two are almost the same as memory_region_init_ram() and
 memory_region_init_ram_ptr() except that they have an extra errp
 parameter to let callers handle error.
 
 In hostmem-ram.c we call memory_region_init_ram_may_fail() now rather
 than memory_region_init_ram() so that error can be handled.
 
 This patch solves a problem that qemu just exits when using monitor
 command object_add to add a memory backend whose size is way too large.
 In the case we'd better give an error message and keep guest running.
 
 The problem can be reproduced as follows:
 
 1. run qemu
 2. (monitor)object_add memory-backend-ram,size=10G,id=ram0
 
 Signed-off-by: Hu Tao hu...@cn.fujitsu.com
 ---
  backends/hostmem-ram.c  |  4 ++--
  exec.c  | 32 ++
  hw/block/pflash_cfi01.c |  5 +++-
  hw/block/pflash_cfi02.c |  5 +++-

Why do we need to patch pflash?
We can't trigger it with object_add, can we?

  include/exec/memory.h   | 40 +++-
  include/exec/ram_addr.h |  4 ++--
  memory.c| 61 
 +++--
  7 files changed, 123 insertions(+), 28 deletions(-)
 
 diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
 index d9a8290..d0e92ad 100644
 --- a/backends/hostmem-ram.c
 +++ b/backends/hostmem-ram.c
 @@ -26,8 +26,8 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error 
 **errp)
  }
  
  path = object_get_canonical_path_component(OBJECT(backend));
 -memory_region_init_ram(backend-mr, OBJECT(backend), path,
 -   backend-size);
 +memory_region_init_ram_may_fail(backend-mr, OBJECT(backend), path,
 +backend-size, errp);
  g_free(path);
  }
  
 diff --git a/exec.c b/exec.c
 index 5a2a25e..ca7741b 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -1224,7 +1224,7 @@ static int memory_try_enable_merging(void *addr, size_t 
 len)
  return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE);
  }
  
 -static ram_addr_t ram_block_add(RAMBlock *new_block)
 +static ram_addr_t ram_block_add(RAMBlock *new_block, Error **errp)
  {
  RAMBlock *block;
  ram_addr_t old_ram_size, new_ram_size;
 @@ -1241,9 +1241,11 @@ static ram_addr_t ram_block_add(RAMBlock *new_block)
  } else {
  new_block-host = phys_mem_alloc(new_block-length);
  if (!new_block-host) {
 -fprintf(stderr, Cannot set up guest memory '%s': %s\n,
 -new_block-mr-name, strerror(errno));
 -exit(1);
 +error_setg_errno(errp, errno,
 + cannot set up guest memory '%s',
 + new_block-mr-name);
 +qemu_mutex_unlock_ramlist();
 +return -1;
  }
  memory_try_enable_merging(new_block-host, new_block-length);
  }
 @@ -1294,6 +1296,7 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, 
 MemoryRegion *mr,
  Error **errp)
  {
  RAMBlock *new_block;
 +ram_addr_t addr;
  
  if (xen_enabled()) {
  error_setg(errp, -mem-path not supported with Xen);
 @@ -1323,14 +1326,20 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, 
 MemoryRegion *mr,
  return -1;
  }
  
 -return ram_block_add(new_block);
 +addr = ram_block_add(new_block, errp);
 +if (errp  *errp) {
 +g_free(new_block);
 +return -1;
 +}
 +return addr;
  }
  #endif
  
  ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
 -   MemoryRegion *mr)
 +   MemoryRegion *mr, Error **errp)
  {
  RAMBlock *new_block;
 +ram_addr_t addr;
  
  size = TARGET_PAGE_ALIGN(size);
  new_block = g_malloc0(sizeof(*new_block));
 @@ -1341,12 +1350,17 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, 
 void *host,
  if (host) {
  new_block-flags |= RAM_PREALLOC;
  }
 -return ram_block_add(new_block);
 +addr = ram_block_add(new_block, errp);
 +if (errp  *errp) {
 +g_free(new_block);
 +return -1;
 +}
 +return addr;
  }
  
 -ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr)
 +ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr, Error **errp)
  {
 -return qemu_ram_alloc_from_ptr(size, NULL, mr);
 +return qemu_ram_alloc_from_ptr(size, NULL, mr, errp);
  }
  
  void qemu_ram_free_from_ptr(ram_addr_t addr)
 diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c
 index f9507b4..92b8b87 100644
 --- a/hw/block/pflash_cfi01.c
 +++ b/hw/block/pflash_cfi01.c
 @@ -770,7 +770,10 @@ static void pflash_cfi01_realize(DeviceState *dev, Error 
 **errp)
  memory_region_init_rom_device(
  pfl-mem, OBJECT(dev),
  pfl-be ? pflash_cfi01_ops_be : pflash_cfi01_ops_le, pfl,
 -pfl-name,

[Qemu-devel] [PULL for-2.1 00/11] Block patches

2014-07-07 Thread Stefan Hajnoczi

Bug fixes for QEMU 2.1-rc1.

The following changes since commit 9d9de254c2b81b68cd48f2324cc753a570a4cdd8:

  MAINTAINERS: seccomp: change email contact for Eduardo Otubo (2014-07-03 
12:36:15 +0100)

are available in the git repository at:

  git://github.com/stefanha/qemu.git tags/block-pull-request

for you to fetch changes up to f4eb32b590bf58c1c67570775eb78beb09964fad:

  qmp: show QOM properties in device-list-properties (2014-07-07 11:10:05 +0200)


Block pull request


Benoît Canet (1):
  qemu-iotests: Disable Quorum testing in 041 when Quorum is not builtin

Chunyan Liu (1):
  Fix nocow typos in manpage

Kevin Wolf (1):
  mirror: Fix qiov size for short requests

Le Tan (1):
  ahci: map memory via device's address space instead of 
address_space_memory

Markus Armbruster (1):
  raw-posix: Fix raw_getlength() to always return -errno on error

Ming Lei (3):
  block: block: introduce APIs for submitting IO as a batch
  linux-aio: implement io plug, unplug and flush io queue
  dataplane: submit I/O as a batch

Reza Jelveh (1):
  ahci.c: mask unused flags when reading size PRDT DBC

Stefan Hajnoczi (2):
  MAINTAINERS: add Stefan Hajnoczi to IDE maintainers
  qmp: show QOM properties in device-list-properties

 MAINTAINERS |  1 +
 block.c | 31 +
 block/linux-aio.c   | 96 ++-
 block/mirror.c  |  4 +-
 block/raw-aio.h |  2 +
 block/raw-posix.c   | 73 +++---
 hw/block/dataplane/virtio-blk.c |  2 +
 hw/ide/ahci.c   | 32 +++--
 hw/ide/ahci.h   |  2 +
 include/block/block.h   |  4 ++
 include/block/block_int.h   |  5 +++
 qemu-doc.texi   |  4 +-
 qemu-img.texi   |  4 +-
 qmp.c   | 99 +++--
 tests/qemu-iotests/041  | 46 ++-
 tests/qemu-iotests/041.out  |  4 +-
 16 files changed, 356 insertions(+), 53 deletions(-)

-- 
1.9.3

[Qemu-devel] [PULL for-2.1 01/11] Fix nocow typos in manpage

2014-07-07 Thread Stefan Hajnoczi

From: Chunyan Liu cy...@suse.com

Signed-off-by: Chunyan Liu cy...@suse.com
Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 qemu-doc.texi | 4 ++--
 qemu-img.texi | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/qemu-doc.texi b/qemu-doc.texi
index ad92c85..551619a 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -590,7 +590,7 @@ check -r all} is required, which may take some time.
 This option can only be enabled if @code{compat=1.1} is specified.
 
 @item nocow
-If this option is set to @code{on}, it will trun off COW of the file. It's only
+If this option is set to @code{on}, it will turn off COW of the file. It's only
 valid on btrfs, no effect on other file systems.
 
 Btrfs has low performance when hosting a VM image file, even more when the 
guest
@@ -603,7 +603,7 @@ does.
 Note: this option is only valid to new or empty files. If there is an existing
 file which is COW and has data blocks already, it couldn't be changed to NOCOW
 by setting @code{nocow=on}. One can issue @code{lsattr filename} to check if
-the NOCOW flag is set or not (Capitabl 'C' is NOCOW flag).
+the NOCOW flag is set or not (Capital 'C' is NOCOW flag).
 
 @end table
 
diff --git a/qemu-img.texi b/qemu-img.texi
index 8496f3b..514be90 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -475,7 +475,7 @@ check -r all} is required, which may take some time.
 This option can only be enabled if @code{compat=1.1} is specified.
 
 @item nocow
-If this option is set to @code{on}, it will trun off COW of the file. It's only
+If this option is set to @code{on}, it will turn off COW of the file. It's only
 valid on btrfs, no effect on other file systems.
 
 Btrfs has low performance when hosting a VM image file, even more when the 
guest
@@ -488,7 +488,7 @@ does.
 Note: this option is only valid to new or empty files. If there is an existing
 file which is COW and has data blocks already, it couldn't be changed to NOCOW
 by setting @code{nocow=on}. One can issue @code{lsattr filename} to check if
-the NOCOW flag is set or not (Capitabl 'C' is NOCOW flag).
+the NOCOW flag is set or not (Capital 'C' is NOCOW flag).
 
 @end table
 
-- 
1.9.3

[Qemu-devel] [PULL for-2.1 02/11] mirror: Fix qiov size for short requests

2014-07-07 Thread Stefan Hajnoczi

From: Kevin Wolf kw...@redhat.com

When mirroring an image of a size that is not a multiple of the
mirror job granularity, the last request would have the right nb_sectors
argument, but a qiov that is rounded up to the next multiple of the
granularity. Don't do this.

This fixes a segfault that is caused by raw-posix being confused by this
and allocating a buffer with request length, but operating on it with
qiov length.

[s/Driver/Drive/ in qemu-iotests 041 as suggested by Eric
--Stefan]

Reported-by: Eric Blake ebl...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
Tested-by: Eric Blake ebl...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 block/mirror.c | 4 +++-
 tests/qemu-iotests/041 | 5 +
 tests/qemu-iotests/041.out | 4 ++--
 3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 6c3ee70..c7a655f 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -265,9 +265,11 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 next_sector = sector_num;
 while (nb_chunks--  0) {
 MirrorBuffer *buf = QSIMPLEQ_FIRST(s-buf_free);
+size_t remaining = (nb_sectors * BDRV_SECTOR_SIZE) - op-qiov.size;
+
 QSIMPLEQ_REMOVE_HEAD(s-buf_free, next);
 s-buf_free_count--;
-qemu_iovec_add(op-qiov, buf, s-granularity);
+qemu_iovec_add(op-qiov, buf, MIN(s-granularity, remaining));
 
 /* Advance the HBitmapIter in parallel, so that we do not examine
  * the same sector twice.
diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 0815e19..005090e 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -217,6 +217,11 @@ class TestSingleDriveZeroLength(TestSingleDrive):
 test_small_buffer2 = None
 test_large_cluster = None
 
+class TestSingleDriveUnalignedLength(TestSingleDrive):
+image_len = 1025 * 1024
+test_small_buffer2 = None
+test_large_cluster = None
+
 class TestMirrorNoBacking(ImageMirroringTestCase):
 image_len = 2 * 1024 * 1024 # MB
 
diff --git a/tests/qemu-iotests/041.out b/tests/qemu-iotests/041.out
index 42147c0..24093bc 100644
--- a/tests/qemu-iotests/041.out
+++ b/tests/qemu-iotests/041.out
@@ -1,5 +1,5 @@
-..
+..
 --
-Ran 46 tests
+Ran 54 tests
 
 OK
-- 
1.9.3

[Qemu-devel] [PULL for-2.1 09/11] linux-aio: implement io plug, unplug and flush io queue

2014-07-07 Thread Stefan Hajnoczi

From: Ming Lei ming@canonical.com

This patch implements .bdrv_io_plug, .bdrv_io_unplug and
.bdrv_flush_io_queue callbacks for linux-aio Block Drivers,
so that submitting I/O as a batch can be supported on linux-aio.

[Unprocessed requests are completed with -EIO instead of a bogus ret
value.
--Stefan]

Signed-off-by: Ming Lei ming@canonical.com
Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 block/linux-aio.c | 96 +--
 block/raw-aio.h   |  2 ++
 block/raw-posix.c | 45 ++
 3 files changed, 141 insertions(+), 2 deletions(-)

diff --git a/block/linux-aio.c b/block/linux-aio.c
index f0a2c08..4867369 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -25,6 +25,8 @@
  */
 #define MAX_EVENTS 128
 
+#define MAX_QUEUED_IO  128
+
 struct qemu_laiocb {
 BlockDriverAIOCB common;
 struct qemu_laio_state *ctx;
@@ -36,9 +38,19 @@ struct qemu_laiocb {
 QLIST_ENTRY(qemu_laiocb) node;
 };
 
+typedef struct {
+struct iocb *iocbs[MAX_QUEUED_IO];
+int plugged;
+unsigned int size;
+unsigned int idx;
+} LaioQueue;
+
 struct qemu_laio_state {
 io_context_t ctx;
 EventNotifier e;
+
+/* io queue for submit at batch */
+LaioQueue io_q;
 };
 
 static inline ssize_t io_event_ret(struct io_event *ev)
@@ -135,6 +147,79 @@ static const AIOCBInfo laio_aiocb_info = {
 .cancel = laio_cancel,
 };
 
+static void ioq_init(LaioQueue *io_q)
+{
+io_q-size = MAX_QUEUED_IO;
+io_q-idx = 0;
+io_q-plugged = 0;
+}
+
+static int ioq_submit(struct qemu_laio_state *s)
+{
+int ret, i = 0;
+int len = s-io_q.idx;
+
+do {
+ret = io_submit(s-ctx, len, s-io_q.iocbs);
+} while (i++  3  ret == -EAGAIN);
+
+/* empty io queue */
+s-io_q.idx = 0;
+
+if (ret  0) {
+i = 0;
+} else {
+i = ret;
+}
+
+for (; i  len; i++) {
+struct qemu_laiocb *laiocb =
+container_of(s-io_q.iocbs[i], struct qemu_laiocb, iocb);
+
+laiocb-ret = (ret  0) ? ret : -EIO;
+qemu_laio_process_completion(s, laiocb);
+}
+return ret;
+}
+
+static void ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb)
+{
+unsigned int idx = s-io_q.idx;
+
+s-io_q.iocbs[idx++] = iocb;
+s-io_q.idx = idx;
+
+/* submit immediately if queue is full */
+if (idx == s-io_q.size) {
+ioq_submit(s);
+}
+}
+
+void laio_io_plug(BlockDriverState *bs, void *aio_ctx)
+{
+struct qemu_laio_state *s = aio_ctx;
+
+s-io_q.plugged++;
+}
+
+int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug)
+{
+struct qemu_laio_state *s = aio_ctx;
+int ret = 0;
+
+assert(s-io_q.plugged  0 || !unplug);
+
+if (unplug  --s-io_q.plugged  0) {
+return 0;
+}
+
+if (s-io_q.idx  0) {
+ret = ioq_submit(s);
+}
+
+return ret;
+}
+
 BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
 int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
 BlockDriverCompletionFunc *cb, void *opaque, int type)
@@ -168,8 +253,13 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void 
*aio_ctx, int fd,
 }
 io_set_eventfd(laiocb-iocb, event_notifier_get_fd(s-e));
 
-if (io_submit(s-ctx, 1, iocbs)  0)
-goto out_free_aiocb;
+if (!s-io_q.plugged) {
+if (io_submit(s-ctx, 1, iocbs)  0) {
+goto out_free_aiocb;
+}
+} else {
+ioq_enqueue(s, iocbs);
+}
 return laiocb-common;
 
 out_free_aiocb:
@@ -204,6 +294,8 @@ void *laio_init(void)
 goto out_close_efd;
 }
 
+ioq_init(s-io_q);
+
 return s;
 
 out_close_efd:
diff --git a/block/raw-aio.h b/block/raw-aio.h
index 8cf084e..e18c975 100644
--- a/block/raw-aio.h
+++ b/block/raw-aio.h
@@ -40,6 +40,8 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void 
*aio_ctx, int fd,
 BlockDriverCompletionFunc *cb, void *opaque, int type);
 void laio_detach_aio_context(void *s, AioContext *old_context);
 void laio_attach_aio_context(void *s, AioContext *new_context);
+void laio_io_plug(BlockDriverState *bs, void *aio_ctx);
+int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug);
 #endif
 
 #ifdef _WIN32
diff --git a/block/raw-posix.c b/block/raw-posix.c
index fa005b3..a857def 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1057,6 +1057,36 @@ static BlockDriverAIOCB *raw_aio_submit(BlockDriverState 
*bs,
cb, opaque, type);
 }
 
+static void raw_aio_plug(BlockDriverState *bs)
+{
+#ifdef CONFIG_LINUX_AIO
+BDRVRawState *s = bs-opaque;
+if (s-use_aio) {
+laio_io_plug(bs, s-aio_ctx);
+}
+#endif
+}
+
+static void raw_aio_unplug(BlockDriverState *bs)
+{
+#ifdef CONFIG_LINUX_AIO
+BDRVRawState *s = bs-opaque;
+if (s-use_aio) {
+laio_io_unplug(bs, s-aio_ctx, true);
+}
+#endif
+}
+
+static void raw_aio_flush_io_queue(BlockDriverState *bs)
+{
+#ifdef

[Qemu-devel] [PULL for-2.1 06/11] raw-posix: Fix raw_getlength() to always return -errno on error

2014-07-07 Thread Stefan Hajnoczi

From: Markus Armbruster arm...@redhat.com

We got a merry mix of -1 and -errno here.

Signed-off-by: Markus Armbruster arm...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
Reviewed-by: Benoit Canet ben...@irqsave.net
Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 block/raw-posix.c | 28 ++--
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 825a0c8..fa005b3 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1133,12 +1133,12 @@ static int64_t raw_getlength(BlockDriverState *bs)
 struct stat st;
 
 if (fstat(fd, st))
-return -1;
+return -errno;
 if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode)) {
 struct disklabel dl;
 
 if (ioctl(fd, DIOCGDINFO, dl))
-return -1;
+return -errno;
 return (uint64_t)dl.d_secsize *
 dl.d_partitions[DISKPART(st.st_rdev)].p_size;
 } else
@@ -1152,7 +1152,7 @@ static int64_t raw_getlength(BlockDriverState *bs)
 struct stat st;
 
 if (fstat(fd, st))
-return -1;
+return -errno;
 if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode)) {
 struct dkwedge_info dkw;
 
@@ -1162,7 +1162,7 @@ static int64_t raw_getlength(BlockDriverState *bs)
 struct disklabel dl;
 
 if (ioctl(fd, DIOCGDINFO, dl))
-return -1;
+return -errno;
 return (uint64_t)dl.d_secsize *
 dl.d_partitions[DISKPART(st.st_rdev)].p_size;
 }
@@ -1175,6 +1175,7 @@ static int64_t raw_getlength(BlockDriverState *bs)
 BDRVRawState *s = bs-opaque;
 struct dk_minfo minfo;
 int ret;
+int64_t size;
 
 ret = fd_open(bs);
 if (ret  0) {
@@ -1193,7 +1194,11 @@ static int64_t raw_getlength(BlockDriverState *bs)
  * There are reports that lseek on some devices fails, but
  * irc discussion said that contingency on contingency was overkill.
  */
-return lseek(s-fd, 0, SEEK_END);
+size = lseek(s-fd, 0, SEEK_END);
+if (size  0) {
+return -errno;
+}
+return size;
 }
 #elif defined(CONFIG_BSD)
 static int64_t raw_getlength(BlockDriverState *bs)
@@ -1231,6 +1236,9 @@ again:
 size = LLONG_MAX;
 #else
 size = lseek(fd, 0LL, SEEK_END);
+if (size  0) {
+return -errno;
+}
 #endif
 #if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
 switch(s-type) {
@@ -1247,6 +1255,9 @@ again:
 #endif
 } else {
 size = lseek(fd, 0, SEEK_END);
+if (size  0) {
+return -errno;
+}
 }
 return size;
 }
@@ -1255,13 +1266,18 @@ static int64_t raw_getlength(BlockDriverState *bs)
 {
 BDRVRawState *s = bs-opaque;
 int ret;
+int64_t size;
 
 ret = fd_open(bs);
 if (ret  0) {
 return ret;
 }
 
-return lseek(s-fd, 0, SEEK_END);
+size = lseek(s-fd, 0, SEEK_END);
+if (size  0) {
+return -errno;
+}
+return size;
 }
 #endif
 
-- 
1.9.3

[Qemu-devel] [PULL for-2.1 04/11] ahci.c: mask unused flags when reading size PRDT DBC

2014-07-07 Thread Stefan Hajnoczi

From: Reza Jelveh reza.jel...@tuhh.de

The data byte count(DBC) read from the description information is defined for
bits 21:00. Bits 30:22 are reserved and bit 31 is the Interrupt on Completion
(I) flag.

Completion interrupts are triggered after every transaction instead of on
I-flag in QEMU. tbl_entry_size is a signed integer and improperly reading the
DBC leads to a negative offset that causes sglist allocation to fail.

Signed-off-by: Reza Jelveh reza.jel...@tuhh.de
Reviewed-by: Alexander Graf ag...@suse.de
Reviewed-by: Kevin Wolf kw...@redhat.com
Reviewed-by: John Snow js...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 hw/ide/ahci.c | 11 ---
 hw/ide/ahci.h |  2 ++
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 9bae22e..cd140d1 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -639,6 +639,11 @@ static void ahci_write_fis_d2h(AHCIDevice *ad, uint8_t 
*cmd_fis)
 }
 }
 
+static int prdt_tbl_entry_size(const AHCI_SG *tbl)
+{
+return (le32_to_cpu(tbl-flags_size)  AHCI_PRDT_SIZE_MASK) + 1;
+}
+
 static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int offset)
 {
 AHCICmdHdr *cmd = ad-cur_cmd;
@@ -681,7 +686,7 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList 
*sglist, int offset)
 sum = 0;
 for (i = 0; i  sglist_alloc_hint; i++) {
 /* flags_size is zero-based */
-tbl_entry_size = (le32_to_cpu(tbl[i].flags_size) + 1);
+tbl_entry_size = prdt_tbl_entry_size(tbl[i]);
 if (offset = (sum + tbl_entry_size)) {
 off_idx = i;
 off_pos = offset - sum;
@@ -700,12 +705,12 @@ static int ahci_populate_sglist(AHCIDevice *ad, 
QEMUSGList *sglist, int offset)
 qemu_sglist_init(sglist, qbus-parent, (sglist_alloc_hint - off_idx),
  ad-hba-as);
 qemu_sglist_add(sglist, le64_to_cpu(tbl[off_idx].addr + off_pos),
-le32_to_cpu(tbl[off_idx].flags_size) + 1 - off_pos);
+prdt_tbl_entry_size(tbl[off_idx]) - off_pos);
 
 for (i = off_idx + 1; i  sglist_alloc_hint; i++) {
 /* flags_size is zero-based */
 qemu_sglist_add(sglist, le64_to_cpu(tbl[i].addr),
-le32_to_cpu(tbl[i].flags_size) + 1);
+prdt_tbl_entry_size(tbl[i]));
 }
 }
 
diff --git a/hw/ide/ahci.h b/hw/ide/ahci.h
index 9a4064f..f418b30 100644
--- a/hw/ide/ahci.h
+++ b/hw/ide/ahci.h
@@ -201,6 +201,8 @@
 
 #define AHCI_COMMAND_TABLE_ACMD0x40
 
+#define AHCI_PRDT_SIZE_MASK0x3f
+
 #define IDE_FEATURE_DMA1
 
 #define READ_FPDMA_QUEUED  0x60
-- 
1.9.3

[Qemu-devel] [RFC v4 04/13] hw/vfio/pci: introduce VFIODevice

2014-07-07 Thread Eric Auger

Introduce the VFIODevice struct that is going to be shared by
VFIOPCIDevice and VFIOPlatformDevice.

Additional fields will be added there later on for review
convenience.

the group's device_list becomes a list of VFIODevice

This obliges to rework the reset_handler which becomes generic and
calls VFIODevice ops that are specialized in each parent object.
Also functions that iterate on this list must take care that the
devices can be something else than VFIOPCIDevice. The type is used
to discriminate them.

we profit from this step to change the prototype of
vfio_unmask_intx, vfio_mask_intx, vfio_disable_irqindex which now
apply to VFIODevice. They are renamed as *_irqindex.
The index is passed as parameter to anticipate their usage for
platform IRQs

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 hw/vfio/pci.c | 243 +++---
 1 file changed, 149 insertions(+), 94 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index a7df3de..d0bee62 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -44,6 +44,11 @@
 #define VFIO_ALLOW_KVM_MSI 1
 #define VFIO_ALLOW_KVM_MSIX 1
 
+enum {
+VFIO_DEVICE_TYPE_PCI = 0,
+VFIO_DEVICE_TYPE_PLATFORM = 1,
+};
+
 struct VFIOPCIDevice;
 
 typedef struct VFIOQuirk {
@@ -173,9 +178,27 @@ typedef struct VFIOMSIXInfo {
 void *mmap;
 } VFIOMSIXInfo;
 
+typedef struct VFIODeviceOps VFIODeviceOps;
+
+typedef struct VFIODevice {
+QLIST_ENTRY(VFIODevice) next;
+struct VFIOGroup *group;
+char *name;
+int fd;
+int type;
+bool reset_works;
+bool needs_reset;
+VFIODeviceOps *ops;
+} VFIODevice;
+
+struct VFIODeviceOps {
+bool (*vfio_compute_needs_reset)(VFIODevice *vdev);
+int (*vfio_hot_reset_multi)(VFIODevice *vdev);
+};
+
 typedef struct VFIOPCIDevice {
 PCIDevice pdev;
-int fd;
+VFIODevice vbasedev;
 VFIOINTx intx;
 unsigned int config_size;
 uint8_t *emulated_config_bits; /* QEMU emulated bits, little-endian */
@@ -191,20 +214,16 @@ typedef struct VFIOPCIDevice {
 VFIOBAR bars[PCI_NUM_REGIONS - 1]; /* No ROM */
 VFIOVGA vga; /* 0xa, 0x3b0, 0x3c0 */
 PCIHostDeviceAddress host;
-QLIST_ENTRY(VFIOPCIDevice) next;
-struct VFIOGroup *group;
 EventNotifier err_notifier;
 uint32_t features;
 #define VFIO_FEATURE_ENABLE_VGA_BIT 0
 #define VFIO_FEATURE_ENABLE_VGA (1  VFIO_FEATURE_ENABLE_VGA_BIT)
 int32_t bootindex;
 uint8_t pm_cap;
-bool reset_works;
 bool has_vga;
 bool pci_aer;
 bool has_flr;
 bool has_pm_reset;
-bool needs_reset;
 bool rom_read_failed;
 } VFIOPCIDevice;
 
@@ -212,7 +231,7 @@ typedef struct VFIOGroup {
 int fd;
 int groupid;
 VFIOContainer *container;
-QLIST_HEAD(, VFIOPCIDevice) device_list;
+QLIST_HEAD(, VFIODevice) device_list;
 QLIST_ENTRY(VFIOGroup) next;
 QLIST_ENTRY(VFIOGroup) container_next;
 } VFIOGroup;
@@ -265,7 +284,7 @@ static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool 
enabled);
 /*
  * Common VFIO interrupt disable
  */
-static void vfio_disable_irqindex(VFIOPCIDevice *vdev, int index)
+static void vfio_disable_irqindex(VFIODevice *vbasedev, int index)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
@@ -275,37 +294,37 @@ static void vfio_disable_irqindex(VFIOPCIDevice *vdev, 
int index)
 .count = 0,
 };
 
-ioctl(vdev-fd, VFIO_DEVICE_SET_IRQS, irq_set);
+ioctl(vbasedev-fd, VFIO_DEVICE_SET_IRQS, irq_set);
 }
 
 /*
  * INTx
  */
-static void vfio_unmask_intx(VFIOPCIDevice *vdev)
+static void vfio_unmask_irqindex(VFIODevice *vbasedev, int index)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
 .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK,
-.index = VFIO_PCI_INTX_IRQ_INDEX,
+.index = index,
 .start = 0,
 .count = 1,
 };
 
-ioctl(vdev-fd, VFIO_DEVICE_SET_IRQS, irq_set);
+ioctl(vbasedev-fd, VFIO_DEVICE_SET_IRQS, irq_set);
 }
 
 #ifdef CONFIG_KVM /* Unused outside of CONFIG_KVM code */
-static void vfio_mask_intx(VFIOPCIDevice *vdev)
+static void vfio_mask_irqindex(VFIODevice *vbasedev, int index)
 {
 struct vfio_irq_set irq_set = {
 .argsz = sizeof(irq_set),
 .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK,
-.index = VFIO_PCI_INTX_IRQ_INDEX,
+.index = index,
 .start = 0,
 .count = 1,
 };
 
-ioctl(vdev-fd, VFIO_DEVICE_SET_IRQS, irq_set);
+ioctl(vbasedev-fd, VFIO_DEVICE_SET_IRQS, irq_set);
 }
 #endif
 
@@ -369,7 +388,7 @@ static void vfio_eoi(VFIOPCIDevice *vdev)
 
 vdev-intx.pending = false;
 pci_irq_deassert(vdev-pdev);
-vfio_unmask_intx(vdev);
+vfio_unmask_irqindex(vdev-vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
 }
 
 static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
@@ -392,7 +411,7 @@ static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
 
 /* Get to a known interrupt state */

[Qemu-devel] [RFC v4 06/13] hw/vfio/pci: split vfio_get_device

2014-07-07 Thread Eric Auger

vfio_get_device now takes a VFIODevice as argument. The function is split
into 4 functional parts: dev_info query, device check, region populate
and interrupt populate. the last 3 are specialized by parent device and
are added into DeviceOps.

3 new fields are introduced in VFIODevice to store dev_info.

vfio_put_base_device is created.

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 hw/vfio/pci.c | 181 +++---
 1 file changed, 121 insertions(+), 60 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 5f0164a..d228cf8 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -194,12 +194,18 @@ typedef struct VFIODevice {
 bool reset_works;
 bool needs_reset;
 VFIODeviceOps *ops;
+unsigned int num_irqs;
+unsigned int num_regions;
+unsigned int flags;
 } VFIODevice;
 
 struct VFIODeviceOps {
 bool (*vfio_compute_needs_reset)(VFIODevice *vdev);
 int (*vfio_hot_reset_multi)(VFIODevice *vdev);
 void (*vfio_eoi)(VFIODevice *vdev);
+int (*vfio_check_device)(VFIODevice *vdev);
+int (*vfio_populate_regions)(VFIODevice *vdev);
+int (*vfio_populate_interrupts)(VFIODevice *vdev);
 };
 
 typedef struct VFIOPCIDevice {
@@ -286,6 +292,10 @@ static uint32_t vfio_pci_read_config(PCIDevice *pdev, 
uint32_t addr, int len);
 static void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr,
   uint32_t val, int len);
 static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled);
+static void vfio_put_base_device(VFIODevice *vbasedev);
+static int vfio_check_device(VFIODevice *vbasedev);
+static int vfio_populate_regions(VFIODevice *vbasedev);
+static int vfio_populate_interrupts(VFIODevice *vbasedev);
 
 /*
  * Common VFIO interrupt disable
@@ -3585,6 +3595,9 @@ static VFIODeviceOps vfio_pci_ops = {
 .vfio_compute_needs_reset = vfio_pci_compute_needs_reset,
 .vfio_hot_reset_multi = vfio_pci_hot_reset_multi,
 .vfio_eoi = vfio_eoi,
+.vfio_check_device = vfio_check_device,
+.vfio_populate_regions = vfio_populate_regions,
+.vfio_populate_interrupts = vfio_populate_interrupts,
 };
 
 static void vfio_reset_handler(void *opaque)
@@ -3927,54 +3940,53 @@ static void vfio_put_group(VFIOGroup *group)
 }
 }
 
-static int vfio_get_device(VFIOGroup *group, const char *name,
-   VFIOPCIDevice *vdev)
+static int vfio_check_device(VFIODevice *vbasedev)
 {
-struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) };
-struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
-struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info) };
-int ret, i;
-
-ret = ioctl(group-fd, VFIO_GROUP_GET_DEVICE_FD, name);
-if (ret  0) {
-error_report(vfio: error getting device %s from group %d: %m,
- name, group-groupid);
-error_printf(Verify all devices in group %d are bound to vfio-pci 
- or pci-stub and not already in use\n, group-groupid);
-return ret;
+if (!(vbasedev-flags  VFIO_DEVICE_FLAGS_PCI)) {
+error_report(vfio: Um, this isn't a PCI device);
+goto error;
 }
-
-vdev-vbasedev.fd = ret;
-vdev-vbasedev.group = group;
-QLIST_INSERT_HEAD(group-device_list, vdev-vbasedev, next);
-
-/* Sanity check device */
-ret = ioctl(vdev-vbasedev.fd, VFIO_DEVICE_GET_INFO, dev_info);
-if (ret) {
-error_report(vfio: error getting device info: %m);
+if (vbasedev-num_regions  VFIO_PCI_CONFIG_REGION_INDEX + 1) {
+error_report(vfio: unexpected number of io regions %u,
+ vbasedev-num_regions);
 goto error;
 }
-
-DPRINTF(Device %s flags: %u, regions: %u, irgs: %u\n, name,
-dev_info.flags, dev_info.num_regions, dev_info.num_irqs);
-
-if (!(dev_info.flags  VFIO_DEVICE_FLAGS_PCI)) {
-error_report(vfio: Um, this isn't a PCI device);
+if (vbasedev-num_irqs  VFIO_PCI_MSIX_IRQ_INDEX + 1) {
+error_report(vfio: unexpected number of irqs %u,
+ vbasedev-num_irqs);
 goto error;
 }
+   return 0;
+error:
+vfio_put_base_device(vbasedev);
+return -errno;
+}
 
-vdev-vbasedev.reset_works = !!(dev_info.flags  VFIO_DEVICE_FLAGS_RESET);
+static int vfio_populate_interrupts(VFIODevice *vbasedev)
+{
+VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
+int ret;
+struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info) };
+irq_info.index = VFIO_PCI_ERR_IRQ_INDEX;
 
-if (dev_info.num_regions  VFIO_PCI_CONFIG_REGION_INDEX + 1) {
-error_report(vfio: unexpected number of io regions %u,
- dev_info.num_regions);
-goto error;
+ret = ioctl(vbasedev-fd, VFIO_DEVICE_GET_IRQ_INFO, irq_info);
+if (ret) {
+/* This can fail for an old kernel or legacy PCI dev */
+DPRINTF(VFIO_DEVICE_GET_IRQ_INFO failure: %m\n);
+} else if

[Qemu-devel] [RFC v4 11/13] hw/vfio/platform: Add irqfd support

2014-07-07 Thread Eric Auger

This patch aims at optimizing IRQ handling using irqfd framework.

Instead of handling the eventfds on user-side they are handled on
kernel side using
- the KVM irqfd framework,
- the VFIO driver virqfd framework.

the virtual IRQ completion is trapped at interrupt controller
instead of on guest 1st access to any region after IRQ hit.
This removes the need for fast/slow path swap.

Overall this brings significant performance improvements.

It depends on host kernel KVM irqfd/GSI routing capability.

Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com
Signed-off-by: Eric Auger eric.au...@linaro.org

---

v3 - v4:
[Alvise Rigo]
Use of VFIO Platform driver v6 unmask/virqfd feature and removal
of resamplefd handler. Physical IRQ unmasking is now done in
VFIO driver.

v3:
[Eric Auger]
initial support with resamplefd handled on QEMU side since the
unmask was not supported on VFIO platform driver v5.
---
 hw/vfio/platform.c | 95 ++
 1 file changed, 95 insertions(+)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index a5fc22b..fb0f7c9 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -381,6 +381,101 @@ static int vfio_populate_interrupts(VFIODevice *vbasedev)
 return 0;
 }
 
+static void vfio_enable_intp_kvm(VFIOINTp *intp)
+{
+#ifdef CONFIG_KVM
+struct kvm_irqfd irqfd = {
+.fd = event_notifier_get_fd(intp-interrupt),
+.gsi = intp-virtualID,
+.flags = KVM_IRQFD_FLAG_RESAMPLE,
+};
+
+struct vfio_irq_set *irq_set;
+int ret, argsz;
+int32_t *pfd;
+VFIODevice *vbasedev = intp-vdev-vbasedev;
+
+if (!kvm_irqfds_enabled() ||
+!kvm_check_extension(kvm_state, KVM_CAP_IRQFD_RESAMPLE)) {
+return;
+}
+
+/* Get to a known interrupt state */
+qemu_set_fd_handler(irqfd.fd, NULL, NULL, NULL);
+vfio_mask_irqindex(vbasedev, intp-pin);
+intp-state = VFIO_IRQ_INACTIVE;
+qemu_set_irq(intp-qemuirq, 0);
+
+/* Get an eventfd for resample/unmask */
+if (event_notifier_init(intp-unmask, 0)) {
+error_report(vfio: Error: event_notifier_init failed eoi);
+goto fail;
+}
+
+/* KVM triggers it, VFIO listens for it */
+irqfd.resamplefd = event_notifier_get_fd(intp-unmask);
+
+if (kvm_vm_ioctl(kvm_state, KVM_IRQFD, irqfd)) {
+error_report(vfio: Error: Failed to setup resample irqfd: %m);
+goto fail_irqfd;
+}
+
+argsz = sizeof(*irq_set) + sizeof(*pfd);
+
+irq_set = g_malloc0(argsz);
+irq_set-argsz = argsz;
+irq_set-flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_UNMASK;
+irq_set-index = intp-pin;
+irq_set-start = 0;
+irq_set-count = 1;
+pfd = (int32_t *)irq_set-data;
+
+*pfd = irqfd.resamplefd;
+
+ret = ioctl(vbasedev-fd, VFIO_DEVICE_SET_IRQS, irq_set);
+g_free(irq_set);
+if (ret) {
+error_report(vfio: Error: Failed to setup INTx unmask fd: %m);
+goto fail_vfio;
+}
+
+/* Let'em rip */
+vfio_unmask_irqindex(vbasedev, intp-pin);
+
+intp-kvm_accel = true;
+
+DPRINTF(%s irqfd pin=%d to virtID = %d fd=%d, resamplefd=%d)\n,
+__func__, intp-pin, intp-virtualID,
+irqfd.fd, irqfd.resamplefd);
+
+return;
+
+fail_vfio:
+irqfd.flags = KVM_IRQFD_FLAG_DEASSIGN;
+kvm_vm_ioctl(kvm_state, KVM_IRQFD, irqfd);
+fail_irqfd:
+event_notifier_cleanup(intp-unmask);
+fail:
+qemu_set_fd_handler(irqfd.fd, vfio_intp_interrupt, NULL, intp);
+vfio_unmask_irqindex(vbasedev, intp-pin);
+#endif
+}
+
+void vfio_setup_irqfd(SysBusDevice *s, int index, int virq)
+{
+VFIOPlatformDevice *vdev = container_of(s, VFIOPlatformDevice, sbdev);
+VFIOINTp *intp;
+
+QLIST_FOREACH(intp, vdev-intp_list, next) {
+if (intp-pin == index) {
+intp-virtualID = virq;
+DPRINTF(enable irqfd for irq index %d (virtual IRQ %d)\n,
+index, virq);
+vfio_enable_intp_kvm(intp);
+}
+}
+}
+
 static VFIODeviceOps vfio_platform_ops = {
 .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
 .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
-- 
1.8.3.2

[Qemu-devel] [PULL 03/12] pc-dimm: error out if memory hotplug is not enabled

2014-07-07 Thread Michael S. Tsirkin

From: Igor Mammedov imamm...@redhat.com

fixes QEMU abort in case it's started without memory
hotplug enabled.

as result of fix it will print following messages:

-device pc-dimm,id=d1,memdev=m1: memory hotplug is not enabled, enable it on 
startup
-device pc-dimm,id=d1,memdev=m1: Device 'pc-dimm' could not be initialized


Also fixup assert condition to detect hotplug address
space overflow.

Signed-off-by: Igor Mammedov imamm...@redhat.com
Reported-by:  Hu Tao hu...@cn.fujitsu.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/mem/pc-dimm.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
index ad176b7..08f49ed 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/pc-dimm.c
@@ -146,7 +146,13 @@ uint64_t pc_dimm_get_free_addr(uint64_t 
address_space_start,
 uint64_t new_addr, ret = 0;
 uint64_t address_space_end = address_space_start + address_space_size;
 
-assert(address_space_end  address_space_size);
+if (!address_space_size) {
+error_setg(errp, memory hotplug is not enabled, 
+ please add maxmem option);
+goto out;
+}
+
+assert(address_space_end  address_space_start);
 object_child_foreach(qdev_get_machine(), pc_dimm_built_list, list);
 
 if (hint) {
-- 
MST

Re: [Qemu-devel] [PATCH] qmp: show QOM properties in device-list-properties

2014-07-07 Thread Christian Borntraeger

On 07/07/14 09:29, Markus Armbruster wrote:
 Paolo Bonzini pbonz...@redhat.com writes:
 
 Il 06/07/2014 21:03, Cole Robinson ha scritto:
 On 07/05/2014 05:14 AM, Paolo Bonzini wrote:
 Il 20/05/2014 14:29, Stefan Hajnoczi ha scritto:
 Devices can use a mix of qdev and QOM properties.  Currently only the
 qdev properties are displayed by device-list-properties.

 This patch extends the property enumeration algorithm to also display
 QOM properties (excluding the implicit type, realized,
 hotpluggable, and parent_bus properties).

 When a qdev property exists, use the qdev type name to preserve
 backwards compatibility.  QOM type names can be different for bool (qdev
 on/off) and str (used by qdev pointers).

 Signed-off-by: Stefan Hajnoczi stefa...@redhat.com


 Stefan, was this never applied?


 I assume you CC'd me in reference to the bug I reported:

 https://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00882.html

 I tested this patch, but it doesn't fix the specific bit I mentioned (lack 
 of
 'bootindex' in -device virtio-blk,? )

 Yes, it doesn't, but does libvirt work then?  I'm not sure if libvirt
 still uses -device or rather device-list-properties (which lets you
 start a single QEMU process and do multiple probes).

Libvirt still seems to used -device FOO,help
With current qemu master (plus this fix), I get
unsupported configuration: hypervisor lacks deviceboot feature


 
 Valid question, but of course we need to fix the -device FOO,help
 regression regardless.

Re: [Qemu-devel] [PATCH V3 for 2.1 2/2] exec: improve error handling and reporting in file_ram_alloc() and gethugepagesize()

2014-07-07 Thread Michael S. Tsirkin

On Mon, Jul 07, 2014 at 06:55:28PM +0800, Hu Tao wrote:
 This patch fixes two problems of memory-backend-file:
 
 1. If user adds a memory-backend-file object using object_add command,
specifying a non-existing directory for property mem-path, qemu
will core dump with message:
 
  /nonexistingdir: No such file or directory
  Bad ram offset f000
  Aborted (core dumped)
 
with this patch, qemu reports error message like:
 
  qemu-system-x86_64: -object 
 memory-backend-file,mem-path=/nonexistingdir,id=mem-file0,size=128M:
  failed to stat file /nonexistingdir: No such file or directory
 
 2. If user adds a memory-backend-file object using object_add command,
specifying a size that is less than huge page size, qemu
will core dump with message:
 
  Bad ram offset f000
  Aborted (core dumped)
 
with this patch, qemu reports error message like:
 
  qemu-system-x86_64: -object 
 memory-backend-file,mem-path=/hugepages,id=mem-file0,size=1M: memory
  size 0x10 should be euqal or larger than huge page size 0x20
 
 Signed-off-by: Hu Tao hu...@cn.fujitsu.com

Build fails on 32 bit host
/scm/qemu/exec.c:1037:9: error: format ‘%llx’ expects argument of type
‘long long unsigned int’, but argument 5 has type ‘long unsigned int’
[-Werror=format=]


 ---
  exec.c | 19 +++
  1 file changed, 11 insertions(+), 8 deletions(-)
 
 diff --git a/exec.c b/exec.c
 index ca7741b..bb97b15 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -996,7 +996,7 @@ void qemu_mutex_unlock_ramlist(void)
  
  #define HUGETLBFS_MAGIC   0x958458f6
  
 -static long gethugepagesize(const char *path)
 +static long gethugepagesize(const char *path, Error **errp)
  {
  struct statfs fs;
  int ret;
 @@ -1006,7 +1006,7 @@ static long gethugepagesize(const char *path)
  } while (ret != 0  errno == EINTR);
  
  if (ret != 0) {
 -perror(path);
 +error_setg_errno(errp, errno, failed to get size of file %s, path);
  return 0;
  }
  
 @@ -1024,17 +1024,20 @@ static void *file_ram_alloc(RAMBlock *block,
  char *filename;
  char *sanitized_name;
  char *c;
 -void *area;
 +void *area = NULL;
  int fd;
  unsigned long hpagesize;
  
 -hpagesize = gethugepagesize(path);
 -if (!hpagesize) {
 +hpagesize = gethugepagesize(path, errp);
 +if (errp  *errp) {
  goto error;
  }
  
  if (memory  hpagesize) {
 -return NULL;
 +error_setg(errp, memory size 0x RAM_ADDR_FMT  must be euqal to 
 +   or larger than huge page size 0x% PRIx64,
 +   memory, hpagesize);
 +goto error;
  }
  
  if (kvm_enabled()  !kvm_has_sync_mmu()) {
 @@ -1094,8 +1097,8 @@ static void *file_ram_alloc(RAMBlock *block,
  return area;
  
  error:
 -if (mem_prealloc) {
 -exit(1);
 +if (area  area != MAP_FAILED) {
 +munmap(area, memory);
  }
  return NULL;
  }
 -- 
 1.9.3

Re: [Qemu-devel] [PATCH] PPC: e500: Actually install u-boot.e500

2014-07-07 Thread Alexander Graf



On 04.07.14 21:43, Cole Robinson wrote:

Signed-off-by: Cole Robinson crobi...@redhat.com


Let me get that brown paperbag :).

Thanks, applied to ppc-next (for 2.1)


Alex

[Qemu-devel] [PULL 00/12] pc,vhost,virtio fixes, test

2014-07-07 Thread Michael S. Tsirkin

I will merge only high priority fixes from here on. High priority means
regression fixes or fixes in a new feature if there is no workaround.

The following changes since commit 9d9de254c2b81b68cd48f2324cc753a570a4cdd8:

  MAINTAINERS: seccomp: change email contact for Eduardo Otubo (2014-07-03 
12:36:15 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream

for you to fetch changes up to 3f0838ab8557c6071a5931183b2d7fed568cd35c:

  qemu-char: add chr_add_watch support in mux chardev (2014-07-06 09:13:54 
+0300)


pc,vhost,virtio fixes, test

Bugfixes all over the place.

There's a  non bugfix here: re-enabling the vhost-user test,
though the patch just brings back functionality that
I disabled earlier to fix mingw build failures.
This is now sorted, and keeping the unit test enabled
seems important since the feature relies on an external
server to work, so isn't easy to test.

Signed-off-by: Michael S. Tsirkin m...@redhat.com


Eduardo Habkost (2):
  qdev: Don't abort() in case globals can't be set
  qdev: Fix crash when using non-device class name on -global

Hu Tao (1):
  numa: check for busy memory backend

Igor Mammedov (2):
  pc-dimm: error out if memory hotplug is not enabled
  acpi: fix typo in memory hotplug MMIO region name

Kirill Batuzov (2):
  Handle G_IO_HUP in tcp_chr_read for tcp chardev
  qemu-char: add chr_add_watch support in mux chardev

Le Tan (1):
  pci: assign devfn to pci_dev before calling 
pci_device_iommu_address_space()

Ming Lei (2):
  virtio: move common virtio properties to bus class device
  hw/virtio: enable common virtio feature for mmio device

Nikolay Nikolaev (1):
  qtest: enable vhost-user-test

Paolo Bonzini (1):
  virtio-pci: fix MSI memory region use after free

 include/hw/virtio/virtio-blk.h   |  3 ---
 include/hw/virtio/virtio-net.h   |  1 -
 include/hw/virtio/virtio-scsi.h  |  1 -
 include/sysemu/char.h|  1 -
 hw/acpi/memory_hotplug.c |  2 +-
 hw/core/qdev-properties-system.c |  3 ++-
 hw/core/qdev.c   |  8 +++-
 hw/mem/pc-dimm.c |  8 +++-
 hw/pci/pci.c |  2 +-
 hw/s390x/s390-virtio-bus.c   |  2 ++
 hw/s390x/virtio-ccw.c| 11 ++-
 hw/virtio/virtio-mmio.c  |  6 ++
 hw/virtio/virtio-pci.c   | 16 
 numa.c   |  8 
 qemu-char.c  | 36 +++-
 tests/Makefile   |  8 +---
 16 files changed, 68 insertions(+), 48 deletions(-)

[Qemu-devel] [PULL 01/12] qtest: enable vhost-user-test

2014-07-07 Thread Michael S. Tsirkin

From: Nikolay Nikolaev n.nikol...@virtualopensystems.com

Use qtest-obj-y to get the right library order. CONFIG_POSIX ensures
mingw compilation won't break.

Signed-off-by: Nikolay Nikolaev n.nikol...@virtualopensystems.com
Acked-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com

MST: whitespace tweak
---
 tests/Makefile | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/tests/Makefile b/tests/Makefile
index 7e53d0d..1fcd633 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -158,7 +158,7 @@ gcov-files-i386-y += hw/usb/hcd-ehci.c
 gcov-files-i386-y += hw/usb/hcd-uhci.c
 gcov-files-i386-y += hw/usb/dev-hid.c
 gcov-files-i386-y += hw/usb/dev-storage.c
-#check-qtest-i386-y += tests/vhost-user-test$(EXESUF)
+check-qtest-i386-$(CONFIG_POSIX) += tests/vhost-user-test$(EXESUF)
 check-qtest-x86_64-y = $(check-qtest-i386-y)
 gcov-files-i386-y += i386-softmmu/hw/timer/mc146818rtc.c
 gcov-files-x86_64-y = $(subst 
i386-softmmu/,x86_64-softmmu/,$(gcov-files-i386-y))
@@ -333,11 +333,13 @@ tests/es1370-test$(EXESUF): tests/es1370-test.o
 tests/intel-hda-test$(EXESUF): tests/intel-hda-test.o
 tests/ioh3420-test$(EXESUF): tests/ioh3420-test.o
 tests/usb-hcd-ehci-test$(EXESUF): tests/usb-hcd-ehci-test.o $(libqos-pc-obj-y)
-tests/vhost-user-test$(EXESUF): tests/vhost-user-test.o qemu-char.o 
qemu-timer.o libqemuutil.a libqemustub.a
+tests/vhost-user-test$(EXESUF): tests/vhost-user-test.o qemu-char.o 
qemu-timer.o $(qtest-obj-y)
 tests/qemu-iotests/socket_scm_helper$(EXESUF): 
tests/qemu-iotests/socket_scm_helper.o
 tests/test-qemu-opts$(EXESUF): tests/test-qemu-opts.o libqemuutil.a 
libqemustub.a
 
-#LIBS+= -lutil
+ifeq ($(CONFIG_POSIX),y)
+LIBS += -lutil
+endif
 
 # QTest rules
 
-- 
MST

[Qemu-devel] [PULL 02/12] numa: check for busy memory backend

2014-07-07 Thread Michael S. Tsirkin

From: Hu Tao hu...@cn.fujitsu.com

Specifying the same memory backend twice leads to an assert:

./x86_64-softmmu/qemu-system-x86_64 -m 512M -enable-kvm -object
memory-backend-ram,size=256M,id=ram0 -numa node,nodeid=0,memdev=ram0
-numa node,nodeid=1,memdev=ram0
qemu-system-x86_64: /scm/qemu/memory.c:1506:
memory_region_add_subregion_common: Assertion `!subregion-container'
failed.
Aborted (core dumped)

Detect and exit with an error message instead.

Reviewed-by: Igor Mammedov imamm...@redhat.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 numa.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/numa.c b/numa.c
index 2fde740..7bf7834 100644
--- a/numa.c
+++ b/numa.c
@@ -301,6 +301,14 @@ void memory_region_allocate_system_memory(MemoryRegion 
*mr, Object *owner,
 exit(1);
 }
 
+if (memory_region_is_mapped(seg)) {
+char *path = object_get_canonical_path_component(OBJECT(backend));
+error_report(memory backend %s is used multiple times. Each 
+ -numa option must use a different memdev value.,
+ path);
+exit(1);
+}
+
 memory_region_add_subregion(mr, addr, seg);
 vmstate_register_ram_global(seg);
 addr += size;
-- 
MST

[Qemu-devel] [PULL 12/12] qemu-char: add chr_add_watch support in mux chardev

2014-07-07 Thread Michael S. Tsirkin

From: Kirill Batuzov batuz...@ispras.ru

Forward chr_add_watch call from mux chardev to underlying
implementation.

This should fix bug #1335444

Signed-off-by: Kirill Batuzov batuz...@ispras.ru
Acked-by: Paolo Bonzini pbonz...@redhat.com
Acked-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 qemu-char.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/qemu-char.c b/qemu-char.c
index 22a9777..55e372c 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -581,6 +581,12 @@ static Notifier muxes_realize_notify = {
 .notify = muxes_realize_done,
 };
 
+static GSource *mux_chr_add_watch(CharDriverState *s, GIOCondition cond)
+{
+MuxDriver *d = s-opaque;
+return d-drv-chr_add_watch(d-drv, cond);
+}
+
 static CharDriverState *qemu_chr_open_mux(CharDriverState *drv)
 {
 CharDriverState *chr;
@@ -597,6 +603,9 @@ static CharDriverState *qemu_chr_open_mux(CharDriverState 
*drv)
 chr-chr_accept_input = mux_chr_accept_input;
 /* Frontend guest-open / -close notification is not support with muxes */
 chr-chr_set_fe_open = NULL;
+if (drv-chr_add_watch) {
+chr-chr_add_watch = mux_chr_add_watch;
+}
 /* only default to opened state if we've realized the initial
  * set of muxes
  */
-- 
MST

Re: [Qemu-devel] [RFC v4 08/13] hw/vfio/common: Add EXEC_FLAG to VFIO DMA mappings

2014-07-07 Thread Peter Maydell

On 7 July 2014 13:27, Eric Auger eric.au...@linaro.org wrote:
 From: Alvise Rigo a.r...@virtualopensystems.com

 The flag is mandatory for the ARM SMMU so we always add it if the MMIO
 handles it.

 Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com
 ---
  hw/vfio/common.c  | 9 +
  include/hw/vfio/vfio-common.h | 1 +
  linux-headers/linux/vfio.h| 2 ++
  3 files changed, 12 insertions(+)

 diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
 index 26c218e..b13f7d3 100644
 --- a/linux-headers/linux/vfio.h
 +++ b/linux-headers/linux/vfio.h
 @@ -30,6 +30,7 @@
   */
  #define VFIO_DMA_CC_IOMMU  4

 +#define VFIO_IOMMU_PROT_EXEC   5
  /*
   * The IOCTL interface is designed for extensibility by embedding the
   * structure length (argsz) and flags into structures passed between
 @@ -398,6 +399,7 @@ struct vfio_iommu_type1_dma_map {
 __u32   flags;
  #define VFIO_DMA_MAP_FLAG_READ (1  0)/* readable from 
 device */
  #define VFIO_DMA_MAP_FLAG_WRITE (1  1)   /* writable from device */
 +#define VFIO_DMA_MAP_FLAG_EXEC (1  2)/* executable from device */
 __u64   vaddr;  /* Process virtual address */
 __u64   iova;   /* IO virtual address */
 __u64   size;   /* Size of mapping (bytes) */

You shouldn't change linux-headers/ files except by syncing them from
a kernel tree using scripts/update-linux-headers.sh. Those changes
should always be in a separate commit that includes the kernel tree
and commit hash synced against in its commit message. For an RFC
patchseries where the equivalent kernel changes haven't been
accepted upstream yet it's ok to sync against a local tree (and
clearly note in the commit message that it's not for committing
to upstream qemu), but the changes should still be in their own patch.

thanks
-- PMM

[Qemu-devel] [Bug 1338563] [NEW] README refers to a non-extant file

2014-07-07 Thread Daniel U. Thibault

Public bug reported:

The current stable QEMU release (1.4.2-89400a8) README consists of a
single line telling the new user to read the documentation in qemu-
doc.html or on http://wiki.qemu.org;.  The distribution includes no
qemu-doc.html, just a qemu-doc.texi.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1338563

Title:
  README refers to a non-extant file

Status in QEMU:
  New

Bug description:
  The current stable QEMU release (1.4.2-89400a8) README consists of a
  single line telling the new user to read the documentation in qemu-
  doc.html or on http://wiki.qemu.org;.  The distribution includes no
  qemu-doc.html, just a qemu-doc.texi.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1338563/+subscriptions

[Qemu-devel] [PULL for-2.1 05/11] qemu-iotests: Disable Quorum testing in 041 when Quorum is not builtin

2014-07-07 Thread Stefan Hajnoczi

From: Benoît Canet benoit.ca...@irqsave.net

This avoid breaking tests on RHEL6 where gnutls is too old for quorum to be
built by default.

Signed-off-by: Benoit Canet ben...@irqsave.net
Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 tests/qemu-iotests/041 | 41 +++--
 1 file changed, 39 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 005090e..5dbd4ee 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -740,6 +740,9 @@ class TestRepairQuorum(ImageMirroringTestCase):
 image_len = 1 * 1024 * 1024 # MB
 IMAGES = [ quorum_img1, quorum_img2, quorum_img3 ]
 
+def has_quorum(self):
+return 'quorum' in iotests.qemu_img_pipe('--help')
+
 def setUp(self):
 self.vm = iotests.VM()
 
@@ -757,8 +760,9 @@ class TestRepairQuorum(ImageMirroringTestCase):
 #assemble the quorum block device from the individual files
 args = { options : { driver: quorum, id: quorum0,
  vote-threshold: 2, children: [ img0, img1, img2 ] } 
}
-result = self.vm.qmp(blockdev-add, **args)
-self.assert_qmp(result, 'return', {})
+if self.has_quorum():
+result = self.vm.qmp(blockdev-add, **args)
+self.assert_qmp(result, 'return', {})
 
 
 def tearDown(self):
@@ -771,6 +775,9 @@ class TestRepairQuorum(ImageMirroringTestCase):
 pass
 
 def test_complete(self):
+if not self.has_quorum():
+return
+
 self.assert_no_active_block_jobs()
 
 result = self.vm.qmp('drive-mirror', device='quorum0', sync='full',
@@ -789,6 +796,9 @@ class TestRepairQuorum(ImageMirroringTestCase):
 'target image does not match source after mirroring')
 
 def test_cancel(self):
+if not self.has_quorum():
+return
+
 self.assert_no_active_block_jobs()
 
 result = self.vm.qmp('drive-mirror', device='quorum0', sync='full',
@@ -805,6 +815,9 @@ class TestRepairQuorum(ImageMirroringTestCase):
 self.vm.shutdown()
 
 def test_cancel_after_ready(self):
+if not self.has_quorum():
+return
+
 self.assert_no_active_block_jobs()
 
 result = self.vm.qmp('drive-mirror', device='quorum0', sync='full',
@@ -823,6 +836,9 @@ class TestRepairQuorum(ImageMirroringTestCase):
 'target image does not match source after mirroring')
 
 def test_pause(self):
+if not self.has_quorum():
+return
+
 self.assert_no_active_block_jobs()
 
 result = self.vm.qmp('drive-mirror', device='quorum0', sync='full',
@@ -851,6 +867,9 @@ class TestRepairQuorum(ImageMirroringTestCase):
 'target image does not match source after mirroring')
 
 def test_medium_not_found(self):
+if not self.has_quorum():
+return
+
 result = self.vm.qmp('drive-mirror', device='ide1-cd0', sync='full',
  node_name='repair0',
  replaces='img1',
@@ -858,6 +877,9 @@ class TestRepairQuorum(ImageMirroringTestCase):
 self.assert_qmp(result, 'error/class', 'GenericError')
 
 def test_image_not_found(self):
+if not self.has_quorum():
+return
+
 result = self.vm.qmp('drive-mirror', device='quorum0', sync='full',
  node_name='repair0',
  replaces='img1',
@@ -866,6 +888,9 @@ class TestRepairQuorum(ImageMirroringTestCase):
 self.assert_qmp(result, 'error/class', 'GenericError')
 
 def test_device_not_found(self):
+if not self.has_quorum():
+return
+
 result = self.vm.qmp('drive-mirror', device='nonexistent', sync='full',
  node_name='repair0',
  replaces='img1',
@@ -873,6 +898,9 @@ class TestRepairQuorum(ImageMirroringTestCase):
 self.assert_qmp(result, 'error/class', 'DeviceNotFound')
 
 def test_wrong_sync_mode(self):
+if not self.has_quorum():
+return
+
 result = self.vm.qmp('drive-mirror', device='quorum0',
  node_name='repair0',
  replaces='img1',
@@ -880,12 +908,18 @@ class TestRepairQuorum(ImageMirroringTestCase):
 self.assert_qmp(result, 'error/class', 'GenericError')
 
 def test_no_node_name(self):
+if not self.has_quorum():
+return
+
 result = self.vm.qmp('drive-mirror', device='quorum0', sync='full',
  replaces='img1',
  target=quorum_repair_img, format=iotests.imgfmt)
 self.assert_qmp(result, 'error/class', 'GenericError')
 
 def test_unexistant_replaces(self):
+if not self.has_quorum():
+return
+
 result = self.vm.qmp('drive-mirror', device='quorum0', sync='full',

[Qemu-devel] [RFC v4 05/13] hw/vfio/pci: Introduce VFIORegion

2014-07-07 Thread Eric Auger

This structure is going to be shared by VFIOPCIDevice and
VFIOPlatformDevice. VFIOBAR includes it.

vfio_eoi becomes an ops of VFIODevice specialized by parent device.
This makes possible to transform vfio_bar_write/read into generic
vfio_region_write/read that will be used by VFIOPlatformDevice too.

vfio_mmap_bar becomes vfio_map_region

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 hw/vfio/pci.c | 169 --
 1 file changed, 93 insertions(+), 76 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index d0bee62..5f0164a 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -74,15 +74,20 @@ typedef struct VFIOQuirk {
 } data;
 } VFIOQuirk;
 
-typedef struct VFIOBAR {
-off_t fd_offset; /* offset of BAR within device fd */
-int fd; /* device fd, allows us to pass VFIOBAR as opaque data */
+typedef struct VFIORegion {
+struct VFIODevice *vbasedev;
+off_t fd_offset; /* offset of region within device fd */
+int fd; /* device fd, allows us to pass VFIORegion as opaque data */
 MemoryRegion mem; /* slow, read/write access */
 MemoryRegion mmap_mem; /* direct mapped access */
 void *mmap;
 size_t size;
 uint32_t flags; /* VFIO region flags (rd/wr/mmap) */
-uint8_t nr; /* cache the BAR number for debug */
+uint8_t nr; /* cache the region number for debug */
+} VFIORegion;
+
+typedef struct VFIOBAR {
+VFIORegion region;
 bool ioport;
 bool mem64;
 QLIST_HEAD(, VFIOQuirk) quirks;
@@ -194,6 +199,7 @@ typedef struct VFIODevice {
 struct VFIODeviceOps {
 bool (*vfio_compute_needs_reset)(VFIODevice *vdev);
 int (*vfio_hot_reset_multi)(VFIODevice *vdev);
+void (*vfio_eoi)(VFIODevice *vdev);
 };
 
 typedef struct VFIOPCIDevice {
@@ -377,8 +383,10 @@ static void vfio_intx_interrupt(void *opaque)
 }
 }
 
-static void vfio_eoi(VFIOPCIDevice *vdev)
+static void vfio_eoi(VFIODevice *vbasedev)
 {
+VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
+
 if (!vdev-intx.pending) {
 return;
 }
@@ -388,7 +396,7 @@ static void vfio_eoi(VFIOPCIDevice *vdev)
 
 vdev-intx.pending = false;
 pci_irq_deassert(vdev-pdev);
-vfio_unmask_irqindex(vdev-vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
+vfio_unmask_irqindex(vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
 }
 
 static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
@@ -543,7 +551,7 @@ static void vfio_update_irq(PCIDevice *pdev)
 vfio_enable_intx_kvm(vdev);
 
 /* Re-enable the interrupt in cased we missed an EOI */
-vfio_eoi(vdev);
+vfio_eoi(vdev-vbasedev);
 }
 
 static int vfio_enable_intx(VFIOPCIDevice *vdev)
@@ -1073,10 +1081,11 @@ static void vfio_update_msi(VFIOPCIDevice *vdev)
 /*
  * IO Port/MMIO - Beware of the endians, VFIO is always little endian
  */
-static void vfio_bar_write(void *opaque, hwaddr addr,
+static void vfio_region_write(void *opaque, hwaddr addr,
uint64_t data, unsigned size)
 {
-VFIOBAR *bar = opaque;
+VFIORegion *region = opaque;
+VFIODevice *vbasedev = region-vbasedev;
 union {
 uint8_t byte;
 uint16_t word;
@@ -1099,19 +1108,16 @@ static void vfio_bar_write(void *opaque, hwaddr addr,
 break;
 }
 
-if (pwrite(bar-fd, buf, size, bar-fd_offset + addr) != size) {
+if (pwrite(region-fd, buf, size, region-fd_offset + addr) != size) {
 error_report(%s(,0x%HWADDR_PRIx, 0x%PRIx64, %d) failed: %m,
  __func__, addr, data, size);
 }
 
 #ifdef DEBUG_VFIO
 {
-VFIOPCIDevice *vdev = container_of(bar, VFIOPCIDevice, bars[bar-nr]);
-
-DPRINTF(%s(%04x:%02x:%02x.%x:BAR%d+0x%HWADDR_PRIx, 0x%PRIx64
-, %d)\n, __func__, vdev-host.domain, vdev-host.bus,
-vdev-host.slot, vdev-host.function, bar-nr, addr,
-data, size);
+DPRINTF(%s(%s:region%d+0x%HWADDR_PRIx, 0x%PRIx64
+, %d)\n, __func__, vbasedev-name,
+region-nr, addr, data, size);
 }
 #endif
 
@@ -1123,13 +1129,15 @@ static void vfio_bar_write(void *opaque, hwaddr addr,
  * which access will service the interrupt, so we're potentially
  * getting quite a few host interrupts per guest interrupt.
  */
-vfio_eoi(container_of(bar, VFIOPCIDevice, bars[bar-nr]));
+vbasedev-ops-vfio_eoi(vbasedev);
+
 }
 
-static uint64_t vfio_bar_read(void *opaque,
+static uint64_t vfio_region_read(void *opaque,
   hwaddr addr, unsigned size)
 {
-VFIOBAR *bar = opaque;
+VFIORegion *region = opaque;
+VFIODevice *vbasedev = region-vbasedev;
 union {
 uint8_t byte;
 uint16_t word;
@@ -1138,7 +1146,7 @@ static uint64_t vfio_bar_read(void *opaque,
 } buf;
 uint64_t data = 0;
 
-if (pread(bar-fd, buf, size, bar-fd_offset + addr) != size) {
+if (pread(region-fd, buf, size, region-fd_offset + addr) != size) {
 error_report(%s(,0x%HWADDR_PRIx, %d)

Re: [Qemu-devel] [RFC v4 08/13] hw/vfio/common: Add EXEC_FLAG to VFIO DMA mappings

2014-07-07 Thread Will Deacon

On Mon, Jul 07, 2014 at 01:27:18PM +0100, Eric Auger wrote:
 From: Alvise Rigo a.r...@virtualopensystems.com
 
 The flag is mandatory for the ARM SMMU so we always add it if the MMIO
 handles it.

I though the logic of this flag was changing (so that you request an NX
mapping instead), so I'd hold off on this change until the kernel has
decided what it's doing.

Also, the ARM SMMU doesn't mandate any flags, you probably need this as
a result of using a PL330, which is an odd case of a DMA master that
spits out EXEC transactions (instruction fetch).

Will

Re: [Qemu-devel] [RFC PATCH 0/5] modify boot order when vm is running

2014-07-07 Thread Michael S. Tsirkin

On Mon, Jul 07, 2014 at 11:08:32AM +, Gonglei (Arei) wrote:
  -Original Message-
  From: Michael S. Tsirkin [mailto:m...@redhat.com]
  Sent: Monday, July 07, 2014 5:29 PM
  To: Gonglei (Arei)
  Cc: qemu-devel@nongnu.org; afaer...@suse.de; ag...@suse.de;
  stefa...@redhat.com; ak...@redhat.com; a...@ozlabs.ru;
  alex.william...@redhat.com; arm...@redhat.com; ebl...@redhat.com;
  kw...@redhat.com; peter.mayd...@linaro.org; lcapitul...@redhat.com;
  pbonz...@redhat.com; ler...@redhat.com; kra...@redhat.com;
  imamm...@redhat.com; dmi...@daynix.com; marce...@redhat.com;
  peter.crosthwa...@xilinx.com; r...@twiddle.net; so...@cmu.edu;
  Huangweidong (C); Luonengjun; Huangpeng (Peter); chenliang (T)
  Subject: Re: [RFC PATCH 0/5] modify boot order when vm is running

  On Mon, Jul 07, 2014 at 05:10:56PM +0800, arei.gong...@huawei.com wrote:
   From: Chenliang chenlian...@huawei.com

   Sometime, we want to modify boot order of vm without shutdown it.
   This sets of patches add one qmp to achieve it. And fix some little
   bug when device is hotpluged.

   Chenliang (5):
 bootindex: add *_boot_device_path function
 bootindex: reset bootindex when vm reset
 bootindex: delete boot index when device is removed
 bootindex: add qmp to set boot index when vm is running
 bootindex: fix memory leak when ppc sets boot index

  Unfortunately at least for PC, boot order is exposed
  in fw cfg which can not change while guest is running.

 Yes, so we should assure it take effect after the guest rebooting. 

Does this patch do it like this?
I didn't get it. How is this handled?
Maybe more code comments would be helpful to make this
clear to readers.

  I suspect we need to change how we report boot order to guests.
  While we are at it, maybe we can fix the silly bootindex
  convention: I think people really want to specify boot *order*,
  not boot index.

 Agreed.

 But at present, the boot index can be used for the boot order 
 except -boot command line. Because -boot only can assign
 the guest booting from HD or Network or Floppy etc.. but cannot
 assign the index of hard disks or PXE net cards, which not be enough
 for many scenes, such as P2V, or two different system hard disks
 (vda/sda/hda).

 Best regards,
 -Gonglei

[Qemu-devel] [RFC v4 00/13] KVM platform device passthrough

2014-07-07 Thread Eric Auger

This RFC series aims at enabling KVM platform device passthrough.
It implements a VFIO platform device which is bound to be dynamically
instantiated using -device option.

The VFIO platform device uses an host VFIO platform driver which must
be bound to the assigned device prior to the QEMU system start.

- the guest can directly access the device register space
- assigned device IRQs are transparently routed to the guest by
  QEMU/KVM (2 methods currently are supported)
- iommu is transparently programmed to prevent the device from
  accessing physical pages outside of the guest address space

The patch series was fully reworked between v3 and v4 to ease the
review of PCI modifications. Dynamic instantiation from command
line was cleaned up thanks to Alex Graf Dynamic sysbus device
allocation support patch series and its porting onto machvirt.

the patch relies on the following QEMU patch series:
- Alex Graf's Dynamic sysbus device allocation support
  http://lists.gnu.org/archive/html/qemu-ppc/2014-07/msg00047.html
- machvirt dynamic sysbus device instantiation
  Port Alex mechanics from e500 to virt. Propose to implement
  device tree generation in devices instead of machine file

The patch series is made of the following patch files:

1-6) Modifications to PCI code to prepare for VFIO platform device:
7) split of PCI specific code and generic code (move)
8) EXEC_FLAG setting
9) creation of the VFIO platform device, without irqfd support
   (MMIO direct access and IRQ assignment).
10-11) addition of irqfd/virqfd support
12) capability to dynamically instantiate the device
13) example derived VFIO device: calxeda xgmac

v3-v4 changes (Eric Auger, Alvise Rigo)
- rebase on last VFIO PCI code (v2.1.0-rc0)
- full git history rework to ease PCI code change review
- mv include files in hw/vfio
- DPRINTF reformatting temporarily moved out
- support of VFIO virq (removal of resamplefd handler on user-side)
- integration with sysbus dynamic instantiation framwork
- removal of unrealize and cleanup routines until it is better
  understood what is really needed
- Support of VFIO for Amba devices should be handled in an inherited
  device to specialize the device tree generation (clock handle currently
  missing in framework however)
- Always use eventfd as notifying mechanism temporarily moved out
- static instantiation is not mainstream (although it remains possible)
  note if static instantiation is used, irqfd must be setup in machine file
  when virtual IRQ is known
- create the GSI routing table on qemu side

v2-v3 changes (Alvise Rigo, Eric Auger):
- Following Alex W recommandations, further efforts to factorize the
  code between PCI:introduction of VFIODevice and VFIORegion
  as base classes
- unique reset handler for platform and PCI
- cleanup following Kim's comments
- multiple IRQ support mechanics should be in place although not
  tested
- Better handling of MMIO multiple regions
- New features and fixes by Alvise (multiple compat string, exec
  flag, force eventfd usage, amba device tree support)
- irqfd support

v1-v2 changes (Kim Phillips, Eric Auger):
- IRQ initial support (legacy mode where eventfds are handled on
  user side)
- hacked dynamic instantiation

v1 (Kim Phillips):
- initial split between PCI and platform
- MMIO support only
- static instantiation

This patch has the following kernel side dependencies:

- [RFC Patch v6 0/20] VFIO support for platform devices
https://www.mail-archive.com/kvm@vger.kernel.org/msg103247.html
- [Patch] ARM: KVM: Handle IPA unmapping on memory region deletion
https://patches.linaro.org/27691/
- [PATCH v3] ARM: KVM: add irqfd and irq routing support
https://patches.linaro.org/32261/
- [PATCH] ARM: KVM: Enable the KVM-VFIO device
https://lists.cs.columbia.edu/pipermail/kvmarm/2014-March/008629.html
- [PATCH v2] ARM: KVM: user_mem_abort: support stage 2 MMIO page mapping
http://www.spinics.net/lists/kvm/msg105083.html

The patch series was tested on Calxeda Midway (ARMv7) where one xgmac
is assigned to KVM host while the second one is assigned to the guest.

Unfortunately a single IRQ is exercised.

Next steps:
- use of ARM: Forwarding physical interrupts to a guest VM
- unbind/migration/reset problematics

Here are the instructions to test on a Calxeda Midway:

https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway

git://git.linaro.org/people/eric.auger/linux.git (branch irqfd_integ_v3)
git://git.linaro.org/people/eric.auger/qemu.git (branch vfio_integ_v4)

Best Regards

Eric


Alvise Rigo (1):
  hw/vfio/common: Add EXEC_FLAG to VFIO DMA mappings

Eric Auger (11):
  hw/vfio/pci: Rename VFIODevice into VFIOPCIDevice
  hw/vfio/pci: Remove unneeded include files
  hw/vfio/pci: introduce VFIODevice
  hw/vfio/pci: Introduce VFIORegion
  hw/vfio/pci: split vfio_get_device
  hw/vfio: create common module
  hw/vfio/platform: add vfio-platform support
  hw/intc/arm_gic_kvm: enable irqfd and set routing table
  hw/vfio/platform:

[Qemu-devel] [RFC v4 10/13] hw/intc/arm_gic_kvm: enable irqfd and set routing table

2014-07-07 Thread Eric Auger

Makes possible to use KVM irqfd. An identity GSI routing table
is defined so that virtual IRQ injection can happen.

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 hw/intc/arm_gic_kvm.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c
index 5038885..29b9236 100644
--- a/hw/intc/arm_gic_kvm.c
+++ b/hw/intc/arm_gic_kvm.c
@@ -576,6 +576,17 @@ static void kvm_arm_gic_realize(DeviceState *dev, Error 
**errp)
 KVM_DEV_ARM_VGIC_GRP_ADDR,
 KVM_VGIC_V2_ADDR_TYPE_CPU,
 s-dev_fd);
+
+/* set up irq routing */
+kvm_init_irq_routing(kvm_state);
+for (i = 0; i  s-num_irq - GIC_INTERNAL; ++i) {
+kvm_irqchip_add_irq_route(kvm_state, i, 0, i);
+}
+
+kvm_irqfds_allowed = true;
+kvm_gsi_routing_allowed = true;
+
+kvm_irqchip_commit_routes(kvm_state);
 }
 
 static void kvm_arm_gic_class_init(ObjectClass *klass, void *data)
-- 
1.8.3.2

[Qemu-devel] [PATCH v3 0/2] coroutine: dynamically scale pool size

2014-07-07 Thread Stefan Hajnoczi

v3:
 * Use COROUTINE_POOL_RESERVATION constant in block.c [Lluis]

v2:
 * Assert that callers never reduce pool below default size [eblake]

The coroutine pool reuses exited coroutines to make qemu_coroutine_create()
cheap.  The size of the pool is capped to prevent it from hogging memory after
a period of high coroutine activity.  Previously the max size was hardcoded to
64 but this doesn't scale with guest size.

A guest with lots of disks can do more parallel I/O and therefore requires a
larger coroutine pool size.  This series tries to solve the problem by scaling
pool size according to the number of drives.

Ming has confirmed that this patch series, together with his block plug/unplug
series, solves the dataplane performance regression in QEMU 2.1.

Stefan Hajnoczi (2):
  coroutine: make pool size dynamic
  block: bump coroutine pool size for drives

 block.c   |  6 ++
 include/block/coroutine.h | 11 +++
 qemu-coroutine.c  | 26 +++---
 3 files changed, 40 insertions(+), 3 deletions(-)

-- 
1.9.3

[Qemu-devel] [PATCH v3 1/2] coroutine: make pool size dynamic

2014-07-07 Thread Stefan Hajnoczi

Allow coroutine users to adjust the pool size.  For example, if the
guest has multiple emulated disk drives we should keep around more
coroutines.

Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 include/block/coroutine.h | 11 +++
 qemu-coroutine.c  | 26 +++---
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/include/block/coroutine.h b/include/block/coroutine.h
index a1797ae..07eeb3c 100644
--- a/include/block/coroutine.h
+++ b/include/block/coroutine.h
@@ -223,4 +223,15 @@ void coroutine_fn co_aio_sleep_ns(AioContext *ctx, 
QEMUClockType type,
  * Note that this function clobbers the handlers for the file descriptor.
  */
 void coroutine_fn yield_until_fd_readable(int fd);
+
+/**
+ * Add or subtract from the coroutine pool size
+ *
+ * The coroutine implementation keeps a pool of coroutines to be reused by
+ * qemu_coroutine_create().  This makes coroutine creation cheap.  Heavy
+ * coroutine users should call this to reserve pool space.  Call it again with
+ * a negative number to release pool space.
+ */
+void qemu_coroutine_adjust_pool_size(int n);
+
 #endif /* QEMU_COROUTINE_H */
diff --git a/qemu-coroutine.c b/qemu-coroutine.c
index 4708521..bd574aa 100644
--- a/qemu-coroutine.c
+++ b/qemu-coroutine.c
@@ -19,14 +19,14 @@
 #include block/coroutine_int.h
 
 enum {
-/* Maximum free pool size prevents holding too many freed coroutines */
-POOL_MAX_SIZE = 64,
+POOL_DEFAULT_SIZE = 64,
 };
 
 /** Free list to speed up creation */
 static QemuMutex pool_lock;
 static QSLIST_HEAD(, Coroutine) pool = QSLIST_HEAD_INITIALIZER(pool);
 static unsigned int pool_size;
+static unsigned int pool_max_size = POOL_DEFAULT_SIZE;
 
 Coroutine *qemu_coroutine_create(CoroutineEntry *entry)
 {
@@ -55,7 +55,7 @@ static void coroutine_delete(Coroutine *co)
 {
 if (CONFIG_COROUTINE_POOL) {
 qemu_mutex_lock(pool_lock);
-if (pool_size  POOL_MAX_SIZE) {
+if (pool_size  pool_max_size) {
 QSLIST_INSERT_HEAD(pool, co, pool_next);
 co-caller = NULL;
 pool_size++;
@@ -137,3 +137,23 @@ void coroutine_fn qemu_coroutine_yield(void)
 self-caller = NULL;
 coroutine_swap(self, to);
 }
+
+void qemu_coroutine_adjust_pool_size(int n)
+{
+qemu_mutex_lock(pool_lock);
+
+pool_max_size += n;
+
+/* Callers should never take away more than they added */
+assert(pool_max_size = POOL_DEFAULT_SIZE);
+
+/* Trim oversized pool down to new max */
+while (pool_size  pool_max_size) {
+Coroutine *co = QSLIST_FIRST(pool);
+QSLIST_REMOVE_HEAD(pool, pool_next);
+pool_size--;
+qemu_coroutine_delete(co);
+}
+
+qemu_mutex_unlock(pool_lock);
+}
-- 
1.9.3

[Qemu-devel] [PATCH v3 2/2] block: bump coroutine pool size for drives

2014-07-07 Thread Stefan Hajnoczi

When a BlockDriverState is associated with a storage controller
DeviceState we expect guest I/O.  Use this opportunity to bump the
coroutine pool size by 64.

This patch ensures that the coroutine pool size scales with the number
of drives attached to the guest.  It should increase coroutine pool
usage (which makes qemu_coroutine_create() fast) without hogging too
much memory when fewer drives are attached.

Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 block.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/block.c b/block.c
index f80e2b2..3a6dcd8 100644
--- a/block.c
+++ b/block.c
@@ -57,6 +57,8 @@ struct BdrvDirtyBitmap {
 
 #define NOT_DONE 0x7fff /* used while emulated sync operation in progress 
*/
 
+#define COROUTINE_POOL_RESERVATION 64 /* number of coroutines to reserve */
+
 static void bdrv_dev_change_media_cb(BlockDriverState *bs, bool load);
 static BlockDriverAIOCB *bdrv_aio_readv_em(BlockDriverState *bs,
 int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
@@ -2093,6 +2095,9 @@ int bdrv_attach_dev(BlockDriverState *bs, void *dev)
 }
 bs-dev = dev;
 bdrv_iostatus_reset(bs);
+
+/* We're expecting I/O from the device so bump up coroutine pool size */
+qemu_coroutine_adjust_pool_size(COROUTINE_POOL_RESERVATION);
 return 0;
 }
 
@@ -2112,6 +2117,7 @@ void bdrv_detach_dev(BlockDriverState *bs, void *dev)
 bs-dev_ops = NULL;
 bs-dev_opaque = NULL;
 bs-guest_block_size = 512;
+qemu_coroutine_adjust_pool_size(-COROUTINE_POOL_RESERVATION);
 }
 
 /* TODO change to return DeviceState * when all users are qdevified */
-- 
1.9.3

[Qemu-devel] [PATCH 3/4] test-aio: fix GSource-based timer test

2014-07-07 Thread Paolo Bonzini

The current test depends too much on the implementation of the AioContext
GSource.  Just iterate on the main loop until the callback has been invoked
the right number of times.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 tests/test-aio.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/tests/test-aio.c b/tests/test-aio.c
index e5f8b55..264dab9 100644
--- a/tests/test-aio.c
+++ b/tests/test-aio.c
@@ -806,17 +806,16 @@ static void test_source_timer_schedule(void)
 g_usleep(1 * G_USEC_PER_SEC);
 g_assert_cmpint(data.n, ==, 0);
 
-g_assert(g_main_context_iteration(NULL, false));
+g_assert(g_main_context_iteration(NULL, true));
 g_assert_cmpint(data.n, ==, 1);
+expiry += data.ns;
 
-/* The comment above was not kidding when it said this wakes up itself */
-do {
-g_assert(g_main_context_iteration(NULL, true));
-} while (qemu_clock_get_ns(data.clock_type) = expiry);
-g_usleep(1 * G_USEC_PER_SEC);
-g_main_context_iteration(NULL, false);
+while (data.n  2) {
+g_main_context_iteration(NULL, true);
+}
 
 g_assert_cmpint(data.n, ==, 2);
+g_assert(qemu_clock_get_ns(data.clock_type)  expiry);
 
 aio_set_fd_handler(ctx, pipefd[0], NULL, NULL, NULL);
 close(pipefd[0]);
-- 
1.8.3.1

[Qemu-devel] [PATCH 1/4] block: prefer aio_poll to qemu_aio_wait

2014-07-07 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 block.c| 2 +-
 blockjob.c | 2 +-
 qemu-io-cmds.c | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index f80e2b2..591a913 100644
--- a/block.c
+++ b/block.c
@@ -471,7 +471,7 @@ int bdrv_create(BlockDriver *drv, const char* filename,
 co = qemu_coroutine_create(bdrv_create_co_entry);
 qemu_coroutine_enter(co, cco);
 while (cco.ret == NOT_DONE) {
-qemu_aio_wait();
+aio_poll(qemu_get_aio_context(), true);
 }
 }
 
diff --git a/blockjob.c b/blockjob.c
index 67a64ea..ca0b4e2 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -187,7 +187,7 @@ int block_job_cancel_sync(BlockJob *job)
 job-opaque = data;
 block_job_cancel(job);
 while (data.ret == -EINPROGRESS) {
-qemu_aio_wait();
+aio_poll(bdrv_get_aio_context(bs), true);
 }
 return (data.cancelled  data.ret == 0) ? -ECANCELED : data.ret;
 }
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 60c1ceb..c503fc6 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -483,7 +483,7 @@ static int do_co_write_zeroes(BlockDriverState *bs, int64_t 
offset, int count,
 co = qemu_coroutine_create(co_write_zeroes_entry);
 qemu_coroutine_enter(co, data);
 while (!data.done) {
-qemu_aio_wait();
+aio_poll(bdrv_get_aio_context(bs), true);
 }
 if (data.ret  0) {
 return data.ret;
@@ -2027,7 +2027,7 @@ static const cmdinfo_t resume_cmd = {
 static int wait_break_f(BlockDriverState *bs, int argc, char **argv)
 {
 while (!bdrv_debug_is_suspended(bs, argv[1])) {
-qemu_aio_wait();
+aio_poll(bdrv_get_aio_context(bs), true);
 }
 
 return 0;
-- 
1.8.3.1

[Qemu-devel] [PATCH 2/4] block: drop aio functions that operate on the main AioContext

2014-07-07 Thread Paolo Bonzini

The main AioContext should be accessed explicitly via qemu_get_aio_context().
Most of the time, using it is not the right thing to do.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 aio-posix.c   |  4 ++--
 aio-win32.c   |  6 +++---
 include/block/aio.h   | 17 ++---
 include/block/blockjob.h  |  4 ++--
 include/block/coroutine.h |  2 +-
 main-loop.c   | 21 -
 tests/test-thread-pool.c  |  4 ++--
 7 files changed, 12 insertions(+), 46 deletions(-)

diff --git a/aio-posix.c b/aio-posix.c
index f921d4f..44c4df3 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -125,7 +125,7 @@ static bool aio_dispatch(AioContext *ctx)
 bool progress = false;
 
 /*
- * We have to walk very carefully in case qemu_aio_set_fd_handler is
+ * We have to walk very carefully in case aio_set_fd_handler is
  * called while we're walking.
  */
 node = QLIST_FIRST(ctx-aio_handlers);
@@ -183,7 +183,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
 /*
  * If there are callbacks left that have been queued, we need to call them.
  * Do not call select in this case, because it is possible that the caller
- * does not need a complete flush (as is the case for qemu_aio_wait loops).
+ * does not need a complete flush (as is the case for aio_poll loops).
  */
 if (aio_bh_poll(ctx)) {
 blocking = false;
diff --git a/aio-win32.c b/aio-win32.c
index 23f4e5b..c12f61e 100644
--- a/aio-win32.c
+++ b/aio-win32.c
@@ -102,7 +102,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
 /*
  * If there are callbacks left that have been queued, we need to call then.
  * Do not call select in this case, because it is possible that the caller
- * does not need a complete flush (as is the case for qemu_aio_wait loops).
+ * does not need a complete flush (as is the case for aio_poll loops).
  */
 if (aio_bh_poll(ctx)) {
 blocking = false;
@@ -115,7 +115,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
 /*
  * Then dispatch any pending callbacks from the GSource.
  *
- * We have to walk very carefully in case qemu_aio_set_fd_handler is
+ * We have to walk very carefully in case aio_set_fd_handler is
  * called while we're walking.
  */
 node = QLIST_FIRST(ctx-aio_handlers);
@@ -177,7 +177,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
 blocking = false;
 
 /* we have to walk very carefully in case
- * qemu_aio_set_fd_handler is called while we're walking */
+ * aio_set_fd_handler is called while we're walking */
 node = QLIST_FIRST(ctx-aio_handlers);
 while (node) {
 AioHandler *tmp;
diff --git a/include/block/aio.h b/include/block/aio.h
index a92511b..d81250c 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -220,7 +220,7 @@ bool aio_poll(AioContext *ctx, bool blocking);
 #ifdef CONFIG_POSIX
 /* Register a file descriptor and associated callbacks.  Behaves very similarly
  * to qemu_set_fd_handler2.  Unlike qemu_set_fd_handler2, these callbacks will
- * be invoked when using qemu_aio_wait().
+ * be invoked when using aio_poll().
  *
  * Code that invokes AIO completion functions should rely on this function
  * instead of qemu_set_fd_handler[2].
@@ -234,7 +234,7 @@ void aio_set_fd_handler(AioContext *ctx,
 
 /* Register an event notifier and associated callbacks.  Behaves very similarly
  * to event_notifier_set_handler.  Unlike event_notifier_set_handler, these 
callbacks
- * will be invoked when using qemu_aio_wait().
+ * will be invoked when using aio_poll().
  *
  * Code that invokes AIO completion functions should rely on this function
  * instead of event_notifier_set_handler.
@@ -251,19 +251,6 @@ GSource *aio_get_g_source(AioContext *ctx);
 /* Return the ThreadPool bound to this AioContext */
 struct ThreadPool *aio_get_thread_pool(AioContext *ctx);
 
-/* Functions to operate on the main QEMU AioContext.  */
-
-bool qemu_aio_wait(void);
-void qemu_aio_set_event_notifier(EventNotifier *notifier,
- EventNotifierHandler *io_read);
-
-#ifdef CONFIG_POSIX
-void qemu_aio_set_fd_handler(int fd,
- IOHandler *io_read,
- IOHandler *io_write,
- void *opaque);
-#endif
-
 /**
  * aio_timer_new:
  * @ctx: the aio context
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index f3cf63f..60aa835 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -74,7 +74,7 @@ struct BlockJob {
  * Set to true if the job should cancel itself.  The flag must
  * always be tested just before toggling the busy flag from false
  * to true.  After a job has been cancelled, it should only yield
- * if #qemu_aio_wait will (sooner or later) reenter the coroutine.
+ * if #aio_poll will (sooner or later) reenter the coroutine.
  */
 bool

[Qemu-devel] [PATCH for 2.1 0/4] AioContext cleanups and optimizations

2014-07-07 Thread Paolo Bonzini

These patches do some cleanup and optimization in AioContext land.

The first two drop AIO functions that operate on the main AioContext.
These are not needed anymore now that each BlockDriverState explicitly
operates on its own AioContext.  They are independent, and can be
skipped if the maintainers prefer doing so or have comments.

Patch 3 is a testsuite change for the aio_notify optimization, and
patch 4 is the aio_notify patch, now with a better comment
about the smp_mb optimization.

Paolo Bonzini (4):
  block: prefer aio_poll to qemu_aio_wait
  block: drop aio functions that operate on the main AioContext
  test-aio: fix GSource-based timer test
  AioContext: speed up aio_notify

 aio-posix.c   |  38 +++--
 aio-win32.c   |   6 +--
 async.c   |  19 -
 block.c   |   2 +-
 blockjob.c|   2 +-
 docs/aio_notify.promela   | 104 ++
 include/block/aio.h   |  26 +---
 include/block/blockjob.h  |   4 +-
 include/block/coroutine.h |   2 +-
 main-loop.c   |  21 --
 qemu-io-cmds.c|   4 +-
 tests/test-aio.c  |  13 +++---
 tests/test-thread-pool.c  |   4 +-
 13 files changed, 186 insertions(+), 59 deletions(-)
 create mode 100644 docs/aio_notify.promela

-- 
1.8.3.1

[Qemu-devel] [PATCH 4/4] AioContext: speed up aio_notify

2014-07-07 Thread Paolo Bonzini

In many cases, the call to event_notifier_set in aio_notify is unnecessary.
In particular, if we are executing aio_dispatch, or if aio_poll is not
blocking, we know that we will soon get to the next loop iteration (if
necessary); the thread that hosts the AioContext's event loop does not
need any nudging.

The patch includes a Promela formal model that shows that this really
works and does not need any further complication such as generation
counts.  It needs a memory barrier though.

The generation counts are not needed because any change to
ctx-dispatching after the memory barrier is okay for aio_notify.
If it changes from zero to one, it is the right thing to skip
event_notifier_set.  If it changes from one to zero, the
event_notifier_set is unnecessary but harmless.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 aio-posix.c |  34 +++-
 async.c |  19 -
 docs/aio_notify.promela | 104 
 include/block/aio.h |   9 +
 4 files changed, 164 insertions(+), 2 deletions(-)
 create mode 100644 docs/aio_notify.promela

diff --git a/aio-posix.c b/aio-posix.c
index 44c4df3..2eada2e 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -175,11 +175,38 @@ static bool aio_dispatch(AioContext *ctx)
 bool aio_poll(AioContext *ctx, bool blocking)
 {
 AioHandler *node;
+bool was_dispatching;
 int ret;
 bool progress;
 
+was_dispatching = ctx-dispatching;
 progress = false;
 
+/* aio_notify can avoid the expensive event_notifier_set if
+ * everything (file descriptors, bottom halves, timers) will
+ * be re-evaluated before the next blocking poll().  This happens
+ * in two cases:
+ *
+ * 1) when aio_poll is called with blocking == false
+ *
+ * 2) when we are called after poll().  If we are called before
+ *poll(), bottom halves will not be re-evaluated and we need
+ *aio_notify() if blocking == true.
+ *
+ * The first aio_dispatch() only does something when AioContext is
+ * running as a GSource, and in that case aio_poll is used only
+ * with blocking == false, so this optimization is already quite
+ * effective.  However, the code is ugly and should be restructured
+ * to have a single aio_dispatch() call.  To do this, we need to
+ * reorganize aio_poll into a prepare/poll/dispatch model like
+ * glib's.
+ *
+ * If we're in a nested event loop, ctx-dispatching might be true.
+ * In that case we can restore it just before returning, but we
+ * have to clear it now.
+ */
+aio_set_dispatching(ctx, !blocking);
+
 /*
  * If there are callbacks left that have been queued, we need to call them.
  * Do not call select in this case, because it is possible that the caller
@@ -190,12 +217,14 @@ bool aio_poll(AioContext *ctx, bool blocking)
 progress = true;
 }
 
+/* Re-evaluate condition (1) above.  */
+aio_set_dispatching(ctx, !blocking);
 if (aio_dispatch(ctx)) {
 progress = true;
 }
 
 if (progress  !blocking) {
-return true;
+goto out;
 }
 
 ctx-walking_handlers++;
@@ -234,9 +263,12 @@ bool aio_poll(AioContext *ctx, bool blocking)
 }
 
 /* Run dispatch even if there were no readable fds to run timers */
+aio_set_dispatching(ctx, true);
 if (aio_dispatch(ctx)) {
 progress = true;
 }
 
+out:
+aio_set_dispatching(ctx, was_dispatching);
 return progress;
 }
diff --git a/async.c b/async.c
index 5b6fe6b..34af0b2 100644
--- a/async.c
+++ b/async.c
@@ -26,6 +26,7 @@
 #include block/aio.h
 #include block/thread-pool.h
 #include qemu/main-loop.h
+#include qemu/atomic.h
 
 /***/
 /* bottom halves (can be seen as timers which expire ASAP) */
@@ -247,9 +248,25 @@ ThreadPool *aio_get_thread_pool(AioContext *ctx)
 return ctx-thread_pool;
 }
 
+void aio_set_dispatching(AioContext *ctx, bool dispatching)
+{
+ctx-dispatching = dispatching;
+if (!dispatching) {
+/* Write ctx-dispatching before reading e.g. bh-scheduled.
+ * Optimization: this is only needed when we're entering the unsafe
+ * phase where other threads must call event_notifier_set.
+ */
+smp_mb();
+}
+}
+
 void aio_notify(AioContext *ctx)
 {
-event_notifier_set(ctx-notifier);
+/* Write e.g. bh-scheduled before reading ctx-dispatching.  */
+smp_mb();
+if (!ctx-dispatching) {
+event_notifier_set(ctx-notifier);
+}
 }
 
 static void aio_timerlist_notify(void *opaque)
diff --git a/docs/aio_notify.promela b/docs/aio_notify.promela
new file mode 100644
index 000..ad3f6f0
--- /dev/null
+++ b/docs/aio_notify.promela
@@ -0,0 +1,104 @@
+/*
+ * This model describes the interaction between aio_set_dispatching()
+ * and aio_notify().
+ *
+ * Author: Paolo Bonzini pbonz...@redhat.com
+ *
+ * This file is in the public

[Qemu-devel] [RFC v4 08/13] hw/vfio/common: Add EXEC_FLAG to VFIO DMA mappings

2014-07-07 Thread Eric Auger

From: Alvise Rigo a.r...@virtualopensystems.com

The flag is mandatory for the ARM SMMU so we always add it if the MMIO
handles it.

Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com
---
 hw/vfio/common.c  | 9 +
 include/hw/vfio/vfio-common.h | 1 +
 linux-headers/linux/vfio.h| 2 ++
 3 files changed, 12 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index ed93cf3..e22f326 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -233,6 +233,11 @@ static int vfio_dma_map(VFIOContainer *container, hwaddr 
iova,
 map.flags |= VFIO_DMA_MAP_FLAG_WRITE;
 }
 
+/* add exec flag */
+if (container-iommu_data.has_exec_cap) {
+map.flags |= VFIO_DMA_MAP_FLAG_EXEC;
+}
+
 /*
  * Try the mapping, if it fails with EBUSY, unmap the region and try
  * again.  This shouldn't be necessary, but we sometimes see it in
@@ -688,6 +693,10 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as)
 goto free_container_exit;
 }
 
+if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_IOMMU_PROT_EXEC)) {
+container-iommu_data.has_exec_cap = true;
+}
+
 container-iommu_data.type1.listener = vfio_memory_listener;
 container-iommu_data.release = vfio_listener_release;
 
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index d19622b..e670ae3 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -76,6 +76,7 @@ typedef struct VFIOContainer {
 union {
 VFIOType1 type1;
 };
+bool has_exec_cap; /* support of exec capability by the IOMMU */
 void (*release)(struct VFIOContainer *);
 } iommu_data;
 QLIST_HEAD(, VFIOGuestIOMMU) giommu_list;
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 26c218e..b13f7d3 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -30,6 +30,7 @@
  */
 #define VFIO_DMA_CC_IOMMU  4
 
+#define VFIO_IOMMU_PROT_EXEC   5
 /*
  * The IOCTL interface is designed for extensibility by embedding the
  * structure length (argsz) and flags into structures passed between
@@ -398,6 +399,7 @@ struct vfio_iommu_type1_dma_map {
__u32   flags;
 #define VFIO_DMA_MAP_FLAG_READ (1  0)/* readable from device 
*/
 #define VFIO_DMA_MAP_FLAG_WRITE (1  1)   /* writable from device */
+#define VFIO_DMA_MAP_FLAG_EXEC (1  2)/* executable from device */
__u64   vaddr;  /* Process virtual address */
__u64   iova;   /* IO virtual address */
__u64   size;   /* Size of mapping (bytes) */
-- 
1.8.3.2

[Qemu-devel] [PULL 10/12] qdev: Fix crash when using non-device class name on -global

2014-07-07 Thread Michael S. Tsirkin

From: Eduardo Habkost ehabk...@redhat.com

This fixes the following crash:

$ qemu-system-x86_64 -global container.xxx=y
hw/core/qdev-properties-system.c:399:qdev_add_one_global: Object 
0x7f7eff234100 is not an instance of type device
Aborted (core dumped)

New behavior will be to just warn, just like when non-existing clas
names are used:

$ qemu-system-x86_64 -global container.xxx=y
qemu-system-x86_64: Warning: -global container.xxx=y not used

Signed-off-by: Eduardo Habkost ehabk...@redhat.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Eduardo Habkost ehabk...@redhat.com
Tested-by: Don Slutz dsl...@verizon.com
---
 hw/core/qdev-properties-system.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index 8e140af..ae0900f 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -394,7 +394,8 @@ static int qdev_add_one_global(QemuOpts *opts, void *opaque)
 g-driver   = qemu_opt_get(opts, driver);
 g-property = qemu_opt_get(opts, property);
 g-value= qemu_opt_get(opts, value);
-oc = object_class_by_name(g-driver);
+oc = object_class_dynamic_cast(object_class_by_name(g-driver),
+   TYPE_DEVICE);
 if (oc) {
 DeviceClass *dc = DEVICE_CLASS(oc);
 
-- 
MST

Re: [Qemu-devel] [RFC v4 08/13] hw/vfio/common: Add EXEC_FLAG to VFIO DMA mappings

2014-07-07 Thread Alvise Rigo

Il 07/07/2014 14:49, Will Deacon ha scritto:
 On Mon, Jul 07, 2014 at 01:27:18PM +0100, Eric Auger wrote:
 From: Alvise Rigo a.r...@virtualopensystems.com

 The flag is mandatory for the ARM SMMU so we always add it if the MMIO
 handles it.
 
 I though the logic of this flag was changing (so that you request an NX
 mapping instead), so I'd hold off on this change until the kernel has
 decided what it's doing.

Yes, you are right.
This patch is not needed anymore, in fact it was dropped in my last
patch series.
It should not be here, please ignore it.

Regards,
alvise

 
 Also, the ARM SMMU doesn't mandate any flags, you probably need this as
 a result of using a PL330, which is an odd case of a DMA master that
 spits out EXEC transactions (instruction fetch).
 
 Will

[Qemu-devel] [RFC v4 09/13] hw/vfio/platform: add vfio-platform support

2014-07-07 Thread Eric Auger

Minimal VFIO platform implementation supporting
- register space user mapping,
- IRQ assignment based on eventfds handled on qemu side.

irqfd kernel acceleration comes in a subsequent patch.

Signed-off-by: Kim Phillips kim.phill...@linaro.org
Signed-off-by: Eric Auger eric.au...@linaro.org

---

v3 - v4:
[Eric Auger]
- merge of vfio: Add initial IRQ support in platform device
  to get a full functional patch although perfs are limited.
- removal of unrealize function since I currently understand
  it is only used with device hot-plug feature.

v2 - v3:
[Eric Auger]
- further factorization between PCI and platform (VFIORegion,
  VFIODevice). same level of functionality.

= v2:
[Kim Philipps]
- Initial Creation of the device supporting register space mapping
---
 hw/vfio/Makefile.objs   |   1 +
 hw/vfio/platform.c  | 528 
 include/hw/vfio/vfio-platform.h |  74 ++
 3 files changed, 603 insertions(+)
 create mode 100644 hw/vfio/platform.c
 create mode 100644 include/hw/vfio/vfio-platform.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index e31f30e..c5c76fe 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -1,4 +1,5 @@
 ifeq ($(CONFIG_LINUX), y)
 obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o
+obj-$(CONFIG_SOFTMMU) += platform.o
 endif
diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
new file mode 100644
index 000..a5fc22b
--- /dev/null
+++ b/hw/vfio/platform.c
@@ -0,0 +1,528 @@
+/*
+ * vfio based device assignment support - platform devices
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Kim Phillips kim.phill...@linaro.org
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on vfio based PCI device assignment support:
+ *  Copyright Red Hat, Inc. 2012
+ */
+
+#include linux/vfio.h
+#include sys/ioctl.h
+#include sys/mman.h
+#include hw/pci/msi.h
+#include hw/pci/msix.h
+#include qemu/error-report.h
+#include qemu/range.h
+#include sysemu/sysemu.h
+#include hw/vfio/vfio-platform.h
+
+extern const MemoryRegionOps vfio_region_ops;
+extern const MemoryListener vfio_memory_listener;
+extern QLIST_HEAD(, VFIOGroup) group_list;
+extern QLIST_HEAD(, VFIOAddressSpace) vfio_address_spaces;
+
+static void vfio_put_device(VFIOPlatformDevice *vdev)
+{
+unsigned int i;
+VFIODevice *vbasedev = vdev-vbasedev;
+
+for (i = 0; i  vbasedev-num_regions; i++) {
+g_free(vdev-regions[i]);
+}
+g_free(vdev-regions);
+vfio_put_base_device(vdev-vbasedev);
+}
+
+/*
+ * It is mandatory to pass a VFIOPlatformDevice since VFIODevice
+ * is not a QOM Object and cannot be passed to memory region functions
+*/
+static void vfio_map_region(VFIOPlatformDevice *vdev, int nr)
+{
+VFIORegion *region = vdev-regions[nr];
+unsigned size = region-size;
+char name[64];
+
+snprintf(name, sizeof(name), VFIO %s region %d,
+ vdev-vbasedev.name, nr);
+
+/* A slow read/write mapping underlies all regions */
+memory_region_init_io(region-mem, OBJECT(vdev), vfio_region_ops,
+  region, name, size);
+
+strncat(name,  mmap, sizeof(name) - strlen(name) - 1);
+
+if (vfio_mmap_region(OBJECT(vdev), region, region-mem,
+ region-mmap_mem, region-mmap, size, 0, name)) {
+error_report(%s unsupported. Performance may be slow, name);
+}
+}
+
+static void print_regions(VFIOPlatformDevice *vdev)
+{
+int i;
+
+DPRINTF(Device \%s\ counts %d region(s):\n,
+ vdev-vbasedev.name, vdev-vbasedev.num_regions);
+
+for (i = 0; i  vdev-vbasedev.num_regions; i++) {
+DPRINTF(- region %d flags = 0x%lx, size = 0x%lx, 
+fd= %d, offset = 0x%lx\n,
+vdev-regions[i]-nr,
+(unsigned long)vdev-regions[i]-flags,
+(unsigned long)vdev-regions[i]-size,
+vdev-regions[i]-fd,
+(unsigned long)vdev-regions[i]-fd_offset);
+}
+}
+
+static int vfio_populate_regions(VFIODevice *vbasedev)
+{
+struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
+int i, ret = errno;
+VFIOPlatformDevice *vdev =
+container_of(vbasedev, VFIOPlatformDevice, vbasedev);
+
+vdev-regions = g_malloc0(sizeof(VFIORegion *) * vbasedev-num_regions);
+
+for (i = 0; i  vbasedev-num_regions; i++) {
+vdev-regions[i] = g_malloc0(sizeof(VFIORegion));
+reg_info.index = i;
+ret = ioctl(vbasedev-fd, VFIO_DEVICE_GET_REGION_INFO, reg_info);
+if (ret) {
+error_report(vfio: Error getting region %d info: %m, i);
+goto error;
+}
+
+vdev-regions[i]-flags = reg_info.flags;
+vdev-regions[i]-size = reg_info.size;
+vdev-regions[i]-fd_offset = reg_info.offset;
+vdev-regions[i]-fd = vbasedev-fd;
+vdev-regions[i]-nr = i;
+

[Qemu-devel] [PULL 06/12] pci: assign devfn to pci_dev before calling pci_device_iommu_address_space()

2014-07-07 Thread Michael S. Tsirkin

From: Le Tan tamlokv...@gmail.com

In function do_pci_register_device() in file hw/pci/pci.c, move the assignment
of pci_dev-devfn to the position before the call to
pci_device_iommu_address_space(pci_dev) which will use the value of
pci_dev-devfn.

Fixes: 9eda7d373e9c691c070eddcbe3467b991f67f6bd
pci: Introduce helper to retrieve a PCI device's DMA address space

Cc: qemu-sta...@nongnu.org
Signed-off-by: Le Tan tamlokv...@gmail.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/pci/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 17ed510..351d320 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -827,6 +827,7 @@ static PCIDevice *do_pci_register_device(PCIDevice 
*pci_dev, PCIBus *bus,
 }
 
 pci_dev-bus = bus;
+pci_dev-devfn = devfn;
 dma_as = pci_device_iommu_address_space(pci_dev);
 
 memory_region_init_alias(pci_dev-bus_master_enable_region,
@@ -836,7 +837,6 @@ static PCIDevice *do_pci_register_device(PCIDevice 
*pci_dev, PCIBus *bus,
 address_space_init(pci_dev-bus_master_as, 
pci_dev-bus_master_enable_region,
name);
 
-pci_dev-devfn = devfn;
 pstrcpy(pci_dev-name, sizeof(pci_dev-name), name);
 pci_dev-irq_state = 0;
 pci_config_alloc(pci_dev);
-- 
MST

[Qemu-devel] [PULL 08/12] hw/virtio: enable common virtio feature for mmio device

2014-07-07 Thread Michael S. Tsirkin

From: Ming Lei ming@canonical.com

Both 'indirect_desc' and 'event_idx' are bus independent features,
and they should be enabled for mmio devices too.

On arm64 quad core VM(qemu-kvm), the patch can increase block I/O
performance a lot with latest linux tree:
- without the patch: 14K IOPS
- with the patch: 34K IOPS

fio script:
[global]
direct=1
bsrange=4k-4k
timeout=10
numjobs=4
ioengine=libaio
iodepth=64

filename=/dev/vdc
group_reporting=1

[f1]
rw=randread

Cc: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Ming Lei ming@canonical.com
Acked-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/virtio/virtio-mmio.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
index 8829eb0..18c6e5b 100644
--- a/hw/virtio/virtio-mmio.c
+++ b/hw/virtio/virtio-mmio.c
@@ -369,10 +369,16 @@ static void virtio_mmio_realizefn(DeviceState *d, Error 
**errp)
 sysbus_init_mmio(sbd, proxy-iomem);
 }
 
+static Property virtio_mmio_properties[] = {
+DEFINE_VIRTIO_COMMON_FEATURES(VirtIOMMIOProxy, host_features),
+DEFINE_PROP_END_OF_LIST(),
+};
+
 static void virtio_mmio_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 
+dc-props = virtio_mmio_properties;
 dc-realize = virtio_mmio_realizefn;
 dc-reset = virtio_mmio_reset;
 set_bit(DEVICE_CATEGORY_MISC, dc-categories);
-- 
MST

[Qemu-devel] [PULL for-2.1 08/11] block: block: introduce APIs for submitting IO as a batch

2014-07-07 Thread Stefan Hajnoczi

From: Ming Lei ming@canonical.com

This patch introduces three APIs so that following
patches can support queuing I/O requests and submitting them
as a batch for improving I/O performance.

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Ming Lei ming@canonical.com
Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 block.c   | 31 +++
 include/block/block.h |  4 
 include/block/block_int.h |  5 +
 3 files changed, 40 insertions(+)

diff --git a/block.c b/block.c
index f80e2b2..8800a6b 100644
--- a/block.c
+++ b/block.c
@@ -1905,6 +1905,7 @@ void bdrv_drain_all(void)
 bool bs_busy;
 
 aio_context_acquire(aio_context);
+bdrv_flush_io_queue(bs);
 bdrv_start_throttled_reqs(bs);
 bs_busy = bdrv_requests_pending(bs);
 bs_busy |= aio_poll(aio_context, bs_busy);
@@ -5782,3 +5783,33 @@ BlockDriverState *check_to_replace_node(const char 
*node_name, Error **errp)
 
 return to_replace_bs;
 }
+
+void bdrv_io_plug(BlockDriverState *bs)
+{
+BlockDriver *drv = bs-drv;
+if (drv  drv-bdrv_io_plug) {
+drv-bdrv_io_plug(bs);
+} else if (bs-file) {
+bdrv_io_plug(bs-file);
+}
+}
+
+void bdrv_io_unplug(BlockDriverState *bs)
+{
+BlockDriver *drv = bs-drv;
+if (drv  drv-bdrv_io_unplug) {
+drv-bdrv_io_unplug(bs);
+} else if (bs-file) {
+bdrv_io_unplug(bs-file);
+}
+}
+
+void bdrv_flush_io_queue(BlockDriverState *bs)
+{
+BlockDriver *drv = bs-drv;
+if (drv  drv-bdrv_flush_io_queue) {
+drv-bdrv_flush_io_queue(bs);
+} else if (bs-file) {
+bdrv_flush_io_queue(bs-file);
+}
+}
diff --git a/include/block/block.h b/include/block/block.h
index baecc26..32d3676 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -584,4 +584,8 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs);
  */
 void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context);
 
+void bdrv_io_plug(BlockDriverState *bs);
+void bdrv_io_unplug(BlockDriverState *bs);
+void bdrv_flush_io_queue(BlockDriverState *bs);
+
 #endif
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8f8e65e..f6c3bef 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -261,6 +261,11 @@ struct BlockDriver {
 void (*bdrv_attach_aio_context)(BlockDriverState *bs,
 AioContext *new_context);
 
+/* io queue for linux-aio */
+void (*bdrv_io_plug)(BlockDriverState *bs);
+void (*bdrv_io_unplug)(BlockDriverState *bs);
+void (*bdrv_flush_io_queue)(BlockDriverState *bs);
+
 QLIST_ENTRY(BlockDriver) list;
 };
 
-- 
1.9.3

[Qemu-devel] [RFC v4 03/13] hw/vfio/pci: Remove unneeded include files

2014-07-07 Thread Eric Auger

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 hw/vfio/pci.c | 12 
 1 file changed, 12 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 5c7bfd5..a7df3de 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -18,26 +18,14 @@
  *  Copyright (C) 2008, IBM, Muli Ben-Yehuda (m...@il.ibm.com)
  */
 
-#include dirent.h
 #include linux/vfio.h
 #include sys/ioctl.h
 #include sys/mman.h
-#include sys/stat.h
-#include sys/types.h
-#include unistd.h
-
-#include config.h
 #include exec/address-spaces.h
-#include exec/memory.h
 #include hw/pci/msi.h
 #include hw/pci/msix.h
-#include hw/pci/pci.h
-#include qemu-common.h
 #include qemu/error-report.h
-#include qemu/event_notifier.h
-#include qemu/queue.h
 #include qemu/range.h
-#include sysemu/kvm.h
 #include sysemu/sysemu.h
 #include hw/vfio/vfio.h
 
-- 
1.8.3.2

Re: [Qemu-devel] [RFC v4 08/13] hw/vfio/common: Add EXEC_FLAG to VFIO DMA mappings

2014-07-07 Thread Eric Auger

On 07/07/2014 03:25 PM, Alvise Rigo wrote:
 Il 07/07/2014 14:49, Will Deacon ha scritto:
 On Mon, Jul 07, 2014 at 01:27:18PM +0100, Eric Auger wrote:
 From: Alvise Rigo a.r...@virtualopensystems.com

 The flag is mandatory for the ARM SMMU so we always add it if the MMIO
 handles it.

 I though the logic of this flag was changing (so that you request an NX
 mapping instead), so I'd hold off on this change until the kernel has
 decided what it's doing.
 
 Yes, you are right.
 This patch is not needed anymore, in fact it was dropped in my last
 patch series.
 It should not be here, please ignore it.

OK. My apologies.

Best Regards

Eric

 
 Regards,
 alvise
 

 Also, the ARM SMMU doesn't mandate any flags, you probably need this as
 a result of using a PL330, which is an odd case of a DMA master that
 spits out EXEC transactions (instruction fetch).

 Will

[Qemu-devel] [PULL 07/12] acpi: fix typo in memory hotplug MMIO region name

2014-07-07 Thread Michael S. Tsirkin

From: Igor Mammedov imamm...@redhat.com

Reported-by: Sergey Fionov fio...@gmail.com
Signed-off-by: Igor Mammedov imamm...@redhat.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
Reviewed-by: Peter Crosthwaite peter.crosthwa...@xilinx.com
---
 hw/acpi/memory_hotplug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
index 38ca415..ed39241 100644
--- a/hw/acpi/memory_hotplug.c
+++ b/hw/acpi/memory_hotplug.c
@@ -159,7 +159,7 @@ void acpi_memory_hotplug_init(MemoryRegion *as, Object 
*owner,
 
 state-devs = g_malloc0(sizeof(*state-devs) * state-dev_count);
 memory_region_init_io(state-io, owner, acpi_memory_hotplug_ops, state,
-  apci-mem-hotplug, ACPI_MEMORY_HOTPLUG_IO_LEN);
+  acpi-mem-hotplug, ACPI_MEMORY_HOTPLUG_IO_LEN);
 memory_region_add_subregion(as, ACPI_MEMORY_HOTPLUG_BASE, state-io);
 }
 
-- 
MST

1 2 3 >

1 - 100 of 264 matches

Mail list logo