[Qemu-devel] [Bug 1242963] Re: QEMU loadvm causes guest OS freeze

2014-04-09 Thread NanothylL
Hi guys,

This bug has reported in a very long time.  I have tested v1.7.0, v1.7.1 and 
even v2.0.0rc1, v2.0.0rc2.
The bug is still there. This is very easy to reproduce. I dunno why it is not 
fixed until today.

I think living-snapshot is a very important feature for Qemu. Somebody can take 
a look at this issue.
Or at least, give me some tips to let me know how to fix this?

Thanks!

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1242963

Title:
  QEMU loadvm causes guest OS freeze

Status in QEMU:
  New

Bug description:
  HOST: ubuntu 13.10 x64
  GUEST: winxp sp 3 x86

  AFFECT QEMU(tested): v1.5.2, v1.5.3,  v1.6.0, v1.6.1

  I compile QEMU by myself with ./configure --target-list=i386-softmmu  make 
 make install.
  After installing a winxp sp3 into the qemu-system-i386 with command line:
   qemu-system-i386 -m 512 -hda xp.img -net user -net nic,model=rtl8139 -rtc 
base=localtime,clock=vm

  I use monitor to create a live snapshot: 
   stop
   savevm xxx
   cont

  And then I load this snapshot (I also try it in commad line: -loadvm xxx):
   loadvm xxx
   cont

  After that, the windows system is freeze (don't accept any keyboard or
  mouse input, although I knew vcpu is still working).

  If I compile with -enable-kvm and launch qemu-system-i386 with -enable-kvm, 
it looks like everything works well.
  I think it is a bug for qemu system.

  BTW: freeze is not appearing 100%, but in my test, 95% cases would
  cause system freeze.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1242963/+subscriptions



[Qemu-devel] [PATCH] qmp: synchronize state of the cpu before memory reading

2014-04-09 Thread Andrey Karpov
calling memsave' through the qmp makes a zero filled dump. the hmp still works 
properly because hmp_memsave calls cpu_synchronize_state before calling 
qmp_memsave.

---

Andrey Karpov (1):
  synchronize state of the cpu before memory reading


 cpus.c |1 +
 1 file changed, 1 insertion(+)

--
Signature



[Qemu-devel] [PATCH] synchronize state of the cpu before memory reading

2014-04-09 Thread Andrey Karpov

---
 cpus.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/cpus.c b/cpus.c
index 1104d61..4c53747 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1417,6 +1417,7 @@ void qmp_memsave(int64_t addr, int64_t size, const char 
*filename,
 return;
 }
 
+cpu_synchronize_state(cpu);
 while (size != 0) {
 l = sizeof(buf);
 if (l  size)




[Qemu-devel] [PATCH v8 3/4] raw-posix: Add full image preallocation option

2014-04-09 Thread Hu Tao
This patch adds a new option preallocation for raw format, and implements
full preallocation.

Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 block/raw-posix.c | 61 ---
 1 file changed, 54 insertions(+), 7 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index a363b71..44a1b4a 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1240,6 +1240,7 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options,
 int fd;
 int result = 0;
 int64_t total_size = 0;
+PreallocMode prealloc = PREALLOC_MODE_OFF;
 
 strstart(filename, file:, filename);
 
@@ -1247,6 +1248,18 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options,
 while (options  options-name) {
 if (!strcmp(options-name, BLOCK_OPT_SIZE)) {
 total_size = options-value.n  BDRV_SECTOR_MASK;
+} else if (!strcmp(options-name, BLOCK_OPT_PREALLOC)) {
+if (!options-value.s || !strcmp(options-value.s, off)) {
+prealloc = PREALLOC_MODE_OFF;
+} else if (!strcmp(options-value.s, metadata)) {
+prealloc = PREALLOC_MODE_METADATA;
+} else if (!strcmp(options-value.s, full)) {
+prealloc = PREALLOC_MODE_FULL;
+} else {
+error_setg(errp, Invalid preallocation mode: '%s',
+   options-value.s);
+return -EINVAL;
+}
 }
 options++;
 }
@@ -1256,16 +1269,45 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options,
 if (fd  0) {
 result = -errno;
 error_setg_errno(errp, -result, Could not create file);
-} else {
-if (ftruncate(fd, total_size) != 0) {
-result = -errno;
-error_setg_errno(errp, -result, Could not resize file);
+goto out;
+}
+if (ftruncate(fd, total_size) != 0) {
+result = -errno;
+error_setg_errno(errp, -result, Could not resize file);
+goto out_close;
+}
+if (prealloc == PREALLOC_MODE_METADATA) {
+/* posix_fallocate() doesn't set errno. */
+result = -posix_fallocate(fd, 0, total_size);
+if (result != 0) {
+error_setg_errno(errp, -result,
+ Could not preallocate data for the new file);
 }
-if (qemu_close(fd) != 0) {
-result = -errno;
-error_setg_errno(errp, -result, Could not close the new file);
+} else if (prealloc == PREALLOC_MODE_FULL) {
+char *buf = g_malloc0(65536);
+int64_t num = 0, left = total_size;
+
+while (left  0) {
+num = MIN(left, 65536);
+result = write(fd, buf, num);
+if (result  0) {
+result = -errno;
+error_setg_errno(errp, -result,
+ Could not write to the new file);
+g_free(buf);
+goto out_close;
+}
+left -= num;
 }
+fsync(fd);
+g_free(buf);
+}
+out_close:
+if (qemu_close(fd) != 0) {
+result = -errno;
+error_setg_errno(errp, -result, Could not close the new file);
 }
+out:
 return result;
 }
 
@@ -1416,6 +1458,11 @@ static QEMUOptionParameter raw_create_options[] = {
 .type = OPT_SIZE,
 .help = Virtual disk size
 },
+{
+.name = BLOCK_OPT_PREALLOC,
+.type = OPT_STRING,
+.help = Preallocation mode (allowed values: off, metadata, full)
+},
 { NULL }
 };
 
-- 
1.8.5.2.229.g4448466




[Qemu-devel] [PATCH v8 2/4] raw, qcow2: don't convert file size to sector size

2014-04-09 Thread Hu Tao
and avoid converting it back later.

Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 block/qcow2.c | 8 
 block/raw-posix.c | 4 ++--
 block/raw-win32.c | 4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 3377007..b2ff9c3 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1713,7 +1713,7 @@ static int qcow2_create2(const char *filename, int64_t 
total_size,
 }
 
 /* Okay, now that we have a valid image, let's give it the right size */
-ret = bdrv_truncate(bs, total_size * BDRV_SECTOR_SIZE);
+ret = bdrv_truncate(bs, total_size);
 if (ret  0) {
 error_setg_errno(errp, -ret, Could not resize image);
 goto out;
@@ -1766,7 +1766,7 @@ static int qcow2_create(const char *filename, 
QEMUOptionParameter *options,
 {
 const char *backing_file = NULL;
 const char *backing_fmt = NULL;
-uint64_t sectors = 0;
+uint64_t size = 0;
 int flags = 0;
 size_t cluster_size = DEFAULT_CLUSTER_SIZE;
 PreallocMode prealloc = PREALLOC_MODE_OFF;
@@ -1777,7 +1777,7 @@ static int qcow2_create(const char *filename, 
QEMUOptionParameter *options,
 /* Read out options */
 while (options  options-name) {
 if (!strcmp(options-name, BLOCK_OPT_SIZE)) {
-sectors = options-value.n / 512;
+size = options-value.n  BDRV_SECTOR_MASK;
 } else if (!strcmp(options-name, BLOCK_OPT_BACKING_FILE)) {
 backing_file = options-value.s;
 } else if (!strcmp(options-name, BLOCK_OPT_BACKING_FMT)) {
@@ -1828,7 +1828,7 @@ static int qcow2_create(const char *filename, 
QEMUOptionParameter *options,
 return -EINVAL;
 }
 
-ret = qcow2_create2(filename, sectors, backing_file, backing_fmt, flags,
+ret = qcow2_create2(filename, size, backing_file, backing_fmt, flags,
 cluster_size, prealloc, options, version, local_err);
 if (local_err) {
 error_propagate(errp, local_err);
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 1688e16..a363b71 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1246,7 +1246,7 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options,
 /* Read out options */
 while (options  options-name) {
 if (!strcmp(options-name, BLOCK_OPT_SIZE)) {
-total_size = options-value.n / BDRV_SECTOR_SIZE;
+total_size = options-value.n  BDRV_SECTOR_MASK;
 }
 options++;
 }
@@ -1257,7 +1257,7 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options,
 result = -errno;
 error_setg_errno(errp, -result, Could not create file);
 } else {
-if (ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
+if (ftruncate(fd, total_size) != 0) {
 result = -errno;
 error_setg_errno(errp, -result, Could not resize file);
 }
diff --git a/block/raw-win32.c b/block/raw-win32.c
index 48cb2c2..9fe60f3 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -486,7 +486,7 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options,
 /* Read out options */
 while (options  options-name) {
 if (!strcmp(options-name, BLOCK_OPT_SIZE)) {
-total_size = options-value.n / 512;
+total_size = options-value.n  BDRV_SECTOR_MASK;
 }
 options++;
 }
@@ -498,7 +498,7 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options,
 return -EIO;
 }
 set_sparse(fd);
-ftruncate(fd, total_size * 512);
+ftruncate(fd, total_size);
 qemu_close(fd);
 return 0;
 }
-- 
1.8.5.2.229.g4448466




[Qemu-devel] [PATCH v8 1/4] qapi: introduce PreallocMode and a new PreallocMode full.

2014-04-09 Thread Hu Tao
This patch prepares for the subsequent patches.

Reviewed-by: Fam Zheng f...@redhat.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 block/qcow2.c|  8 
 qapi-schema.json | 14 ++
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index e903d97..3377007 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1592,7 +1592,7 @@ static int preallocate(BlockDriverState *bs)
 
 static int qcow2_create2(const char *filename, int64_t total_size,
  const char *backing_file, const char *backing_format,
- int flags, size_t cluster_size, int prealloc,
+ int flags, size_t cluster_size, PreallocMode prealloc,
  QEMUOptionParameter *options, int version,
  Error **errp)
 {
@@ -1769,7 +1769,7 @@ static int qcow2_create(const char *filename, 
QEMUOptionParameter *options,
 uint64_t sectors = 0;
 int flags = 0;
 size_t cluster_size = DEFAULT_CLUSTER_SIZE;
-int prealloc = 0;
+PreallocMode prealloc = PREALLOC_MODE_OFF;
 int version = 3;
 Error *local_err = NULL;
 int ret;
@@ -1790,9 +1790,9 @@ static int qcow2_create(const char *filename, 
QEMUOptionParameter *options,
 }
 } else if (!strcmp(options-name, BLOCK_OPT_PREALLOC)) {
 if (!options-value.s || !strcmp(options-value.s, off)) {
-prealloc = 0;
+prealloc = PREALLOC_MODE_OFF;
 } else if (!strcmp(options-value.s, metadata)) {
-prealloc = 1;
+prealloc = PREALLOC_MODE_METADATA;
 } else {
 error_setg(errp, Invalid preallocation mode: '%s',
options-value.s);
diff --git a/qapi-schema.json b/qapi-schema.json
index 391356f..9e6221a 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -4689,3 +4689,17 @@
   'btn' : 'InputBtnEvent',
   'rel' : 'InputMoveEvent',
   'abs' : 'InputMoveEvent' } }
+
+##
+# @PreallocMode
+#
+# Preallocation mode of QEMU image file
+#
+# @off: no preallocation
+# @metadata: preallocate only for metadata
+# @full: preallocate all data, including metadata
+#
+# Since 2.0
+##
+{ 'enum': 'PreallocMode',
+  'data': [ 'off', 'metadata', 'full' ] }
-- 
1.8.5.2.229.g4448466




[Qemu-devel] [PATCH v4 2/6] register: Add Register API

2014-04-09 Thread Peter Crosthwaite
This API provides some encapsulation of registers and factors our some
common functionality to common code. Bits of device state (usually MMIO
registers), often have all sorts of access restrictions and semantics
associated with them. This API allow you to define what those
restrictions are on a bit-by-bit basis.

Helper functions are then used to access the register which observe the
semantics defined by the RegisterAccessInfo struct.

Some features:
Bits can be marked as read_only (ro field)
Bits can be marked as write-1-clear (w1c field)
Bits can be marked as reserved (rsvd field)
Reset values can be defined (reset)
Bits can throw guest errors when written certain values (ge0, ge1)
Bits can throw unimp errors when written certain values (ui0, ui1)
Bits can be marked clear on read (cor)
Pre and post action callbacks can be added to read and write ops
Verbose debugging info can be enabled/disabled

Useful for defining device register spaces in a data driven way. Cuts
down on a lot of the verbosity and repetition in the switch-case blocks
in the standard foo_mmio_read/write functions.

Also useful for automated generation of device models from hardware
design sources.

Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com
---
changed from v2:
Simplified! Removed pre-read, nwx, wo
Removed byte loops (Gerd Review)
Made data pointer optional
Added fast paths for simple registers
Moved into hw/core and include/hw (Paolo Review)
changed from v1:
Rebranded as the Register API - I think thats probably what it is.
Near total rewrite of implementation.
De-arrayified reset (this is client/Memory APIs job).
Moved out of bitops into its own file (Blue review)
Added debug, the register pointer, and prefix to a struct (Blue Review)
Made 64-bit to play friendlier with memory API (Blue review)
Made backend storage uint8_t (MST review)
Added read/write callbacks (Blue review)
Added ui0, ui1 (Blue review)
Moved re-purposed width (now byte width defining actual storage size)
Arrayified ge0, ge1 (ui0, ui1 too) and added .reason
Added wo field (not an April fools joke - this has genuine meaning here)
Added we mask to write accessor

 hw/core/Makefile.objs |   1 +
 hw/core/register.c| 197 ++
 include/hw/register.h | 131 +
 3 files changed, 329 insertions(+)
 create mode 100644 hw/core/register.c
 create mode 100644 include/hw/register.h

diff --git a/hw/core/Makefile.objs b/hw/core/Makefile.objs
index 5377d05..7839f46 100644
--- a/hw/core/Makefile.objs
+++ b/hw/core/Makefile.objs
@@ -4,6 +4,7 @@ common-obj-y += fw-path-provider.o
 # irq.o needed for qdev GPIO handling:
 common-obj-y += irq.o
 common-obj-y += hotplug.o
+common-obj-y += register.o
 
 common-obj-$(CONFIG_EMPTY_SLOT) += empty_slot.o
 common-obj-$(CONFIG_XILINX_AXI) += stream.o
diff --git a/hw/core/register.c b/hw/core/register.c
new file mode 100644
index 000..96712f2
--- /dev/null
+++ b/hw/core/register.c
@@ -0,0 +1,197 @@
+/*
+ * Register Definition API
+ *
+ * Copyright (c) 2013 Xilinx Inc.
+ * Copyright (c) 2013 Peter Crosthwaite peter.crosthwa...@xilinx.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#include hw/register.h
+#include qemu/log.h
+
+static inline void register_write_log(RegisterInfo *reg, int dir, uint64_t val,
+  int mask, const char *msg,
+  const char *reason)
+{
+qemu_log_mask(mask, %s:%s bits %# PRIx64  %s write of %d%s%s\n,
+  reg-prefix, reg-access-name, val, msg, dir,
+  reason ? :  : , reason ? reason : );
+}
+
+static inline void register_write_val(RegisterInfo *reg, uint64_t val)
+{
+if (!reg-data) {
+return;
+}
+switch (reg-data_size) {
+case 1:
+*(uint8_t *)reg-data = val;
+break;
+case 2:
+*(uint16_t *)reg-data = val;
+break;
+case 4:
+*(uint32_t *)reg-data = val;
+break;
+case 8:
+*(uint64_t *)reg-data = val;
+break;
+default:
+abort();
+}
+}
+
+static inline uint64_t register_read_val(RegisterInfo *reg)
+{
+switch (reg-data_size) {
+case 1:
+return *(uint8_t *)reg-data;
+case 2:
+return *(uint16_t *)reg-data;
+case 4:
+return *(uint32_t *)reg-data;
+case 8:
+return *(uint64_t *)reg-data;
+default:
+abort();
+}
+return 0; /* unreachable */
+}
+
+void register_write(RegisterInfo *reg, uint64_t val, uint64_t we)
+{
+uint64_t old_val, new_val, test, no_w_mask;
+const RegisterAccessInfo *ac;
+const RegisterAccessError *rae;
+
+assert(reg);
+
+ac = reg-access;
+if (reg-write_lite  !~we) { /* fast path!! */
+new_val = val;
+goto register_write_fast;
+}
+
+old_val = reg-data ? register_read_val(reg) : ac-reset;
+

[Qemu-devel] [PATCH v4 3/6] register: Add Memory API glue

2014-04-09 Thread Peter Crosthwaite
Add memory io handlers that glue the register API to the memory API.
Just translation functions at this stage. Although it does allow for
devices to be created without all-in-one mmio r/w handlers.

Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com
---
changed from v2:
Added fast path to register_write_memory to skip endianness bitbashing

 hw/core/register.c| 48 
 include/hw/register.h | 12 
 2 files changed, 60 insertions(+)

diff --git a/hw/core/register.c b/hw/core/register.c
index 96712f2..29e5f04 100644
--- a/hw/core/register.c
+++ b/hw/core/register.c
@@ -195,3 +195,51 @@ void register_reset(RegisterInfo *reg)
 
 register_write_val(reg, reg-access-reset);
 }
+
+static inline void register_write_memory(void *opaque, hwaddr addr,
+ uint64_t value, unsigned size, bool 
be)
+{
+RegisterInfo *reg = opaque;
+uint64_t we = ~0;
+int shift = 0;
+
+if (reg-data_size != size) {
+we = (size == 8) ? ~0ull : (1ull  size * 8) - 1;
+shift = 8 * (be ? reg-data_size - size - addr : addr);
+}
+
+assert(size + addr = reg-data_size);
+register_write(reg, value  shift, we  shift);
+}
+
+void register_write_memory_be(void *opaque, hwaddr addr, uint64_t value,
+  unsigned size)
+{
+register_write_memory(opaque, addr, value, size, true);
+}
+
+
+void register_write_memory_le(void *opaque, hwaddr addr, uint64_t value,
+  unsigned size)
+{
+register_write_memory(opaque, addr, value, size, false);
+}
+
+static inline uint64_t register_read_memory(void *opaque, hwaddr addr,
+unsigned size, bool be)
+{
+RegisterInfo *reg = opaque;
+int shift = 8 * (be ? reg-data_size - size - addr : addr);
+
+return register_read(reg)  shift;
+}
+
+uint64_t register_read_memory_be(void *opaque, hwaddr addr, unsigned size)
+{
+return register_read_memory(opaque, addr, size, true);
+}
+
+uint64_t register_read_memory_le(void *opaque, hwaddr addr, unsigned size)
+{
+return register_read_memory(opaque, addr, size, false);
+}
diff --git a/include/hw/register.h b/include/hw/register.h
index f747a7f..c595ed6 100644
--- a/include/hw/register.h
+++ b/include/hw/register.h
@@ -87,6 +87,8 @@ struct RegisterAccessInfo {
  * @prefix: String prefix for log and debug messages
  *
  * @opaque: Opaque data for the register
+ *
+ * @mem: optional Memory region for the register
  */
 
 struct RegisterInfo {
@@ -102,6 +104,8 @@ struct RegisterInfo {
 /* private */
 bool read_lite;
 bool write_lite;
+
+MemoryRegion mem;
 };
 
 /**
@@ -128,4 +132,12 @@ uint64_t register_read(RegisterInfo *reg);
 
 void register_reset(RegisterInfo *reg);
 
+void register_write_memory_be(void *opaque, hwaddr addr, uint64_t value,
+  unsigned size);
+void register_write_memory_le(void *opaque, hwaddr addr, uint64_t value,
+  unsigned size);
+
+uint64_t register_read_memory_be(void *opaque, hwaddr addr, unsigned size);
+uint64_t register_read_memory_le(void *opaque, hwaddr addr, unsigned size);
+
 #endif
-- 
1.9.1.1.gbb9f595




[Qemu-devel] [PATCH v4 0/6] Data Driven device registers + Zynq DEVCFG

2014-04-09 Thread Peter Crosthwaite
Hi All. This is a new scheme i've come up with handling device registers in a
data driven way. My motivation for this is to factor out a lot of the access
checking that seems to be replicated in every device. See P2 commit message for
further discussion.

P1 is a trivial addition to bitops.h
P2 is the main patch, adds the register definition functionality
P3 adds helpers that glue the register API to the Memory API
P4 is an example new device (the Xilinx Zynq devcfg) that uses this scheme.
P5 adds devcfg to the Zynq machine model

This devcfg device was particularly finnicky with per-bit restrictions which
prompted all this. I'm also looking for a higher-than-usual modelling fidelity
on the register space, with semantics defined for random reserved bits
in-between otherwise consistent fields.

Here's an example of the qemu_log output for the devcfg device. This is produced
by now generic sharable code:

/machine/unattached/device[44]:Addr 0x08:CFG: write of value 0508
/machine/unattached/device[44]:Addr 0x80:MCTRL: write of value 00800010
/machine/unattached/device[44]:Addr 0x10:INT_MASK: write of value 
/machine/unattached/device[44]:Addr :CTRL: write of value 0c00607f

And an example of a rogue guest banging on a bad bit:

/machine/unattached/device[44]:Addr 0x14:STATUS bits 0x01 may not be \
written to 1

Changed from v3:
Rebased (its been a while)
Added reserved bits.
Cleaner separation of decode and access components (Patch 3)
Changed from v2:
Fixed for hw/ re-orginisation (Paolo review)
Simplified and optimized (PMM and Gerd review)
Changed from v1:
Added ONES macro patch
Dropped bogus former patch 1 (PMM review)
Addressed Blue, Gerd and MST comments.
Simplified to be more Memory API compatible.
Added Memory API helpers.
Please see discussion already on list and commit msgs for more detail.


Peter Crosthwaite (6):
  bitops: Add ONES macro
  register: Add Register API
  register: Add Memory API glue
  register: Add support for decoding information
  xilinx_devcfg: Zynq devcfg device model
  xilinx_zynq: added devcfg to machine model

 default-configs/arm-softmmu.mak |   1 +
 hw/arm/xilinx_zynq.c|   8 +
 hw/core/Makefile.objs   |   1 +
 hw/core/register.c  | 245 +
 hw/dma/Makefile.objs|   1 +
 hw/dma/xilinx_devcfg.c  | 462 
 include/hw/register.h   | 164 ++
 include/qemu/bitops.h   |   2 +
 8 files changed, 884 insertions(+)
 create mode 100644 hw/core/register.c
 create mode 100644 hw/dma/xilinx_devcfg.c
 create mode 100644 include/hw/register.h

-- 
1.9.1.1.gbb9f595




[Qemu-devel] [PATCH v4 4/6] register: Add support for decoding information

2014-04-09 Thread Peter Crosthwaite
Allow defining of optional address decoding information in register
definitions. This is useful for clients that want to associate
registers with specific addresses.

Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com
---

 include/hw/register.h | 21 +
 1 file changed, 21 insertions(+)

diff --git a/include/hw/register.h b/include/hw/register.h
index c595ed6..d7f602f 100644
--- a/include/hw/register.h
+++ b/include/hw/register.h
@@ -15,6 +15,7 @@
 
 typedef struct RegisterInfo RegisterInfo;
 typedef struct RegisterAccessInfo RegisterAccessInfo;
+typedef struct RegisterDecodeInfo RegisterDecodeInfo;
 
 /**
  * A register access error message
@@ -54,6 +55,11 @@ typedef struct RegisterAccessError {
  * allowing this function to modify the value before return to the client.
  */
 
+#define REG_DECODE_READ (1  0)
+#define REG_DECODE_WRITE (1  1)
+#define REG_DECODE_EXECUTE (1  2)
+#define REG_DECODE_RW (REG_DECODE_READ | REG_DECODE_WRITE)
+
 struct RegisterAccessInfo {
 const char *name;
 uint64_t ro;
@@ -71,6 +77,14 @@ struct RegisterAccessInfo {
 void (*post_write)(RegisterInfo *reg, uint64_t val);
 
 uint64_t (*post_read)(RegisterInfo *reg, uint64_t val);
+
+size_t storage;
+int data_size;
+
+struct {
+hwaddr addr;
+uint8_t flags;
+} decode;
 };
 
 /**
@@ -108,6 +122,13 @@ struct RegisterInfo {
 MemoryRegion mem;
 };
 
+
+struct RegisterDecodeInfo {
+RegisterInfo *reg;
+hwaddr addr;
+unsigned len;
+};
+
 /**
  * write a value to a register, subject to its restrictions
  * @reg: register to write to
-- 
1.9.1.1.gbb9f595




Re: [Qemu-devel] [V3 PATCH 1/2] target-ppc: Fix target_disas

2014-04-09 Thread Peter Maydell
On 8 April 2014 20:26, Tom Musta tommu...@gmail.com wrote:
 Inspect only bit 16 for the Little Endian test.  Correct comment preceding
 the target_disas() function.

 Signed-off-by: Tom Musta tommu...@gmail.com

Reviewed-by: Peter Maydell peter.mayd...@linaro.org



[Qemu-devel] [PATCH v8 0/4] qemu-img: add preallocation=full

2014-04-09 Thread Hu Tao
The purpose of this series is to use posix_fallocate() when creating
img file to ensure there are disk space for it which is way fast than
acturally writing to disk. But this only works in file system level.
For cases like thin provisioning, an option full preallocation is
added to write zeros to storage to ensure disk space.

Hu Tao (4):
  qapi: introduce PreallocMode and a new PreallocMode full.
  raw, qcow2: don't convert file size to sector size
  raw-posix: Add full image preallocation option
  qcow2: Add full image preallocation option

 block/qcow2.c  | 95 --
 block/raw-posix.c  | 63 ++
 block/raw-win32.c  |  4 +-
 qapi-schema.json   | 14 +++
 tests/qemu-iotests/082.out | 54 +-
 5 files changed, 182 insertions(+), 48 deletions(-)

-- 
1.8.5.2.229.g4448466




Re: [Qemu-devel] [V3 PATCH 2/2] monitor: QEMU Monitor Instruction Disassembly Incorrect for PowerPC LE Mode

2014-04-09 Thread Peter Maydell
On 8 April 2014 20:26, Tom Musta tommu...@gmail.com wrote:
 The monitor support for disassembling instructions does not honor the MSR[LE]
 bit for PowerPC processors.

 This change enhances the monitor_disas() routine by supporting a flag bit
 for Little Endian mode.  Bit 16 is used since that bit was used in the
 analagous guest disassembly routine target_disas().

 Also, to be consistent with target_disas(), the disassembler bfd_mach field
 can be passed in the flags argument.

 Reported-by: Anton Blanchard an...@samba.org
 Signed-off-by: Tom Musta tommu...@gmail.com
 ---

 V2: Look specifically at bit 16 for LE.  Support machine configuration in 
 flags.

 V3: Changed documentation of 'flags' argument to simply refer to the 
 target_disas
   description (per Peter Maydell's review).

 The bug can be easily observed by dumping the first few instructions of an
 interrupt vector (0x300 is the Data Storage Interrupt handler in PPC)

   (qemu) xp/8i 0x300
   0x0300:  .long 0x60
   0x0304:  lhzur18,-19843(r3)
   0x0308:  .long 0x60
   0x030c:  lhzur18,-20099(r2)
   0x0310:  lwz r0,11769(0)
   0x0314:  lhzur23,8317(r2)
   0x0318:  .long 0x7813427c
   0x031c:  lbz r0,19961(0)

 With the patch applied, the disassembly now looks correct:

   (qemu) xp/8i 0x300
   0x0300:  nop
   0x0304:  mtsprg  2,r13
   0x0308:  nop
   0x030c:  mfsprg  r13,1
   0x0310:  std r9,128(r13)
   0x0314:  mfspr   r9,896
   0x0318:  mr  r2,r2
   0x031c:  std r10,136(r13)
  disas.c   |   14 --
  monitor.c |4 
  2 files changed, 16 insertions(+), 2 deletions(-)

 diff --git a/disas.c b/disas.c
 index a427d18..f300102 100644
 --- a/disas.c
 +++ b/disas.c
 @@ -445,6 +445,8 @@ monitor_fprintf(FILE *stream, const char *fmt, ...)
  return 0;
  }

 +/* Disassembler for the monitor.  See target_disas for a description of 
 'flags'.
 + */

You could just put that */ on the previous line, right? Or is it too
long if you do?

  void monitor_disas(Monitor *mon, CPUArchState *env,
 target_ulong pc, int nb_insn, int is_physical, int flags)
  {
 @@ -485,11 +487,19 @@ void monitor_disas(Monitor *mon, CPUArchState *env,
  s.info.mach = bfd_mach_sparc_v9b;
  #endif
  #elif defined(TARGET_PPC)
 +if (flags  0x) {
 +/* If we have a precise definitions of the instructions set, use it 
 */

definition, instruction.

Otherwise:
Reviewed-by: Peter Maydell peter.mayd...@linaro.org

thanks
-- PMM



[Qemu-devel] [PATCH v4 1/6] bitops: Add ONES macro

2014-04-09 Thread Peter Crosthwaite
Little macro that just gives you N ones (justified to LSB).

Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com
---

 include/qemu/bitops.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/qemu/bitops.h b/include/qemu/bitops.h
index 340b1e7..b9fd60f 100644
--- a/include/qemu/bitops.h
+++ b/include/qemu/bitops.h
@@ -403,4 +403,6 @@ static inline uint64_t deposit64(uint64_t value, int start, 
int length,
 return (value  ~mask) | ((fieldval  start)  mask);
 }
 
+#define ONES(num) ((num) == 64 ? ~0ull : (1ull  (num)) - 1)
+
 #endif
-- 
1.9.1.1.gbb9f595




Re: [Qemu-devel] [PATCH for-2.0] configure: add option to disable -fstack-protector flags

2014-04-09 Thread Peter Maydell
On 8 April 2014 21:47, Noonan, Steven snoo...@amazon.com wrote:
 On Tue, Apr 08, 2014 at 09:37:27PM +0100, Peter Maydell wrote:
 This is bad because we have that framework argument as part of our
 linker flags. Effectively this means that clang won't warn about the
 argument at link time but will warn for every .c-.o compile (as well
 as ending up with no stack protection).

 So -framework is designed to transparently add the appropriate -I and
 -L/-l flags, pointing to the insides of a .framework bundle.

 To me, the -framework arguments belong in CFLAGS and LIBS, but not
 LDFLAGS. In the context of QEMU's configure script, I think it'd be
 QEMU_INCLUDES and LIBS.

Unfortunately, putting -framework CoreFoundation in CFLAGS
produces a different warning:

manooth$ clang -o /tmp/zz9.o -Werror -fstack-protector -c /tmp/zz9.c
-framework CoreFoundation
clang: error: -framework CoreFoundation: 'linker' input unused

which would seem to imply that you shouldn't be passing it on
the .c-.o compile command line.

thanks
-- PMM



[Qemu-devel] [PATCH v8 4/4] qcow2: Add full image preallocation option

2014-04-09 Thread Hu Tao
This adds a preallocation=full mode to qcow2 image creation, which
creates a non-sparse image file.

Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 block/qcow2.c  | 79 --
 tests/qemu-iotests/082.out | 54 +++
 2 files changed, 103 insertions(+), 30 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index b2ff9c3..1b0034d 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1596,6 +1596,7 @@ static int qcow2_create2(const char *filename, int64_t 
total_size,
  QEMUOptionParameter *options, int version,
  Error **errp)
 {
+QEMUOptionParameter *alloc_options = NULL;
 /* Calculate cluster_bits */
 int cluster_bits;
 cluster_bits = ffs(cluster_size) - 1;
@@ -1625,10 +1626,78 @@ static int qcow2_create2(const char *filename, int64_t 
total_size,
 Error *local_err = NULL;
 int ret;
 
+if (prealloc == PREALLOC_MODE_FULL || prealloc == PREALLOC_MODE_METADATA) {
+int64_t meta_size = 0;
+unsigned nreftablee, nrefblocke, nl1e, nl2e;
+BlockDriver *drv;
+
+total_size = align_offset(total_size, cluster_size);
+
+drv = bdrv_find_protocol(filename, true);
+if (drv == NULL) {
+error_setg(errp, Could not find protocol for file '%s', 
filename);
+return -ENOENT;
+}
+
+alloc_options = append_option_parameters(alloc_options,
+ drv-create_options);
+alloc_options = append_option_parameters(alloc_options, options);
+
+/* header: 1 cluster */
+meta_size += cluster_size;
+
+/* total size of L2 tables */
+nl2e = total_size / cluster_size;
+nl2e = align_offset(nl2e, cluster_size / sizeof(uint64_t));
+meta_size += nl2e * sizeof(uint64_t);
+
+/* total size of L1 tables */
+nl1e = nl2e * sizeof(uint64_t) / cluster_size;
+nl1e = align_offset(nl1e, cluster_size / sizeof(uint64_t));
+meta_size += nl1e * sizeof(uint64_t);
+
+/* total size of refcount blocks
+ *
+ * note: every host cluster is reference-counted, including metadata
+ * (even refcount blocks are recursively included).
+ * Let:
+ *   a = total_size (this is the guest disk size)
+ *   m = meta size not including refcount blocks and refcount tables
+ *   c = cluster size
+ *   y1 = number of refcount blocks entries
+ *   y2 = meta size including everything
+ * then,
+ *   y1 = (y2 + a)/c
+ *   y2 = y1 * sizeof(u16) + y1 * sizeof(u16) * sizeof(u64) / c + m
+ * we can get y1:
+ *   y1 = (a + m) / (c - sizeof(u16) - sizeof(u16) * sizeof(u64) / c)
+ */
+nrefblocke = (total_size + meta_size + cluster_size) /
+(cluster_size - sizeof(uint16_t) -
+ 1.0 * sizeof(uint16_t) * sizeof(uint64_t) / cluster_size);
+nrefblocke = align_offset(nrefblocke, cluster_size / sizeof(uint16_t));
+meta_size += nrefblocke * sizeof(uint16_t);
+
+/* total size of refcount tables */
+nreftablee = nrefblocke * sizeof(uint16_t) / cluster_size;
+nreftablee = align_offset(nreftablee, cluster_size / sizeof(uint64_t));
+meta_size += nreftablee * sizeof(uint64_t);
+
+set_option_parameter_int(alloc_options, BLOCK_OPT_SIZE,
+ total_size + meta_size);
+if (prealloc == PREALLOC_MODE_FULL) {
+set_option_parameter(alloc_options, BLOCK_OPT_PREALLOC, full);
+} else if (prealloc == PREALLOC_MODE_METADATA) {
+set_option_parameter(alloc_options, BLOCK_OPT_PREALLOC, 
metadata);
+}
+
+options = alloc_options;
+}
+
 ret = bdrv_create_file(filename, options, local_err);
 if (ret  0) {
 error_propagate(errp, local_err);
-return ret;
+goto out_options;
 }
 
 bs = NULL;
@@ -1636,7 +1705,7 @@ static int qcow2_create2(const char *filename, int64_t 
total_size,
 NULL, local_err);
 if (ret  0) {
 error_propagate(errp, local_err);
-return ret;
+goto out_options;
 }
 
 /* Write the header */
@@ -1758,6 +1827,8 @@ out:
 if (bs) {
 bdrv_unref(bs);
 }
+out_options:
+free_option_parameters(alloc_options);
 return ret;
 }
 
@@ -1793,6 +1864,8 @@ static int qcow2_create(const char *filename, 
QEMUOptionParameter *options,
 prealloc = PREALLOC_MODE_OFF;
 } else if (!strcmp(options-value.s, metadata)) {
 prealloc = PREALLOC_MODE_METADATA;
+} else if (!strcmp(options-value.s, full)) {
+prealloc = PREALLOC_MODE_FULL;
 } else {
 error_setg(errp, Invalid preallocation mode: '%s',
options-value.s);
@@ -2358,7 +2431,7 @@ static 

[Qemu-devel] [PATCH v4 6/6] xilinx_zynq: added devcfg to machine model

2014-04-09 Thread Peter Crosthwaite
Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com
---
Changed since v3:
Author reset.
Changed since v1:
Added manual parenting of devcfg node (evil but needed for early access
to canonical path by devcfgs realize fn).

 hw/arm/xilinx_zynq.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
index 9ee21e7..e3505fa 100644
--- a/hw/arm/xilinx_zynq.c
+++ b/hw/arm/xilinx_zynq.c
@@ -231,6 +231,14 @@ static void zynq_init(QEMUMachineInitArgs *args)
 sysbus_connect_irq(busdev, n + 1, pic[dma_irqs[n] - IRQ_OFFSET]);
 }
 
+dev = qdev_create(NULL, xlnx.ps7-dev-cfg);
+object_property_add_child(qdev_get_machine(), xilinx-devcfg, OBJECT(dev),
+  NULL);
+qdev_init_nofail(dev);
+busdev = SYS_BUS_DEVICE(dev);
+sysbus_connect_irq(busdev, 0, pic[40-IRQ_OFFSET]);
+sysbus_mmio_map(busdev, 0, 0xF8007000);
+
 zynq_binfo.ram_size = ram_size;
 zynq_binfo.kernel_filename = kernel_filename;
 zynq_binfo.kernel_cmdline = kernel_cmdline;
-- 
1.9.1.1.gbb9f595




Re: [Qemu-devel] [PULL for-2.0] dsdt: tweak ACPI ID for hotplug resource device

2014-04-09 Thread Igor Mammedov
On Tue,  8 Apr 2014 15:24:18 +0300
Michael S. Tsirkin m...@redhat.com wrote:

 ACPI0004 seems too new:
 Windows XP complains about an unrecognized device.
 This is a regression since 1.7.
 Use PNP0A06 instead - Generic Container Device.
 
 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 Reviewed-By: Igor Mammedov imamm...@redhat.com
 ---
  hw/i386/acpi-dsdt-cpu-hotplug.dsl | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/hw/i386/acpi-dsdt-cpu-hotplug.dsl 
 b/hw/i386/acpi-dsdt-cpu-hotplug.dsl
 index dee4843..34aab5a 100644
 --- a/hw/i386/acpi-dsdt-cpu-hotplug.dsl
 +++ b/hw/i386/acpi-dsdt-cpu-hotplug.dsl
 @@ -93,7 +93,7 @@ Scope(\_SB) {
  }
  
  Device(CPU_HOTPLUG_RESOURCE_DEVICE) {
 -Name(_HID, ACPI0004)
 +Name(_HID, EisaId(PNP0A06))
  
  Name(_CRS, ResourceTemplate() {
  IO(Decode16, CPU_STATUS_BASE, CPU_STATUS_BASE, 0, CPU_STATUS_LEN)

There is a need to update test cases with new DSDTs, since it breaks them.



[Qemu-devel] [Bug 1242963] Re: QEMU loadvm causes guest OS freeze

2014-04-09 Thread NanothylL
I found something, hope this could be useful.

(qemu) info cpus
* CPU #0: pc=0x8054b7f1 thread_id=20447
(qemu) savevm x
(qemu) loadvm x
(qemu) info cpus
* CPU #0: pc=0x806f69ba (halted) thread_id=20447

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1242963

Title:
  QEMU loadvm causes guest OS freeze

Status in QEMU:
  New

Bug description:
  HOST: ubuntu 13.10 x64
  GUEST: winxp sp 3 x86

  AFFECT QEMU(tested): v1.5.2, v1.5.3,  v1.6.0, v1.6.1

  I compile QEMU by myself with ./configure --target-list=i386-softmmu  make 
 make install.
  After installing a winxp sp3 into the qemu-system-i386 with command line:
   qemu-system-i386 -m 512 -hda xp.img -net user -net nic,model=rtl8139 -rtc 
base=localtime,clock=vm

  I use monitor to create a live snapshot: 
   stop
   savevm xxx
   cont

  And then I load this snapshot (I also try it in commad line: -loadvm xxx):
   loadvm xxx
   cont

  After that, the windows system is freeze (don't accept any keyboard or
  mouse input, although I knew vcpu is still working).

  If I compile with -enable-kvm and launch qemu-system-i386 with -enable-kvm, 
it looks like everything works well.
  I think it is a bug for qemu system.

  BTW: freeze is not appearing 100%, but in my test, 95% cases would
  cause system freeze.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1242963/+subscriptions



Re: [Qemu-devel] [RFC PATCH] target-ppc: enable migration within the same CPU family

2014-04-09 Thread Alexander Graf


 Am 09.04.2014 um 02:41 schrieb Alexey Kardashevskiy a...@ozlabs.ru:
 
 On 04/09/2014 12:59 AM, Alexander Graf wrote:
 On 04/08/2014 02:19 PM, Michael Mueller wrote:
 On Tue, 08 Apr 2014 21:47:39 +1000
 Alexey Kardashevskiy a...@ozlabs.ru wrote:
 
 On 04/08/2014 08:32 PM, Michael Mueller wrote:
 On Tue, 08 Apr 2014 20:04:42 +1000
 Alexey Kardashevskiy a...@ozlabs.ru wrote:
 
 On 04/08/2014 07:47 PM, Michael Mueller wrote:
 On Tue, 08 Apr 2014 11:23:14 +1000
 Alexey Kardashevskiy a...@ozlabs.ru wrote:
 
 On 04/08/2014 04:53 AM, Andreas Färber wrote:
 Am 07.04.2014 05:27, schrieb Alexey Kardashevskiy:
 On 04/04/2014 11:28 PM, Alexander Graf wrote:
 On 04/04/2014 07:17 AM, Alexey Kardashevskiy wrote:
 On 03/24/2014 04:28 PM, Alexey Kardashevskiy wrote:
 Currently only migration fails if CPU version is different even
 a bit.
 For example, migration from POWER7 v2.0 to POWER7 v2.1 fails
 because of
 that. Since there is no difference between CPU versions which
 could
 affect migration stream, we can safely enable it.
 
 This adds a helper to find the closest POWERPC family class
 (i.e. first
 abstract class in hierarchy).
 
 This replaces VMSTATE_UINTTL_EQUAL statement with a custom
 handler which
 checks if the source and destination CPUs belong to the same
 family and
 fails if they are not.
 
 This adds a PVR reset to the default value as it will be
 overwritten
 by VMSTATE_UINTTL_ARRAY(env.spr, PowerPCCPU, 1024).
 
 Since the actual migration format is not changed by this patch,
 @version_id of vmstate_ppc_cpu does not have to be changed either.
 
 Signed-off-by: Bharata B Rao bhar...@linux.vnet.ibm.com
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 Ping?
 Can't we just always allow migration to succeed? It's a problem
 of the tool
 stack above if it allows migration to an incompatible host, no?
 This is not how libvirt works. It simply sends the source XML,
 reconstructs
 a guest on the destination side and then migrates. hoping that the
 migration will fail is something (which only QEMU has knowledge
 of) is
 incompatible. The new guest will start with -cpu host (as the
 source) but
 it will create diffrent CPU class and do different things. If we
 do not
 check PVR (and cpu_dt_id and chip_id - the latter is coming soon) and
 migrate power8-power7, we can easily get a broken guest.
 The response is very simple: -cpu host is not supported for migration.
 Same as for x86 hosts.
 Is there any good reason to limit ourselves on POWERPC?
 
 As you say, the domain config is transferred by libvirt:
 If you use -cpu POWER7, you can migrate from POWER7 to POWER8 and
 back;
 if you use -cpu POWER8, you can only migrate on POWER8.
 -cpu other that host is not supported by HV KVM, only compat which
 upstream QEMU does not have yet. So you are saying that the
 migration is
 not supported by upstream QEMU for at least SPAPR. Well, ok, it is dead
 anyway so I am fine :)
 With s390x we have a similar situation. Thus we came up with a
 mechanism to limit
 the CPU functionality of a possible target system. Our patch
 implements CPU models
 based on TYPE and GA like 2817-ga1, etc. (GA represents a CPU
 facility set and an IBC
 value (Instruction Blocking Control, reduces the instruction set to
 the requested
 level)) When a guest is started, it receives its CPU model by means
 of option -cpu.
 host equates the configuration of the current system. We
 implemented query-cpu-model
 returning the actual model, here maybe { name: 2817-ga1 }. To find
 a suitable
 migration target in a remote CEC, libvirt has to
 query-cpu-definitions returning a
 list of models supported by the target system {{name: 2827-ga2},
 {name: 2827-ga1},
 {name: 2817-ga2},...]. A match means the system is suitable and can
 be used
 as migration target.
 Sorry, I do not follow you. You hacked libvirt to run the destination
 QEMU
 with a specific CPU model? Or it is in QEMU? Where? What I see now is
 this:
 
 static const VMStateDescription vmstate_s390_cpu = {
 .name = cpu,
 .unmigratable = 1,
 };
 
 Does not look like it supports migration :) Thanks!
 The code you're missing is not upstream yet. The s390x guest can be
 migrated in the meantime.
 Yes, libvirt currently gets an extension to be able to identify and
 startup suitable migration
 targets for s390x on behalf of the mentioned qemu cpu model. BTW can
 you point me to the above
 mentioned SPAPR stuff...
 
 Mmm. What stuff? :) At the moment POWERPC guests migrate if PVR (processor
 version register) value is exactly the same. I am trying to relax this
 limitation to any version within same CPU family, like power7 v1.0 and
 v2.1.
 With stuff I referred to to term sPAPR not realizing it relates to
 the Power Architecture Platform Requirements, got it now. :-)
 
 I see, ppc currently has this limitation to enforce compatibility
 VMSTATE_UINTTL_EQUAL(env.spr[SPR_PVR], PowerPCCPU),
 
 Yes, but the s390 approach is a lot cleaner and I'd rather like to move
 into that direction.
 
 Then we 

Re: [Qemu-devel] [PATCH v1 1/1] vl.c: Allow sysbus devices to be attached via commandline

2014-04-09 Thread Markus Armbruster
Alistair Francis alistair.fran...@xilinx.com writes:

 On Wed, Apr 9, 2014 at 11:28 AM, Peter Crosthwaite
 peter.crosthwa...@xilinx.com wrote:
 On Wed, Mar 26, 2014 at 10:47 AM, Alistair Francis
 alistair.fran...@xilinx.com wrote:
 This patch introduces a new command line argument that allows
 sysbus devices to be attached via the command line.

 This allows devices to be added after the machine init but
 before anything is booted

 The new argument is -sysbusdev

 A new argument is being used to avoid confusion and user
 errors that would appear if the same command for hot-plugging
 was used instead. This could be changed to use -device and
 specify sysbus by using something like 'bus=sysbus' if that
 is preferred


 Can you be more specific about the confusion issue you are seeing? If
 you went for -device bus=root (meaning the default singleton root
 sysbus), where are the possible issues?


 Using -device would be possible (and easy to implement). The reason I
 started with
 a different argument name was just to avoid confusion and to make sure
 I didn't interfere with hot-pluggable devices. I still feel that separating
 hot-pluggable devices from bus devices is a good idea. They are initialised
 in slightly different parts of the code (although this wouldn't
 be hard to fix). The name 'sysbusdev' is confusing though, as it doesn't
 have to connect to the sysbus.

-device / device_add already covers both hot plug and cold plug.  In
fact, only device_add can hot plug; all -device ever does is cold plug.

Plugging in a device involves wiring it up.  Hot plug additionally
involves communication with guest software, but let's ignore that for
now.

-device uses generic wiring code, guided by configuration: property
bus.  This suffices for straightforward cases like PCI and USB
devices.  It doesn't for sysbus devices, because these may require
arbitrarily complex wirings.  Therefore, sysbus devices are unavailable
with -device; see commit 837d371.

What we need is a way for configuration to guide more general wiring.
Perhaps -device can be extended.  If that turns out to be impractical,
we need something more expressive that also covers everything -device
does now, and can cover everything eventually.  Andreas may have ideas
here.

What we don't need, in my opinion, is more special-case options :)

[...]



[Qemu-devel] [PATCH v4 5/6] xilinx_devcfg: Zynq devcfg device model

2014-04-09 Thread Peter Crosthwaite
Minimal device model for devcfg module of Zynq. DMA capabilities and
interrupt generation supported.

Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com
---
Changed since v3:
Stylistic updates.
Changed over to new decoding scheme.
Use .rsvd in definitions as appropriate.
Author reset (s/petalogix/xilinx).
Changed since v2:
Some QOM styling updates.
Re-implemented nw0 for lock register as pre_write
Changed since v1:
Rebased against new version of Register API.
Use action callbacks for side effects rather than switch.
Documented reasons for ge0, ge1 (Verbatim from TRM)
Added ui1 definitions for unimplemented major features
Removed dead lock code

 default-configs/arm-softmmu.mak |   1 +
 hw/dma/Makefile.objs|   1 +
 hw/dma/xilinx_devcfg.c  | 462 
 3 files changed, 464 insertions(+)
 create mode 100644 hw/dma/xilinx_devcfg.c

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index f3513fa..e0e518c 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -62,6 +62,7 @@ CONFIG_PXA2XX=y
 CONFIG_BITBANG_I2C=y
 CONFIG_FRAMEBUFFER=y
 CONFIG_XILINX_SPIPS=y
+CONFIG_ZYNQ_DEVCFG=y
 
 CONFIG_ARM11SCU=y
 CONFIG_A9SCU=y
diff --git a/hw/dma/Makefile.objs b/hw/dma/Makefile.objs
index 0e65ed0..96025cf 100644
--- a/hw/dma/Makefile.objs
+++ b/hw/dma/Makefile.objs
@@ -5,6 +5,7 @@ common-obj-$(CONFIG_PL330) += pl330.o
 common-obj-$(CONFIG_I82374) += i82374.o
 common-obj-$(CONFIG_I8257) += i8257.o
 common-obj-$(CONFIG_XILINX_AXI) += xilinx_axidma.o
+common-obj-$(CONFIG_ZYNQ_DEVCFG) += xilinx_devcfg.o
 common-obj-$(CONFIG_ETRAXFS) += etraxfs_dma.o
 common-obj-$(CONFIG_STP2000) += sparc32_dma.o
 common-obj-$(CONFIG_SUN4M) += sun4m_iommu.o
diff --git a/hw/dma/xilinx_devcfg.c b/hw/dma/xilinx_devcfg.c
new file mode 100644
index 000..c85f808
--- /dev/null
+++ b/hw/dma/xilinx_devcfg.c
@@ -0,0 +1,462 @@
+/*
+ * QEMU model of the Xilinx Devcfg Interface
+ *
+ * (C) 2011 PetaLogix Pty Ltd
+ * (C) 2014 Xilinx Inc.
+ * Written by Peter Crosthwaite peter.crosthwa...@xilinx.com
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include sysemu/sysemu.h
+#include sysemu/dma.h
+#include hw/sysbus.h
+#include hw/ptimer.h
+#include qemu/bitops.h
+#include hw/register.h
+#include qemu/bitops.h
+
+#define TYPE_XILINX_DEVCFG xlnx.ps7-dev-cfg
+
+#define XILINX_DEVCFG(obj) \
+OBJECT_CHECK(XilinxDevcfg, (obj), TYPE_XILINX_DEVCFG)
+
+/* FIXME: get rid of hardcoded nastiness */
+
+#define FREQ_HZ 9
+
+#define BTT_MAX 0x400 /* bytes to transfer per delay inteval */
+#define CYCLES_BTT_MAX 1 /*approximate 10k cycles per delay interval */
+
+#ifndef XILINX_DEVCFG_ERR_DEBUG
+#define XILINX_DEVCFG_ERR_DEBUG 0
+#endif
+#define DB_PRINT(...) do { \
+if (XILINX_DEVCFG_ERR_DEBUG) { \
+fprintf(stderr,  : %s: , __func__); \
+fprintf(stderr, ## __VA_ARGS__); \
+} \
+} while (0);
+
+#define R_CTRL(0x00/4)
+#define FORCE_RST(1  31) /* Not supported, writes ignored */
+#define PCAP_PR  (1  27) /* Forced to 0 on bad unlock */
+#define PCAP_MODE(1  26)
+#define MULTIBOOT_EN (1  24)
+#define USER_MODE(1  15)
+#define PCFG_AES_FUSE(1  12) /* locked by AES_FUSE_LOCK */
+#define PCFG_AES_EN_SHIFT9 /* locked by AES_EN_LOCK */
+#define PCFG_AES_EN_LEN  3 /* locked by AES_EN_LOCK */
+#define PCFG_AES_EN_MASK (((1  PCFG_AES_EN_LEN) - 1) \
+ PCFG_AES_EN_SHIFT)
+#define SEU_EN   (1  8) /* locked by SEU_LOCK */
+#define SEC_EN   (1  7) /* locked by SEC_LOCK */
+#define SPNIDEN  (1  6) /* locked by DBG_LOCK */
+#define SPIDEN   (1  5) /* locked by DBG_LOCK */
+#define NIDEN(1  4) /* locked by DBG_LOCK */
+#define DBGEN(1  3) 

Re: [Qemu-devel] [PATCH v2 4/6] qemu-img: Enable progress output for commit

2014-04-09 Thread Kevin Wolf
Am 08.04.2014 um 18:53 hat Eric Blake geschrieben:
 On 04/08/2014 09:34 AM, Kevin Wolf wrote:
  Am 08.04.2014 um 14:50 hat Max Reitz geschrieben:
  Implement progress output for the commit command by querying the
  progress of the block job.
 
  Signed-off-by: Max Reitz mre...@redhat.com
  Reviewed-by: Eric Blake ebl...@redhat.com
  ---
 
   
   info = block_job_query(job);
   
  +if (info-offset) {
  +if (!mod_offset) {
  
  On a fully populated image this doesn't look entirely right. I think the
  first 2 MB (or whatever the buffer size is) will be disregarded in the
  calculation, even though they are real work that is done.
 
 Is there any (easy) way to calculate how many dirty sectors remain to be
 committed, to use that rather than the image size as the job percentage
 remaining?

The very first thing that mirror_run() does is building the dirty bitmap
depending on whether clusters are allocated or not. After this loop we
have the necessary information, but we don't really communicate it to
the caller. Max tries to guess the point at which the loop completed by
checking when we make progress for the first time, but that means that
we ignore the first actual buffer.

We could probably simply change mirror_run() to update s-common.len by
subtracting the parts that haven't been marked dirty. Then the caller
could use the values as they are, and query-block-jobs would get more
accurate values for mirroring jobs, too.

commit_run(), for a commit somewhere down in the backing chain, we don't
have this initialising loop yet, but we could add it without adding too
much overhead, I think. qemu-img convert was changed in December to do
the same and noone has complained yet (though it's not in a release yet,
perhaps people will complain once 2.0 is out).

Kevin


pgpksJm2RJ8vu.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH v1 1/1] zynq_slcr: Change the comma to a underscore

2014-04-09 Thread Markus Armbruster
Alistair Francis alistair.fran...@xilinx.com writes:

 On Wed, Apr 9, 2014 at 11:14 AM, Peter Crosthwaite
 peter.crosthwa...@xilinx.com wrote:
 On Wed, Mar 26, 2014 at 1:05 PM, Alistair Francis
 alistair.fran...@xilinx.com wrote:
 This patch changes the comma in the xilinx,zynq_slcr to an
 underscore. This matches every other xilinx* peripheral and
 also makes parsing the device via the command line possible.


 I think its actually a case of this being slightly ahead and everyone
 else being behind. The comma is probably ultimately wrong and I'm
 guessing its awkward for your command-line work due to command line
 character escaping. I am in favor of the xlnx.foo styling that is
 more widely adopted:

 [qemu]$ git grep -c xlnx\.
 hw/arm/xilinx_zynq.c:1
 hw/char/xilinx_uartlite.c:2
 hw/dma/xilinx_axidma.c:2
 hw/intc/xilinx_intc.c:2
 hw/microblaze/petalogix_ml605_mmu.c:6
 hw/microblaze/petalogix_s3adsp1800_mmu.c:5
 hw/net/xilinx_axienet.c:3
 hw/net/xilinx_ethlite.c:2
 hw/ppc/virtex_ml507.c:2
 hw/ssi/xilinx_spi.c:1
 hw/ssi/xilinx_spips.c:2
 hw/timer/xilinx_timer.c:2
 target-microblaze/cpu.c:1

 will xlnx.zynq-slcr work? (fix the underscore while at it).

 Full stops are fine, just as long as it is not a comma

The most common separator in device model names is '-'.  There's a fair
number of '.', some '_', and a few ','.

The comma are probably rooted in device tree usage.  I doubt that buys
us anything but confusing command line trouble.

I suspect period breaks -global.

I very much recommend picking '-' whenever practical.



Re: [Qemu-devel] [PATCH] HMP: support specifying dump format for dump-guest-memory

2014-04-09 Thread Markus Armbruster
Qiao Nuohan qiaonuo...@cn.fujitsu.com writes:

 Dumping guest memory is available to specify the dump format now. This patch
 adds options '-z|-l|-s' to HMP command dump-guest-memory to specify dumping in
 kdump-compression format, with zlib/lzo/snappy compression. And without these
 options ELF format will be used.

 The discussion about this feature is here:

 http://lists.nongnu.org/archive/html/qemu-devel/2014-03/msg04235.html

 Signed-off-by: Qiao Nuohan qiaonuo...@cn.fujitsu.com
 Suggested-by: Christian Borntraeger borntrae...@de.ibm.com
 ---
  hmp-commands.hx | 11 +++
  hmp.c   | 25 -
  2 files changed, 31 insertions(+), 5 deletions(-)

 diff --git a/hmp-commands.hx b/hmp-commands.hx
 index f3fc514..4b9989f 100644
 --- a/hmp-commands.hx
 +++ b/hmp-commands.hx
 @@ -998,8 +998,8 @@ ETEXI
  
  {
  .name   = dump-guest-memory,
 -.args_type  = paging:-p,filename:F,begin:i?,length:i?,
 -.params = [-p] filename [begin] [length],
 +.args_type  = 
 paging:-p,zlib:-z,lzo:-l,snappy:-s,filename:F,begin:i?,length:i?,
 +.params = [-p] [-z|-l|-s] filename [begin] [length],

Since you're touching it anyway, please fix it to [begin length].

  .help   = dump guest memory to file
\n\t\t\t begin(optional): the starting physical 
 address
\n\t\t\t length(optional): the memory size, in bytes,

.help doesn't cover the flags.  Please fix that.

Both (optional) are redundant, because help also prints .params.

 @@ -1008,12 +1008,15 @@ ETEXI
  
  
  STEXI
 -@item dump-guest-memory [-p] @var{protocol} @var{begin} @var{length}
 +@item dump-guest-memory [-p] [-z|-l|-s] @var{protocol} @var{begin} 
 @var{length}
  @findex dump-guest-memory
  Dump guest memory to @var{protocol}. The file can be processed with crash or
 -gdb.
 +gdb. Without -z|-l|-s, the dump format is ELF.
filename: dump file name

This was called protocol above.  Not your fault, but please fix it to
filename anyway.

  paging: do paging to get guest's memory mapping

This should be -p:, not paging:.  Again, not your fault, but please
fix it.

 +  zlib: dump in kdump-compressed format, with zlib compression
 +   lzo: dump in kdump-compressed format, with lzo compression
 +snappy: dump in kdump-compressed format, with snappy compression

These should be -z:, -l: and -s:.

   begin: the starting physical address. It's optional, and should be
  specified with length together.
  length: the memory size, in bytes. It's optional, and should be specified
   with begin together.

Please document that -p and begin/length both conflict with -z, -l, -s.

Here's a rather formal way to do that:

@item dump-guest-memory [[-p] @var{filename} @var{begin} @var{length} | 
[-z|-l|-s] @var{filename}]

Awkward.  This might come out better:

@item dump-guest-memory [-p] @var{filename} @var{begin} @var{length}
@item dump-guest-memory [-z|-l|-s] @var{filename}

Alternatively describe the restrictions in the text rather than the 

Suggest to try and check the resulting .info.

In case you decide to touch the lines on begin and length for some
reason: with length together should be together with length.

I'd probably fix the preexisting issues in a separate patch first.

 diff --git a/hmp.c b/hmp.c
 index 2f279c4..37c3961 100644
 --- a/hmp.c
 +++ b/hmp.c
 @@ -1308,16 +1308,39 @@ void hmp_dump_guest_memory(Monitor *mon, const QDict 
 *qdict)
  {
  Error *errp = NULL;
  int paging = qdict_get_try_bool(qdict, paging, 0);
 +int zlib = qdict_get_try_bool(qdict, zlib, 0);
 +int lzo = qdict_get_try_bool(qdict, lzo, 0);
 +int snappy = qdict_get_try_bool(qdict, snappy, 0);
  const char *file = qdict_get_str(qdict, filename);
  bool has_begin = qdict_haskey(qdict, begin);
  bool has_length = qdict_haskey(qdict, length);
 -/* kdump-compressed format is not supported for HMP */
  bool has_format = false;

Suggest to drop has_format, and pass true to qmp_dump_guest_memory()
instead.

  int64_t begin = 0;
  int64_t length = 0;
  enum DumpGuestMemoryFormat dump_format = DUMP_GUEST_MEMORY_FORMAT_ELF;
  char *prot;
  
 +if ((zlib + lzo + snappy)  1) {

Superfluous parentheses around sum, please drop them.

 +error_setg(errp, only one of '-z|-l|-s' can be set);
 +hmp_handle_error(mon, errp);
 +return;
 +}
 +
 +if (zlib) {
 +has_format = true;
 +dump_format = DUMP_GUEST_MEMORY_FORMAT_KDUMP_ZLIB;
 +}
 +
 +if (lzo) {
 +has_format = true;
 +dump_format = DUMP_GUEST_MEMORY_FORMAT_KDUMP_LZO;
 +}
 +
 +if (snappy) {
 +has_format = true;
 +dump_format = DUMP_GUEST_MEMORY_FORMAT_KDUMP_SNAPPY;
 +}
 +
  if (has_begin) {
  begin = qdict_get_int(qdict, begin);
  }



Re: [Qemu-devel] [PATCH v1 1/1] zynq_slcr: Change the comma to a underscore

2014-04-09 Thread Markus Armbruster
Markus Armbruster arm...@redhat.com writes:

 Alistair Francis alistair.fran...@xilinx.com writes:

 On Wed, Apr 9, 2014 at 11:14 AM, Peter Crosthwaite
 peter.crosthwa...@xilinx.com wrote:
 On Wed, Mar 26, 2014 at 1:05 PM, Alistair Francis
 alistair.fran...@xilinx.com wrote:
 This patch changes the comma in the xilinx,zynq_slcr to an
 underscore. This matches every other xilinx* peripheral and
 also makes parsing the device via the command line possible.


 I think its actually a case of this being slightly ahead and everyone
 else being behind. The comma is probably ultimately wrong and I'm
 guessing its awkward for your command-line work due to command line
 character escaping. I am in favor of the xlnx.foo styling that is
 more widely adopted:

 [qemu]$ git grep -c xlnx\.
 hw/arm/xilinx_zynq.c:1
 hw/char/xilinx_uartlite.c:2
 hw/dma/xilinx_axidma.c:2
 hw/intc/xilinx_intc.c:2
 hw/microblaze/petalogix_ml605_mmu.c:6
 hw/microblaze/petalogix_s3adsp1800_mmu.c:5
 hw/net/xilinx_axienet.c:3
 hw/net/xilinx_ethlite.c:2
 hw/ppc/virtex_ml507.c:2
 hw/ssi/xilinx_spi.c:1
 hw/ssi/xilinx_spips.c:2
 hw/timer/xilinx_timer.c:2
 target-microblaze/cpu.c:1

 will xlnx.zynq-slcr work? (fix the underscore while at it).

 Full stops are fine, just as long as it is not a comma

 The most common separator in device model names is '-'.  There's a fair
 number of '.', some '_', and a few ','.

 The comma are probably rooted in device tree usage.  I doubt that buys
 us anything but confusing command line trouble.

 I suspect period breaks -global.

Yes, it does: -global cfi.pflash01.name=foo gets parsed as driver =
cfi, property = pflash01.name, value = foo.  I'm going to start a
new thread for this issue.

 I very much recommend picking '-' whenever practical.

Still do :)



Re: [Qemu-devel] [PATCH for-2.0] configure: add option to disable -fstack-protector flags

2014-04-09 Thread Peter Maydell
On 9 April 2014 10:29, Noonan, Steven snoo...@amazon.com wrote:
 So in your case all you probably need is to drop the -framework
 arguments from CFLAGS and plop them into LIBS, and you're probably good
 to go.

This is where they are already. The problem is that putting
-framework on the linker command line causes clang to
fail to reject -fsome-random-unknown-thing, which means
we can't do configure detection of -fsomething arguments
using compile_prog  :-(

I think the only thing we can do about this is to make sure
that our configure code to check for -fsomething does a
compile-only check and not a link.

thanks
-- PMM



[Qemu-devel] '.' IDs and certain names breaks -global and -set

2014-04-09 Thread Markus Armbruster
We have a number of device model names containing '.'.  They're unusable
with -global.  That's because -global A.B.C=foo gets parsed as

driver = A
property = B.C
value = foo.

Wrong when the device model name is really A.B and the property is C,
e.g. -global cfi.pflash01.name.

Related problem: QemuOpts ID may contain '.'.  Such IDs are in fact
common practice.  Unfortunately, they're unusable with -set.  For
instance, -set A.B.C.D=foo gets parsed as

group = A
id = B
arg = C.D
value = foo

Wrong when the id is really B.C and the arg is D, e.g. -device
isa-serial,id=s.0 -set device.s.0.chardev=c0.  This issue isn't limited
to devices.

Related historical problem: commit b560a9a.

Possible solutions:

* Outlaw '.' in QemuOpts parameter names (and thus device model names
  and property names, because QemuOpts is a major way for users to
  specify them).

* Resolve the syntactic ambiguity semantically (ugh)

* Extend the syntax of -global and -set to let users avoid the issue, or
  replace them outright

* Use the old ostrich algorithm

Andreas, what restrictions does QOM place on QOM path component names?
No '/', obviously.  Anything else?



[Qemu-devel] [PATCH] bochs: Fix memory leak in bochs_open() error path

2014-04-09 Thread Kevin Wolf
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/bochs.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/block/bochs.c b/block/bochs.c
index 826ec12..50b84a9 100644
--- a/block/bochs.c
+++ b/block/bochs.c
@@ -150,11 +150,13 @@ static int bochs_open(BlockDriverState *bs, QDict 
*options, int flags,
 s-extent_size = le32_to_cpu(bochs.extent);
 if (s-extent_size == 0) {
 error_setg(errp, Extent size may not be zero);
-return -EINVAL;
+ret = -EINVAL;
+goto fail;
 } else if (s-extent_size  0x80) {
 error_setg(errp, Extent size % PRIu32  is too large,
s-extent_size);
-return -EINVAL;
+ret = -EINVAL;
+goto fail;
 }
 
 if (s-catalog_size  bs-total_sectors / s-extent_size) {
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH] bochs: Fix memory leak in bochs_open() error path

2014-04-09 Thread Laszlo Ersek
On 04/09/14 11:20, Kevin Wolf wrote:
 Signed-off-by: Kevin Wolf kw...@redhat.com
 ---
  block/bochs.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)
 
 diff --git a/block/bochs.c b/block/bochs.c
 index 826ec12..50b84a9 100644
 --- a/block/bochs.c
 +++ b/block/bochs.c
 @@ -150,11 +150,13 @@ static int bochs_open(BlockDriverState *bs, QDict 
 *options, int flags,
  s-extent_size = le32_to_cpu(bochs.extent);
  if (s-extent_size == 0) {
  error_setg(errp, Extent size may not be zero);
 -return -EINVAL;
 +ret = -EINVAL;
 +goto fail;
  } else if (s-extent_size  0x80) {
  error_setg(errp, Extent size % PRIu32  is too large,
 s-extent_size);
 -return -EINVAL;
 +ret = -EINVAL;
 +goto fail;
  }
  
  if (s-catalog_size  bs-total_sectors / s-extent_size) {
 

Reviewed-by: Laszlo Ersek ler...@redhat.com



Re: [Qemu-devel] [PATCH 2/2] pci-assign: Fix memory out of bound when MSI-X table not fit in a single page

2014-04-09 Thread Gonglei (Arei)
Hi,

  QEMU only mmap MSIX_PAGE_SIZE memory for all pci devices in
  assigned_dev_register_msix_mmio(), meanwhile the set the one
  page memmory to zero, so the rest memory will be random value
  (maybe etnry.data is not 0). In the assigned_dev_update_msix_mmio()
  maybe occur the issue of entry_nr  256, and the kmod reports
  the EINVAL error.
 
  Signed-off-by: Gonglei arei.gong...@huawei.com
 
 Okay so this kind of works because guest does not try
 to use so many vectors.
 
Yes. It's true.

 But I think it's better not to give guest more entries
 than we can actually support.
 
 How about tweaking MSIX capability exposed to guest to limit table size?
 
IMHO, the MSIX table entry size got form the physic NIC hardware. The software
layer shouldn't tweak the capability of real hardware, neither QEMU nor KVM. 


Best regards,
-Gonglei

  ---
   hw/i386/kvm/pci-assign.c | 15 ---
   1 file changed, 12 insertions(+), 3 deletions(-)
 
  diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
  index 570333f..d25a19e 100644
  --- a/hw/i386/kvm/pci-assign.c
  +++ b/hw/i386/kvm/pci-assign.c
  @@ -37,6 +37,7 @@
   #include hw/pci/pci.h
   #include hw/pci/msi.h
   #include kvm_i386.h
  +#include qemu/osdep.h
 
   #define MSIX_PAGE_SIZE 0x1000
 
  @@ -59,6 +60,9 @@
   #define DEBUG(fmt, ...)
   #endif
 
  +/* the msix-table size readed from pci device config */
  +static int msix_table_size;
  +
   typedef struct PCIRegion {
   int type;   /* Memory or port I/O */
   int valid;
  @@ -1604,7 +1608,12 @@ static void
 assigned_dev_msix_reset(AssignedDevice *dev)
 
   static int assigned_dev_register_msix_mmio(AssignedDevice *dev)
   {
  -dev-msix_table = mmap(NULL, MSIX_PAGE_SIZE,
 PROT_READ|PROT_WRITE,
  +msix_table_size = ROUND_UP(dev-msix_max * sizeof(struct
 MSIXTableEntry),
  +MSIX_PAGE_SIZE);
  +
  +DEBUG(msix_table_size: 0x%x\n, msix_table_size);
  +
  +dev-msix_table = mmap(NULL, msix_table_size,
 PROT_READ|PROT_WRITE,
  MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
   if (dev-msix_table == MAP_FAILED) {
   error_report(fail allocate msix_table! %s, strerror(errno));
  @@ -1615,7 +1624,7 @@ static int
 assigned_dev_register_msix_mmio(AssignedDevice *dev)
   assigned_dev_msix_reset(dev);
 
   memory_region_init_io(dev-mmio, OBJECT(dev),
 assigned_dev_msix_mmio_ops,
  -  dev, assigned-dev-msix,
 MSIX_PAGE_SIZE);
  +  dev, assigned-dev-msix, msix_table_size);
   return 0;
   }
 
  @@ -1627,7 +1636,7 @@ static void
 assigned_dev_unregister_msix_mmio(AssignedDevice *dev)
 
   memory_region_destroy(dev-mmio);
 
  -if (munmap(dev-msix_table, MSIX_PAGE_SIZE) == -1) {
  +if (munmap(dev-msix_table, msix_table_size) == -1) {
   error_report(error unmapping msix_table! %s, strerror(errno));
   }
   dev-msix_table = NULL;
  --
  1.7.12.4
 
 



[Qemu-devel] [PATCH for-2.0] configure: use do_cc when checking for -fstack-protector support

2014-04-09 Thread Peter Maydell
MacOSX clang silently swallows unrecognized -f options when doing a link
with '-framework' also on the command line, so to detect support for
the various -fstack-protector options we must do a plain .c to .o compile,
not a complete compile-and-link.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
This should be a pretty safe change and it prevents clang/MacOSX
builds from spewing a warning on every C file compilation, so I'd
like to get it into 2.0.

 configure | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure b/configure
index eb0e7bb..c85475f 100755
--- a/configure
+++ b/configure
@@ -1448,7 +1448,7 @@ done
 if test $stack_protector != no ; then
   gcc_flags=-fstack-protector-strong -fstack-protector-all
   for flag in $gcc_flags; do
-if compile_prog -Werror $flag  ; then
+if do_cc $QEMU_CFLAGS -Werror $flag -c -o $TMPO $TMPC ; then
   QEMU_CFLAGS=$QEMU_CFLAGS $flag
   LIBTOOLFLAGS=$LIBTOOLFLAGS -Wc,$flag
   break
-- 
1.8.5.4




[Qemu-devel] [PATCH] bochs: Fix catalog size check

2014-04-09 Thread Kevin Wolf
The old check was off by a factor of 512 and didn't consider cases where
we don't get an exact division. This could lead to an out-of-bounds
array access in seek_to_sector().

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/bochs.c  | 14 +++---
 tests/qemu-iotests/078 |  6 +-
 tests/qemu-iotests/078.out |  6 --
 3 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/block/bochs.c b/block/bochs.c
index 50b84a9..eacf956 100644
--- a/block/bochs.c
+++ b/block/bochs.c
@@ -148,8 +148,14 @@ static int bochs_open(BlockDriverState *bs, QDict 
*options, int flags,
 s-extent_blocks = 1 + (le32_to_cpu(bochs.extent) - 1) / 512;
 
 s-extent_size = le32_to_cpu(bochs.extent);
-if (s-extent_size == 0) {
-error_setg(errp, Extent size may not be zero);
+if (s-extent_size  BDRV_SECTOR_SIZE) {
+/* bximage actually never creates extents smaller than 4k */
+error_setg(errp, Extent size must be at least 512);
+ret = -EINVAL;
+goto fail;
+} else if (!is_power_of_2(s-extent_size)) {
+error_setg(errp, Extent size % PRIu32  is not a power of two,
+   s-extent_size);
 ret = -EINVAL;
 goto fail;
 } else if (s-extent_size  0x80) {
@@ -159,7 +165,9 @@ static int bochs_open(BlockDriverState *bs, QDict *options, 
int flags,
 goto fail;
 }
 
-if (s-catalog_size  bs-total_sectors / s-extent_size) {
+if (s-catalog_size  DIV_ROUND_UP(bs-total_sectors,
+   s-extent_size / BDRV_SECTOR_SIZE))
+{
 error_setg(errp, Catalog size is too small for this disk size);
 ret = -EINVAL;
 goto fail;
diff --git a/tests/qemu-iotests/078 b/tests/qemu-iotests/078
index 872e734..d4d6da7 100755
--- a/tests/qemu-iotests/078
+++ b/tests/qemu-iotests/078
@@ -69,10 +69,14 @@ _use_sample_img empty.bochs.bz2
 poke_file $TEST_IMG $disk_size_offset \x00\xc0\x0f\x00\x00\x00\x00\x7f
 { $QEMU_IO -c read 2T 4k $TEST_IMG; } 21 | _filter_qemu_io | 
_filter_testdir
 
+_use_sample_img empty.bochs.bz2
+poke_file $TEST_IMG $catalog_size_offset \x10\x00\x00\x00
+{ $QEMU_IO -c read 0xfbe00 512 $TEST_IMG; } 21 | _filter_qemu_io | 
_filter_testdir
+
 echo
 echo == Negative extent size ==
 _use_sample_img empty.bochs.bz2
-poke_file $TEST_IMG $extent_size_offset \xff\xff\xff\xff
+poke_file $TEST_IMG $extent_size_offset \x00\x00\x00\x80
 { $QEMU_IO -c read 768k 512 $TEST_IMG; } 21 | _filter_qemu_io | 
_filter_testdir
 
 echo
diff --git a/tests/qemu-iotests/078.out b/tests/qemu-iotests/078.out
index ea95ffd..ca18d2e 100644
--- a/tests/qemu-iotests/078.out
+++ b/tests/qemu-iotests/078.out
@@ -15,12 +15,14 @@ no file open, try 'help open'
 == Too small catalog bitmap for image size ==
 qemu-io: can't open device TEST_DIR/empty.bochs: Catalog size is too small for 
this disk size
 no file open, try 'help open'
+qemu-io: can't open device TEST_DIR/empty.bochs: Catalog size is too small for 
this disk size
+no file open, try 'help open'
 
 == Negative extent size ==
-qemu-io: can't open device TEST_DIR/empty.bochs: Extent size 4294967295 is too 
large
+qemu-io: can't open device TEST_DIR/empty.bochs: Extent size 2147483648 is too 
large
 no file open, try 'help open'
 
 == Zero extent size ==
-qemu-io: can't open device TEST_DIR/empty.bochs: Extent size may not be zero
+qemu-io: can't open device TEST_DIR/empty.bochs: Extent size must be at least 
512
 no file open, try 'help open'
 *** done
-- 
1.8.3.1




[Qemu-devel] [PATCH v4 2/5] kvm: add kvm_enable_cap_{vm,vcpu}

2014-04-09 Thread Cornelia Huck
Provide helper functions for enabling capabilities (on a vcpu and on a vm).

Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 include/sysemu/kvm.h |4 
 kvm-all.c|   19 ++-
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 0bee1e8..d89911c 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -294,6 +294,10 @@ bool kvm_arch_stop_on_emulation_error(CPUState *cpu);
 
 int kvm_check_extension(KVMState *s, unsigned int extension);
 
+int kvm_enable_cap_vm(KVMState *s, unsigned int capability);
+
+int kvm_enable_cap_vcpu(CPUState *cpu, unsigned int capability);
+
 uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function,
   uint32_t index, int reg);
 
diff --git a/kvm-all.c b/kvm-all.c
index cd4111d..c32eeff 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -501,7 +501,24 @@ int kvm_check_extension(KVMState *s, unsigned int 
extension)
 return ret;
 }
 
-static int kvm_set_ioeventfd_mmio(int fd, hwaddr addr, uint32_t val,
+int kvm_enable_cap_vm(KVMState *s, unsigned int capability)
+{
+struct kvm_enable_cap cap = {};
+
+cap.cap = capability;
+return kvm_vm_ioctl(s, KVM_ENABLE_CAP, cap);
+}
+
+int kvm_enable_cap_vcpu(CPUState *cpu, unsigned int capability)
+{
+struct kvm_enable_cap cap = {};
+
+cap.cap = capability;
+return kvm_vcpu_ioctl(cpu, KVM_ENABLE_CAP, cap);
+}
+
+
+static int kvm_set_ioeventfd_mmio(int fd, uint32_t addr, uint32_t val,
   bool assign, uint32_t size, bool datamatch)
 {
 int ret;
-- 
1.7.9.5




[Qemu-devel] [PATCH v4 5/5] s390x/virtio-ccw: Wire up irq routing and irqfds.

2014-04-09 Thread Cornelia Huck
Make use of the new s390 adapter irq routing support to enable real
in-kernel irqfds for virtio-ccw with adapter interrupts.

Note that s390 doesn't provide the common KVM_CAP_IRQCHIP capability, but
rather needs KVM_CAP_S390_IRQCHIP to be enabled. This is to ensure backward
compatibility.

Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 hw/s390x/virtio-ccw.c  |  165 
 hw/s390x/virtio-ccw.h  |2 +
 include/hw/s390x/adapter.h |   23 ++
 include/qemu/typedefs.h|1 +
 include/sysemu/kvm.h   |2 +
 kvm-all.c  |   38 +-
 kvm-stub.c |5 ++
 target-s390x/kvm.c |5 ++
 8 files changed, 228 insertions(+), 13 deletions(-)
 create mode 100644 include/hw/s390x/adapter.h

diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index 69efa6c..5612ccc 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -21,6 +21,7 @@
 #include hw/sysbus.h
 #include qemu/bitops.h
 #include hw/virtio/virtio-bus.h
+#include hw/s390x/adapter.h
 
 #include ioinst.h
 #include css.h
@@ -48,7 +49,7 @@ static IndAddr *get_indicator(hwaddr ind_addr, int len)
 return indicator;
 }
 
-static void release_indicator(IndAddr *indicator)
+static void release_indicator(uint32_t adapter_id, IndAddr *indicator)
 {
 assert(indicator-refcnt  0);
 indicator-refcnt--;
@@ -56,9 +57,31 @@ static void release_indicator(IndAddr *indicator)
 return;
 }
 QTAILQ_REMOVE(indicator_addresses, indicator, sibling);
+if (indicator-map) {
+s390_io_adapter_map(adapter_id, indicator-map, false);
+}
 g_free(indicator);
 }
 
+static int map_indicator(uint32_t adapter_id, IndAddr *indicator)
+{
+int ret;
+
+if (indicator-map) {
+return 0; /* already mapped is not an error */
+}
+indicator-map = indicator-addr;
+ret = s390_io_adapter_map(adapter_id, indicator-map, true);
+if ((ret != 0)  (ret != -ENOSYS)) {
+goto out_err;
+}
+return 0;
+
+out_err:
+indicator-map = 0;
+return -EFAULT;
+}
+
 static void virtio_ccw_bus_new(VirtioBusState *bus, size_t bus_size,
VirtioCcwDevice *dev);
 
@@ -733,7 +756,7 @@ static int virtio_ccw_exit(VirtioCcwDevice *dev)
 g_free(sch);
 }
 if (dev-indicators) {
-release_indicator(dev-indicators);
+release_indicator(dev-adapter_id, dev-indicators);
 dev-indicators = NULL;
 }
 return 0;
@@ -1034,15 +1057,15 @@ static void virtio_ccw_reset(DeviceState *d)
 virtio_reset(vdev);
 css_reset_sch(dev-sch);
 if (dev-indicators) {
-release_indicator(dev-indicators);
+release_indicator(dev-adapter_id, dev-indicators);
 dev-indicators = NULL;
 }
 if (dev-indicators2) {
-release_indicator(dev-indicators2);
+release_indicator(dev-adapter_id, dev-indicators2);
 dev-indicators2 = NULL;
 }
 if (dev-summary_indicator) {
-release_indicator(dev-summary_indicator);
+release_indicator(dev-adapter_id, dev-summary_indicator);
 dev-summary_indicator = NULL;
 }
 }
@@ -1078,6 +1101,100 @@ static int virtio_ccw_set_host_notifier(DeviceState *d, 
int n, bool assign)
 return virtio_ccw_set_guest2host_notifier(dev, n, assign, false);
 }
 
+static int virtio_ccw_get_adapter_info(VirtioCcwDevice *dev,
+   AdapterInfo *adapter)
+{
+int r;
+
+if (!dev-sch-thinint_active) {
+return -EINVAL;
+}
+
+r = map_indicator(dev-adapter_id, dev-summary_indicator);
+if (r) {
+return r;
+}
+r = map_indicator(dev-adapter_id, dev-indicators);
+if (r) {
+return r;
+}
+adapter-summary_addr = dev-summary_indicator-map;
+adapter-ind_addr = dev-indicators-map;
+adapter-ind_offset = dev-ind_bit;
+adapter-summary_offset = 7;
+adapter-adapter_id = dev-adapter_id;
+
+return 0;
+}
+
+static int virtio_ccw_setup_irqroutes(VirtioCcwDevice *dev, int nvqs)
+{
+int i;
+VirtIODevice *vdev = virtio_bus_get_device(dev-bus);
+int ret;
+AdapterInfo adapter;
+
+ret = virtio_ccw_get_adapter_info(dev, adapter);
+if (ret) {
+return ret;
+}
+for (i = 0; i  nvqs; i++) {
+if (!virtio_queue_get_num(vdev, i)) {
+break;
+}
+ret = kvm_irqchip_add_adapter_route(kvm_state, adapter);
+if (ret  0) {
+goto out_undo;
+}
+dev-gsi[i] = ret;
+adapter.ind_offset++;
+}
+return 0;
+out_undo:
+while (--i = 0) {
+kvm_irqchip_release_virq(kvm_state, dev-gsi[i]);
+dev-gsi[i] = -1;
+}
+return ret;
+}
+
+static void virtio_ccw_release_irqroutes(VirtioCcwDevice *dev, int nvqs)
+{
+int i;
+VirtIODevice *vdev = virtio_bus_get_device(dev-bus);
+
+for (i = 0; i  nvqs; i++) {
+

Re: [Qemu-devel] [PATCH v4 1/1] Make qemu_peek_buffer loop until it gets it's data

2014-04-09 Thread 陈梁

 *  (chenliang0...@icloud.com) wrote:
 
 ?? 2014??4??810:29??Dr. David Alan Gilbert (git) 
 dgilb...@redhat.com ??
 
 From: Dr. David Alan Gilbert dgilb...@redhat.com
 
 Make qemu_peek_buffer repeatedly call fill_buffer until it gets
 all the data it requires, or until there is an error.
 
 At the moment, qemu_peek_buffer will try one qemu_fill_buffer if there
 isn't enough data waiting, however the kernel is entitled to return
 just a few bytes, and still leave qemu_peek_buffer with less bytes
 than it needed.  I've seen this fail in a dev world, and I think it
 could theoretically fail in the peeking of the subsection headers in
 the current world.
 hmm, I also have got some errors(infrequently). Maybe It is one point.
 Could you show some messages about the error?
 
 I've only seen the errors in my visitor/ber world where I use the peek_buffer
 a lot more; but the one place it is used in the existing code is in
 the code to check if we have the start of a subsection; if that goes wrong
 I'm not sure what error would be produced.
 
 Dave
In my observation, error is more likely to happen if migration thread is 
blocked too long.
BTW, xbzrle and auto convergence are disable.

Best regards
ChenLiang
 
 
 However, this patch avoids one potential issue, thanks.
 
 Reviewed-by: ChenLiang chenliang0...@icloud.com
 
 Comment qemu_peek_byte to point out it's not guaranteed to work for
 non-continuous peeks
 
 Signed-off-by: Dr. David Alan Gilbert dgilb...@redhat.com
 ---
 
 v4:
 Limit the size qemu_get_buffer passes to qemu_fill_buffer on each loop
 
 v3:
 Make qemu_fill_buffer return ssize_t
 Remove the other size_t cleanup from this patch - I'll get back to them 
 later
 Comment additions/cleanups from review
 Stretch my -ve
 
 include/migration/qemu-file.h |  5 
 qemu-file.c   | 53 
 +++
 2 files changed, 54 insertions(+), 4 deletions(-)
 
 diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
 index a191fb6..c90f529 100644
 --- a/include/migration/qemu-file.h
 +++ b/include/migration/qemu-file.h
 @@ -123,6 +123,11 @@ void qemu_put_be32(QEMUFile *f, unsigned int v);
 void qemu_put_be64(QEMUFile *f, uint64_t v);
 int qemu_peek_buffer(QEMUFile *f, uint8_t *buf, int size, size_t offset);
 int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size);
 +/*
 + * Note that you can only peek continuous bytes from where the current 
 pointer
 + * is; you aren't guaranteed to be able to peak to +n bytes unless you've
 + * previously peeked +n-1.
 + */
 int qemu_peek_byte(QEMUFile *f, int offset);
 int qemu_get_byte(QEMUFile *f);
 void qemu_file_skip(QEMUFile *f, int size);
 diff --git a/qemu-file.c b/qemu-file.c
 index e5ec798..6ae37d0 100644
 --- a/qemu-file.c
 +++ b/qemu-file.c
 @@ -529,7 +529,15 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t 
 block_offset,
return RAM_SAVE_CONTROL_NOT_SUPP;
 }
 
 -static void qemu_fill_buffer(QEMUFile *f)
 +/*
 + * Attempt to fill the buffer from the underlying file
 + * Returns the number of bytes read, or negative value for an error.
 + *
 + * Note that it can return a partially full buffer even in a not error/not 
 EOF
 + * case if the underlying file descriptor gives a short read, and that can
 + * happen even on a blocking fd.
 + */
 +static ssize_t qemu_fill_buffer(QEMUFile *f)
 {
int len;
int pending;
 @@ -553,6 +561,8 @@ static void qemu_fill_buffer(QEMUFile *f)
} else if (len != -EAGAIN) {
qemu_file_set_error(f, len);
}
 +
 +return len;
 }
 
 int qemu_get_fd(QEMUFile *f)
 @@ -683,17 +693,39 @@ void qemu_file_skip(QEMUFile *f, int size)
}
 }
 
 +/*
 + * Read 'size' bytes from file (at 'offset') into buf without moving the
 + * pointer.
 + *
 + * It will return size bytes unless there was an error, in which case it 
 will
 + * return as many as it managed to read (assuming blocking fd's which
 + * all current QEMUFile are)
 + */
 int qemu_peek_buffer(QEMUFile *f, uint8_t *buf, int size, size_t offset)
 {
int pending;
int index;
 
assert(!qemu_file_is_writable(f));
 +assert(offset  IO_BUF_SIZE);
 +assert(size = IO_BUF_SIZE - offset);
 
 +/* The 1st byte to read from */
index = f-buf_index + offset;
 +/* The number of available bytes starting at index */
pending = f-buf_size - index;
 -if (pending  size) {
 -qemu_fill_buffer(f);
 +
 +/*
 + * qemu_fill_buffer might return just a few bytes, even when there 
 isn't
 + * an error, so loop collecting them until we get enough.
 + */
 +while (pending  size) {
 +int received = qemu_fill_buffer(f);
 +
 +if (received = 0) {
 +break;
 +}
 +
index = f-buf_index + offset;
pending = f-buf_size - index;
}
 @@ -709,6 +741,14 @@ int qemu_peek_buffer(QEMUFile *f, uint8_t *buf, int 
 size, size_t offset)
return size;
 }
 
 +/*
 + * Read 'size' bytes of data from the file into buf.
 + 

Re: [Qemu-devel] '.' IDs and certain names breaks -global and -set

2014-04-09 Thread Peter Crosthwaite
On Wed, Apr 9, 2014 at 7:56 PM, Markus Armbruster arm...@redhat.com wrote:
 We have a number of device model names containing '.'.  They're unusable
 with -global.  That's because -global A.B.C=foo gets parsed as

 driver = A
 property = B.C
 value = foo.

 Wrong when the device model name is really A.B and the property is C,
 e.g. -global cfi.pflash01.name.

 Related problem: QemuOpts ID may contain '.'.  Such IDs are in fact
 common practice.  Unfortunately, they're unusable with -set.  For
 instance, -set A.B.C.D=foo gets parsed as

 group = A
 id = B
 arg = C.D
 value = foo

 Wrong when the id is really B.C and the arg is D, e.g. -device
 isa-serial,id=s.0 -set device.s.0.chardev=c0.  This issue isn't limited
 to devices.

 Related historical problem: commit b560a9a.

 Possible solutions:

 * Outlaw '.' in QemuOpts parameter names (and thus device model names
   and property names, because QemuOpts is a major way for users to
   specify them).


I like the .s because they mean this is where the vendor stops and
the IP name starts. Whereas - just delimits multiple words. We seem
to be running our of characters however with these overloading
problems. But AFAICT : is largely unused (within this context
anyway). So how about:

- seperates multiple words in a name
: splits logical groupings of multiple words as appropriate ( e.g.
some-vendor:their-device )
, demilits multiple command line arguments
. accesses property within an object
/ walks the canonical path

I do also need to confess to my alternate agenda of device tree driven
QEMU - I am still parsing device trees and directly mapping the
compatible strings to QOM type names for machine creation. Ambiguity
of - as meaning the , or the - is hard to deal with.

Regards,
Peter

 * Resolve the syntactic ambiguity semantically (ugh)

 * Extend the syntax of -global and -set to let users avoid the issue, or
   replace them outright

 * Use the old ostrich algorithm

 Andreas, what restrictions does QOM place on QOM path component names?
 No '/', obviously.  Anything else?




Re: [Qemu-devel] [PATCH v4 1/1] Make qemu_peek_buffer loop until it gets it's data

2014-04-09 Thread Dr. David Alan Gilbert
*  (chenliang0...@icloud.com) wrote:
 
  *  (chenliang0...@icloud.com) wrote:
  
  ?? 2014??4??810:29??Dr. David Alan Gilbert (git) 
  dgilb...@redhat.com ??
  
  From: Dr. David Alan Gilbert dgilb...@redhat.com
  
  Make qemu_peek_buffer repeatedly call fill_buffer until it gets
  all the data it requires, or until there is an error.
  
  At the moment, qemu_peek_buffer will try one qemu_fill_buffer if there
  isn't enough data waiting, however the kernel is entitled to return
  just a few bytes, and still leave qemu_peek_buffer with less bytes
  than it needed.  I've seen this fail in a dev world, and I think it
  could theoretically fail in the peeking of the subsection headers in
  the current world.
  hmm, I also have got some errors(infrequently). Maybe It is one point.
  Could you show some messages about the error?
  
  I've only seen the errors in my visitor/ber world where I use the 
  peek_buffer
  a lot more; but the one place it is used in the existing code is in
  the code to check if we have the start of a subsection; if that goes wrong
  I'm not sure what error would be produced.
  
  Dave
 In my observation, error is more likely to happen if migration thread is 
 blocked too long.
 BTW, xbzrle and auto convergence are disable.

What error do you see?

Dave
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



[Qemu-devel] [PATCH v4 3/5] s390x: Add I/O adapter registration.

2014-04-09 Thread Cornelia Huck
Register an I/O adapter interrupt source for when virtio-ccw devices start
using adapter interrupts.

Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 hw/intc/s390_flic.c   |   59 +
 hw/s390x/css.c|   51 ++
 hw/s390x/css.h|4 
 hw/s390x/virtio-ccw.c |4 
 hw/s390x/virtio-ccw.h |1 +
 target-s390x/cpu.h|   33 +++
 6 files changed, 152 insertions(+)

diff --git a/hw/intc/s390_flic.c b/hw/intc/s390_flic.c
index b2ef3e3..c033c8a 100644
--- a/hw/intc/s390_flic.c
+++ b/hw/intc/s390_flic.c
@@ -21,6 +21,11 @@
 #define FLIC_FAILED (-1UL)
 #define FLIC_SAVEVM_VERSION 1
 
+static KVMS390FLICState *s390_get_flic(void)
+{
+return KVM_S390_FLIC(object_resolve_path(/machine/s390-flic, NULL));
+}
+
 void s390_flic_init(void)
 {
 DeviceState *dev;
@@ -148,6 +153,60 @@ static int __get_all_irqs(KVMS390FLICState *flic,
 return r;
 }
 
+int kvm_s390_register_io_adapter(uint32_t id, uint8_t isc, bool swap,
+ bool is_maskable)
+{
+struct kvm_s390_io_adapter adapter = {
+.id = id,
+.isc = isc,
+.maskable = is_maskable,
+.swap = swap,
+};
+KVMS390FLICState *flic = s390_get_flic();
+int r, ret;
+struct kvm_device_attr attr = {
+.group = KVM_DEV_FLIC_ADAPTER_REGISTER,
+.addr = (uint64_t)adapter,
+};
+
+if (!flic) {
+return -ENOSYS;
+}
+if (!kvm_check_extension(kvm_state, KVM_CAP_IRQ_ROUTING)) {
+return -ENOSYS;
+}
+
+r = ioctl(flic-fd, KVM_SET_DEVICE_ATTR, attr);
+
+ret = r ? -errno : 0;
+return ret;
+}
+
+int kvm_s390_io_adapter_map(uint32_t id, uint64_t map_addr, bool do_map)
+{
+struct kvm_s390_io_adapter_req req = {
+.id = id,
+.type = do_map ? KVM_S390_IO_ADAPTER_MAP : KVM_S390_IO_ADAPTER_UNMAP,
+.addr = map_addr,
+};
+KVMS390FLICState *flic = s390_get_flic();
+struct kvm_device_attr attr = {
+.group = KVM_DEV_FLIC_ADAPTER_MODIFY,
+.addr = (uint64_t)req,
+};
+int r;
+
+if (!flic) {
+return -ENOSYS;
+}
+if (!kvm_check_extension(kvm_state, KVM_CAP_IRQ_ROUTING)) {
+return -ENOSYS;
+}
+
+r = ioctl(flic-fd, KVM_SET_DEVICE_ATTR, attr);
+return r ? -errno : 0;
+}
+
 /**
  * kvm_flic_save - Save pending floating interrupts
  * @f: QEMUFile containing migration state
diff --git a/hw/s390x/css.c b/hw/s390x/css.c
index 7074d2b..a6d173f 100644
--- a/hw/s390x/css.c
+++ b/hw/s390x/css.c
@@ -39,6 +39,13 @@ typedef struct CssImage {
 ChpInfo chpids[MAX_CHPID + 1];
 } CssImage;
 
+typedef struct IoAdapter {
+uint32_t id;
+uint8_t type;
+uint8_t isc;
+QTAILQ_ENTRY(IoAdapter) sibling;
+} IoAdapter;
+
 typedef struct ChannelSubSys {
 QTAILQ_HEAD(, CrwContainer) pending_crws;
 bool do_crw_mchk;
@@ -49,6 +56,7 @@ typedef struct ChannelSubSys {
 uint64_t chnmon_area;
 CssImage *css[MAX_CSSID + 1];
 uint8_t default_cssid;
+QTAILQ_HEAD(, IoAdapter) io_adapters;
 } ChannelSubSys;
 
 static ChannelSubSys *channel_subsys;
@@ -69,6 +77,48 @@ int css_create_css_image(uint8_t cssid, bool default_image)
 return 0;
 }
 
+int css_register_io_adapter(uint8_t type, uint8_t isc, bool swap,
+bool maskable, uint32_t *id)
+{
+IoAdapter *adapter;
+bool found = false;
+int ret;
+
+*id = 0;
+QTAILQ_FOREACH(adapter, channel_subsys-io_adapters, sibling) {
+if ((adapter-type == type)  (adapter-isc == isc)) {
+*id = adapter-id;
+found = true;
+ret = 0;
+break;
+}
+if (adapter-id = *id) {
+*id = adapter-id + 1;
+}
+}
+if (found) {
+goto out;
+}
+adapter = g_new0(IoAdapter, 1);
+ret = s390_register_io_adapter(*id, isc, swap, maskable);
+if (ret == -ENOSYS) {
+/* Keep adapter even if we didn't register it anywhere. */
+ret = 0;
+}
+if (ret == 0) {
+adapter-id = *id;
+adapter-isc = isc;
+adapter-type = type;
+QTAILQ_INSERT_TAIL(channel_subsys-io_adapters, adapter, sibling);
+} else {
+g_free(adapter);
+fprintf(stderr, Unexpected error %d when registering adapter %d\n,
+ret, *id);
+}
+out:
+return ret;
+}
+
 uint16_t css_build_subchannel_id(SubchDev *sch)
 {
 if (channel_subsys-max_cssid  0) {
@@ -1239,6 +1289,7 @@ static void css_init(void)
 channel_subsys-do_crw_mchk = true;
 channel_subsys-crws_lost = false;
 channel_subsys-chnmon_active = false;
+QTAILQ_INIT(channel_subsys-io_adapters);
 }
 machine_init(css_init);
 
diff --git a/hw/s390x/css.h b/hw/s390x/css.h
index e9b4405..380e8e7 100644
--- a/hw/s390x/css.h
+++ b/hw/s390x/css.h
@@ -99,4 +99,8 @@ void 

[Qemu-devel] [PATCH v4 4/5] s390x/virtio-ccw: reference-counted indicators

2014-04-09 Thread Cornelia Huck
Make code using the same indicators point to a single allocated structure
that is freed when the last user goes away.

This will be used by the irqfd code to unmap addresses after the last user
is gone.

Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 hw/s390x/virtio-ccw.c |   80 ++---
 hw/s390x/virtio-ccw.h |   13 ++--
 2 files changed, 73 insertions(+), 20 deletions(-)

diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index 1193682..69efa6c 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -27,6 +27,38 @@
 #include virtio-ccw.h
 #include trace.h
 
+static QTAILQ_HEAD(, IndAddr) indicator_addresses =
+QTAILQ_HEAD_INITIALIZER(indicator_addresses);
+
+static IndAddr *get_indicator(hwaddr ind_addr, int len)
+{
+IndAddr *indicator;
+
+QTAILQ_FOREACH(indicator, indicator_addresses, sibling) {
+if (indicator-addr == ind_addr) {
+indicator-refcnt++;
+return indicator;
+}
+}
+indicator = g_new0(IndAddr, 1);
+indicator-addr = ind_addr;
+indicator-len = len;
+indicator-refcnt = 1;
+QTAILQ_INSERT_TAIL(indicator_addresses, indicator, sibling);
+return indicator;
+}
+
+static void release_indicator(IndAddr *indicator)
+{
+assert(indicator-refcnt  0);
+indicator-refcnt--;
+if (indicator-refcnt  0) {
+return;
+}
+QTAILQ_REMOVE(indicator_addresses, indicator, sibling);
+g_free(indicator);
+}
+
 static void virtio_ccw_bus_new(VirtioBusState *bus, size_t bus_size,
VirtioCcwDevice *dev);
 
@@ -445,7 +477,7 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
 ret = -EFAULT;
 } else {
 indicators = ldq_phys(address_space_memory, ccw.cda);
-dev-indicators = indicators;
+dev-indicators = get_indicator(indicators, sizeof(uint64_t));
 sch-curr_status.scsw.count = ccw.count - sizeof(indicators);
 ret = 0;
 }
@@ -465,7 +497,7 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
 ret = -EFAULT;
 } else {
 indicators = ldq_phys(address_space_memory, ccw.cda);
-dev-indicators2 = indicators;
+dev-indicators2 = get_indicator(indicators, sizeof(uint64_t));
 sch-curr_status.scsw.count = ccw.count - sizeof(indicators);
 ret = 0;
 }
@@ -517,8 +549,10 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
 ret = -EFAULT;
 } else {
 len = hw_len;
-dev-summary_indicator = thinint-summary_indicator;
-dev-indicators = thinint-device_indicator;
+dev-summary_indicator =
+get_indicator(thinint-summary_indicator, sizeof(uint8_t));
+dev-indicators = get_indicator(thinint-device_indicator,
+thinint-ind_bit / 8 + 1);
 dev-thinint_isc = thinint-isc;
 dev-ind_bit = thinint-ind_bit;
 cpu_physical_memory_unmap(thinint, hw_len, 0, hw_len);
@@ -526,8 +560,8 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
   dev-thinint_isc, true, false,
   dev-adapter_id);
 assert(ret == 0);
-sch-thinint_active = ((dev-indicators != 0) 
-   (dev-summary_indicator != 0));
+sch-thinint_active = ((dev-indicators != NULL) 
+   (dev-summary_indicator != NULL));
 sch-curr_status.scsw.count = ccw.count - len;
 ret = 0;
 }
@@ -558,7 +592,7 @@ static int virtio_ccw_device_init(VirtioCcwDevice *dev, 
VirtIODevice *vdev)
 sch-driver_data = dev;
 dev-sch = sch;
 
-dev-indicators = 0;
+dev-indicators = NULL;
 
 /* Initialize subchannel structure. */
 sch-channel_prog = 0x0;
@@ -698,7 +732,10 @@ static int virtio_ccw_exit(VirtioCcwDevice *dev)
 css_subch_assign(sch-cssid, sch-ssid, sch-schid, sch-devno, NULL);
 g_free(sch);
 }
-dev-indicators = 0;
+if (dev-indicators) {
+release_indicator(dev-indicators);
+dev-indicators = NULL;
+}
 return 0;
 }
 
@@ -955,17 +992,17 @@ static void virtio_ccw_notify(DeviceState *d, uint16_t 
vector)
  * ind_bit indicates the start of the indicators in a big
  * endian notation.
  */
-virtio_set_ind_atomic(sch, dev-indicators +
+virtio_set_ind_atomic(sch, dev-indicators-addr +
   (dev-ind_bit + vector) / 8,
   0x80  ((dev-ind_bit + vector) % 8));
-if (!virtio_set_ind_atomic(sch, dev-summary_indicator,
+if 

Re: [Qemu-devel] [PATCH v4 1/1] Make qemu_peek_buffer loop until it gets it's data

2014-04-09 Thread 陈梁

 *  (chenliang0...@icloud.com) wrote:
 
 *  (chenliang0...@icloud.com) wrote:
 
 ?? 2014??4??810:29??Dr. David Alan Gilbert (git) 
 dgilb...@redhat.com ??
 
 From: Dr. David Alan Gilbert dgilb...@redhat.com
 
 Make qemu_peek_buffer repeatedly call fill_buffer until it gets
 all the data it requires, or until there is an error.
 
 At the moment, qemu_peek_buffer will try one qemu_fill_buffer if there
 isn't enough data waiting, however the kernel is entitled to return
 just a few bytes, and still leave qemu_peek_buffer with less bytes
 than it needed.  I've seen this fail in a dev world, and I think it
 could theoretically fail in the peeking of the subsection headers in
 the current world.
 hmm, I also have got some errors(infrequently). Maybe It is one point.
 Could you show some messages about the error?
 
 I've only seen the errors in my visitor/ber world where I use the 
 peek_buffer
 a lot more; but the one place it is used in the existing code is in
 the code to check if we have the start of a subsection; if that goes wrong
 I'm not sure what error would be produced.
 
 Dave
 In my observation, error is more likely to happen if migration thread is 
 blocked too long.
 BTW, xbzrle and auto convergence are disable.
 
 What error do you see?
 
 Dave
It likes section id 1 load failed at dest side.
 --
 Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




[Qemu-devel] [PATCH v4 0/5] qemu: irqfds for s390x

2014-04-09 Thread Cornelia Huck
Here's my current qemu s390x irqfd patchset. Unless there are objections,
I'll send a pull request once 2.0 has been released.

Changes from v3:
- rebased against current master
- first patch is now a proper kernel header update
- in patch 5, make indicators-map a uint64_t instead of void *; this saves
  useless casting and fixes a 32 bit compile error

Changes from v2:
- rebased against current master
- use object_resolve_path() to grab the flic
- more explicit return code check for enabling KVM_CAP_S390_IRQCHIP

Changes from v1:
- rebased against current master
- pick up changed capability numbers from the kvm patchset
- use c99 struct initializers in patch 3
- adapter interrupts are already upstream

The git branch at

https://github.com/cohuck/qemu.git s390x-irqfd

has been updated accordingly.

Cornelia Huck (5):
  linux-headers: update
  kvm: add kvm_enable_cap_{vm,vcpu}
  s390x: Add I/O adapter registration.
  s390x/virtio-ccw: reference-counted indicators
  s390x/virtio-ccw: Wire up irq routing and irqfds.

 hw/intc/s390_flic.c  |   59 +++
 hw/s390x/css.c   |   51 +
 hw/s390x/css.h   |4 +
 hw/s390x/virtio-ccw.c|  239 +-
 hw/s390x/virtio-ccw.h|   16 ++-
 include/hw/s390x/adapter.h   |   23 
 include/qemu/typedefs.h  |1 +
 include/sysemu/kvm.h |6 ++
 kvm-all.c|   57 +-
 kvm-stub.c   |5 +
 linux-headers/asm-s390/kvm.h |   24 +
 linux-headers/linux/kvm.h|   17 +++
 target-s390x/cpu.h   |   33 ++
 target-s390x/kvm.c   |5 +
 14 files changed, 511 insertions(+), 29 deletions(-)
 create mode 100644 include/hw/s390x/adapter.h

-- 
1.7.9.5




[Qemu-devel] [PATCH v4 1/5] linux-headers: update

2014-04-09 Thread Cornelia Huck
Base is 7cbb39d4d4d530dff12f2ff06ed6c85c504ba91a.

Gets several new interfaces:
Per-vm capability enablement, adapter interrupt sources, irq routing on s390.

Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 linux-headers/asm-s390/kvm.h |   24 
 linux-headers/linux/kvm.h|   17 +
 2 files changed, 41 insertions(+)

diff --git a/linux-headers/asm-s390/kvm.h b/linux-headers/asm-s390/kvm.h
index cb4c1eb..c003c6a 100644
--- a/linux-headers/asm-s390/kvm.h
+++ b/linux-headers/asm-s390/kvm.h
@@ -22,6 +22,8 @@
 #define KVM_DEV_FLIC_CLEAR_IRQS3
 #define KVM_DEV_FLIC_APF_ENABLE4
 #define KVM_DEV_FLIC_APF_DISABLE_WAIT  5
+#define KVM_DEV_FLIC_ADAPTER_REGISTER  6
+#define KVM_DEV_FLIC_ADAPTER_MODIFY7
 /*
  * We can have up to 4*64k pending subchannels + 8 adapter interrupts,
  * as well as up  to ASYNC_PF_PER_VCPU*KVM_MAX_VCPUS pfault done interrupts.
@@ -32,6 +34,26 @@
 #define KVM_S390_MAX_FLOAT_IRQS266250
 #define KVM_S390_FLIC_MAX_BUFFER   0x200
 
+struct kvm_s390_io_adapter {
+   __u32 id;
+   __u8 isc;
+   __u8 maskable;
+   __u8 swap;
+   __u8 pad;
+};
+
+#define KVM_S390_IO_ADAPTER_MASK 1
+#define KVM_S390_IO_ADAPTER_MAP 2
+#define KVM_S390_IO_ADAPTER_UNMAP 3
+
+struct kvm_s390_io_adapter_req {
+   __u32 id;
+   __u8 type;
+   __u8 mask;
+   __u16 pad0;
+   __u64 addr;
+};
+
 /* for KVM_GET_REGS and KVM_SET_REGS */
 struct kvm_regs {
/* general purpose regs for s390 */
@@ -76,4 +98,6 @@ struct kvm_sync_regs {
 #define KVM_REG_S390_PFTOKEN   (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x5)
 #define KVM_REG_S390_PFCOMPARE (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x6)
 #define KVM_REG_S390_PFSELECT  (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x7)
+#define KVM_REG_S390_PP(KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x8)
+#define KVM_REG_S390_GBEA  (KVM_REG_S390 | KVM_REG_SIZE_U64 | 0x9)
 #endif
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index e27a4b3..b278ab3 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -740,6 +740,9 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_SPAPR_MULTITCE 94
 #define KVM_CAP_EXT_EMUL_CPUID 95
 #define KVM_CAP_HYPERV_TIME 96
+#define KVM_CAP_IOAPIC_POLARITY_IGNORED 97
+#define KVM_CAP_ENABLE_CAP_VM 98
+#define KVM_CAP_S390_IRQCHIP 99
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -755,9 +758,18 @@ struct kvm_irq_routing_msi {
__u32 pad;
 };
 
+struct kvm_irq_routing_s390_adapter {
+   __u64 ind_addr;
+   __u64 summary_addr;
+   __u64 ind_offset;
+   __u32 summary_offset;
+   __u32 adapter_id;
+};
+
 /* gsi routing entry types */
 #define KVM_IRQ_ROUTING_IRQCHIP 1
 #define KVM_IRQ_ROUTING_MSI 2
+#define KVM_IRQ_ROUTING_S390_ADAPTER 3
 
 struct kvm_irq_routing_entry {
__u32 gsi;
@@ -767,6 +779,7 @@ struct kvm_irq_routing_entry {
union {
struct kvm_irq_routing_irqchip irqchip;
struct kvm_irq_routing_msi msi;
+   struct kvm_irq_routing_s390_adapter adapter;
__u32 pad[8];
} u;
 };
@@ -1075,6 +1088,10 @@ struct kvm_s390_ucas_mapping {
 /* Available with KVM_CAP_DEBUGREGS */
 #define KVM_GET_DEBUGREGS _IOR(KVMIO,  0xa1, struct kvm_debugregs)
 #define KVM_SET_DEBUGREGS _IOW(KVMIO,  0xa2, struct kvm_debugregs)
+/*
+ * vcpu version available with KVM_ENABLE_CAP
+ * vm version available with KVM_CAP_ENABLE_CAP_VM
+ */
 #define KVM_ENABLE_CAP_IOW(KVMIO,  0xa3, struct kvm_enable_cap)
 /* Available with KVM_CAP_XSAVE */
 #define KVM_GET_XSAVE_IOR(KVMIO,  0xa4, struct kvm_xsave)
-- 
1.7.9.5




[Qemu-devel] PL022 VMSD versions bump

2014-04-09 Thread Peter Crosthwaite
Hi Peter,

Im looking to do some cleanup around pl022 and I would like to use it
as my lead example of code using the new generic FIFO (as I think I
have test cases handy for it). But the VMSD description for this
device is a bit ugly as it is:

VMSTATE_INT32(tx_fifo_head, PL022State),
VMSTATE_INT32(rx_fifo_head, PL022State),
VMSTATE_INT32(tx_fifo_len, PL022State),
VMSTATE_INT32(rx_fifo_len, PL022State),
VMSTATE_UINT16(tx_fifo[0], PL022State),
VMSTATE_UINT16(rx_fifo[0], PL022State),
VMSTATE_UINT16(tx_fifo[1], PL022State),
VMSTATE_UINT16(rx_fifo[1], PL022State),
VMSTATE_UINT16(tx_fifo[2], PL022State),
VMSTATE_UINT16(rx_fifo[2], PL022State),
VMSTATE_UINT16(tx_fifo[3], PL022State),
VMSTATE_UINT16(rx_fifo[3], PL022State),
VMSTATE_UINT16(tx_fifo[4], PL022State),
VMSTATE_UINT16(rx_fifo[4], PL022State),
VMSTATE_UINT16(tx_fifo[5], PL022State),
VMSTATE_UINT16(rx_fifo[5], PL022State),
VMSTATE_UINT16(tx_fifo[6], PL022State),
VMSTATE_UINT16(rx_fifo[6], PL022State),
VMSTATE_UINT16(tx_fifo[7], PL022State),
VMSTATE_UINT16(rx_fifo[7], PL022State),

There is a way to covert to Fifo while maintaining full backwards
compat (in much the same way as I did for serial), although as you can
imagine, it will be messy. Is it worth the VMSD version bump to get it
cleaned up?

Regards,
Peter



Re: [Qemu-devel] [PATCH v3 21/26] tcg-aarch64: Introduce tcg_out_insn_3507

2014-04-09 Thread Claudio Fontana
On 03.04.2014 21:56, Richard Henderson wrote:
 Cleaning up the implementation of REV and REV16 at the same time.
 
 Reviewed-by: Claudio Fontana claudio.font...@huawei.com
 Signed-off-by: Richard Henderson r...@twiddle.net
 ---
  tcg/aarch64/tcg-target.c | 22 ++
  1 file changed, 14 insertions(+), 8 deletions(-)

During testing I found this patch causes a regression for big endian targets 
(sparc).

Can you take a look?
I think it might be related to the extended form of the REV instruction needing
an additional 0x400. See below.

 
 diff --git a/tcg/aarch64/tcg-target.c b/tcg/aarch64/tcg-target.c
 index caaf8a2..de7490d 100644
 --- a/tcg/aarch64/tcg-target.c
 +++ b/tcg/aarch64/tcg-target.c
 @@ -327,6 +327,10 @@ typedef enum {
  I3506_CSEL  = 0x1a80,
  I3506_CSINC = 0x1a800400,
  
 +/* Data-processing (1 source) instructions.  */
 +I3507_REV16 = 0x5ac00400,
 +I3507_REV   = 0x5ac00800,
 +
  /* Data-processing (2 source) instructions.  */
  I3508_LSLV  = 0x1ac02000,
  I3508_LSRV  = 0x1ac02400,
 @@ -545,6 +549,12 @@ static void tcg_out_insn_3506(TCGContext *s, AArch64Insn 
 insn, TCGType ext,
| tcg_cond_to_aarch64[c]  12);
  }
  
 +static void tcg_out_insn_3507(TCGContext *s, AArch64Insn insn, TCGType ext,
 +  TCGReg rd, TCGReg rn)
 +{
 +tcg_out32(s, insn | ext  31 | rn  5 | rd);
 +}
 +
  static void tcg_out_insn_3509(TCGContext *s, AArch64Insn insn, TCGType ext,
TCGReg rd, TCGReg rn, TCGReg rm, TCGReg ra)
  {
 @@ -961,19 +971,15 @@ static void tcg_out_brcond(TCGContext *s, TCGMemOp ext, 
 TCGCond c, TCGArg a,
  }
  
  static inline void tcg_out_rev(TCGContext *s, TCGType ext,
 -   TCGReg rd, TCGReg rm)
 +   TCGReg rd, TCGReg rn)
  {
 -/* using REV 0x5ac00800 */
 -unsigned int base = ext ? 0xdac00c00 : 0x5ac00800;

see the extended form 0xdac00c00 -


 -tcg_out32(s, base | rm  5 | rd);
 +tcg_out_insn(s, 3507, REV, ext, rd, rn);
  }
  
  static inline void tcg_out_rev16(TCGContext *s, TCGType ext,
 - TCGReg rd, TCGReg rm)
 + TCGReg rd, TCGReg rn)
  {
 -/* using REV16 0x5ac00400 */
 -unsigned int base = ext ? 0xdac00400 : 0x5ac00400;
 -tcg_out32(s, base | rm  5 | rd);
 +tcg_out_insn(s, 3507, REV16, ext, rd, rn);

while this does not have it.

  }
  
  static inline void tcg_out_sxt(TCGContext *s, TCGType ext, TCGMemOp s_bits,
 

Ciao,

Claudio




Re: [Qemu-devel] [PATCH v4 1/1] Make qemu_peek_buffer loop until it gets it's data

2014-04-09 Thread Dr. David Alan Gilbert
*  (chenliang0...@icloud.com) wrote:
 
  *  (chenliang0...@icloud.com) wrote:
  
  *  (chenliang0...@icloud.com) wrote:
  
  ?? 2014??4??810:29??Dr. David Alan Gilbert (git) 
  dgilb...@redhat.com ??
  
  From: Dr. David Alan Gilbert dgilb...@redhat.com
  
  Make qemu_peek_buffer repeatedly call fill_buffer until it gets
  all the data it requires, or until there is an error.
  
  At the moment, qemu_peek_buffer will try one qemu_fill_buffer if there
  isn't enough data waiting, however the kernel is entitled to return
  just a few bytes, and still leave qemu_peek_buffer with less bytes
  than it needed.  I've seen this fail in a dev world, and I think it
  could theoretically fail in the peeking of the subsection headers in
  the current world.
  hmm, I also have got some errors(infrequently). Maybe It is one point.
  Could you show some messages about the error?
  
  I've only seen the errors in my visitor/ber world where I use the 
  peek_buffer
  a lot more; but the one place it is used in the existing code is in
  the code to check if we have the start of a subsection; if that goes wrong
  I'm not sure what error would be produced.
  
  Dave
  In my observation, error is more likely to happen if migration thread is 
  blocked too long.
  BTW, xbzrle and auto convergence are disable.
  
  What error do you see?
  
  Dave
 It likes section id 1 load failed at dest side.

Our migration errors aren't particularly descriptive :-(
However, try it with this patch - I'd be interested if it makes
your problem go away.

Dave
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [libvirt] [PATCH 1/1] Enable QEMU_CAPS_PCI_MULTIBUS capability for QEMU2.0 forward.

2014-04-09 Thread Eric Blake
[adding qemu]

On 04/08/2014 11:36 PM, Li Zhang wrote:
 On 2014年04月09日 11:20, Eric Blake wrote:
 On 04/08/2014 08:03 PM, Li Zhang wrote:
 From: Li Zhang zhlci...@linux.vnet.ibm.com

 For QEMU2.0 forward version, it supports PCI multiBUS.
 Currently, libvirt still disables it which causes an error
 Bus 'pci' not found.

 Signed-off-by: Li Zhang zhlci...@linux.vnet.ibm.com
 ---
   src/qemu/qemu_capabilities.c | 3 +++
   1 file changed, 3 insertions(+)

 diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c
 index 2c8ec10..b49398f 100644
 --- a/src/qemu/qemu_capabilities.c
 +++ b/src/qemu/qemu_capabilities.c
 @@ -3019,6 +3019,9 @@ virQEMUCapsInitQMPMonitor(virQEMUCapsPtr qemuCaps,
   if (qemuCaps-version = 1006000)
   virQEMUCapsSet(qemuCaps, QEMU_CAPS_DEVICE_VIDEO_PRIMARY);
   +if (qemuCaps-version = 200)
 +virQEMUCapsSet(qemuCaps, QEMU_CAPS_PCI_MULTIBUS);
 +
 This is a version number check, which is bad.  Is there a QMP command we
 can issue that gives a more reliable answer of whether the feature is
 present?

 Hi Eric,
 
 I can't find any QMP command for MULTIBUS in QEMU.
 There is only one query-pci to list PCI bus and devices information.

Maybe it's my fault for not being more vocal when the bus rename was
being proposed on the qemu list, but libvirt really does want a way to
probe via QMP whether the 'pci.0' (multibus) vs. 'pci' naming should be
used for a given machine.  Is there an existing command we can use, or
is this something that we need to fix early on in qemu 2.1 and backport
to 2.0-stable?

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] bochs: Fix catalog size check

2014-04-09 Thread Laszlo Ersek
On 04/09/14 13:14, Kevin Wolf wrote:
 The old check was off by a factor of 512 and didn't consider cases where
 we don't get an exact division. This could lead to an out-of-bounds
 array access in seek_to_sector().
 
 Signed-off-by: Kevin Wolf kw...@redhat.com
 ---
  block/bochs.c  | 14 +++---
  tests/qemu-iotests/078 |  6 +-
  tests/qemu-iotests/078.out |  6 --
  3 files changed, 20 insertions(+), 6 deletions(-)
 
 diff --git a/block/bochs.c b/block/bochs.c
 index 50b84a9..eacf956 100644
 --- a/block/bochs.c
 +++ b/block/bochs.c
 @@ -148,8 +148,14 @@ static int bochs_open(BlockDriverState *bs, QDict 
 *options, int flags,
  s-extent_blocks = 1 + (le32_to_cpu(bochs.extent) - 1) / 512;
  
  s-extent_size = le32_to_cpu(bochs.extent);
 -if (s-extent_size == 0) {
 -error_setg(errp, Extent size may not be zero);
 +if (s-extent_size  BDRV_SECTOR_SIZE) {
 +/* bximage actually never creates extents smaller than 4k */
 +error_setg(errp, Extent size must be at least 512);
 +ret = -EINVAL;
 +goto fail;
 +} else if (!is_power_of_2(s-extent_size)) {
 +error_setg(errp, Extent size % PRIu32  is not a power of two,
 +   s-extent_size);
  ret = -EINVAL;
  goto fail;
  } else if (s-extent_size  0x80) {

OK, so this is based on your other

  [PATCH] bochs: Fix memory leak in bochs_open() error path

; good thus far.

 @@ -159,7 +165,9 @@ static int bochs_open(BlockDriverState *bs, QDict 
 *options, int flags,
  goto fail;
  }
  
 -if (s-catalog_size  bs-total_sectors / s-extent_size) {
 +if (s-catalog_size  DIV_ROUND_UP(bs-total_sectors,
 +   s-extent_size / BDRV_SECTOR_SIZE))
 +{
  error_setg(errp, Catalog size is too small for this disk size);
  ret = -EINVAL;
  goto fail;

Nice! If I may voice my liking.

Reviewed-by: Laszlo Ersek ler...@redhat.com

Thanks
Laszlo

 diff --git a/tests/qemu-iotests/078 b/tests/qemu-iotests/078
 index 872e734..d4d6da7 100755
 --- a/tests/qemu-iotests/078
 +++ b/tests/qemu-iotests/078
 @@ -69,10 +69,14 @@ _use_sample_img empty.bochs.bz2
  poke_file $TEST_IMG $disk_size_offset \x00\xc0\x0f\x00\x00\x00\x00\x7f
  { $QEMU_IO -c read 2T 4k $TEST_IMG; } 21 | _filter_qemu_io | 
 _filter_testdir
  
 +_use_sample_img empty.bochs.bz2
 +poke_file $TEST_IMG $catalog_size_offset \x10\x00\x00\x00
 +{ $QEMU_IO -c read 0xfbe00 512 $TEST_IMG; } 21 | _filter_qemu_io | 
 _filter_testdir
 +
  echo
  echo == Negative extent size ==
  _use_sample_img empty.bochs.bz2
 -poke_file $TEST_IMG $extent_size_offset \xff\xff\xff\xff
 +poke_file $TEST_IMG $extent_size_offset \x00\x00\x00\x80
  { $QEMU_IO -c read 768k 512 $TEST_IMG; } 21 | _filter_qemu_io | 
 _filter_testdir
  
  echo
 diff --git a/tests/qemu-iotests/078.out b/tests/qemu-iotests/078.out
 index ea95ffd..ca18d2e 100644
 --- a/tests/qemu-iotests/078.out
 +++ b/tests/qemu-iotests/078.out
 @@ -15,12 +15,14 @@ no file open, try 'help open'
  == Too small catalog bitmap for image size ==
  qemu-io: can't open device TEST_DIR/empty.bochs: Catalog size is too small 
 for this disk size
  no file open, try 'help open'
 +qemu-io: can't open device TEST_DIR/empty.bochs: Catalog size is too small 
 for this disk size
 +no file open, try 'help open'
  
  == Negative extent size ==
 -qemu-io: can't open device TEST_DIR/empty.bochs: Extent size 4294967295 is 
 too large
 +qemu-io: can't open device TEST_DIR/empty.bochs: Extent size 2147483648 is 
 too large
  no file open, try 'help open'
  
  == Zero extent size ==
 -qemu-io: can't open device TEST_DIR/empty.bochs: Extent size may not be zero
 +qemu-io: can't open device TEST_DIR/empty.bochs: Extent size must be at 
 least 512
  no file open, try 'help open'
  *** done
 




Re: [Qemu-devel] PL022 VMSD versions bump

2014-04-09 Thread Peter Maydell
On 9 April 2014 13:42, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote:
 Hi Peter,

 Im looking to do some cleanup around pl022 and I would like to use it
 as my lead example of code using the new generic FIFO (as I think I
 have test cases handy for it). But the VMSD description for this
 device is a bit ugly as it is:

This will just be because it's a transliteration of a previous
save/load function implementation.

 There is a way to covert to Fifo while maintaining full backwards
 compat (in much the same way as I did for serial), although as you can
 imagine, it will be messy. Is it worth the VMSD version bump to get it
 cleaned up?

A version bump is fine with me.

thanks
-- PMM



Re: [Qemu-devel] [PATCH 2/2] pci-assign: Fix memory out of bound when MSI-X table not fit in a single page

2014-04-09 Thread Michael S. Tsirkin
On Wed, Apr 09, 2014 at 10:56:57AM +, Gonglei (Arei) wrote:
 Hi,
 
   QEMU only mmap MSIX_PAGE_SIZE memory for all pci devices in
   assigned_dev_register_msix_mmio(), meanwhile the set the one
   page memmory to zero, so the rest memory will be random value
   (maybe etnry.data is not 0). In the assigned_dev_update_msix_mmio()
   maybe occur the issue of entry_nr  256, and the kmod reports
   the EINVAL error.
  
   Signed-off-by: Gonglei arei.gong...@huawei.com
  
  Okay so this kind of works because guest does not try
  to use so many vectors.
  
 Yes. It's true.
 
  But I think it's better not to give guest more entries
  than we can actually support.
  
  How about tweaking MSIX capability exposed to guest to limit table size?
  
 IMHO, the MSIX table entry size got form the physic NIC hardware. The software
 layer shouldn't tweak the capability of real hardware, neither QEMU nor KVM. 
 
 
 Best regards,
 -Gonglei

Should be easy to fix though. Does the following help?

--

pci-assign: limit # of msix vectors


Signed-off-by: Michael S. Tsirkin m...@redhat.com


diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index a825871..76aa86e 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -1258,6 +1258,7 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev)
 if (pos != 0  kvm_device_msix_supported(kvm_state)) {
 int bar_nr;
 uint32_t msix_table_entry;
+uint16_t msix_max;
 
 if (!check_irqchip_in_kernel()) {
 return -ENOTSUP;
@@ -1269,9 +1270,10 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev)
 }
 pci_dev-msix_cap = pos;
 
-pci_set_word(pci_dev-config + pos + PCI_MSIX_FLAGS,
- pci_get_word(pci_dev-config + pos + PCI_MSIX_FLAGS) 
- PCI_MSIX_FLAGS_QSIZE);
+msix_max = (pci_get_word(pci_dev-config + pos + PCI_MSIX_FLAGS) 
+PCI_MSIX_FLAGS_QSIZE) + 1;
+msix_max = MIN(msix_max, KVM_MAX_MSIX_PER_DEV);
+pci_set_word(pci_dev-config + pos + PCI_MSIX_FLAGS, msix_max - 1);
 
 /* Only enable and function mask bits are writable */
 pci_set_word(pci_dev-wmask + pos + PCI_MSIX_FLAGS,
@@ -1281,9 +1283,7 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev)
 bar_nr = msix_table_entry  PCI_MSIX_FLAGS_BIRMASK;
 msix_table_entry = ~PCI_MSIX_FLAGS_BIRMASK;
 dev-msix_table_addr = pci_region[bar_nr].base_addr + msix_table_entry;
-dev-msix_max = pci_get_word(pci_dev-config + pos + PCI_MSIX_FLAGS);
-dev-msix_max = PCI_MSIX_FLAGS_QSIZE;
-dev-msix_max += 1;
+dev-msix_max = msix_max;
 }
 
 /* Minimal PM support, nothing writable, device appears to NAK changes */



Re: [Qemu-devel] [PATCH v4 2/5] kvm: add kvm_enable_cap_{vm,vcpu}

2014-04-09 Thread Alexander Graf


On 09.04.14 13:34, Cornelia Huck wrote:

Provide helper functions for enabling capabilities (on a vcpu and on a vm).

Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com


I think it makes sense to convert all ENABLE_CAP callers of the code 
base to this helper as well. Once you do that, you'll realize that we 
may want to have additional argument parameters ;).



Alex


---
  include/sysemu/kvm.h |4 
  kvm-all.c|   19 ++-
  2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 0bee1e8..d89911c 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -294,6 +294,10 @@ bool kvm_arch_stop_on_emulation_error(CPUState *cpu);
  
  int kvm_check_extension(KVMState *s, unsigned int extension);
  
+int kvm_enable_cap_vm(KVMState *s, unsigned int capability);

+
+int kvm_enable_cap_vcpu(CPUState *cpu, unsigned int capability);
+
  uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function,
uint32_t index, int reg);
  
diff --git a/kvm-all.c b/kvm-all.c

index cd4111d..c32eeff 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -501,7 +501,24 @@ int kvm_check_extension(KVMState *s, unsigned int 
extension)
  return ret;
  }
  
-static int kvm_set_ioeventfd_mmio(int fd, hwaddr addr, uint32_t val,

+int kvm_enable_cap_vm(KVMState *s, unsigned int capability)
+{
+struct kvm_enable_cap cap = {};
+
+cap.cap = capability;
+return kvm_vm_ioctl(s, KVM_ENABLE_CAP, cap);
+}
+
+int kvm_enable_cap_vcpu(CPUState *cpu, unsigned int capability)
+{
+struct kvm_enable_cap cap = {};
+
+cap.cap = capability;
+return kvm_vcpu_ioctl(cpu, KVM_ENABLE_CAP, cap);
+}
+
+
+static int kvm_set_ioeventfd_mmio(int fd, uint32_t addr, uint32_t val,
bool assign, uint32_t size, bool datamatch)
  {
  int ret;





Re: [Qemu-devel] [PATCH v4 3/5] s390x: Add I/O adapter registration.

2014-04-09 Thread Alexander Graf


On 09.04.14 13:34, Cornelia Huck wrote:

Register an I/O adapter interrupt source for when virtio-ccw devices start
using adapter interrupts.

Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
  hw/intc/s390_flic.c   |   59 +
  hw/s390x/css.c|   51 ++
  hw/s390x/css.h|4 
  hw/s390x/virtio-ccw.c |4 
  hw/s390x/virtio-ccw.h |1 +
  target-s390x/cpu.h|   33 +++
  6 files changed, 152 insertions(+)

diff --git a/hw/intc/s390_flic.c b/hw/intc/s390_flic.c
index b2ef3e3..c033c8a 100644
--- a/hw/intc/s390_flic.c
+++ b/hw/intc/s390_flic.c
@@ -21,6 +21,11 @@
  #define FLIC_FAILED (-1UL)
  #define FLIC_SAVEVM_VERSION 1
  
+static KVMS390FLICState *s390_get_flic(void)

+{
+return KVM_S390_FLIC(object_resolve_path(/machine/s390-flic, NULL));
+}
+
  void s390_flic_init(void)
  {
  DeviceState *dev;
@@ -148,6 +153,60 @@ static int __get_all_irqs(KVMS390FLICState *flic,
  return r;
  }
  
+int kvm_s390_register_io_adapter(uint32_t id, uint8_t isc, bool swap,

+ bool is_maskable)
+{
+struct kvm_s390_io_adapter adapter = {
+.id = id,
+.isc = isc,
+.maskable = is_maskable,
+.swap = swap,
+};
+KVMS390FLICState *flic = s390_get_flic();
+int r, ret;
+struct kvm_device_attr attr = {
+.group = KVM_DEV_FLIC_ADAPTER_REGISTER,
+.addr = (uint64_t)adapter,
+};
+
+if (!flic) {
+return -ENOSYS;
+}
+if (!kvm_check_extension(kvm_state, KVM_CAP_IRQ_ROUTING)) {
+return -ENOSYS;
+}
+
+r = ioctl(flic-fd, KVM_SET_DEVICE_ATTR, attr);
+
+ret = r ? -errno : 0;
+return ret;
+}
+
+int kvm_s390_io_adapter_map(uint32_t id, uint64_t map_addr, bool do_map)
+{
+struct kvm_s390_io_adapter_req req = {
+.id = id,
+.type = do_map ? KVM_S390_IO_ADAPTER_MAP : KVM_S390_IO_ADAPTER_UNMAP,
+.addr = map_addr,
+};
+KVMS390FLICState *flic = s390_get_flic();
+struct kvm_device_attr attr = {
+.group = KVM_DEV_FLIC_ADAPTER_MODIFY,
+.addr = (uint64_t)req,
+};
+int r;
+
+if (!flic) {
+return -ENOSYS;
+}
+if (!kvm_check_extension(kvm_state, KVM_CAP_IRQ_ROUTING)) {
+return -ENOSYS;
+}
+
+r = ioctl(flic-fd, KVM_SET_DEVICE_ATTR, attr);
+return r ? -errno : 0;
+}
+
  /**
   * kvm_flic_save - Save pending floating interrupts
   * @f: QEMUFile containing migration state
diff --git a/hw/s390x/css.c b/hw/s390x/css.c
index 7074d2b..a6d173f 100644
--- a/hw/s390x/css.c
+++ b/hw/s390x/css.c
@@ -39,6 +39,13 @@ typedef struct CssImage {
  ChpInfo chpids[MAX_CHPID + 1];
  } CssImage;
  
+typedef struct IoAdapter {

+uint32_t id;
+uint8_t type;
+uint8_t isc;
+QTAILQ_ENTRY(IoAdapter) sibling;
+} IoAdapter;
+
  typedef struct ChannelSubSys {
  QTAILQ_HEAD(, CrwContainer) pending_crws;
  bool do_crw_mchk;
@@ -49,6 +56,7 @@ typedef struct ChannelSubSys {
  uint64_t chnmon_area;
  CssImage *css[MAX_CSSID + 1];
  uint8_t default_cssid;
+QTAILQ_HEAD(, IoAdapter) io_adapters;
  } ChannelSubSys;
  
  static ChannelSubSys *channel_subsys;

@@ -69,6 +77,48 @@ int css_create_css_image(uint8_t cssid, bool default_image)
  return 0;
  }
  
+int css_register_io_adapter(uint8_t type, uint8_t isc, bool swap,

+bool maskable, uint32_t *id)
+{
+IoAdapter *adapter;
+bool found = false;
+int ret;
+
+*id = 0;
+QTAILQ_FOREACH(adapter, channel_subsys-io_adapters, sibling) {
+if ((adapter-type == type)  (adapter-isc == isc)) {
+*id = adapter-id;
+found = true;
+ret = 0;
+break;
+}
+if (adapter-id = *id) {
+*id = adapter-id + 1;
+}
+}
+if (found) {
+goto out;
+}
+adapter = g_new0(IoAdapter, 1);
+ret = s390_register_io_adapter(*id, isc, swap, maskable);
+if (ret == -ENOSYS) {
+/* Keep adapter even if we didn't register it anywhere. */
+ret = 0;
+}
+if (ret == 0) {
+adapter-id = *id;
+adapter-isc = isc;
+adapter-type = type;
+QTAILQ_INSERT_TAIL(channel_subsys-io_adapters, adapter, sibling);
+} else {
+g_free(adapter);
+fprintf(stderr, Unexpected error %d when registering adapter %d\n,
+ret, *id);
+}
+out:
+return ret;
+}
+
  uint16_t css_build_subchannel_id(SubchDev *sch)
  {
  if (channel_subsys-max_cssid  0) {
@@ -1239,6 +1289,7 @@ static void css_init(void)
  channel_subsys-do_crw_mchk = true;
  channel_subsys-crws_lost = false;
  channel_subsys-chnmon_active = false;
+QTAILQ_INIT(channel_subsys-io_adapters);
  }
  machine_init(css_init);
  
diff --git a/hw/s390x/css.h b/hw/s390x/css.h

index e9b4405..380e8e7 

Re: [Qemu-devel] [PATCH 2/2] pci-assign: Fix memory out of bound when MSI-X table not fit in a single page

2014-04-09 Thread Laszlo Ersek
On 04/03/14 07:18, arei.gong...@huawei.com wrote:
 From: Gonglei arei.gong...@huawei.com
 
 QEMU only mmap MSIX_PAGE_SIZE memory for all pci devices in
 assigned_dev_register_msix_mmio(), meanwhile the set the one
 page memmory to zero, so the rest memory will be random value
 (maybe etnry.data is not 0). In the assigned_dev_update_msix_mmio()
 maybe occur the issue of entry_nr  256, and the kmod reports
 the EINVAL error.
 
 Signed-off-by: Gonglei arei.gong...@huawei.com
 ---
  hw/i386/kvm/pci-assign.c | 15 ---
  1 file changed, 12 insertions(+), 3 deletions(-)
 
 diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
 index 570333f..d25a19e 100644
 --- a/hw/i386/kvm/pci-assign.c
 +++ b/hw/i386/kvm/pci-assign.c
 @@ -37,6 +37,7 @@
  #include hw/pci/pci.h
  #include hw/pci/msi.h
  #include kvm_i386.h
 +#include qemu/osdep.h
  
  #define MSIX_PAGE_SIZE 0x1000
  
 @@ -59,6 +60,9 @@
  #define DEBUG(fmt, ...)
  #endif
  
 +/* the msix-table size readed from pci device config */
 +static int msix_table_size;
 +
  typedef struct PCIRegion {
  int type;   /* Memory or port I/O */
  int valid;
 @@ -1604,7 +1608,12 @@ static void assigned_dev_msix_reset(AssignedDevice 
 *dev)
  
  static int assigned_dev_register_msix_mmio(AssignedDevice *dev)
  {
 -dev-msix_table = mmap(NULL, MSIX_PAGE_SIZE, PROT_READ|PROT_WRITE,
 +msix_table_size = ROUND_UP(dev-msix_max * sizeof(struct MSIXTableEntry),
 +MSIX_PAGE_SIZE);
 +
 +DEBUG(msix_table_size: 0x%x\n, msix_table_size);
 +
 +dev-msix_table = mmap(NULL, msix_table_size, PROT_READ|PROT_WRITE,
 MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
  if (dev-msix_table == MAP_FAILED) {
  error_report(fail allocate msix_table! %s, strerror(errno));
 @@ -1615,7 +1624,7 @@ static int 
 assigned_dev_register_msix_mmio(AssignedDevice *dev)
  assigned_dev_msix_reset(dev);
  
  memory_region_init_io(dev-mmio, OBJECT(dev), 
 assigned_dev_msix_mmio_ops,
 -  dev, assigned-dev-msix, MSIX_PAGE_SIZE);
 +  dev, assigned-dev-msix, msix_table_size);
  return 0;
  }
  
 @@ -1627,7 +1636,7 @@ static void 
 assigned_dev_unregister_msix_mmio(AssignedDevice *dev)
  
  memory_region_destroy(dev-mmio);
  
 -if (munmap(dev-msix_table, MSIX_PAGE_SIZE) == -1) {
 +if (munmap(dev-msix_table, msix_table_size) == -1) {
  error_report(error unmapping msix_table! %s, strerror(errno));
  }
  dev-msix_table = NULL;
 

My only interest in this patch is RHBZ 616415. Namely, I have to improve
the propagation of human-readable errors in PCI device assignment
(through QMP as well). This patchset has hunks that look capable of
conflicting with my work (not started yet), so I think maybe I should
delay my work until this patchset is hashed out and applied.

Anyway, I just don't want to wait, so here are some review comments (I'm
not a PCI assignment expert by any means!):

(1) The patch introduces msix_table_size as an object with static
storage duration (ie. as a global variable). This is wrong; different
passthru devices will have different MSI-X table sizes. The patch causes
breakage as soon as two devices with different (rounded up) table sizes
are assigned, and then at least one of them is unassigned (because
assigned_dev_unregister_msix_mmio() will unmap either too much or too
little).

There are two ways to fix this:
- Make msix_table_size a *sibling* field of msix_table and
msix_max, in struct AssignedDevice. Or,
- Recompute the value of msix_table_size (to be introduced as a local
variable in all relevant functions) from dev-msix_max each time you
need it.

(2) The other problem is that not all uses of MSIX_PAGE_SIZE are
updated. Namely, assigned_dev_msix_reset() clears the first page, then
overwrites the first dev-msix_table entries (masking them). However, if
we've allocated at least two pages due to *inexact* division (ie.
rounding up), then the trailing portion of the last page continues to
contain garbage.

Now, this
(a) either matters to KVM (because it wants to have sensible values in
*all* entries that fit into the pages that it sees), or
(b) it doesn't (because KVM complies with entries_nr in
assigned_dev_update_msix_mmio(), which is bounded by adev-msix_max).

I don't know which one of (a) or (b) is the case. But, the code seems
incorrect (or at least misleading) in both cases:

In case (a), the memset() in assigned_dev_msix_reset() should be updated
so that it covers the entire allocation (all full pages, including the
last one).

In case (b), the memset() should be dropped from
assigned_dev_msix_reset(), because the loop just beneath it will
populate the entire set that KVM will look at.

Thanks
Laszlo



Re: [Qemu-devel] [PATCH 1/2] pci-assign: Fix a bug when map MSI-X table memory failed

2014-04-09 Thread Michael S. Tsirkin
On Thu, Apr 03, 2014 at 01:18:23PM +0800, arei.gong...@huawei.com wrote:
 From: Gonglei arei.gong...@huawei.com
 
 when map MSI-X table memory failed, the dev-msix_table not be
 set to NULL, the assigned_dev_unregister_msix_mmio() will case
 a segfault when munmap the failed dev-msix_table.
 
 Signed-off-by: Gonglei arei.gong...@huawei.com


Reviewed-by: Michael S. Tsirkin m...@redhat.com

 ---
  hw/i386/kvm/pci-assign.c | 1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
 index a825871..570333f 100644
 --- a/hw/i386/kvm/pci-assign.c
 +++ b/hw/i386/kvm/pci-assign.c
 @@ -1608,6 +1608,7 @@ static int 
 assigned_dev_register_msix_mmio(AssignedDevice *dev)
 MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
  if (dev-msix_table == MAP_FAILED) {
  error_report(fail allocate msix_table! %s, strerror(errno));
 +dev-msix_table = NULL;
  return -EFAULT;
  }
  
 -- 
 1.7.12.4
 



Re: [Qemu-devel] [PATCH v4 2/5] kvm: add kvm_enable_cap_{vm,vcpu}

2014-04-09 Thread Cornelia Huck
On Wed, 09 Apr 2014 15:58:55 +0200
Alexander Graf ag...@suse.de wrote:

 
 On 09.04.14 13:34, Cornelia Huck wrote:
  Provide helper functions for enabling capabilities (on a vcpu and on a vm).
 
  Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com
  Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
 
 I think it makes sense to convert all ENABLE_CAP callers of the code 
 base to this helper as well. Once you do that, you'll realize that we 
 may want to have additional argument parameters ;).

OK, I'll do the additional conversions as a follow-up, so I can add a
good interface here :)




Re: [Qemu-devel] [PATCH v4 3/5] s390x: Add I/O adapter registration.

2014-04-09 Thread Cornelia Huck
On Wed, 09 Apr 2014 16:05:00 +0200
Alexander Graf ag...@suse.de wrote:

 
 On 09.04.14 13:34, Cornelia Huck wrote:
  Register an I/O adapter interrupt source for when virtio-ccw devices start
  using adapter interrupts.
 
  Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com
  Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
  ---
hw/intc/s390_flic.c   |   59 
  +
hw/s390x/css.c|   51 ++
hw/s390x/css.h|4 
hw/s390x/virtio-ccw.c |4 
hw/s390x/virtio-ccw.h |1 +
target-s390x/cpu.h|   33 +++
6 files changed, 152 insertions(+)
 

  diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
  index 2bf0af8..1193682 100644
  --- a/hw/s390x/virtio-ccw.c
  +++ b/hw/s390x/virtio-ccw.c
  @@ -522,6 +522,10 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
dev-thinint_isc = thinint-isc;
dev-ind_bit = thinint-ind_bit;
cpu_physical_memory_unmap(thinint, hw_len, 0, hw_len);
  +ret = css_register_io_adapter(CSS_IO_ADAPTER_VIRTIO,
  +  dev-thinint_isc, true, 
  false,
  +  dev-adapter_id);
 
 In all other machines the machine file is the one creating the link 
 between a device and the interrupt controller. Can we do something 
 similar for s390?

Hm. This would imply we'd need to add a virtio I/O adapter for each isc
(0-7) at startup, regardless whether the guest enables adapter
interrupts on any of those iscs. Moreover, we'd need to do the same for
each type (on each isc) if we add more types of I/O adapters later
(should we want to support one of the other adapter-interrupt using
devices). I'd prefer to add an I/O adapter only when needed.




Re: [Qemu-devel] [PATCH v4 3/5] s390x: Add I/O adapter registration.

2014-04-09 Thread Alexander Graf


On 09.04.14 16:24, Cornelia Huck wrote:

On Wed, 09 Apr 2014 16:05:00 +0200
Alexander Graf ag...@suse.de wrote:


On 09.04.14 13:34, Cornelia Huck wrote:

Register an I/O adapter interrupt source for when virtio-ccw devices start
using adapter interrupts.

Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
   hw/intc/s390_flic.c   |   59 
+
   hw/s390x/css.c|   51 ++
   hw/s390x/css.h|4 
   hw/s390x/virtio-ccw.c |4 
   hw/s390x/virtio-ccw.h |1 +
   target-s390x/cpu.h|   33 +++
   6 files changed, 152 insertions(+)

diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index 2bf0af8..1193682 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -522,6 +522,10 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
   dev-thinint_isc = thinint-isc;
   dev-ind_bit = thinint-ind_bit;
   cpu_physical_memory_unmap(thinint, hw_len, 0, hw_len);
+ret = css_register_io_adapter(CSS_IO_ADAPTER_VIRTIO,
+  dev-thinint_isc, true, false,
+  dev-adapter_id);

In all other machines the machine file is the one creating the link
between a device and the interrupt controller. Can we do something
similar for s390?

Hm. This would imply we'd need to add a virtio I/O adapter for each isc
(0-7) at startup, regardless whether the guest enables adapter
interrupts on any of those iscs. Moreover, we'd need to do the same for
each type (on each isc) if we add more types of I/O adapters later
(should we want to support one of the other adapter-interrupt using
devices). I'd prefer to add an I/O adapter only when needed.


I'm not sure I can follow you here. Instead of registering the interrupt 
vector on the fly, you would still register it on the fly, but after the 
virtio-ccw device got created, no?



Alex




[Qemu-devel] [PULL for-2.0 1/1] tests/acpi: update expected DSDT files

2014-04-09 Thread Michael S. Tsirkin
commit f2ccc311df55ec026a8f8ea9df998f26314f22b2
dsdt: tweak ACPI ID for hotplug resource device
changes the DSDT, update test expected files to match

Signed-off-by: Michael S. Tsirkin m...@redhat.com
Reported-by: Igor Mammedov imamm...@redhat.com
---
 tests/acpi-test-data/pc/DSDT  | Bin 4485 - 4480 bytes
 tests/acpi-test-data/q35/DSDT | Bin 7383 - 7378 bytes
 2 files changed, 0 insertions(+), 0 deletions(-)

diff --git a/tests/acpi-test-data/pc/DSDT b/tests/acpi-test-data/pc/DSDT
index 
8b14a5f6f2012dc9bfd01f13931100f002e99391..d0bb3de79d2ff98736e8a54e6e8859ab4efeaf03
 100644
GIT binary patch
delta 45
zcmZowZcyfO33dr-5M*Fr+`Ex$AwQ$jTd;+_%#E9T!T5{Jv?1_94~OOOo~z7XT?9
B4Z8pU

delta 50
zcmZorZdK-T33dr-6=Yywe7lirAwQ%0Td;+MDzlJT!T5{Jv?1_9i0O_4GauS7$z?k
GxC;Pq#typx

diff --git a/tests/acpi-test-data/q35/DSDT b/tests/acpi-test-data/q35/DSDT
index 
a76ea9a418bea57c2d685b3a1b0221defd02b447..fc5b9700097d621ac76c59cf2596ead9d61c20a6
 100644
GIT binary patch
delta 45
zcmca^dC8K?CDk8k_-a_W8p?FHEBkt$%fJ|_%#E9T!T5{Jv?1_94~OOP1coJ1pqk4
B4IBUf

delta 50
zcmca)dEJuBCDk8x(ovYHC(xYSN7ElMSU`i0B0bxdwB@dw9C=Iywh8WRuFih5v
GSp@)qeGVJ|

-- 
MST




[Qemu-devel] [PULL for-2.0 0/1] acpi test: DSDT update

2014-04-09 Thread Michael S. Tsirkin
The following changes since commit efcc87d9aedb590b8506cd1a7c8abe557c760f9e:

  Update version for v2.0.0-rc2 release (2014-04-08 18:52:06 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream

for you to fetch changes up to 50329d3418c959ebce63c59d4c21473ac102d88f:

  tests/acpi: update expected DSDT files (2014-04-09 17:52:08 +0300)


acpi test: DSDT update

Not a must but as close to zero risk as one
can get: removes an annoying warning produced
when running make check.

No code changes so it really can't break anything ...

Signed-off-by: Michael S. Tsirkin m...@redhat.com


Michael S. Tsirkin (1):
  tests/acpi: update expected DSDT files

 tests/acpi-test-data/pc/DSDT  | Bin 4485 - 4480 bytes
 tests/acpi-test-data/q35/DSDT | Bin 7383 - 7378 bytes
 2 files changed, 0 insertions(+), 0 deletions(-)




[Qemu-devel] [RFC v2 1/6] hw/arm/virt: add a xgmac device

2014-04-09 Thread Eric Auger
From: Kim Phillips kim.phill...@linaro.org

This is a hack and only serves as an example of what needs to be
done to make the next RFC - add vfio-platform support - work
for development purposes on a Calxeda Midway system.  We don't want
mach-virt to always create this ethernet device - DO NOT APPLY, etc.

Initial attempts to convince QEMU to create a memory mapped device
on the command line (e.g., -device vfio-platform,name=fff51000.ethernet)
would fail with Parameter 'driver' expects pluggable device type.
Any guidance as to how to overcome this apparent design limitation
is welcome.

RAM is reduced from 30 to 1GiB such as to not overlap the xgmac device's
physical address.  Not sure if the 30GiB RAM (or whatever the user sets
it to with -m) could be set up above 0x1__, but there is probably
extra work needed to resolve this type of conflict.

note: vfio-platform interrupt support development may want interrupt
property data filled; here it's omitted for the time being.

Not-signed-off-by: Kim Phillips kim.phill...@linaro.org
---
 hw/arm/virt.c | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 2bbc931..5d43cf0 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -65,6 +65,7 @@ enum {
 VIRT_GIC_CPU,
 VIRT_UART,
 VIRT_MMIO,
+VIRT_ETHERNET,
 };
 
 typedef struct MemMapEntry {
@@ -106,7 +107,8 @@ static const MemMapEntry a15memmap[] = {
 [VIRT_MMIO] = { 0xa00, 0x200 },
 /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
 /* 0x1000 .. 0x4000 reserved for PCI */
-[VIRT_MEM] = { 0x4000, 30ULL * 1024 * 1024 * 1024 },
+[VIRT_MEM] = { 0x4000, 1ULL * 1024 * 1024 * 1024 },
+[VIRT_ETHERNET] = { 0xfff51000, 0x1000 },
 };
 
 static const int a15irqmap[] = {
@@ -291,6 +293,25 @@ static void create_uart(const VirtBoardInfo *vbi, qemu_irq 
*pic)
 g_free(nodename);
 }
 
+static void create_ethernet(const VirtBoardInfo *vbi, qemu_irq *pic)
+{
+char *nodename;
+hwaddr base = vbi-memmap[VIRT_ETHERNET].base;
+hwaddr size = vbi-memmap[VIRT_ETHERNET].size;
+const char compat[] = calxeda,hb-xgmac;
+
+sysbus_create_simple(vfio-platform, base, NULL);
+
+nodename = g_strdup_printf(/ethernet@% PRIx64, base);
+qemu_fdt_add_subnode(vbi-fdt, nodename);
+
+/* Note that we can't use setprop_string because of the embedded NUL */
+qemu_fdt_setprop(vbi-fdt, nodename, compatible, compat, sizeof(compat));
+qemu_fdt_setprop_sized_cells(vbi-fdt, nodename, reg, 2, base, 2, size);
+
+g_free(nodename);
+}
+
 static void create_virtio_devices(const VirtBoardInfo *vbi, qemu_irq *pic)
 {
 int i;
@@ -425,6 +446,7 @@ static void machvirt_init(QEMUMachineInitArgs *args)
 }
 
 create_uart(vbi, pic);
+create_ethernet(vbi, pic);
 
 /* Create mmio transports, so the user can create virtio backends
  * (which will be automatically plugged in to the transports). If
-- 
1.8.3.2




[Qemu-devel] [RFC v2 3/6] vfio: add vfio-platform support

2014-04-09 Thread Eric Auger
From: Kim Phillips kim.phill...@linaro.org

Functions for which PCI and platform device support share are moved
into common.c.  The common vfio_{get,put}_group() get an additional
argument, a pointer to a vfio_reset_handler(), for which to pass on to
qemu_register_reset, but only if it exists (the platform device code
currently passes a NULL as its reset_handler).

For the platform device code, we basically use SysBusDevice
instead of PCIDevice.  Since realize() returns void, unlike
PCIDevice's initfn, error codes are moved into the
error message text with %m.

Currently only MMIO access is supported at this time.

The perceived path for future QEMU development is:

- add support for interrupts
- verify and test platform dev unmap path
- test existing PCI path for any regressions
- add support for creating platform devices on the qemu command line
  - currently device address specification is hardcoded for test
development on Calxeda Midway's fff51000.ethernet device
- reset is not supported and registration of reset functions is
  bypassed for platform devices.
  - there is no standard means of resetting a platform device,
unsure if it suffices to be handled at device--VFIO binding time

Signed-off-by: Kim Phillips kim.phill...@linaro.org

[1] http://www.spinics.net/lists/kvm-arm/msg08195.html
---
 hw/vfio/Makefile.objs |   2 +
 hw/vfio/common.c  | 486 ++
 hw/vfio/pci.c | 480 ++---
 hw/vfio/platform.c| 381 +++
 hw/vfio/vfio-common.h |  55 ++
 5 files changed, 937 insertions(+), 467 deletions(-)
 create mode 100644 hw/vfio/common.c
 create mode 100644 hw/vfio/platform.c
 create mode 100644 hw/vfio/vfio-common.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index 31c7dab..c5c76fe 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -1,3 +1,5 @@
 ifeq ($(CONFIG_LINUX), y)
+obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o
+obj-$(CONFIG_SOFTMMU) += platform.o
 endif
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
new file mode 100644
index 000..9d1f723
--- /dev/null
+++ b/hw/vfio/common.c
@@ -0,0 +1,486 @@
+/*
+ * vfio based device assignment support
+ *
+ * Copyright Red Hat, Inc. 2012
+ *
+ * Authors:
+ *  Alex Williamson alex.william...@redhat.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on qemu-kvm device-assignment:
+ *  Adapted for KVM by Qumranet.
+ *  Copyright (c) 2007, Neocleus, Alex Novik (a...@neocleus.com)
+ *  Copyright (c) 2007, Neocleus, Guy Zana (g...@neocleus.com)
+ *  Copyright (C) 2008, Qumranet, Amit Shah (amit.s...@qumranet.com)
+ *  Copyright (C) 2008, Red Hat, Amit Shah (amit.s...@redhat.com)
+ *  Copyright (C) 2008, IBM, Muli Ben-Yehuda (m...@il.ibm.com)
+ */
+
+#include dirent.h
+#include linux/vfio.h
+#include sys/ioctl.h
+#include sys/mman.h
+#include sys/stat.h
+#include sys/types.h
+#include unistd.h
+
+#include config.h
+#include exec/address-spaces.h
+#include exec/memory.h
+#include hw/pci/msi.h
+#include hw/pci/msix.h
+#include hw/pci/pci.h
+#include qemu-common.h
+#include qemu/error-report.h
+#include qemu/event_notifier.h
+#include qemu/queue.h
+#include qemu/range.h
+#include sysemu/kvm.h
+#include sysemu/sysemu.h
+
+#include vfio-common.h
+
+#define DEBUG_VFIO
+#ifdef DEBUG_VFIO
+#define DPRINTF(fmt, ...) \
+do { fprintf(stderr, vfio:  fmt, ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) \
+do { } while (0)
+#endif
+
+static QLIST_HEAD(, VFIOContainer)
+container_list = QLIST_HEAD_INITIALIZER(container_list);
+
+QLIST_HEAD(, VFIOGroup)
+group_list = QLIST_HEAD_INITIALIZER(group_list);
+
+
+struct VFIODevice;
+
+#ifdef CONFIG_KVM
+/*
+ * We have a single VFIO pseudo device per KVM VM.  Once created it lives
+ * for the life of the VM.  Closing the file descriptor only drops our
+ * reference to it and the device's reference to kvm.  Therefore once
+ * initialized, this file descriptor is only released on QEMU exit and
+ * we'll re-use it should another vfio device be attached before then.
+ */
+static int vfio_kvm_device_fd = -1;
+#endif
+
+/*
+ * DMA - Mapping and unmapping for the type1 IOMMU interface used on x86
+ */
+static int vfio_dma_unmap(VFIOContainer *container,
+  hwaddr iova, ram_addr_t size)
+{
+struct vfio_iommu_type1_dma_unmap unmap = {
+.argsz = sizeof(unmap),
+.flags = 0,
+.iova = iova,
+.size = size,
+};
+
+if (ioctl(container-fd, VFIO_IOMMU_UNMAP_DMA, unmap)) {
+DPRINTF(VFIO_UNMAP_DMA: %d\n, -errno);
+return -errno;
+}
+
+return 0;
+}
+
+static int vfio_dma_map(VFIOContainer *container, hwaddr iova,
+ram_addr_t size, void *vaddr, bool readonly)
+{
+struct vfio_iommu_type1_dma_map map = {
+.argsz = 

[Qemu-devel] [RFC v2 5/6] virt: Assign a VFIO platform device to a virt VM in QEMU command line

2014-04-09 Thread Eric Auger
This patch aims at allowing the end-user to specify the device he
wants to directly assign to his virt VM in the QEMU command line.
The QEMU platform device becomes generic.

Current choice is to reuse the -device option.

For example when assigning Calxeda Midway xgmac device this option is
used:
-device vfio-platform,vfio_device=fff51000.ethernet,\
compat=calxeda/hb-xgmac,mmap-timeout-ms=1000

where
- fff51000.ethernet is the name of the device in
  /sys/bus/platform/devices/
- calxeda/hb-xgma is the compatibility where the standard coma
  separator is replaced by / since coma is specifically used by QEMU
  command line parser
- mmap-timeout-ms is minimal amount of time (ms) during which the IP
  register space stays MMIO mapped after an IRQ triggers in order to
  trap the end of interrupt (EOI). This is an optional parameter
  (default value set to 1100 ms).

virt machine was modified to interpret this line and automatically
- map the device at a chosen guest physical address in
  [0xa004000, 0x1000],
- map the device IRQs after 48,
- create the associated guest device tree with the provided
  compatibility.

The -device option underlying implementation is not standard
which can be argued. Indeed normaly it induces the call to the QEMU
device realize function once after the virtual machine init execution.
In this case QDEV mappings and device tree creation must happen.
Since virt is the place where the whole memory and IRQ mapping is
known and device tree is created, it was chosen to interpret the option
line there. This means the realize function is called twice, once in
virt.c and once after machine init return. The second call returns
immediatly since the QEMU device is recognized as already existing.
Another way to implement this would be to create a new option in QEMU.

Acknowledgements:
- a single compatibility currently is supported
- IRQ properties set in the device tree should be refined
- More generally devices with more complex device tree nodes must be
  studied and are not currently handled
- cases where multiple VFIO devices are assigned could not be tested

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 hw/arm/virt.c  | 178 +++--
 hw/vfio/platform.c |  43 ++---
 2 files changed, 181 insertions(+), 40 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 31ae7d2..1fb66ef 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -40,6 +40,17 @@
 #include exec/address-spaces.h
 #include qemu/bitops.h
 #include qemu/error-report.h
+#include monitor/qdev.h
+#include qemu/config-file.h
+
+/*
+ * this function is implemented in vfio/platform.c
+ * it returns the name, compatibility, IRQ number and register set size.
+ * the function only is implemented for VFIO platform devices
+ */
+void vfio_get_props(SysBusDevice *s, char **pname, char **pcompat,
+int *pnum_irqs, size_t *psize);
+
 
 #define NUM_VIRTIO_TRANSPORTS 32
 
@@ -65,7 +76,7 @@ enum {
 VIRT_GIC_CPU,
 VIRT_UART,
 VIRT_MMIO,
-VIRT_ETHERNET,
+VIRT_VFIO,
 };
 
 typedef struct MemMapEntry {
@@ -79,7 +90,10 @@ typedef struct VirtBoardInfo {
 const char *qdevname;
 const char *gic_compatible;
 const MemMapEntry *memmap;
+qemu_irq pic[NUM_IRQS];
 const int *irqmap;
+hwaddr avail_vfio_base;
+int avail_vfio_irq;
 int smp_cpus;
 void *fdt;
 int fdt_size;
@@ -105,16 +119,16 @@ static const MemMapEntry a15memmap[] = {
 [VIRT_GIC_CPU] = { 0x8002000, 0x1000 },
 [VIRT_UART] = { 0x900, 0x1000 },
 [VIRT_MMIO] = { 0xa00, 0x200 },
+[VIRT_VFIO] = { 0xa004000, 0x0 }, /* size is dynamically populated */
 /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
 /* 0x1000 .. 0x4000 reserved for PCI */
-[VIRT_MEM] = { 0x4000, 1ULL * 1024 * 1024 * 1024 },
-[VIRT_ETHERNET] = { 0xfff41000, 0x1000 },
+[VIRT_MEM] = { 0x4000, 30ULL * 1024 * 1024 * 1024 },
 };
 
 static const int a15irqmap[] = {
 [VIRT_UART] = 1,
 [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
-[VIRT_ETHERNET] = 77,
+[VIRT_VFIO] = 48,
 };
 
 static VirtBoardInfo machines[] = {
@@ -266,7 +280,7 @@ static void fdt_add_gic_node(const VirtBoardInfo *vbi)
 qemu_fdt_setprop_cell(vbi-fdt, /intc, phandle, gic_phandle);
 }
 
-static void create_uart(const VirtBoardInfo *vbi, qemu_irq *pic)
+static void create_uart(const VirtBoardInfo *vbi)
 {
 char *nodename;
 hwaddr base = vbi-memmap[VIRT_UART].base;
@@ -275,7 +289,7 @@ static void create_uart(const VirtBoardInfo *vbi, qemu_irq 
*pic)
 const char compat[] = arm,pl011\0arm,primecell;
 const char clocknames[] = uartclk\0apb_pclk;
 
-sysbus_create_simple(pl011, base, pic[irq]);
+sysbus_create_simple(pl011, base, vbi-pic[irq]);
 
 nodename = g_strdup_printf(/pl011@% PRIx64, base);
 qemu_fdt_add_subnode(vbi-fdt, nodename);
@@ -294,34 +308,133 @@ static void 

[Qemu-devel] [RFC v2 4/6] vfio: Add initial IRQ support in QEMU platform device

2014-04-09 Thread Eric Auger
This work is inspired of PCI INTx. The code was prepared to support
multiple IRQs but this was not tested at that stage. Similarly to what
is done on PCI, the device register space is RAM unmapped on IRQ hit
in order to trap the end of interrupt (EOI). On mmap timer hit, if the
EOI was trapped, the mmapping is restored. the physical interrupt is
unmasked on the first read/write with the MMIO register space.

Tested on Calxeda Midway xgmac which can be directly assigned to one
virt VM.

Acknowledgements:
- vfio device name, guest physical address and IRQ indexes are
  hardcoded. This will be improved in another patch
- currently tested on a single IRQ (xgmac main IRQ)
- a KVM patch is required to invalidate stage2 entries on RAM memory
  region destruction (see [PATCH] ARM: KVM: Handle IPA unmapping on
  memory region deletion)

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 hw/arm/virt.c  |  13 ++-
 hw/vfio/platform.c | 316 +
 2 files changed, 304 insertions(+), 25 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 5d43cf0..31ae7d2 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -108,12 +108,13 @@ static const MemMapEntry a15memmap[] = {
 /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
 /* 0x1000 .. 0x4000 reserved for PCI */
 [VIRT_MEM] = { 0x4000, 1ULL * 1024 * 1024 * 1024 },
-[VIRT_ETHERNET] = { 0xfff51000, 0x1000 },
+[VIRT_ETHERNET] = { 0xfff41000, 0x1000 },
 };
 
 static const int a15irqmap[] = {
 [VIRT_UART] = 1,
 [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
+[VIRT_ETHERNET] = 77,
 };
 
 static VirtBoardInfo machines[] = {
@@ -299,8 +300,12 @@ static void create_ethernet(const VirtBoardInfo *vbi, 
qemu_irq *pic)
 hwaddr base = vbi-memmap[VIRT_ETHERNET].base;
 hwaddr size = vbi-memmap[VIRT_ETHERNET].size;
 const char compat[] = calxeda,hb-xgmac;
+int irqm = vbi-irqmap[VIRT_ETHERNET];
+int irqp = irqm+1;
+int irqlp = irqm+2;
 
-sysbus_create_simple(vfio-platform, base, NULL);
+sysbus_create_varargs(vfio-platform, base,
+  pic[irqm], pic[irqp], pic[irqlp], NULL);
 
 nodename = g_strdup_printf(/ethernet@% PRIx64, base);
 qemu_fdt_add_subnode(vbi-fdt, nodename);
@@ -308,6 +313,10 @@ static void create_ethernet(const VirtBoardInfo *vbi, 
qemu_irq *pic)
 /* Note that we can't use setprop_string because of the embedded NUL */
 qemu_fdt_setprop(vbi-fdt, nodename, compatible, compat, sizeof(compat));
 qemu_fdt_setprop_sized_cells(vbi-fdt, nodename, reg, 2, base, 2, size);
+qemu_fdt_setprop_cells(vbi-fdt, nodename, interrupts,
+0x0, irqm, 0x4,
+0x0, irqp, 0x4,
+0x0, irqlp, 0x4);
 
 g_free(nodename);
 }
diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 138fb13..f148edd 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -24,6 +24,7 @@
 #include config.h
 #include exec/address-spaces.h
 #include exec/memory.h
+
 #include qemu-common.h
 #include qemu/error-report.h
 #include qemu/event_notifier.h
@@ -31,6 +32,7 @@
 #include qemu/range.h
 #include sysemu/kvm.h
 #include sysemu/sysemu.h
+
 #include hw/qdev-properties.h
 #include migration/vmstate.h
 #include hw/hw.h
@@ -41,12 +43,15 @@
 #define DEBUG_VFIO
 #ifdef DEBUG_VFIO
 #define DPRINTF(fmt, ...) \
-do { fprintf(stderr, vfio:  fmt, ## __VA_ARGS__); } while (0)
+do { fprintf(stderr, vfio: %s:  fmt, __func__, ## __VA_ARGS__); } \
+while (0)
 #else
 #define DPRINTF(fmt, ...) \
 do { } while (0)
 #endif
 
+#define PLATFORM_NUM_REGIONS 10
+
 /* Extra debugging, trap acceleration paths for more logging */
 #define VFIO_ALLOW_MMAP 1
 
@@ -63,16 +68,240 @@ typedef struct VFIORegion {
 uint8_t nr; /* cache the region number for debug */
 } VFIORegion;
 
+
+#define VFIO_INT_INTp 4
+
+typedef struct VFIOINTp {
+QLIST_ENTRY(VFIOINTp) next;
+EventNotifier interrupt; /* eventfd triggered on interrupt */
+EventNotifier unmask; /* eventfd for unmask on QEMU bypass */
+qemu_irq qemuirq;
+struct VFIODevice *vdev; /* back pointer to device */
+bool pending; /* interrupt pending */
+bool kvm_accel; /* set when QEMU bypass through KVM enabled */
+uint8_t pin; /* index */
+uint32_t mmap_timeout; /* delay to re-enable mmaps after interrupt */
+QEMUTimer *mmap_timer; /* enable mmaps after periods w/o interrupts */
+} VFIOINTp;
+
+
+
 typedef struct VFIODevice {
 SysBusDevice sbdev;
 int fd;
 int num_regions;
-VFIORegion *regions;
+int num_irqs;
+int interrupt; /* type of the interrupt, might disappear */
+char *name;
+uint32_t mmap_timeout; /* mmap timeout value in ms */
+VFIORegion regions[PLATFORM_NUM_REGIONS];
 QLIST_ENTRY(VFIODevice) next;
 struct VFIOGroup *group;
-char *name;
+QLIST_HEAD(, VFIOINTp) intp_list;
 

[Qemu-devel] [RFC v2 2/6] vfio: move hw/misc/vfio.c to hw/vfio/pci.c

2014-04-09 Thread Eric Auger
From: Kim Phillips kim.phill...@linaro.org

This is done in preparation for the addition of VFIO platform
device support.

Signed-off-by: Kim Phillips kim.phill...@linaro.org
---
 LICENSE| 2 +-
 MAINTAINERS| 2 +-
 hw/Makefile.objs   | 1 +
 hw/misc/Makefile.objs  | 1 -
 hw/vfio/Makefile.objs  | 3 +++
 hw/{misc/vfio.c = vfio/pci.c} | 0
 6 files changed, 6 insertions(+), 3 deletions(-)
 create mode 100644 hw/vfio/Makefile.objs
 rename hw/{misc/vfio.c = vfio/pci.c} (100%)

diff --git a/LICENSE b/LICENSE
index da70e94..0e0b4b9 100644
--- a/LICENSE
+++ b/LICENSE
@@ -11,7 +11,7 @@ option) any later version.
 
 As of July 2013, contributions under version 2 of the GNU General Public
 License (and no later version) are only accepted for the following files
-or directories: bsd-user/, linux-user/, hw/misc/vfio.c, hw/xen/xen_pt*.
+or directories: bsd-user/, linux-user/, hw/vfio/, hw/xen/xen_pt*.
 
 3) The Tiny Code Generator (TCG) is released under the BSD license
(see license headers in files).
diff --git a/MAINTAINERS b/MAINTAINERS
index 7d17f83..9b5d569 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -605,7 +605,7 @@ F: hw/usb/*
 VFIO
 M: Alex Williamson alex.william...@redhat.com
 S: Supported
-F: hw/misc/vfio.c
+F: hw/vfio/*
 
 vhost
 M: Michael S. Tsirkin m...@redhat.com
diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index d178b65..b16dada 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -26,6 +26,7 @@ devices-dirs-$(CONFIG_SOFTMMU) += ssi/
 devices-dirs-$(CONFIG_SOFTMMU) += timer/
 devices-dirs-$(CONFIG_TPM) += tpm/
 devices-dirs-$(CONFIG_SOFTMMU) += usb/
+devices-dirs-$(CONFIG_SOFTMMU) += vfio/
 devices-dirs-$(CONFIG_VIRTIO) += virtio/
 devices-dirs-$(CONFIG_SOFTMMU) += watchdog/
 devices-dirs-$(CONFIG_SOFTMMU) += xen/
diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index f674365..656570c 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -21,7 +21,6 @@ common-obj-$(CONFIG_MACIO) += macio/
 
 ifeq ($(CONFIG_PCI), y)
 obj-$(CONFIG_KVM) += ivshmem.o
-obj-$(CONFIG_LINUX) += vfio.o
 endif
 
 obj-$(CONFIG_REALVIEW) += arm_sysctl.o
diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
new file mode 100644
index 000..31c7dab
--- /dev/null
+++ b/hw/vfio/Makefile.objs
@@ -0,0 +1,3 @@
+ifeq ($(CONFIG_LINUX), y)
+obj-$(CONFIG_PCI) += pci.o
+endif
diff --git a/hw/misc/vfio.c b/hw/vfio/pci.c
similarity index 100%
rename from hw/misc/vfio.c
rename to hw/vfio/pci.c
-- 
1.8.3.2




[Qemu-devel] [RFC v2 6/6] vfio: add exit function and IRQ disable functions

2014-04-09 Thread Eric Auger
this patch brings misc improvements:
- improve comments
- remove interrupt field in VFIODevice
- add IRQ disable routines used by new exitfn function

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 hw/vfio/platform.c | 167 -
 1 file changed, 127 insertions(+), 40 deletions(-)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 8f30d41..c4a4286 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -40,7 +40,6 @@
 
 #include vfio-common.h
 
-#define DEBUG_VFIO
 #ifdef DEBUG_VFIO
 #define DPRINTF(fmt, ...) \
 do { fprintf(stderr, vfio: %s:  fmt, __func__, ## __VA_ARGS__); } \
@@ -69,7 +68,9 @@ typedef struct VFIORegion {
 } VFIORegion;
 
 
-#define VFIO_INT_INTp 4
+/*
+ * The IRQ structure inspired from PCI VFIOINTx
+ */
 
 typedef struct VFIOINTp {
 QLIST_ENTRY(VFIOINTp) next;
@@ -91,9 +92,8 @@ typedef struct VFIODevice {
 int fd;
 int num_regions;
 int num_irqs;
-int interrupt; /* type of the interrupt, might disappear */
 char *name;
-char *compat;
+char *compat; /* compatibility string */
 uint32_t mmap_timeout; /* mmap timeout value in ms */
 VFIORegion regions[PLATFORM_NUM_REGIONS];
 QLIST_ENTRY(VFIODevice) next;
@@ -101,9 +101,11 @@ typedef struct VFIODevice {
 QLIST_HEAD(, VFIOINTp) intp_list;
 } VFIODevice;
 
+
 /*
  * returns properties from a QEMU VFIO device such as
  * name, compatibility, num IRQs, size of the register set
+ * currently used by virt machine
  */
 void vfio_get_props(SysBusDevice *s, char **pname,
 char **pcompat, int *pnum_irqs, size_t *psize);
@@ -118,6 +120,20 @@ void vfio_get_props(SysBusDevice *s, char **pname,
  *psize = vdev-regions[0].size;
 }
 
+static void vfio_disable_irqindex(VFIODevice *vdev, int index)
+{
+struct vfio_irq_set irq_set = {
+.argsz = sizeof(irq_set),
+.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER,
+.index = index,
+.start = 0,
+.count = 0,
+};
+
+ioctl(vdev-fd, VFIO_DEVICE_SET_IRQS, irq_set);
+}
+
+
 
 static void vfio_unmask_intp(VFIODevice *vdev, int index)
 {
@@ -132,12 +148,17 @@ static void vfio_unmask_intp(VFIODevice *vdev, int index)
 ioctl(vdev-fd, VFIO_DEVICE_SET_IRQS, irq_set);
 }
 
-
+/*
+ * Checks whether the IRQ was EOI'ed. In the positive the fast path
+ * is restored (reg space is mmaped). In the negative the reg space
+ * stays as an MMIO region (ops) and the mmap timer is reprogrammed
+ * to check the same condition after mmap_timeout ms
+ */
 
 
 static void vfio_intp_mmap_enable(void *opaque)
 {
-VFIOINTp * intp = (VFIOINTp *)opaque;
+VFIOINTp *intp = (VFIOINTp *)opaque;
 VFIODevice *vdev = intp-vdev;
 
 if (intp-pending) {
@@ -153,6 +174,10 @@ static void vfio_intp_mmap_enable(void *opaque)
 }
 
 
+/*
+ * The fd handler
+ */
+
 
 static void vfio_intp_interrupt(void *opaque)
 {
@@ -176,11 +201,14 @@ static void vfio_intp_interrupt(void *opaque)
  */
 
 VFIORegion *region = vdev-regions[0];
+
+/* register space is unmapped to trap EOI */
 memory_region_set_enabled(region-mmap_mem, false);
 
+/* trigger the virtual IRQ */
 qemu_set_irq(intp-qemuirq, 1);
 
-/* schedule the mmap timer which will restote mmap path after EOI*/
+/* schedule the mmap timer which will restore mmap path after EOI*/
 if (intp-mmap_timeout) {
 timer_mod(intp-mmap_timer,
   qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + intp-mmap_timeout);
@@ -200,13 +228,18 @@ static void vfio_irq_eoi(VFIODevice *vdev)
 if (intp-pending) {
 if (eoi_done) {
 DPRINTF(several IRQ pending simultaneously: \
- this is not a supported case yet\n);
+ this case is not tested yet\n);
 }
+
 DPRINTF(EOI IRQ #%d fd=%d\n,
 intp-pin, event_notifier_get_fd(intp-interrupt));
+
 intp-pending = false;
+
+/* deassert the virtual IRQ and unmask physical one */
 qemu_set_irq(intp-qemuirq, 0);
 vfio_unmask_intp(vdev, intp-pin);
+
 eoi_done = true;
 }
 }
@@ -217,22 +250,6 @@ static void vfio_irq_eoi(VFIODevice *vdev)
 
 
 
-#if 0
-static void vfio_list_intp(VFIODevice *vdev)
-{
-VFIOINTp *intp;
-int i = 0;
-QLIST_FOREACH(intp, vdev-intp_list, next) {
-DPRINTF(IRQ #%d\n, i);
-DPRINTF(- pin = %d\n, intp-pin);
-DPRINTF(- fd = %d\n, event_notifier_get_fd(intp-interrupt));
-DPRINTF(- pending = %d\n, (int)intp-pending);
-DPRINTF(- kvm_accel = %d\n, (int)intp-kvm_accel);
-i++;
-}
-}
-#endif
-
 static int vfio_enable_intp(VFIODevice *vdev, unsigned int index)
 {
 struct vfio_irq_set *irq_set; /* irq structure passed to vfio kernel */
@@ -242,8 +259,6 @@ static int vfio_enable_intp(VFIODevice *vdev, unsigned int 
index)
 int device = vdev-fd;
 

Re: [Qemu-devel] [PATCH v4 3/5] s390x: Add I/O adapter registration.

2014-04-09 Thread Cornelia Huck
On Wed, 09 Apr 2014 16:30:33 +0200
Alexander Graf ag...@suse.de wrote:

 
 On 09.04.14 16:24, Cornelia Huck wrote:
  On Wed, 09 Apr 2014 16:05:00 +0200
  Alexander Graf ag...@suse.de wrote:
 
  On 09.04.14 13:34, Cornelia Huck wrote:
  Register an I/O adapter interrupt source for when virtio-ccw devices start
  using adapter interrupts.
 
  Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com
  Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
  ---
 hw/intc/s390_flic.c   |   59 
  +
 hw/s390x/css.c|   51 ++
 hw/s390x/css.h|4 
 hw/s390x/virtio-ccw.c |4 
 hw/s390x/virtio-ccw.h |1 +
 target-s390x/cpu.h|   33 +++
 6 files changed, 152 insertions(+)
 
  diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
  index 2bf0af8..1193682 100644
  --- a/hw/s390x/virtio-ccw.c
  +++ b/hw/s390x/virtio-ccw.c
  @@ -522,6 +522,10 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
 dev-thinint_isc = thinint-isc;
 dev-ind_bit = thinint-ind_bit;
 cpu_physical_memory_unmap(thinint, hw_len, 0, hw_len);
  +ret = css_register_io_adapter(CSS_IO_ADAPTER_VIRTIO,
  +  dev-thinint_isc, true, 
  false,
  +  dev-adapter_id);
  In all other machines the machine file is the one creating the link
  between a device and the interrupt controller. Can we do something
  similar for s390?
  Hm. This would imply we'd need to add a virtio I/O adapter for each isc
  (0-7) at startup, regardless whether the guest enables adapter
  interrupts on any of those iscs. Moreover, we'd need to do the same for
  each type (on each isc) if we add more types of I/O adapters later
  (should we want to support one of the other adapter-interrupt using
  devices). I'd prefer to add an I/O adapter only when needed.
 
 I'm not sure I can follow you here. Instead of registering the interrupt 
 vector on the fly, you would still register it on the fly, but after the 
 virtio-ccw device got created, no?

You mean register-at-device-creation instead of
register-while-interpreting-ccw? We'd end up with the same problem: We
don't know which isc the guest wants to use for adapter interrupts at
that point in time, so we would need to register for all iscs. I don't
think that is what we want.




[Qemu-devel] Error propagation in generated visitors and command marshallers

2014-04-09 Thread Markus Armbruster
I stumbled over this while trying to purge error_is_set() from the code.


Here's how we commonly use the Error API:

Error *err = NULL;

foo(arg, err)
if (err) {
goto out;
}
bar(arg, err)
if (err) {
goto out;
}

This ensures that err is null on entry, both for foo() and for bar().
Many functions rely on that, like this:

void foo(ArgType arg, Error **errp)
{
if (frobnicate(arg)  0) {
error_setg(errp, Can't frobnicate);
// This asserts errp != NULL
}
}


Here's how some of our visitor code uses the Error API (for real code,
check out generated qmp-marshal.c):

Error *err = NULL;
QmpInputVisitor *mi = qmp_input_visitor_new_strict(QOBJECT(args));
Visitor *v = qmp_input_get_visitor(mi);
char *foo = NULL;
char *bar = NULL;

visit_type_str(v, foo, foo, err);
visit_type_str(v, bar, bar, err);
if (err) {
goto out;
}

Unlike above, this may pass a non-null errp to the second
visit_type_str(), namely when the first one fails.

The visitor functions guard against that, like this:

void visit_type_str(Visitor *v, char **obj, const char *name, Error **errp)
{
if (!error_is_set(errp)) {
v-type_str(v, obj, name, errp);
}
}

As discussed before, error_is_set() is almost almost wrong, fragile or
unclean.  What if errp is null?  Then we fail to stop visiting after an
error.

The function could be improved like this:

void visit_type_str(Visitor *v, char **obj, const char *name, Error **errp)
{
assert(errp);
if (!*errp) {
v-type_str(v, obj, name, errp);
}
}


But: is it a good idea to have both patterns in the code?  Should we
perhaps use the common pattern for visiting, too?  Like this:

visit_type_str(v, foo, foo, err);
if (err) {
goto out;
}
visit_type_str(v, bar, bar, err);
if (err) {
goto out;
}

Then we can assume *errp is clear on function entry, like this:

void visit_type_str(Visitor *v, char **obj, const char *name, Error **errp)
{
v-type_str(v, obj, name, errp);
}

Should execute roughly the same number of conditional branches.

Tedious repetition of if (err) goto out in the caller, but that's what
we do elsewhere, and unlike elsewhere, these one's are generated.

Opinions?



Re: [Qemu-devel] [PATCH v4 3/5] s390x: Add I/O adapter registration.

2014-04-09 Thread Alexander Graf


 Am 09.04.2014 um 17:35 schrieb Cornelia Huck cornelia.h...@de.ibm.com:
 
 On Wed, 09 Apr 2014 16:30:33 +0200
 Alexander Graf ag...@suse.de wrote:
 
 
 On 09.04.14 16:24, Cornelia Huck wrote:
 On Wed, 09 Apr 2014 16:05:00 +0200
 Alexander Graf ag...@suse.de wrote:
 
 On 09.04.14 13:34, Cornelia Huck wrote:
 Register an I/O adapter interrupt source for when virtio-ccw devices start
 using adapter interrupts.
 
 Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com
 Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
 ---
   hw/intc/s390_flic.c   |   59 
 +
   hw/s390x/css.c|   51 ++
   hw/s390x/css.h|4 
   hw/s390x/virtio-ccw.c |4 
   hw/s390x/virtio-ccw.h |1 +
   target-s390x/cpu.h|   33 +++
   6 files changed, 152 insertions(+)
 
 diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
 index 2bf0af8..1193682 100644
 --- a/hw/s390x/virtio-ccw.c
 +++ b/hw/s390x/virtio-ccw.c
 @@ -522,6 +522,10 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
   dev-thinint_isc = thinint-isc;
   dev-ind_bit = thinint-ind_bit;
   cpu_physical_memory_unmap(thinint, hw_len, 0, hw_len);
 +ret = css_register_io_adapter(CSS_IO_ADAPTER_VIRTIO,
 +  dev-thinint_isc, true, 
 false,
 +  dev-adapter_id);
 In all other machines the machine file is the one creating the link
 between a device and the interrupt controller. Can we do something
 similar for s390?
 Hm. This would imply we'd need to add a virtio I/O adapter for each isc
 (0-7) at startup, regardless whether the guest enables adapter
 interrupts on any of those iscs. Moreover, we'd need to do the same for
 each type (on each isc) if we add more types of I/O adapters later
 (should we want to support one of the other adapter-interrupt using
 devices). I'd prefer to add an I/O adapter only when needed.
 
 I'm not sure I can follow you here. Instead of registering the interrupt 
 vector on the fly, you would still register it on the fly, but after the 
 virtio-ccw device got created, no?
 
 You mean register-at-device-creation instead of
 register-while-interpreting-ccw? We'd end up with the same problem: We
 don't know which isc the guest wants to use for adapter interrupts at
 that point in time, so we would need to register for all iscs. I don't
 think that is what we want.

Well, if we only assign a route to the interrupt delivery path, the guest can 
then configure that previously set up route to the respective isc, or am I 
missing something obvious?

Alex


[Qemu-devel] [RFC v2 0/6] KVM platform device passthrough

2014-04-09 Thread Eric Auger
This is an RFC about QEMU VFIO platform device implementation.
This work aims at enabling KVM platform device passthrough,
where a guest OS is directly assigned a platform device: meaning
it can access the register space without trapping, and receive IRQ
from the device. Also the system MMU downstream to the device is
programmed properly to make sure the device can only see the VM
assigned physical address space.

This work was tested on Calxeda Midway where one xgmac is assigned to
KVM host while the second one is assigned to the guest. The current
status is the guest is able to ping the gateway meaning the basic
functionality is achieved for this device.

Nethertheless some key work remains to be done:
- unbind/migration
- multi-instantiation testing
- multiple IRQ testing
- management of platform devices with more complex device tree node

- The 1st RFC patch works around the limitation of QEMU not being able
to create platform devices from the command line, and should only
be used for testing the second patch.  A discussion of how to address
the limitation is encouraged, however, in the context of the first patch.

- The 2d patch addresses A.Williamson's comment
to have the platform device code separated from the PCI device
code, by firstly moving the existing VFIO PCI code into its
own hw/vfio/.

- the 3d RFC patch achieves MMIO direct access to the device.

- the 4th RFC brings basic IRQ support

- the 5th RFC enables the QEMU end-user to pass the device he wants
  to assign to his guest in the command line.

- the last RFC brings some cleanup and paves the way for cleaner exit

Here are the instructions to test on a Calxeda Midway:

https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway

The code is based on today's: git://git.qemu.org/qemu.git

some patches need to be applied on host linux to correct issues
not yet upstreamed. They can be found on vfio-dev branch of
git://git.linaro.org/people/eric.auger/linux.git

- commit 997691b: enables unmapping stage2 entries on memory region deletion
- commit 731e308: corrects an in read/write function of vfio platform driver
- iommu/arm-smmu commit serie and especially bf5a852 fixing an issue with
  Calxeda Midway sMMU.

QEMU commits can be found on vfio-dev-integ
git://git.linaro.org/people/eric.auger/qemu.git

Best Regards

Eric


Eric Auger (3):
  vfio: Add initial IRQ support in QEMU platform device
  virt: Assign a VFIO platform device to a virt VM in QEMU command line
  vfio: add exit function and IRQ disable functions

Kim Phillips (3):
  hw/arm/virt: add a xgmac device
  vfio: move hw/misc/vfio.c to hw/vfio/pci.c
  vfio: add vfio-platform support

 LICENSE|   2 +-
 MAINTAINERS|   2 +-
 hw/Makefile.objs   |   1 +
 hw/arm/virt.c  | 161 -
 hw/misc/Makefile.objs  |   1 -
 hw/vfio/Makefile.objs  |   5 +
 hw/vfio/common.c   | 486 ++
 hw/{misc/vfio.c = vfio/pci.c} | 480 +-
 hw/vfio/platform.c | 765 +
 hw/vfio/vfio-common.h  |  55 +++
 10 files changed, 1480 insertions(+), 478 deletions(-)
 create mode 100644 hw/vfio/Makefile.objs
 create mode 100644 hw/vfio/common.c
 rename hw/{misc/vfio.c = vfio/pci.c} (89%)
 create mode 100644 hw/vfio/platform.c
 create mode 100644 hw/vfio/vfio-common.h

-- 
1.8.3.2




Re: [Qemu-devel] [PATCH v4 3/5] s390x: Add I/O adapter registration.

2014-04-09 Thread Cornelia Huck
On Wed, 9 Apr 2014 17:53:40 +0200
Alexander Graf ag...@suse.de wrote:

 
 
  Am 09.04.2014 um 17:35 schrieb Cornelia Huck cornelia.h...@de.ibm.com:
  
  On Wed, 09 Apr 2014 16:30:33 +0200
  Alexander Graf ag...@suse.de wrote:
  
  
  On 09.04.14 16:24, Cornelia Huck wrote:
  On Wed, 09 Apr 2014 16:05:00 +0200
  Alexander Graf ag...@suse.de wrote:
  
  On 09.04.14 13:34, Cornelia Huck wrote:
  Register an I/O adapter interrupt source for when virtio-ccw devices 
  start
  using adapter interrupts.
  
  Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com
  Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
  ---
hw/intc/s390_flic.c   |   59 
  +
hw/s390x/css.c|   51 
  ++
hw/s390x/css.h|4 
hw/s390x/virtio-ccw.c |4 
hw/s390x/virtio-ccw.h |1 +
target-s390x/cpu.h|   33 +++
6 files changed, 152 insertions(+)
  
  diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
  index 2bf0af8..1193682 100644
  --- a/hw/s390x/virtio-ccw.c
  +++ b/hw/s390x/virtio-ccw.c
  @@ -522,6 +522,10 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
dev-thinint_isc = thinint-isc;
dev-ind_bit = thinint-ind_bit;
cpu_physical_memory_unmap(thinint, hw_len, 0, hw_len);
  +ret = css_register_io_adapter(CSS_IO_ADAPTER_VIRTIO,
  +  dev-thinint_isc, true, 
  false,
  +  dev-adapter_id);
  In all other machines the machine file is the one creating the link
  between a device and the interrupt controller. Can we do something
  similar for s390?
  Hm. This would imply we'd need to add a virtio I/O adapter for each isc
  (0-7) at startup, regardless whether the guest enables adapter
  interrupts on any of those iscs. Moreover, we'd need to do the same for
  each type (on each isc) if we add more types of I/O adapters later
  (should we want to support one of the other adapter-interrupt using
  devices). I'd prefer to add an I/O adapter only when needed.
  
  I'm not sure I can follow you here. Instead of registering the interrupt 
  vector on the fly, you would still register it on the fly, but after the 
  virtio-ccw device got created, no?
  
  You mean register-at-device-creation instead of
  register-while-interpreting-ccw? We'd end up with the same problem: We
  don't know which isc the guest wants to use for adapter interrupts at
  that point in time, so we would need to register for all iscs. I don't
  think that is what we want.
 
 Well, if we only assign a route to the interrupt delivery path, the guest can 
 then configure that previously set up route to the respective isc, or am I 
 missing something obvious?

The guest might want to use different iscs for different devices.
Currently, the first caller for an isc registers the adapter, further
users just use it. If we had all devices register for (say) isc 0, we'd
have one adapter for that. If the first guest devices uses isc 3, we
could switch this one to isc 3, but it would still be one adapter. If
now another guest device uses isc 7, we'd need to register a new
adapter for isc 7 anyway.




Re: [Qemu-devel] Error propagation in generated visitors and command marshallers

2014-04-09 Thread Eric Blake
On 04/09/2014 09:48 AM, Markus Armbruster wrote:
 I stumbled over this while trying to purge error_is_set() from the code.
 

 But: is it a good idea to have both patterns in the code?  Should we
 perhaps use the common pattern for visiting, too?  Like this:
 
 visit_type_str(v, foo, foo, err);
 if (err) {
 goto out;
 }
 visit_type_str(v, bar, bar, err);
 if (err) {
 goto out;
 }
 
 Then we can assume *errp is clear on function entry, like this:
 
 void visit_type_str(Visitor *v, char **obj, const char *name, Error 
 **errp)
 {
 v-type_str(v, obj, name, errp);
 }
 
 Should execute roughly the same number of conditional branches.
 
 Tedious repetition of if (err) goto out in the caller, but that's what
 we do elsewhere, and unlike elsewhere, these one's are generated.
 
 Opinions?

Putting the tedium into the generated code is WHY we have generated
code; so that the rest of the code that is hand-written can be concise.
 I like this latter idea of letting the visitors assume that errp is
clean on entry with the caller responsible for checking err after each step.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] Error propagation in generated visitors and command marshallers

2014-04-09 Thread Anthony Liguori
On Wed, Apr 9, 2014 at 8:48 AM, Markus Armbruster arm...@redhat.com wrote:
 I stumbled over this while trying to purge error_is_set() from the code.


 Here's how we commonly use the Error API:

 Error *err = NULL;

 foo(arg, err)
 if (err) {
 goto out;
 }
 bar(arg, err)
 if (err) {
 goto out;
 }

 This ensures that err is null on entry, both for foo() and for bar().
 Many functions rely on that, like this:

 void foo(ArgType arg, Error **errp)
 {
 if (frobnicate(arg)  0) {
 error_setg(errp, Can't frobnicate);
 // This asserts errp != NULL
 }
 }


 Here's how some of our visitor code uses the Error API (for real code,
 check out generated qmp-marshal.c):

 Error *err = NULL;
 QmpInputVisitor *mi = qmp_input_visitor_new_strict(QOBJECT(args));
 Visitor *v = qmp_input_get_visitor(mi);
 char *foo = NULL;
 char *bar = NULL;

 visit_type_str(v, foo, foo, err);
 visit_type_str(v, bar, bar, err);
 if (err) {
 goto out;
 }

 Unlike above, this may pass a non-null errp to the second
 visit_type_str(), namely when the first one fails.

 The visitor functions guard against that, like this:

 void visit_type_str(Visitor *v, char **obj, const char *name, Error 
 **errp)
 {
 if (!error_is_set(errp)) {
 v-type_str(v, obj, name, errp);
 }
 }

 As discussed before, error_is_set() is almost almost wrong, fragile or
 unclean.  What if errp is null?  Then we fail to stop visiting after an
 error.

 The function could be improved like this:

 void visit_type_str(Visitor *v, char **obj, const char *name, Error 
 **errp)
 {
 assert(errp);
 if (!*errp) {
 v-type_str(v, obj, name, errp);
 }
 }


 But: is it a good idea to have both patterns in the code?  Should we
 perhaps use the common pattern for visiting, too?  Like this:

 visit_type_str(v, foo, foo, err);
 if (err) {
 goto out;
 }
 visit_type_str(v, bar, bar, err);
 if (err) {
 goto out;
 }

 Then we can assume *errp is clear on function entry, like this:

 void visit_type_str(Visitor *v, char **obj, const char *name, Error 
 **errp)
 {
 v-type_str(v, obj, name, errp);
 }

 Should execute roughly the same number of conditional branches.

 Tedious repetition of if (err) goto out in the caller, but that's what
 we do elsewhere, and unlike elsewhere, these one's are generated.

 Opinions?

The original visiting code was loosely based on ASN1 marshaling code
from Samba which used the if error, bail out at the top style of
error handling.

As use of Error has evolved in QEMU, I agree that the paradigm of
bail out as soon as you see an error and fail fast is better so I'd
vote for changing the generated code to do that.

Regards,

Anthony Liguori



[Qemu-devel] [PATCH V4 1/1] qmp: add pmemload command

2014-04-09 Thread Baojun Wang
Signed-off-by: Baojun Wang wan...@gmail.com
---
 cpus.c   | 29 +
 hmp-commands.hx  | 13 +
 hmp.c| 11 +++
 hmp.h|  1 +
 qapi-schema.json | 18 ++
 qmp-commands.hx  | 27 +++
 6 files changed, 99 insertions(+)

diff --git a/cpus.c b/cpus.c
index 1104d61..9617d43 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1467,6 +1467,35 @@ exit:
 fclose(f);
 }
 
+void qmp_pmemload(int64_t addr, int64_t size, const char *filename,
+  Error **errp)
+{
+FILE *f;
+uint32_t l;
+uint8_t buf[1024];
+
+f = fopen(filename, rb);
+if (!f) {
+error_setg_file_open(errp, errno, filename);
+return;
+}
+
+while (size != 0) {
+l = fread(buf, 1, sizeof(buf), f);
+if (l != sizeof(buf)) {
+error_set(errp, QERR_IO_ERROR);
+goto exit;
+}
+if (l  size)
+l = size;
+cpu_physical_memory_rw(addr, buf, l, 1);
+addr += l;
+size -= l;
+}
+
+exit:
+fclose(f);
+}
 void qmp_inject_nmi(Error **errp)
 {
 #if defined(TARGET_I386)
diff --git a/hmp-commands.hx b/hmp-commands.hx
index f3fc514..18604a6 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -809,6 +809,19 @@ save to disk physical memory dump starting at @var{addr} 
of size @var{size}.
 ETEXI
 
 {
+.name   = pmemload,
+.args_type  = val:l,size:i,filename:s,
+.params = addr size file,
+.help   = load from disk physical memory dump starting at 'addr' 
of size 'size',
+.mhandler.cmd = hmp_pmemload,
+},
+
+STEXI
+@item pmemload @var{addr} @var{size} @var{file}
+@findex pmemload
+load from disk physical memory dump starting at @var{addr} of size @var{size}.
+ETEXI
+{
 .name   = boot_set,
 .args_type  = bootdevice:s,
 .params = bootdevice,
diff --git a/hmp.c b/hmp.c
index 2f279c4..6e932f9 100644
--- a/hmp.c
+++ b/hmp.c
@@ -767,6 +767,17 @@ void hmp_pmemsave(Monitor *mon, const QDict *qdict)
 hmp_handle_error(mon, errp);
 }
 
+void hmp_pmemload(Monitor *mon, const QDict *qdict)
+{
+uint32_t size = qdict_get_int(qdict, size);
+const char *filename = qdict_get_str(qdict, filename);
+uint64_t addr = qdict_get_int(qdict, val);
+Error *errp = NULL;
+
+qmp_pmemload(addr, size, filename, errp);
+hmp_handle_error(mon, errp);
+}
+
 void hmp_ringbuf_write(Monitor *mon, const QDict *qdict)
 {
 const char *chardev = qdict_get_str(qdict, device);
diff --git a/hmp.h b/hmp.h
index ed58f0e..f5f2a16 100644
--- a/hmp.h
+++ b/hmp.h
@@ -44,6 +44,7 @@ void hmp_system_powerdown(Monitor *mon, const QDict *qdict);
 void hmp_cpu(Monitor *mon, const QDict *qdict);
 void hmp_memsave(Monitor *mon, const QDict *qdict);
 void hmp_pmemsave(Monitor *mon, const QDict *qdict);
+void hmp_pmemload(Monitor *mon, const QDict *qdict);
 void hmp_ringbuf_write(Monitor *mon, const QDict *qdict);
 void hmp_ringbuf_read(Monitor *mon, const QDict *qdict);
 void hmp_cont(Monitor *mon, const QDict *qdict);
diff --git a/qapi-schema.json b/qapi-schema.json
index 391356f..60f9782 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1708,6 +1708,24 @@
   'data': {'val': 'int', 'size': 'int', 'filename': 'str'} }
 
 ##
+# @pmemload:
+#
+# Load a portion of guest physical memory from a file.
+#
+# @val: the physical address of the guest to start from
+#
+# @size: the size of memory region to load
+#
+# @filename: the file to load the memory from as binary data
+#
+# Returns: Nothing on success
+#
+# Since: 2.1
+##
+{ 'command': 'pmemload',
+  'data': {'val': 'int', 'size': 'int', 'filename': 'str'} }
+
+##
 # @cont:
 #
 # Resume guest VCPU execution.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index ed3ab92..584d6cf 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -468,6 +468,33 @@ Example:
 EQMP
 
 {
+.name   = pmemload,
+.args_type  = val:l,size:i,filename:s,
+.mhandler.cmd_new = qmp_marshal_input_pmemload,
+},
+
+SQMP
+pmemload
+
+
+load from disk physical memory dump starting at 'val' of size 'size'.
+
+Arguments:
+
+- val: the starting address (json-int)
+- size: the memory size, in bytes (json-int)
+- filename: file path (json-string)
+
+Example:
+
+- { execute: pmemload,
+ arguments: { val: 10,
+size: 100,
+filename: /tmp/physical-mem-dump } }
+- { return: {} }
+
+EQMP
+{
 .name   = inject-nmi,
 .args_type  = ,
 .mhandler.cmd_new = qmp_marshal_input_inject_nmi,
-- 
1.9.1




Re: [Qemu-devel] [PATCH V4 1/1] qmp: add pmemload command

2014-04-09 Thread Eric Blake
On 04/09/2014 10:54 AM, Baojun Wang wrote:
 Signed-off-by: Baojun Wang wan...@gmail.com

You lost this part of your commit message, which gives more details
about the 'why' (the subject line covers the 'what', but it is often the
'why' that is most needed when reviewing a commit later).  Your earlier
mail had:

I found this could be useful to have qemu-softmmu as a cross debugger
(launch
with -s -S command line option), then if we can have a command to load guest
physical memory, we can use cross gdb to do some target debug which gdb
cannot
do directly.


  
 +void qmp_pmemload(int64_t addr, int64_t size, const char *filename,
 +  Error **errp)
 +{
 +FILE *f;
 +uint32_t l;
 +uint8_t buf[1024];
 +
 +f = fopen(filename, rb);

I still think that fopen()/fread() is wrong, and that you should be
using qemu_open()/read() - that way, libvirt could pass in a file
descriptor and use qemu's already-existing /dev/fdset/nnn support rather
than forcing qemu to open something from the filesystem.


 +++ b/hmp-commands.hx
 @@ -809,6 +809,19 @@ save to disk physical memory dump starting at @var{addr} 
 of size @var{size}.
  ETEXI
  
  {
 +.name   = pmemload,
 +.args_type  = val:l,size:i,filename:s,
 +.params = addr size file,
 +.help   = load from disk physical memory dump starting at 
 'addr' of size 'size',
 +.mhandler.cmd = hmp_pmemload,

And I still think that you should list filename first, with val and size
optional at the end.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v8 1/4] qapi: introduce PreallocMode and a new PreallocMode full.

2014-04-09 Thread Eric Blake
On 04/09/2014 01:12 AM, Hu Tao wrote:
 This patch prepares for the subsequent patches.
 
 Reviewed-by: Fam Zheng f...@redhat.com
 Signed-off-by: Hu Tao hu...@cn.fujitsu.com
 ---
  block/qcow2.c|  8 
  qapi-schema.json | 14 ++
  2 files changed, 18 insertions(+), 4 deletions(-)
 

 +++ b/qapi-schema.json
 @@ -4689,3 +4689,17 @@
'btn' : 'InputBtnEvent',
'rel' : 'InputMoveEvent',
'abs' : 'InputMoveEvent' } }
 +
 +##
 +# @PreallocMode
 +#
 +# Preallocation mode of QEMU image file
 +#
 +# @off: no preallocation
 +# @metadata: preallocate only for metadata
 +# @full: preallocate all data, including metadata
 +#
 +# Since 2.0

You've missed 2.0.  This should be 2.1.  With that fixed,
Reviewed-by: Eric Blake ebl...@redhat.com

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH] Re: snapshot: fixed bdrv_get_full_backing_filename can not get correct full_backing_filename

2014-04-09 Thread Jun Li
Thanks Eric's analysis and review firstly. As not so clear to the application 
context, so the first patch can not cover symlink scenarios.
In this patch, will check the backing_filename is a symlink or not firstly, 
then return the full(absolute) path via realpath.
If this patch has something not coverd, please give me more suggestions.
Thx.

Signed-off-by: Jun Li junm...@gmail.com
---
 block.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 990a754..8566b75 100644
--- a/block.c
+++ b/block.c
@@ -304,10 +304,26 @@ void path_combine(char *dest, int dest_size,
 
 void bdrv_get_full_backing_filename(BlockDriverState *bs, char *dest, size_t 
sz)
 {
+struct stat sb;
+char *linkname;
+
 if (bs-backing_file[0] == '\0' || path_has_protocol(bs-backing_file)) {
 pstrcpy(dest, sz, bs-backing_file);
 } else {
-path_combine(dest, sz, bs-filename, bs-backing_file);
+if (lstat(bs-backing_file, sb) == -1) {
+perror(lstat);
+exit(EXIT_FAILURE);
+}
+
+/* Check linkname is a link or not */
+if (S_ISLNK(sb.st_mode)) {
+linkname = malloc(sb.st_size + 1);
+readlink(bs-backing_file, linkname, sb.st_size + 1);
+linkname[sb.st_size] = '\0';
+realpath(linkname, dest);
+} else {
+realpath(bs-backing_file, dest);
+}
 }
 }
 
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH v3 21/26] tcg-aarch64: Introduce tcg_out_insn_3507

2014-04-09 Thread Richard Henderson
On 04/09/2014 05:54 AM, Claudio Fontana wrote:
 During testing I found this patch causes a regression for big endian targets 
 (sparc).
 
 Can you take a look?
 I think it might be related to the extended form of the REV instruction 
 needing
 an additional 0x400. See below.

You're right.  It's disassembling as rev32 x0, x0.

Bizzarely, sparc32 bios was working.  I guess it only uses 64-bit load/store
for ldd/std for register pair save and restore.  And since we rev'ed them the
same way for load/store, it worked.

Uploading a full mipseb system image to test now.


r~



Re: [Qemu-devel] Error propagation in generated visitors and command marshallers

2014-04-09 Thread Dr. David Alan Gilbert
* Markus Armbruster (arm...@redhat.com) wrote:
 I stumbled over this while trying to purge error_is_set() from the code.

 Here's how we commonly use the Error API:
 
 Error *err = NULL;
 
 foo(arg, err)
 if (err) {
 goto out;
 }
 bar(arg, err)
 if (err) {
 goto out;
 }
 
 This ensures that err is null on entry, both for foo() and for bar().
 Many functions rely on that, like this:
 
 void foo(ArgType arg, Error **errp)
 {
 if (frobnicate(arg)  0) {
 error_setg(errp, Can't frobnicate);
 // This asserts errp != NULL
 }
 }
 
 
 Here's how some of our visitor code uses the Error API (for real code,
 check out generated qmp-marshal.c):
 
 Error *err = NULL;
 QmpInputVisitor *mi = qmp_input_visitor_new_strict(QOBJECT(args));
 Visitor *v = qmp_input_get_visitor(mi);
 char *foo = NULL;
 char *bar = NULL;
 
 visit_type_str(v, foo, foo, err);
 visit_type_str(v, bar, bar, err);
 if (err) {
 goto out;
 }
 
 Unlike above, this may pass a non-null errp to the second
 visit_type_str(), namely when the first one fails.

Right, one of the problems is you just have long strings of visit_* calls
and adding a check to each one hides what you're actually doing in a sea
of checks.  The downside is that if one of those visit's fails then you've
got no chance of figuring out which one it was.

In my BER world I've got some macros along the lines of:

#define LOCAL_ERR_REPORT(fallout) \
if (local_err) { \
fallout \
}

and at least then I can do things like:
   visit_type_str(v, foo, foo, err);
   LOCAL_ERR_REPORT( goto out; )
   visit_type_str(v, bar, bar, err);
   LOCAL_ERR_REPORT( goto out; )

which while not nice, means that you can actually follow the code, and
I can also add a printf to the macro to record the function/line so
that when one of them fails I can see which visit was the cause of the problem
(something that's currently very difficult).

 The visitor functions guard against that, like this:
 
 void visit_type_str(Visitor *v, char **obj, const char *name, Error 
 **errp)
 {
 if (!error_is_set(errp)) {
 v-type_str(v, obj, name, errp);
 }
 }
 
 As discussed before, error_is_set() is almost almost wrong, fragile or
 unclean.  What if errp is null?  Then we fail to stop visiting after an
 error.
 
 The function could be improved like this:
 
 void visit_type_str(Visitor *v, char **obj, const char *name, Error 
 **errp)
 {
 assert(errp);
 if (!*errp) {
 v-type_str(v, obj, name, errp);
 }
 }
 
 
 But: is it a good idea to have both patterns in the code?  Should we
 perhaps use the common pattern for visiting, too?  Like this:
 
 visit_type_str(v, foo, foo, err);
 if (err) {
 goto out;
 }
 visit_type_str(v, bar, bar, err);
 if (err) {
 goto out;
 }
 
 Then we can assume *errp is clear on function entry, like this:
 
 void visit_type_str(Visitor *v, char **obj, const char *name, Error 
 **errp)
 {
 v-type_str(v, obj, name, errp);
 }
 
 Should execute roughly the same number of conditional branches.
 
 Tedious repetition of if (err) goto out in the caller, but that's what
 we do elsewhere, and unlike elsewhere, these one's are generated.

The other problem is I had a tendency to typo some of the cases to
if (*err)  and it's quite hard to spot and you wonder what's going on.

Dave
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



[Qemu-devel] qemu projects

2014-04-09 Thread Pradeep Kiruvale
Hi All,

 I am new to qemu-kvm.But have good knowledge about virtulization and cloud
computing.I would like to do some projects in qemu-kvm.Please let me know
how can I start.

Please suggest me some small projects which I can take up and finish in 2-3
weeks.

Regards,
Pradeep


[Qemu-devel] TCG x86-64 'bt' insn

2014-04-09 Thread Clemens Kolbitsch
Hi guys,

I have to revive a rather old thread [1,2]. A quick summary of the issue:

TCG emulates the BT (bit-test) instruction using a SAR to re-compute
eflags. While SAR affects all flags, BT only affects the C-flag and leaves
the Z-flag unaffected/unchanged [3].

According to the x86 manual, the current behavior is correct (although not
perfect performance-wise), but NOT correct according to newer Intel CPU
instruction sets (such as x86-64). What has caused some confusion in the
past it seems is that AMD's manual mentions that all flags other than the
C-flag are undefined (just like in Intel's old x86 manual).

A patch has been suggested (although not applied) before [2] and I was
trying to update and try the patch to the QEMU 2.0 RC2, but it seems not to
be working (causes a BSOD on Windows guests).

BTW: I have a program that seg-faults on Windows clients (32 as well as 64
bit guest OSes) in vanilla QEMU 2.0 RC2 (just like in previous versions),
so this is not just of theoretical importance :)

Can somebody say if there is something that I'm doing obviously wrong?

*NON-FUNCTIONAL PATCH!*

--- ../orig/qemu-2.0.0-rc2/target-i386/translate.c  2014-04-08
12:38:58.0 -0700
+++ target-i386/translate.c 2014-04-09 10:08:43.252084947 -0700
@@ -6710,8 +6710,14 @@
 tcg_gen_andi_tl(cpu_T[1], cpu_T[1], (1  (3 + ot)) - 1);
 switch(op) {
 case 0:
-tcg_gen_shr_tl(cpu_cc_src, cpu_T[0], cpu_T[1]);
-tcg_gen_movi_tl(cpu_cc_dst, 0);
+/* whatever the last CC-op is - recompute now so we can OR-in
+ * updated results */
+gen_update_cc_op(s);
+gen_compute_eflags(s);
+tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
+tcg_gen_andi_tl(cpu_tmp4, cpu_tmp4, CC_C);
+tcg_gen_or_tl(cpu_cc_src, cpu_cc_src, cpu_tmp4);
+set_cc_op(s, CC_OP_EFLAGS);
 break;
 case 1:
 tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
@@ -6734,8 +6740,8 @@
 tcg_gen_xor_tl(cpu_T[0], cpu_T[0], cpu_tmp0);
 break;
 }
-set_cc_op(s, CC_OP_SARB + ot);
 if (op != 0) {
+set_cc_op(s, CC_OP_SARB + ot);
 if (mod != 3) {
 gen_op_st_v(s, ot, cpu_T[0], cpu_A0);
 } else {

Thanks, I'd really appreciate any input you can provide.
-Clemens


[1] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00442.html
[2] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00764.html
[3] page 148
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf

-- 
Clemens Kolbitsch
Security Researcher
kolbit...@lastline.com
Mobile +1 (206) 356-7745
Land +1 (805) 456-7076

Lastline, Inc.
6950 Hollister Avenue, Suite 101
Goleta, CA 93117

www.lastline.com


[Qemu-devel] [PATCH V3 4/5] machine: replace QEMUMachine by MachineClass in accelerator configuration

2014-04-09 Thread Marcel Apfelbaum
This minimizes QEMUMachine usage, as part of machine QOM-ification.

Signed-off-by: Marcel Apfelbaum marce...@redhat.com
---
 include/hw/xen/xen.h|  2 +-
 include/qemu/typedefs.h |  1 +
 include/sysemu/kvm.h|  2 +-
 include/sysemu/qtest.h  |  2 +-
 kvm-all.c   |  6 +++---
 kvm-stub.c  |  2 +-
 qtest.c |  2 +-
 vl.c| 10 +-
 xen-all.c   |  2 +-
 xen-stub.c  |  2 +-
 10 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h
index 9d549fc..85fda3d 100644
--- a/include/hw/xen/xen.h
+++ b/include/hw/xen/xen.h
@@ -36,7 +36,7 @@ void xen_cmos_set_s3_resume(void *opaque, int irq, int level);
 
 qemu_irq *xen_interrupt_controller_init(void);
 
-int xen_init(QEMUMachine *machine);
+int xen_init(MachineClass *mc);
 int xen_hvm_init(MemoryRegion **ram_memory);
 void xenstore_store_pv_console_info(int i, struct CharDriverState *chr);
 
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index bf8daac..86bab12 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -31,6 +31,7 @@ typedef struct MemoryListener MemoryListener;
 typedef struct MemoryMappingList MemoryMappingList;
 
 typedef struct QEMUMachine QEMUMachine;
+typedef struct MachineClass MachineClass;
 typedef struct NICInfo NICInfo;
 typedef struct HCIInfo HCIInfo;
 typedef struct AudioState AudioState;
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 0bee1e8..518655c 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -152,7 +152,7 @@ extern KVMState *kvm_state;
 
 /* external API */
 
-int kvm_init(QEMUMachine *machine);
+int kvm_init(MachineClass *mc);
 
 int kvm_has_sync_mmu(void);
 int kvm_has_vcpu_events(void);
diff --git a/include/sysemu/qtest.h b/include/sysemu/qtest.h
index 224131f..95c9ade 100644
--- a/include/sysemu/qtest.h
+++ b/include/sysemu/qtest.h
@@ -26,7 +26,7 @@ static inline bool qtest_enabled(void)
 
 bool qtest_driver(void);
 
-int qtest_init_accel(QEMUMachine *machine);
+int qtest_init_accel(MachineClass *mc);
 void qtest_init(const char *qtest_chrdev, const char *qtest_log, Error **errp);
 
 static inline int qtest_available(void)
diff --git a/kvm-all.c b/kvm-all.c
index cd4111d..1bb979d 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1341,7 +1341,7 @@ static int kvm_max_vcpus(KVMState *s)
 return (ret) ? ret : kvm_recommended_vcpus(s);
 }
 
-int kvm_init(QEMUMachine *machine)
+int kvm_init(MachineClass *mc)
 {
 static const char upgrade_note[] =
 Please upgrade to at least kernel 2.6.29 or recent kvm-kmod\n
@@ -1433,8 +1433,8 @@ int kvm_init(QEMUMachine *machine)
 }
 
 kvm_type = qemu_opt_get(qemu_get_machine_opts(), kvm-type);
-if (machine-kvm_type) {
-type = machine-kvm_type(kvm_type);
+if (mc-kvm_type) {
+type = mc-kvm_type(kvm_type);
 } else if (kvm_type) {
 fprintf(stderr, Invalid argument kvm-type=%s\n, kvm_type);
 goto err;
diff --git a/kvm-stub.c b/kvm-stub.c
index ccdba62..8acda86 100644
--- a/kvm-stub.c
+++ b/kvm-stub.c
@@ -34,7 +34,7 @@ int kvm_init_vcpu(CPUState *cpu)
 return -ENOSYS;
 }
 
-int kvm_init(QEMUMachine *machine)
+int kvm_init(MachineClass *mc)
 {
 return -ENOSYS;
 }
diff --git a/qtest.c b/qtest.c
index 0ac9f42..2aba20d 100644
--- a/qtest.c
+++ b/qtest.c
@@ -500,7 +500,7 @@ static void qtest_event(void *opaque, int event)
 }
 }
 
-int qtest_init_accel(QEMUMachine *machine)
+int qtest_init_accel(MachineClass *mc)
 {
 configure_icount(0);
 
diff --git a/vl.c b/vl.c
index dc8d515..5119937 100644
--- a/vl.c
+++ b/vl.c
@@ -2720,7 +2720,7 @@ static MachineClass *machine_parse(const char *name)
 exit(!name || !is_help_option(name));
 }
 
-static int tcg_init(QEMUMachine *machine)
+static int tcg_init(MachineClass *mc)
 {
 tcg_exec_init(tcg_tb_size * 1024 * 1024);
 return 0;
@@ -2730,7 +2730,7 @@ static struct {
 const char *opt_name;
 const char *name;
 int (*available)(void);
-int (*init)(QEMUMachine *);
+int (*init)(MachineClass *mc);
 bool *allowed;
 } accel_list[] = {
 { tcg, tcg, tcg_available, tcg_init, tcg_allowed },
@@ -2739,7 +2739,7 @@ static struct {
 { qtest, QTest, qtest_available, qtest_init_accel, qtest_allowed },
 };
 
-static int configure_accelerator(QEMUMachine *machine)
+static int configure_accelerator(MachineClass *mc)
 {
 const char *p;
 char buf[10];
@@ -2766,7 +2766,7 @@ static int configure_accelerator(QEMUMachine *machine)
 continue;
 }
 *(accel_list[i].allowed) = true;
-ret = accel_list[i].init(machine);
+ret = accel_list[i].init(mc);
 if (ret  0) {
 init_failed = true;
 fprintf(stderr, failed to initialize %s: %s\n,
@@ -4187,7 +4187,7 @@ int main(int argc, char **argv, char **envp)
 exit(0);
 }
 
-

[Qemu-devel] [PATCH V3 0/5] remove QEMUMachine indirection from MachineClass

2014-04-09 Thread Marcel Apfelbaum
Cc: Andreas Färber afaer...@suse.de

V2 - V3:
  - Addressed Andreas's comments:
- Dropped QEMUMachineInitArgs's 'next' obsoleted field
  in a separate patch
- Revision the separation into patches:
  - Started using MachineClass for .machine early (3/5).
  - Merged hw/ppc changes with QEMUMachine indirection removal
  - Ensured that git bisect is not affected
  - Rebased to master.

V1 - V2:
  - Addressed Paolo's comments:
- replaced commas by semicolons on patch 4/5.
  - Rebased to master.

This is a continuation of 'QEMU Machine as QOM object' effort.
The scope of this series is to allow machine QOM-ification
of all machines gradually, by removing the need for QEMUMachine registration 
through vl.c .

Now we will have 2 paths:
1. Non QOM-ified machines will be converted to QOM on the fly
   in vl.c by qemu machine registration.
2. QOM-ified machines will behave as regular QOM classes setting
   MachineClass fields in class_init.
   - Patch 4/5 demonstrates this.

Next steps:
 - Replace QemuOpts queries by MachineState fields.
 - Follow Paolo's suggestions to get rid of QEMUMachineInitArgs.

Comments are appreciated,

Thanks,
Marcel

Marcel Apfelbaum (5):
  hw/boards.h: remove obsoleted field from QEMUMachine
  vl.c: copy QEMUMachine's fields to MachineClass
  vl.c: Replace QEMUMachine with MachineClass in QEMUMachineInitArgs
  machine: replace QEMUMachine by MachineClass in accelerator
configuration
  machine: remove QEMUMachine indirection from MachineClass

 device-hotplug.c|   2 +-
 hw/ppc/spapr.c  |  26 +--
 include/hw/boards.h |  30 +++--
 include/hw/xen/xen.h|   2 +-
 include/qemu/typedefs.h |   1 +
 include/sysemu/kvm.h|   2 +-
 include/sysemu/qtest.h  |   2 +-
 kvm-all.c   |   6 +--
 kvm-stub.c  |   2 +-
 qmp.c   |   4 +-
 qtest.c |   2 +-
 vl.c| 114 +++-
 xen-all.c   |   2 +-
 xen-stub.c  |   2 +-
 14 files changed, 116 insertions(+), 81 deletions(-)

-- 
1.8.3.1




[Qemu-devel] [PATCH V3 1/5] hw/boards.h: remove obsoleted field from QEMUMachine

2014-04-09 Thread Marcel Apfelbaum
This field shouldn't be used any more since we
adopted the QOM way of iterating over the types.

The commit that obsoleted it is:
commit 261747f176f6f2d88f8268aaebfdd1a1afe887e2
vl: Use MachineClass instead of global QEMUMachine list

The machine registration flow is refactored to use the QOM functionality.
Instead of linking the machines into a list, each machine has a type
and the types can be traversed in the QOM way.

Signed-off-by: Marcel Apfelbaum marce...@redhat.com
---
 include/hw/boards.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index dd2c70d..871 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -46,7 +46,6 @@ struct QEMUMachine {
 const char *default_machine_opts;
 const char *default_boot_order;
 GlobalProperty *compat_props;
-struct QEMUMachine *next;
 const char *hw_version;
 };
 
-- 
1.8.3.1




[Qemu-devel] [PATCH V3 3/5] vl.c: Replace QEMUMachine with MachineClass in QEMUMachineInitArgs

2014-04-09 Thread Marcel Apfelbaum
QEMUMachine's fields are already in MachineClass. We can safely
make the switch because we copy them in machine_class_init.

Signed-off-by: Marcel Apfelbaum marce...@redhat.com
---
 include/hw/boards.h | 5 +++--
 vl.c| 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index 138346d..51211a6 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -7,8 +7,10 @@
 #include hw/qdev.h
 #include qom/object.h
 
+typedef struct MachineClass MachineClass;
+
 typedef struct QEMUMachineInitArgs {
-const QEMUMachine *machine;
+const MachineClass *machine;
 ram_addr_t ram_size;
 const char *boot_order;
 const char *kernel_filename;
@@ -62,7 +64,6 @@ int qemu_register_machine(QEMUMachine *m);
 OBJECT_CLASS_CHECK(MachineClass, (klass), TYPE_MACHINE)
 
 typedef struct MachineState MachineState;
-typedef struct MachineClass MachineClass;
 
 MachineClass *find_default_machine(void);
 extern MachineState *current_machine;
diff --git a/vl.c b/vl.c
index 0b0202d..dc8d515 100644
--- a/vl.c
+++ b/vl.c
@@ -4394,7 +4394,7 @@ int main(int argc, char **argv, char **envp)
 
 qdev_machine_init();
 
-QEMUMachineInitArgs args = { .machine = machine,
+QEMUMachineInitArgs args = { .machine = machine_class,
  .ram_size = ram_size,
  .boot_order = boot_order,
  .kernel_filename = kernel_filename,
-- 
1.8.3.1




[Qemu-devel] [PATCH V3 2/5] vl.c: copy QEMUMachine's fields to MachineClass

2014-04-09 Thread Marcel Apfelbaum
In order to eliminate the QEMUMachine indirection,
add its fields directly to MachineClass.
Do not remove yet qemu_machine field because it is
still in use by sparpr.

Signed-off-by: Marcel Apfelbaum marce...@redhat.com
---
 include/hw/boards.h | 23 +++
 vl.c| 23 +++
 2 files changed, 46 insertions(+)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index 871..138346d 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -77,6 +77,29 @@ struct MachineClass {
 /* public */
 
 QEMUMachine *qemu_machine;
+const char *name;
+const char *alias;
+const char *desc;
+
+void (*init)(QEMUMachineInitArgs *args);
+void (*reset)(void);
+void (*hot_add_cpu)(const int64_t id, Error **errp);
+int (*kvm_type)(const char *arg);
+
+BlockInterfaceType block_default_type;
+int max_cpus;
+unsigned int no_serial:1,
+no_parallel:1,
+use_virtcon:1,
+use_sclp:1,
+no_floppy:1,
+no_cdrom:1,
+no_sdcard:1;
+int is_default;
+const char *default_machine_opts;
+const char *default_boot_order;
+GlobalProperty *compat_props;
+const char *hw_version;
 };
 
 /**
diff --git a/vl.c b/vl.c
index 9975e5a..0b0202d 100644
--- a/vl.c
+++ b/vl.c
@@ -1583,8 +1583,31 @@ MachineState *current_machine;
 static void machine_class_init(ObjectClass *oc, void *data)
 {
 MachineClass *mc = MACHINE_CLASS(oc);
+QEMUMachine *qm = data;
 
 mc-qemu_machine = data;
+
+mc-name = qm-name;
+mc-alias = qm-alias;
+mc-desc = qm-desc;
+mc-init = qm-init;
+mc-reset = qm-reset;
+mc-hot_add_cpu = qm-hot_add_cpu;
+mc-kvm_type = qm-kvm_type;
+mc-block_default_type = qm-block_default_type;
+mc-max_cpus = qm-max_cpus;
+mc-no_serial = qm-no_serial;
+mc-no_parallel = qm-no_parallel;
+mc-use_virtcon = qm-use_virtcon;
+mc-use_sclp = qm-use_sclp;
+mc-no_floppy = qm-no_floppy;
+mc-no_cdrom = qm-no_cdrom;
+mc-no_sdcard = qm-no_sdcard;
+mc-is_default = qm-is_default;
+mc-default_machine_opts = qm-default_machine_opts;
+mc-default_boot_order = qm-default_boot_order;
+mc-compat_props = qm-compat_props;
+mc-hw_version = qm-hw_version;
 }
 
 int qemu_register_machine(QEMUMachine *m)
-- 
1.8.3.1




[Qemu-devel] [PATCH V3 5/5] machine: remove QEMUMachine indirection from MachineClass

2014-04-09 Thread Marcel Apfelbaum
No need to go through qemu_machine field. Use
MachineClass fields directly.

Signed-off-by: Marcel Apfelbaum marce...@redhat.com
---
 device-hotplug.c|  2 +-
 hw/ppc/spapr.c  | 26 --
 include/hw/boards.h |  1 -
 qmp.c   |  4 +--
 vl.c| 79 -
 5 files changed, 50 insertions(+), 62 deletions(-)

diff --git a/device-hotplug.c b/device-hotplug.c
index ebfa6b1..eecb08e 100644
--- a/device-hotplug.c
+++ b/device-hotplug.c
@@ -40,7 +40,7 @@ DriveInfo *add_init_drive(const char *optstr)
 return NULL;
 
 mc = MACHINE_GET_CLASS(current_machine);
-dinfo = drive_init(opts, mc-qemu_machine-block_default_type);
+dinfo = drive_init(opts, mc-block_default_type);
 if (!dinfo) {
 qemu_opts_del(opts);
 return NULL;
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a11e121..b4ce950 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1419,19 +1419,6 @@ static int spapr_kvm_type(const char *vm_type)
 exit(1);
 }
 
-static QEMUMachine spapr_machine = {
-.name = pseries,
-.desc = pSeries Logical Partition (PAPR compliant),
-.is_default = 1,
-.init = ppc_spapr_init,
-.reset = ppc_spapr_reset,
-.block_default_type = IF_SCSI,
-.max_cpus = MAX_CPUS,
-.no_parallel = 1,
-.default_boot_order = NULL,
-.kvm_type = spapr_kvm_type,
-};
-
 /*
  * Implementation of an interface to adjust firmware patch
  * for the bootindex property handling.
@@ -1494,7 +1481,17 @@ static void spapr_machine_class_init(ObjectClass *oc, 
void *data)
 MachineClass *mc = MACHINE_CLASS(oc);
 FWPathProviderClass *fwc = FW_PATH_PROVIDER_CLASS(oc);
 
-mc-qemu_machine = data;
+mc-name = pseries;
+mc-desc = pSeries Logical Partition (PAPR compliant);
+mc-is_default = 1;
+mc-init = ppc_spapr_init;
+mc-reset = ppc_spapr_reset;
+mc-block_default_type = IF_SCSI;
+mc-max_cpus = MAX_CPUS;
+mc-no_parallel = 1;
+mc-default_boot_order = NULL;
+mc-kvm_type = spapr_kvm_type;
+
 fwc-get_dev_path = spapr_get_fw_dev_path;
 }
 
@@ -1502,7 +1499,6 @@ static const TypeInfo spapr_machine_info = {
 .name  = TYPE_SPAPR_MACHINE,
 .parent= TYPE_MACHINE,
 .class_init= spapr_machine_class_init,
-.class_data= spapr_machine,
 .interfaces = (InterfaceInfo[]) {
 { TYPE_FW_PATH_PROVIDER },
 { }
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 51211a6..66ee98a 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -77,7 +77,6 @@ struct MachineClass {
 ObjectClass parent_class;
 /* public */
 
-QEMUMachine *qemu_machine;
 const char *name;
 const char *alias;
 const char *desc;
diff --git a/qmp.c b/qmp.c
index 87a28f7..26eb589 100644
--- a/qmp.c
+++ b/qmp.c
@@ -117,8 +117,8 @@ void qmp_cpu_add(int64_t id, Error **errp)
 MachineClass *mc;
 
 mc = MACHINE_GET_CLASS(current_machine);
-if (mc-qemu_machine-hot_add_cpu) {
-mc-qemu_machine-hot_add_cpu(id, errp);
+if (mc-hot_add_cpu) {
+mc-hot_add_cpu(id, errp);
 } else {
 error_setg(errp, Not supported);
 }
diff --git a/vl.c b/vl.c
index 5119937..190bd5d 100644
--- a/vl.c
+++ b/vl.c
@@ -1585,8 +1585,6 @@ static void machine_class_init(ObjectClass *oc, void 
*data)
 MachineClass *mc = MACHINE_CLASS(oc);
 QEMUMachine *qm = data;
 
-mc-qemu_machine = data;
-
 mc-name = qm-name;
 mc-alias = qm-alias;
 mc-desc = qm-desc;
@@ -1634,12 +1632,12 @@ static MachineClass *find_machine(const char *name)
 for (el = machines; el; el = el-next) {
 MachineClass *temp = el-data;
 
-if (!strcmp(temp-qemu_machine-name, name)) {
+if (!strcmp(temp-name, name)) {
 mc = temp;
 break;
 }
-if (temp-qemu_machine-alias 
-!strcmp(temp-qemu_machine-alias, name)) {
+if (temp-alias 
+!strcmp(temp-alias, name)) {
 mc = temp;
 break;
 }
@@ -1657,7 +1655,7 @@ MachineClass *find_default_machine(void)
 for (el = machines; el; el = el-next) {
 MachineClass *temp = el-data;
 
-if (temp-qemu_machine-is_default) {
+if (temp-is_default) {
 mc = temp;
 break;
 }
@@ -1671,27 +1669,25 @@ MachineInfoList *qmp_query_machines(Error **errp)
 {
 GSList *el, *machines = object_class_get_list(TYPE_MACHINE, false);
 MachineInfoList *mach_list = NULL;
-QEMUMachine *m;
 
 for (el = machines; el; el = el-next) {
 MachineClass *mc = el-data;
 MachineInfoList *entry;
 MachineInfo *info;
 
-m = mc-qemu_machine;
 info = g_malloc0(sizeof(*info));
-if (m-is_default) {
+if (mc-is_default) {
 info-has_is_default = true;
 info-is_default = true;
 }
 
-if (m-alias) {
+if (mc-alias) {
 

[Qemu-devel] [PATCH V5 1/1] qmp: add pmemload command

2014-04-09 Thread Baojun Wang
I found this could be useful to have qemu-softmmu as a cross debugger (launch
with -s -S command line option), then if we can have a command to load guest
physical memory, we can use cross gdb to do some target debug which gdb cannot
do directly.

Many thanks to Eric Blake for review the patch and suggestions.

Signed-off-by: Baojun Wang wan...@gmail.com
---
 cpus.c   | 29 +
 hmp-commands.hx  | 13 +
 hmp.c| 11 +++
 hmp.h|  1 +
 qapi-schema.json | 18 ++
 qmp-commands.hx  | 27 +++
 6 files changed, 99 insertions(+)

diff --git a/cpus.c b/cpus.c
index 1104d61..230664f 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1467,6 +1467,35 @@ exit:
 fclose(f);
 }
 
+void qmp_pmemload(const char* filename, int64_t addr, int64_t size,
+  Error **errp)
+{
+int fd;
+ssize_t l;
+uint8_t buf[4096];
+
+fd = qemu_open(filename, O_RDONLY | O_BINARY);
+if (fd  0) {
+error_setg_file_open(errp, errno, filename);
+return;
+}
+
+while (size != 0) {
+l = read(fd, buf, sizeof(buf));
+if (l != sizeof(buf)) {
+error_set(errp, QERR_IO_ERROR);
+goto exit;
+}
+if (l  size)
+l = size;
+cpu_physical_memory_rw(addr, buf, l, 1);
+addr += l;
+size -= l;
+}
+
+exit:
+qemu_close(fd);
+}
 void qmp_inject_nmi(Error **errp)
 {
 #if defined(TARGET_I386)
diff --git a/hmp-commands.hx b/hmp-commands.hx
index f3fc514..1eae6e3 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -809,6 +809,19 @@ save to disk physical memory dump starting at @var{addr} 
of size @var{size}.
 ETEXI
 
 {
+.name   = pmemload,
+.args_type  = filename:s,val:l,size:i,
+.params = file addr size,
+.help   = load from disk physical memory dump starting at 'addr' 
of size 'size',
+.mhandler.cmd = hmp_pmemload,
+},
+
+STEXI
+@item pmemload @var{file} @var{addr} @var{size}
+@findex pmemload
+load from disk physical memory dump starting at @var{addr} of size @var{size}.
+ETEXI
+{
 .name   = boot_set,
 .args_type  = bootdevice:s,
 .params = bootdevice,
diff --git a/hmp.c b/hmp.c
index 2f279c4..745c087 100644
--- a/hmp.c
+++ b/hmp.c
@@ -767,6 +767,17 @@ void hmp_pmemsave(Monitor *mon, const QDict *qdict)
 hmp_handle_error(mon, errp);
 }
 
+void hmp_pmemload(Monitor *mon, const QDict *qdict)
+{
+uint32_t size = qdict_get_int(qdict, size);
+const char *filename = qdict_get_str(qdict, filename);
+uint64_t addr = qdict_get_int(qdict, val);
+Error *errp = NULL;
+
+qmp_pmemload(filename, addr, size, errp);
+hmp_handle_error(mon, errp);
+}
+
 void hmp_ringbuf_write(Monitor *mon, const QDict *qdict)
 {
 const char *chardev = qdict_get_str(qdict, device);
diff --git a/hmp.h b/hmp.h
index ed58f0e..f5f2a16 100644
--- a/hmp.h
+++ b/hmp.h
@@ -44,6 +44,7 @@ void hmp_system_powerdown(Monitor *mon, const QDict *qdict);
 void hmp_cpu(Monitor *mon, const QDict *qdict);
 void hmp_memsave(Monitor *mon, const QDict *qdict);
 void hmp_pmemsave(Monitor *mon, const QDict *qdict);
+void hmp_pmemload(Monitor *mon, const QDict *qdict);
 void hmp_ringbuf_write(Monitor *mon, const QDict *qdict);
 void hmp_ringbuf_read(Monitor *mon, const QDict *qdict);
 void hmp_cont(Monitor *mon, const QDict *qdict);
diff --git a/qapi-schema.json b/qapi-schema.json
index 391356f..7198242 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1708,6 +1708,24 @@
   'data': {'val': 'int', 'size': 'int', 'filename': 'str'} }
 
 ##
+# @pmemload:
+#
+# Load a portion of guest physical memory from a file.
+#
+# @filename: the file to load the memory from as binary data
+#
+# @val: the physical address of the guest to start from
+#
+# @size: the size of memory region to load
+#
+# Returns: Nothing on success
+#
+# Since: 2.1
+##
+{ 'command': 'pmemload',
+  'data': {'filename': 'str', 'val': 'int', 'size': 'int'} }
+
+##
 # @cont:
 #
 # Resume guest VCPU execution.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index ed3ab92..2831038 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -468,6 +468,33 @@ Example:
 EQMP
 
 {
+.name   = pmemload,
+.args_type  = filename:s,val:l,size:i,
+.mhandler.cmd_new = qmp_marshal_input_pmemload,
+},
+
+SQMP
+pmemload
+
+
+load from disk physical memory dump starting at 'val' of size 'size'.
+
+Arguments:
+
+- filename: file path (json-string)
+- val: the starting address (json-int)
+- size: the memory size, in bytes (json-int)
+
+Example:
+
+- { execute: pmemload,
+ arguments: { filename: /tmp/physical-mem-dump,
+val: 10,
+size: 100 } }
+- { return: {} }
+
+EQMP
+{
 .name   = inject-nmi,
 .args_type  = ,
 .mhandler.cmd_new = qmp_marshal_input_inject_nmi,
-- 
1.9.1

Re: [Qemu-devel] TCG x86-64 'bt' insn

2014-04-09 Thread Clemens Kolbitsch
Hi,

quick follow-up. *As always* you find a problem right after asking for help
:). The updated patch does not cause BSOD on Windows guests, but neither
does it fix the actual problem (of the program seg-faulting)

I would really appreciate any feedback on the proposed patch below - the
difference to the previous patch is that we clear out undefined flags and
only keep the Z-flag (and update the C-flag)

--- ../orig/qemu-2.0.0-rc2/target-i386/translate.c  2014-04-08
12:38:58.0 -0700
+++ target-i386/translate.c 2014-04-09 10:48:25.264200230 -0700
@@ -6710,8 +6710,15 @@
 tcg_gen_andi_tl(cpu_T[1], cpu_T[1], (1  (3 + ot)) - 1);
 switch(op) {
 case 0:
-tcg_gen_shr_tl(cpu_cc_src, cpu_T[0], cpu_T[1]);
-tcg_gen_movi_tl(cpu_cc_dst, 0);
+/* whatever the last CC-op is - recompute now so we can OR-in
+ * updated results */
+gen_update_cc_op(s); // ? needed ?
+gen_compute_eflags(s);
+tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
+tcg_gen_andi_tl(cpu_tmp4, cpu_tmp4, CC_C);
+tcg_gen_andi_tl(cpu_cc_src, cpu_cc_src, CC_Z);
+tcg_gen_or_tl(cpu_cc_src, cpu_cc_src, cpu_tmp4);
+set_cc_op(s, CC_OP_EFLAGS);
 break;
 case 1:
 tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
@@ -6734,8 +6741,8 @@
 tcg_gen_xor_tl(cpu_T[0], cpu_T[0], cpu_tmp0);
 break;
 }
-set_cc_op(s, CC_OP_SARB + ot);
 if (op != 0) {
+set_cc_op(s, CC_OP_SARB + ot);
 if (mod != 3) {
 gen_op_st_v(s, ot, cpu_T[0], cpu_A0);
 } else {

Last another general question: Does TCG make any assumptions that undefined
flags are set to 0? I see that most flag-computations set undefined flags
to 0 - is this just a convention or really a requirement?

Thanks guys!
-Clemens




On Wed, Apr 9, 2014 at 10:33 AM, Clemens Kolbitsch
kolbit...@lastline.comwrote:

 Hi guys,

 I have to revive a rather old thread [1,2]. A quick summary of the issue:

 TCG emulates the BT (bit-test) instruction using a SAR to re-compute
 eflags. While SAR affects all flags, BT only affects the C-flag and leaves
 the Z-flag unaffected/unchanged [3].

 According to the x86 manual, the current behavior is correct (although not
 perfect performance-wise), but NOT correct according to newer Intel CPU
 instruction sets (such as x86-64). What has caused some confusion in the
 past it seems is that AMD's manual mentions that all flags other than the
 C-flag are undefined (just like in Intel's old x86 manual).

 A patch has been suggested (although not applied) before [2] and I was
 trying to update and try the patch to the QEMU 2.0 RC2, but it seems not to
 be working (causes a BSOD on Windows guests).

 BTW: I have a program that seg-faults on Windows clients (32 as well as 64
 bit guest OSes) in vanilla QEMU 2.0 RC2 (just like in previous versions),
 so this is not just of theoretical importance :)

 Can somebody say if there is something that I'm doing obviously wrong?

 *NON-FUNCTIONAL PATCH!*

 --- ../orig/qemu-2.0.0-rc2/target-i386/translate.c  2014-04-08
 12:38:58.0 -0700
 +++ target-i386/translate.c 2014-04-09 10:08:43.252084947 -0700
 @@ -6710,8 +6710,14 @@
  tcg_gen_andi_tl(cpu_T[1], cpu_T[1], (1  (3 + ot)) - 1);
  switch(op) {
  case 0:
 -tcg_gen_shr_tl(cpu_cc_src, cpu_T[0], cpu_T[1]);
 -tcg_gen_movi_tl(cpu_cc_dst, 0);
 +/* whatever the last CC-op is - recompute now so we can OR-in
 + * updated results */
 +gen_update_cc_op(s);
 +gen_compute_eflags(s);
 +tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
 +tcg_gen_andi_tl(cpu_tmp4, cpu_tmp4, CC_C);
 +tcg_gen_or_tl(cpu_cc_src, cpu_cc_src, cpu_tmp4);
 +set_cc_op(s, CC_OP_EFLAGS);
  break;
  case 1:
  tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
 @@ -6734,8 +6740,8 @@
  tcg_gen_xor_tl(cpu_T[0], cpu_T[0], cpu_tmp0);
  break;
  }
 -set_cc_op(s, CC_OP_SARB + ot);
  if (op != 0) {
 +set_cc_op(s, CC_OP_SARB + ot);
  if (mod != 3) {
  gen_op_st_v(s, ot, cpu_T[0], cpu_A0);
  } else {

 Thanks, I'd really appreciate any input you can provide.
 -Clemens


 [1] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00442.html
 [2] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00764.html
 [3] page 148
 http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf



Re: [Qemu-devel] [PATCH v17 08/14] block: Support dropping active in bdrv_drop_intermediate

2014-04-09 Thread Jeff Cody
On Tue, Apr 08, 2014 at 05:07:38PM +0800, Fam Zheng wrote:
 On Tue, 04/08 10:15, Markus Armbruster wrote:
  Jeff Cody jc...@redhat.com writes:
  
   On Mon, Mar 10, 2014 at 03:26:04PM +0800, Fam Zheng wrote:
   Dropping intermediate could be useful both for commit and stream, and
   BDS refcnt plus bdrv_swap could do most of the job nicely. It also needs
   to work with op blockers.
   
   Signed-off-by: Fam Zheng f...@redhat.com
   ---
block.c| 139 
   -
block/commit.c |   2 +-
2 files changed, 70 insertions(+), 71 deletions(-)
   
   diff --git a/block.c b/block.c
   index 05f7766..0af7c62 100644
   --- a/block.c
   +++ b/block.c
   @@ -2503,115 +2503,114 @@ BlockDriverState 
   *bdrv_find_overlay(BlockDriverState *active,
return overlay;
}

   -typedef struct BlkIntermediateStates {
   -BlockDriverState *bs;
   -QSIMPLEQ_ENTRY(BlkIntermediateStates) entry;
   -} BlkIntermediateStates;
   -
   -
/*
   - * Drops images above 'base' up to and including 'top', and sets the 
   image
   - * above 'top' to have base as its backing file.
   + * Drops images above 'base' up to and including 'top', and sets new 
   'base' as
   + * backing_hd of top's overlay (the image orignally has 'top' as 
   backing file).
   + * top's overlay may be NULL if 'top' is active, no such update needed.
   + * Requires that the top's overlay to 'top' is opened r/w.
   + *
   + * 1) This will convert the following chain:
   + *
   + * ... - base - ... - top - overlay -... - active
 *
   - * Requires that the overlay to 'top' is opened r/w, so that the 
   backing file
   - * information in 'bs' can be properly updated.
   + * to
   + *
   + * ... - base - overlay - active
   + *
   + * 2) It is allowed for bottom==base, in which case it converts:
 *
   - * E.g., this will convert the following chain:
   - * bottom - base - intermediate - top - active
   + * base - ... - top - overlay - ... - active
 *
 * to
 *
   - * bottom - base - active
   + * base - overlay - active
 *
   - * It is allowed for bottom==base, in which case it converts:
   + * 2) It also allows active==top, in which case it converts:
 *
   - * base - intermediate - top - active
   + * ... - base - ... - top (active)
 *
 * to
 *
   - * base - active
   + * ... - base == active == top
   + *
   + * i.e. only base and lower remains: *top == *base when return.
   + *
   + * 3) If base==NULL, it will drop all the BDS below overlay and set its
   + * backing_hd to NULL. I.e.:
 *
   - * Error conditions:
   - *  if active == top, that is considered an error
   + * base(NULL) - ... - overlay - ... - active
   + *
   + * to
   + *
   + * overlay - ... - active
 *
 */
int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState 
   *top,
   BlockDriverState *base)
{
   -BlockDriverState *intermediate;
   -BlockDriverState *base_bs = NULL;
   -BlockDriverState *new_top_bs = NULL;
   -BlkIntermediateStates *intermediate_state, *next;
   -int ret = -EIO;
   -
   -QSIMPLEQ_HEAD(states_to_delete, BlkIntermediateStates) 
   states_to_delete;
   -QSIMPLEQ_INIT(states_to_delete);
   +BlockDriverState *drop_start, *overlay, *bs;
   +int ret = -EINVAL;

   -if (!top-drv || !base-drv) {
   +assert(active);
   +assert(top);
   +/* Verify that top is in backing chain of active */
   +bs = active;
   +while (bs  bs != top) {
   +bs = bs-backing_hd;
   +}
   +if (!bs) {
goto exit;
}
   +/* Verify that base is in backing chain of top */
   +if (base) {
   +while (bs  bs != base) {
   +bs = bs-backing_hd;
   +}
   +if (bs != base) {
   +goto exit;
   +}
   +}

   -new_top_bs = bdrv_find_overlay(active, top);
   -
   -if (new_top_bs == NULL) {
   -/* we could not find the image above 'top', this is an error */
   +if (!top-drv || (base  !base-drv)) {
goto exit;
}
   -
   -/* special case of new_top_bs-backing_hd already pointing to base 
   - nothing
   - * to do, no intermediate images */
   -if (new_top_bs-backing_hd == base) {
   +if (top == base) {
   +ret = 0;
   +goto exit;
   +} else if (top == active) {
   +assert(base);
   +drop_start = active-backing_hd;
   +bdrv_swap(active, base);
  
   This will assert in block.c, in bdrv_swap, on the test for
   anonymity of active.  (For testing, I changed the active layer commit
   in mirror to use bdrv_drop_intermediate()).
 
 Jeff, you're right, because bdrv_swap requires first argument to be a new
 BDS, while we are passing in an top BDS.
 
 But what happens if we write bdrv_swap(base, active)?


That seems like it could work - I did a quick 

Re: [Qemu-devel] [PATCH v17 06/14] block: Add backing_blocker in BlockDriverState

2014-04-09 Thread Jeff Cody
On Mon, Mar 10, 2014 at 03:26:02PM +0800, Fam Zheng wrote:
 This makes use of op_blocker and blocks all the operations except for
 commit target, on each BlockDriverState-backing_hd.
 
 The asserts for op_blocker in bdrv_swap are removed because with this
 change, the target of block commit has at least the backing blocker of
 its child, so the assertion is not true. Callers should do their check.
 
 Signed-off-by: Fam Zheng f...@redhat.com
 ---
  block.c   | 24 
  block/mirror.c|  1 +
  include/block/block_int.h |  3 +++
  3 files changed, 24 insertions(+), 4 deletions(-)
 
 diff --git a/block.c b/block.c
 index 64738dc..95247c8 100644
 --- a/block.c
 +++ b/block.c
 @@ -1050,16 +1050,33 @@ fail:
  void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
  {
  
 +if (bs-backing_hd) {
 +assert(error_is_set(bs-backing_blocker));
 +bdrv_op_unblock_all(bs-backing_hd, bs-backing_blocker);
 +} else if (backing_hd) {
 +error_setg(bs-backing_blocker,
 +   device is used as backing hd of '%s',
 +   bs-device_name);
 +}
 +
  bs-backing_hd = backing_hd;
  if (!backing_hd) {
  bs-backing_file[0] = '\0';
  bs-backing_format[0] = '\0';
 +if (error_is_set(bs-backing_blocker)) {
 +error_free(bs-backing_blocker);
 +}
  goto out;
  }
  bs-open_flags = ~BDRV_O_NO_BACKING;
  pstrcpy(bs-backing_file, sizeof(bs-backing_file), 
 backing_hd-filename);
  pstrcpy(bs-backing_format, sizeof(bs-backing_format),
  backing_hd-drv ? backing_hd-drv-format_name : );
 +
 +bdrv_op_block_all(bs-backing_hd, bs-backing_blocker);
 +/* Otherwise we won't be able to commit due to check in bdrv_commit */
 +bdrv_op_unblock(bs-backing_hd, BLOCK_OP_TYPE_COMMIT,
 +bs-backing_blocker);
  out:
  bdrv_refresh_limits(bs);
  }
 @@ -1699,8 +1716,9 @@ void bdrv_close(BlockDriverState *bs)
  
  if (bs-drv) {
  if (bs-backing_hd) {
 -bdrv_unref(bs-backing_hd);
 -bs-backing_hd = NULL;
 +BlockDriverState *backing_hd = bs-backing_hd;
 +bdrv_set_backing_hd(bs, NULL);
 +bdrv_unref(backing_hd);
  }
  bs-drv-bdrv_close(bs);
  g_free(bs-opaque);
 @@ -1908,7 +1926,6 @@ void bdrv_swap(BlockDriverState *bs_new, 
 BlockDriverState *bs_old)
  assert(QLIST_EMPTY(bs_new-dirty_bitmaps));
  assert(bs_new-job == NULL);
  assert(bs_new-dev == NULL);
 -assert(bdrv_op_blocker_is_empty(bs_new));
  assert(bs_new-io_limits_enabled == false);
  assert(!throttle_have_timer(bs_new-throttle_state));
  
 @@ -1927,7 +1944,6 @@ void bdrv_swap(BlockDriverState *bs_new, 
 BlockDriverState *bs_old)
  /* Check a few fields that should remain attached to the device */
  assert(bs_new-dev == NULL);
  assert(bs_new-job == NULL);
 -assert(bdrv_op_blocker_is_empty(bs_new));
  assert(bs_new-io_limits_enabled == false);
  assert(!throttle_have_timer(bs_new-throttle_state));
  
 diff --git a/block/mirror.c b/block/mirror.c
 index dd5ee05..6dc84e8 100644
 --- a/block/mirror.c
 +++ b/block/mirror.c
 @@ -487,6 +487,7 @@ immediate_exit:
   * trigger the unref from the top one */
  BlockDriverState *p = s-base-backing_hd;
  s-base-backing_hd = NULL;
 +bdrv_op_unblock_all(p, s-base-backing_blocker);
  bdrv_unref(p);

FYI, this is what I changed in my testing, to try out the active layer
case in bdrv_drop_intermediate().  Since you'll need to respin anyway,
might as well clean this up to use the updated
bdrv_drop_intermediate():


--- a/block/mirror.c
+++ b/block/mirror.c
@@ -490,15 +490,8 @@ immediate_exit:
 if (bdrv_get_flags(s-target) != bdrv_get_flags(s-common.bs)) {
 bdrv_reopen(s-target, bdrv_get_flags(s-common.bs), NULL);
 }
-bdrv_swap(s-target, s-common.bs);
-if (s-common.driver-job_type == BLOCK_JOB_TYPE_COMMIT) {
-/* drop the bs loop chain formed by the swap: break the loop then
- * trigger the unref from the top one */
-BlockDriverState *p = s-base-backing_hd;
-s-base-backing_hd = NULL;
-bdrv_op_unblock_all(p, s-base-backing_blocker);
-bdrv_unref(p);
-}
+bdrv_op_unblock_all(s-common.bs-backing_hd, 
s-common.bs-backing_blocker);
+bdrv_drop_intermediate(s-common.bs, s-common.bs, s-target);
 }
 bdrv_unref(s-target);
 block_job_completed(s-common, ret);



[Qemu-devel] [Bug 1303926] Re: qemu-system-x86_64 crashed with SIGABRT

2014-04-09 Thread Launchpad Bug Tracker
This bug was fixed in the package qemu - 2.0.0~rc1+dfsg-0ubuntu3

---
qemu (2.0.0~rc1+dfsg-0ubuntu3) trusty; urgency=medium

  * d/p/ubuntu/kvm_physical_sync_dirty_bitmap-ignore-ENOENT-from-kv.patch
don't abort() just because the kernel has no dirty bitmap.
(LP: #1303926)
 -- Serge Hallyn serge.hal...@ubuntu.com   Tue, 08 Apr 2014 22:32:00 -0500

** Changed in: qemu (Ubuntu)
   Status: Triaged = Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1303926

Title:
  qemu-system-x86_64 crashed with SIGABRT

Status in QEMU:
  New
Status in “qemu” package in Ubuntu:
  Fix Released

Bug description:
  I've been getting this periodically since my upgrade to qemu 2.0 in
  trusty this morning.

  ProblemType: Crash
  DistroRelease: Ubuntu 14.04
  Package: qemu-system-x86 2.0.0~rc1+dfsg-0ubuntu1
  ProcVersionSignature: Ubuntu 3.13.0-23.45-generic 3.13.8
  Uname: Linux 3.13.0-23-generic x86_64
  ApportVersion: 2.14.1-0ubuntu1
  Architecture: amd64
  Date: Mon Apr  7 13:31:53 2014
  ExecutablePath: /usr/bin/qemu-system-x86_64
  InstallationDate: Installed on 2013-11-26 (131 days ago)
  InstallationMedia: Ubuntu 13.10 Saucy Salamander - Release amd64 
(20131016.1)
  ProcEnviron: PATH=(custom, no user)
  Registers: No symbol table is loaded.  Use the file command.
  Signal: 6
  SourcePackage: qemu
  StacktraceTop:
   
  Title: qemu-system-x86_64 crashed with SIGABRT
  UpgradeStatus: Upgraded to trusty on 2014-01-17 (79 days ago)
  UserGroups:

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1303926/+subscriptions



Re: [Qemu-devel] [PATCH v2] PPC: Clean up DECR implementation

2014-04-09 Thread Tom Musta
On 4/8/2014 2:58 PM, Alexander Graf wrote:
 On 04/08/2014 09:56 PM, Tom Musta wrote:
 On 4/6/2014 3:55 PM, Alexander Graf wrote:
 snip

 @@ -806,6 +838,10 @@ clk_setup_cb cpu_ppc_tb_init (CPUPPCState *env, 
 uint32_t freq)
   tb_env = g_malloc0(sizeof(ppc_tb_t));
   env-tb_env = tb_env;
   tb_env-flags = PPC_DECR_UNDERFLOW_TRIGGERED;
 +if (env-insns_flags  PPC_SEGMENT_64B) {
 +/* All Book3S 64bit CPUs implement level based DEC logic */
 +tb_env-flags |= PPC_DECR_UNDERFLOW_LEVEL;
 +}
   /* Create new timer */
   tb_env-decr_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, 
 cpu_ppc_decr_cb, cpu);
   if (0) {
 Equating Book3S with PPC_SEGMENT_64B is clever ... is it too clever?  
 Especially since
 the SLB Bridge is in the phased-out category and consequently we should 
 expect future
 Book3S implementations to not support this instruction category.
 
 Maybe it's too clever :). I'm very open to suggestions on how to figure this 
 out otherwise. Or maybe we should just rework the way timers get created and 
 make them be part of the core itself?
 
 
 Alex
 

A somewhat more practical approach than redesigning timer init:  The phrasing 
introduced into Book3S
that corresponds to your UNDERFLOW_LEVEL flag has existed at least since ISA 
2.03.  And 2.03 introduced
some new features, like SPE and Altivec.  So ...

if (env-insns_flags  (PPC_SEGMENT_64B | PPC_SPE | PPC_ALTIVEC)) {
/* All Book3S 64bit CPUs implement level based DEC logic */
tb_env-flags |= PPC_DECR_UNDERFLOW_LEVEL;
}

would catch a few more.  I'm not sure we get into this code for Book3E 
machines, but if you are worried
about that you could also ensure that insns_flags doesn't have PPC_BOOKE on.




[Qemu-devel] [PATCH 1/2] target-ppc: Fix target_disas

2014-04-09 Thread Tom Musta
Inspect only bit 16 for the Little Endian test.  Correct comment preceding
the target_disas() function.  Correct grammar in comment for flags processing.

Signed-off-by: Tom Musta tommu...@gmail.com
Reviewed-by: Peter Maydell peter.mayd...@linaro.org
---
V4: Correct grammar definitions of the instructions set == definition of
the instruction set

 disas.c |7 ---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/disas.c b/disas.c
index 79e6944..1397167 100644
--- a/disas.c
+++ b/disas.c
@@ -191,7 +191,8 @@ static int print_insn_od_target(bfd_vma pc, 
disassemble_info *info)
values:
 i386 - 1 means 16 bit code, 2 means 64 bit code
 arm  - bit 0 = thumb, bit 1 = reverse endian, bit 2 = A64
-ppc  - nonzero means little endian
+ppc  - bits 0:15 specify (optionally) the machine instruction set;
+   bit 16 indicates little endian.
 other targets - unused
  */
 void target_disas(FILE *out, CPUArchState *env, target_ulong code,
@@ -251,11 +252,11 @@ void target_disas(FILE *out, CPUArchState *env, 
target_ulong code,
 s.info.mach = bfd_mach_sparc_v9b;
 #endif
 #elif defined(TARGET_PPC)
-if (flags  16) {
+if ((flags  16)  1) {
 s.info.endian = BFD_ENDIAN_LITTLE;
 }
 if (flags  0x) {
-/* If we have a precise definitions of the instructions set, use it */
+/* If we have a precise definition of the instruction set, use it. */
 s.info.mach = flags  0x;
 } else {
 #ifdef TARGET_PPC64
-- 
1.7.1




[Qemu-devel] [PATCH 2/2] monitor: QEMU Monitor Instruction Disassembly Incorrect for PowerPC LE Mode

2014-04-09 Thread Tom Musta
The monitor support for disassembling instructions does not honor the MSR[LE]
bit for PowerPC processors.

This change enhances the monitor_disas() routine by supporting a flag bit
for Little Endian mode.  Bit 16 is used since that bit was used in the
analagous guest disassembly routine target_disas().

Also, to be consistent with target_disas(), the disassembler bfd_mach field
can be passed in the flags argument.

Reported-by: Anton Blanchard an...@samba.org
Signed-off-by: Tom Musta tommu...@gmail.com
Reviewed-by: Peter Maydell peter.mayd...@linaro.org
---
V2: Look specifically at bit 16 for LE.  Support machine configuration in flags.
V3: Changed documentation of 'flags' argument to simply refer to the 
target_disas
  description (per Peter Maydell's review).
V4: Corrections to comments.

The bug can be easily observed by dumping the first few instructions of an
interrupt vector (0x300 is the Data Storage Interrupt handler in PPC)

  (qemu) xp/8i 0x300
  0x0300:  .long 0x60
  0x0304:  lhzur18,-19843(r3)
  0x0308:  .long 0x60
  0x030c:  lhzur18,-20099(r2)
  0x0310:  lwz r0,11769(0)
  0x0314:  lhzur23,8317(r2)
  0x0318:  .long 0x7813427c
  0x031c:  lbz r0,19961(0)

With the patch applied, the disassembly now looks correct:

  (qemu) xp/8i 0x300
  0x0300:  nop
  0x0304:  mtsprg  2,r13
  0x0308:  nop
  0x030c:  mfsprg  r13,1
  0x0310:  std r9,128(r13)
  0x0314:  mfspr   r9,896
  0x0318:  mr  r2,r2
  0x031c:  std r10,136(r13)

 disas.c   |   14 --
 monitor.c |4 
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/disas.c b/disas.c
index 1397167..44a019a 100644
--- a/disas.c
+++ b/disas.c
@@ -445,6 +445,8 @@ monitor_fprintf(FILE *stream, const char *fmt, ...)
 return 0;
 }
 
+/* Disassembler for the monitor.
+   See target_disas for a description of flags. */
 void monitor_disas(Monitor *mon, CPUArchState *env,
target_ulong pc, int nb_insn, int is_physical, int flags)
 {
@@ -485,11 +487,19 @@ void monitor_disas(Monitor *mon, CPUArchState *env,
 s.info.mach = bfd_mach_sparc_v9b;
 #endif
 #elif defined(TARGET_PPC)
+if (flags  0x) {
+/* If we have a precise definition of the instruction set, use it. */
+s.info.mach = flags  0x;
+} else {
 #ifdef TARGET_PPC64
-s.info.mach = bfd_mach_ppc64;
+s.info.mach = bfd_mach_ppc64;
 #else
-s.info.mach = bfd_mach_ppc;
+s.info.mach = bfd_mach_ppc;
 #endif
+}
+if ((flags  16)  1) {
+s.info.endian = BFD_ENDIAN_LITTLE;
+}
 print_insn = print_insn_ppc;
 #elif defined(TARGET_M68K)
 print_insn = print_insn_m68k;
diff --git a/monitor.c b/monitor.c
index 342e83b..3ae561b 100644
--- a/monitor.c
+++ b/monitor.c
@@ -1309,6 +1309,10 @@ static void memory_dump(Monitor *mon, int count, int 
format, int wsize,
 }
 }
 #endif
+#ifdef TARGET_PPC
+flags = msr_le  16;
+flags |= env-bfd_mach;
+#endif
 monitor_disas(mon, env, addr, count, is_physical, flags);
 return;
 }
-- 
1.7.1




[Qemu-devel] [PATCH 0/2] QEMU Monitor Instruction Disassembly Incorrect for PPC LE

2014-04-09 Thread Tom Musta
Fix disassembly in the QEMU monitor for Little Endian codes.  Also fix the 
comment
and tighten up a flag check in the closely related disassembler code for
tracing.

V2: Fixed target_disas comment and bit decoding.
V3: Make monitor_disas flag documentation refer to target_disas documentation.
V4: Minor corrections to comments.

Tom Musta (2):
  target-ppc: Fix target_disas
  monitor: QEMU Monitor Instruction Disassembly Incorrect for PowerPC
LE Mode

 disas.c   |   21 -
 monitor.c |4 
 2 files changed, 20 insertions(+), 5 deletions(-)




[Qemu-devel] Turning off default storage devices?

2014-04-09 Thread Andy Lutomirski
Currently, -M q35 boots linux quite a bit slower than the default
machine type.  This seems to be because it takes a few hundred ms to
determine that there's nothing attached to the AHCI controller.

In virtio setups, there will probably never be anything attached to
the AHCI controller.  Would it be possible to add something like
-machine default_storage=off to turn off default storage devices?
This could include the AHCI on q35 and the cdrom and such on pc.

There's precedent: -machine usb=off turns off the default USB
controllers, which is great for setups that use xhci.

Thanks,
Andy



Re: [Qemu-devel] [PATCH v2] PPC: Clean up DECR implementation

2014-04-09 Thread Alexander Graf


 Am 09.04.2014 um 21:33 schrieb Tom Musta tommu...@gmail.com:
 
 On 4/8/2014 2:58 PM, Alexander Graf wrote:
 On 04/08/2014 09:56 PM, Tom Musta wrote:
 On 4/6/2014 3:55 PM, Alexander Graf wrote:
 snip
 
 @@ -806,6 +838,10 @@ clk_setup_cb cpu_ppc_tb_init (CPUPPCState *env, 
 uint32_t freq)
  tb_env = g_malloc0(sizeof(ppc_tb_t));
  env-tb_env = tb_env;
  tb_env-flags = PPC_DECR_UNDERFLOW_TRIGGERED;
 +if (env-insns_flags  PPC_SEGMENT_64B) {
 +/* All Book3S 64bit CPUs implement level based DEC logic */
 +tb_env-flags |= PPC_DECR_UNDERFLOW_LEVEL;
 +}
  /* Create new timer */
  tb_env-decr_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, 
 cpu_ppc_decr_cb, cpu);
  if (0) {
 Equating Book3S with PPC_SEGMENT_64B is clever ... is it too clever?  
 Especially since
 the SLB Bridge is in the phased-out category and consequently we should 
 expect future
 Book3S implementations to not support this instruction category.
 
 Maybe it's too clever :). I'm very open to suggestions on how to figure this 
 out otherwise. Or maybe we should just rework the way timers get created and 
 make them be part of the core itself?
 
 
 Alex
 
 A somewhat more practical approach than redesigning timer init:  The phrasing 
 introduced into Book3S
 that corresponds to your UNDERFLOW_LEVEL flag has existed at least since ISA 
 2.03.  And 2.03 introduced
 some new features, like SPE and Altivec.  So ...

e6500 supports Altivec :)

 
if (env-insns_flags  (PPC_SEGMENT_64B | PPC_SPE | PPC_ALTIVEC)) {
/* All Book3S 64bit CPUs implement level based DEC logic */
tb_env-flags |= PPC_DECR_UNDERFLOW_LEVEL;
}
 
 would catch a few more.  I'm not sure we get into this code for Book3E 
 machines, but if you are worried
 about that you could also ensure that insns_flags doesn't have PPC_BOOKE on.

Phew, I'm not quite sure. We can do it as an interim solution, but really the 
timer is a core property rather than a machine device :). So eventually this 
should get reworked.


Alex




Re: [Qemu-devel] [PATCH target-arm v1 1/1] net: cadence_gem: Make phy respond to broadcast

2014-04-09 Thread Beniamino Galvani
On Thu, Apr 03, 2014 at 11:55:19PM -0700, Peter Crosthwaite wrote:
 Phys must respond to address 0 by specification. Implement.
 
 Signed-off-by: Nathan Rossi nathan.ro...@xilinx.com
 Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com
 ---
 
  hw/net/cadence_gem.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
 index 92dc2f2..e34b25e 100644
 --- a/hw/net/cadence_gem.c
 +++ b/hw/net/cadence_gem.c
 @@ -1093,7 +1093,7 @@ static uint64_t gem_read(void *opaque, hwaddr offset, 
 unsigned size)
  uint32_t phy_addr, reg_num;
  
  phy_addr = (retval  GEM_PHYMNTNC_ADDR)  
 GEM_PHYMNTNC_ADDR_SHFT;
 -if (phy_addr == BOARD_PHY_ADDRESS) {
 +if (phy_addr == BOARD_PHY_ADDRESS || phy_addr == 0) {
  reg_num = (retval  GEM_PHYMNTNC_REG)  
 GEM_PHYMNTNC_REG_SHIFT;
  retval = 0x;
  retval |= gem_phy_read(s, reg_num);
 @@ -1193,7 +1193,7 @@ static void gem_write(void *opaque, hwaddr offset, 
 uint64_t val,
  uint32_t phy_addr, reg_num;
  
  phy_addr = (val  GEM_PHYMNTNC_ADDR)  GEM_PHYMNTNC_ADDR_SHFT;
 -if (phy_addr == BOARD_PHY_ADDRESS) {
 +if (phy_addr == BOARD_PHY_ADDRESS || phy_addr == 0) {
  reg_num = (val  GEM_PHYMNTNC_REG)  GEM_PHYMNTNC_REG_SHIFT;
  gem_phy_write(s, reg_num, val);
  }


Although 802.3 standard dictates that PHYs must always respond to
address 0, AFAIK not all PHYs do this.

In this case, if the Marvell PHY complies with standard:

Reviewed-by: Beniamino Galvani b.galv...@gmail.com



[Qemu-devel] [PATCH] target-i386: Preserve the Z bit for bt/bts/btr/btc

2014-04-09 Thread Richard Henderson
Older Intel manuals (pre-2010) describe Z as undefined, but AMD and
newer Intel manuals describe Z as unchanged.

Signed-off-by: Richard Henderson r...@twiddle.net
---
 target-i386/translate.c | 40 +++-
 1 file changed, 31 insertions(+), 9 deletions(-)

---
Clemens, your patch fails to fix flags computation for bts/btr/btc,
which should be done similarly to bt.  

And to answer your question, no, QEMU does not make any assumptions
about undefined flags.  We often set them to zero, just because that
is easier than any other setting.


r~



diff --git a/target-i386/translate.c b/target-i386/translate.c
index 02625e3..032b0fd 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -6708,41 +6708,63 @@ static target_ulong disas_insn(CPUX86State *env, 
DisasContext *s,
 }
 bt_op:
 tcg_gen_andi_tl(cpu_T[1], cpu_T[1], (1  (3 + ot)) - 1);
+tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
 switch(op) {
 case 0:
-tcg_gen_shr_tl(cpu_cc_src, cpu_T[0], cpu_T[1]);
-tcg_gen_movi_tl(cpu_cc_dst, 0);
 break;
 case 1:
-tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
 tcg_gen_movi_tl(cpu_tmp0, 1);
 tcg_gen_shl_tl(cpu_tmp0, cpu_tmp0, cpu_T[1]);
 tcg_gen_or_tl(cpu_T[0], cpu_T[0], cpu_tmp0);
 break;
 case 2:
-tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
 tcg_gen_movi_tl(cpu_tmp0, 1);
 tcg_gen_shl_tl(cpu_tmp0, cpu_tmp0, cpu_T[1]);
-tcg_gen_not_tl(cpu_tmp0, cpu_tmp0);
-tcg_gen_and_tl(cpu_T[0], cpu_T[0], cpu_tmp0);
+tcg_gen_andc_tl(cpu_T[0], cpu_T[0], cpu_tmp0);
 break;
 default:
 case 3:
-tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]);
 tcg_gen_movi_tl(cpu_tmp0, 1);
 tcg_gen_shl_tl(cpu_tmp0, cpu_tmp0, cpu_T[1]);
 tcg_gen_xor_tl(cpu_T[0], cpu_T[0], cpu_tmp0);
 break;
 }
-set_cc_op(s, CC_OP_SARB + ot);
 if (op != 0) {
 if (mod != 3) {
 gen_op_st_v(s, ot, cpu_T[0], cpu_A0);
 } else {
 gen_op_mov_reg_v(ot, rm, cpu_T[0]);
 }
+}
+
+/* Delay all CC updates until after the store above.  Note that
+   C is the result of the test, Z is unchanged, and the others
+   are all undefined.  */
+switch (s-cc_op) {
+case CC_OP_MULB ... CC_OP_MULQ:
+case CC_OP_ADDB ... CC_OP_ADDQ:
+case CC_OP_ADCB ... CC_OP_ADCQ:
+case CC_OP_SUBB ... CC_OP_SUBQ:
+case CC_OP_SBBB ... CC_OP_SBBQ:
+case CC_OP_LOGICB ... CC_OP_LOGICQ:
+case CC_OP_INCB ... CC_OP_INCQ:
+case CC_OP_DECB ... CC_OP_DECQ:
+case CC_OP_SHLB ... CC_OP_SHLQ:
+case CC_OP_SARB ... CC_OP_SARQ:
+case CC_OP_BMILGB ... CC_OP_BMILGQ:
+/* Z was going to be computed from the non-zero status of CC_DST.
+   We can get that same Z value (and the new C value) by leaving
+   CC_DST alone, setting CC_SRC, and using a CC_OP_SAR of the
+   same width.  */
 tcg_gen_mov_tl(cpu_cc_src, cpu_tmp4);
-tcg_gen_movi_tl(cpu_cc_dst, 0);
+set_cc_op(s, ((s-cc_op - CC_OP_MULB)  3) + CC_OP_SARB);
+break;
+default:
+/* Otherwise, generate EFLAGS and replace the C bit.  */
+gen_compute_eflags(s);
+tcg_gen_deposit_tl(cpu_cc_src, cpu_cc_src, cpu_tmp4,
+   ctz32(CC_C), 1);
+break;
 }
 break;
 case 0x1bc: /* bsf / tzcnt */
-- 
1.9.0




Re: [Qemu-devel] [PATCH microblaze v1 1/1] net: xilinx_axienet.c: Add phy soft reset bit clearing

2014-04-09 Thread Beniamino Galvani
On Tue, Apr 08, 2014 at 06:52:39PM -0700, Peter Crosthwaite wrote:
 From: Nathan Rossi nathan.ro...@xilinx.com
 
 Clear the BMCR Reset when writing to registers.
 
 Signed-off-by: Nathan Rossi nathan.ro...@xilinx.com
 [ PC:
  * Trivial style fixes to commit message
 ]
 Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com
 ---
 
  hw/net/xilinx_axienet.c | 3 +++
  1 file changed, 3 insertions(+)
 
 diff --git a/hw/net/xilinx_axienet.c b/hw/net/xilinx_axienet.c
 index 839d97c..0f485a0 100644
 --- a/hw/net/xilinx_axienet.c
 +++ b/hw/net/xilinx_axienet.c
 @@ -142,6 +142,9 @@ tdk_write(struct PHY *phy, unsigned int req, unsigned int 
 data)
  phy-regs[regnum] = data;
  break;
  }
 +
 +/* Unconditionally clear regs[BMCR][BMCR_RESET] */
 +phy-regs[0] = ~0x8000;
  }
  
  static void
 -- 

Reviewed-by: Beniamino Galvani b.galv...@gmail.com

Ideally we should also restore default values of registers after a
reset, but probably it is not required for the guest to operate
properly.

Beniamino



  1   2   >