date:20131121

[Qemu-devel] [Bug 1248854] Re: The guest cannot boot up with parameter "-no-acpi"

2013-11-21 Thread chao zhou

** Changed in: qemu
   Status: Confirmed => Fix Committed

** Changed in: qemu
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1248854

Title:
  The guest cannot boot up with parameter "-no-acpi"

Status in QEMU:
  Fix Released

Bug description:
  Environment:
  
  Host OS (ia32/ia32e/IA64):ia32e
  Guest OS (ia32/ia32e/IA64):ia32e
  Guest OS Type (Linux/Windows):Linux
  kvm.git Commit:81e87e26796782e014fd1f2bb9cd8fb6ce4021a8
  qemu.git Commit:a126050a103c924b03388a9a64ce9af8c96b0969
  Host Kernel Version:3.12.0-rc5
  Hardware:Romley_EP ,Ivytowm_EP

  
  Bug detailed description:
  --
  when create guest with parameter "-no-acpi", the guest cannot boot up.

  note: this should be a qemu bug
  kvm  + qemu  =  result
  81e87e26 + a126050a  =  bad
  81e87e26 + b8616055  =  good

  Reproduce steps:
  
  1. create guest
  qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -net none -no-acpi rhel6u4.qcow

  Current result:
  
  guest cannot boot up

  Expected result:
  
  guest boot up fine.

  Basic root-causing log:
  --
  [root@vt-snb9 qemu]#qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -net none 
/root/rhel6u4.qcow -no-acpi
  VNC server running on `::1:5900'
  qemu-system-x86_64: /root/qemu/hw/i386/acpi-build.c:135: acpi_get_pm_info: 
Assertion `obj' failed.
  Aborted (core dumped)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1248854/+subscriptions

[Qemu-devel] [Bug 1248854] Re: The guest cannot boot up with parameter "-no-acpi"

2013-11-21 Thread chao zhou

kvm.git + qemu.git: ede58222_5c5432e7
test on Romley_EP, Ivytown_EP, create guest with parameter "-no-acpi", the 
guest boot up fine.

** Changed in: qemu
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1248854

Title:
  The guest cannot boot up with parameter "-no-acpi"

Status in QEMU:
  Confirmed

Bug description:
  Environment:
  
  Host OS (ia32/ia32e/IA64):ia32e
  Guest OS (ia32/ia32e/IA64):ia32e
  Guest OS Type (Linux/Windows):Linux
  kvm.git Commit:81e87e26796782e014fd1f2bb9cd8fb6ce4021a8
  qemu.git Commit:a126050a103c924b03388a9a64ce9af8c96b0969
  Host Kernel Version:3.12.0-rc5
  Hardware:Romley_EP ,Ivytowm_EP

  
  Bug detailed description:
  --
  when create guest with parameter "-no-acpi", the guest cannot boot up.

  note: this should be a qemu bug
  kvm  + qemu  =  result
  81e87e26 + a126050a  =  bad
  81e87e26 + b8616055  =  good

  Reproduce steps:
  
  1. create guest
  qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -net none -no-acpi rhel6u4.qcow

  Current result:
  
  guest cannot boot up

  Expected result:
  
  guest boot up fine.

  Basic root-causing log:
  --
  [root@vt-snb9 qemu]#qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -net none 
/root/rhel6u4.qcow -no-acpi
  VNC server running on `::1:5900'
  qemu-system-x86_64: /root/qemu/hw/i386/acpi-build.c:135: acpi_get_pm_info: 
Assertion `obj' failed.
  Aborted (core dumped)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1248854/+subscriptions

[Qemu-devel] [RFC PATCH v0 3/3] gluster: Add support for creating zero-filled image

2013-11-21 Thread Bharata B Rao

GlusterFS supports creation of zero-filled file on GlusterFS volume
by means of an API called glfs_zerofill(). Use this API from QEMU to
create an image that is filled with zeroes by using the preallocation
option of qemu-img.

qemu-img create gluster://server/volume/image -o preallocation=full 10G

The allowed values for preallocation are 'full' and 'off'. By default
preallocation is off and image is not zero-filled.

glfs_zerofill() offloads the writing of zeroes to the server and if
the storage supports SCSI WRITESAME, GlusterFS server can issue
BLKZEROOUT ioctl to achieve the zeroing.

Signed-off-by: Bharata B Rao 
---
 block/gluster.c | 45 +
 1 file changed, 45 insertions(+)

diff --git a/block/gluster.c b/block/gluster.c
index 15f5dfb..2368997 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -381,6 +381,29 @@ out:
 qemu_aio_release(acb);
 return ret;
 }
+
+static inline int gluster_supports_zerofill(void)
+{
+return 1;
+}
+
+static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
+int64_t size)
+{
+return glfs_zerofill(fd, offset, size);
+}
+
+#else
+static inline int gluster_supports_zerofill(void)
+{
+return 0;
+}
+
+static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
+int64_t size)
+{
+return 0;
+}
 #endif
 
 static int qemu_gluster_create(const char *filename,
@@ -389,6 +412,7 @@ static int qemu_gluster_create(const char *filename,
 struct glfs *glfs;
 struct glfs_fd *fd;
 int ret = 0;
+int prealloc = 0;
 int64_t total_size = 0;
 GlusterConf *gconf = g_malloc0(sizeof(GlusterConf));
 
@@ -401,6 +425,18 @@ static int qemu_gluster_create(const char *filename,
 while (options && options->name) {
 if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
 total_size = options->value.n / BDRV_SECTOR_SIZE;
+} else if (!strcmp(options->name, BLOCK_OPT_PREALLOC)) {
+if (!options->value.s || !strcmp(options->value.s, "off")) {
+prealloc = 0;
+} else if (!strcmp(options->value.s, "full") &&
+gluster_supports_zerofill()) {
+prealloc = 1;
+} else {
+error_setg(errp, "Invalid preallocation mode: '%s'"
+" or GlusterFS doesn't support zerofill API",
+   options->value.s);
+return -EINVAL;
+}
 }
 options++;
 }
@@ -413,6 +449,10 @@ static int qemu_gluster_create(const char *filename,
 if (glfs_ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
 ret = -errno;
 }
+if (prealloc && qemu_gluster_zerofill(fd, 0,
+total_size * BDRV_SECTOR_SIZE)) {
+ret = -errno;
+}
 if (glfs_close(fd) != 0) {
 ret = -errno;
 }
@@ -595,6 +635,11 @@ static QEMUOptionParameter qemu_gluster_create_options[] = 
{
 .type = OPT_SIZE,
 .help = "Virtual disk size"
 },
+{
+.name = BLOCK_OPT_PREALLOC,
+.type = OPT_STRING,
+.help = "Preallocation mode (allowed values: off, on)"
+},
 { NULL }
 };
 
-- 
1.7.11.7

[Qemu-devel] [RFC PATCH v0 2/3] gluster: Implement .bdrv_co_write_zeroes for gluster

2013-11-21 Thread Bharata B Rao

Support .bdrv_co_write_zeroes() from gluster driver by using GlusterFS API
glfs_zerofill() that off-loads the writing of zeroes to GlusterFS server.

Signed-off-by: Bharata B Rao 
---
 block/gluster.c | 101 
 configure   |   8 +
 2 files changed, 81 insertions(+), 28 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index 9f85228..15f5dfb 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -250,6 +250,34 @@ static void qemu_gluster_complete_aio(void *opaque)
 qemu_aio_release(acb);
 }
 
+/*
+ * AIO callback routine called from GlusterFS thread.
+ */
+static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
+{
+GlusterAIOCB *acb = (GlusterAIOCB *)arg;
+
+acb->ret = ret;
+acb->bh = qemu_bh_new(qemu_gluster_complete_aio, acb);
+qemu_bh_schedule(acb->bh);
+}
+
+static void qemu_gluster_aio_cancel(BlockDriverAIOCB *blockacb)
+{
+GlusterAIOCB *acb = (GlusterAIOCB *)blockacb;
+bool finished = false;
+
+acb->finished = &finished;
+while (!finished) {
+qemu_aio_wait();
+}
+}
+
+static const AIOCBInfo gluster_aiocb_info = {
+.aiocb_size = sizeof(GlusterAIOCB),
+.cancel = qemu_gluster_aio_cancel,
+};
+
 /* TODO Convert to fine grained options */
 static QemuOptsList runtime_opts = {
 .name = "gluster",
@@ -322,6 +350,39 @@ out:
 return ret;
 }
 
+#ifdef CONFIG_GLUSTERFS_ZEROFILL
+static int qemu_gluster_co_write_zeroes(BlockDriverState *bs,
+int64_t sector_num, int nb_sectors)
+{
+int ret;
+GlusterAIOCB *acb;
+BDRVGlusterState *s = bs->opaque;
+off_t size;
+off_t offset;
+
+offset = sector_num * BDRV_SECTOR_SIZE;
+size = nb_sectors * BDRV_SECTOR_SIZE;
+
+acb = qemu_aio_get(&gluster_aiocb_info, bs, NULL, NULL);
+acb->size = size;
+acb->ret = 0;
+acb->finished = NULL;
+acb->coroutine = qemu_coroutine_self();
+
+ret = glfs_zerofill_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
+if (ret < 0) {
+goto out;
+}
+
+qemu_coroutine_yield();
+return acb->ret;
+
+out:
+qemu_aio_release(acb);
+return ret;
+}
+#endif
+
 static int qemu_gluster_create(const char *filename,
 QEMUOptionParameter *options, Error **errp)
 {
@@ -364,34 +425,6 @@ out:
 return ret;
 }
 
-static void qemu_gluster_aio_cancel(BlockDriverAIOCB *blockacb)
-{
-GlusterAIOCB *acb = (GlusterAIOCB *)blockacb;
-bool finished = false;
-
-acb->finished = &finished;
-while (!finished) {
-qemu_aio_wait();
-}
-}
-
-static const AIOCBInfo gluster_aiocb_info = {
-.aiocb_size = sizeof(GlusterAIOCB),
-.cancel = qemu_gluster_aio_cancel,
-};
-
-/*
- * AIO callback routine called from GlusterFS thread.
- */
-static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
-{
-GlusterAIOCB *acb = (GlusterAIOCB *)arg;
-
-acb->ret = ret;
-acb->bh = qemu_bh_new(qemu_gluster_complete_aio, acb);
-qemu_bh_schedule(acb->bh);
-}
-
 static coroutine_fn int qemu_gluster_aio_rw(BlockDriverState *bs,
 int64_t sector_num, int nb_sectors, QEMUIOVector *qiov, int write)
 {
@@ -583,6 +616,9 @@ static BlockDriver bdrv_gluster = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
 .bdrv_co_discard  = qemu_gluster_co_discard,
 #endif
+#ifdef CONFIG_GLUSTERFS_ZEROFILL
+.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
+#endif
 .create_options   = qemu_gluster_create_options,
 };
 
@@ -604,6 +640,9 @@ static BlockDriver bdrv_gluster_tcp = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
 .bdrv_co_discard  = qemu_gluster_co_discard,
 #endif
+#ifdef CONFIG_GLUSTERFS_ZEROFILL
+.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
+#endif
 .create_options   = qemu_gluster_create_options,
 };
 
@@ -625,6 +664,9 @@ static BlockDriver bdrv_gluster_unix = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
 .bdrv_co_discard  = qemu_gluster_co_discard,
 #endif
+#ifdef CONFIG_GLUSTERFS_ZEROFILL
+.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
+#endif
 .create_options   = qemu_gluster_create_options,
 };
 
@@ -646,6 +688,9 @@ static BlockDriver bdrv_gluster_rdma = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
 .bdrv_co_discard  = qemu_gluster_co_discard,
 #endif
+#ifdef CONFIG_GLUSTERFS_ZEROFILL
+.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
+#endif
 .create_options   = qemu_gluster_create_options,
 };
 
diff --git a/configure b/configure
index 508f6a5..3c267a4 100755
--- a/configure
+++ b/configure
@@ -255,6 +255,7 @@ coroutine_pool=""
 seccomp=""
 glusterfs=""
 glusterfs_discard="no"
+glusterfs_zerofill="no"
 virtio_blk_data_plane=""
 gtk=""
 gtkabi="2.0"
@@ -2670,6 +2671,9 @@ if test "$glusterfs" != "no" ; then
 if $pkg_config --atleast-version=5 glusterfs-api; then
   glusterfs_discard="yes"
 fi
+if $pkg_config --atlea

[Qemu-devel] [RFC PATCH v0 1/3] gluster: Convert aio routines into coroutines

2013-11-21 Thread Bharata B Rao

Convert the read, write, flush and discard implementations from aio-based
ones to coroutine based ones.

Signed-off-by: Bharata B Rao 
---
 block/gluster.c | 168 +---
 1 file changed, 63 insertions(+), 105 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index 877686a..9f85228 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -26,11 +26,11 @@ typedef struct GlusterAIOCB {
 int ret;
 bool *finished;
 QEMUBH *bh;
+Coroutine *coroutine;
 } GlusterAIOCB;
 
 typedef struct BDRVGlusterState {
 struct glfs *glfs;
-int fds[2];
 struct glfs_fd *fd;
 int event_reader_pos;
 GlusterAIOCB *event_acb;
@@ -231,46 +231,23 @@ out:
 return NULL;
 }
 
-static void qemu_gluster_complete_aio(GlusterAIOCB *acb, BDRVGlusterState *s)
+static void qemu_gluster_complete_aio(void *opaque)
 {
-int ret;
-bool *finished = acb->finished;
-BlockDriverCompletionFunc *cb = acb->common.cb;
-void *opaque = acb->common.opaque;
-
-if (!acb->ret || acb->ret == acb->size) {
-ret = 0; /* Success */
-} else if (acb->ret < 0) {
-ret = acb->ret; /* Read/Write failed */
-} else {
-ret = -EIO; /* Partial read/write - fail it */
-}
+GlusterAIOCB *acb = (GlusterAIOCB *)opaque;
 
-qemu_aio_release(acb);
-cb(opaque, ret);
-if (finished) {
-*finished = true;
+if (acb->ret == acb->size) {
+acb->ret = 0;
+} else if (acb->ret > 0) {
+acb->ret = -EIO; /* Partial read/write - fail it */
 }
-}
 
-static void qemu_gluster_aio_event_reader(void *opaque)
-{
-BDRVGlusterState *s = opaque;
-ssize_t ret;
-
-do {
-char *p = (char *)&s->event_acb;
-
-ret = read(s->fds[GLUSTER_FD_READ], p + s->event_reader_pos,
-   sizeof(s->event_acb) - s->event_reader_pos);
-if (ret > 0) {
-s->event_reader_pos += ret;
-if (s->event_reader_pos == sizeof(s->event_acb)) {
-s->event_reader_pos = 0;
-qemu_gluster_complete_aio(s->event_acb, s);
-}
-}
-} while (ret < 0 && errno == EINTR);
+qemu_bh_delete(acb->bh);
+acb->bh = NULL;
+qemu_coroutine_enter(acb->coroutine, NULL);
+if (acb->finished) {
+*acb->finished = true;
+}
+qemu_aio_release(acb);
 }
 
 /* TODO Convert to fine grained options */
@@ -309,7 +286,6 @@ static int qemu_gluster_open(BlockDriverState *bs,  QDict 
*options,
 
 filename = qemu_opt_get(opts, "filename");
 
-
 s->glfs = qemu_gluster_init(gconf, filename);
 if (!s->glfs) {
 ret = -errno;
@@ -329,18 +305,8 @@ static int qemu_gluster_open(BlockDriverState *bs,  QDict 
*options,
 s->fd = glfs_open(s->glfs, gconf->image, open_flags);
 if (!s->fd) {
 ret = -errno;
-goto out;
 }
 
-ret = qemu_pipe(s->fds);
-if (ret < 0) {
-ret = -errno;
-goto out;
-}
-fcntl(s->fds[GLUSTER_FD_READ], F_SETFL, O_NONBLOCK);
-qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ],
-qemu_gluster_aio_event_reader, NULL, s);
-
 out:
 qemu_opts_del(opts);
 qemu_gluster_gconf_free(gconf);
@@ -414,28 +380,20 @@ static const AIOCBInfo gluster_aiocb_info = {
 .cancel = qemu_gluster_aio_cancel,
 };
 
+/*
+ * AIO callback routine called from GlusterFS thread.
+ */
 static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
 {
 GlusterAIOCB *acb = (GlusterAIOCB *)arg;
-BlockDriverState *bs = acb->common.bs;
-BDRVGlusterState *s = bs->opaque;
-int retval;
 
 acb->ret = ret;
-retval = qemu_write_full(s->fds[GLUSTER_FD_WRITE], &acb, sizeof(acb));
-if (retval != sizeof(acb)) {
-/*
- * Gluster AIO callback thread failed to notify the waiting
- * QEMU thread about IO completion.
- */
-error_report("Gluster AIO completion failed: %s", strerror(errno));
-abort();
-}
+acb->bh = qemu_bh_new(qemu_gluster_complete_aio, acb);
+qemu_bh_schedule(acb->bh);
 }
 
-static BlockDriverAIOCB *qemu_gluster_aio_rw(BlockDriverState *bs,
-int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-BlockDriverCompletionFunc *cb, void *opaque, int write)
+static coroutine_fn int qemu_gluster_aio_rw(BlockDriverState *bs,
+int64_t sector_num, int nb_sectors, QEMUIOVector *qiov, int write)
 {
 int ret;
 GlusterAIOCB *acb;
@@ -446,10 +404,11 @@ static BlockDriverAIOCB 
*qemu_gluster_aio_rw(BlockDriverState *bs,
 offset = sector_num * BDRV_SECTOR_SIZE;
 size = nb_sectors * BDRV_SECTOR_SIZE;
 
-acb = qemu_aio_get(&gluster_aiocb_info, bs, cb, opaque);
+acb = qemu_aio_get(&gluster_aiocb_info, bs, NULL, NULL);
 acb->size = size;
 acb->ret = 0;
 acb->finished = NULL;
+acb->coroutine = qemu_coroutine_self();
 
 if (write) {
 ret = glfs_pwritev_async(s->fd, qiov->iov, qiov->niov, offset, 0,
@@ -462,11 +421

[Qemu-devel] [RFC PATCH v0 0/3] gluster: conversion to coroutines and supporting write_zeroes

2013-11-21 Thread Bharata B Rao

Hi,

This series is about converting all the bdrv_aio* implementations in gluster
driver to coroutine based implementations. Read, write, flush and discard
routines are converted.

This also adds support for .bdrv_co_write_zeroes() in gluster and provides
a new preallocation option with qemu-img (-o preallocation=full) that can
be used for raw images on GlusterFS backend to create fully allocated and
zero-filled images.

Bharata B Rao (3):
  gluster: Convert aio routines into coroutines
  gluster: Implement .bdrv_co_write_zeroes for gluster
  gluster: Add support for creating zero-filled image

 block/gluster.c | 298 
 configure   |   8 ++
 2 files changed, 181 insertions(+), 125 deletions(-)

-- 
1.7.11.7

Re: [Qemu-devel] [PATCH V5 3/5] qemu-iotests: add 058 internal snapshot export with qemu-nbd case

2013-11-21 Thread Wenchao Xia


于 2013/11/19 19:29, Kevin Wolf 写道:

Am 10.11.2013 um 23:03 hat Wenchao Xia geschrieben:

Signed-off-by: Wenchao Xia 
---
  tests/qemu-iotests/058 |  102 
  tests/qemu-iotests/058.out |   32 ++
  tests/qemu-iotests/check   |1 +
  tests/qemu-iotests/group   |1 +
  4 files changed, 136 insertions(+), 0 deletions(-)
  create mode 100755 tests/qemu-iotests/058
  create mode 100644 tests/qemu-iotests/058.out


I think this should have:

_supported_proto nbd

Or otherwise the check in common needs to be changed. At least for me
the test failed without an error message because I didn't have a
qemu-nbd symlink in place.

Kevin


  I found a little problem:
$QEMU_IMG snapshot -c sn1 "$TEST_IMG"
can't work when IMGPROTO=nbd, since in common.rc the image was not
exported as raw:

if [ $IMGPROTO = "nbd" ]; then
eval "$QEMU_NBD -v -t -b 127.0.0.1 -p 10810  $TEST_IMG_FILE &"
QEMU_NBD_PID=$!
sleep 1 # FIXME: qemu-nbd needs to be listening before we continue
fi

So later the test case think $TEST_IMG as a raw image. All case used
$QEMU_IMG snapshot will fail, such as 015? Do you think this is a bug?

  Instead if change common.rc to exported as raw(I am not sure whether
it is done on purpose), how about add a check called:

_required_util

in common.rc?

Re: [Qemu-devel] [PATCH for 1.7] .gitignore: Ignore config.status

2013-11-21 Thread Stefan Weil

Am 22.11.2013 07:39, schrieb Fam Zheng:
> Signed-off-by: Fam Zheng 
> ---
>  .gitignore | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/.gitignore b/.gitignore
> index 5584b5f..1c9d63d 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -3,6 +3,7 @@ config-all-devices.*
>  config-all-disas.*
>  config-host.*
>  config-target.*
> +config.status
>  trace/generated-tracers.h
>  trace/generated-tracers.c
>  trace/generated-tracers-dtrace.h

Reviewed-by: Stefan Weil 

This is a small bug fix for the build environment without any risk which
might be added to QEMU 1.7.

Stefan

Re: [Qemu-devel] Buildbot failure: MinGW (was: qga/vss-win32/requester.h compile error)

2013-11-21 Thread Stefan Weil

Am 21.11.2013 09:40, schrieb Stefan Hajnoczi:
> Excellent, thanks!  The latest buildbot has compiled successfully:
> http://buildbot.b1-systems.de/qemu/builders/default_mingw32/builds/784
>
> It's now hitting a make check failure unrelated to VSS.
>
> Thanks,
> Stefan

'make check' for MinGW cross builds needs an installation of wine to run
the resulting exe files.

I also had to fix tests/Makefile to exclude some POSIX-only tests and to
makesome smaller code fixes.
Now I can run 'make check' for MinGW on a Linux host.

Patches will follow after QEMU 1.7 - maybe you can prepare the buildbot
with wine.

Regards,
Stefan

[Qemu-devel] [PATCH] .gitignore: Ignore config.status

2013-11-21 Thread Fam Zheng

Signed-off-by: Fam Zheng 
---
 .gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitignore b/.gitignore
index 5584b5f..1c9d63d 100644
--- a/.gitignore
+++ b/.gitignore
@@ -3,6 +3,7 @@ config-all-devices.*
 config-all-disas.*
 config-host.*
 config-target.*
+config.status
 trace/generated-tracers.h
 trace/generated-tracers.c
 trace/generated-tracers-dtrace.h
-- 
1.8.4.2

Re: [Qemu-devel] [RFC PATCH v3 0/2] Point-in-time snapshot exporting over NBD

2013-11-21 Thread Fam Zheng


On 2013年11月20日 10:32, Ian Main wrote:

On Thu, Oct 17, 2013 at 01:36:41PM +0800, Fam Zheng wrote:

This series adds for point-in-time snapshot NBD exporting based on
blockdev-backup (variant of drive-backup with existing device as target).


In general this seems to work great.  I did a bunch of testing and was
able to mount filesystems over NBD, do many writes (on backed up
image)/reads (on target), intense IO etc and all held up fine.

There are definitely some issues with some of the commands allowing you
to blow things up.  I'm interested to hear opinions on whether this is a
showstopper at this time or not.



Thanks very much for your testing, Ian. QEMU should report error instead 
of crashing with invalid operations. I added an "operation blocker" and 
incorporated into the next version.


In the future, we still want to review more on those commands and try to 
enable safe ones, and possibly also allow multiple block jobs on a BDS. 
For now it is good enough to only allow blockdev-backup on the source 
and NBD export on the target (which is safe, and all what we need for 
image fleecing, as this version shows). With that being a working 
implementation, I've posted v4. Please test and free to make comments.



We get a thin point-in-time snapshot by COW mechanism of drive-backup, and
export it through built in NBD server. The steps are as below:

  1. (SHELL) qemu-img create -f qcow2 BACKUP.qcow2 

 (Alternatively we can use -o backing_file=RUNNING-VM.img to omit explicitly
 providing the size by ourselves, but it's risky because RUNNING-VM.qcow2 is
 used r/w by guest. Whether or not setting backing file in the image file
 doesn't matter, as we are going to override the backing hd in the next
 step)

  2. (QMP) blockdev-add backing=source-drive file.driver=file 
file.filename=BACKUP.qcow2 id=target0 if=none driver=qcow2


I had to create a custom python script to make these commands work as
they require nested dicts.  Is that normal or did I miss something
entirely?



Yes, I use a python script locally to test it too.

Fam

[Qemu-devel] [PATCH v4 5/7] block: Parse "backing" option to reference existing BDS

2013-11-21 Thread Fam Zheng

Signed-off-by: Fam Zheng 
---
 block.c   | 37 -
 include/block/block_int.h |  3 +++
 2 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index 3bf4c8a..a5da656 100644
--- a/block.c
+++ b/block.c
@@ -1166,11 +1166,33 @@ int bdrv_open(BlockDriverState *bs, const char 
*filename, QDict *options,
 /* If there is a backing file, use it */
 if ((flags & BDRV_O_NO_BACKING) == 0) {
 QDict *backing_options;
-
-qdict_extract_subqdict(options, &backing_options, "backing.");
-ret = bdrv_open_backing_file(bs, backing_options, &local_err);
-if (ret < 0) {
-goto close_and_fail;
+const char *backing_bs;
+
+backing_bs = qdict_get_try_str(options, "backing");
+if (backing_bs) {
+bs->backing_hd = bdrv_find(backing_bs);
+qdict_del(options, "backing");
+if (bs->backing_hd) {
+bdrv_ref(bs->backing_hd);
+assert(!bs->backing_blocker);
+error_setg(&bs->backing_blocker,
+   "device is used as backing hd of '%s'",
+   bs->device_name);
+bdrv_op_block_all(bs->backing_hd, bs->backing_blocker);
+pstrcpy(bs->backing_file, sizeof(bs->backing_file),
+bs->backing_hd->filename);
+pstrcpy(bs->backing_format, sizeof(bs->backing_format),
+bs->backing_hd->drv->format_name);
+} else {
+error_setg(errp, "Backing device not found: %s", backing_bs);
+goto close_and_fail;
+}
+} else {
+qdict_extract_subqdict(options, &backing_options, "backing.");
+ret = bdrv_open_backing_file(bs, backing_options, &local_err);
+if (ret < 0) {
+goto close_and_fail;
+}
 }
 }
 
@@ -1461,6 +1483,11 @@ void bdrv_close(BlockDriverState *bs)
 
 if (bs->drv) {
 if (bs->backing_hd) {
+if (error_is_set(&bs->backing_blocker)) {
+bdrv_op_unblock_all(bs->backing_hd,
+bs->backing_blocker);
+error_free(bs->backing_blocker);
+}
 bdrv_unref(bs->backing_hd);
 bs->backing_hd = NULL;
 }
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 60edc80..6db30c9 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -316,6 +316,9 @@ struct BlockDriverState {
 BlockJob *job;
 
 QDict *options;
+
+/* For backing reference, block the operations of named backing device */
+Error *backing_blocker;
 };
 
 int get_tmp_filename(char *filename, int size);
-- 
1.8.4.2

[Qemu-devel] [PATCH v4 7/7] block: Allow backup on referenced named BlockDriverState

2013-11-21 Thread Fam Zheng

Drive backup is a read only operation on source bs. We want to allow
this specific case to enable image-fleecing. Note that when
image-fleecing job starts, the job still add its blocker to source bs,
and any other operation on it will be blocked by that.

Signed-off-by: Fam Zheng 
---
 block.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block.c b/block.c
index a5da656..d30be51 100644
--- a/block.c
+++ b/block.c
@@ -1179,6 +1179,8 @@ int bdrv_open(BlockDriverState *bs, const char *filename, 
QDict *options,
"device is used as backing hd of '%s'",
bs->device_name);
 bdrv_op_block_all(bs->backing_hd, bs->backing_blocker);
+bdrv_op_unblock(bs->backing_hd, BLOCK_OP_TYPE_BACKUP,
+bs->backing_blocker);
 pstrcpy(bs->backing_file, sizeof(bs->backing_file),
 bs->backing_hd->filename);
 pstrcpy(bs->backing_format, sizeof(bs->backing_format),
-- 
1.8.4.2

[Qemu-devel] [PATCH v4 4/7] block: Add checks of blocker in block operations

2013-11-21 Thread Fam Zheng

Before operate on a BlockDriverState, respective types are checked
against bs->op_blockers and it will error out if there's a blocker.

Signed-off-by: Fam Zheng 
---
 blockdev-nbd.c |  4 
 blockdev.c | 25 +
 2 files changed, 29 insertions(+)

diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 922cf56..c49feae 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -92,6 +92,10 @@ void qmp_nbd_server_add(const char *device, bool 
has_writable, bool writable,
 return;
 }
 
+if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_NBD_SERVER_ADD, errp)) {
+return;
+}
+
 if (!has_writable) {
 writable = false;
 }
diff --git a/blockdev.c b/blockdev.c
index eb55b74..1921d27 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1001,6 +1001,11 @@ SnapshotInfo 
*qmp_blockdev_snapshot_delete_internal_sync(const char *device,
 return NULL;
 }
 
+if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_INTERNAL_SNAPSHOT_DELETE,
+   errp)) {
+return NULL;
+}
+
 ret = bdrv_snapshot_find_by_id_and_name(bs, id, name, &sn, &local_err);
 if (error_is_set(&local_err)) {
 error_propagate(errp, local_err);
@@ -1098,6 +1103,10 @@ static void 
internal_snapshot_prepare(BlkTransactionState *common,
 return;
 }
 
+if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_INTERNAL_SNAPSHOT, errp)) {
+return;
+}
+
 if (!bdrv_is_inserted(bs)) {
 error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
 return;
@@ -1485,6 +1494,10 @@ void qmp_block_passwd(const char *device, const char 
*password, Error **errp)
 return;
 }
 
+if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_PASSWD, errp)) {
+return;
+}
+
 err = bdrv_set_key(bs, password);
 if (err == -EINVAL) {
 error_set(errp, QERR_DEVICE_NOT_ENCRYPTED, bdrv_get_device_name(bs));
@@ -1536,6 +1549,10 @@ void qmp_change_blockdev(const char *device, const char 
*filename,
 return;
 }
 
+if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_CHANGE, errp)) {
+return;
+}
+
 if (format) {
 drv = bdrv_find_whitelisted_format(format, bs->read_only);
 if (!drv) {
@@ -1586,6 +1603,10 @@ void qmp_block_set_io_throttle(const char *device, 
int64_t bps, int64_t bps_rd,
 return;
 }
 
+if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_SET_IO_THROTTLE, errp)) {
+return;
+}
+
 memset(&cfg, 0, sizeof(cfg));
 cfg.buckets[THROTTLE_BPS_TOTAL].avg = bps;
 cfg.buckets[THROTTLE_BPS_READ].avg  = bps_rd;
@@ -1684,6 +1705,10 @@ void qmp_block_resize(const char *device, int64_t size, 
Error **errp)
 return;
 }
 
+if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_RESIZE, errp)) {
+return;
+}
+
 if (size < 0) {
 error_set(errp, QERR_INVALID_PARAMETER_VALUE, "size", "a >0 size");
 return;
-- 
1.8.4.2

[Qemu-devel] [PATCH v4 6/7] qmp: add command 'blockdev-backup'

2013-11-21 Thread Fam Zheng

Similar to drive-backup, but this command uses a device id as target
instead of creating/opening an image file.

Also add blocker on target bs, since the target is also a named device
now.

Signed-off-by: Fam Zheng 
---
 block/backup.c   | 22 ++
 blockdev.c   | 46 ++
 qapi-schema.json | 49 +
 qmp-commands.hx  | 46 ++
 4 files changed, 163 insertions(+)

diff --git a/block/backup.c b/block/backup.c
index cad14c9..b608f08 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -338,6 +338,7 @@ static void coroutine_fn backup_run(void *opaque)
 hbitmap_free(job->bitmap);
 
 bdrv_iostatus_disable(target);
+bdrv_op_unblock_all(target, job->common.blocker);
 bdrv_unref(target);
 
 block_job_completed(&job->common, ret);
@@ -363,6 +364,24 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
 return;
 }
 
+if (!bdrv_is_inserted(bs)) {
+error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, bs->device_name);
+return;
+}
+
+if (!bdrv_is_inserted(target)) {
+error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, target->device_name);
+return;
+}
+
+if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP, errp)) {
+return;
+}
+
+if (bdrv_op_is_blocked(target, BLOCK_OP_TYPE_BACKUP, errp)) {
+return;
+}
+
 len = bdrv_getlength(bs);
 if (len < 0) {
 error_setg_errno(errp, -len, "unable to get length for '%s'",
@@ -376,6 +395,9 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
 return;
 }
 
+bdrv_op_block_all(target, job->common.blocker);
+bdrv_op_unblock(target, BLOCK_OP_TYPE_NBD_SERVER_ADD, job->common.blocker);
+
 job->on_source_error = on_source_error;
 job->on_target_error = on_target_error;
 job->target = target;
diff --git a/blockdev.c b/blockdev.c
index 1921d27..e62fb35 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1900,6 +1900,8 @@ void qmp_drive_backup(const char *device, const char 
*target,
 return;
 }
 
+/* Although backup_run has this check too, we need to use bs->drv below, so
+ * do an early check redundantly. */
 if (!bdrv_is_inserted(bs)) {
 error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
 return;
@@ -1974,6 +1976,50 @@ void qmp_drive_backup(const char *device, const char 
*target,
 }
 }
 
+void qmp_blockdev_backup(const char *device, const char *target,
+ enum MirrorSyncMode sync,
+ bool has_speed, int64_t speed,
+ bool has_on_source_error,
+ BlockdevOnError on_source_error,
+ bool has_on_target_error,
+ BlockdevOnError on_target_error,
+ Error **errp)
+{
+BlockDriverState *bs;
+BlockDriverState *target_bs;
+Error *local_err = NULL;
+
+if (!has_speed) {
+speed = 0;
+}
+if (!has_on_source_error) {
+on_source_error = BLOCKDEV_ON_ERROR_REPORT;
+}
+if (!has_on_target_error) {
+on_target_error = BLOCKDEV_ON_ERROR_REPORT;
+}
+
+bs = bdrv_find(device);
+if (!bs) {
+error_set(errp, QERR_DEVICE_NOT_FOUND, device);
+return;
+}
+
+target_bs = bdrv_find(target);
+if (!target_bs) {
+error_set(errp, QERR_DEVICE_NOT_FOUND, target);
+return;
+}
+
+bdrv_ref(target_bs);
+backup_start(bs, target_bs, speed, sync, on_source_error, on_target_error,
+ block_job_cb, bs, &local_err);
+if (local_err != NULL) {
+bdrv_unref(target_bs);
+error_propagate(errp, local_err);
+}
+}
+
 #define DEFAULT_MIRROR_BUF_SIZE   (10 << 20)
 
 void qmp_drive_mirror(const char *device, const char *target,
diff --git a/qapi-schema.json b/qapi-schema.json
index 4656e8c..bd3c55c 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1844,6 +1844,40 @@
 '*on-target-error': 'BlockdevOnError' } }
 
 ##
+# @BlockdevBackup
+#
+# @device: the name of the device which should be copied.
+#
+# @target: the name of the backup target device
+#
+# @sync: what parts of the disk image should be copied to the destination
+#(all the disk, only the sectors allocated in the topmost image, or
+#only new I/O).
+#
+# @speed: #optional the maximum speed, in bytes per second
+#
+# @on-source-error: #optional the action to take on an error on the source,
+#   default 'report'.  'stop' and 'enospc' can only be used
+#   if the block device supports io-status (see BlockInfo).
+#
+# @on-target-error: #optional the action to take on an error on the target,
+#   default 'report' (no limitations, since this applies to
+#   a different block device than @device).
+#
+# Note that @on-source-error and @on-

[Qemu-devel] [PATCH v4 3/7] block: Replace in_use with operation blocker

2013-11-21 Thread Fam Zheng

This drops BlockDriverState.in_use with op_blockers:

  - Call bdrv_op_block_all in place of bdrv_set_in_use(bs, 1).
  - Call bdrv_op_unblock_all in place of bdrv_set_in_use(bs, 0).
  - Check bdrv_op_is_blocked() in place of bdrv_in_use(bs).
The specific types are used, e.g. in place of starting block backup,
bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP, ...).
  - Check bdrv_op_blocker_is_empty() in place of assert(!bs->in_use).

Note: there is no block for only specific types for now, i.e. a caller
blocks all or none. So although the checks are type specific, the above
changes can still be seen as identical in logic.

Signed-off-by: Fam Zheng 
---
 block-migration.c   |  8 ++--
 block.c | 34 +-
 blockdev.c  | 29 +++--
 blockjob.c  | 12 +++-
 hw/block/dataplane/virtio-blk.c | 16 ++--
 include/block/block.h   |  2 --
 include/block/block_int.h   |  1 -
 include/block/blockjob.h|  3 +++
 8 files changed, 66 insertions(+), 39 deletions(-)

diff --git a/block-migration.c b/block-migration.c
index daf9ec1..12afcfa 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -58,6 +58,8 @@ typedef struct BlkMigDevState {
 /* Protected by block migration lock.  */
 unsigned long *aio_bitmap;
 int64_t completed_sectors;
+
+Error *blocker;
 } BlkMigDevState;
 
 typedef struct BlkMigBlock {
@@ -336,7 +338,8 @@ static void init_blk_migration_it(void *opaque, 
BlockDriverState *bs)
 bmds->completed_sectors = 0;
 bmds->shared_base = block_mig_state.shared_base;
 alloc_aio_bitmap(bmds);
-bdrv_set_in_use(bs, 1);
+error_setg(&bmds->blocker, "block device is in use by migration");
+bdrv_op_block_all(bs, bmds->blocker);
 bdrv_ref(bs);
 
 block_mig_state.total_sector_sum += sectors;
@@ -574,7 +577,8 @@ static void blk_mig_cleanup(void)
 blk_mig_lock();
 while ((bmds = QSIMPLEQ_FIRST(&block_mig_state.bmds_list)) != NULL) {
 QSIMPLEQ_REMOVE_HEAD(&block_mig_state.bmds_list, entry);
-bdrv_set_in_use(bmds->bs, 0);
+bdrv_op_unblock_all(bmds->bs, bmds->blocker);
+error_free(bmds->blocker);
 bdrv_unref(bmds->bs);
 g_free(bmds->aio_bitmap);
 g_free(bmds);
diff --git a/block.c b/block.c
index 2b18a43..3bf4c8a 100644
--- a/block.c
+++ b/block.c
@@ -1628,15 +1628,17 @@ static void bdrv_move_feature_fields(BlockDriverState 
*bs_dest,
 bs_dest->refcnt = bs_src->refcnt;
 
 /* job */
-bs_dest->in_use = bs_src->in_use;
 bs_dest->job= bs_src->job;
 
 /* keep the same entry in bdrv_states */
 pstrcpy(bs_dest->device_name, sizeof(bs_dest->device_name),
 bs_src->device_name);
+memcpy(bs_dest->op_blockers, bs_src->op_blockers,
+   sizeof(bs_dest->op_blockers));
 bs_dest->list = bs_src->list;
 }
 
+static bool bdrv_op_blocker_is_empty(BlockDriverState *bs);
 /*
  * Swap bs contents for two image chains while they are live,
  * while keeping required fields on the BlockDriverState that is
@@ -1658,7 +1660,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState 
*bs_old)
 assert(bs_new->dirty_bitmap == NULL);
 assert(bs_new->job == NULL);
 assert(bs_new->dev == NULL);
-assert(bs_new->in_use == 0);
+assert(bdrv_op_blocker_is_empty(bs_new));
 assert(bs_new->io_limits_enabled == false);
 assert(!throttle_have_timer(&bs_new->throttle_state));
 
@@ -1677,7 +1679,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState 
*bs_old)
 /* Check a few fields that should remain attached to the device */
 assert(bs_new->dev == NULL);
 assert(bs_new->job == NULL);
-assert(bs_new->in_use == 0);
+assert(bdrv_op_blocker_is_empty(bs_new));
 assert(bs_new->io_limits_enabled == false);
 assert(!throttle_have_timer(&bs_new->throttle_state));
 
@@ -1714,7 +1716,7 @@ static void bdrv_delete(BlockDriverState *bs)
 {
 assert(!bs->dev);
 assert(!bs->job);
-assert(!bs->in_use);
+assert(bdrv_op_blocker_is_empty(bs));
 assert(!bs->refcnt);
 
 bdrv_close(bs);
@@ -1887,6 +1889,7 @@ int bdrv_commit(BlockDriverState *bs)
 int ret = 0;
 uint8_t *buf;
 char filename[PATH_MAX];
+Error *local_err;
 
 if (!drv)
 return -ENOMEDIUM;
@@ -1895,7 +1898,9 @@ int bdrv_commit(BlockDriverState *bs)
 return -ENOTSUP;
 }
 
-if (bdrv_in_use(bs) || bdrv_in_use(bs->backing_hd)) {
+if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_COMMIT, &local_err) ||
+bdrv_op_is_blocked(bs->backing_hd, BLOCK_OP_TYPE_COMMIT, &local_err)) {
+error_free(local_err);
 return -EBUSY;
 }
 
@@ -2831,6 +2836,7 @@ int coroutine_fn bdrv_co_write_zeroes(BlockDriverState 
*bs,
 int bdrv_truncate(BlockDriverState *bs, int64_t offset)
 {
 BlockDriver *drv = bs->d

[Qemu-devel] [PATCH v4 2/7] block: Introduce op_blockers to BlockDriverState

2013-11-21 Thread Fam Zheng

BlockDriverState.op_blockers is an array of list with BLOCK_OP_TYPE_MAX
elements. Each list is a list of blockers of an operation type
(BlockOpType), that marks this BDS is currently blocked for certain type
of operation with reason errors stored in the list. The rule of usage
is:

 * BDS user who wants to take an operation should check if there's any
   blocker of the type with bdrv_op_is_blocked().

 * BDS user who wants to block certain types of operation, should call
   bdrv_op_block (or bdrv_op_block_all to block all types of operations,
   which is similar to bdrv_set_in_use of now).

 * A blocker is only referenced by op_blockers, so the lifecycle is
   managed by caller, and shouldn't be lost until unblock, so typically
   a caller does these:

   - Allocate a blocker with error_setg or similar, call bdrv_op_block()
 to block some operations.
   - Hold the blocker, do his job.
   - Unblock operations that it blocked, with the same reason pointer
 passed to bdrv_op_unblock().
   - Release the blocker with error_free().

Signed-off-by: Fam Zheng 
---
 block.c   | 55 +++
 include/block/block.h |  6 ++
 include/block/block_int.h |  5 +
 3 files changed, 66 insertions(+)

diff --git a/block.c b/block.c
index 382ea71..2b18a43 100644
--- a/block.c
+++ b/block.c
@@ -4425,6 +4425,61 @@ void bdrv_unref(BlockDriverState *bs)
 }
 }
 
+struct BdrvOpBlocker {
+Error *reason;
+QLIST_ENTRY(BdrvOpBlocker) list;
+};
+
+bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp)
+{
+BdrvOpBlocker *blocker;
+assert(op >=0 && op < BLOCK_OP_TYPE_MAX);
+if (!QLIST_EMPTY(&bs->op_blockers[op])) {
+blocker = QLIST_FIRST(&bs->op_blockers[op]);
+*errp = error_copy(blocker->reason);
+return true;
+}
+return false;
+}
+
+void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason)
+{
+BdrvOpBlocker *blocker;
+assert(op >=0 && op < BLOCK_OP_TYPE_MAX);
+
+blocker = g_malloc0(sizeof(BdrvOpBlocker));
+blocker->reason = reason;
+QLIST_INSERT_HEAD(&bs->op_blockers[op], blocker, list);
+}
+
+void bdrv_op_unblock(BlockDriverState *bs, BlockOpType op, Error *reason)
+{
+BdrvOpBlocker *blocker, *next;
+assert(op >=0 && op < BLOCK_OP_TYPE_MAX);
+QLIST_FOREACH_SAFE(blocker, &bs->op_blockers[op], list, next) {
+if (blocker->reason == reason) {
+QLIST_REMOVE(blocker, list);
+g_free(blocker);
+}
+}
+}
+
+void bdrv_op_block_all(BlockDriverState *bs, Error *reason)
+{
+int i;
+for (i = 0; i < BLOCK_OP_TYPE_MAX; i++) {
+bdrv_op_block(bs, i, reason);
+}
+}
+
+void bdrv_op_unblock_all(BlockDriverState *bs, Error *reason)
+{
+int i;
+for (i = 0; i < BLOCK_OP_TYPE_MAX; i++) {
+bdrv_op_unblock(bs, i, reason);
+}
+}
+
 void bdrv_set_in_use(BlockDriverState *bs, int in_use)
 {
 assert(bs->in_use != in_use);
diff --git a/include/block/block.h b/include/block/block.h
index 3560deb..693d305 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -403,6 +403,12 @@ void bdrv_unref(BlockDriverState *bs);
 void bdrv_set_in_use(BlockDriverState *bs, int in_use);
 int bdrv_in_use(BlockDriverState *bs);
 
+bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp);
+void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason);
+void bdrv_op_unblock(BlockDriverState *bs, BlockOpType op, Error *reason);
+void bdrv_op_block_all(BlockDriverState *bs, Error *reason);
+void bdrv_op_unblock_all(BlockDriverState *bs, Error *reason);
+
 #ifdef CONFIG_LINUX_AIO
 int raw_get_aio_fd(BlockDriverState *bs);
 #else
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 1666066..d8cef85 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -230,6 +230,8 @@ struct BlockDriver {
 QLIST_ENTRY(BlockDriver) list;
 };
 
+typedef struct BdrvOpBlocker BdrvOpBlocker;
+
 /*
  * Note: the function bdrv_append() copies and swaps contents of
  * BlockDriverStates, so if you add new fields to this struct, please
@@ -308,6 +310,9 @@ struct BlockDriverState {
 
 QLIST_HEAD(, BdrvTrackedRequest) tracked_requests;
 
+/* operation blockers */
+QLIST_HEAD(, BdrvOpBlocker) op_blockers[BLOCK_OP_TYPE_MAX];
+
 /* long-running background operation */
 BlockJob *job;
 
-- 
1.8.4.2

[Qemu-devel] [PATCH v4 0/7] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD

2013-11-21 Thread Fam Zheng

This series adds for point-in-time snapshot NBD exporting based on
blockdev-backup (variant of drive-backup with existing device as target).

We get a thin point-in-time snapshot by COW mechanism of drive-backup, and
export it through built in NBD server. The steps are as below:

 1. (SHELL) qemu-img create -f qcow2 BACKUP.qcow2 

(Alternatively we can use -o backing_file=RUNNING-VM.img to omit explicitly
providing the size by ourselves, but it's risky because RUNNING-VM.qcow2 is
used r/w by guest. Whether or not setting backing file in the image file
doesn't matter, as we are going to override the backing hd in the next
step)

 2. (QMP) blockdev-add backing=source-drive file.driver=file 
file.filename=BACKUP.qcow2 id=target0 if=none driver=qcow2

(where ide0-hd0 is the running BlockDriverState name for
RUNNING-VM.img. This patch implements "backing=" option to override
backing_hd for added drive)

 3. (QMP) blockdev-backup device=source-drive sync=none target=target0

(this is the QMP command introduced by this series, which use a named
device as target of drive-backup)

 4. (QMP) nbd-server-add device=target0

When image fleecing done:

 1. (QMP) block-job-complete device=ide0-hd0

 2. (HMP) drive_del target0

 3. (SHELL) rm BACKUP.qcow2

v4: Dropping RFC, this series tries to address the crashing cases with an added
safety mechanics.

In the first half of series, replace the in_use flag with an operation
blocker list, and block all operations on named backing hd:

[01/07] qapi: Add BlockOperationType enum
[02/07] block: Introduce op_blockers to BlockDriverState
[03/07] block: Replace in_use with operation blocker
[04/07] block: Add checks of blocker in block operations

The second half enables point in time snapshot over NBD, with fixes from
last revision:

[05/07] block: Parse "backing" option to reference existing BDS
Fix NULL dereference if device not found.
[06/07] qmp: add command 'blockdev-backup'
Moved some checks into backup_run. (Paolo)
[07/07] block: Allow backup on referenced named BlockDriverState
New. Removes one specific blocker on backing referenced BDS, so we
can start a backup job on it.

v3: Base on blockdev-add.
The syntax blockdev-add backing= is new: This will make referenced BDS
in the middle of a backing chain, which has significant effects over all
existing block operations in QMP. It needs to be reviewed and tested 
carefully.

I'm listing the commands here that can take a device id as parameter (but
may not be complete).

These commands do not mutate the backing chain (not directly, at least) and
should be safe:
block_passwd
block_set_io_throttle
block-job-set-speed
block-job-cancel
block-job-pause
block-job-resume
block-job-complete
drive-backup
blockdev-snapshot-sync
blockdev-snapshot-internal-sync
blockdev-snapshot-delete-internal-sync: These should be safe.
nbd-server-add

These can mutates the chain (removing, closing or swapping BDS), need more
work to convert them to safe operations with a BDS in the middle.
device_del
eject: it does bdrv_close the device.
change: internally calls eject.
block-commit: it deletes intermediate BDSes, which may include other 
named BDS.
block-stream: 
drive-mirror: it swaps active with target when complete.

Resizing a middle BDS need to be reviewed too:
block_resize: TBD.

Adding and backing HD referencing will put other BDS in middle, but should
not immediately break things:
blockdev-add


Fam Zheng (7):
  qapi: Add BlockOperationType enum
  block: Introduce op_blockers to BlockDriverState
  block: Replace in_use with operation blocker
  block: Add checks of blocker in block operations
  block: Parse "backing" option to reference existing BDS
  qmp: add command 'blockdev-backup'
  block: Allow backup on referenced named BlockDriverState

 block-migration.c   |   8 ++-
 block.c | 124 ++--
 block/backup.c  |  22 +++
 blockdev-nbd.c  |   4 ++
 blockdev.c  | 100 
 blockjob.c  |  12 ++--
 hw/block/dataplane/virtio-blk.c |  16 --
 include/block/block.h   |   8 ++-
 include/block/block_int.h   |   9 ++-
 include/block/blockjob.h|   3 +
 qapi-schema.json|  74 
 qmp-commands.hx |  46 +++
 12 files changed, 384 insertions(+), 42 deletions(-)

-- 
1.8.4.2

[Qemu-devel] [PATCH v4 1/7] qapi: Add BlockOperationType enum

2013-11-21 Thread Fam Zheng

This adds the enum of all the operations that can be taken on a block
device.

Signed-off-by: Fam Zheng 
---
 qapi-schema.json | 25 +
 1 file changed, 25 insertions(+)

diff --git a/qapi-schema.json b/qapi-schema.json
index 83fa485..4656e8c 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1440,6 +1440,31 @@
   'data': ['commit', 'stream', 'mirror', 'backup'] }
 
 ##
+# @BlockOperationType
+# Type of a block operation.
+#
+# Since: 1.8
+##
+{ 'enum': 'BlockOpType',
+  'data': [
+'backup',
+'change',
+'commit',
+'dataplane',
+'drive-del',
+'eject',
+'external-snapshot',
+'internal-snapshot',
+'internal-snapshot-delete',
+'mirror',
+'nbd-server-add',
+'passwd',
+'resize',
+'set-io-throttle',
+'stream'
+] }
+
+##
 # @BlockJobInfo:
 #
 # Information about a long-running block device operation.
-- 
1.8.4.2

[Qemu-devel] [PATCH V16 07/11] NUMA: expand MAX_NODES from 64 to 128

2013-11-21 Thread Wanlong Gao

libnuma choosed 128 for MAX_NODES, so we follow libnuma here.

Signed-off-by: Wanlong Gao 
---
 include/sysemu/sysemu.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 291aa6a..807619e 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -132,7 +132,7 @@ extern size_t boot_splash_filedata_size;
 extern uint8_t qemu_extra_params_fw[2];
 extern QEMUClockType rtc_clock;
 
-#define MAX_NODES 64
+#define MAX_NODES 128
 #define MAX_CPUMASK_BITS 255
 extern int nb_numa_nodes;
 extern int nb_numa_mem_nodes;
-- 
1.8.5.rc3

[Qemu-devel] [PATCH V16 09/11] NUMA: set guest numa nodes memory policy

2013-11-21 Thread Wanlong Gao

Set the guest numa nodes memory policies using the mbind(2)
system call node by node.
After this patch, we are able to set guest nodes memory policies
through the QEMU options, this arms to solve the guest cross
nodes memory access performance issue.
And as you all know, if PCI-passthrough is used,
direct-attached-device uses DMA transfer between device and qemu process.
All pages of the guest will be pinned by get_user_pages().

KVM_ASSIGN_PCI_DEVICE ioctl
  kvm_vm_ioctl_assign_device()
=>kvm_assign_device()
  => kvm_iommu_map_memslots()
=> kvm_iommu_map_pages()
   => kvm_pin_pages()

So, with direct-attached-device, all guest page's page count will be +1 and
any page migration will not work. AutoNUMA won't too.

So, we should set the guest nodes memory allocation policies before
the pages are really mapped.

Signed-off-by: Andre Przywara 
Signed-off-by: Wanlong Gao 
---
 numa.c | 86 ++
 1 file changed, 86 insertions(+)

diff --git a/numa.c b/numa.c
index da4dbbd..915a67a 100644
--- a/numa.c
+++ b/numa.c
@@ -27,6 +27,16 @@
 #include "qapi-visit.h"
 #include "qapi/opts-visitor.h"
 #include "qapi/dealloc-visitor.h"
+#include "exec/memory.h"
+
+#ifdef __linux__
+#include 
+#ifndef MPOL_F_RELATIVE_NODES
+#define MPOL_F_RELATIVE_NODES (1 << 14)
+#define MPOL_F_STATIC_NODES   (1 << 15)
+#endif
+#endif
+
 QemuOptsList qemu_numa_opts = {
 .name = "numa",
 .implied_opt_name = "type",
@@ -228,6 +238,75 @@ void set_numa_nodes(void)
 }
 }
 
+#ifdef __linux__
+static int node_parse_bind_mode(unsigned int nodeid)
+{
+int bind_mode;
+
+switch (numa_info[nodeid].policy) {
+case NUMA_NODE_POLICY_DEFAULT:
+case NUMA_NODE_POLICY_PREFERRED:
+case NUMA_NODE_POLICY_MEMBIND:
+case NUMA_NODE_POLICY_INTERLEAVE:
+bind_mode = numa_info[nodeid].policy;
+break;
+default:
+bind_mode = NUMA_NODE_POLICY_DEFAULT;
+return bind_mode;
+}
+
+bind_mode |= numa_info[nodeid].relative ?
+MPOL_F_RELATIVE_NODES : MPOL_F_STATIC_NODES;
+
+return bind_mode;
+}
+#endif
+
+static int set_node_mem_policy(int nodeid)
+{
+#ifdef __linux__
+void *ram_ptr;
+RAMBlock *block;
+ram_addr_t len, ram_offset = 0;
+int bind_mode;
+int i;
+
+QTAILQ_FOREACH(block, &ram_list.blocks, next) {
+if (!strcmp(block->mr->name, "pc.ram")) {
+break;
+}
+}
+
+if (block->host == NULL) {
+return -1;
+}
+
+ram_ptr = block->host;
+for (i = 0; i < nodeid; i++) {
+len = numa_info[i].node_mem;
+ram_offset += len;
+}
+
+len = numa_info[nodeid].node_mem;
+bind_mode = node_parse_bind_mode(nodeid);
+unsigned long *nodes = numa_info[nodeid].host_mem;
+
+/* This is a workaround for a long standing bug in Linux'
+ * mbind implementation, which cuts off the last specified
+ * node. To stay compatible should this bug be fixed, we
+ * specify one more node and zero this one out.
+ */
+unsigned long maxnode = find_last_bit(nodes, MAX_NODES);
+if (syscall(SYS_mbind, ram_ptr + ram_offset, len, bind_mode,
+nodes, maxnode + 2, 0)) {
+perror("mbind");
+return -1;
+}
+#endif
+
+return 0;
+}
+
 void set_numa_modes(void)
 {
 CPUState *cpu;
@@ -240,4 +319,11 @@ void set_numa_modes(void)
 }
 }
 }
+
+for (i = 0; i < nb_numa_nodes; i++) {
+if (set_node_mem_policy(i) == -1) {
+fprintf(stderr,
+"qemu: can not set host memory policy for node%d\n", i);
+}
+}
 }
-- 
1.8.5.rc3

[Qemu-devel] [PATCH] fixup

2013-11-21 Thread Wanlong Gao

Signed-off-by: Wanlong Gao 
---
 hw/i386/pc.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 50ed4cc..74c1f16 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1072,8 +1072,12 @@ PcGuestInfo *pc_guest_info_init(ram_addr_t 
below_4g_mem_size,
 guest_info->apic_id_limit = pc_apic_id_limit(max_cpus);
 guest_info->apic_xrupt_override = kvm_allows_irq0_override();
 guest_info->numa_nodes = nb_numa_nodes;
-guest_info->node_mem = g_memdup(node_mem, guest_info->numa_nodes *
+guest_info->node_mem = g_malloc0(guest_info->numa_nodes *
 sizeof *guest_info->node_mem);
+for (i = 0; i < nb_numa_nodes; i++) {
+guest_info->node_mem[i] = numa_info[i].node_mem;
+}
+
 guest_info->node_cpu = g_malloc0(guest_info->apic_id_limit *
  sizeof *guest_info->node_cpu);
 
@@ -1081,7 +1085,7 @@ PcGuestInfo *pc_guest_info_init(ram_addr_t 
below_4g_mem_size,
 unsigned int apic_id = x86_cpu_apic_id_from_index(i);
 assert(apic_id < guest_info->apic_id_limit);
 for (j = 0; j < nb_numa_nodes; j++) {
-if (test_bit(i, node_cpumask[j])) {
+if (test_bit(i, numa_info[j].node_cpu)) {
 guest_info->node_cpu[apic_id] = j;
 break;
 }
-- 
1.8.5.rc0.44.gf26f72d

[Qemu-devel] [PATCH V16 01/11] NUMA: move numa related code to new file numa.c

2013-11-21 Thread Wanlong Gao

Signed-off-by: Wanlong Gao 
---
 Makefile.target |   2 +-
 cpus.c  |  14 
 include/sysemu/cpus.h   |   1 -
 include/sysemu/sysemu.h |   3 +
 numa.c  | 182 
 vl.c| 139 +---
 6 files changed, 187 insertions(+), 154 deletions(-)
 create mode 100644 numa.c

diff --git a/Makefile.target b/Makefile.target
index af6ac7e..0197c17 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -109,7 +109,7 @@ endif #CONFIG_BSD_USER
 #
 # System emulator target
 ifdef CONFIG_SOFTMMU
-obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o
+obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o
 obj-y += qtest.o
 obj-y += hw/
 obj-$(CONFIG_FDT) += device_tree.o
diff --git a/cpus.c b/cpus.c
index 01d128d..53360b0 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1297,20 +1297,6 @@ static void tcg_exec_all(void)
 exit_request = 0;
 }
 
-void set_numa_modes(void)
-{
-CPUState *cpu;
-int i;
-
-CPU_FOREACH(cpu) {
-for (i = 0; i < nb_numa_nodes; i++) {
-if (test_bit(cpu->cpu_index, node_cpumask[i])) {
-cpu->numa_node = i;
-}
-}
-}
-}
-
 void list_cpus(FILE *f, fprintf_function cpu_fprintf, const char *optarg)
 {
 /* XXX: implement xxx_cpu_list for targets that still miss it */
diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h
index 6502488..4f79081 100644
--- a/include/sysemu/cpus.h
+++ b/include/sysemu/cpus.h
@@ -23,7 +23,6 @@ extern int smp_threads;
 #define smp_threads 1
 #endif
 
-void set_numa_modes(void);
 void list_cpus(FILE *f, fprintf_function cpu_fprintf, const char *optarg);
 
 #endif
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 495dae8..2509649 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -136,6 +136,9 @@ extern QEMUClockType rtc_clock;
 extern int nb_numa_nodes;
 extern uint64_t node_mem[MAX_NODES];
 extern unsigned long *node_cpumask[MAX_NODES];
+void numa_add(const char *optarg);
+void set_numa_nodes(void);
+void set_numa_modes(void);
 
 #define MAX_OPTION_ROMS 16
 typedef struct QEMUOptionRom {
diff --git a/numa.c b/numa.c
new file mode 100644
index 000..ce7736a
--- /dev/null
+++ b/numa.c
@@ -0,0 +1,182 @@
+/*
+ * QEMU System Emulator
+ *
+ * Copyright (c) 2013 Fujitsu Ltd.
+ * Author: Wanlong Gao 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "sysemu/sysemu.h"
+
+static void numa_node_parse_cpus(int nodenr, const char *cpus)
+{
+char *endptr;
+unsigned long long value, endvalue;
+
+/* Empty CPU range strings will be considered valid, they will simply
+ * not set any bit in the CPU bitmap.
+ */
+if (!*cpus) {
+return;
+}
+
+if (parse_uint(cpus, &value, &endptr, 10) < 0) {
+goto error;
+}
+if (*endptr == '-') {
+if (parse_uint_full(endptr + 1, &endvalue, 10) < 0) {
+goto error;
+}
+} else if (*endptr == '\0') {
+endvalue = value;
+} else {
+goto error;
+}
+
+if (endvalue >= MAX_CPUMASK_BITS) {
+endvalue = MAX_CPUMASK_BITS - 1;
+fprintf(stderr,
+"qemu: NUMA: A max of %d VCPUs are supported\n",
+ MAX_CPUMASK_BITS);
+}
+
+if (endvalue < value) {
+goto error;
+}
+
+bitmap_set(node_cpumask[nodenr], value, endvalue-value+1);
+return;
+
+error:
+fprintf(stderr, "qemu: Invalid NUMA CPU range: %s\n", cpus);
+exit(1);
+}
+
+void numa_add(const char *optarg)
+{
+char option[128];
+char *endptr;
+unsigned long long nodenr;
+
+optarg = get_opt_name(option, 128, optarg, ',');
+if (*optarg == ',') {
+optarg++;
+}
+if (!strcmp(option, "node")) {
+
+if (nb_numa_nodes >= MAX_NODES) {
+fprintf(s

[Qemu-devel] [PATCH V16 11/11] NUMA: convert hmp command info_numa to use qmp command query_numa

2013-11-21 Thread Wanlong Gao

Reviewed-by: Luiz Capitulino 
Signed-off-by: Wanlong Gao 
---
 hmp.c | 57 +
 hmp.h |  1 +
 monitor.c | 21 +
 3 files changed, 59 insertions(+), 20 deletions(-)

diff --git a/hmp.c b/hmp.c
index 32ee285..d6dedd2 100644
--- a/hmp.c
+++ b/hmp.c
@@ -24,6 +24,10 @@
 #include "ui/console.h"
 #include "block/qapi.h"
 #include "qemu-io.h"
+#include "qapi-visit.h"
+#include "qapi/opts-visitor.h"
+#include "qapi/dealloc-visitor.h"
+#include "sysemu/sysemu.h"
 
 static void hmp_handle_error(Monitor *mon, Error **errp)
 {
@@ -1564,3 +1568,56 @@ void hmp_qemu_io(Monitor *mon, const QDict *qdict)
 
 hmp_handle_error(mon, &err);
 }
+
+void hmp_info_numa(Monitor *mon, const QDict *qdict)
+{
+NUMANodeList *node_list, *node;
+uint16List *head;
+int nodeid;
+char *policy_str = NULL;
+
+node_list = qmp_query_numa(NULL);
+
+monitor_printf(mon, "%d nodes\n", nb_numa_nodes);
+for (node = node_list; node; node = node->next) {
+nodeid = node->value->nodeid;
+monitor_printf(mon, "node %d cpus:", nodeid);
+head = node->value->cpus;
+for (head = node->value->cpus; head != NULL; head = head->next) {
+monitor_printf(mon, " %d", (int)head->value);
+}
+monitor_printf(mon, "\n");
+monitor_printf(mon, "node %d size: %" PRId64 " MB\n",
+   nodeid, node->value->memory >> 20);
+switch (node->value->policy) {
+case NUMA_NODE_POLICY_DEFAULT:
+policy_str = g_strdup("default");
+break;
+case NUMA_NODE_POLICY_PREFERRED:
+policy_str = g_strdup("preferred");
+break;
+case NUMA_NODE_POLICY_MEMBIND:
+policy_str = g_strdup("membind");
+break;
+case NUMA_NODE_POLICY_INTERLEAVE:
+policy_str = g_strdup("interleave");
+break;
+default:
+break;
+}
+monitor_printf(mon, "node %d policy: %s\n",
+   nodeid, policy_str ? : " ");
+if (policy_str) {
+free(policy_str);
+}
+monitor_printf(mon, "node %d relative: %s\n", nodeid,
+   node->value->relative ? "true" : "false");
+monitor_printf(mon, "node %d host-nodes:", nodeid);
+for (head = node->value->host_nodes; head != NULL; head = head->next) {
+monitor_printf(mon, " %d", (int)head->value);
+}
+monitor_printf(mon, "\n");
+}
+
+qapi_free_NUMANodeList(node_list);
+}
diff --git a/hmp.h b/hmp.h
index 54cf71f..4f8d39b 100644
--- a/hmp.h
+++ b/hmp.h
@@ -37,6 +37,7 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict);
 void hmp_info_pci(Monitor *mon, const QDict *qdict);
 void hmp_info_block_jobs(Monitor *mon, const QDict *qdict);
 void hmp_info_tpm(Monitor *mon, const QDict *qdict);
+void hmp_info_numa(Monitor *mon, const QDict *qdict);
 void hmp_quit(Monitor *mon, const QDict *qdict);
 void hmp_stop(Monitor *mon, const QDict *qdict);
 void hmp_system_reset(Monitor *mon, const QDict *qdict);
diff --git a/monitor.c b/monitor.c
index b97b7d3..f747a48 100644
--- a/monitor.c
+++ b/monitor.c
@@ -1989,25 +1989,6 @@ static void do_info_mtree(Monitor *mon, const QDict 
*qdict)
 mtree_info((fprintf_function)monitor_printf, mon);
 }
 
-static void do_info_numa(Monitor *mon, const QDict *qdict)
-{
-int i;
-CPUState *cpu;
-
-monitor_printf(mon, "%d nodes\n", nb_numa_nodes);
-for (i = 0; i < nb_numa_nodes; i++) {
-monitor_printf(mon, "node %d cpus:", i);
-CPU_FOREACH(cpu) {
-if (cpu->numa_node == i) {
-monitor_printf(mon, " %d", cpu->cpu_index);
-}
-}
-monitor_printf(mon, "\n");
-monitor_printf(mon, "node %d size: %" PRId64 " MB\n", i,
-numa_info[i].node_mem >> 20);
-}
-}
-
 #ifdef CONFIG_PROFILER
 
 int64_t qemu_time;
@@ -2775,7 +2756,7 @@ static mon_cmd_t info_cmds[] = {
 .args_type  = "",
 .params = "",
 .help   = "show NUMA information",
-.mhandler.cmd = do_info_numa,
+.mhandler.cmd = hmp_info_numa,
 },
 {
 .name   = "usb",
-- 
1.8.5.rc3

Re: [Qemu-devel] [PATCH] fixup

2013-11-21 Thread Wanlong Gao

Sorry, please ignore this patch.

Thanks,
Wanlong Gao

> Signed-off-by: Wanlong Gao 
> ---
>  hw/i386/pc.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 50ed4cc..74c1f16 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1072,8 +1072,12 @@ PcGuestInfo *pc_guest_info_init(ram_addr_t 
> below_4g_mem_size,
>  guest_info->apic_id_limit = pc_apic_id_limit(max_cpus);
>  guest_info->apic_xrupt_override = kvm_allows_irq0_override();
>  guest_info->numa_nodes = nb_numa_nodes;
> -guest_info->node_mem = g_memdup(node_mem, guest_info->numa_nodes *
> +guest_info->node_mem = g_malloc0(guest_info->numa_nodes *
>  sizeof *guest_info->node_mem);
> +for (i = 0; i < nb_numa_nodes; i++) {
> +guest_info->node_mem[i] = numa_info[i].node_mem;
> +}
> +
>  guest_info->node_cpu = g_malloc0(guest_info->apic_id_limit *
>   sizeof *guest_info->node_cpu);
>  
> @@ -1081,7 +1085,7 @@ PcGuestInfo *pc_guest_info_init(ram_addr_t 
> below_4g_mem_size,
>  unsigned int apic_id = x86_cpu_apic_id_from_index(i);
>  assert(apic_id < guest_info->apic_id_limit);
>  for (j = 0; j < nb_numa_nodes; j++) {
> -if (test_bit(i, node_cpumask[j])) {
> +if (test_bit(i, numa_info[j].node_cpu)) {
>  guest_info->node_cpu[apic_id] = j;
>  break;
>  }
>

[Qemu-devel] [PATCH V16 02/11] NUMA: check if the total numa memory size is equal to ram_size

2013-11-21 Thread Wanlong Gao

If the total number of the assigned numa nodes memory is not
equal to the assigned ram size, it will write the wrong data
to ACPI talb, then the guest will ignore the wrong ACPI table
and recognize all memory to one node. It's buggy, we should
check it to ensure that we write the right data to ACPI table.

Signed-off-by: Wanlong Gao 
---
 numa.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/numa.c b/numa.c
index ce7736a..beda80e 100644
--- a/numa.c
+++ b/numa.c
@@ -150,6 +150,16 @@ void set_numa_nodes(void)
 node_mem[i] = ram_size - usedmem;
 }
 
+uint64_t numa_total = 0;
+for (i = 0; i < nb_numa_nodes; i++) {
+numa_total += node_mem[i];
+}
+if (numa_total != ram_size) {
+fprintf(stderr, "qemu: numa nodes total memory size "
+"should equal to ram_size\n");
+exit(1);
+}
+
 for (i = 0; i < nb_numa_nodes; i++) {
 if (!bitmap_empty(node_cpumask[i], MAX_CPUMASK_BITS)) {
 break;
-- 
1.8.5.rc3

[Qemu-devel] [PATCH V16 10/11] NUMA: add qmp command query-numa

2013-11-21 Thread Wanlong Gao

Add qmp command query-numa to show guest NUMA information.

Reviewed-by: Luiz Capitulino 
Signed-off-by: Wanlong Gao 
---
 numa.c   | 66 
 qapi-schema.json | 36 +++
 qmp-commands.hx  | 49 +
 3 files changed, 151 insertions(+)

diff --git a/numa.c b/numa.c
index 915a67a..b392190 100644
--- a/numa.c
+++ b/numa.c
@@ -28,6 +28,7 @@
 #include "qapi/opts-visitor.h"
 #include "qapi/dealloc-visitor.h"
 #include "exec/memory.h"
+#include "qmp-commands.h"
 
 #ifdef __linux__
 #include 
@@ -327,3 +328,68 @@ void set_numa_modes(void)
 }
 }
 }
+
+NUMANodeList *qmp_query_numa(Error **errp)
+{
+NUMANodeList *head = NULL, *cur_item = NULL;
+CPUState *cpu;
+int i;
+
+for (i = 0; i < nb_numa_nodes; i++) {
+NUMANodeList *info;
+uint16List *cur_cpu_item = NULL;
+info = g_malloc0(sizeof(*info));
+info->value = g_malloc0(sizeof(*info->value));
+info->value->nodeid = i;
+CPU_FOREACH(cpu) {
+if (cpu->numa_node == i) {
+uint16List *node_cpu = g_malloc0(sizeof(*node_cpu));
+node_cpu->value = cpu->cpu_index;
+
+if (!cur_cpu_item) {
+info->value->cpus = cur_cpu_item = node_cpu;
+} else {
+cur_cpu_item->next = node_cpu;
+cur_cpu_item = node_cpu;
+}
+}
+}
+info->value->memory = numa_info[i].node_mem;
+
+#ifdef __linux__
+info->value->policy = numa_info[i].policy;
+info->value->relative = numa_info[i].relative;
+
+unsigned long first, next;
+next = first = find_first_bit(numa_info[i].host_mem, MAX_NODES);
+if (first == MAX_NODES) {
+goto end;
+}
+uint16List *cur_node_item = g_malloc0(sizeof(*cur_node_item));
+cur_node_item->value = first;
+info->value->host_nodes = cur_node_item;
+do {
+next = find_next_bit(numa_info[i].host_mem, MAX_NODES,
+ next + 1);
+if (next == MAX_NODES) {
+break;
+}
+
+uint16List *host_node = g_malloc0(sizeof(*host_node));
+host_node->value = next;
+cur_node_item->next = host_node;
+cur_node_item = host_node;
+} while (true);
+end:
+#endif
+
+if (!cur_item) {
+head = cur_item = info;
+} else {
+cur_item->next = info;
+cur_item = info;
+}
+}
+
+return head;
+}
diff --git a/qapi-schema.json b/qapi-schema.json
index c0dad81..af947e2 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -4289,3 +4289,39 @@
'*policy': 'NumaNodePolicy',
'*relative':   'bool',
'*host-nodes': ['uint16'] }}
+
+##
+# @NUMANode:
+#
+# Information of guest NUMA node
+#
+# @nodeid: NUMA node ID
+#
+# @cpus: VCPUs contained in this node
+#
+# @memory: memory size of this node
+#
+# @policy: memory policy of this node
+#
+# @relative: if host nodes are relative for memory policy
+#
+# @host-nodes: host nodes for its memory policy
+#
+# Since: 1.7
+#
+##
+{ 'type': 'NUMANode',
+  'data': {'nodeid': 'uint16', 'cpus': ['uint16'], 'memory': 'uint64',
+   'policy': 'NumaNodePolicy', 'relative': 'bool',
+   'host-nodes': ['uint16'] }}
+
+##
+# @query-numa:
+#
+# Returns a list of information about each guest node.
+#
+# Returns: a list of @NUMANode for each guest node
+#
+# Since: 1.7
+##
+{ 'command': 'query-numa', 'returns': ['NUMANode'] }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index fba15cd..c2bc508 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -3295,3 +3295,52 @@ Example (2):
 <- { "return": {} }
 
 EQMP
+
+{
+.name = "query-numa",
+.args_type = "",
+.mhandler.cmd_new = qmp_marshal_input_query_numa,
+},
+
+SQMP
+query-numa
+-
+
+Show NUMA information.
+
+Return a json-array. Each NUMA node is represented by a json-object,
+which contains:
+
+- "nodeid": NUMA node ID (json-int)
+- "cpus": a json-arry of contained VCPUs
+- "memory": amount of memory in each node in Byte (json-int)
+- "policy": memory policy of this node (json-string)
+- "relative": if host nodes is relative for its memory policy (json-bool)
+- "host-nodes": a json-array of host nodes for its memory policy
+
+Arguments:
+
+Example:
+
+-> { "excute": "query-numa" }
+<- { "return":[
+{
+"nodeid": 0,
+"cpus": [0, 1],
+"memory": 536870912,
+"policy": "membind",
+"relative": false,
+"host-nodes": [0, 1]
+},
+{
+"nodeid": 1,
+"cpus": [2, 3],
+"memory": 536870912,
+"policy": "interleave",
+"relative": false,
+"host-nodes": [1]
+}
+ ]
+   }
+
+EQMP
--

[Qemu-devel] [PATCH V16 04/11] NUMA: convert -numa option to use OptsVisitor

2013-11-21 Thread Wanlong Gao

Signed-off-by: Wanlong Gao 
---
 include/sysemu/sysemu.h |   3 +-
 numa.c  | 148 +++-
 qapi-schema.json|  30 ++
 vl.c|  11 +++-
 4 files changed, 114 insertions(+), 78 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index d873b42..20b05a3 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -140,9 +140,10 @@ typedef struct node_info {
 DECLARE_BITMAP(node_cpu, MAX_CPUMASK_BITS);
 } NodeInfo;
 extern NodeInfo numa_info[MAX_NODES];
-void numa_add(const char *optarg);
 void set_numa_nodes(void);
 void set_numa_modes(void);
+extern QemuOptsList qemu_numa_opts;
+int numa_init_func(QemuOpts *opts, void *opaque);
 
 #define MAX_OPTION_ROMS 16
 typedef struct QEMUOptionRom {
diff --git a/numa.c b/numa.c
index 1bc0fad..c4fa665 100644
--- a/numa.c
+++ b/numa.c
@@ -24,101 +24,97 @@
  */
 
 #include "sysemu/sysemu.h"
-
-static void numa_node_parse_cpus(int nodenr, const char *cpus)
+#include "qapi-visit.h"
+#include "qapi/opts-visitor.h"
+#include "qapi/dealloc-visitor.h"
+QemuOptsList qemu_numa_opts = {
+.name = "numa",
+.implied_opt_name = "type",
+.head = QTAILQ_HEAD_INITIALIZER(qemu_numa_opts.head),
+.desc = { { 0 } } /* validated with OptsVisitor */
+};
+
+static int numa_node_parse(NumaNodeOptions *opts)
 {
-char *endptr;
-unsigned long long value, endvalue;
-
-/* Empty CPU range strings will be considered valid, they will simply
- * not set any bit in the CPU bitmap.
- */
-if (!*cpus) {
-return;
-}
+uint16_t nodenr;
+uint16List *cpus = NULL;
 
-if (parse_uint(cpus, &value, &endptr, 10) < 0) {
-goto error;
-}
-if (*endptr == '-') {
-if (parse_uint_full(endptr + 1, &endvalue, 10) < 0) {
-goto error;
-}
-} else if (*endptr == '\0') {
-endvalue = value;
+if (opts->has_nodeid) {
+nodenr = opts->nodeid;
 } else {
-goto error;
+nodenr = nb_numa_nodes;
 }
 
-if (endvalue >= MAX_CPUMASK_BITS) {
-endvalue = MAX_CPUMASK_BITS - 1;
-fprintf(stderr,
-"qemu: NUMA: A max of %d VCPUs are supported\n",
- MAX_CPUMASK_BITS);
+if (nodenr >= MAX_NODES) {
+fprintf(stderr, "qemu: Max number of NUMA nodes reached: %"
+PRIu16 "\n", nodenr);
+return -1;
 }
 
-if (endvalue < value) {
-goto error;
+for (cpus = opts->cpus; cpus; cpus = cpus->next) {
+if (cpus->value > MAX_CPUMASK_BITS) {
+fprintf(stderr, "qemu: cpu number %" PRIu16 " is bigger than %d",
+cpus->value, MAX_CPUMASK_BITS);
+continue;
+}
+bitmap_set(numa_info[nodenr].node_cpu, cpus->value, 1);
 }
 
-bitmap_set(numa_info[nodenr].node_cpu, value, endvalue-value+1);
-return;
+if (opts->has_mem) {
+int64_t mem_size;
+char *endptr;
+mem_size = strtosz(opts->mem, &endptr);
+if (mem_size < 0 || *endptr) {
+fprintf(stderr, "qemu: invalid numa mem size: %s\n", opts->mem);
+return -1;
+}
+numa_info[nodenr].node_mem = mem_size;
+}
 
-error:
-fprintf(stderr, "qemu: Invalid NUMA CPU range: %s\n", cpus);
-exit(1);
+return 0;
 }
 
-void numa_add(const char *optarg)
+int numa_init_func(QemuOpts *opts, void *opaque)
 {
-char option[128];
-char *endptr;
-unsigned long long nodenr;
-
-optarg = get_opt_name(option, 128, optarg, ',');
-if (*optarg == ',') {
-optarg++;
+NumaOptions *object = NULL;
+Error *err = NULL;
+int ret = 0;
+
+{
+OptsVisitor *ov = opts_visitor_new(opts);
+visit_type_NumaOptions(opts_get_visitor(ov), &object, NULL, &err);
+opts_visitor_cleanup(ov);
 }
-if (!strcmp(option, "node")) {
-
-if (nb_numa_nodes >= MAX_NODES) {
-fprintf(stderr, "qemu: too many NUMA nodes\n");
-exit(1);
-}
 
-if (get_param_value(option, 128, "nodeid", optarg) == 0) {
-nodenr = nb_numa_nodes;
-} else {
-if (parse_uint_full(option, &nodenr, 10) < 0) {
-fprintf(stderr, "qemu: Invalid NUMA nodeid: %s\n", option);
-exit(1);
-}
-}
-
-if (nodenr >= MAX_NODES) {
-fprintf(stderr, "qemu: invalid NUMA nodeid: %llu\n", nodenr);
-exit(1);
-}
+if (error_is_set(&err)) {
+fprintf(stderr, "qemu: %s\n", error_get_pretty(err));
+error_free(err);
+ret = -1;
+goto error;
+}
 
-if (get_param_value(option, 128, "mem", optarg) == 0) {
-numa_info[nodenr].node_mem = 0;
-} else {
-int64_t sval;
-sval = strtosz(option, &endptr);
-if (sval < 0 || *endptr) {
-fprintf(stderr, "qemu: invalid numa mem size: %s\n"

[Qemu-devel] [PATCH V16 03/11] NUMA: Add numa_info structure to contain numa nodes info

2013-11-21 Thread Wanlong Gao

Add the numa_info structure to contain the numa nodes memory,
VCPUs information and the future added numa nodes host memory
policies.

Reviewed-by: Eduardo Habkost 
Signed-off-by: Andre Przywara 
Signed-off-by: Wanlong Gao 
---
 hw/i386/pc.c| 12 
 include/sysemu/sysemu.h |  8 ++--
 monitor.c   |  2 +-
 numa.c  | 23 ---
 vl.c|  7 +++
 5 files changed, 30 insertions(+), 22 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 12c436e..74c1f16 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -670,14 +670,14 @@ static FWCfgState *bochs_bios_init(void)
 unsigned int apic_id = x86_cpu_apic_id_from_index(i);
 assert(apic_id < apic_id_limit);
 for (j = 0; j < nb_numa_nodes; j++) {
-if (test_bit(i, node_cpumask[j])) {
+if (test_bit(i, numa_info[j].node_cpu)) {
 numa_fw_cfg[apic_id + 1] = cpu_to_le64(j);
 break;
 }
 }
 }
 for (i = 0; i < nb_numa_nodes; i++) {
-numa_fw_cfg[apic_id_limit + 1 + i] = cpu_to_le64(node_mem[i]);
+numa_fw_cfg[apic_id_limit + 1 + i] = 
cpu_to_le64(numa_info[i].node_mem);
 }
 fw_cfg_add_bytes(fw_cfg, FW_CFG_NUMA, numa_fw_cfg,
  (1 + apic_id_limit + nb_numa_nodes) *
@@ -1072,8 +1072,12 @@ PcGuestInfo *pc_guest_info_init(ram_addr_t 
below_4g_mem_size,
 guest_info->apic_id_limit = pc_apic_id_limit(max_cpus);
 guest_info->apic_xrupt_override = kvm_allows_irq0_override();
 guest_info->numa_nodes = nb_numa_nodes;
-guest_info->node_mem = g_memdup(node_mem, guest_info->numa_nodes *
+guest_info->node_mem = g_malloc0(guest_info->numa_nodes *
 sizeof *guest_info->node_mem);
+for (i = 0; i < nb_numa_nodes; i++) {
+guest_info->node_mem[i] = numa_info[i].node_mem;
+}
+
 guest_info->node_cpu = g_malloc0(guest_info->apic_id_limit *
  sizeof *guest_info->node_cpu);
 
@@ -1081,7 +1085,7 @@ PcGuestInfo *pc_guest_info_init(ram_addr_t 
below_4g_mem_size,
 unsigned int apic_id = x86_cpu_apic_id_from_index(i);
 assert(apic_id < guest_info->apic_id_limit);
 for (j = 0; j < nb_numa_nodes; j++) {
-if (test_bit(i, node_cpumask[j])) {
+if (test_bit(i, numa_info[j].node_cpu)) {
 guest_info->node_cpu[apic_id] = j;
 break;
 }
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 2509649..d873b42 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -9,6 +9,7 @@
 #include "qapi-types.h"
 #include "qemu/notify.h"
 #include "qemu/main-loop.h"
+#include "qemu/bitmap.h"
 
 /* vl.c */
 
@@ -134,8 +135,11 @@ extern QEMUClockType rtc_clock;
 #define MAX_NODES 64
 #define MAX_CPUMASK_BITS 255
 extern int nb_numa_nodes;
-extern uint64_t node_mem[MAX_NODES];
-extern unsigned long *node_cpumask[MAX_NODES];
+typedef struct node_info {
+uint64_t node_mem;
+DECLARE_BITMAP(node_cpu, MAX_CPUMASK_BITS);
+} NodeInfo;
+extern NodeInfo numa_info[MAX_NODES];
 void numa_add(const char *optarg);
 void set_numa_nodes(void);
 void set_numa_modes(void);
diff --git a/monitor.c b/monitor.c
index 845f608..b97b7d3 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2004,7 +2004,7 @@ static void do_info_numa(Monitor *mon, const QDict *qdict)
 }
 monitor_printf(mon, "\n");
 monitor_printf(mon, "node %d size: %" PRId64 " MB\n", i,
-node_mem[i] >> 20);
+numa_info[i].node_mem >> 20);
 }
 }
 
diff --git a/numa.c b/numa.c
index beda80e..1bc0fad 100644
--- a/numa.c
+++ b/numa.c
@@ -61,7 +61,7 @@ static void numa_node_parse_cpus(int nodenr, const char *cpus)
 goto error;
 }
 
-bitmap_set(node_cpumask[nodenr], value, endvalue-value+1);
+bitmap_set(numa_info[nodenr].node_cpu, value, endvalue-value+1);
 return;
 
 error:
@@ -101,7 +101,7 @@ void numa_add(const char *optarg)
 }
 
 if (get_param_value(option, 128, "mem", optarg) == 0) {
-node_mem[nodenr] = 0;
+numa_info[nodenr].node_mem = 0;
 } else {
 int64_t sval;
 sval = strtosz(option, &endptr);
@@ -109,7 +109,7 @@ void numa_add(const char *optarg)
 fprintf(stderr, "qemu: invalid numa mem size: %s\n", optarg);
 exit(1);
 }
-node_mem[nodenr] = sval;
+numa_info[nodenr].node_mem = sval;
 }
 if (get_param_value(option, 128, "cpus", optarg) != 0) {
 numa_node_parse_cpus(nodenr, option);
@@ -134,7 +134,7 @@ void set_numa_nodes(void)
  * and distribute the available memory equally across all nodes
  */
 for (i = 0; i < nb_numa_nodes; i++) {
-if (node_mem[i] != 0)
+if (numa_info[i].node_mem != 0)
 break;
 }
 if (i == nb_

[Qemu-devel] [PATCH V16 06/11] NUMA: add "-numa mem," options

2013-11-21 Thread Wanlong Gao

Add "-numa mem," option like following as Paolo suggested:

-numa mem,nodeid=0,size=1G

This new option will make later coming memory hotplug better.

We will use the new options to specify nodes memory info,
and just remain "-numa node,mem=xx" as legacy.

Reviewed-by: Laszlo Ersek 
Signed-off-by: Wanlong Gao 
---
 include/sysemu/sysemu.h |  1 +
 numa.c  | 36 
 qemu-options.hx |  6 --
 vl.c|  2 ++
 4 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 20b05a3..291aa6a 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -135,6 +135,7 @@ extern QEMUClockType rtc_clock;
 #define MAX_NODES 64
 #define MAX_CPUMASK_BITS 255
 extern int nb_numa_nodes;
+extern int nb_numa_mem_nodes;
 typedef struct node_info {
 uint64_t node_mem;
 DECLARE_BITMAP(node_cpu, MAX_CPUMASK_BITS);
diff --git a/numa.c b/numa.c
index c4fa665..c676c5e 100644
--- a/numa.c
+++ b/numa.c
@@ -74,6 +74,31 @@ static int numa_node_parse(NumaNodeOptions *opts)
 return 0;
 }
 
+static int numa_mem_parse(NumaMemOptions *opts)
+{
+uint16_t nodenr;
+uint64_t mem_size;
+
+if (opts->has_nodeid) {
+nodenr = opts->nodeid;
+} else {
+nodenr = nb_numa_mem_nodes;
+}
+
+if (nodenr >= MAX_NODES) {
+fprintf(stderr, "qemu: Max number of NUMA nodes reached: %"
+PRIu16 "\n", nodenr);
+return -1;
+}
+
+if (opts->has_size) {
+mem_size = opts->size;
+numa_info[nodenr].node_mem = mem_size;
+}
+
+return 0;
+}
+
 int numa_init_func(QemuOpts *opts, void *opaque)
 {
 NumaOptions *object = NULL;
@@ -101,6 +126,13 @@ int numa_init_func(QemuOpts *opts, void *opaque)
 }
 nb_numa_nodes++;
 break;
+case NUMA_OPTIONS_KIND_MEM:
+ret = numa_mem_parse(object->mem);
+if (ret) {
+goto error;
+}
+nb_numa_mem_nodes++;
+break;
 default:
 fprintf(stderr, "qemu: Invalid NUMA options type.\n");
 ret = -1;
@@ -119,6 +151,10 @@ error:
 
 void set_numa_nodes(void)
 {
+if (nb_numa_mem_nodes > nb_numa_nodes) {
+nb_numa_nodes = nb_numa_mem_nodes;
+}
+
 if (nb_numa_nodes > 0) {
 int i;
 
diff --git a/qemu-options.hx b/qemu-options.hx
index 8b94264..e6afb6f 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -95,11 +95,13 @@ specifies the maximum number of hotpluggable CPUs.
 ETEXI
 
 DEF("numa", HAS_ARG, QEMU_OPTION_numa,
-"-numa node[,mem=size][,cpus=cpu[-cpu]][,nodeid=node]\n", QEMU_ARCH_ALL)
+"-numa node[,nodeid=node][,cpus=cpu[-cpu]]\n"
+"-numa mem[,nodeid=node][,size=size]\n"
+, QEMU_ARCH_ALL)
 STEXI
 @item -numa @var{opts}
 @findex -numa
-Simulate a multi node NUMA system. If mem and cpus are omitted, resources
+Simulate a multi node NUMA system. If @var{size} and @var{cpus} are omitted, 
resources
 are split equally.
 ETEXI
 
diff --git a/vl.c b/vl.c
index e67f34a..064b821 100644
--- a/vl.c
+++ b/vl.c
@@ -250,6 +250,7 @@ static QTAILQ_HEAD(, FWBootEntry) fw_boot_order =
 QTAILQ_HEAD_INITIALIZER(fw_boot_order);
 
 int nb_numa_nodes;
+int nb_numa_mem_nodes;
 NodeInfo numa_info[MAX_NODES];
 
 uint8_t qemu_uuid[16];
@@ -2817,6 +2818,7 @@ int main(int argc, char **argv, char **envp)
 }
 
 nb_numa_nodes = 0;
+nb_numa_mem_nodes = 0;
 nb_nics = 0;
 
 bdrv_init_with_whitelist();
-- 
1.8.5.rc3

[Qemu-devel] [PATCH V16 00/11] Add support for binding guest numa nodes to host numa nodes

2013-11-21 Thread Wanlong Gao

As you know, QEMU can't direct it's memory allocation now, this may cause
guest cross node access performance regression.
And, the worse thing is that if PCI-passthrough is used,
direct-attached-device uses DMA transfer between device and qemu process.
All pages of the guest will be pinned by get_user_pages().

KVM_ASSIGN_PCI_DEVICE ioctl
  kvm_vm_ioctl_assign_device()
=>kvm_assign_device()
  => kvm_iommu_map_memslots()
=> kvm_iommu_map_pages()
   => kvm_pin_pages()

So, with direct-attached-device, all guest page's page count will be +1 and
any page migration will not work. AutoNUMA won't too.

So, we should set the guest nodes memory allocation policy before
the pages are really mapped.

According to this patch set, we are able to set guest nodes memory policy
like following:

 -numa node,nodeid=0,cpus=0, \
 -numa mem,size=1024M,policy=membind,host-nodes=0-1 \
 -numa node,nodeid=1,cpus=1 \
 -numa mem,size=1024M,policy=interleave,host-nodes=1

This supports 
"policy={default|membind|interleave|preferred},relative=true,host-nodes=N-N" 
like format.

And add a QMP command "query-numa" to show numa info through
this API.

And convert the "info numa" monitor command to use this
QMP command "query-numa".

This version removes "set-mem-policy" qmp and hmp commands temporarily
as Marcelo and Paolo suggested.


The simple test is like following:
=
Before:
# numactl -H && /qemu/x86_64-softmmu/qemu-system-x86_64 -m 4096  -smp 2 -numa 
node,nodeid=0,cpus=0,mem=2048 -numa node,nodeid=1,cpus=1,mem=2048 -hda 
6u4ga2.qcow2 -enable-kvm -device 
pci-assign,host=07:00.1,id=hostdev0,bus=pci.0,addr=0x7 & sleep 40 && numactl -H
[1] 13320
available: 2 nodes (0-1)
node 0 cpus: 0 2
node 0 size: 5111 MB
node 0 free: 4653 MB
node 1 cpus: 1 3
node 1 size: 5120 MB
node 1 free: 4764 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 
available: 2 nodes (0-1)
node 0 cpus: 0 2
node 0 size: 5111 MB
node 0 free: 4317 MB
node 1 cpus: 1 3
node 1 size: 5120 MB
node 1 free: 876 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 



After:
# numactl -H && /qemu/x86_64-softmmu/qemu-system-x86_64 -m 4096 -smp 4 -numa 
node,nodeid=0,cpus=0,cpus=2 -numa mem,size=2048M,policy=membind,host-nodes=0 
-numa node,nodeid=0,cpus=1,cpus=3 -numa 
mem,size=2048M,policy=membind,host-nodes=1 -hda 6u4ga2.qcow2 -enable-kvm 
-device pci-assign,host=07:00.1,id=hostdev0,bus=pci.0,addr=0x7 & sleep 40 && 
numactl -H
[1] 10862
available: 2 nodes (0-1)
node 0 cpus: 0 2
node 0 size: 5111 MB
node 0 free: 4718 MB
node 1 cpus: 1 3
node 1 size: 5120 MB
node 1 free: 4799 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 
available: 2 nodes (0-1)
node 0 cpus: 0 2
node 0 size: 5111 MB
node 0 free: 2544 MB
node 1 cpus: 1 3
node 1 size: 5120 MB
node 1 free: 2725 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 
===


V1->V2:
change to use QemuOpts in numa options (Paolo)
handle Error in mpol parser (Paolo)
change qmp command format to mem-policy=membind,mem-hostnode=0-1 like 
(Paolo)
V2->V3:
also handle Error in cpus parser (5/10)
split out common parser from cpus and hostnode parser (Bandan 6/10)
V3-V4:
rebase to request for comments
V4->V5:
use OptVisitor and split -numa option (Paolo)
 - s/set-mpol/set-mem-policy (Andreas)
 - s/mem-policy/policy
 - s/mem-hostnode/host-nodes
fix hmp command process after error (Luiz)
add qmp command query-numa and convert info numa to it (Luiz)
V5->V6:
remove tabs in json file (Laszlo, Paolo)
add back "-numa node,mem=xxx" as legacy (Paolo)
change cpus and host-nodes to array (Laszlo, Eric)
change "nodeid" to "uint16"
add NumaMemPolicy enum type (Eric)
rebased on Laszlo's "OptsVisitor: support / flatten integer ranges for 
repeating options" patch set, thanks for Laszlo's help
V6-V7:
change UInt16 to uint16 (Laszlo)
fix a typo in adding qmp command set-mem-policy
V7-V8:
rebase to current master with Laszlo's V2 of OptsVisitor patch set
fix an adding white space line error
V8->V9:
rebase to current master
check if total numa memory size is equal to ram_size (Paolo)
add comments to the OptsVisitor stuff in qapi-schema.json (Eric, Laszlo)
replace the use of numa_num_configured_nodes() (Andrew)
avoid abusing the fact i==nodeid (Andrew)
V9->V10:
rebase to current master
remove libnuma (Andrew)
MAX_NODES=64 -> MAX_NODES=128 since libnuma selected 128 (Andrew)
use MAX_NODES instead of MAX_CPUMASK_BITS for host_mem bitmap (Andrew)
remove a useless clear_bit() operation (Andrew)
V10->V11:
rebase to current master
fix "maxnode" argument of mbind(2)
V11->V12:
rebase to current master
split patch 02/11 of V11 (Eduardo)
add some max value check (Eduardo)
split MAX_NODES change patch (Eduardo)
V12->V13:
rebase to current master
thanks for L

[Qemu-devel] [PATCH V16 05/11] NUMA: introduce NumaMemOptions

2013-11-21 Thread Wanlong Gao

Signed-off-by: Wanlong Gao 
---
 qapi-schema.json | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index db539b6..1043e57 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -4223,7 +4223,8 @@
 ##
 { 'union': 'NumaOptions',
   'data': {
-'node': 'NumaNodeOptions' }}
+'node': 'NumaNodeOptions',
+'mem' : 'NumaMemOptions' }}
 
 ##
 # @NumaNodeOptions
@@ -4243,3 +4244,19 @@
'*nodeid': 'uint16',
'*cpus':   ['uint16'],
'*mem':'str' }}
+
+##
+# @NumaMemOptions
+#
+# Set memory information of guest NUMA node. (for OptsVisitor)
+#
+# @nodeid: #optional NUMA node ID
+#
+# @size: #optional memory size of this node
+#
+# Since 1.7
+##
+{ 'type': 'NumaMemOptions',
+  'data': {
+   '*nodeid': 'uint16',
+   '*size':   'size' }}
-- 
1.8.5.rc3

[Qemu-devel] [PATCH V16 08/11] NUMA: parse guest numa nodes memory policy

2013-11-21 Thread Wanlong Gao

The memory policy setting format is like:
policy={default|membind|interleave|preferred}[,relative=true],host-nodes=N-N
And we are adding this setting as a suboption of "-numa mem,",
the memory policy then can be set like following:
-numa node,nodeid=0,cpus=0 \
-numa node,nodeid=1,cpus=1 \
-numa mem,nodeid=0,size=1G,policy=membind,host-nodes=0-1 \
-numa mem,nodeid=1,size=1G,policy=interleave,relative=true,host-nodes=1

Signed-off-by: Wanlong Gao 
---
 include/sysemu/sysemu.h |  3 +++
 numa.c  | 18 ++
 qapi-schema.json| 33 +++--
 vl.c|  3 +++
 4 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 807619e..82f1447 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -139,6 +139,9 @@ extern int nb_numa_mem_nodes;
 typedef struct node_info {
 uint64_t node_mem;
 DECLARE_BITMAP(node_cpu, MAX_CPUMASK_BITS);
+DECLARE_BITMAP(host_mem, MAX_NODES);
+NumaNodePolicy policy;
+bool relative;
 } NodeInfo;
 extern NodeInfo numa_info[MAX_NODES];
 void set_numa_nodes(void);
diff --git a/numa.c b/numa.c
index c676c5e..da4dbbd 100644
--- a/numa.c
+++ b/numa.c
@@ -78,6 +78,7 @@ static int numa_mem_parse(NumaMemOptions *opts)
 {
 uint16_t nodenr;
 uint64_t mem_size;
+uint16List *nodes;
 
 if (opts->has_nodeid) {
 nodenr = opts->nodeid;
@@ -96,6 +97,23 @@ static int numa_mem_parse(NumaMemOptions *opts)
 numa_info[nodenr].node_mem = mem_size;
 }
 
+if (opts->has_policy) {
+numa_info[nodenr].policy = opts->policy;
+}
+
+if (opts->has_relative) {
+numa_info[nodenr].relative = opts->relative;
+}
+
+for (nodes = opts->host_nodes; nodes; nodes = nodes->next) {
+if (nodes->value > MAX_NODES) {
+fprintf(stderr, "qemu: node number %" PRIu16 " is bigger than 
%d\n",
+nodes->value, MAX_NODES);
+continue;
+}
+bitmap_set(numa_info[nodenr].host_mem, nodes->value, 1);
+}
+
 return 0;
 }
 
diff --git a/qapi-schema.json b/qapi-schema.json
index 1043e57..c0dad81 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -4246,6 +4246,26 @@
'*mem':'str' }}
 
 ##
+# @NumaNodePolicy
+#
+# NUMA node policy types
+#
+# @default: restore default policy, remove any nondefault policy
+#
+# @preferred: set the preferred node for allocation
+#
+# @membind: a strict policy that restricts memory allocation to the
+#   nodes specified
+#
+# @interleave: the page allocations is interleaved across the set
+#  of nodes specified
+#
+# Since 1.7
+##
+{ 'enum': 'NumaNodePolicy',
+  'data': [ 'default', 'preferred', 'membind', 'interleave' ] }
+
+##
 # @NumaMemOptions
 #
 # Set memory information of guest NUMA node. (for OptsVisitor)
@@ -4254,9 +4274,18 @@
 #
 # @size: #optional memory size of this node
 #
+# @policy: #optional memory policy of this node
+#
+# @relative: #optional if the nodes specified are relative
+#
+# @host-nodes: #optional host nodes for its memory policy
+#
 # Since 1.7
 ##
 { 'type': 'NumaMemOptions',
   'data': {
-   '*nodeid': 'uint16',
-   '*size':   'size' }}
+   '*nodeid': 'uint16',
+   '*size':   'size',
+   '*policy': 'NumaNodePolicy',
+   '*relative':   'bool',
+   '*host-nodes': ['uint16'] }}
diff --git a/vl.c b/vl.c
index 064b821..95d03f5 100644
--- a/vl.c
+++ b/vl.c
@@ -2815,6 +2815,9 @@ int main(int argc, char **argv, char **envp)
 for (i = 0; i < MAX_NODES; i++) {
 numa_info[i].node_mem = 0;
 bitmap_zero(numa_info[i].node_cpu, MAX_CPUMASK_BITS);
+bitmap_zero(numa_info[i].host_mem, MAX_NODES);
+numa_info[i].policy = NUMA_NODE_POLICY_DEFAULT;
+numa_info[i].relative = false;
 }
 
 nb_numa_nodes = 0;
-- 
1.8.5.rc3

[Qemu-devel] [PATCH] net: Update netdev peer on link change

2013-11-21 Thread Vlad Yasevich

When a link change occurs on a backend (like tap), we currently do
not propage such change to the nic.  As a result, when someone turns
off a link on a tap device, for instance, then a guest doesn't see
that change and continues to try to send traffic or run DHCP even
though the lower-layer is disconnected.  This is OK when the network
is set up as a HUB since the the guest may be connected to other HUB
ports too, but when it's set up as a netdev, it makes thinkgs worse.

The patch addresses this by setting the peers link down only when the
peer is not a HUBPORT device.  With this patch, in the following config
  -netdev tap,id=net0 -device e1000,mac=X,netdev=net0
when net0 link is turned off, the guest e1000 shows lower-layer link
down. This allows guests to boot much faster in such configurations.
With windows guest, it also allows the network to recover properly
since windows will not configure the link-local IPv4 address, and
when the link is turned on, the proper address address is configured.

Signed-off-by: Vlad Yasevich 
---
 net/net.c | 26 +-
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/net/net.c b/net/net.c
index 0a88e68..8a084bf 100644
--- a/net/net.c
+++ b/net/net.c
@@ -1065,15 +1065,23 @@ void qmp_set_link(const char *name, bool up, Error 
**errp)
 nc->info->link_status_changed(nc);
 }
 
-/* Notify peer. Don't update peer link status: this makes it possible to
- * disconnect from host network without notifying the guest.
- * FIXME: is disconnected link status change operation useful?
- *
- * Current behaviour is compatible with qemu vlans where there could be
- * multiple clients that can still communicate with each other in
- * disconnected mode. For now maintain this compatibility. */
-if (nc->peer && nc->peer->info->link_status_changed) {
-nc->peer->info->link_status_changed(nc->peer);
+if (nc->peer) {
+/* Change peer link only if the peer is NIC and then notify peer.
+ * If the peer is a HUBPORT or a backend, we do not change the
+ * link status.
+ *
+ * This behavior is compatible with qemu vlans where there could be
+ * multiple clients that can still communicate with each other in
+ * disconnected mode. For now maintain this compatibility.
+ */
+if (nc->peer->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
+for (i = 0; i < queues; i++) {
+ncs[i]->peer->link_down = !up;
+}
+}
+if (nc->peer->info->link_status_changed) {
+nc->peer->info->link_status_changed(nc->peer);
+}
 }
 }
 
-- 
1.8.4.2

Re: [Qemu-devel] [PATCH RFC 0/3] add direct support of event in qapi schema

2013-11-21 Thread Wenchao Xia

于 2013/11/13 9:44, Wenchao Xia 写道:
> This series add support for tag/keyword 'event' in qapi-schema.
> The implemention doesn't generate a struture and visit function
> in the background for every event, so it doesn't support nested
> structure in the define to avoid trouble.
> 
> It is on top of series:
> http://lists.nongnu.org/archive/html/qemu-devel/2013-11/msg01574.html
> 
> Wenchao Xia (3):
>1 os-posix: include sys/time.h
>2 qapi script: add support of event
>3 tests: add test cases for qapi event support
> 
>   Makefile|9 +-
>   Makefile.objs   |2 +-
>   include/sysemu/os-posix.h   |2 +
>   qapi/Makefile.objs  |1 +
>   scripts/qapi-event.py   |  355 
> +++
>   tests/Makefile  |   14 +-
>   tests/qapi-schema/qapi-schema-test.json |   12 +
>   tests/qapi-schema/qapi-schema-test.out  |   10 +-
>   tests/test-qmp-event.c  |  250 ++
>   9 files changed, 646 insertions(+), 9 deletions(-)
>   create mode 100644 scripts/qapi-event.py
>   create mode 100644 tests/test-qmp-event.c
> 
  Luiz, do you think this series is in the right direction?

Re: [Qemu-devel] [PATCH V5 2/5] qemu-nbd: support internal snapshot export

2013-11-21 Thread Wenchao Xia


于 2013/11/19 19:26, Kevin Wolf 写道:

Am 10.11.2013 um 23:03 hat Wenchao Xia geschrieben:

Now it is possible to directly export an internal snapshot, which
can be used to probe the snapshot's contents without qemu-img
convert.

Signed-off-by: Wenchao Xia 
---
  block/snapshot.c |   18 +
  include/block/snapshot.h |8 +++
  qemu-nbd.c   |   47 -
  qemu-nbd.texi|8 ++-
  4 files changed, 78 insertions(+), 3 deletions(-)

diff --git a/block/snapshot.c b/block/snapshot.c
index e51a7db..7cc45fa 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -25,6 +25,24 @@
  #include "block/snapshot.h"
  #include "block/block_int.h"

+QemuOptsList internal_snapshot_opts = {
+.name = "snapshot",
+.head = QTAILQ_HEAD_INITIALIZER(internal_snapshot_opts.head),
+.desc = {
+{
+.name = SNAPSHOT_OPT_ID,
+.type = QEMU_OPT_STRING,
+.help = "snapshot id"
+},{
+.name = SNAPSHOT_OPT_NAME,
+.type = QEMU_OPT_STRING,
+.help = "snapshot name"
+},{
+/* end of list */
+}
+},
+};
+
  int bdrv_snapshot_find(BlockDriverState *bs, QEMUSnapshotInfo *sn_info,
 const char *name)
  {
diff --git a/include/block/snapshot.h b/include/block/snapshot.h
index d05bea7..770d9bb 100644
--- a/include/block/snapshot.h
+++ b/include/block/snapshot.h
@@ -27,6 +27,14 @@

  #include "qemu-common.h"
  #include "qapi/error.h"
+#include "qemu/option.h"
+
+
+#define SNAPSHOT_OPT_BASE   "snapshot."
+#define SNAPSHOT_OPT_ID "snapshot.id"
+#define SNAPSHOT_OPT_NAME   "snapshot.name"
+
+extern QemuOptsList internal_snapshot_opts;

  typedef struct QEMUSnapshotInfo {
  char id_str[128]; /* unique snapshot id */
diff --git a/qemu-nbd.c b/qemu-nbd.c
index c26c98e..f934eaa 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -20,6 +20,7 @@
  #include "block/block.h"
  #include "block/nbd.h"
  #include "qemu/main-loop.h"
+#include "block/snapshot.h"

  #include 
  #include 
@@ -79,7 +80,14 @@ static void usage(const char *name)
  "\n"
  "Block device options:\n"
  "  -r, --read-only  export read-only\n"
-"  -s, --snapshot   use snapshot file\n"
+"  -s, --snapshot   use FILE as an external snapshot, create a temporary\n"
+"   file with backing_file=FILE, redirect the write to\n"
+"   the temporary one\n"
+"  -l, --load-snapshot=SNAPSHOT_PARAM\n"
+"   load an internal snapshot inside FILE and export it\n"
+"   as an read-only device, SNAPSHOT_PARAM format is\n"
+"   'snapshot.id=[ID],snapshot.name=[NAME]', or\n"
+"   '[ID_OR_NAME]'\n"
  "  -n, --nocachedisable host cache\n"
  "  --cache=MODE set cache mode (none, writeback, ...)\n"
  #ifdef CONFIG_LINUX_AIO
@@ -315,7 +323,9 @@ int main(int argc, char **argv)
  char *device = NULL;
  int port = NBD_DEFAULT_PORT;
  off_t fd_size;
-const char *sopt = "hVb:o:p:rsnP:c:dvk:e:f:t";
+QemuOpts *sn_opts = NULL;
+const char *sn_id_or_name = NULL;
+const char *sopt = "hVb:o:p:rsnP:c:dvk:e:f:tl:";
  struct option lopt[] = {
  { "help", 0, NULL, 'h' },
  { "version", 0, NULL, 'V' },
@@ -328,6 +338,7 @@ int main(int argc, char **argv)
  { "connect", 1, NULL, 'c' },
  { "disconnect", 0, NULL, 'd' },
  { "snapshot", 0, NULL, 's' },
+{ "load-snapshot", 1, NULL, 'l' },
  { "nocache", 0, NULL, 'n' },
  { "cache", 1, NULL, QEMU_NBD_OPT_CACHE },
  #ifdef CONFIG_LINUX_AIO
@@ -428,6 +439,18 @@ int main(int argc, char **argv)
  errx(EXIT_FAILURE, "Offset must be positive `%s'", optarg);
  }
  break;
+case 'l':
+if (strncmp(optarg, SNAPSHOT_OPT_BASE, strlen(SNAPSHOT_OPT_BASE))
+== 0) {


You can avoid this ugly line break by using strstart():

 if (strstart(optarg, SNAPSHOT_OPT_BASE, NULL)) {
 ...

Kevin


  Will use strstart(), thanks for tipping.

Re: [Qemu-devel] [PATCH V5 1/5] snapshot: distinguish id and name in load_tmp

2013-11-21 Thread Wenchao Xia


于 2013/11/19 19:20, Kevin Wolf 写道:

Am 10.11.2013 um 23:03 hat Wenchao Xia geschrieben:

Since later this function will be used so improve it. The only caller of it
now is qemu-img, and it is not impacted by introduce function
bdrv_snapshot_load_tmp_by_id_or_name() that call bdrv_snapshot_load_tmp()
twice to keep old search logic. bdrv_snapshot_load_tmp_by_id_or_name() return
int to let caller know the errno, and errno will be used later.
Also fix a typo in comments of bdrv_snapshot_delete().

Signed-off-by: Wenchao Xia 
---
  block/qcow2-snapshot.c|   16 ++-
  block/qcow2.h |5 +++-
  block/snapshot.c  |   60 ++--
  include/block/block_int.h |4 ++-
  include/block/snapshot.h  |7 -
  qemu-img.c|8 -
  6 files changed, 90 insertions(+), 10 deletions(-)

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 3529c68..aeb5efd 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -675,7 +675,10 @@ int qcow2_snapshot_list(BlockDriverState *bs, 
QEMUSnapshotInfo **psn_tab)
  return s->nb_snapshots;
  }

-int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name)
+int qcow2_snapshot_load_tmp(BlockDriverState *bs,
+const char *snapshot_id,
+const char *name,
+Error **errp)
  {
  int i, snapshot_index;
  BDRVQcowState *s = bs->opaque;
@@ -683,12 +686,17 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs, const 
char *snapshot_name)
  uint64_t *new_l1_table;
  int new_l1_bytes;
  int ret;
+const char *device = bdrv_get_device_name(bs);


This is wrong, low-level functions shouldn't need the device name for
anything.


  assert(bs->read_only);

  /* Search the snapshot */
-snapshot_index = find_snapshot_by_id_or_name(bs, snapshot_name);
+snapshot_index = find_snapshot_by_id_and_name(bs, snapshot_id, name);
  if (snapshot_index < 0) {
+error_setg(errp,
+   "Can't find a snapshot with ID '%s' and name '%s' "
+   "on device '%s'",
+   STR_OR_NULL(snapshot_id), STR_OR_NULL(name), device);
  return -ENOENT;
  }
  sn = &s->snapshots[snapshot_index];


I think we already discussed the same thing in the context of a
different series: The caller knows which device and which snapshot name
or ID he used. The only information he really needs is:

 error_setg(errp, "Can't find snapshot");

If in the context of the caller's caller this isn't enough, the caller
can add this information. I doubt that it's even necessary in this case.



  I remember that, will follow this rule.


@@ -699,6 +707,10 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs, const 
char *snapshot_name)

  ret = bdrv_pread(bs->file, sn->l1_table_offset, new_l1_table, 
new_l1_bytes);
  if (ret < 0) {
+error_setg(errp,
+   "Failed to read l1 table for snapshot with ID '%s' and name 
"
+   "'%s' on device '%s'",
+   sn->id_str, sn->name, device);
  g_free(new_l1_table);
  return ret;
  }
diff --git a/block/qcow2.h b/block/qcow2.h
index 922e190..303eb26 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -488,7 +488,10 @@ int qcow2_snapshot_delete(BlockDriverState *bs,
const char *name,
Error **errp);
  int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab);
-int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name);
+int qcow2_snapshot_load_tmp(BlockDriverState *bs,
+const char *snapshot_id,
+const char *name,
+Error **errp);

  void qcow2_free_snapshots(BlockDriverState *bs);
  int qcow2_read_snapshots(BlockDriverState *bs);
diff --git a/block/snapshot.c b/block/snapshot.c
index a05c0c0..e51a7db 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -194,7 +194,7 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
   * If only @snapshot_id is specified, delete the first one with id
   * @snapshot_id.
   * If only @name is specified, delete the first one with name @name.
- * if none is specified, return -ENINVAL.
+ * if none is specified, return -EINVAL.
   *
   * Returns: 0 on success, -errno on failure. If @bs is not inserted, return
   * -ENOMEDIUM. If @snapshot_id and @name are both NULL, return -EINVAL. If @bs
@@ -265,18 +265,72 @@ int bdrv_snapshot_list(BlockDriverState *bs,
  return -ENOTSUP;
  }

+/**
+ * Temporarily load an internal snapshot by @snapshot_id and @name.
+ * @bs: block device used in the operation
+ * @snapshot_id: unique snapshot ID, or NULL
+ * @name: snapshot name, or NULL
+ * @errp: location to store error
+ *
+ * If both @snapshot_id and @name are specified, load the first one with
+ * id @snapshot_id and name @name.
+ *

Re: [Qemu-devel] Is there any new progress or plans about "ram live snapshot feature"?

2013-11-21 Thread Wenchao Xia


于 2013/11/21 19:02, Zhanghailiang 写道:

Hi,

Now qemu ram live snapshot feature has some problems, it is based on
‘ram live migration’.

The time of snapshot depends on completion time of migration, which is
not measurable. Also It may can’t achieve migrate in some situation.

I have seen discussion about that in your previous emails, and there
seems to be a feasible scheme.

I want to know if there is any new progress or somebody is trying to
realize the program.

Best Regards

Hailiang Zhang


Hi, Pavel
  Are you going to implement this feat? If no I'd like to add it in my
TODO list.

Re: [Qemu-devel] [PATCH 16/27] acpi: ich9: allow guest to clear SCI rised by GPE

2013-11-21 Thread Li Guang


Igor Mammedov wrote:

On Thu, 21 Nov 2013 16:32:27 +0800
Li Guang  wrote:

   

Michael S. Tsirkin wrote:
 

On Thu, Nov 21, 2013 at 04:18:45PM +0800, Li Guang wrote:

   

Hu Tao wrote:

 

On Thu, Nov 21, 2013 at 09:14:18AM +0200, Michael S. Tsirkin wrote:

   

On Thu, Nov 21, 2013 at 03:38:37AM +0100, Igor Mammedov wrote:

 

it fixes IRQ storm since guest isn't able to lower SCI IRQ
after it has been handled when it clears GPE event.

Signed-off-by: Igor Mammedov

   

The storm is only on memory hotplug right?

 

IIRC, it happens on cpu hotplug, too.




   

:-), that made remember EC implementation,
with EC, SCI will be safer, I think.

 

Hmm you are saying let's use EC for memory hotplug?



   

It can be a bridge between guest and QEMU,
with it, we may don't have to bother ASL writing
and south-bridge hardware related work(or very
little) if we implement EC correctly.

 

Wouldn't it require guest driver though?
Beauty of ASL/GPE it's that it supported by Windows and Linux
out of box.

   


it did require guest driver, but as a ACPI standard device,
the driver is natively implemented by ACPI compatible OS.

Re: [Qemu-devel] [PATCH 16/27] acpi: ich9: allow guest to clear SCI rised by GPE

2013-11-21 Thread Li Guang


Michael S. Tsirkin wrote:

On Thu, Nov 21, 2013 at 04:32:27PM +0800, Li Guang wrote:
   

Michael S. Tsirkin wrote:
 

On Thu, Nov 21, 2013 at 04:18:45PM +0800, Li Guang wrote:
   

Hu Tao wrote:
 

On Thu, Nov 21, 2013 at 09:14:18AM +0200, Michael S. Tsirkin wrote:
   

On Thu, Nov 21, 2013 at 03:38:37AM +0100, Igor Mammedov wrote:
 

it fixes IRQ storm since guest isn't able to lower SCI IRQ
after it has been handled when it clears GPE event.

Signed-off-by: Igor Mammedov
   

The storm is only on memory hotplug right?
 

IIRC, it happens on cpu hotplug, too.



   

:-), that made remember EC implementation,
with EC, SCI will be safer, I think.
 

Hmm you are saying let's use EC for memory hotplug?


   

It can be a bridge between guest and QEMU,
with it, we may don't have to bother ASL writing
and south-bridge hardware related work(or very
little) if we implement EC correctly.


 


I'd like to see that. Can you write a document (just text)
for an imaginary EC support for memory hotplug?



   


Hmm..., with EC,

For memory hotplug, at least,
ASL at [PATCH 27/27] can be replaced
by a simple Method(_Qx) under EC device,
IO base operations at [PATCH 15/27]
are dispensable,  we can relay data
by standard operations of EC space

and also for SCI, all device changes want to
notify guest OS can share same SCI with EC,
and the operations are specified at ACPI SPEC.

likewise, for CPU hotplug, pvpanic,
and even debugcon.

and, for odd devices, like pvpanic, guest OS may complain
about it, and we may also have to bother on maintaining state of
it at QEMU, and writing a driver for guest OS,
with EC, functions of device like pvpanic may be implemented silently,
and EC is ACPI standard device, each ACPI compatible OS will
have a driver for it natively.

[Qemu-devel] [PULL 06/11] target-i386: Fix build by providing stub kvm_arch_get_supported_cpuid()

2013-11-21 Thread Paolo Bonzini

From: Peter Maydell 

Fix build failures with clang when KVM is not enabled by
providing a stub version of kvm_arch_get_supported_cpuid().
We retain the compile time check that this function isn't
called when CONFIG_KVM is not set by guarding the stub with
ifndef __OPTIMIZE__ (we assume that an optimizing build will
do sufficient constant folding and dead code elimination to
remove the calls before linking).

Signed-off-by: Peter Maydell 
Signed-off-by: Paolo Bonzini 
---
 target-i386/kvm-stub.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/target-i386/kvm-stub.c b/target-i386/kvm-stub.c
index 11429c4..2b9e801 100644
--- a/target-i386/kvm-stub.c
+++ b/target-i386/kvm-stub.c
@@ -16,3 +16,15 @@ bool kvm_allows_irq0_override(void)
 {
 return 1;
 }
+
+#ifndef __OPTIMIZE__
+/* This function is only called inside conditionals which we
+ * rely on the compiler to optimize out when CONFIG_KVM is not
+ * defined.
+ */
+uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function,
+  uint32_t index, int reg)
+{
+abort();
+}
+#endif
-- 
1.8.3.1

[Qemu-devel] [PATCH] vfio-pci: Release all MSI-X vectors when disabled

2013-11-21 Thread Alex Williamson

We were relying on msix_unset_vector_notifiers() to release all the
vectors when we disable MSI-X, but this only happens when MSI-X is
still enabled on the device.  Perform further cleanup by releasing
any remaining vectors listed as in-use after this call.  This caused
a leak of IRQ routes on hotplug depending on how the guest OS prepared
the device for removal.

Signed-off-by: Alex Williamson 
Cc: qemu-sta...@nongnu.org
---
 hw/misc/vfio.c |   12 
 1 file changed, 12 insertions(+)

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index f7f8a19..355b018 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -878,8 +878,20 @@ static void vfio_disable_msi_common(VFIODevice *vdev)
 
 static void vfio_disable_msix(VFIODevice *vdev)
 {
+int i;
+
 msix_unset_vector_notifiers(&vdev->pdev);
 
+/*
+ * MSI-X will only release vectors if MSI-X is still enabled on the
+ * device, check through the rest and release it ourselves if necessary.
+ */
+for (i = 0; i < vdev->nr_vectors; i++) {
+if (vdev->msi_vectors[i].use) {
+vfio_msix_vector_release(&vdev->pdev, i);
+}
+}
+
 if (vdev->nr_vectors) {
 vfio_disable_irqindex(vdev, VFIO_PCI_MSIX_IRQ_INDEX);
 }

Re: [Qemu-devel] [qemu PATCH] hw/i386/pc_sysfw: support more than one flash drive

2013-11-21 Thread Laszlo Ersek

On 11/21/13 23:26, Eric Blake wrote:
> On 11/21/2013 03:21 PM, Laszlo Ersek wrote:
>> This patch allows the user to usefully specify
>>
>>   -drive file=img_1,if=pflash,format=raw,readonly \
>>   -drive file=img_2,if=pflash,format=raw
>>
>> on the command line. The flash images will be mapped under 4G in their
>> reverse unit order -- that is, with their base addresses progressing
>> downwards, in increasing unit order.
>>
> 
>> +/* This function maps flash drives from 4G downward, in order of their unit
>> + * numbers. Addressing within one flash drive is of course not reversed.
>> + *
>> + * The drive with unit number 0 is mapped at the highest address, and it is
>> + * passed to pc_isa_bios_init(). Merging severral drives for isa-bios is not
> 
> s/severral/several/
> 

I may have meant "very seveal" :)

(Will fix if patch is deemed worthy otherwise.)

Thanks!
Laszlo

Re: [Qemu-devel] [qemu PATCH] hw/i386/pc_sysfw: support more than one flash drive

2013-11-21 Thread Eric Blake

On 11/21/2013 03:21 PM, Laszlo Ersek wrote:
> This patch allows the user to usefully specify
> 
>   -drive file=img_1,if=pflash,format=raw,readonly \
>   -drive file=img_2,if=pflash,format=raw
> 
> on the command line. The flash images will be mapped under 4G in their
> reverse unit order -- that is, with their base addresses progressing
> downwards, in increasing unit order.
> 

> +/* This function maps flash drives from 4G downward, in order of their unit
> + * numbers. Addressing within one flash drive is of course not reversed.
> + *
> + * The drive with unit number 0 is mapped at the highest address, and it is
> + * passed to pc_isa_bios_init(). Merging severral drives for isa-bios is not

s/severral/several/

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH for-1.7] Revert "e1000/rtl8139: update HMP NIC when every bit is written"

2013-11-21 Thread Michael S. Tsirkin

On Mon, Nov 18, 2013 at 04:32:34PM -0700, Alex Williamson wrote:
> On Mon, 2013-11-18 at 17:55 -0500, Vlad Yasevich wrote:
> > On 11/18/2013 05:40 PM, Alex Williamson wrote:
> > > On Mon, 2013-11-18 at 17:07 -0500, Vlad Yasevich wrote:
> > >> On 11/18/2013 04:33 PM, Alex Williamson wrote:
> > >>> On Mon, 2013-11-18 at 15:57 -0500, Vlad Yasevich wrote:
> >  On 11/18/2013 03:33 PM, Alex Williamson wrote:
> > > On Mon, 2013-11-18 at 15:09 -0500, Vlad Yasevich wrote:
> > >> On 11/18/2013 02:58 PM, Alex Williamson wrote:
> > >>> On Mon, 2013-11-18 at 21:47 +0200, Michael S. Tsirkin wrote:
> >  This reverts commit cd5be5829c1ce87aa6b3a7806524fac07ac9a757.
> >  Digging into hardware specs shows this does not
> >  actually make QEMU behave more like hardware.
> >  Let's stick to the tried heuristic for 1.7 and
> >  possibly revisit for 1.8.
> > >>>
> > >>> If this is broken, then so are these:
> > >>>
> > >>> 23c37c37f0280761072c23bf67d3a4f3c0ff25aa
> > >>> 7c36507c2b8776266f50c5e2739bd18279953b93
> > >>
> > >> These aren't really broken.  They just assume that the high order
> > >> writes will happen after the low order writes.
> > >>
> > >> In the case of e1000, this is a little more then an assumption
> > >> (particularly in the case of nic initilization).
> > >
> > > But AIUI there's also a valid bit in that high order byte on e1000, so
> > > reverting cd5be582 means we stuff a new mac into qemu less often, but
> > > it's still only accurate some of the time.
> > 
> >  Yes, there is a slight issue with validity of mac at the time of
> >  processing packets.  I have an outstanding question on the Intel
> >  list about this behavior with real HW.  But, with e1000, the validity
> >  bit provides a much higher guarantee that a guest that will be
> >  setting the mac address will write the high register second to
> >  guarantee that when the valid bit is written, the mac is fully
> >  valid.  As a result we don't really need the e1000 part of the
> >  cd5be5829.
> > >>>
> > >>> But doesn't that valid bit mean that a mac update will start and end
> > >>> with a write to the high order register?  So we're assuming:
> > >>>
> > >>> a) write RA + 1 (invalidate)
> > >>> b) write RA (write low)
> > >>> c) write RA + 1 (write high + valid)
> > >>>
> > >>
> > >> No. On update, only steps b and c typically happen.  Thus my question
> > >> to the on the intel list.
> > > 
> > > So perhaps the bit is some kind of data latch bit and the mac address
> > > fields within those registers are effectively scratch until that bit is
> > > written?
> > > 
> > >>> Without cd5be5829 the only change is that we don't store a new mac into
> > >>> the monitor at b).  The mac stored in the monitor is still wrong from a)
> > >>> until c).  So it's ever so slightly less broken without cd5be5829.
> > >>
> > >> Since there is really no a), the mac in the monitor is only different
> > >> after step b).  since it's is incomplete and we expect step c), there
> > >> is really no point in updating it.
> > > 
> > > Great, so I have no argument against reverting, or just fixing, that
> > > chunk of cd5be5829.  Let's implement the latch bit too.
> > > 
> > >
> > >> In the case of RTL nic, it is just an  assumption, but it hasn't
> > >> been shown faulty yet.  We do plan to address this a bit more
> > >> thoroughly.
> > >
> > > So how is RTL less broken without cd5be582?  AIUI the valid bit is off
> > > in a separate register on RTL, so we have no guarantee about order of
> > > updating the mac.  Without cd5be582 the info in the monitor may be
> > > permanently broken if the guest uses a write order other than what we
> > > assume.
> > >
> > 
> >  This one is actually not as bad either.  RTL spec requires that
> >  receive register writes happen as 32 bit word writes.  This is
> >  what linux and bsd drivers do, so from driver perspective, the
> >  issue is the same.  What our emulation layer does is turn these
> >  32 bit writes into 4 8-bit writes.  This is likely due to some
> >  very broken and very old drivers, but I am not sure.
> > 
> >  So, the information in the monitor will be broken if the guest
> >  does: write_hi(); write_lo();  A part of me would really like
> >  to see a guest that does this :)
> > >>>
> > >>> So the argument for reverting here is that it seems unlikely that the
> > >>> dwords get written in the hi->lo order and we'd rather the monitor get
> > >>> stuck with the wrong mac forever
> > >>
> > >> For how many/which guests?  I know it's not linux or BSD.  I need to
> > >> boot windows to see what it does, but I think it does the right thing.
> > > 
> > > How many guests do you plan to test?
> > 
> > I think the proposal was to see if anyone reports an issue :)
> > 
> > > 
> > >>> than it show

[Qemu-devel] [edk2 PATCH] OvmfPkg: split the variable store to a separate file

2013-11-21 Thread Laszlo Ersek

Split the variable store off to a separate file when SPLIT_VARSTORE is
defined.

Even in this case, the preexistent PCDs' values don't change. Qemu must
take care of contiguously mapping NVVARSTORE.fd + OVMF.fd so that when
concatenated they end exactly at 4GB.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek 
---
 Generated with 8 lines of context for easier in-patch verification with
 the help of the clipboard.

 OvmfPkg/OvmfPkgIa32.fdf| 48 ++
 OvmfPkg/OvmfPkgIa32X64.fdf | 48 ++
 OvmfPkg/OvmfPkgX64.fdf | 48 ++
 OvmfPkg/README | 16 
 4 files changed, 160 insertions(+)

diff --git a/OvmfPkg/OvmfPkgIa32.fdf b/OvmfPkg/OvmfPkgIa32.fdf
index c50709c..310d6a9 100644
--- a/OvmfPkg/OvmfPkgIa32.fdf
+++ b/OvmfPkg/OvmfPkgIa32.fdf
@@ -23,31 +23,51 @@
 #
 [Defines]
 !if $(TARGET) == RELEASE
 !ifndef $(FD_SIZE_2MB)
 DEFINE FD_SIZE_1MB=
 !endif
 !endif
 
+!ifdef $(SPLIT_VARSTORE)
+!ifdef $(FD_SIZE_1MB)
+[FD.NvVarStore]
+BaseAddress   = 0xFFF0|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFdBaseAddress
+SET gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFirmwareFdSize = 0x0010
+Size  = 0x2
+ErasePolarity = 1
+BlockSize = 0x1000|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFirmwareBlockSize
+NumBlocks = 0x20
+!else
+[FD.NvVarStore]
+BaseAddress   = 0xFFE0|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFdBaseAddress
+SET gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFirmwareFdSize = 0x0020
+Size  = 0x2
+ErasePolarity = 1
+BlockSize = 0x1000|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFirmwareBlockSize
+NumBlocks = 0x20
+!endif
+!else
 !ifdef $(FD_SIZE_1MB)
 [FD.OVMF]
 BaseAddress   = 0xFFF0|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFdBaseAddress
 Size  = 0x0010|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFirmwareFdSize
 ErasePolarity = 1
 BlockSize = 0x1000|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFirmwareBlockSize
 NumBlocks = 0x100
 !else
 [FD.OVMF]
 BaseAddress   = 0xFFE0|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFdBaseAddress
 Size  = 0x0020|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFirmwareFdSize
 ErasePolarity = 1
 BlockSize = 0x1000|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFirmwareBlockSize
 NumBlocks = 0x200
 !endif
+!endif
 
 0x|0xe000
 
gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFlashNvStorageVariableBase|gEfiMdeModulePkgTokenSpaceGuid.PcdFlashNvStorageVariableSize
 #NV_VARIABLE_STORE
 DATA = {
   ## This is the EFI_FIRMWARE_VOLUME_HEADER
   # ZeroVector []
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
@@ -106,30 +126,58 @@ DATA = {
   # WriteQueueSize: UINT64
   0xE0, 0x0F, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
 }
 
 0x0001|0x0001
 #NV_FTW_SPARE
 
gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFlashNvStorageFtwSpareBase|gEfiMdeModulePkgTokenSpaceGuid.PcdFlashNvStorageFtwSpareSize
 
+!ifdef $(SPLIT_VARSTORE)
+!ifdef $(FD_SIZE_1MB)
+[FD.OVMF]
+BaseAddress   = 0xFFF2
+Size  = 0x000E
+ErasePolarity = 1
+BlockSize = 0x1000
+NumBlocks = 0xE0
+
+0x|0x000CC000
+FV = FVMAIN_COMPACT
+0x000CC000|0x14000
+FV = SECFV
+!else
+[FD.OVMF]
+BaseAddress   = 0xFFE2
+Size  = 0x001E
+ErasePolarity = 1
+BlockSize = 0x1000
+NumBlocks = 0x1E0
+
+0x|0x001AC000
+FV = FVMAIN_COMPACT
+0x001AC000|0x34000
+FV = SECFV
+!endif
+!else
 !ifdef $(FD_SIZE_1MB)
 0x0002|0x000CC000
 FV = FVMAIN_COMPACT
 
 0x000EC000|0x14000
 FV = SECFV
 
 !else
 0x0002|0x001AC000
 FV = FVMAIN_COMPACT
 
 0x001CC000|0x34000
 FV = SECFV
 !endif
+!endif
 
 

 
 [FD.MEMFD]
 BaseAddress   = 0x80|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfMemFvBase
 Size  = 0x80|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfMemFvSize
 ErasePolarity = 1
 BlockSize = 0x1
diff --git a/OvmfPkg/OvmfPkgIa32X64.fdf b/OvmfPkg/OvmfPkgIa32X64.fdf
index d65f40f..8877a89 100644
--- a/OvmfPkg/OvmfPkgIa32X64.fdf
+++ b/OvmfPkg/OvmfPkgIa32X64.fdf
@@ -23,31 +23,51 @@
 #
 [Defines]
 !if $(TARGET) == RELEASE
 !ifndef $(FD_SIZE_2MB)
 DEFINE FD_SIZE_1MB=
 !endif
 !endif
 
+!ifdef $(SPLIT_VARSTORE)
+!ifdef $(FD_SIZE_1MB)
+[FD.NvVarStore]
+BaseAddress   = 0xFFF0|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFdBaseAddress
+SET gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFirmwareFdSize = 0x0010
+Size  = 0x2
+ErasePolarity = 1
+BlockSize = 0x1000|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFirmwareBlockSize
+NumBlocks = 0x20
+!else
+[FD.NvVarStore]
+BaseAddress   = 0xFFE0|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFdBaseAddress
+SET gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFirmwareFdSize = 0x0020
+Size  = 0x2
+ErasePolarity = 1
+BlockSize = 0x1000|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFirmwareBlockSize
+NumBlocks = 0x20
+!endif
+!else
 !ifdef $(FD_SIZE_1MB)
 [FD.OVMF]
 BaseAddress   = 0xFFF0|gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFdBaseAddress
 Size  = 0x0010|gUefi

[Qemu-devel] [qemu PATCH] hw/i386/pc_sysfw: support more than one flash drive

2013-11-21 Thread Laszlo Ersek

This patch allows the user to usefully specify

  -drive file=img_1,if=pflash,format=raw,readonly \
  -drive file=img_2,if=pflash,format=raw

on the command line. The flash images will be mapped under 4G in their
reverse unit order -- that is, with their base addresses progressing
downwards, in increasing unit order.

(The unit number increases with command line order if not explicitly
specified.)

This accommodates the following use case: suppose that OVMF is split in
two parts, a writeable host file for non-volatile variable storage, and a
read-only part for bootstrap and decompressible executable code.

The binary code part would be read-only, centrally managed on the host
system, and passed in as unit 0. The variable store would be writeable,
VM-specific, and passed in as unit 1.

  ffe0-ffe1 (prio 0, R-): system.flash1
  ffe2- (prio 0, R-): system.flash0

(If the guest tries to write to the flash range that is backed by the
read-only drive, bdrv_write() in pflash_update() will correctly deny the
write with -EACCES, and pflash_update() won't care, which suits us well.)

Signed-off-by: Laszlo Ersek 
---
 hw/i386/pc_sysfw.c | 60 --
 1 file changed, 45 insertions(+), 15 deletions(-)

diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index e917c83..1c3e3d6 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -72,35 +72,65 @@ static void pc_isa_bios_init(MemoryRegion *rom_memory,
 memory_region_set_readonly(isa_bios, true);
 }
 
+/* This function maps flash drives from 4G downward, in order of their unit
+ * numbers. Addressing within one flash drive is of course not reversed.
+ *
+ * The drive with unit number 0 is mapped at the highest address, and it is
+ * passed to pc_isa_bios_init(). Merging severral drives for isa-bios is not
+ * supported.
+ *
+ * The caller is responsible to pass in the non-NULL @pflash_drv that
+ * corresponds to the flash drive with unit number 0.
+ */
 static void pc_system_flash_init(MemoryRegion *rom_memory,
  DriveInfo *pflash_drv)
 {
+int unit = 0;
 BlockDriverState *bdrv;
 int64_t size;
-hwaddr phys_addr;
+hwaddr phys_addr = 0x1ULL;
 int sector_bits, sector_size;
 pflash_t *system_flash;
 MemoryRegion *flash_mem;
+char name[64];
 
-bdrv = pflash_drv->bdrv;
-size = bdrv_getlength(pflash_drv->bdrv);
 sector_bits = 12;
 sector_size = 1 << sector_bits;
 
-if ((size % sector_size) != 0) {
-fprintf(stderr,
-"qemu: PC system firmware (pflash) must be a multiple of 
0x%x\n",
-sector_size);
-exit(1);
-}
+do {
+bdrv = pflash_drv->bdrv;
+size = bdrv_getlength(bdrv);
+if ((size % sector_size) != 0) {
+fprintf(stderr,
+"qemu: PC system firmware (pflash segment %d) must be a "
+"multiple of 0x%x\n", unit, sector_size);
+exit(1);
+}
+if (size > phys_addr) {
+fprintf(stderr, "qemu: pflash segments must fit under 4G "
+"cumulatively\n");
+exit(1);
+}
 
-phys_addr = 0x1ULL - size;
-system_flash = pflash_cfi01_register(phys_addr, NULL, "system.flash", size,
- bdrv, sector_size, size >> 
sector_bits,
- 1, 0x, 0x, 0x, 0x, 0);
-flash_mem = pflash_cfi01_get_memory(system_flash);
+phys_addr -= size;
 
-pc_isa_bios_init(rom_memory, flash_mem, size);
+/* pflash_cfi01_register() creates a deep copy of the name */
+snprintf(name, sizeof name, "system.flash%d", unit);
+system_flash = pflash_cfi01_register(phys_addr, NULL /* qdev */, name,
+ size, bdrv, sector_size,
+ size >> sector_bits,
+ 1  /* width */,
+ 0x /* id0 */,
+ 0x /* id1 */,
+ 0x /* id2 */,
+ 0x /* id3 */,
+ 0  /* be */);
+if (unit == 0) {
+flash_mem = pflash_cfi01_get_memory(system_flash);
+pc_isa_bios_init(rom_memory, flash_mem, size);
+}
+pflash_drv = drive_get(IF_PFLASH, 0, ++unit);
+} while (pflash_drv != NULL);
 }
 
 static void old_pc_system_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH 16/27] acpi: ich9: allow guest to clear SCI rised by GPE

2013-11-21 Thread Igor Mammedov

On Thu, 21 Nov 2013 16:32:27 +0800
Li Guang  wrote:

> Michael S. Tsirkin wrote:
> > On Thu, Nov 21, 2013 at 04:18:45PM +0800, Li Guang wrote:
> >
> >> Hu Tao wrote:
> >>  
> >>> On Thu, Nov 21, 2013 at 09:14:18AM +0200, Michael S. Tsirkin wrote:
> >>>
>  On Thu, Nov 21, 2013 at 03:38:37AM +0100, Igor Mammedov wrote:
>   
> > it fixes IRQ storm since guest isn't able to lower SCI IRQ
> > after it has been handled when it clears GPE event.
> >
> > Signed-off-by: Igor Mammedov
> >
>  The storm is only on memory hotplug right?
>   
> >>> IIRC, it happens on cpu hotplug, too.
> >>>
> >>>
> >>>
> >>>
> >> :-), that made remember EC implementation,
> >> with EC, SCI will be safer, I think.
> >>  
> > Hmm you are saying let's use EC for memory hotplug?
> >
> >
> >
> It can be a bridge between guest and QEMU,
> with it, we may don't have to bother ASL writing
> and south-bridge hardware related work(or very
> little) if we implement EC correctly.
> 
Wouldn't it require guest driver though?
Beauty of ASL/GPE it's that it supported by Windows and Linux
out of box.

Re: [Qemu-devel] [PATCH 00/27 RFC v7] ACPI memory hotplug

2013-11-21 Thread Michael S. Tsirkin

On Thu, Nov 21, 2013 at 02:39:10PM +0100, Igor Mammedov wrote:
> On Thu, 21 Nov 2013 08:20:56 +0200
> "Michael S. Tsirkin"  wrote:
> 
> > On Thu, Nov 21, 2013 at 03:38:21AM +0100, Igor Mammedov wrote:
> > > ---
> > > What's new since v6:
> > > 
> > > * DIMM device is split to backend and frontend. Therefore following
> > >   command/options were added for supporting it:
> > > 
> > >   For memdev backend:
> > >   CLI: -memdev-add
> > >   Monitor/QMP: memdev-add
> > >   with options: 'id' and 'size'
> > >   For dimm frontend:
> > >   option "size" became readonly, pulling it's size from attached 
> > > backend
> > >   added option "memdev" for specifying backend by 'id'
> > > 
> > > * Added Q35 support
> > > * Added support for 32 bit guests
> > > * build for i386 emulator (not tested)
> > 
> > OK so a large patchset so did not review yet.  One question
> > due to the dependency on bios honouring etc/reserved-memory-end: is
> > there some way to detect old BIOS and fail memory hotplug?
> version negotiation between ASL and hardware could be used to that effect.
> 
> QEMU could start with present but disabled memory hotplug and if
> QEMU and BIOS ASL could come in agreement that both support it in
> compatible way, it could be enabled.

So at the moment there's no negotiation, is there?


> > 
> > > ---
> > > 
> > > This series allows to hotplug 'arbitrary' DIMM devices specifying size,
> > > NUMA node mapping (guest side), slot and address where to map it, at 
> > > runtime.
> > > 
> > > Due to ACPI limitation there is need to specify a number of possible
> > > DIMM devices. For this task -m option was extended to support
> > > following format:
> > > 
> > >   -m [mem=]RamSize[,slots=N,maxmem=M]
> > > 
> > > To allow memory hotplug user must specify a pair of additional parameters:
> > > 'slots' - number of possible increments
> > > 'maxmem' - max possible total memory size QEMU is allowed to use,
> > >including RamSize.
> > > 
> > > minimal monitor command syntax to hotplug DIMM device:
> > > 
> > >   memdev-add id=memX,size=1G
> > >   device_add dimm,id=dimmX,memdev=memX
> > > 
> > > DIMM device provides following properties that could be used with
> > > device_add / -device to alter default behavior:
> > > 
> > >   id- unique string identifying device [mandatory]
> > >   slot  - number in range [0-slots) [optional], if not specified
> > >   the first free slot is used
> > >   node  - NUMA node id [optional] (default: 0)
> > >   size  - amount of memory to add, readonly derived from backing memdev
> > >   start - guest's physical address where to plug DIMM [optional],
> > >   if not specified the first gap in hotplug memory region
> > >   that fits DIMM is used
> > > 
> > >  -device option could be used for adding potentially hotunplugable DIMMs
> > > and also for specifying hotplugged DIMMs in migration case.
> > > 
> > > Tested guests:
> > >  - RHEL 6x64
> > >  - Windows 2012DCx64
> > >  - Windows 2008DCx64
> > >  - Windows 2008DCx32
> > > 
> > > Known limitations/bugs/TODOs:
> > >  - only hot-add supported
> > >  - max number of supported DIMM devices 255 (due to ACPI object name
> > >limit), could be increased creating several containers and putting
> > >DIMMs there. (exercise for future) 
> > >  - failed hotplug action consumes 1 slot (device_add doesn't delete
> > >device properly if realize failed)
> > >  - e820 table doesn't include DIMM devices added with -device /
> > >(or after reboot devices added with device_add)
> > >  - Windows 2008 remembers DIMM configuration, so if DIMM with other
> > >start/size is added into the same slot, it refuses to use it insisting
> > >on old mapping.
> > > 
> > > Series is based on mst's PCI tree and requires following SeaBIOS patch:
> > >   http://patchwork.ozlabs.org/patch/292614/
> > >   on top of patches to load ACPI tables from QEMU
> > >   working SeaBIOS tree for testing is available at:
> > > https://github.com/imammedo/seabios/commits/memhp-wip
> > > 
> > > QEMU git tree for testing is available at:
> > >   https://github.com/imammedo/qemu/commits/memory-hotplug-v7
> > > 
> > > Example QEMU cmd line:
> > >   qemu-system-x86_64 -enable-kvm -monitor unix:/tmp/mon,server,nowait \ 
> > >  -m 4096,slots=4,maxmem=8G -L /custome_seabios guest.img
> > > 
> > > PS:
> > >   Windows guest requires SRAT table for hotplug to work so add extra 
> > > option:
> > >-numa node
> > >   to QEMU command line.
> > > 
> > > 
> > > Igor Mammedov (26):
> > >   acpi: factor out common pm_update_sci() into acpi core
> > >   rename pci_hotplug_fn to hotplug_fn and make it available for other
> > > devices
> > >   pc: add 'etc/reserved-memory-end' fw_cfg interface for SeaBIOS
> > >   vl: convert -m to qemu_opts_parse()
> > >   qapi: add SIZE type parser to string_input_visitor
> > >   get reference to /backend container via qemu_get_backend()
> > >   add memdev backen

Re: [Qemu-devel] [RFC PATCH 2/2] rtl8139: update HMP only when the address is fully written

2013-11-21 Thread Eric Blake

On 11/21/2013 01:04 PM, Vlad Yasevich wrote:
> rtl8139 hardware requires 9346 config register to be set into
> write mode before mac address can be changed even though it is
> not documented.  Every driver inspected so far appears to do
> this along with comments that this is an undocumented requirement.
> 
> We can use this to help us identify when the mac address has been
> completely written.  Simple set a flag whenever mac has changed

s/Simple/Simply/

> and at the next transition of 9346 register from Write to Normal
> mode, we update the HMP.
> 
> Signed-off-by: Vlad Yasevich 
> ---

Comment-only review (ie. I didn't validate the code, just fixing grammar)

> +} else if (opmode == Cfg9346_Normal && s->mac_changed) {
> +/* Even though it is not documented, it is required to set
> + * opmode to Cfg9346_ConfigWrite when changing the mac address
> + * of the card and to set to Cfg9346_Normal when done.  We
> + * can use this as an idication to kick off the notification event.

s/idication/indication/

> -qemu_format_nic_info_str(qemu_get_queue(s->nic), s->phys);
> +if (s->compat_flags & RTL8139_FLAG_MAC_COMPLETE) {
> +s->mac_changed = true;
> +} else if (addr == MAC0+5) {

Doesn't coding style recommend s/MAC0+5/MAC0 + 5/

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] os x boot broken by commit 11948748495841bd54721b250d68c7b3cb0475ef

2013-11-21 Thread Gabriel L. Somlo

Added qemu-devel, since that is where this stuff belongs now. Everyone
else, sorry for the dupe...

On Thu, Nov 21, 2013 at 07:14:27PM +0100, Paolo Bonzini wrote:
> Can you remind us about your DSDT modifications?  It should be possible
> to patch the HPET and applesmc bits appropriately from QEMU (or to move
> them from the DSDT to an SSDT that is built entirely in QEMU).
> 
> It actually isn't impossible that Mac OS X would boot just fine with 1.8...

My current DSDT patch (against QEMU) is enclosed below. The HPET
basically needs "IRQNoFlags() {2, 8}", which causes XP to bluescreen.

So, I've made it conditional on the SMC STA method returning success
(0x0B).

The SMC node's STA method returns 0x0B unconditionally on real
hardware. So I was planning on figuring out what's easier in the
context of the most recent QEMU code base:

1. dynamically generating (during qemu runtime initialization)
 a DSDT entry for SMC with hardcoded 0x0B STA method, whenever
"--device isa-applesmc" is present on the qemu command line

or

2. writing a static (compile-time) SMC node but with a slightly
smarter _STA method, which returns 0x0B when "--device isa-applesmc"
was given on the cmdline, or which returns 0x00 in the absence
of "--device isa-applesmc".

Either 1. or 2. could be used with HPET -- I can make inclusion of
IRQNoFlags dependent on either the success or on the presence of
SMC._STA() :)

Let me know what you think.

Thanks,
--Gabriel

###
# Modify DSDT entry for HPET: conditionally insert "IRQNoFlags() {2, 8}" into
# _CRS method only if an AppleSMC DSDT node is also present and enabled (it
# otherwise causes WinXP to BSOD).
###
diff --git a/hw/i386/acpi-dsdt-hpet.dsl b/hw/i386/acpi-dsdt-hpet.dsl
index dfde174..205cf05 100644
--- a/hw/i386/acpi-dsdt-hpet.dsl
+++ b/hw/i386/acpi-dsdt-hpet.dsl
@@ -38,14 +38,23 @@ Scope(\_SB) {
 }
 Return (0x0F)
 }
-Name(_CRS, ResourceTemplate() {
-#if 0   /* This makes WinXP BSOD for not yet figured reasons. */
-IRQNoFlags() {2, 8}
-#endif
+Name(RESP, ResourceTemplate() {
 Memory32Fixed(ReadOnly,
 0xFED0, // Address Base
 0x0400, // Address Length
 )
 })
+Name(RESI, ResourceTemplate() {
+IRQNoFlags() {2, 8}
+})
+Method(_CRS, 0) {
+Store(\_SB.PCI0.ISA.SMC._STA(), Local0)
+If (LEqual(Local0, 0x0B)) {// AppleSMC present, add IRQ
+ConcatenateResTemplate(RESP, RESI, Local1)
+Return (Local1)
+} else {
+Return (RESP)
+}
+}
 }
 }
###
# Add DSDT entry for AppleSMC;
# TODO: find a way to make the _STA method return 0x0b only if QEMU command
# line contains "-device isa-applesmc", and 0x00 otherwise!
###
diff --git a/hw/i386/acpi-dsdt-isa.dsl b/hw/i386/acpi-dsdt-isa.dsl
index 89caa16..b7a27bb 100644
--- a/hw/i386/acpi-dsdt-isa.dsl
+++ b/hw/i386/acpi-dsdt-isa.dsl
@@ -16,6 +16,28 @@
 /* Common legacy ISA style devices. */
 Scope(\_SB.PCI0.ISA) {

+Device (SMC) {
+Name(_HID, EisaId("APP0001"))
+OperationRegion(SMC, SystemIO, 0x0300, 0x20)
+Field(SMC, ByteAcc, NoLock, Preserve) {
+Offset(0x04),
+CMDP, 8,
+}
+Method(_STA, 0) {
+//Store(0x10, CMDP)// APPLESMC_READ_CMD
+//Store(CMDP, Local0)
+//If (LEqual(Local0, 0x0c)) {
+Return (0x0B)
+//} Else {
+//Return (0x00)
+//}
+}
+Name (_CRS, ResourceTemplate () {
+IO (Decode16, 0x0300, 0x0300, 0x01, 0x20)
+IRQNoFlags() { 6 }
+})
+}
+
 Device(RTC) {
 Name(_HID, EisaId("PNP0B00"))
 Name(_CRS, ResourceTemplate() {

Re: [Qemu-devel] [PATCH for-1.7 0/5] acpi unit-test: added tests

2013-11-21 Thread Erik Rull


Marcel Apfelbaum wrote:

On Thu, 2013-11-21 at 22:20 +0100, Erik Rull wrote:

Marcel Apfelbaum wrote:

Added 2 tests:
   1. Basic check of FACS table (missed on prev submission)
   2. Compare DSDT and SSDT tables against expected values

Test 2:
   - runs only if iasl is installed on the host machine.
   - the test plan:
 1. Dumps the ACPI tables as AML on the disk.
 2. Runs iasl to disassembly the tables into ASL files.
 3. Compares them with expected offline ASL files.

   - the test runs for both default machine and q35.
   - in case the test fails, it can be easily tweaked to
 show the differences between the ASL files and
 understand the issue.

Patches:
   1/5 - test 1
   2/5 - some infrastructure improvements
   3/5 - expected asl files for test 2
   4/5 - creates links for the expected files
 if the build directory is not current
   5/5 - test 2



Which iasl Version is needed for the ACPI compilation and testing? I have
an IASL installed on my build machine, but when trying to compile the ACPI
stuff, it fails. Maybe it's just too old, but I didn't find a way to
disable the iasl access. Must I uninstall iasl on my machine to get qemu
compiled again?

I would use the latest version, version 20130823, from 
https://acpica.org/downloads
or the git from git://github.com/acpica/acpica.git

I don't think you need iasl on your computer to build qemu.

Hope I helped,
Marcel


Thanks.
But then I don't understand the error that appears:

  CPP x86_64-softmmu/acpi-dsdt.dsl.i.orig
  ACPI_PREPROCESS x86_64-softmmu/acpi-dsdt.dsl.i
  IASL x86_64-softmmu/acpi-dsdt.dsl.i
make[1]: *** [hw/i386/acpi-dsdt.hex] Error 1
make: *** [subdir-x86_64-softmmu] Error 2

I don't find a chance to disable this access/compilation within configure. 
If I just missed a possible option, it would be great to point me at it.


I found also when grep'ing through the sources that there is an "if" for 
check whether iasl is present or not. But setting --iasl= (empty) to force 
a removal of iasl for the qemu compilation gives a configure error.


Best regards,

Erik






The IASL version is:
Intel ACPI Component Architecture
ASL Optimizing Compiler version 20060912 [Dec 20 2006]
Copyright (C) 2000 - 2006 Intel Corporation
Supports ACPI Specification Revision 3.0a

Thanks for your support.

Best regards,

Erik

Re: [Qemu-devel] [PATCH for-1.7 0/5] acpi unit-test: added tests

2013-11-21 Thread Marcel Apfelbaum

On Thu, 2013-11-21 at 22:20 +0100, Erik Rull wrote:
> Marcel Apfelbaum wrote:
> > Added 2 tests:
> >   1. Basic check of FACS table (missed on prev submission)
> >   2. Compare DSDT and SSDT tables against expected values
> >
> > Test 2:
> >   - runs only if iasl is installed on the host machine.
> >   - the test plan:
> > 1. Dumps the ACPI tables as AML on the disk.
> > 2. Runs iasl to disassembly the tables into ASL files.
> > 3. Compares them with expected offline ASL files.
> >
> >   - the test runs for both default machine and q35.
> >   - in case the test fails, it can be easily tweaked to
> > show the differences between the ASL files and
> > understand the issue.
> >
> > Patches:
> >   1/5 - test 1
> >   2/5 - some infrastructure improvements
> >   3/5 - expected asl files for test 2
> >   4/5 - creates links for the expected files
> > if the build directory is not current
> >   5/5 - test 2
> >
> 
> Which iasl Version is needed for the ACPI compilation and testing? I have 
> an IASL installed on my build machine, but when trying to compile the ACPI 
> stuff, it fails. Maybe it's just too old, but I didn't find a way to 
> disable the iasl access. Must I uninstall iasl on my machine to get qemu 
> compiled again?
I would use the latest version, version 20130823, from 
https://acpica.org/downloads 
or the git from git://github.com/acpica/acpica.git

I don't think you need iasl on your computer to build qemu.

Hope I helped,
Marcel

> The IASL version is:
> Intel ACPI Component Architecture
> ASL Optimizing Compiler version 20060912 [Dec 20 2006]
> Copyright (C) 2000 - 2006 Intel Corporation
> Supports ACPI Specification Revision 3.0a
> 
> Thanks for your support.
> 
> Best regards,
> 
> Erik
> 
> 
>

[Qemu-devel] [PULL 02/11] configure: Explicitly set ARFLAGS so we can build with GNU Make 4.0

2013-11-21 Thread Paolo Bonzini

From: Peter Maydell 

Our rules.mak adds '-rR' to MAKEFLAGS to indicate that we will be
explicitly specifying everything and not relying on any default
variables or rules. However we were accidentally relying on the
default ARFLAGS ("rv"). This went unnoticed because of a bug in
GNU Make 3.82 and earlier which meant that adding -rR to MAKEFLAGS
only affected submakes, not the currently running instance.
Explicitly set ARFLAGS in config-host.mak, in the same way we
handle CFLAGS and LDFLAGS; this will allow us to work with
Make 4.0.

Thanks to Paul Smith for analyzing this bug for us.

Cc: qemu-sta...@nongnu.org
Reported-by: Ken Moffat 
Signed-off-by: Peter Maydell 
Signed-off-by: Paolo Bonzini 
---
 configure | 5 +
 1 file changed, 5 insertions(+)

diff --git a/configure b/configure
index 508f6a5..ad12688 100755
--- a/configure
+++ b/configure
@@ -325,6 +325,9 @@ query_pkg_config() {
 pkg_config=query_pkg_config
 sdl_config="${SDL_CONFIG-${cross_prefix}sdl-config}"
 
+# If the user hasn't specified ARFLAGS, default to 'rv', just as make does.
+ARFLAGS="${ARFLAGS-rv}"
+
 # default flags for all hosts
 QEMU_CFLAGS="-fno-strict-aliasing $QEMU_CFLAGS"
 QEMU_CFLAGS="-Wall -Wundef -Wwrite-strings -Wmissing-prototypes $QEMU_CFLAGS"
@@ -3695,6 +3698,7 @@ echo "C compiler$cc"
 echo "Host C compiler   $host_cc"
 echo "C++ compiler  $cxx"
 echo "Objective-C compiler $objcc"
+echo "ARFLAGS   $ARFLAGS"
 echo "CFLAGS$CFLAGS"
 echo "QEMU_CFLAGS   $QEMU_CFLAGS"
 echo "LDFLAGS   $LDFLAGS"
@@ -4276,6 +4280,7 @@ echo "HOST_CC=$host_cc" >> $config_host_mak
 echo "CXX=$cxx" >> $config_host_mak
 echo "OBJCC=$objcc" >> $config_host_mak
 echo "AR=$ar" >> $config_host_mak
+echo "ARFLAGS=$ARFLAGS" >> $config_host_mak
 echo "AS=$as" >> $config_host_mak
 echo "CPP=$cpp" >> $config_host_mak
 echo "OBJCOPY=$objcopy" >> $config_host_mak
-- 
1.8.3.1

[Qemu-devel] [PULL 04/11] atomic.h: Fix build with clang

2013-11-21 Thread Paolo Bonzini

From: Peter Maydell 

clang defines __ATOMIC_SEQ_CST but its implementation of the
__atomic_exchange() builtin differs from that of gcc. Move the
__clang__ branch of the ifdef ladder to the top and fix its
implementation (there is no such builtin as __sync_exchange),
so we can compile with clang again.

Signed-off-by: Peter Maydell 
Signed-off-by: Paolo Bonzini 
---
 include/qemu/atomic.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h
index 0aa8913..492bce1 100644
--- a/include/qemu/atomic.h
+++ b/include/qemu/atomic.h
@@ -168,14 +168,14 @@
 #endif
 
 #ifndef atomic_xchg
-#ifdef __ATOMIC_SEQ_CST
+#if defined(__clang__)
+#define atomic_xchg(ptr, i)__sync_swap(ptr, i)
+#elif defined(__ATOMIC_SEQ_CST)
 #define atomic_xchg(ptr, i)({   \
 typeof(*ptr) _new = (i), _old;  \
 __atomic_exchange(ptr, &_new, &_old, __ATOMIC_SEQ_CST); \
 _old;   \
 })
-#elif defined __clang__
-#define atomic_xchg(ptr, i)__sync_exchange(ptr, i)
 #else
 /* __sync_lock_test_and_set() is documented to be an acquire barrier only.  */
 #define atomic_xchg(ptr, i)(smp_mb(), __sync_lock_test_and_set(ptr, i))
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH for-1.7 0/5] acpi unit-test: added tests

2013-11-21 Thread Erik Rull


Marcel Apfelbaum wrote:

Added 2 tests:
  1. Basic check of FACS table (missed on prev submission)
  2. Compare DSDT and SSDT tables against expected values

Test 2:
  - runs only if iasl is installed on the host machine.
  - the test plan:
1. Dumps the ACPI tables as AML on the disk.
2. Runs iasl to disassembly the tables into ASL files.
3. Compares them with expected offline ASL files.

  - the test runs for both default machine and q35.
  - in case the test fails, it can be easily tweaked to
show the differences between the ASL files and
understand the issue.

Patches:
  1/5 - test 1
  2/5 - some infrastructure improvements
  3/5 - expected asl files for test 2
  4/5 - creates links for the expected files
if the build directory is not current
  5/5 - test 2



Which iasl Version is needed for the ACPI compilation and testing? I have 
an IASL installed on my build machine, but when trying to compile the ACPI 
stuff, it fails. Maybe it's just too old, but I didn't find a way to 
disable the iasl access. Must I uninstall iasl on my machine to get qemu 
compiled again?

The IASL version is:
Intel ACPI Component Architecture
ASL Optimizing Compiler version 20060912 [Dec 20 2006]
Copyright (C) 2000 - 2006 Intel Corporation
Supports ACPI Specification Revision 3.0a

Thanks for your support.

Best regards,

Erik

[Qemu-devel] [Bug 1253777] [NEW] OpenBSD VM running on OpenBSD host has sleep calls taking twice as long as they should

2013-11-21 Thread Martin van den Nieuwelaar

Public bug reported:

Running a script like

while [ 1 ]
do
  date
  sleep 1
done

on the VM will result in the (correct) date being displayed, but it is
displayed only every two (!) seconds.  We have also noticed that if we
connect to the VM's console using VNC, and move the mouse pointer
constantly in the VNC window, the script runs normally with updates
every second!  Note that the script doesn't have to be running on the
VM's console - it's also possible to (say) ssh to the VM from a separate
machine and run the script and it will display the '2 second' issue, but
as soon as you move the mouse pointer constantly in the VNC console
window the script starts behaving normally with updates every second.

I have only seen this bug when running an OpenBSD VM on an OpenBSD host.
Running an OpenBSD VM on a Linux host does not exhibit the problem for
me.  I also belive (am told) that a Linux VM running on an OpenBSD host
does not exhibit the problem.

I have been using the OpenBSD 5.4 64 bit distro which comes with qemu
1.5.1 in a package, however I tried compiling qemu 1.6.1 and that has
the same bug.  In fact older OpenBSD distros have the same issue - going
back to OpenBSD distros from two years ago still have the problem.  This
is not a 'new' bug recently introduced.

Initially I wondered if it could be traced to an incorrectly set command
line option, but I've since gone through many of the options in the man
page simply trying different values (eg. different CPU types ( -cpu) ,
different emulated PC (-M)) but so far the problem remains.

I'm quite happy to run tests in order to track this bug down better.  We
use qemu running on OpenBSD extensively and find it very useful!

** Affects: qemu
 Importance: Undecided
 Status: New


** Tags: openbsd sleep time

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1253777

Title:
  OpenBSD VM running on OpenBSD host has sleep calls taking twice as
  long as they should

Status in QEMU:
  New

Bug description:
  Running a script like

  while [ 1 ]
  do
date
sleep 1
  done

  on the VM will result in the (correct) date being displayed, but it is
  displayed only every two (!) seconds.  We have also noticed that if we
  connect to the VM's console using VNC, and move the mouse pointer
  constantly in the VNC window, the script runs normally with updates
  every second!  Note that the script doesn't have to be running on the
  VM's console - it's also possible to (say) ssh to the VM from a
  separate machine and run the script and it will display the '2 second'
  issue, but as soon as you move the mouse pointer constantly in the VNC
  console window the script starts behaving normally with updates every
  second.

  I have only seen this bug when running an OpenBSD VM on an OpenBSD
  host.  Running an OpenBSD VM on a Linux host does not exhibit the
  problem for me.  I also belive (am told) that a Linux VM running on an
  OpenBSD host does not exhibit the problem.

  I have been using the OpenBSD 5.4 64 bit distro which comes with qemu
  1.5.1 in a package, however I tried compiling qemu 1.6.1 and that has
  the same bug.  In fact older OpenBSD distros have the same issue -
  going back to OpenBSD distros from two years ago still have the
  problem.  This is not a 'new' bug recently introduced.

  Initially I wondered if it could be traced to an incorrectly set
  command line option, but I've since gone through many of the options
  in the man page simply trying different values (eg. different CPU
  types ( -cpu) , different emulated PC (-M)) but so far the problem
  remains.

  I'm quite happy to run tests in order to track this bug down better.
  We use qemu running on OpenBSD extensively and find it very useful!

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1253777/+subscriptions

Re: [Qemu-devel] [RFC PATCH 1/2] e1000: Use Address_Available bit as HW latch

2013-11-21 Thread Eric Blake

On 11/21/2013 01:04 PM, Vlad Yasevich wrote:
> e1000 provides a E1000_RAH_AV bit on every complete write
> to the Receive Address Register.  We can use this bit
> 2 ways:
>  1) To trigger HMP notifications.  When the bit is set the
> mac address is fully set and we can update the HMP.
> 
>  2) We can turn off he bit on the write to low order bits of

s/he/the/

> the Receive Address Register, so that we would not try
> to match received traffic to this address when it is
> not completely set.
> 
> Signed-off-by: Vlad Yasevich 
> ---
>  hw/net/e1000.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 

>  
> -if (index == RA || index == RA + 1) {
> +switch (index) {
> +case RA:
> +/* Mask off AV bit on the write of the low dword.  The write of
> + * the high dword will set the bit.  This way a half-written
> + * mac address will not be used to filter on rx.
> + */
> +s->mac_reg[RA+1] &= ~E1000_RAH_AV;

Does real hardware also auto-clear this bit when writing the low word
(thus forcing all drivers to write high word last to make a change take
effect)?  Or are we risking the case of a driver that writes high word
first including the bit, and where real hardware just glitches over the
temporary half-written address where our emulation locks the user out
entirely?  (Asked by someone that has not read the datasheet, so take
with a grain of salt)

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCHv2 3/6] ui/vnc: optimize dirty bitmap tracking

2013-11-21 Thread Eric Blake

On 11/21/2013 01:51 AM, Peter Lieven wrote:
> vnc_update_client currently scans the dirty bitmap of each client
> bitwise which is a very costly operation if only few bits are dirty.
> vnc_refresh_server_surface does almost the same.
> this patch optimizes both by utilizing the heavily optimized
> function find_next_bit to find the offset of the next dirty
> bit in the dirty bitmaps.
> 
> The following artifical test (just the bitmap operation part) running

s/artifical/artificial/

> vnc_update_client 65536 times on a 2560x2048 surface illustrates the
> performance difference:
> 
> All bits clean - vnc_update_client_new: 0.07 secs
>  vnc_update_client_old: 10.98 secs
> 
> All bits dirty - vnc_update_client_new: 11.26 secs
>  vnc_update_client_old: 20.19 secs
> 
> Few bits dirty - vnc_update_client_new: 0.08 secs
>  vnc_update_client_old: 10.98 secs
> 
> The case for all bits dirty is still rather slow, this
> is due to the implementation of find_and_clear_dirty_height.
> This will be addresses in a separate patch.

s/addresses/addressed/

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [RFC PATCH 2/2] rtl8139: update HMP only when the address is fully written

2013-11-21 Thread Vlad Yasevich

rtl8139 hardware requires 9346 config register to be set into
write mode before mac address can be changed even though it is
not documented.  Every driver inspected so far appears to do
this along with comments that this is an undocumented requirement.

We can use this to help us identify when the mac address has been
completely written.  Simple set a flag whenever mac has changed
and at the next transition of 9346 register from Write to Normal
mode, we update the HMP.

Signed-off-by: Vlad Yasevich 
---
 hw/i386/pc_piix.c|  4 
 hw/i386/pc_q35.c |  4 
 hw/net/rtl8139.c | 50 +-
 include/hw/i386/pc.h |  8 
 4 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 2daa111..731ae3b 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -372,6 +372,10 @@ static QEMUMachine pc_i440fx_machine_v1_7 = {
 PC_I440FX_1_7_MACHINE_OPTIONS,
 .name = "pc-i440fx-1.7",
 .init = pc_init_pci_1_7,
+.compat_props = (GlobalProperty[]) {
+PC_COMPAT_1_7,
+{ /* end of list */ }
+},
 };
 
 #define PC_I440FX_1_6_MACHINE_OPTIONS PC_I440FX_1_7_MACHINE_OPTIONS
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index e2b8907..7b6aedf 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -292,6 +292,10 @@ static QEMUMachine pc_q35_machine_v1_7 = {
 PC_Q35_1_7_MACHINE_OPTIONS,
 .name = "pc-q35-1.7",
 .init = pc_q35_init_1_7,
+.compat_props = (GlobalProperty[]) {
+PC_COMPAT_1_7,
+{ /* end of list */ }
+},
 };
 
 #define PC_Q35_1_6_MACHINE_OPTIONS PC_Q35_1_7_MACHINE_OPTIONS
diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
index 7f2b4db..5c4caec 100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -476,6 +476,7 @@ typedef struct RTL8139State {
 
 uint16_t CpCmd;
 uint8_t  TxThresh;
+bool mac_changed;
 
 NICState *nic;
 NICConf conf;
@@ -515,6 +516,9 @@ typedef struct RTL8139State {
 
 /* Support migration to/from old versions */
 int rtl8139_mmio_io_addr_dummy;
+#define RTL8139_FLAG_MAC_BIT 0
+#define RTL8139_FLAG_MAC_COMPLETE (1 << RTL8139_FLAG_MAC_BIT)
+uint32_tcompat_flags;
 } RTL8139State;
 
 /* Writes tally counters to memory via DMA */
@@ -1215,6 +1219,7 @@ static void rtl8139_reset(DeviceState *d)
 /* restore MAC address */
 memcpy(s->phys, s->conf.macaddr.a, 6);
 qemu_format_nic_info_str(qemu_get_queue(s->nic), s->phys);
+s->mac_changed = false;
 
 /* reset interrupt mask */
 s->IntrStatus = 0;
@@ -1563,6 +1568,14 @@ static void rtl8139_Cfg9346_write(RTL8139State *s, 
uint32_t val)
 /* Reset.  */
 val = 0;
 rtl8139_reset(d);
+} else if (opmode == Cfg9346_Normal && s->mac_changed) {
+/* Even though it is not documented, it is required to set
+ * opmode to Cfg9346_ConfigWrite when changing the mac address
+ * of the card and to set to Cfg9346_Normal when done.  We
+ * can use this as an idication to kick off the notification event.
+ */
+qemu_format_nic_info_str(qemu_get_queue(s->nic), s->phys);
+s->mac_changed = false;
 }
 
 s->Cfg9346 = val;
@@ -2743,7 +2756,12 @@ static void rtl8139_io_writeb(void *opaque, uint8_t 
addr, uint32_t val)
 {
 case MAC0 ... MAC0+5:
 s->phys[addr - MAC0] = val;
-qemu_format_nic_info_str(qemu_get_queue(s->nic), s->phys);
+if (s->compat_flags & RTL8139_FLAG_MAC_COMPLETE) {
+s->mac_changed = true;
+} else if (addr == MAC0+5) {
+/* Emulate old style updates on the last write */
+qemu_format_nic_info_str(qemu_get_queue(s->nic), s->phys);
+}
 break;
 case MAC0+6 ... MAC0+7:
 /* reserved */
@@ -3256,6 +3274,13 @@ static int rtl8139_post_load(void *opaque, int 
version_id)
  * to link status bit in BasicModeStatus */
 qemu_get_queue(s->nic)->link_down = (s->BasicModeStatus & 0x04) == 0;
 
+/* Emulate old behavior if we don't support mac change completion
+ * tracking
+ */
+if (!(s->compat_flags & RTL8139_FLAG_MAC_COMPLETE)) {
+s->mac_changed = false;
+}
+
 return 0;
 }
 
@@ -3286,6 +3311,24 @@ static void rtl8139_pre_save(void *opaque)
 s->rtl8139_mmio_io_addr_dummy = 0;
 }
 
+static bool rtl8139_mac_state_needed(void *opaque)
+{
+RTL8139State *s = opaque;
+
+return (s->compat_flags & RTL8139_FLAG_MAC_COMPLETE) && s->mac_changed;
+}
+
+static const VMStateDescription vmstate_rtl8139_mac_state ={
+.name = "rtl8139/mac_state",
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.fields  = (VMStateField []) {
+VMSTATE_BOOL(mac_changed, RTL8139State),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static const VMStateDescription vmstate_rtl8139 = {
 .name = "rtl8139",
 .version_id = 4,
@@ -3371,6 +3414,9 @

Re: [Qemu-devel] [PATCH 0/3] [PULL for 1.7?] qemu-kvm.git uq/master queue

2013-11-21 Thread Stefan Weil

Am 21.11.2013 14:28, schrieb Gleb Natapov:
> The following changes since commit fc8ead74674b7129e8f31c2595c76658e5622197:
>
>   Merge remote-tracking branch 'qemu-kvm/uq/master' into staging (2013-10-18 
> 10:03:24 -0700)
>
> are available in the git repository at:
>
>
>   git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git uq/master
>
> for you to fetch changes up to ef4cbe14342c1f63b3c754e306218f004f4e26c4:
>
>   kvm: Fix uninitialized cpuid_data (2013-11-07 13:14:56 +0200)
>
> 
> Jan Kiszka (1):
>   pci-assign: Remove dead code for direct I/O region access from userspace
>
> Paolo Bonzini (1):
>   KVM: x86: fix typo in KVM_GET_XCRS
>
> Stefan Weil (1):
>   kvm: Fix uninitialized cpuid_data
>
>  hw/i386/kvm/pci-assign.c | 56 
> +---
>  target-i386/kvm.c| 13 ---
>  2 files changed, 14 insertions(+), 55 deletions(-)
>

I think these patches should be included in QEMU 1.7, too.

Stefan

Re: [Qemu-devel] [PATCH for-1.7 1/2] acpi-build: fix build on glib < 2.22

2013-11-21 Thread Paolo Bonzini

Il 21/11/2013 13:17, Michael S. Tsirkin ha scritto:
> g_string_vprintf was only introduced in 2.24 so switch to vsnprintf
> instead.  A bit uglier but name size is fixed at 4 bytes here so it's
> easy.
> 
> Reported-by: Richard Henderson 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  hw/i386/acpi-build.c | 11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 486e705..59a17df 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -287,16 +287,17 @@ static inline void build_append_array(GArray *array, 
> GArray *val)
>  
>  static void build_append_nameseg(GArray *array, const char *format, ...)
>  {
> -GString *s = g_string_new("");
> +/* It would be nicer to use g_string_vprintf but it's only there in 2.22 
> */
> +char s[] = "";
> +int len;
>  va_list args;
>  
>  va_start(args, format);
> -g_string_vprintf(s, format, args);
> +len = vsnprintf(s, sizeof s, format, args);
>  va_end(args);
>  
> -assert(s->len == 4);
> -g_array_append_vals(array, s->str, s->len);
> -g_string_free(s, true);
> +assert(len == 4);
> +g_array_append_vals(array, s, len);
>  }
>  
>  /* 5.4 Definition Block Encoding */
> 

Reviewed-by: Paolo Bonzini

[Qemu-devel] [PATCH 2/3] pci-assign: Remove dead code for direct I/O region access from userspace

2013-11-21 Thread Gleb Natapov

From: Jan Kiszka 

This feature was already deprecated back then in qemu-kvm, ie. before
pci-assign went upstream. assigned_dev_ioport_rw will never be invoked
with resource_fd < 0.

Signed-off-by: Jan Kiszka 
Acked-by: Alex Williamson 
Signed-off-by: Gleb Natapov 
---
 hw/i386/kvm/pci-assign.c | 56 +---
 1 file changed, 10 insertions(+), 46 deletions(-)

diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index 011764f..4e65110 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -154,55 +154,19 @@ static uint64_t assigned_dev_ioport_rw(AssignedDevRegion 
*dev_region,
 uint64_t val = 0;
 int fd = dev_region->region->resource_fd;
 
-if (fd >= 0) {
-if (data) {
-DEBUG("pwrite data=%" PRIx64 ", size=%d, e_phys=" TARGET_FMT_plx
-  ", addr="TARGET_FMT_plx"\n", *data, size, addr, addr);
-if (pwrite(fd, data, size, addr) != size) {
-error_report("%s - pwrite failed %s",
- __func__, strerror(errno));
-}
-} else {
-if (pread(fd, &val, size, addr) != size) {
-error_report("%s - pread failed %s",
- __func__, strerror(errno));
-val = (1UL << (size * 8)) - 1;
-}
-DEBUG("pread val=%" PRIx64 ", size=%d, e_phys=" TARGET_FMT_plx
-  ", addr=" TARGET_FMT_plx "\n", val, size, addr, addr);
+if (data) {
+DEBUG("pwrite data=%" PRIx64 ", size=%d, e_phys=" TARGET_FMT_plx
+  ", addr="TARGET_FMT_plx"\n", *data, size, addr, addr);
+if (pwrite(fd, data, size, addr) != size) {
+error_report("%s - pwrite failed %s", __func__, strerror(errno));
 }
 } else {
-uint32_t port = addr + dev_region->u.r_baseport;
-
-if (data) {
-DEBUG("out data=%" PRIx64 ", size=%d, e_phys=" TARGET_FMT_plx
-  ", host=%x\n", *data, size, addr, port);
-switch (size) {
-case 1:
-outb(*data, port);
-break;
-case 2:
-outw(*data, port);
-break;
-case 4:
-outl(*data, port);
-break;
-}
-} else {
-switch (size) {
-case 1:
-val = inb(port);
-break;
-case 2:
-val = inw(port);
-break;
-case 4:
-val = inl(port);
-break;
-}
-DEBUG("in data=%" PRIx64 ", size=%d, e_phys=" TARGET_FMT_plx
-  ", host=%x\n", val, size, addr, port);
+if (pread(fd, &val, size, addr) != size) {
+error_report("%s - pread failed %s", __func__, strerror(errno));
+val = (1UL << (size * 8)) - 1;
 }
+DEBUG("pread val=%" PRIx64 ", size=%d, e_phys=" TARGET_FMT_plx
+  ", addr=" TARGET_FMT_plx "\n", val, size, addr, addr);
 }
 return val;
 }
-- 
1.8.4.rc3

[Qemu-devel] [PATCH 0/3] [PULL] qemu-kvm.git uq/master queue

2013-11-21 Thread Gleb Natapov

The following changes since commit fc8ead74674b7129e8f31c2595c76658e5622197:

  Merge remote-tracking branch 'qemu-kvm/uq/master' into staging (2013-10-18 
10:03:24 -0700)

are available in the git repository at:


  git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git uq/master

for you to fetch changes up to ef4cbe14342c1f63b3c754e306218f004f4e26c4:

  kvm: Fix uninitialized cpuid_data (2013-11-07 13:14:56 +0200)


Jan Kiszka (1):
  pci-assign: Remove dead code for direct I/O region access from userspace

Paolo Bonzini (1):
  KVM: x86: fix typo in KVM_GET_XCRS

Stefan Weil (1):
  kvm: Fix uninitialized cpuid_data

 hw/i386/kvm/pci-assign.c | 56 +---
 target-i386/kvm.c| 13 ---
 2 files changed, 14 insertions(+), 55 deletions(-)

Re: [Qemu-devel] [PATCH v3 0/6] Add cache mode option to qemu-iotests, and change default mode to "writeback"

2013-11-21 Thread Stefan Hajnoczi

On Wed, Nov 20, 2013 at 03:44:11PM +0800, Fam Zheng wrote:
> This series adds cache mode option in the iotests framework. Test cases are
> updated to make use of cache mode and mask supported modes.
> 
> v3: Change _unsupported_qemu_io_options to _supported_cache_modes.
> Change default mode to "writeback".
> Clean up some whitespaces in the end of series.
> Fix "026.out.nocache" case.
> Fix 048 case on tmpfs.
> 
> 
> Fam Zheng (6):
>   qemu-iotests: Add "-c " option
>   qemu-iotests: Honour cache mode in iotests.py
>   qemu-iotests: Add _supported_cache_modes
>   qemu-iotests: Change default cache mode to "writeback"
>   qemu-iotests: Force qcow2 in error path test in 048
>   qemu-iotests: Clean up spaces in usage output
> 
>  tests/qemu-iotests/026|  2 +-
>  tests/qemu-iotests/039|  2 +-
>  tests/qemu-iotests/048|  8 +++-
>  tests/qemu-iotests/052|  3 +--
>  tests/qemu-iotests/check  |  2 +-
>  tests/qemu-iotests/common | 37 +++--
>  tests/qemu-iotests/common.rc  | 20 ++--
>  tests/qemu-iotests/iotests.py |  3 ++-
>  8 files changed, 50 insertions(+), 27 deletions(-)

In principle I'm happy with the series but there are two instances where
you are silently changing what the test does (overriding cache mode and
image format).

Please skip tests that cannot run instead of silently testing something
else.

Stefan

Re: [Qemu-devel] [PATCH 16/27] acpi: ich9: allow guest to clear SCI rised by GPE

2013-11-21 Thread Igor Mammedov

On Thu, 21 Nov 2013 16:28:40 +0800
Hu Tao  wrote:

> On Thu, Nov 21, 2013 at 10:26:36AM +0200, Michael S. Tsirkin wrote:
> > On Thu, Nov 21, 2013 at 04:12:22PM +0800, Hu Tao wrote:
> > > On Thu, Nov 21, 2013 at 09:14:18AM +0200, Michael S. Tsirkin wrote:
> > > > On Thu, Nov 21, 2013 at 03:38:37AM +0100, Igor Mammedov wrote:
> > > > > it fixes IRQ storm since guest isn't able to lower SCI IRQ
> > > > > after it has been handled when it clears GPE event.
> > > > > 
> > > > > Signed-off-by: Igor Mammedov 
> > > > 
> > > > The storm is only on memory hotplug right?
> > > 
> > > IIRC, it happens on cpu hotplug, too.
> > 
> > So this is a bugfix then? We should include this in 1.7?
> 
> Yes. but cpu hotplug support is not added for q35, so the bug doesn't
> affect anyone, yet.
> 

I was thinking about CPU hotplug for q35 as well, thus patch

1/27 "acpi: factor out common pm_update_sci() into acpi core"

moves common for CPU/Memory hotplug ACPI bits into generic code,
so making CPU hotplug for q35 would require only unifying CPU
hotplug bits and plumbing it in q35 its done for mem hotplug in:

 [PATCH 17/27] acpi: initialize memory hotplug ACPI ICH9 hardware

[Qemu-devel] [PATCH v3 for-1.7] s390x: fix flat file load on 32 bit systems

2013-11-21 Thread Michael S. Tsirkin

pc-bios/s390-zipl.rom is a flat image so it's expected that
loading it as elf will fail.
It should fall back on loading a flat file, but doesn't
on 32 bit systems, instead it fails printing:
qemu: hardware error: could not load bootloader 's390-zipl.rom'

The result is boot failure.

The reason is that a 64 bit unsigned interger which is set
to -1 on error is compared to -1UL which on a 32 bit system
with gcc is a 32 bit unsigned interger.
Since both are unsigned, no sign extension takes place and
comparison evaluates to non-equal.

There's no reason to do clever tricks: all functions
we call actually return int so just use int.
And then we can use == -1 everywhere, consistently.

Reviewed-by: Alexander Graf 
Signed-off-by: Michael S. Tsirkin 
---

changes from v2:
drop cleanups that aren't bugfixes, not needed for 1.7
changes from v1:
fix more places

 hw/s390x/ipl.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index d69adb2..65d39da 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -62,10 +62,10 @@ typedef struct S390IPLState {
 static int s390_ipl_init(SysBusDevice *dev)
 {
 S390IPLState *ipl = S390_IPL(dev);
-ram_addr_t kernel_size = 0;
+int kernel_size;
 
 if (!ipl->kernel) {
-ram_addr_t bios_size = 0;
+int bios_size;
 char *bios_filename;
 
 /* Load zipl bootloader */
@@ -80,7 +80,7 @@ static int s390_ipl_init(SysBusDevice *dev)
 
 bios_size = load_elf(bios_filename, NULL, NULL, &ipl->start_addr, NULL,
  NULL, 1, ELF_MACHINE, 0);
-if (bios_size == -1UL) {
+if (bios_size == -1) {
 bios_size = load_image_targphys(bios_filename, ZIPL_IMAGE_START,
 4096);
 ipl->start_addr = ZIPL_IMAGE_START;
@@ -90,17 +90,17 @@ static int s390_ipl_init(SysBusDevice *dev)
 }
 g_free(bios_filename);
 
-if ((long)bios_size < 0) {
+if (bios_size == -1) {
 hw_error("could not load bootloader '%s'\n", bios_name);
 }
 return 0;
 } else {
 kernel_size = load_elf(ipl->kernel, NULL, NULL, NULL, NULL,
NULL, 1, ELF_MACHINE, 0);
-if (kernel_size == -1UL) {
+if (kernel_size == -1) {
 kernel_size = load_image_targphys(ipl->kernel, 0, ram_size);
 }
-if (kernel_size == -1UL) {
+if (kernel_size == -1) {
 fprintf(stderr, "could not load kernel '%s'\n", ipl->kernel);
 return -1;
 }
@@ -115,7 +115,8 @@ static int s390_ipl_init(SysBusDevice *dev)
 ipl->start_addr = KERN_IMAGE_START;
 }
 if (ipl->initrd) {
-ram_addr_t initrd_offset, initrd_size;
+ram_addr_t initrd_offset;
+int initrd_size;
 
 initrd_offset = INITRD_START;
 while (kernel_size + 0x10 > initrd_offset) {
@@ -123,7 +124,7 @@ static int s390_ipl_init(SysBusDevice *dev)
 }
 initrd_size = load_image_targphys(ipl->initrd, initrd_offset,
   ram_size - initrd_offset);
-if (initrd_size == -1UL) {
+if (initrd_size == -1) {
 fprintf(stderr, "qemu: could not load initrd '%s'\n", ipl->initrd);
 exit(1);
 }
-- 
MST

[Qemu-devel] [PATCH v2 for-1.7] s390x: fix flat file load on 32 bit systems

2013-11-21 Thread Michael S. Tsirkin

pc-bios/s390-zipl.rom is a flat image so it's expected that
loading it as elf will fail.
It should fall back on loading a flat file, but doesn't
on 32 bit systems, instead it fails printing:
qemu: hardware error: could not load bootloader 's390-zipl.rom'

The result is boot failure.

The reason is that a 64 bit unsigned interger which is set
to -1 on error is compared to -1UL which on a 32 bit system
with gcc is a 32 bit unsigned interger.
Since both are unsigned, no sign extension takes place and
comparison evaluates to non-equal.

There's no reason to do clever tricks: all functions
we call actually return int so just use int.
In fact ram_addr_t dos not make any sense -
it's meaning is "memory handle for migration".

And then we can use == -1 everywhere, consistently.

Signed-off-by: Michael S. Tsirkin 
---

Changes from v1:
better fix: use int everywhere
fix all places with same bug (e.g. -kernel was broken too)

 hw/s390x/ipl.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index d69adb2..9570912 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -62,10 +62,9 @@ typedef struct S390IPLState {
 static int s390_ipl_init(SysBusDevice *dev)
 {
 S390IPLState *ipl = S390_IPL(dev);
-ram_addr_t kernel_size = 0;
 
 if (!ipl->kernel) {
-ram_addr_t bios_size = 0;
+int bios_size;
 char *bios_filename;
 
 /* Load zipl bootloader */
@@ -80,7 +79,7 @@ static int s390_ipl_init(SysBusDevice *dev)
 
 bios_size = load_elf(bios_filename, NULL, NULL, &ipl->start_addr, NULL,
  NULL, 1, ELF_MACHINE, 0);
-if (bios_size == -1UL) {
+if (bios_size == -1) {
 bios_size = load_image_targphys(bios_filename, ZIPL_IMAGE_START,
 4096);
 ipl->start_addr = ZIPL_IMAGE_START;
@@ -90,17 +89,19 @@ static int s390_ipl_init(SysBusDevice *dev)
 }
 g_free(bios_filename);
 
-if ((long)bios_size < 0) {
+if (bios_size == -1) {
 hw_error("could not load bootloader '%s'\n", bios_name);
 }
 return 0;
 } else {
+int kernel_size;
+
 kernel_size = load_elf(ipl->kernel, NULL, NULL, NULL, NULL,
NULL, 1, ELF_MACHINE, 0);
-if (kernel_size == -1UL) {
+if (kernel_size == -1) {
 kernel_size = load_image_targphys(ipl->kernel, 0, ram_size);
 }
-if (kernel_size == -1UL) {
+if (kernel_size == -1) {
 fprintf(stderr, "could not load kernel '%s'\n", ipl->kernel);
 return -1;
 }
@@ -115,7 +116,8 @@ static int s390_ipl_init(SysBusDevice *dev)
 ipl->start_addr = KERN_IMAGE_START;
 }
 if (ipl->initrd) {
-ram_addr_t initrd_offset, initrd_size;
+hwaddr initrd_offset;
+int initrd_size;
 
 initrd_offset = INITRD_START;
 while (kernel_size + 0x10 > initrd_offset) {
@@ -123,7 +125,7 @@ static int s390_ipl_init(SysBusDevice *dev)
 }
 initrd_size = load_image_targphys(ipl->initrd, initrd_offset,
   ram_size - initrd_offset);
-if (initrd_size == -1UL) {
+if (initrd_size == -1) {
 fprintf(stderr, "qemu: could not load initrd '%s'\n", ipl->initrd);
 exit(1);
 }
-- 
MST

Re: [Qemu-devel] [PATCH for-1.7 2/2] acpi-build: fix build on glib < 2.14

2013-11-21 Thread Paolo Bonzini

Il 21/11/2013 13:17, Michael S. Tsirkin ha scritto:
> g_array_get_element_size was only added in glib 2.14.
> Fortunately we don't use it for any arrays where
> element size is > 1, so just add an assert.
> 
> Reported-by: Richard Henderson 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  hw/i386/acpi-build.c | 5 -
>  hw/i386/bios-linker-loader.c | 8 
>  2 files changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 59a17df..5f36e7e 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -425,7 +425,10 @@ static inline void *acpi_data_push(GArray *table_data, 
> unsigned size)
>  
>  static unsigned acpi_data_len(GArray *table)
>  {
> -return table->len * g_array_get_element_size(table);
> +#if GLIB_CHECK_VERSION(2, 14, 0)
> +assert(g_array_get_element_size(table) == 1);
> +#endif
> +return table->len;
>  }
>  
>  static void acpi_align_size(GArray *blob, unsigned align)

This looks good, but is the below part necessary?  It's ugly!

Paolo

> diff --git a/hw/i386/bios-linker-loader.c b/hw/i386/bios-linker-loader.c
> index 0833853..fd23611 100644
> --- a/hw/i386/bios-linker-loader.c
> +++ b/hw/i386/bios-linker-loader.c
> @@ -90,7 +90,7 @@ enum {
>  
>  GArray *bios_linker_loader_init(void)
>  {
> -return g_array_new(false, true /* clear */, 
> sizeof(BiosLinkerLoaderEntry));
> +return g_array_new(false, true /* clear */, 1);
>  }
>  
>  /* Free linker wrapper and return the linker array. */
> @@ -115,7 +115,7 @@ void bios_linker_loader_alloc(GArray *linker,
>  BIOS_LINKER_LOADER_ALLOC_ZONE_HIGH);
>  
>  /* Alloc entries must come first, so prepend them */
> -g_array_prepend_val(linker, entry);
> +g_array_prepend_vals(linker, &entry, sizeof entry);
>  }
>  
>  void bios_linker_loader_add_checksum(GArray *linker, const char *file,
> @@ -132,7 +132,7 @@ void bios_linker_loader_add_checksum(GArray *linker, 
> const char *file,
>  entry.cksum.start = cpu_to_le32((uint8_t *)start - (uint8_t *)table);
>  entry.cksum.length = cpu_to_le32(size);
>  
> -g_array_append_val(linker, entry);
> +g_array_append_vals(linker, &entry, sizeof entry);
>  }
>  
>  void bios_linker_loader_add_pointer(GArray *linker,
> @@ -154,5 +154,5 @@ void bios_linker_loader_add_pointer(GArray *linker,
>  assert(pointer_size == 1 || pointer_size == 2 ||
> pointer_size == 4 || pointer_size == 8);
>  
> -g_array_append_val(linker, entry);
> +g_array_append_vals(linker, &entry, sizeof entry);
>  }
>

[Qemu-devel] [PATCH 2/3] usb-host-libusb: Add alloc / free streams ops

2013-11-21 Thread Hans de Goede

Signed-off-by: Hans de Goede 
---
 hw/usb/host-libusb.c | 50 ++
 1 file changed, 50 insertions(+)

diff --git a/hw/usb/host-libusb.c b/hw/usb/host-libusb.c
index 0dd60b2..894875b 100644
--- a/hw/usb/host-libusb.c
+++ b/hw/usb/host-libusb.c
@@ -1280,6 +1280,54 @@ static void usb_host_handle_reset(USBDevice *udev)
 }
 }
 
+static int usb_host_alloc_streams(USBDevice *udev, USBEndpoint **eps,
+  int nr_eps, int streams)
+{
+#if LIBUSBX_API_VERSION >= 0x01000103
+USBHostDevice *s = USB_HOST_DEVICE(udev);
+unsigned char endpoints[30];
+int i, rc;
+
+for (i = 0; i < nr_eps; i++) {
+endpoints[i] = eps[i]->nr;
+if (eps[i]->pid == USB_TOKEN_IN) {
+endpoints[i] |= 0x80;
+}
+}
+rc = libusb_alloc_streams(s->dh, streams, endpoints, nr_eps);
+if (rc < 0) {
+usb_host_libusb_error("libusb_alloc_streams", rc);
+} else if (rc != streams) {
+fprintf(stderr,
+"libusb_alloc_streams: got less streams then requested %d < %d\n",
+rc, streams);
+}
+
+return (rc == streams) ? 0 : -1;
+#else
+fprintf(stderr, "libusb_alloc_streams: error not implemented\n");
+return -1;
+#endif
+}
+
+static void usb_host_free_streams(USBDevice *udev, USBEndpoint **eps,
+  int nr_eps)
+{
+#if LIBUSBX_API_VERSION >= 0x01000103
+USBHostDevice *s = USB_HOST_DEVICE(udev);
+unsigned char endpoints[30];
+int i;
+
+for (i = 0; i < nr_eps; i++) {
+endpoints[i] = eps[i]->nr;
+if (eps[i]->pid == USB_TOKEN_IN) {
+endpoints[i] |= 0x80;
+}
+}
+libusb_free_streams(s->dh, endpoints, nr_eps);
+#endif
+}
+
 /*
  * This is *NOT* about restoring state.  We have absolutely no idea
  * what state the host device is in at the moment and whenever it is
@@ -1361,6 +1409,8 @@ static void usb_host_class_initfn(ObjectClass *klass, 
void *data)
 uc->handle_reset   = usb_host_handle_reset;
 uc->handle_destroy = usb_host_handle_destroy;
 uc->flush_ep_queue = usb_host_flush_ep_queue;
+uc->alloc_streams  = usb_host_alloc_streams;
+uc->free_streams   = usb_host_free_streams;
 dc->vmsd = &vmstate_usb_host;
 dc->props = usb_host_dev_properties;
 set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
-- 
1.8.4.2

Re: [Qemu-devel] [PATCH v3 5/6] qemu-iotests: Force qcow2 in error path test in 048

2013-11-21 Thread Fam Zheng


On 2013年11月21日 20:41, Stefan Hajnoczi wrote:

On Wed, Nov 20, 2013 at 03:44:16PM +0800, Fam Zheng wrote:

The "raw" doesn't always work on certain file systems (e.g. tmpfs). Use
qcow2 to make the allocation status explicit.

Signed-off-by: Fam Zheng 
---
  tests/qemu-iotests/048 | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/048 b/tests/qemu-iotests/048
index 9def7fc..0fb5d2c 100755
--- a/tests/qemu-iotests/048
+++ b/tests/qemu-iotests/048
@@ -88,7 +88,13 @@ event = "read_aio"
  errno = "5"
  once ="off"
  EOF
-_make_test_img $size
+if [ "$IMGFMT" = "raw" ]; then
+# For raw the allocation status for new image depends on file system, force
+# qcow2 in this case, since the tested code is in block layer
+IMGFMT=qcow2 _make_test_img $size | _filter_imgfmt


If it can't be tested, skip the test.  Don't silently test something
else.



OK. Do you think I should split the test? Since we don't have 
conditional reference output.


Fam

Re: [Qemu-devel] [PATCH for-1.7] s390x: fix flat rom load on 32 bit systems

2013-11-21 Thread Alexander Graf


On 21.11.2013, at 13:24, Cornelia Huck  wrote:

> On Thu, 21 Nov 2013 14:08:22 +0200
> "Michael S. Tsirkin"  wrote:
> 
>> pc-bios/s390-zipl.rom is a flat image so it's expected that
>> loading it as elf will fail.
>> It should fall back on loading a flat file, but doesn't
>> on 32 bit systems, instead it fails printing:
>>qemu: hardware error: could not load bootloader 's390-zipl.rom'
>> 
>> The result is boot failure.
>> 
>> The reason is that a 64 bit unsigned interger which is set
>> to -1 on error is compared to -1UL which on a 32 bit system
>> with gcc is a 32 bit unsigned interger.
>> Since both are unsigned, no sign extension takes place and
>> comparison evaluates to non-equal.
>> 
>> There's no reason to do clever tricks: -1 will cause
>> sign extension to happen correctly automatically.
>> 
>> Signed-off-by: Michael S. Tsirkin 
>> ---
>> hw/s390x/ipl.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
>> index d69adb2..88115e9 100644
>> --- a/hw/s390x/ipl.c
>> +++ b/hw/s390x/ipl.c
>> @@ -80,7 +80,7 @@ static int s390_ipl_init(SysBusDevice *dev)
>> 
>> bios_size = load_elf(bios_filename, NULL, NULL, &ipl->start_addr, 
>> NULL,
>>  NULL, 1, ELF_MACHINE, 0);
>> -if (bios_size == -1UL) {
>> +if (bios_size == -1) {
>> bios_size = load_image_targphys(bios_filename, ZIPL_IMAGE_START,
>> 4096);
>> ipl->start_addr = ZIPL_IMAGE_START;
> 
> Makes sense, but doesn't the kernel loader just below suffer from just
> the same problem?

Yes, initrd too.


Alex

Re: [Qemu-devel] [PATCH 00/27 RFC v7] ACPI memory hotplug

2013-11-21 Thread Igor Mammedov

On Thu, 21 Nov 2013 08:20:56 +0200
"Michael S. Tsirkin"  wrote:

> On Thu, Nov 21, 2013 at 03:38:21AM +0100, Igor Mammedov wrote:
> > ---
> > What's new since v6:
> > 
> > * DIMM device is split to backend and frontend. Therefore following
> >   command/options were added for supporting it:
> > 
> >   For memdev backend:
> >   CLI: -memdev-add
> >   Monitor/QMP: memdev-add
> >   with options: 'id' and 'size'
> >   For dimm frontend:
> >   option "size" became readonly, pulling it's size from attached backend
> >   added option "memdev" for specifying backend by 'id'
> > 
> > * Added Q35 support
> > * Added support for 32 bit guests
> > * build for i386 emulator (not tested)
> 
> OK so a large patchset so did not review yet.  One question
> due to the dependency on bios honouring etc/reserved-memory-end: is
> there some way to detect old BIOS and fail memory hotplug?
version negotiation between ASL and hardware could be used to that effect.

QEMU could start with present but disabled memory hotplug and if
QEMU and BIOS ASL could come in agreement that both support it in
compatible way, it could be enabled.

> 
> > ---
> > 
> > This series allows to hotplug 'arbitrary' DIMM devices specifying size,
> > NUMA node mapping (guest side), slot and address where to map it, at 
> > runtime.
> > 
> > Due to ACPI limitation there is need to specify a number of possible
> > DIMM devices. For this task -m option was extended to support
> > following format:
> > 
> >   -m [mem=]RamSize[,slots=N,maxmem=M]
> > 
> > To allow memory hotplug user must specify a pair of additional parameters:
> > 'slots' - number of possible increments
> > 'maxmem' - max possible total memory size QEMU is allowed to use,
> >including RamSize.
> > 
> > minimal monitor command syntax to hotplug DIMM device:
> > 
> >   memdev-add id=memX,size=1G
> >   device_add dimm,id=dimmX,memdev=memX
> > 
> > DIMM device provides following properties that could be used with
> > device_add / -device to alter default behavior:
> > 
> >   id- unique string identifying device [mandatory]
> >   slot  - number in range [0-slots) [optional], if not specified
> >   the first free slot is used
> >   node  - NUMA node id [optional] (default: 0)
> >   size  - amount of memory to add, readonly derived from backing memdev
> >   start - guest's physical address where to plug DIMM [optional],
> >   if not specified the first gap in hotplug memory region
> >   that fits DIMM is used
> > 
> >  -device option could be used for adding potentially hotunplugable DIMMs
> > and also for specifying hotplugged DIMMs in migration case.
> > 
> > Tested guests:
> >  - RHEL 6x64
> >  - Windows 2012DCx64
> >  - Windows 2008DCx64
> >  - Windows 2008DCx32
> > 
> > Known limitations/bugs/TODOs:
> >  - only hot-add supported
> >  - max number of supported DIMM devices 255 (due to ACPI object name
> >limit), could be increased creating several containers and putting
> >DIMMs there. (exercise for future) 
> >  - failed hotplug action consumes 1 slot (device_add doesn't delete
> >device properly if realize failed)
> >  - e820 table doesn't include DIMM devices added with -device /
> >(or after reboot devices added with device_add)
> >  - Windows 2008 remembers DIMM configuration, so if DIMM with other
> >start/size is added into the same slot, it refuses to use it insisting
> >on old mapping.
> > 
> > Series is based on mst's PCI tree and requires following SeaBIOS patch:
> >   http://patchwork.ozlabs.org/patch/292614/
> >   on top of patches to load ACPI tables from QEMU
> >   working SeaBIOS tree for testing is available at:
> > https://github.com/imammedo/seabios/commits/memhp-wip
> > 
> > QEMU git tree for testing is available at:
> >   https://github.com/imammedo/qemu/commits/memory-hotplug-v7
> > 
> > Example QEMU cmd line:
> >   qemu-system-x86_64 -enable-kvm -monitor unix:/tmp/mon,server,nowait \ 
> >  -m 4096,slots=4,maxmem=8G -L /custome_seabios guest.img
> > 
> > PS:
> >   Windows guest requires SRAT table for hotplug to work so add extra option:
> >-numa node
> >   to QEMU command line.
> > 
> > 
> > Igor Mammedov (26):
> >   acpi: factor out common pm_update_sci() into acpi core
> >   rename pci_hotplug_fn to hotplug_fn and make it available for other
> > devices
> >   pc: add 'etc/reserved-memory-end' fw_cfg interface for SeaBIOS
> >   vl: convert -m to qemu_opts_parse()
> >   qapi: add SIZE type parser to string_input_visitor
> >   get reference to /backend container via qemu_get_backend()
> >   add memdev backend infrastructure
> >   dimm: map DimmDevice into DimBus provided address space
> >   dimm: add busy slot check and slot auto-allocation
> >   dimm: add busy address check and address auto-allocation
> >   dimm: add hotplug callback to DimmBus
> >   acpi: memory hotplug ACPI hardware implementation
> >   acpi: initialize memory hotplug ACPI PIIX4 hardware
> >

Re: [Qemu-devel] [PATCH for-1.7] s390x: fix flat rom load on 32 bit systems

2013-11-21 Thread Michael S. Tsirkin

On Thu, Nov 21, 2013 at 01:24:13PM +0100, Cornelia Huck wrote:
> On Thu, 21 Nov 2013 14:08:22 +0200
> "Michael S. Tsirkin"  wrote:
> 
> > pc-bios/s390-zipl.rom is a flat image so it's expected that
> > loading it as elf will fail.
> > It should fall back on loading a flat file, but doesn't
> > on 32 bit systems, instead it fails printing:
> > qemu: hardware error: could not load bootloader 's390-zipl.rom'
> > 
> > The result is boot failure.
> > 
> > The reason is that a 64 bit unsigned interger which is set
> > to -1 on error is compared to -1UL which on a 32 bit system
> > with gcc is a 32 bit unsigned interger.
> > Since both are unsigned, no sign extension takes place and
> > comparison evaluates to non-equal.
> > 
> > There's no reason to do clever tricks: -1 will cause
> > sign extension to happen correctly automatically.
> > 
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >  hw/s390x/ipl.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
> > index d69adb2..88115e9 100644
> > --- a/hw/s390x/ipl.c
> > +++ b/hw/s390x/ipl.c
> > @@ -80,7 +80,7 @@ static int s390_ipl_init(SysBusDevice *dev)
> > 
> >  bios_size = load_elf(bios_filename, NULL, NULL, &ipl->start_addr, 
> > NULL,
> >   NULL, 1, ELF_MACHINE, 0);
> > -if (bios_size == -1UL) {
> > +if (bios_size == -1) {
> >  bios_size = load_image_targphys(bios_filename, 
> > ZIPL_IMAGE_START,
> >  4096);
> >  ipl->start_addr = ZIPL_IMAGE_START;
> 
> Makes sense, but doesn't the kernel loader just below suffer from just
> the same problem?

Yes, v2 fixes this.

[Qemu-devel] [PATCH 1/3] KVM: x86: fix typo in KVM_GET_XCRS

2013-11-21 Thread Gleb Natapov

From: Paolo Bonzini 

Only the first item of the array was ever looked at.  No
practical effect, but still worth fixing.

Signed-off-by: Paolo Bonzini 
Signed-off-by: Gleb Natapov 
---
 target-i386/kvm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 749aa09..27071e3 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1314,8 +1314,8 @@ static int kvm_get_xcrs(X86CPU *cpu)
 
 for (i = 0; i < xcrs.nr_xcrs; i++) {
 /* Only support xcr0 now */
-if (xcrs.xcrs[0].xcr == 0) {
-env->xcr0 = xcrs.xcrs[0].value;
+if (xcrs.xcrs[i].xcr == 0) {
+env->xcr0 = xcrs.xcrs[i].value;
 break;
 }
 }
-- 
1.8.4.rc3

Re: [Qemu-devel] [PATCH v3 5/6] qemu-iotests: Force qcow2 in error path test in 048

2013-11-21 Thread Stefan Hajnoczi

On Wed, Nov 20, 2013 at 03:44:16PM +0800, Fam Zheng wrote:
> The "raw" doesn't always work on certain file systems (e.g. tmpfs). Use
> qcow2 to make the allocation status explicit.
> 
> Signed-off-by: Fam Zheng 
> ---
>  tests/qemu-iotests/048 | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qemu-iotests/048 b/tests/qemu-iotests/048
> index 9def7fc..0fb5d2c 100755
> --- a/tests/qemu-iotests/048
> +++ b/tests/qemu-iotests/048
> @@ -88,7 +88,13 @@ event = "read_aio"
>  errno = "5"
>  once ="off"
>  EOF
> -_make_test_img $size
> +if [ "$IMGFMT" = "raw" ]; then
> +# For raw the allocation status for new image depends on file system, 
> force
> +# qcow2 in this case, since the tested code is in block layer
> +IMGFMT=qcow2 _make_test_img $size | _filter_imgfmt

If it can't be tested, skip the test.  Don't silently test something
else.

Stefan

[Qemu-devel] [PATCH for-1.7 5/5] acpi unit-test: compare DSDT and SSDT tables against expected values

2013-11-21 Thread Marcel Apfelbaum

This test will run only if iasl is installed on the host machine.
The test plan:
 1. Dumps the ACPI tables as AML on the disk.
 2. Runs iasl to disassembly the tables into ASL files.
 3. Compares them with expected offline ASL files.

The test runs for both default machine and q35.
In case the test fails, it can be easily tweaked to
show the differences between the ASL files and
understand the issue.

Signed-off-by: Marcel Apfelbaum 
---
 tests/acpi-test.c | 231 +-
 1 file changed, 211 insertions(+), 20 deletions(-)

diff --git a/tests/acpi-test.c b/tests/acpi-test.c
index ca83b1d..d08720d 100644
--- a/tests/acpi-test.c
+++ b/tests/acpi-test.c
@@ -18,14 +18,26 @@
 #include "qemu/compiler.h"
 #include "hw/i386/acpi-defs.h"
 
+#define MACHINE_PC "pc"
+#define MACHINE_Q35 "q35"
+
 /* DSDT and SSDTs format */
 typedef struct {
 AcpiTableHeader header;
-uint8_t *aml;
-int aml_len;
-} AcpiSdtTable;
+gchar *aml;/* aml bytecode from guest */
+gsize aml_len;
+gchar *aml_file;
+gchar *asl;/* asl code generated from aml */
+gsize asl_len;
+gchar *asl_file;
+gchar *exp_aml;/* expected asl from test file */
+gsize exp_aml_len;
+gchar *exp_asl;/* expected asl from test file */
+gsize exp_asl_len;
+} QEMU_PACKED AcpiSdtTable;
 
 typedef struct {
+const char *machine;
 uint32_t rsdp_addr;
 AcpiRsdpDescriptor rsdp_table;
 AcpiRsdtDescriptorRev1 rsdt_table;
@@ -33,8 +45,7 @@ typedef struct {
 AcpiFacsDescriptorRev1 facs_table;
 uint32_t *rsdt_tables_addr;
 int rsdt_tables_nr;
-AcpiSdtTable dsdt_table;
-GArray *ssdt_tables;
+GArray *ssdt_tables; /* first is DSDT */
 } test_data;
 
 #define LOW(x) ((x) & 0xff)
@@ -91,8 +102,10 @@ typedef struct {
 
 /* Boot sector code: write SIGNATURE into memory,
  * then halt.
+ * Q35 machine requires a minimum 0x7e000 bytes disk.
+ * (bug or feature?)
  */
-static uint8_t boot_sector[0x200] = {
+static uint8_t boot_sector[0x7e000] = {
 /* 7c00: mov $0xdead,%ax */
 [0x00] = 0xb8,
 [0x01] = LOW(SIGNATURE),
@@ -117,17 +130,37 @@ static uint8_t boot_sector[0x200] = {
 };
 
 static const char *disk = "tests/acpi-test-disk.raw";
+static const char *data_dir = "tests/acpi-test-data";
 
 static void free_test_data(test_data *data)
 {
+AcpiSdtTable *temp;
 int i;
 
 g_free(data->rsdt_tables_addr);
+
 for (i = 0; i < data->ssdt_tables->len; ++i) {
-g_free(g_array_index(data->ssdt_tables, AcpiSdtTable, i).aml);
+temp = &g_array_index(data->ssdt_tables, AcpiSdtTable, i);
+if (temp->aml) {
+g_free(temp->aml);
+}
+if (temp->aml_file) {
+unlink(temp->aml_file);
+g_free(temp->aml_file);
+}
+if (temp->asl) {
+g_free(temp->asl);
+}
+if (temp->asl_file) {
+unlink(temp->asl_file);
+g_free(temp->asl_file);
+}
+if (temp->exp_asl) {
+g_free(temp->exp_asl);
+}
 }
+
 g_array_free(data->ssdt_tables, false);
-g_free(data->dsdt_table.aml);
 }
 
 static uint8_t acpi_checksum(const uint8_t *data, int len)
@@ -292,34 +325,178 @@ static void test_dst_table(AcpiSdtTable *sdt_table, 
uint32_t addr)
 ACPI_READ_ARRAY_PTR(sdt_table->aml, sdt_table->aml_len, addr);
 
 checksum = acpi_checksum((uint8_t *)sdt_table, sizeof(AcpiTableHeader)) +
-   acpi_checksum(sdt_table->aml, sdt_table->aml_len);
+   acpi_checksum((uint8_t *)sdt_table->aml, sdt_table->aml_len);
 g_assert(!checksum);
 }
 
 static void test_acpi_dsdt_table(test_data *data)
 {
-AcpiSdtTable *dsdt_table = &data->dsdt_table;
+AcpiSdtTable dsdt_table;
 uint32_t addr = data->fadt_table.dsdt;
 
-test_dst_table(dsdt_table, addr);
-g_assert_cmphex(dsdt_table->header.signature, ==, ACPI_DSDT_SIGNATURE);
+memset(&dsdt_table, 0, sizeof(dsdt_table));
+data->ssdt_tables = g_array_new(false, true, sizeof(AcpiSdtTable));
+
+test_dst_table(&dsdt_table, addr);
+g_assert_cmphex(dsdt_table.header.signature, ==, ACPI_DSDT_SIGNATURE);
+
+/* Place DSDT first */
+g_array_append_val(data->ssdt_tables, dsdt_table);
 }
 
 static void test_acpi_ssdt_tables(test_data *data)
 {
-GArray *ssdt_tables;
 int ssdt_tables_nr = data->rsdt_tables_nr - 1; /* fadt is first */
 int i;
 
-ssdt_tables = g_array_sized_new(false, true, sizeof(AcpiSdtTable),
-ssdt_tables_nr);
 for (i = 0; i < ssdt_tables_nr; i++) {
 AcpiSdtTable ssdt_table;
+
+memset(&ssdt_table, 0 , sizeof(ssdt_table));
 uint32_t addr = data->rsdt_tables_addr[i + 1]; /* fadt is first */
 test_dst_table(&ssdt_table, addr);
-g_array_append_val(ssdt_tables, ssdt_table);
+g_array_append_val(data->ssdt_tables, ssdt_table);
+}
+}
+
+static bool iasl_installed(void)

Re: [Qemu-devel] GIT master fails compilation for ACPI

2013-11-21 Thread Erik Rull


Hu Tao wrote:

On Thu, Nov 21, 2013 at 09:06:43AM +0100, Erik Rull wrote:

Hi all,

when doing a git clone on the latest master, it fails compiling:

   CCx86_64-softmmu/memory_mapping.o
   CCx86_64-softmmu/dump.o
   CCx86_64-softmmu/xen-stub.o
   CCx86_64-softmmu/hw/i386/multiboot.o
   CCx86_64-softmmu/hw/i386/smbios.o
   CCx86_64-softmmu/hw/i386/pc.o
   CCx86_64-softmmu/hw/i386/pc_piix.o
   CCx86_64-softmmu/hw/i386/pc_q35.o
   CCx86_64-softmmu/hw/i386/pc_sysfw.o
   CCx86_64-softmmu/hw/i386/kvmvapic.o
   CPP x86_64-softmmu/acpi-dsdt.dsl.i.orig
   ACPI_PREPROCESS x86_64-softmmu/acpi-dsdt.dsl.i
   IASL x86_64-softmmu/acpi-dsdt.dsl.i
make[1]: *** [hw/i386/acpi-dsdt.hex] Error 1
make: *** [subdir-x86_64-softmmu] Error 2
erik@debian:~/qemu-test/qemu$


My configure options are:
./configure --prefix= --target-list=x86_64-softmmu --disable-vnc-png


Are you sure about --prefix= and...


--disable-vnc-jpeg --disable-vnc-tls --disable-vnc-sasl  --audio-drv-list=


... --audio-drv-list= ?


--enable-sdl --disable-xen --disable-brlapi --disable-bluez --disable-curl
--disable-guest-agent --disable-spice


I can't even do a configure with exactly the above options, it gives an error 
message:

   ERROR: unknown option --disable-vnc-sasl

While removing --prefix= and --audio-drv-list=, the compilation goes
fine.



No, sorry, the error is still the same! I did a make distclean and redid 
the configure without the options you pointed out.


Any further ideas? Are there more possibilities to debug why the 
compilation fails?


Best regards,

Erik

[Qemu-devel] [RFC PATCH 1/2] e1000: Use Address_Available bit as HW latch

2013-11-21 Thread Vlad Yasevich

e1000 provides a E1000_RAH_AV bit on every complete write
to the Receive Address Register.  We can use this bit
2 ways:
 1) To trigger HMP notifications.  When the bit is set the
mac address is fully set and we can update the HMP.

 2) We can turn off he bit on the write to low order bits of
the Receive Address Register, so that we would not try
to match received traffic to this address when it is
not completely set.

Signed-off-by: Vlad Yasevich 
---
 hw/net/e1000.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index ae63591..82978ea 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -1106,10 +1106,19 @@ mac_writereg(E1000State *s, int index, uint32_t val)
 
 s->mac_reg[index] = val;
 
-if (index == RA || index == RA + 1) {
+switch (index) {
+case RA:
+/* Mask off AV bit on the write of the low dword.  The write of
+ * the high dword will set the bit.  This way a half-written
+ * mac address will not be used to filter on rx.
+ */
+s->mac_reg[RA+1] &= ~E1000_RAH_AV;
+break;
+case (RA + 1):
 macaddr[0] = cpu_to_le32(s->mac_reg[RA]);
 macaddr[1] = cpu_to_le32(s->mac_reg[RA + 1]);
 qemu_format_nic_info_str(qemu_get_queue(s->nic), (uint8_t *)macaddr);
+break;
 }
 }
 
-- 
1.8.4.2

[Qemu-devel] [RFC PATCH 0/2] Update HMP only upon mac change completion.

2013-11-21 Thread Vlad Yasevich

Recent threads regarding e1000/rtl8139 and mac address change notifications
prompted some research into the respecitive hw data sheets as well as
available drivers.  What I found is that each hw has a mechanism that
can be used by our emulation layer to determine when the mac address
change has completed (when the OS finished writing the mac address),
and we can use these mechanisms to trigger HMP notifications.

This is only an RFC series.  It's been tested and works well.
I've split e1000 and rtl8139 changes as they are sufficiently
different.  e1000 make this very clean and easy, but rtl8139
isn't as nice.

Please take a look and I'd like to hear your comments.

Thanks
-vlad

Vlad Yasevich (2):
  e1000: Use Address_Available bit as HW latch
  rtl8139: update HMP only when the address is fully written

 hw/i386/pc_piix.c|  4 
 hw/i386/pc_q35.c |  4 
 hw/net/e1000.c   | 11 ++-
 hw/net/rtl8139.c | 50 +-
 include/hw/i386/pc.h |  8 
 5 files changed, 75 insertions(+), 2 deletions(-)

-- 
1.8.4.2

Re: [Qemu-devel] [PATCH for-1.7] qdev-properties-system.c: Allow vlan or netdev for -device, not both

2013-11-21 Thread Paolo Bonzini

Il 21/11/2013 20:47, Vlad Yasevich ha scritto:
> On 11/11/2013 02:50 AM, Paolo Bonzini wrote:
>> Il 11/11/2013 06:18, Jason Wang ha scritto:
>>> On 11/08/2013 10:13 AM, Vlad Yasevich wrote:
 It is currently possible to specify things like:
-device e1000,netdev=foo,vlan=1
 With this usage, whichever argument was specified last (vlan or netdev)
 overwrites what was previousely set and results in a non-working
 configuration.  Even worse, when used with multiqueue devices,
 it causes a segmentation fault on exit in qemu_free_net_client.

 That patch treates the above command line options as invalid and
 generates an error at start-up.

 Signed-off-by: Vlad Yasevich 
 ---
  hw/core/qdev-properties-system.c | 9 +
  1 file changed, 9 insertions(+)

 diff --git a/hw/core/qdev-properties-system.c 
 b/hw/core/qdev-properties-system.c
 index 0eada32..729efa8 100644
 --- a/hw/core/qdev-properties-system.c
 +++ b/hw/core/qdev-properties-system.c
 @@ -205,6 +205,11 @@ static int parse_netdev(DeviceState *dev, const char 
 *str, void **ptr)
  goto err;
  }
  
 +if (ncs[i]) {
 +ret = -EINVAL;
 +goto err;
 +}
 +
  ncs[i] = peers[i];
  ncs[i]->queue_index = i;
  }
 @@ -301,6 +306,10 @@ static void set_vlan(Object *obj, Visitor *v, void 
 *opaque,
  *ptr = NULL;
  return;
  }
 +if (*ptr) {
 +error_set_from_qdev_prop_error(errp, -EINVAL, dev, prop, name);
 +return;
 +}
  
  hubport = net_hub_port_find(id);
  if (!hubport) {
>>>
>>> Acked-by: Jason Wang 
>>
>> This patch is good for 1.7.
>>
>> Paolo
>>
> 
> Hi All
> 
> Just wondering what's to become of this patch?  It has an ACK, but has
> been abandoned.  It does a fix a crash in qemu.

I missed this when going through pending patches.  Stefan, are you
picking it up?

Paolo

Re: [Qemu-devel] [PATCH 1/2] block: Enable BDRV_O_SNAPSHOT with driver-specific options

2013-11-21 Thread Kevin Wolf

Am 21.11.2013 um 15:33 hat Stefan Hajnoczi geschrieben:
> On Tue, Nov 19, 2013 at 04:37:27PM +0100, Kevin Wolf wrote:
> > @@ -1114,6 +1093,24 @@ int bdrv_open(BlockDriverState *bs, const char 
> > *filename, QDict *options,
> >  goto fail;
> >  }
> >  
> > +/* Prepare a new options QDict for the temporary file, where user
> > + * options refer to the backing file */
> > +if (!options) {
> > +options = qdict_new();
> > +}
> 
> You can drop this because options is never NULL:
> 
>   options = qdict_clone_shallow(options);
> 
>   /* For snapshot=on, create a temporary qcow2 overlay */
>   if (flags & BDRV_O_SNAPSHOT) {
>   ...

You're right, I'll remove it in v2.

> > +if (filename) {
> > +qdict_put(options, "file.filename", 
> > qstring_from_str(filename));
> > +}
> > +if (drv) {
> > +qdict_put(options, "driver", 
> > qstring_from_str(drv->format_name));
> > +}
> > +
> > +snapshot_options = qdict_new();
> > +qdict_put(snapshot_options, "backing", options);
> > +qdict_flatten(snapshot_options);
> > +
> > +options = snapshot_options;
> 
> One thing I'm not sure about after these operations have been performed:
> 
>   bs->options = options;
>   ...
>   if (flags & BDRV_O_SNAPSHOT) {
>   ...
> 
> So bs->options does not reflect what we ended up with in the
> BDRV_O_SNAPSHOT case.
> 
> But git grep -- '->options' shows no users of this field.  Therefore it
> won't cause a problem yet.  But can you explain what's going on here?
> Either we should keep bs->options up-to-date or we should drop the
> field.

I think bs->options was meant to be used in cases where we reopen a BDS.
If it's currently unused, I guess this means we have some bugs. :-)

Should bs->options be the original options passed by the user or the
ones modified by BDRV_O_SNAPSHOT? I'm not completely sure, but I think
the modified ones make more sense; it would also make it easier to move
the whole BDRV_O_SNAPSHOT logic out to drive_init (which I plan to do in
a next step).

Kevin

[Qemu-devel] [PULL 11/11] qga: Fix compiler warnings (missing format attribute, wrong format strings)

2013-11-21 Thread Paolo Bonzini

From: Stefan Weil 

gcc 4.8.2 reports this warning when extra warnings are enabled (-Wextra):

  CCqga/commands.o
qga/commands.c: In function ‘slog’:
qga/commands.c:28:5: error:
 function might be possible candidate for ‘gnu_printf’ format attribute 
[-Werror=suggest-attribute=format]
 g_logv("syslog", G_LOG_LEVEL_INFO, fmt, ap);
 ^

gcc 4.8.2 reports this warning when slog is declared with the
gnu_printf format attribute:

qga/commands-posix.c: In function ‘qmp_guest_file_open’:
qga/commands-posix.c:404:5: warning:
 format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘int64_t’ 
[-Wformat=]
 slog("guest-file-open, handle: %d", handle);
 ^

On 32 bit hosts there are three more warnings which are also fixed here.

Signed-off-by: Stefan Weil 
Signed-off-by: Paolo Bonzini 
---
 qga/commands-posix.c   | 8 
 qga/guest-agent-core.h | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index 10682f5..8100bee 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -401,7 +401,7 @@ int64_t qmp_guest_file_open(const char *path, bool 
has_mode, const char *mode, E
 return -1;
 }
 
-slog("guest-file-open, handle: %d", handle);
+slog("guest-file-open, handle: %" PRId64, handle);
 return handle;
 }
 
@@ -410,7 +410,7 @@ void qmp_guest_file_close(int64_t handle, Error **err)
 GuestFileHandle *gfh = guest_file_handle_find(handle, err);
 int ret;
 
-slog("guest-file-close called, handle: %ld", handle);
+slog("guest-file-close called, handle: %" PRId64, handle);
 if (!gfh) {
 return;
 }
@@ -451,7 +451,7 @@ struct GuestFileRead *qmp_guest_file_read(int64_t handle, 
bool has_count,
 read_count = fread(buf, 1, count, fh);
 if (ferror(fh)) {
 error_setg_errno(err, errno, "failed to read file");
-slog("guest-file-read failed, handle: %ld", handle);
+slog("guest-file-read failed, handle: %" PRId64, handle);
 } else {
 buf[read_count] = 0;
 read_data = g_malloc0(sizeof(GuestFileRead));
@@ -496,7 +496,7 @@ GuestFileWrite *qmp_guest_file_write(int64_t handle, const 
char *buf_b64,
 write_count = fwrite(buf, 1, count, fh);
 if (ferror(fh)) {
 error_setg_errno(err, errno, "failed to write to file");
-slog("guest-file-write failed, handle: %ld", handle);
+slog("guest-file-write failed, handle: %" PRId64, handle);
 } else {
 write_data = g_malloc0(sizeof(GuestFileWrite));
 write_data->count = write_count;
diff --git a/qga/guest-agent-core.h b/qga/guest-agent-core.h
index 624a559..e422208 100644
--- a/qga/guest-agent-core.h
+++ b/qga/guest-agent-core.h
@@ -29,7 +29,7 @@ GACommandState *ga_command_state_new(void);
 bool ga_logging_enabled(GAState *s);
 void ga_disable_logging(GAState *s);
 void ga_enable_logging(GAState *s);
-void slog(const gchar *fmt, ...);
+void GCC_FMT_ATTR(1, 2) slog(const gchar *fmt, ...);
 void ga_set_response_delimited(GAState *s);
 bool ga_is_frozen(GAState *s);
 void ga_set_frozen(GAState *s);
-- 
1.8.3.1

[Qemu-devel] [PULL 01/11] sun4m: Add FCode ROM for TCX framebuffer

2013-11-21 Thread Paolo Bonzini

From: Mark Cave-Ayland 

Upstream OpenBIOS now implements SBus probing in order to determine the
contents of a physical bus slot, which is required to allow OpenBIOS to
identify the framebuffer without help from the fw_cfg interface.

SBus probing works by detecting the presence of an FCode program
(effectively tokenised Forth) at the base address of each slot, and if
present executes it so that it creates its own device node in the
OpenBIOS device tree.

The FCode ROM is generated as part of the OpenBIOS build and should
generally be updated at the same time.

Signed-off-by: Mark Cave-Ayland 
CC: Blue Swirl 
CC: Bob Breuer 
CC: Artyom Tarasenko 
Signed-off-by: Paolo Bonzini 
---
 Makefile |   2 +-
 hw/display/tcx.c |  26 +-
 hw/sparc/sun4m.c |  17 ++---
 pc-bios/QEMU,tcx.bin | Bin 0 -> 1242 bytes
 pc-bios/README   |   4 ++--
 5 files changed, 38 insertions(+), 11 deletions(-)
 create mode 100644 pc-bios/QEMU,tcx.bin

diff --git a/Makefile b/Makefile
index 3321b98..bdff4e4 100644
--- a/Makefile
+++ b/Makefile
@@ -293,7 +293,7 @@ ifdef INSTALL_BLOBS
 BLOBS=bios.bin sgabios.bin vgabios.bin vgabios-cirrus.bin \
 vgabios-stdvga.bin vgabios-vmware.bin vgabios-qxl.bin \
 acpi-dsdt.aml q35-acpi-dsdt.aml \
-ppc_rom.bin openbios-sparc32 openbios-sparc64 openbios-ppc \
+ppc_rom.bin openbios-sparc32 openbios-sparc64 openbios-ppc QEMU,tcx.bin \
 pxe-e1000.rom pxe-eepro100.rom pxe-ne2k_pci.rom \
 pxe-pcnet.rom pxe-rtl8139.rom pxe-virtio.rom \
 efi-e1000.rom efi-eepro100.rom efi-ne2k_pci.rom \
diff --git a/hw/display/tcx.c b/hw/display/tcx.c
index 24876d3..873b82c 100644
--- a/hw/display/tcx.c
+++ b/hw/display/tcx.c
@@ -25,8 +25,12 @@
 #include "qemu-common.h"
 #include "ui/console.h"
 #include "ui/pixel_ops.h"
+#include "hw/loader.h"
 #include "hw/sysbus.h"
 
+#define TCX_ROM_FILE "QEMU,tcx.bin"
+#define FCODE_MAX_ROM_SIZE 0x1
+
 #define MAXX 1024
 #define MAXY 768
 #define TCX_DAC_NREGS 16
@@ -43,6 +47,8 @@ typedef struct TCXState {
 QemuConsole *con;
 uint8_t *vram;
 uint32_t *vram24, *cplane;
+hwaddr prom_addr;
+MemoryRegion rom;
 MemoryRegion vram_mem;
 MemoryRegion vram_8bit;
 MemoryRegion vram_24bit;
@@ -529,14 +535,31 @@ static int tcx_init1(SysBusDevice *dev)
 {
 TCXState *s = TCX(dev);
 ram_addr_t vram_offset = 0;
-int size;
+int size, ret;
 uint8_t *vram_base;
+char *fcode_filename;
 
 memory_region_init_ram(&s->vram_mem, OBJECT(s), "tcx.vram",
s->vram_size * (1 + 4 + 4));
 vmstate_register_ram_global(&s->vram_mem);
 vram_base = memory_region_get_ram_ptr(&s->vram_mem);
 
+/* FCode ROM */
+memory_region_init_ram(&s->rom, NULL, "tcx.prom", FCODE_MAX_ROM_SIZE);
+vmstate_register_ram_global(&s->rom);
+memory_region_set_readonly(&s->rom, true);
+sysbus_init_mmio(dev, &s->rom);
+
+fcode_filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, TCX_ROM_FILE);
+if (fcode_filename) {
+ret = load_image_targphys(fcode_filename, s->prom_addr,
+  FCODE_MAX_ROM_SIZE);
+if (ret < 0 || ret > FCODE_MAX_ROM_SIZE) {
+fprintf(stderr, "tcx: could not load prom '%s'\n", TCX_ROM_FILE);
+return -1;
+}
+}
+
 /* 8-bit plane */
 s->vram = vram_base;
 size = s->vram_size;
@@ -598,6 +621,7 @@ static Property tcx_properties[] = {
 DEFINE_PROP_UINT16("width",TCXState, width, -1),
 DEFINE_PROP_UINT16("height",   TCXState, height,-1),
 DEFINE_PROP_UINT16("depth",TCXState, depth, -1),
+DEFINE_PROP_HEX64("prom_addr", TCXState, prom_addr, -1),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/sparc/sun4m.c b/hw/sparc/sun4m.c
index a0d366c..94f7950 100644
--- a/hw/sparc/sun4m.c
+++ b/hw/sparc/sun4m.c
@@ -537,24 +537,27 @@ static void tcx_init(hwaddr addr, int vram_size, int 
width,
 qdev_prop_set_uint16(dev, "width", width);
 qdev_prop_set_uint16(dev, "height", height);
 qdev_prop_set_uint16(dev, "depth", depth);
+qdev_prop_set_uint64(dev, "prom_addr", addr);
 qdev_init_nofail(dev);
 s = SYS_BUS_DEVICE(dev);
+/* FCode ROM */
+sysbus_mmio_map(s, 0, addr);
 /* 8-bit plane */
-sysbus_mmio_map(s, 0, addr + 0x0080ULL);
+sysbus_mmio_map(s, 1, addr + 0x0080ULL);
 /* DAC */
-sysbus_mmio_map(s, 1, addr + 0x0020ULL);
+sysbus_mmio_map(s, 2, addr + 0x0020ULL);
 /* TEC (dummy) */
-sysbus_mmio_map(s, 2, addr + 0x0070ULL);
+sysbus_mmio_map(s, 3, addr + 0x0070ULL);
 /* THC 24 bit: NetBSD writes here even with 8-bit display: dummy */
-sysbus_mmio_map(s, 3, addr + 0x00301000ULL);
+sysbus_mmio_map(s, 4, addr + 0x00301000ULL);
 if (depth == 24) {
 /* 24-bit plane */
-sysbus_mmio_map(s, 4, addr + 0x0200ULL);
+sysbus_mmio_map(s, 5, addr + 0x0200ULL);
 /* Control plane */
-sysbus_mmio_map(s, 5, addr + 0x0a00

Re: [Qemu-devel] [PATCH for-1.7] qdev-properties-system.c: Allow vlan or netdev for -device, not both

2013-11-21 Thread Vlad Yasevich

On 11/11/2013 02:50 AM, Paolo Bonzini wrote:
> Il 11/11/2013 06:18, Jason Wang ha scritto:
>> On 11/08/2013 10:13 AM, Vlad Yasevich wrote:
>>> It is currently possible to specify things like:
>>> -device e1000,netdev=foo,vlan=1
>>> With this usage, whichever argument was specified last (vlan or netdev)
>>> overwrites what was previousely set and results in a non-working
>>> configuration.  Even worse, when used with multiqueue devices,
>>> it causes a segmentation fault on exit in qemu_free_net_client.
>>>
>>> That patch treates the above command line options as invalid and
>>> generates an error at start-up.
>>>
>>> Signed-off-by: Vlad Yasevich 
>>> ---
>>>  hw/core/qdev-properties-system.c | 9 +
>>>  1 file changed, 9 insertions(+)
>>>
>>> diff --git a/hw/core/qdev-properties-system.c 
>>> b/hw/core/qdev-properties-system.c
>>> index 0eada32..729efa8 100644
>>> --- a/hw/core/qdev-properties-system.c
>>> +++ b/hw/core/qdev-properties-system.c
>>> @@ -205,6 +205,11 @@ static int parse_netdev(DeviceState *dev, const char 
>>> *str, void **ptr)
>>>  goto err;
>>>  }
>>>  
>>> +if (ncs[i]) {
>>> +ret = -EINVAL;
>>> +goto err;
>>> +}
>>> +
>>>  ncs[i] = peers[i];
>>>  ncs[i]->queue_index = i;
>>>  }
>>> @@ -301,6 +306,10 @@ static void set_vlan(Object *obj, Visitor *v, void 
>>> *opaque,
>>>  *ptr = NULL;
>>>  return;
>>>  }
>>> +if (*ptr) {
>>> +error_set_from_qdev_prop_error(errp, -EINVAL, dev, prop, name);
>>> +return;
>>> +}
>>>  
>>>  hubport = net_hub_port_find(id);
>>>  if (!hubport) {
>>
>> Acked-by: Jason Wang 
> 
> This patch is good for 1.7.
> 
> Paolo
> 

Hi All

Just wondering what's to become of this patch?  It has an ACK, but has
been abandoned.  It does a fix a crash in qemu.

Thanks
-vlad

[Qemu-devel] [PATCH] seccomp: add kill() to the syscall whitelist

2013-11-21 Thread Paul Moore

The kill() syscall is triggered with the following command:

 # qemu -sandbox on -monitor stdio \
-device intel-hda -device hda-duplex -vnc :0

The resulting syslog/audit message:

 # ausearch -m SECCOMP
 
 time->Wed Nov 20 09:52:08 2013
 type=SECCOMP msg=audit(1384912328.482:6656): auid=0 uid=0 gid=0 ses=854
  subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=12087
  comm="qemu-kvm" sig=31 syscall=62 compat=0 ip=0x7f7a1d2abc67 code=0x0
 # scmp_sys_resolver 62
 kill

Reported-by: CongLi 
Tested-by: CongLi 
Signed-off-by: Paul Moore 
---
 qemu-seccomp.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/qemu-seccomp.c b/qemu-seccomp.c
index 69cee44..cf07869 100644
--- a/qemu-seccomp.c
+++ b/qemu-seccomp.c
@@ -114,6 +114,7 @@ static const struct QemuSeccompSyscall seccomp_whitelist[] 
= {
 { SCMP_SYS(write), 244 },
 { SCMP_SYS(fcntl), 243 },
 { SCMP_SYS(tgkill), 242 },
+{ SCMP_SYS(kill), 242 },
 { SCMP_SYS(rt_sigaction), 242 },
 { SCMP_SYS(pipe2), 242 },
 { SCMP_SYS(munmap), 242 },

[Qemu-devel] [PATCH for-1.7 4/5] configure: added acpi unit-test files

2013-11-21 Thread Marcel Apfelbaum

Ensure configure will set-up links for the files
if the build is created in other directory.

Signed-off-by: Marcel Apfelbaum 
---
 configure | 4 
 1 file changed, 4 insertions(+)

diff --git a/configure b/configure
index 9addff1..127e287 100755
--- a/configure
+++ b/configure
@@ -4665,6 +4665,10 @@ for bios_file in \
 do
 FILES="$FILES pc-bios/`basename $bios_file`"
 done
+for test_file in `find $source_path/tests/acpi-test-data -type f`
+do
+FILES="$FILES tests/acpi-test-data`echo $test_file | sed -e 
's/.*acpi-test-data//'`"
+done
 mkdir -p $DIRS
 for f in $FILES ; do
 if [ -e "$source_path/$f" ] && [ "$source_path" != `pwd` ]; then
-- 
1.8.3.1

[Qemu-devel] [PATCH for-1.7 1/5] acpi unit-test: load and check facs table

2013-11-21 Thread Marcel Apfelbaum

FACS table does not have a checksum, so we can
check at least the signature (existence).

Signed-off-by: Marcel Apfelbaum 
---
 tests/acpi-test.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/tests/acpi-test.c b/tests/acpi-test.c
index d6ff66f..43775cd 100644
--- a/tests/acpi-test.c
+++ b/tests/acpi-test.c
@@ -30,6 +30,7 @@ typedef struct {
 AcpiRsdpDescriptor rsdp_table;
 AcpiRsdtDescriptorRev1 rsdt_table;
 AcpiFadtDescriptorRev1 fadt_table;
+AcpiFacsDescriptorRev1 facs_table;
 uint32_t *rsdt_tables_addr;
 int rsdt_tables_nr;
 AcpiSdtTable dsdt_table;
@@ -252,6 +253,22 @@ static void test_acpi_fadt_table(test_data *data)
 g_assert(!acpi_checksum((uint8_t *)fadt_table, fadt_table->length));
 }
 
+static void test_acpi_facs_table(test_data *data)
+{
+AcpiFacsDescriptorRev1 *facs_table = &data->facs_table;
+uint32_t addr = data->fadt_table.firmware_ctrl;
+
+ACPI_READ_FIELD(facs_table->signature, addr);
+ACPI_READ_FIELD(facs_table->length, addr);
+ACPI_READ_FIELD(facs_table->hardware_signature, addr);
+ACPI_READ_FIELD(facs_table->firmware_waking_vector, addr);
+ACPI_READ_FIELD(facs_table->global_lock, addr);
+ACPI_READ_FIELD(facs_table->flags, addr);
+ACPI_READ_ARRAY(facs_table->resverved3, addr);
+
+g_assert_cmphex(facs_table->signature, ==, ACPI_FACS_SIGNATURE);
+}
+
 static void test_dst_table(AcpiSdtTable *sdt_table, uint32_t addr)
 {
 uint8_t checksum;
@@ -329,6 +346,7 @@ static void test_acpi_one(const char *params)
 test_acpi_rsdp_table(&data);
 test_acpi_rsdt_table(&data);
 test_acpi_fadt_table(&data);
+test_acpi_facs_table(data);
 test_acpi_dsdt_table(&data);
 test_acpi_ssdt_tables(&data);
 
-- 
1.8.3.1

[Qemu-devel] [PATCH for-1.7 0/5] acpi unit-test: added tests

2013-11-21 Thread Marcel Apfelbaum

Added 2 tests:
 1. Basic check of FACS table (missed on prev submission)
 2. Compare DSDT and SSDT tables against expected values

Test 2:
 - runs only if iasl is installed on the host machine.
 - the test plan:
   1. Dumps the ACPI tables as AML on the disk.
   2. Runs iasl to disassembly the tables into ASL files.
   3. Compares them with expected offline ASL files.

 - the test runs for both default machine and q35.
 - in case the test fails, it can be easily tweaked to
   show the differences between the ASL files and
   understand the issue.

Patches:
 1/5 - test 1
 2/5 - some infrastructure improvements
 3/5 - expected asl files for test 2
 4/5 - creates links for the expected files
   if the build directory is not current
 5/5 - test 2

Marcel Apfelbaum (5):
  acpi unit-test: load and check facs table
  acpi unit-test: adjust the test data structure for better handling
  acpi unit-test: add test files
  configure: added acpi unit-test files
  acpi unit-test: compare DSDT and SSDT tables against expected values

 configure |4 +
 tests/acpi-test-data/pc/APIC.dsl  |  103 ++
 tests/acpi-test-data/pc/DSDT.dsl  | 1870 ++
 tests/acpi-test-data/pc/FACP.dsl  |   99 ++
 tests/acpi-test-data/pc/FACS.dsl  |   32 +
 tests/acpi-test-data/pc/HPET.dsl  |   43 +
 tests/acpi-test-data/pc/SSDT.dsl  |  634 
 tests/acpi-test-data/q35/APIC.dsl |  103 ++
 tests/acpi-test-data/q35/DSDT.dsl | 3197 +
 tests/acpi-test-data/q35/FACP.dsl |   99 ++
 tests/acpi-test-data/q35/FACS.dsl |   32 +
 tests/acpi-test-data/q35/HPET.dsl |   43 +
 tests/acpi-test-data/q35/MCFG.dsl |   36 +
 tests/acpi-test-data/q35/SSDT.dsl |  665 
 tests/acpi-test.c |  282 +++-
 15 files changed, 7210 insertions(+), 32 deletions(-)
 create mode 100644 tests/acpi-test-data/pc/APIC.dsl
 create mode 100644 tests/acpi-test-data/pc/DSDT.dsl
 create mode 100644 tests/acpi-test-data/pc/FACP.dsl
 create mode 100644 tests/acpi-test-data/pc/FACS.dsl
 create mode 100644 tests/acpi-test-data/pc/HPET.dsl
 create mode 100644 tests/acpi-test-data/pc/SSDT.dsl
 create mode 100644 tests/acpi-test-data/q35/APIC.dsl
 create mode 100644 tests/acpi-test-data/q35/DSDT.dsl
 create mode 100644 tests/acpi-test-data/q35/FACP.dsl
 create mode 100644 tests/acpi-test-data/q35/FACS.dsl
 create mode 100644 tests/acpi-test-data/q35/HPET.dsl
 create mode 100644 tests/acpi-test-data/q35/MCFG.dsl
 create mode 100644 tests/acpi-test-data/q35/SSDT.dsl

-- 
1.8.3.1

[Qemu-devel] [PATCH for-1.7 2/5] acpi unit-test: adjust the test data structure for better handling

2013-11-21 Thread Marcel Apfelbaum

Ensure more then one instance of test_data may exist
at a given time. It will help to compare different
acpi table versions.

Signed-off-by: Marcel Apfelbaum 
---
 tests/acpi-test.c | 55 ---
 1 file changed, 32 insertions(+), 23 deletions(-)

diff --git a/tests/acpi-test.c b/tests/acpi-test.c
index 43775cd..ca83b1d 100644
--- a/tests/acpi-test.c
+++ b/tests/acpi-test.c
@@ -34,7 +34,7 @@ typedef struct {
 uint32_t *rsdt_tables_addr;
 int rsdt_tables_nr;
 AcpiSdtTable dsdt_table;
-AcpiSdtTable *ssdt_tables;
+GArray *ssdt_tables;
 } test_data;
 
 #define LOW(x) ((x) & 0xff)
@@ -118,6 +118,18 @@ static uint8_t boot_sector[0x200] = {
 
 static const char *disk = "tests/acpi-test-disk.raw";
 
+static void free_test_data(test_data *data)
+{
+int i;
+
+g_free(data->rsdt_tables_addr);
+for (i = 0; i < data->ssdt_tables->len; ++i) {
+g_free(g_array_index(data->ssdt_tables, AcpiSdtTable, i).aml);
+}
+g_array_free(data->ssdt_tables, false);
+g_free(data->dsdt_table.aml);
+}
+
 static uint8_t acpi_checksum(const uint8_t *data, int len)
 {
 int i;
@@ -295,30 +307,30 @@ static void test_acpi_dsdt_table(test_data *data)
 
 static void test_acpi_ssdt_tables(test_data *data)
 {
-AcpiSdtTable *ssdt_tables;
+GArray *ssdt_tables;
 int ssdt_tables_nr = data->rsdt_tables_nr - 1; /* fadt is first */
 int i;
 
-ssdt_tables = g_new0(AcpiSdtTable, ssdt_tables_nr);
+ssdt_tables = g_array_sized_new(false, true, sizeof(AcpiSdtTable),
+ssdt_tables_nr);
 for (i = 0; i < ssdt_tables_nr; i++) {
-AcpiSdtTable *ssdt_table = &ssdt_tables[i];
+AcpiSdtTable ssdt_table;
 uint32_t addr = data->rsdt_tables_addr[i + 1]; /* fadt is first */
-
-test_dst_table(ssdt_table, addr);
+test_dst_table(&ssdt_table, addr);
+g_array_append_val(ssdt_tables, ssdt_table);
 }
 data->ssdt_tables = ssdt_tables;
 }
 
-static void test_acpi_one(const char *params)
+static void test_acpi_one(const char *params, test_data *data)
 {
 char *args;
 uint8_t signature_low;
 uint8_t signature_high;
 uint16_t signature;
 int i;
-test_data data;
 
-memset(&data, 0, sizeof(data));
+memset(data, 0, sizeof(*data));
 args = g_strdup_printf("-net none -display none %s %s",
params ? params : "", disk);
 qtest_start(args);
@@ -342,20 +354,13 @@ static void test_acpi_one(const char *params)
 }
 g_assert_cmphex(signature, ==, SIGNATURE);
 
-test_acpi_rsdp_address(&data);
-test_acpi_rsdp_table(&data);
-test_acpi_rsdt_table(&data);
-test_acpi_fadt_table(&data);
+test_acpi_rsdp_address(data);
+test_acpi_rsdp_table(data);
+test_acpi_rsdt_table(data);
+test_acpi_fadt_table(data);
 test_acpi_facs_table(data);
-test_acpi_dsdt_table(&data);
-test_acpi_ssdt_tables(&data);
-
-g_free(data.rsdt_tables_addr);
-for (i = 0; i < (data.rsdt_tables_nr - 1); ++i) {
-g_free(data.ssdt_tables[i].aml);
-}
-g_free(data.ssdt_tables);
-g_free(data.dsdt_table.aml);
+test_acpi_dsdt_table(data);
+test_acpi_ssdt_tables(data);
 
 qtest_quit(global_qtest);
 g_free(args);
@@ -363,10 +368,14 @@ static void test_acpi_one(const char *params)
 
 static void test_acpi_tcg(void)
 {
+test_data data;
+
 /* Supplying -machine accel argument overrides the default (qtest).
  * This is to make guest actually run.
  */
-test_acpi_one("-machine accel=tcg");
+test_acpi_one("-machine accel=tcg", &data);
+
+free_test_data(&data);
 }
 
 int main(int argc, char *argv[])
-- 
1.8.3.1

[Qemu-devel] [PATCH 3/6] qemu-option: Add qemu_config_parse_qdict()

2013-11-21 Thread Max Reitz

This function basically parses command-line options given as a QDict
replacing a config file.

For instance, the QDict {"section.opt1": 42, "section.opt2": 23}
corresponds to the config file:

[section]
opt1 = 42
opt2 = 23

It is possible to specify multiple sections and also multiple sections
of the same type. On the command line, this looks like the following:

inject-error.0.event=reftable_load,\
inject-error.1.event=l2_load,\
set-state.event=l1_update

This would correspond to the following config file:

[inject-error "inject-error.0"]
event = reftable_load

[inject-error "inject-error.1"]
event = l2_load

[set-state]
event = l1_update

Signed-off-by: Max Reitz 
---
 include/qemu/config-file.h |   6 +++
 util/qemu-config.c | 111 +
 2 files changed, 117 insertions(+)

diff --git a/include/qemu/config-file.h b/include/qemu/config-file.h
index 508428f..dbd97c4 100644
--- a/include/qemu/config-file.h
+++ b/include/qemu/config-file.h
@@ -4,6 +4,7 @@
 #include 
 #include "qemu/option.h"
 #include "qapi/error.h"
+#include "qapi/qmp/qdict.h"
 
 QemuOptsList *qemu_find_opts(const char *group);
 QemuOptsList *qemu_find_opts_err(const char *group, Error **errp);
@@ -18,6 +19,11 @@ int qemu_config_parse(FILE *fp, QemuOptsList **lists, const 
char *fname);
 
 int qemu_read_config_file(const char *filename);
 
+/* Parse QDict options as a replacement for a config file (allowing multiple
+   enumerated (0..(n-1)) configuration "sections") */
+void qemu_config_parse_qdict(QDict *options, QemuOptsList **lists,
+ Error **errp);
+
 /* Read default QEMU config files
  */
 int qemu_read_default_config_files(bool userconfig);
diff --git a/util/qemu-config.c b/util/qemu-config.c
index 04da942..80d82d4 100644
--- a/util/qemu-config.c
+++ b/util/qemu-config.c
@@ -356,3 +356,114 @@ int qemu_read_config_file(const char *filename)
 return -EINVAL;
 }
 }
+
+static void config_parse_qdict_section(QDict *options, QemuOptsList *opts,
+   Error **errp)
+{
+QemuOpts *subopts, *subopts_i;
+QDict *subqdict, *subqdict_i = NULL;
+char *prefix = g_strdup_printf("%s.", opts->name);
+Error *local_err = NULL;
+size_t orig_size, enum_size;
+
+qdict_extract_subqdict(options, &subqdict, prefix);
+orig_size = qdict_size(subqdict);
+if (!orig_size) {
+goto out;
+}
+
+subopts = qemu_opts_create_nofail(opts);
+qemu_opts_absorb_qdict(subopts, subqdict, &local_err);
+if (error_is_set(&local_err)) {
+error_propagate(errp, local_err);
+goto out;
+}
+
+enum_size = qdict_size(subqdict);
+if (enum_size < orig_size && enum_size) {
+error_setg(errp, "Unknown option '%s' for '%s'",
+   qdict_first(subqdict)->key, opts->name);
+goto out;
+}
+
+if (enum_size) {
+/* Multiple, enumerated rules */
+int i;
+
+/* Not required anymore */
+qemu_opts_del(subopts);
+
+for (i = 0; i < INT_MAX; i++) {
+char i_prefix[32], opt_name[48];
+size_t snprintf_ret;
+
+snprintf_ret = snprintf(i_prefix, 32, "%i.", i);
+assert(snprintf_ret < 32);
+
+snprintf_ret = snprintf(opt_name, 48, "%s.%i", opts->name, i);
+assert(snprintf_ret < 48);
+
+qdict_extract_subqdict(subqdict, &subqdict_i, i_prefix);
+if (!qdict_size(subqdict_i)) {
+break;
+}
+
+subopts_i = qemu_opts_create(opts, opt_name, 1, NULL);
+if (!subopts_i) {
+error_setg(errp, "Option ID '%s' already in use", opt_name);
+goto out;
+}
+
+qemu_opts_absorb_qdict(subopts_i, subqdict_i, &local_err);
+if (error_is_set(&local_err)) {
+error_propagate(errp, local_err);
+qemu_opts_del(subopts_i);
+goto out;
+}
+
+if (qdict_size(subqdict_i)) {
+error_setg(errp, "[%s] rules don't support the option '%s'",
+   opts->name, qdict_first(subqdict_i)->key);
+qemu_opts_del(subopts_i);
+goto out;
+}
+
+/*
+if (add_rule(subopts_i, (void *)ard) < 0) {
+error_setg(errp, "Could not add [%s] rule %i", opts->name, i);
+goto out;
+}
+*/
+
+QDECREF(subqdict_i);
+subqdict_i = NULL;
+}
+
+if (qdict_size(subqdict)) {
+error_setg(errp, "Unused option '%s' for [%s]",
+   qdict_first(subqdict)->key, opts->name);
+goto out;
+}
+}
+
+out:
+QDECREF(subqdict);
+QDECREF(subqdict_i);
+
+free(prefix);
+}
+
+void qemu_config_parse_qdict(QDict *options, QemuOptsList **lists,
+ Error **errp)
+{
+int i;
+

[Qemu-devel] [PATCH 6/6] blkverify: Don't require protocol filename

2013-11-21 Thread Max Reitz

If the filename is not prefixed by "blkverify:" in
blkverify_parse_filename(), the blkverify driver was not selected
through that protocol prefix, but by an explicit command line option
(like file.driver=blkverify). Contrary to the current reaction, this is
not really a problem; the whole filename just has to be stored (in the
x-image option) and the user has to manually specify the x-raw option.

Signed-off-by: Max Reitz 
---
The x-raw option is undocumented and may be removed at any point in
time; but I don't see a reason why this should happen. Passing the
reference filename through an option maybe was a bit dirty in the past;
but by “exposing” it in the way this patch does, it becomes probably the
only clean solution.

The only problem I can see right now is the naming of the option, but we
should always be able to leave it as an alias, if necessary.
---
 block/blkverify.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/blkverify.c b/block/blkverify.c
index 3c63528..bdbdd68 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -78,7 +78,9 @@ static void blkverify_parse_filename(const char *filename, 
QDict *options,
 
 /* Parse the blkverify: prefix */
 if (!strstart(filename, "blkverify:", &filename)) {
-error_setg(errp, "File name string must start with 'blkverify:'");
+/* There was no prefix; therefore, all options have to be already
+   present in the QDict (except for the filename) */
+qdict_put(options, "x-image", qstring_from_str(filename));
 return;
 }
 
-- 
1.8.4.2

[Qemu-devel] [PATCH 0/6] blkdebug/blkverify: Allow command-line configuration

2013-11-21 Thread Max Reitz

Currently, the configuration of blkdebug and blkverify is done through
the "filename" alone. There is now way of manually choosing blkdebug or
blkverify as a driver and using a normal image filename.

In the case of blkdebug, the filename starts with the protocol prefix,
follows up with the name of a configuration file and ends with the name
of the image file.

In the case of blkverify, the filename starts with the protocol prefix,
follows up with the raw reference image filename and ends with the name
of the image file.

This patch allows the configuration of both drivers completely through
command-line options. The driver has to be selected through the
file.driver option (or similar), the image filename has to be given as
the filename (obviously) and, depending on the driver, further options
have to be given to control the behavior.

In case of blkverify, the x-raw option specifies the name of the raw
reference image file.

In case of blkdebug, one may either set the config option to the
filename of a configuration file, or the contents of the configuration
file may be given directly on the command line (see description of patch
3 for an example).


Max Reitz (6):
  blkdebug: Use errp for read_config()
  blkdebug: Don't require sophisticated filename
  qemu-option: Add qemu_config_parse_qdict()
  blkdebug: Always call read_config()
  blkdebug: Use command-line in read_config()
  blkverify: Don't require protocol filename

 block/blkdebug.c   |  50 +---
 block/blkverify.c  |   4 +-
 include/qemu/config-file.h |   6 +++
 util/qemu-config.c | 111 +
 4 files changed, 154 insertions(+), 17 deletions(-)

-- 
1.8.4.2

Re: [Qemu-devel] [ANNOUNCE] QEMU 1.7.0-rc1 is now available

2013-11-21 Thread Anthony Liguori

Sorry for the delays around this release.  I finally have proper
internet access at home.

I've updated the wiki and compressed the schedule for 1.7.0 should go
out on Wednesday.  I went through patches this morning and believe I
have everything committed that was targeted for 1.7 (minus Paolo's
pull request).

Please confirm and ping anything that isn't committed as appropriate.

Regards,

Anthony Liguori

On Thu, Nov 21, 2013 at 10:42 AM, Anthony Liguori  wrote:
> Hi,
>
> On behalf of the QEMU Team, I'd like to announce the availability of the
> second release candidate for the QEMU 1.7 release.  This release is meant
> for testing purposes and should not be used in a production environment.
>
> http://wiki.qemu.org/download/qemu-1.7.0-rc1.tar.bz2
>
> You can help improve the quality of the QEMU 1.7 release by testing this
> release and reporting bugs on Launchpad:
>
> https://bugs.launchpad.net/qemu/
>
> The release plan for the 1.7 release is available at:
>
> http://wiki.qemu.org/Planning/1.7
>
> Please add entries to the ChangeLog for the 1.7 release below:
>
> http://wiki.qemu.org/ChangeLog/Next
>
> The following changes have been made since v1.7.0-rc0:
>
>  - vfio-pci: Fix multifunction=on (Alex Williamson)
>  - target-i386: Fix addr32 prefix in gen_lea_modrm (Richard Henderson)
>  - atomic.h: Fix build with clang (Peter Maydell)
>  - target-i386: do not override nr_cores for -cpu host (Paolo Bonzini)
>  - mips jazz: do not raise data bus exception when accessing invalid 
> addresses (Hervé Poussineau)
>  - target-i386: yield to another VCPU on PAUSE (Paolo Bonzini)
>  - rng-egd: offset the point when repeatedly read from the buffer (Amos Kong)
>  - rng-egd: remove redundant free (Amos Kong)
>  - virtio-rng: add check of period (Amos Kong)
>  - s390x: fix flat file load on 32 bit systems (Michael S. Tsirkin)
>  - acpi-build: fix build on glib < 2.14 (Michael S. Tsirkin)
>  - acpi-build: fix build on glib < 2.22 (Michael S. Tsirkin)
>  - target-openrisc: Correct carry flag check of l.addc and l.addic test cases 
> (Sebastian Macke)
>  - target-openrisc: Correct memory bounds checking for the tlb buffers 
> (Sebastian Macke)
>  - openrisc-timer: Reduce overhead, Separate clock update functions 
> (Sebastian Macke)
>  - target-openrisc: Correct wrong epcr register in interrupt handler 
> (Sebastian Macke)
>  - target-openrisc: Remove executable flag for every page (Sebastian Macke)
>  - target-openrisc: Remove unnecessary code generated by jump instructions 
> (Sebastian Macke)
>  - target-openrisc: Speed up move instruction (Sebastian Macke)
>  - The calculation of bytes_xfer in qemu_put_buffer() is wrong (Wangting 
> (Kathy))
>  - migration: drop MADVISE_DONT_NEED for incoming zero pages (Peter Lieven)
>  - qom: Fix memory leak in object_property_set_link() (Vlad Yasevich)
>  - qtest: Use -display none by default (Andreas Färber)
>  - virtio-net: fix the memory leak in rxfilter_notify() (Amos Kong)
>  - doc: fix hardcoded helper path (Amos Kong)
>  - tcg-ia64: Introduce tcg_opc_bswap64_i (Richard Henderson)
>  - tcg-ia64: Introduce tcg_opc_ext_i (Richard Henderson)
>  - tcg-ia64: Introduce tcg_opc_movi_a (Richard Henderson)
>  - tcg-ia64: Introduce tcg_opc_mov_a (Richard Henderson)
>  - tcg-ia64: Use A3 form of logical operations (Richard Henderson)
>  - tcg-ia64: Use SUB_A3 and ADDS_A4 for subtraction (Richard Henderson)
>  - tcg-ia64: Use ADDS for small addition (Richard Henderson)
>  - tcg-ia64: Avoid unnecessary stop bit in tcg_out_alu (Richard Henderson)
>  - tcg-ia64: Move AREG0 to R32 (Richard Henderson)
>  - tcg-ia64: Simplify brcond (Richard Henderson)
>  - tcg-ia64: Handle constant calls (Richard Henderson)
>  - tcg-ia64: Use shortcuts for nop insns (Richard Henderson)
>  - tcg-ia64: Use TCGMemOp within qemu_ldst routines (Richard Henderson)
>  - hw/i386/Makefile.obj: use $(PYTHON) to run .py scripts consistently 
> (Michael Tokarev)
>  - configure: Use -B switch only for Python versions which support it (Stefan 
> Weil)
>  - qga: Fix shutdown command of guest agent to work with SysV (whitearchey)
>  - block: Fail if requested driver is not available (Kevin Wolf)
>  - MAINTAINERS: add block driver sub-maintainers (Stefan Hajnoczi)
>  - qemu-img: Fix overwriting 'ret' before using (Fam Zheng)
>  - qemu-iotests: Test qcow2 count_contiguous_clusters() (Kevin Wolf)
>  - smc91c111: Fix receive starvation (Sebastian Huber)
>  - qcow2: fix possible corruption when reading multiple clusters (Peter 
> Lieven)
>  - qmp: access the local QemuOptsLists for drive option (Amos Kong)
>  - MAINTAINERS: add block tree repo URLs (Stefan Hajnoczi)
>  - qemu-iotests: Extend 041 for unbacked mirroring (Max Reitz)
>  - block/drive-mirror: Check for NULL backing_hd (Max Reitz)
>  - qapi-schema: Update description for NewImageMode (Max Reitz)
>  - block: Print its file name if backing file opening failed (Fam Zheng)
>  - pc: disable pci-info (Igor Mammedov)
>  - console: Remove unused debug code (Stefan Weil)
>

[Qemu-devel] [PULL for-1.7 00/11] Miscellaneous -rc patches

2013-11-21 Thread Paolo Bonzini

The following changes since commit 394cfa39ba24dd838ace1308ae24961243947fb8:

  Merge remote-tracking branch 'quintela/migration.next' into staging 
(2013-11-19 13:03:06 -0800)

are available in the git repository at:


  git://github.com/bonzini/qemu.git tags/for-anthony

for you to fetch changes up to d607a52364e7bfc1cd6d3e425b898e86be4e525d:

  qga: Fix compiler warnings (missing format attribute, wrong format strings) 
(2013-11-21 17:39:25 +0100)

You do not have my key yet, but you can verify it is the same that I use
for example in Linux pull requests.  Please contact me offlist if you
wish to do keysigning.


Here are a bunch of 1.7-tagged patches that I was afraid
were getting forgotten or that did not have a clear maintainer responsible
for making a pull request.


Alex Williamson (1):
  vfio-pci: Fix multifunction=on

Amos Kong (2):
  rng-egd: remove redundant free
  rng-egd: offset the point when repeatedly read from the buffer

Hervé Poussineau (1):
  mips jazz: do not raise data bus exception when accessing invalid 
addresses

Mark Cave-Ayland (1):
  sun4m: Add FCode ROM for TCX framebuffer

Paolo Bonzini (2):
  pc: get rid of builtin pvpanic for "-M pc-1.5"
  target-i386: yield to another VCPU on PAUSE

Peter Maydell (3):
  configure: Explicitly set ARFLAGS so we can build with GNU Make 4.0
  atomic.h: Fix build with clang
  target-i386: Fix build by providing stub kvm_arch_get_supported_cpuid()

Stefan Weil (1):
  qga: Fix compiler warnings (missing format attribute, wrong format 
strings)

 Makefile  |   2 +-
 backends/rng-egd.c|   5 +++--
 configure |   5 +
 hw/display/tcx.c  |  26 +-
 hw/i386/pc_piix.c |   7 ---
 hw/i386/pc_q35.c  |   7 ---
 hw/mips/mips_jazz.c   |  24 
 hw/misc/pvpanic.c |   5 -
 hw/misc/vfio.c|   7 +++
 hw/sparc/sun4m.c  |  17 ++---
 include/hw/i386/pc.h  |   1 -
 include/qemu/atomic.h |   6 +++---
 pc-bios/QEMU,tcx.bin  | Bin 0 -> 1242 bytes
 pc-bios/README|   4 ++--
 qga/commands-posix.c  |   8 
 qga/guest-agent-core.h|   2 +-
 target-i386/helper.h  |   1 +
 target-i386/kvm-stub.c|  12 
 target-i386/misc_helper.c |  22 --
 target-i386/translate.c   |   5 -
 20 files changed, 122 insertions(+), 44 deletions(-)
 create mode 100644 pc-bios/QEMU,tcx.bin
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH 21/27] pc: add memory hotplug 440fx machine

2013-11-21 Thread Igor Mammedov

On Thu, 21 Nov 2013 15:13:12 +0100
Andreas Färber  wrote:

> Am 21.11.2013 06:48, schrieb Li Guang:
> > Hi, Igor
> > 
> > Igor Mammedov wrote:
> >> Add DimmBus for memory hotplug below 4Gb or above 4Gb depending
> >> on initial memory size and hotplug memory size.
> >>
> > ...
> >> +static
> >> +void pc_hotplug_memory_init_impl(Object *owner,
> >> + MemoryRegion *system_memory,
> >> + ram_addr_t low_hotplug_mem_start,
> >> + ram_addr_t low_hotplug_mem_end,
> >> + DimmBus *hotplug_mem_bus,
> >> + ram_addr_t *high_mem_end)
> >> +{
> >> +QemuOpts *opts = qemu_opts_find(qemu_find_opts("memory-opts"),
> >> NULL);
> >> +ram_addr_t ram_size = qemu_opt_get_size(opts, "mem", 0);
> >> +ram_addr_t maxmem = qemu_opt_get_size(opts, "maxmem", 0);
> >> +ram_addr_t hotplug_mem_size;
> >> +
> >> +if (maxmem<= ram_size) {
> >> +/* Disable ACPI migration code and creation of memory devices
> >> in SSDT */
> >>
> > 
> > Why not give the memory that not be hot-added a chance to be placed on
> > one memory slot?
> 
> Seconded, I believe I requested that on the previous version already.
Because current initial memory allocation is a mess and this already
large series would become even more large and intrusive, so far series
it relatively not intrusive and self contained.

I believe re-factoring of initial memory to use Dimm devices should be
done later on top of infrastructure this series provides.

> Andreas
> 
> > if all memory can be hot-added and hot-removed, then we can bring in
> > more flexibility for
> > memory hotplug feature.
>

[Qemu-devel] [Bug 1253563] [NEW] bad performance with rng-egd backend

2013-11-21 Thread Launchpad Bug Tracker

You have been subscribed to a public bug by Amos Kong (amoskong):


1. create listen socket
# cat /dev/random | nc -l localhost 1024

2. start vm with rng-egd backend

./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -mon 
chardev=qmp,mode=control,pretty=on -chardev 
socket,id=qmp,host=localhost,port=1234,server,nowait -m 2000 -device 
virtio-net-pci,netdev=h1,id=vnet0 -netdev tap,id=h1 -vnc :0 -drive 
file=/images/RHEL-64-virtio.qcow2 \
-chardev socket,host=localhost,port=1024,id=chr0 \
-object rng-egd,chardev=chr0,id=rng0 \
-device virtio-rng-pci,rng=rng0,max-bytes=1024000,period=1000

(guest) # dd if=/dev/hwrng of=/dev/null

note: cancelling dd process by Ctrl+c, it will return the read speed.

Problem:   the speed is around 1k/s

===

If I use rng-random backend (filename=/dev/random), the speed is about
350k/s).

It seems that when the request entry is added to the list, we don't read the 
data from queue list immediately.
The chr_read() is delayed, the virtio_notify() is delayed.  the next request 
will also be delayed. It effects the speed.

I tried to change rng_egd_chr_can_read() always returns 1,  the speed is
improved to (about 400k/s)

Problem: we can't poll the content in time currently


Any thoughts?

Thanks, Amos

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
bad performance with rng-egd backend
https://bugs.launchpad.net/bugs/1253563
You received this bug notification because you are a member of qemu-devel-ml, 
which is subscribed to the bug report.

[Qemu-devel] [PATCH] vl: verify if combination of cpus, sockets, cores and threads is sane

2013-11-21 Thread Peter Lieven

Signed-off-by: Peter Lieven 
---
 vl.c |   34 --
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/vl.c b/vl.c
index 8d5d874..dc0b41a 100644
--- a/vl.c
+++ b/vl.c
@@ -1385,35 +1385,41 @@ static QemuOptsList qemu_smp_opts = {
 static void smp_parse(QemuOpts *opts)
 {
 if (opts) {
-
 unsigned cpus= qemu_opt_get_number(opts, "cpus", 0);
 unsigned sockets = qemu_opt_get_number(opts, "sockets", 0);
 unsigned cores   = qemu_opt_get_number(opts, "cores", 0);
 unsigned threads = qemu_opt_get_number(opts, "threads", 0);
 
 /* compute missing values, prefer sockets over cores over threads */
-if (cpus == 0 || sockets == 0) {
+if (cpus == 0) {
 sockets = sockets > 0 ? sockets : 1;
 cores = cores > 0 ? cores : 1;
 threads = threads > 0 ? threads : 1;
-if (cpus == 0) {
-cpus = cores * threads * sockets;
-}
+cpus = cores * threads * sockets;
+} else if (sockets == 0) {
+cores = cores > 0 ? cores : 1;
+threads = threads > 0 ? threads : 1;
+sockets = cpus / (cores * threads);
+} else if (cores == 0) {
+threads = threads > 0 ? threads : 1;
+cores = cpus / (sockets * threads);
 } else {
-if (cores == 0) {
-threads = threads > 0 ? threads : 1;
-cores = cpus / (sockets * threads);
-} else {
-threads = cpus / (cores * sockets);
-}
+threads = cpus / (sockets * cores);
 }
 
-max_cpus = qemu_opt_get_number(opts, "maxcpus", 0);
+if (cpus != sockets * cores * threads) {
+fprintf(stderr, "Illegal CPU layout: %d cpus with %d sockets,"
+" %d cores per socket and %d threads per core"
+" (cpus != sockets * cores * threads)\n",
+cpus, sockets, cores, threads);
+exit(1);
+}
 
 smp_cpus = cpus;
-smp_cores = cores > 0 ? cores : 1;
-smp_threads = threads > 0 ? threads : 1;
+smp_cores = cores;
+smp_threads = threads;
 
+max_cpus = qemu_opt_get_number(opts, "maxcpus", 0);
 }
 
 if (max_cpus == 0) {
-- 
1.7.9.5

Re: [Qemu-devel] [PATCH 1/2] block: Enable BDRV_O_SNAPSHOT with driver-specific options

2013-11-21 Thread Stefan Hajnoczi

On Tue, Nov 19, 2013 at 04:37:27PM +0100, Kevin Wolf wrote:
> @@ -1114,6 +1093,24 @@ int bdrv_open(BlockDriverState *bs, const char 
> *filename, QDict *options,
>  goto fail;
>  }
>  
> +/* Prepare a new options QDict for the temporary file, where user
> + * options refer to the backing file */
> +if (!options) {
> +options = qdict_new();
> +}

You can drop this because options is never NULL:

  options = qdict_clone_shallow(options);

  /* For snapshot=on, create a temporary qcow2 overlay */
  if (flags & BDRV_O_SNAPSHOT) {
  ...

> +if (filename) {
> +qdict_put(options, "file.filename", qstring_from_str(filename));
> +}
> +if (drv) {
> +qdict_put(options, "driver", qstring_from_str(drv->format_name));
> +}
> +
> +snapshot_options = qdict_new();
> +qdict_put(snapshot_options, "backing", options);
> +qdict_flatten(snapshot_options);
> +
> +options = snapshot_options;

One thing I'm not sure about after these operations have been performed:

  bs->options = options;
  ...
  if (flags & BDRV_O_SNAPSHOT) {
  ...

So bs->options does not reflect what we ended up with in the
BDRV_O_SNAPSHOT case.

But git grep -- '->options' shows no users of this field.  Therefore it
won't cause a problem yet.  But can you explain what's going on here?
Either we should keep bs->options up-to-date or we should drop the
field.

Stefan

Re: [Qemu-devel] Buildbot failure: multiboot.o link error on OpenBSD current

2013-11-21 Thread Brad Smith


On 20/11/13 4:15 AM, Stefan Hajnoczi wrote:

Sorry I forgot to CC qemu-devel on this:

On Wed, Nov 20, 2013 at 10:06 AM, Stefan Hajnoczi  wrote:

Hi Brad,
The QEMU buildbot is failing on brad_openbsd_current as follows:

   Building optionrom/multiboot.img
ld: multiboot.o: relocation R_X86_64_16 can not be used when making a
shared object; recompile with -fPIC
multiboot.o: could not read symbols: Bad value

http://buildbot.b1-systems.de/qemu/builders/default_openbsd_current/builds/590

Do you have time to investigate?

If not, please CC people who could help.

Thanks,
Stefan




I have been meaning to try to attempt to come up with a patch to resolve 
this issue but haven't come up with anything. I have a patch in the 
OpenBSD ports tree that allows QEMU to build but its just taking that 
and having it fit within the configure/Makefile infrastructure and 
applied for OpenBSD only.


http://www.openbsd.org/cgi-bin/cvsweb/~checkout~/ports/emulators/qemu/patches/patch-pc-bios_optionrom_Makefile?rev=1.1;content-type=text%2Fplain

The issue has come up as OpenBSD 5.3 and newer now builds all binaries 
by default on almost all of our architectures with PIE by default; so 
the ROMs need to be built with PIE disabled.


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

1 2 3 >

1 - 100 of 230 matches

Mail list logo