date:20141201

[Qemu-devel] [Bug 1368815] Re: qemu-img convert intermittently corrupts output images

2014-12-01 Thread Michael Steffens

Tested qemu-utils  2.0.0+dfsg-2ubuntu1.8. Successful.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1368815

Title:
  qemu-img convert intermittently corrupts output images

Status in OpenStack Compute (Nova):
  In Progress
Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Trusty:
  Fix Committed
Status in qemu source package in Utopic:
  Fix Committed
Status in qemu source package in Vivid:
  Fix Released

Bug description:
  ==
  Impact: occasional image corruption (any format on local filesystem)
  Test case: see the qemu-img command below
  Regression potential: this cherrypicks a patch from upstream to a 
not-insignificantly older qemu source tree.  While the cherrypick seems sane, 
it's possible that there are subtle interactions with the other delta.  I'd 
really like for a full qa-regression-test qemu testcase to be run against this 
package.
  ==

  -- Found in releases qemu-2.0.0, qemu-2.0.2, qemu-2.1.0. Tested on
  Ubuntu 14.04 using Ext4 filesystems.

  The command

    qemu-img convert -O raw inputimage.qcow2 outputimage.raw

  intermittently creates corrupted output images, when the input image
  is not yet fully synchronized to disk. While the issue has actually
  been discovered in operation of of OpenStack nova, it can be
  reproduced "easily" on command line using

    cat $SRC_PATH > $TMP_PATH && $QEMU_IMG_PATH convert -O raw $TMP_PATH
  $DST_PATH && cksum $DST_PATH

  on filesystems exposing this behavior. (The difficult part of this
  exercise is to prepare a filesystem to reliably trigger this race. On
  my test machine some filesystems are affected while other aren't, and
  unfortunately I haven't found the relevant difference between them,
  yet. Possible it's timing issues completely out of userspace control
  ...)

  The root cause, however, is the same as in

    http://lists.gnu.org/archive/html/coreutils/2011-04/msg00069.html

  and it can be solved the same way as suggested in

    http://lists.gnu.org/archive/html/coreutils/2011-04/msg00102.html

  In qemu, file block/raw-posix.c use the FIEMAP_FLAG_SYNC, i.e change

  f.fm.fm_flags = 0;

  to

  f.fm.fm_flags = FIEMAP_FLAG_SYNC;

  As discussed in the thread mentioned above, retrieving a page cache
  coherent map of file extents is possible only after fsync on that
  file.

  See also

    https://bugs.launchpad.net/nova/+bug/1350766

  In that bug report filed against nova, fsync had been suggested to be
  performed by the framework invoking qemu-img. However, as the choice
  of fiemap -- implying this otherwise unneeded fsync of a temporary
  file  -- is not made by the caller but by qemu-img, I agree with the
  nova bug reviewer's objection to put it into nova. The fsync should
  instead be triggered by qemu-img utilizing the FIEMAP_FLAG_SYNC,
  specifically intended for that purpose.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1368815/+subscriptions

[Qemu-devel] [Bug 1368815] Re: qemu-img convert intermittently corrupts output images

2014-12-01 Thread Michael Steffens

** Tags removed: verification-needed
** Tags added: verification-done-trusty

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1368815

Title:
  qemu-img convert intermittently corrupts output images

Status in OpenStack Compute (Nova):
  In Progress
Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Trusty:
  Fix Committed
Status in qemu source package in Utopic:
  Fix Committed
Status in qemu source package in Vivid:
  Fix Released

Bug description:
  ==
  Impact: occasional image corruption (any format on local filesystem)
  Test case: see the qemu-img command below
  Regression potential: this cherrypicks a patch from upstream to a 
not-insignificantly older qemu source tree.  While the cherrypick seems sane, 
it's possible that there are subtle interactions with the other delta.  I'd 
really like for a full qa-regression-test qemu testcase to be run against this 
package.
  ==

  -- Found in releases qemu-2.0.0, qemu-2.0.2, qemu-2.1.0. Tested on
  Ubuntu 14.04 using Ext4 filesystems.

  The command

    qemu-img convert -O raw inputimage.qcow2 outputimage.raw

  intermittently creates corrupted output images, when the input image
  is not yet fully synchronized to disk. While the issue has actually
  been discovered in operation of of OpenStack nova, it can be
  reproduced "easily" on command line using

    cat $SRC_PATH > $TMP_PATH && $QEMU_IMG_PATH convert -O raw $TMP_PATH
  $DST_PATH && cksum $DST_PATH

  on filesystems exposing this behavior. (The difficult part of this
  exercise is to prepare a filesystem to reliably trigger this race. On
  my test machine some filesystems are affected while other aren't, and
  unfortunately I haven't found the relevant difference between them,
  yet. Possible it's timing issues completely out of userspace control
  ...)

  The root cause, however, is the same as in

    http://lists.gnu.org/archive/html/coreutils/2011-04/msg00069.html

  and it can be solved the same way as suggested in

    http://lists.gnu.org/archive/html/coreutils/2011-04/msg00102.html

  In qemu, file block/raw-posix.c use the FIEMAP_FLAG_SYNC, i.e change

  f.fm.fm_flags = 0;

  to

  f.fm.fm_flags = FIEMAP_FLAG_SYNC;

  As discussed in the thread mentioned above, retrieving a page cache
  coherent map of file extents is possible only after fsync on that
  file.

  See also

    https://bugs.launchpad.net/nova/+bug/1350766

  In that bug report filed against nova, fsync had been suggested to be
  performed by the framework invoking qemu-img. However, as the choice
  of fiemap -- implying this otherwise unneeded fsync of a temporary
  file  -- is not made by the caller but by qemu-img, I agree with the
  nova bug reviewer's objection to put it into nova. The fsync should
  instead be triggered by qemu-img utilizing the FIEMAP_FLAG_SYNC,
  specifically intended for that purpose.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1368815/+subscriptions

[Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission

2014-12-01 Thread Ming Lei

The 1st patch fixes batch submission.

The 2nd one fixes -EAGAIN for non-batch case.

The 3rd one is a cleanup.

This patchset is splitted from previous patchset(dataplane: optimization
and multi virtqueue support), as suggested by Stefan.

V7:
- add protection for aborting in laio_attach_aio_context(), as suggested
by Stefan, 1/3
- patch style, return real aborting failure to caller, as suggested
by Kevin, 1/3
- track pending I/O and only handle -EAGAIN if there is pending I/O,
pointed by Kevin, 2/3 

V6:
- don't pass ioq_submit() return value to ioq_enqueue(), as suggested
by Stefan
- fix one build failure introduced in V5, reported by Stefan

V5:
- in case of submission failure, return -EIO for new coming requests
until aborting is handled
- in patch2, follow Paolo's suggestion about ioq_enqueue() changes

V4:
- abort reuqests in BH to abvoid potential "Co-routine re-entered 
recursively"
- remove 'enqueue' parameter to ioq_submit() to simpify change
- beautify code as suggested by Paolo

V3:
- rebase on QEMU master
V2:
- code style fix and commit log fix as suggested by Benoît Canet
V1:
- rebase on latest QEMU master

 block/linux-aio.c |  139 -
 1 file changed, 116 insertions(+), 23 deletions(-)


Thanks,
Ming Lei

[Qemu-devel] [PATCH v7 1/3] linux-aio: fix submit aio as a batch

2014-12-01 Thread Ming Lei

In the submit path, we can't complete request directly,
otherwise "Co-routine re-entered recursively" may be caused,
so this patch fixes the issue with below ideas:

- for -EAGAIN or partial submission, retry the submision
in following completion cb which is run in BH context
- for part of submission, update the io queue too
- for case of io queue full, submit queued requests
immediatelly and return failure to caller
- for other failure, abort all queued requests in BH
context, and requests won't be allow to submit until
aborting is handled

Reviewed-by: Paolo Bonzini 
Signed-off-by: Ming Lei 
---
 block/linux-aio.c |  116 -
 1 file changed, 97 insertions(+), 19 deletions(-)

diff --git a/block/linux-aio.c b/block/linux-aio.c
index d92513b..53c5616 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -38,11 +38,21 @@ struct qemu_laiocb {
 QLIST_ENTRY(qemu_laiocb) node;
 };
 
+/*
+ * TODO: support to batch I/O from multiple bs in one same
+ * AIO context, one important use case is multi-lun scsi,
+ * so in future the IO queue should be per AIO context.
+ */
 typedef struct {
 struct iocb *iocbs[MAX_QUEUED_IO];
 int plugged;
 unsigned int size;
 unsigned int idx;
+
+/* abort queued requests in BH context */
+QEMUBH *abort_bh;
+bool aborting;
+int abort_ret;
 } LaioQueue;
 
 struct qemu_laio_state {
@@ -59,6 +69,8 @@ struct qemu_laio_state {
 int event_max;
 };
 
+static int ioq_submit(struct qemu_laio_state *s);
+
 static inline ssize_t io_event_ret(struct io_event *ev)
 {
 return (ssize_t)(((uint64_t)ev->res2 << 32) | ev->res);
@@ -135,6 +147,11 @@ static void qemu_laio_completion_bh(void *opaque)
 
 qemu_laio_process_completion(s, laiocb);
 }
+
+/* Handle -EAGAIN or partial submission */
+if (s->io_q.idx) {
+ioq_submit(s);
+}
 }
 
 static void qemu_laio_completion_cb(EventNotifier *e)
@@ -175,47 +192,100 @@ static void ioq_init(LaioQueue *io_q)
 io_q->size = MAX_QUEUED_IO;
 io_q->idx = 0;
 io_q->plugged = 0;
+io_q->aborting = false;
 }
 
+/* Always return >= 0 and it means how many requests are submitted */
 static int ioq_submit(struct qemu_laio_state *s)
 {
-int ret, i = 0;
+int ret;
 int len = s->io_q.idx;
 
-do {
-ret = io_submit(s->ctx, len, s->io_q.iocbs);
-} while (i++ < 3 && ret == -EAGAIN);
-
-/* empty io queue */
-s->io_q.idx = 0;
+if (!len) {
+return 0;
+}
 
+ret = io_submit(s->ctx, len, s->io_q.iocbs);
 if (ret < 0) {
-i = 0;
-} else {
-i = ret;
+/* retry in following completion cb */
+if (ret == -EAGAIN) {
+return 0;
+}
+
+/*
+ * Abort in BH context for avoiding Co-routine re-entered,
+ * and update io queue at that time
+ */
+s->io_q.aborting = true;
+s->io_q.abort_ret = ret;
+qemu_bh_schedule(s->io_q.abort_bh);
+ret = 0;
 }
 
-for (; i < len; i++) {
-struct qemu_laiocb *laiocb =
-container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb);
+/*
+ * update io queue, and retry will be started automatically
+ * in following completion cb for the remainder
+ */
+if (ret > 0) {
+if (ret < len) {
+memmove(&s->io_q.iocbs[0], &s->io_q.iocbs[ret],
+(len - ret) * sizeof(struct iocb *));
+}
+s->io_q.idx -= ret;
+}
+
+return ret;
+}
 
-laiocb->ret = (ret < 0) ? ret : -EIO;
+static void ioq_abort_bh(void *opaque)
+{
+struct qemu_laio_state *s = opaque;
+int i;
+
+for (i = 0; i < s->io_q.idx; i++) {
+struct qemu_laiocb *laiocb = container_of(s->io_q.iocbs[i],
+  struct qemu_laiocb,
+  iocb);
+laiocb->ret = s->io_q.abort_ret;
 qemu_laio_process_completion(s, laiocb);
 }
-return ret;
+
+s->io_q.idx = 0;
+s->io_q.aborting = false;
 }
 
-static void ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb)
+static int ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb)
 {
 unsigned int idx = s->io_q.idx;
 
+/* Request can't be allowed to submit until aborting is handled */
+if (unlikely(s->io_q.aborting)) {
+return -EIO;
+}
+
+if (unlikely(idx == s->io_q.size)) {
+ioq_submit(s);
+
+if (unlikely(s->io_q.aborting)) {
+return -EIO;
+}
+idx = s->io_q.idx;
+}
+
+/* It has to return now if queue is still full */
+if (unlikely(idx == s->io_q.size)) {
+return -EAGAIN;
+}
+
 s->io_q.iocbs[idx++] = iocb;
 s->io_q.idx = idx;
 
-/* submit immediately if queue is full */
-if (idx == s->io_q.size) {
+/* submit immediately if queue depth is above 2/3 */
+if (idx >

[Qemu-devel] [PATCH v7 3/3] linux-aio: remove 'node' from 'struct qemu_laiocb'

2014-12-01 Thread Ming Lei

No one uses the 'node' field any more, so remove it
from 'struct qemu_laiocb', and this can save 16byte
for the struct on 64bit arch.

Reviewed-by: Kevin Wolf 
Reviewed-by: Paolo Bonzini 
Signed-off-by: Ming Lei 
---
 block/linux-aio.c |1 -
 1 file changed, 1 deletion(-)

diff --git a/block/linux-aio.c b/block/linux-aio.c
index 9403b17..cad3848 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -35,7 +35,6 @@ struct qemu_laiocb {
 size_t nbytes;
 QEMUIOVector *qiov;
 bool is_read;
-QLIST_ENTRY(qemu_laiocb) node;
 };
 
 /*
-- 
1.7.9.5

[Qemu-devel] [PATCH v7 2/3] linux-aio: handling -EAGAIN for !s->io_q.plugged case

2014-12-01 Thread Ming Lei

Previously -EAGAIN is simply ignored for !s->io_q.plugged case,
and sometimes it is easy to cause -EIO to VM, such as NVME device.

This patch handles -EAGAIN by io queue for !s->io_q.plugged case,
and it will be retried in following aio completion cb.

Most of times, -EAGAIN only happens if there is pending I/O, but
from linux kernel AIO implementation io_submit() might return it
when kmem_cache_alloc(GFP_KERNEL) returns NULL too. So 'pending'
in 'struct qemu_laio_state' is introduced for tracking active IO,
and -EAGAIN is handled when there is pending I/O.

Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Paolo Bonzini 
Suggested-by: Paolo Bonzini 
Signed-off-by: Ming Lei 
---
 block/linux-aio.c |   32 
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/block/linux-aio.c b/block/linux-aio.c
index 53c5616..9403b17 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -56,6 +56,7 @@ typedef struct {
 } LaioQueue;
 
 struct qemu_laio_state {
+unsigned long pending;
 io_context_t ctx;
 EventNotifier e;
 
@@ -98,6 +99,7 @@ static void qemu_laio_process_completion(struct 
qemu_laio_state *s,
 }
 }
 }
+s->pending--;
 laiocb->common.cb(laiocb->common.opaque, ret);
 
 qemu_aio_unref(laiocb);
@@ -179,6 +181,7 @@ static void laio_cancel(BlockAIOCB *blockacb)
 return;
 }
 
+laiocb->ctx->pending--;
 laiocb->common.cb(laiocb->common.opaque, laiocb->ret);
 }
 
@@ -280,8 +283,13 @@ static int ioq_enqueue(struct qemu_laio_state *s, struct 
iocb *iocb)
 s->io_q.iocbs[idx++] = iocb;
 s->io_q.idx = idx;
 
-/* submit immediately if queue depth is above 2/3 */
-if (idx > s->io_q.size * 2 / 3) {
+/*
+ * This is reached in two cases: queue not plugged but io_submit
+ * returned -EAGAIN, or queue plugged.  In the latter case, start
+ * submitting some I/O if the queue is getting too full.  In the
+ * former case, instead, wait until an I/O operation is completed.
+ */
+if (s->io_q.plugged && unlikely(idx > s->io_q.size * 2 / 3)) {
 ioq_submit(s);
 }
 
@@ -346,15 +354,23 @@ BlockAIOCB *laio_submit(BlockDriverState *bs, void 
*aio_ctx, int fd,
 }
 io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e));
 
-if (!s->io_q.plugged) {
-if (io_submit(s->ctx, 1, &iocbs) < 0) {
-goto out_free_aiocb;
-}
-} else {
-if (ioq_enqueue(s, iocbs) < 0) {
+/* Switch to queue mode until -EAGAIN is handled */
+if (!s->io_q.plugged && !s->io_q.idx) {
+int ret = io_submit(s->ctx, 1, &iocbs);
+if (ret >= 0) {
+return &laiocb->common;
+} else if (ret != -EAGAIN || (ret == -EAGAIN && !s->pending)) {
 goto out_free_aiocb;
 }
+/*
+ * In case of -EAGAIN, only queue the req if there is pending
+ * I/O and it is resubmitted in completion of pending I/O
+ */
+}
+if (ioq_enqueue(s, iocbs) < 0) {
+goto out_free_aiocb;
 }
+s->pending++;
 return &laiocb->common;
 
 out_free_aiocb:
-- 
1.7.9.5

[Qemu-devel] [for-2.2] Re: [PATCH] vhost: Fix vhostfd leak in error branch

2014-12-01 Thread Michael S. Tsirkin

On Fri, Nov 28, 2014 at 05:26:29PM +0800, arei.gong...@huawei.com wrote:
> From: Gonglei 
> 
> Signed-off-by: Gonglei 

Peter, could you pick this up for 2.2 please?

Reviewed-by: Michael S. Tsirkin 


> ---
>  hw/scsi/vhost-scsi.c | 1 +
>  hw/virtio/vhost.c| 2 ++
>  2 files changed, 3 insertions(+)
> 
> diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
> index 308b393..dcb2bc5 100644
> --- a/hw/scsi/vhost-scsi.c
> +++ b/hw/scsi/vhost-scsi.c
> @@ -233,6 +233,7 @@ static void vhost_scsi_realize(DeviceState *dev, Error 
> **errp)
> vhost_dummy_handle_output);
>  if (err != NULL) {
>  error_propagate(errp, err);
> +close(vhostfd);
>  return;
>  }
>  
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 5d7c40a..5a12861 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -817,10 +817,12 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>  int i, r;
>  
>  if (vhost_set_backend_type(hdev, backend_type) < 0) {
> +close((uintptr_t)opaque);
>  return -1;
>  }
>  
>  if (hdev->vhost_ops->vhost_backend_init(hdev, opaque) < 0) {
> +close((uintptr_t)opaque);
>  return -errno;
>  }
>  
> -- 
> 1.7.12.4
>

Re: [Qemu-devel] [BUG] Redhat-6.4_64bit-guest kernel panic with cpu-passthrough and guest numa

2014-12-01 Thread Paolo Bonzini



On 28/11/2014 03:38, Gonglei wrote:
>> > Can you find what line of kernel/sched.c it is?
> Yes, of course. See below please:
> "sgs->avg_load = (sgs->group_load * SCHED_LOAD_SCALE) / group->cpu_power; "
> in update_sg_lb_stats(), file sched.c, line 4094
> And I can share the cause of we found. After commit 787aaf57(target-i386:
> forward CPUID cache leaves when -cpu host is used), guest will get cpu cache
> from host when -cpu host is used. But if we configure guest numa:
>   node 0 cpus 0~7
>   node 1 cpus 8~15
> then the numa nodes lie in the same host cpu cache (cpus 0~16).
> When the guest os boot, calculate group->cpu_power, but the guest find thoes
> two different nodes own the same cache, then node1's group->cpu_power
> will not be valued, just is the initial value '0'. And when vcpu is scheduled,
> division by 0 causes kernel panic.

Thanks.  Please open a Red Hat bugzilla with the information, and Cc
Larry Woodman  who fixed a few instances of this in
the past.

Paolo

Re: [Qemu-devel] [PATCH] target-mips: add CPU definition for MIPS-II

2014-12-01 Thread Vasileios Kalintiris



From: Vasileios Kalintiris
Sent: 25 November 2014 11:04
To: qemu-devel@nongnu.org
Cc: Leon Alrae; aurel...@aurel32.net
Subject: [PATCH] target-mips: add CPU definition for MIPS-II

Add mips2-generic among CPU definitions for MIPS.

Signed-off-by: Vasileios Kalintiris 
---
 target-mips/translate_init.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/target-mips/translate_init.c b/target-mips/translate_init.c
index 148b394..d4b1cd8 100644
--- a/target-mips/translate_init.c
+++ b/target-mips/translate_init.c
@@ -108,6 +108,29 @@ struct mips_def_t {
 static const mips_def_t mips_defs[] =
 {
 {
+/* A generic CPU providing MIPS-II features.
+   FIXME: Eventually this should be replaced by a real CPU model. */
+.name = "mips2-generic",
+.CP0_PRid = 0x00018000,
+.CP0_Config0 = MIPS_CONFIG0 | (MMU_TYPE_R4000 << CP0C0_MT),
+.CP0_Config1 = MIPS_CONFIG1 | (1 << CP0C1_FP) | (15 << CP0C1_MMU) |
+  (0 << CP0C1_IS) | (3 << CP0C1_IL) | (1 << CP0C1_IA) |
+  (0 << CP0C1_DS) | (3 << CP0C1_DL) | (1 << CP0C1_DA) |
+  (0 << CP0C1_CA),
+.CP0_Config2 = MIPS_CONFIG2,
+.CP0_Config3 = MIPS_CONFIG3,
+.CP0_LLAddr_rw_bitmask = 0,
+.CP0_LLAddr_shift = 4,
+.SYNCI_Step = 32,
+.CCRes = 2,
+.CP0_Status_rw_bitmask = 0x3011,
+.CP1_fcr0 = (1 << FCR0_W) | (1 << FCR0_D) | (1 << FCR0_S),
+.SEGBITS = 32,
+.PABITS = 32,
+.insn_flags = CPU_MIPS2,
+.mmu_type = MMU_TYPE_R4000,
+},
+{
 .name = "4Kc",
 .CP0_PRid = 0x00018000,
 .CP0_Config0 = MIPS_CONFIG0 | (MMU_TYPE_R4000 << CP0C0_MT),
--

ping

Re: [Qemu-devel] Help: Convert HDD to QCOW2 img

2014-12-01 Thread Stefan Hajnoczi

On Mon, Dec 1, 2014 at 9:40 AM, Halsey Pian  wrote:

Please keep qemu-devel@nongnu.org CCed so the discussion stays on the
mailing list.  I have added it back.

> Hi Stefan, not know if there is similar module, currently I have not seen it. 
> If yes, please forgive me.  And for the program if it
> is unique,  there should be some policies for involving QEMU team, right? 
> Thanks.

QEMU does not have something directly equivalent to VMware's SDK for storage.

But there is a very powerful API called libguestfs.  Maybe it does
what you want:
http://libguestfs.org/

libvirt has APIs for snapshotting and managing storage:
http://libvirt.org/html/libvirt-libvirt-storage.html

QEMU's qemu-img supports JSON output to make it easy to parse.
qemu-nbd can be used for read-write access.

There was an attempt to create something called libqblock but the work
was never completed.  I guess your approach is similar:
https://lists.gnu.org/archive/html/qemu-devel/2013-02/msg02356.html

Stefan

Re: [Qemu-devel] MinGW build

2014-12-01 Thread Liviu Ionescu

On 28 Nov 2014, at 09:03, Stefan Weil  wrote:

> This is my build script:
> http://qemu.weilnetz.de/results/make-installers-all.

we finally have a functional windows version, installed with a setup, as you 
recommended.

the build procedure was fully documented at:

http://gnuarmeclipse.livius.net/wiki/How_to_build_QEMU

(as Windows cross build on Debian)

the build procedure itself is moderately complex, but fixing the prerequisite 
details was nightmarish, the official wiki page is schematic, at least.

not to mention the note that only older versions of glib are supported, hidden 
somewhere. perhaps an update to newer glib would be useful.

other details worth noting were related to the lack of clear separation between 
the host build and the cross build.

the procedure to detect the presence of packages with pkg-config is great, but 
is seems not used consistently, for example detecting libz is not done with 
pkg-config, but using the compiler, and this required some extra flags to 
configure.

to accomodate the details of my windows setup, I also had to add a new 
QEMU_NSI_FILE variable to Makefile, so I can redefine it externally.

regards,

Liviu

Re: [Qemu-devel] [PATCH v2] persistent dirty bitmap: add QDB file spec.

2014-12-01 Thread Stefan Hajnoczi

On Fri, Nov 28, 2014 at 04:28:57PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 21.11.2014 19:55, Stefan Hajnoczi wrote:
> >Active dirty bitmaps should migrate too.  I'm thinking now that the
> >appropriate thing is to add live migration of dirty bitmaps to QEMU
> >(regardless of whether they are active or not).
> I think, we should migrate named dirty bitmaps, which are not used now. So
> if some external mechanism uses the bitmap (for example - backup) - we
> actually can't migrate this process, because we will need to restore the
> whole backup structure including a pointer to the bitmap, which is too hard
> and includes not only bitmap migration.
> So, if named bitmap is enabled, but not used (only bdrv_aligned_pwritev
> writes to it) it can be migrated. For this I see the following solutions:
> 
> 1) Just save all corresponding pieces of named bitmaps with every migrated
> block. The block size is 1mb, so the overhead for migrating additionally a
> bitmap with 64kb granularity would be 2b, and it would be 256b for bitmap
> with 512b granularity. This approach needs additional fields in BlkMigBlock,
> for saving bitmaps pieces.

block-migration.c is not used for all live migration.  So it's important
not to tie dirty bitmap migration to block-migration.c, at least there
needs to be a way to skip actually copying disk contents in
block-migration.c.

(When there is shared storage that both source and destination hosts can
access then block-migration.c is not used.  Also, there is a newer
non-shared storage migration mechanism that is used instead of
block-migration.c which is not tied into the live migration data stream,
so block-migration.c is optional.)

> 2) Add DIRTY flag to migrated block flags, to distinguish blocks, which
> became dirty while migrating. Save all the bitmaps separately, and also
> update them on block_load, when we receive block with DIRTY flag on. Some
> information will be lost, migrated dirty bitmaps may be "more dirty" then
> original ones. This approach needs additional field "bool dirty" in
> BlkMigBlock, and saving this flag in blk_send.
> 
> These solutions don't depend on "persistence" of dirty bitmaps or persistent
> bitmap file format.

That's an important characteristic since we probably want to migrate
named dirty bitmaps, whether they are persistent or not.

Stefan


pgpuYj8ZiRiN_.pgp
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH 1/3] qemu-img bench

2014-12-01 Thread Stefan Hajnoczi

On Fri, Nov 28, 2014 at 01:19:59PM +0100, Kevin Wolf wrote:
> Am 28.11.2014 um 12:49 hat Stefan Hajnoczi geschrieben:
> > On Wed, Nov 26, 2014 at 03:46:42PM +0100, Kevin Wolf wrote:
> > > +while (data.n > 0) {
> > > +main_loop_wait(false);
> > > +}
> > 
> > Why is this false (non-blocking)?  This is why you get the main loop
> > spun warning message.
> > 
> > Using true (blocking) seems like the right thing.  data.n changes as
> > part of the callback, which is invoked from the main loop.  There is no
> > need to be non-blocking.
> 
> I think the parameter has exactly the opposite meaning as what you
> describe:
> 
> int main_loop_wait(int nonblocking)
> 
> If it were true, you would get timeout = 0. qemu-io and qemu-nbd also
> pass false here.

Oops, you are right!  Sorry, I was confused.

Stefan


pgpipa07IALsm.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH] block: do not use get_clock()

2014-12-01 Thread Stefan Hajnoczi

On Fri, Nov 28, 2014 at 02:41:54PM +0100, Markus Armbruster wrote:
> Stefan Hajnoczi  writes:
> 
> > On Wed, Nov 26, 2014 at 03:01:02PM +0100, Paolo Bonzini wrote:
> >> Use the external qemu-timer API instead.
> >> 
> >> Cc: kw...@redhat.com
> >> Cc: stefa...@redhat.com
> >> Signed-off-by: Paolo Bonzini 
> >> ---
> >>  block/accounting.c | 6 --
> >>  block/raw-posix.c  | 8 
> >>  2 files changed, 8 insertions(+), 6 deletions(-)
> >
> > Thanks, applied to my block-next tree:
> > https://github.com/stefanha/qemu/commits/block-next
> 
> Please wait for Paolo's v2 with rationale in the commit message.

Got it.

Stefan


pgpUbpqRyP0Vy.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH RFC for-2.2] virtio-blk: force 1st s/g to match header

2014-12-01 Thread Peter Maydell

On 30 November 2014 at 16:43, Michael S. Tsirkin  wrote:
> The result of this is host mapping leak.
> What effect does this have? Can this DOS host?

I don't think we can DOS the host here.

If Xen, we crash (but you can't use virtio-blk with Xen anyway)
Otherwise, if you managed to get address_space_map() to hand you
the bounce-buffer (by asking for dma to something other than RAM)
then we'll either hit an assertion or just end up never allowing
dma to/from non-RAM ever again for this guest.
The usual case would be that this was dma to/from ram, in
which case it's harmless if the virtio-backend never wrote to
the memory, and will fail to update dirty bitmaps for migration
etc if the backend did write.

In any case I think that none of these outcomes are worse
than the "exit(1)" the current patch proposes.

-- PMM

Re: [Qemu-devel] [PATCH RFC for-2.2] virtio-blk: force 1st s/g to match header

2014-12-01 Thread Michael S. Tsirkin

On Mon, Dec 01, 2014 at 12:07:07PM +, Peter Maydell wrote:
> On 30 November 2014 at 16:43, Michael S. Tsirkin  wrote:
> > The result of this is host mapping leak.
> > What effect does this have? Can this DOS host?
> 
> I don't think we can DOS the host here.
> 
> If Xen, we crash (but you can't use virtio-blk with Xen anyway)
> Otherwise, if you managed to get address_space_map() to hand you
> the bounce-buffer (by asking for dma to something other than RAM)
> then we'll either hit an assertion or just end up never allowing
> dma to/from non-RAM ever again for this guest.
> The usual case would be that this was dma to/from ram, in
> which case it's harmless if the virtio-backend never wrote to
> the memory, and will fail to update dirty bitmaps for migration
> etc if the backend did write.
> 
> In any case I think that none of these outcomes are worse
> than the "exit(1)" the current patch proposes.
> 
> -- PMM

Fair enough.
Pls disregard the patch then, and we'll fix it properly
for 2.3 when we set ANY_LAYOUT.

Re: [Qemu-devel] [PATCH 3/7] test-coroutine: avoid overflow on 32-bit systems

2014-12-01 Thread Paolo Bonzini



On 01/12/2014 02:28, Ming Lei wrote:
>> > -   (unsigned long)(10 * duration) / maxcycles);
>> > +   (unsigned long)(10.0 * duration / maxcycles));
> One more single bracket.

I don't understand?

Paolo

Re: [Qemu-devel] [PATCH] block: do not use get_clock()

2014-12-01 Thread Markus Armbruster

Stefan Hajnoczi  writes:

> On Fri, Nov 28, 2014 at 02:41:54PM +0100, Markus Armbruster wrote:
>> Stefan Hajnoczi  writes:
>> 
>> > On Wed, Nov 26, 2014 at 03:01:02PM +0100, Paolo Bonzini wrote:
>> >> Use the external qemu-timer API instead.
>> >> 
>> >> Cc: kw...@redhat.com
>> >> Cc: stefa...@redhat.com
>> >> Signed-off-by: Paolo Bonzini 
>> >> ---
>> >>  block/accounting.c | 6 --
>> >>  block/raw-posix.c  | 8 
>> >>  2 files changed, 8 insertions(+), 6 deletions(-)
>> >
>> > Thanks, applied to my block-next tree:
>> > https://github.com/stefanha/qemu/commits/block-next
>> 
>> Please wait for Paolo's v2 with rationale in the commit message.
>
> Got it.

https://github.com/stefanha/qemu/commit/1800094faabbadb017acb15bc0b1bd6dde283f45
looks good, thanks!

[Qemu-devel] [Bug 1383857] Re: aarch64: virtio disks don't show up in guest (neither blk nor scsi)

2014-12-01 Thread Richard Jones

Still happening with latest upstream kernel.  It seems to involve using
the -initrd option at all, with any cpio file, even a tiny one.  More
results posted here:

https://lists.cs.columbia.edu/pipermail/kvmarm/2014-December/012557.html

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1383857

Title:
  aarch64: virtio disks don't show up in guest (neither blk nor scsi)

Status in QEMU:
  New

Bug description:
  kernel-3.18.0-0.rc1.git0.1.rwmj5.fc22.aarch64 (3.18 rc1 + some hardware 
enablement)
  qemu from git today

  When I create a guest with virtio-scsi disks, they don't show up inside the 
guest.
  Literally after the virtio_mmio.ko and virtio_scsi.ko modules are loaded, 
there are
  no messages about disks, and of course nothing else works.

  Really long command line (generated by libvirt):

  HOME=/home/rjones USER=rjones LOGNAME=rjones QEMU_AUDIO_DRV=none
  TMPDIR=/home/rjones/d/libguestfs/tmp
  /home/rjones/d/qemu/aarch64-softmmu/qemu-system-aarch64 -name guestfs-
  oqv29um3jp03kpjf -S -machine virt,accel=tcg,usb=off -cpu cortex-a57 -m
  500 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid
  a5f1a15d-2bc7-46df-9974-1d1f643b2449 -nographic -no-user-config
  -nodefaults -chardev
  socket,id=charmonitor,path=/home/rjones/.config/libvirt/qemu/lib
  /guestfs-oqv29um3jp03kpjf.monitor,server,nowait -mon
  chardev=charmonitor,id=monitor,mode=control -rtc
  base=utc,driftfix=slew -no-reboot -boot strict=on -kernel
  /home/rjones/d/libguestfs/tmp/.guestfs-1000/appliance.d/kernel -initrd
  /home/rjones/d/libguestfs/tmp/.guestfs-1000/appliance.d/initrd -append
  panic=1 console=ttyAMA0 earlyprintk=pl011,0x900 ignore_loglevel
  efi-rtc=noprobe udevtimeout=6000 udev.event-timeout=6000
  no_timer_check lpj=50 acpi=off printk.time=1 cgroup_disable=memory
  root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=xterm-256color -device
  virtio-scsi-device,id=scsi0 -device virtio-serial-device,id=virtio-
  serial0 -usb -drive
  file=/home/rjones/d/libguestfs/tmp/libguestfs4GxfQ9/scratch.1,if=none,id
  =drive-scsi0-0-0-0,format=raw,cache=unsafe -device scsi-
  hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-
  scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -drive
  file=/home/rjones/d/libguestfs/tmp/libguestfs4GxfQ9/overlay2,if=none,id
  =drive-scsi0-0-1-0,format=qcow2,cache=unsafe -device scsi-
  hd,bus=scsi0.0,channel=0,scsi-id=1,lun=0,drive=drive-
  scsi0-0-1-0,id=scsi0-0-1-0 -serial
  unix:/home/rjones/d/libguestfs/tmp/libguestfs4GxfQ9/console.sock
  -chardev
  
socket,id=charchannel0,path=/home/rjones/d/libguestfs/tmp/libguestfs4GxfQ9/guestfsd.sock
  -device virtserialport,bus=virtio-
  serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.libguestfs.channel.0
  -msg timestamp=on

  There are no kernel messages about the disks, they just are not seen.

  Worked with kernel 3.16 so I suspect this could be a kernel bug rather than a
  qemu bug, but I've no idea where to report those.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1383857/+subscriptions

Re: [Qemu-devel] [PATCH] mips: Fix the 64-bit case for microMIPS MOVE16 and MOVEP

2014-12-01 Thread Maciej W. Rozycki

On Mon, 24 Nov 2014, Leon Alrae wrote:

> All the patches up to this one have been applied to mips-next branch
> (available at git://github.com/lalrae/qemu.git), thanks. I'll go through
> the remaining soon.

 Thanks.  I am now back from a week's vacation and will continue posting 
outstanding changes.

 There will be changes made to generic code, specifically soft-float, 
and consequently other platforms' code, to suit the MIPS implementation 
of IEEE 754-2008 NaN handling recommendation.  Regrettably I see the 
delivery of the 2.2 release has been delayed and I do hope the new 
estimate of Dec 5th will stand so that the changes can make their way to 
trunk in a timely manner.

 I have some other stuff beyond 2008-NaN support as well, but I'll be 
giving the latter a priority as I know you look forward to seeing it and 
the rest is valuable, but a bit less important (and furthermore it 
relies on some changes to ABI configuration that we may need to discuss 
before we find a satisfactory solution).

  Maciej

Re: [Qemu-devel] [for-2.2] Re: [PATCH] vhost: Fix vhostfd leak in error branch

2014-12-01 Thread Peter Maydell

On 1 December 2014 at 09:37, Michael S. Tsirkin  wrote:
> On Fri, Nov 28, 2014 at 05:26:29PM +0800, arei.gong...@huawei.com wrote:
>> From: Gonglei 
>>
>> Signed-off-by: Gonglei 
>
> Peter, could you pick this up for 2.2 please?
>
> Reviewed-by: Michael S. Tsirkin 

Applied, thanks.

-- PMM

Re: [Qemu-devel] [Xen-devel] [PATCH] increase maxmem before calling xc_domain_populate_physmap

2014-12-01 Thread Don Slutz


On 11/27/14 05:48, Stefano Stabellini wrote:

On Wed, 26 Nov 2014, Don Slutz wrote:

On 11/26/14 13:17, Stefano Stabellini wrote:

On Tue, 25 Nov 2014, Andrew Cooper wrote:

On 25/11/14 17:45, Stefano Stabellini wrote:

Increase maxmem before calling xc_domain_populate_physmap_exact to avoid
the risk of running out of guest memory. This way we can also avoid
complex memory calculations in libxl at domain construction time.

This patch fixes an abort() when assigning more than 4 NICs to a VM.

Signed-off-by: Stefano Stabellini 

diff --git a/xen-hvm.c b/xen-hvm.c
index 5c69a8d..38e08c3 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -218,6 +218,7 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t
size, MemoryRegion *mr)
   unsigned long nr_pfn;
   xen_pfn_t *pfn_list;
   int i;
+xc_dominfo_t info;
 if (runstate_check(RUN_STATE_INMIGRATE)) {
   /* RAM already populated in Xen */
@@ -240,6 +241,13 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t
size, MemoryRegion *mr)
   pfn_list[i] = (ram_addr >> TARGET_PAGE_BITS) + i;
   }
   +if (xc_domain_getinfo(xen_xc, xen_domid, 1, &info) < 0) {

xc_domain_getinfo()'s interface is mad, and provides no guarantee that
it returns the information for the domain you requested.  It also won't
return -1 on error.  The correct error handing is:

(xc_domain_getinfo(xen_xc, xen_domid, 1, &info) != 1) || (info.domid !=
xen_domid)

It might be wiser to switch to xc_domain_getinfolist

Either needs the same tests, since both return an vector of info.

Right



~Andrew


+hw_error("xc_domain_getinfo failed");
+}
+if (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
+(nr_pfn * XC_PAGE_SIZE / 1024)) < 0) {

There are two big issues and 1 minor one with this.
1) You will allocate the videoram again.
2) You will never use the 1 MB already allocated for option ROMs.

And the minor one is that you can increase maxmem more then is needed.

I don't understand: are you aware that setmaxmem doesn't allocate any
memory, just raises the maximum amount of memory allowed for the domain
to have?


Yes.


But you are right that we would raise the limit more than it could be,
specifically the videoram would get accounted for twice and we wouldn't
need LIBXL_MAXMEM_CONSTANT. I guess we would have to write a patch for
that.




Here is a better if:

-if (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
-(nr_pfn * XC_PAGE_SIZE / 1024)) < 0) {
+max_pages = info.max_memkb * 1024 / XC_PAGE_SIZE;
+free_pages = max_pages - info.nr_pages;
+need_pages = nr_pfn - free_pages;
+if ((free_pages < nr_pfn) &&
+   (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
+(need_pages * XC_PAGE_SIZE / 1024)) < 0)) {

That's an interesting idea, but I am not sure if it is safe in all
configurations.

It could make QEMU work better with older libxl and avoid increasing
maxmem more than necessary.
On the other hand I guess it could break things when PoD is used, or in
general when the user purposely sets maxmem on the vm config file.



Works fine in both claim modes and with PoD used (maxmem > memory).  Do
not know how to test with tmem.  I do not see how it would be worse then 
current
code that does not auto increase.  I.E. even without a xen change, I 
think something

like this could be done.



My testing shows a free 32 pages that I am not sure where they come from.  But
the code about is passing my 8 nics of e1000.

I think that raising maxmem a bit higher than necessary is not too bad.
If we really care about it, we could lower the limit after QEMU's
initialization is completed.


Ok.  I did find the 32 it is VGA_HOLE_SIZE.  So here is what I have 
which includes

a lot of extra printf.


--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -67,6 +67,7 @@ static inline ioreq_t *xen_vcpu_ioreq(shared_iopage_t 
*shared_page, int vcpu)

 #endif

 #define BUFFER_IO_MAX_DELAY  100
+#define VGA_HOLE_SIZE (0x20)

 typedef struct XenPhysmap {
 hwaddr start_addr;
@@ -219,6 +220,11 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t 
size, MemoryRegion *mr)

 xen_pfn_t *pfn_list;
 int i;
 xc_dominfo_t info;
+unsigned long max_pages, free_pages, real_free;
+long need_pages;
+uint64_t tot_pages, pod_cache_pages, pod_entries;
+
+trace_xen_ram_alloc(ram_addr, size, mr->name);

 if (runstate_check(RUN_STATE_INMIGRATE)) {
 /* RAM already populated in Xen */
@@ -232,13 +238,6 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t 
size, MemoryRegion *mr)

 return;
 }

-fprintf(stderr, "%s: alloc "RAM_ADDR_FMT
-" bytes (%ld Kib) of ram at "RAM_ADDR_FMT
-" mr.name=%s\n",
-__func__, size, (long)(size>>10), ram_addr, mr->name);
-
-trace_xen_ram_alloc(ram_addr, size);
-
 nr_pfn = size >> TARGET_PAGE_BITS;
 pfn_list = g_malloc(sizeof (*pfn_list) * nr_pfn);

@

Re: [Qemu-devel] [RFC 1/6] bitmap: add atomic set functions

2014-12-01 Thread Stefan Hajnoczi

On Thu, Nov 27, 2014 at 05:42:41PM +0100, Paolo Bonzini wrote:
> 
> 
> On 27/11/2014 13:29, Stefan Hajnoczi wrote:
> > +void bitmap_set_atomic(unsigned long *map, long start, long nr)
> > +{
> > +unsigned long *p = map + BIT_WORD(start);
> > +const long size = start + nr;
> > +int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG);
> > +unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start);
> > +
> > +while (nr - bits_to_set >= 0) {
> > +atomic_or(p, mask_to_set);
> 
> atomic_or is unnecessary while mask_to_set is ~0UL.  I think not even a
> smp_wmb() is necessary.

Okay, I can split this into the ~0UL case and the atomic case.


pgpzyhSf0UXeO.pgp
Description: PGP signature

[Qemu-devel] xorg crash on Fedora 21 xen hvm domUs with qxl (log with backtrace included)

2014-12-01 Thread Fabio Fantoni

On my latest test with xen 4.5 and qemu 2.2 (from git) linux hvm domU 
(Fedora 21) has no more crashed without any useful information in log 
but was crashed only xorg on start:



[20.653] (EE)
[20.653] (EE) Backtrace:
[20.668] (EE) 0: /usr/libexec/Xorg.bin (OsLookupColor+0x119) 
[0x59bf79]

[20.669] (EE) 1: /lib64/libc.so.6 (__restore_rt+0x0) [0x7fbba151a94f]
[20.669] (EE) 2: /lib64/libpixman-1.so.0 
(_pixman_internal_only_get_implementation+0x2ce2b) [0x7fbba26d249b]
[20.670] (EE) 3: /lib64/libpixman-1.so.0 
(_pixman_internal_only_get_implementation+0x2cf79) [0x7fbba26d2899]
[20.670] (EE) 4: /lib64/libpixman-1.so.0 
(pixman_image_composite32+0x451) [0x7fbba261d711]
[20.670] (EE) 5: /usr/lib64/xorg/modules/drivers/qxl_drv.so 
(_init+0x47d0) [0x7fbb9c687d20]
[20.670] (EE) 6: /usr/lib64/xorg/modules/drivers/qxl_drv.so 
(_init+0x48df) [0x7fbb9c687e9f]
[20.671] (EE) 7: /usr/lib64/xorg/modules/drivers/qxl_drv.so 
(_init+0xfa3a) [0x7fbb9c69e1aa]
[20.671] (EE) 8: /usr/lib64/xorg/modules/drivers/qxl_drv.so 
(_init+0x1937d) [0x7fbb9c6b142d]
[20.671] (EE) 9: /usr/lib64/xorg/modules/drivers/qxl_drv.so 
(_init+0x12c47) [0x7fbb9c6a4587]
[20.671] (EE) 10: /usr/libexec/Xorg.bin 
(DamageRegionAppend+0x18b5) [0x5212e5]
[20.671] (EE) 11: /usr/libexec/Xorg.bin (miPaintWindow+0x1f6) 
[0x579c16]
[20.671] (EE) 12: /usr/libexec/Xorg.bin (miWindowExposures+0x18f) 
[0x57a4ef]
[20.672] (EE) 13: /usr/libexec/Xorg.bin 
(miHandleValidateExposures+0x68) [0x590108]

[20.672] (EE) 14: /usr/libexec/Xorg.bin (MapWindow+0x18a) [0x4656fa]
[20.672] (EE) 15: /usr/libexec/Xorg.bin (ProcBadRequest+0x5f5) 
[0x433d85]
[20.672] (EE) 16: /usr/libexec/Xorg.bin (SendErrorToClient+0x2f7) 
[0x439027]
[20.672] (EE) 17: /usr/libexec/Xorg.bin (remove_fs_handlers+0x416) 
[0x43d186]
[20.673] (EE) 18: /lib64/libc.so.6 (__libc_start_main+0xf0) 
[0x7fbba1505fe0]

[20.673] (EE) 19: /usr/libexec/Xorg.bin (_start+0x29) [0x42761e]
[20.673] (EE) 20: ? (?+0x29) [0x29]
[20.673] (EE)
[20.673] (EE) Segmentation fault at address 0x7fbb9c67a000
[20.673] (EE)
Fatal server error:
[20.673] (EE) Caught signal 11 (Segmentation fault). Server aborting
[20.673] (EE)
[20.673] (EE)


Full xorg log in attachment.
Can someone help me to solve this problem please?

If you need more informations/tests tell me and I'll post them.

Thanks for any reply and sorry for my bad english.

Re: [Qemu-devel] [RFC 2/6] bitmap: add atomic test and clear

2014-12-01 Thread Stefan Hajnoczi

On Thu, Nov 27, 2014 at 05:43:56PM +0100, Paolo Bonzini wrote:
> 
> 
> On 27/11/2014 13:29, Stefan Hajnoczi wrote:
> > +bool bitmap_test_and_clear_atomic(unsigned long *map, long start, long nr)
> > +{
> > +unsigned long *p = map + BIT_WORD(start);
> > +const long size = start + nr;
> > +int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
> > +unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);
> > +unsigned long dirty = 0;
> > +unsigned long old_bits;
> > +
> > +while (nr - bits_to_clear >= 0) {
> > +old_bits = atomic_fetch_and(p, ~mask_to_clear);
> > +dirty |= old_bits & mask_to_clear;
> > +nr -= bits_to_clear;
> > +bits_to_clear = BITS_PER_LONG;
> > +mask_to_clear = ~0UL;
> > +p++;
> > +}
> > +if (nr) {
> > +mask_to_clear &= BITMAP_LAST_WORD_MASK(size);
> > +old_bits = atomic_fetch_and(p, ~mask_to_clear);
> > +dirty |= old_bits & mask_to_clear;
> > +}
> > +
> > +return dirty;
> > +}
> 
> Same here; you can use atomic_xchg, which is faster because on x86
> atomic_fetch_and must do a compare-and-swap loop.

Will fix in v2.


pgpGJHZ8wueHE.pgp
Description: PGP signature

Re: [Qemu-devel] Announcing QEMU Advent Calendar 2014

2014-12-01 Thread François Revol

On 24/11/2014 17:15, Stefan Hajnoczi wrote:
> The QEMU Advent Calendar is launching on December 1st 2014:
> 
> http://www.qemu-advent-calendar.org/
> 
> Each day until Christmas (or until we run out) a new QEMU disk image
> will be posted for you to enjoy.
> 
> The disk images showcase interesting guest operating systems, demos,
> and software that runs under QEMU.
> 
> Want to contribute a disk image?  Send a description, QEMU
> command-line, and the disk image (or download link) to
> stefa...@gmail.com.  Disk images must be freely redistributable.

I can probably propose a Haiku nightly :)


Btw, that would be a good opportunity to relaunch the Free Live OS Zoo:
http://floz.sourceforge.net/

François.

Re: [Qemu-devel] [RFC 4/6] migration: move dirty bitmap sync to ram_addr.h

2014-12-01 Thread Stefan Hajnoczi

On Thu, Nov 27, 2014 at 04:29:06PM +, Dr. David Alan Gilbert wrote:
> * Stefan Hajnoczi (stefa...@redhat.com) wrote:
> > The dirty memory bitmap is managed by ram_addr.h and copied to
> > migration_bitmap[] periodically during live migration.
> > 
> > Move the code to sync the bitmap to ram_addr.h where related code lives.
> 
> Is this sync code going to need to gain a barrier (although I'm not quite
> sure which) to ensure it's picked up all changes?

gcc makes these operations a full barrier:
https://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html

Stefan


pgp6SwRiwAOxM.pgp
Description: PGP signature

[Qemu-devel] [PATCH] arm: dtb: Align dtb to 64K because some kernels use 64K page size.

2014-12-01 Thread Richard W.M. Jones

Resolves: https://bugs.launchpad.net/qemu/+bug/1383857
Signed-off-by: Richard W.M. Jones 
---
 hw/arm/boot.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 0014c34..a859922 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -632,11 +632,11 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
*info)
  */
 if (have_dtb(info)) {
 /* Place the DTB after the initrd in memory. Note that some
- * kernels will trash anything in the 4K page the initrd
+ * kernels will trash anything in the page the initrd
  * ends in, so make sure the DTB isn't caught up in that.
  */
 hwaddr dtb_start = QEMU_ALIGN_UP(info->initrd_start + initrd_size,
- 4096);
+ 65536);
 if (load_dtb(dtb_start, info, 0) < 0) {
 exit(1);
 }
-- 
2.1.0

Re: [Qemu-devel] Help: Convert HDD to QCOW2 img

2014-12-01 Thread Halsey

Hi Stefan, 

Thanks for so much info you provided. 

Okay, I would keep qemu-devel mailing list in the loop, no problem.

Currently, I have finished the coding of the wrapper, now testing the 
bdrv_read/write qcow2 img. I would look into these libraries and incorporate 
them based on the concept of interface design.

Thanks.

Halsey

Sent from my iPhone

> On 2014年12月1日, at 17:52, Stefan Hajnoczi  wrote:
> 
> On Mon, Dec 1, 2014 at 9:40 AM, Halsey Pian  wrote:
> 
> Please keep qemu-devel@nongnu.org CCed so the discussion stays on the
> mailing list.  I have added it back.
> 
>> Hi Stefan, not know if there is similar module, currently I have not seen 
>> it. If yes, please forgive me.  And for the program if it
>> is unique,  there should be some policies for involving QEMU team, right? 
>> Thanks.
> 
> QEMU does not have something directly equivalent to VMware's SDK for storage.
> 
> But there is a very powerful API called libguestfs.  Maybe it does
> what you want:
> http://libguestfs.org/
> 
> libvirt has APIs for snapshotting and managing storage:
> http://libvirt.org/html/libvirt-libvirt-storage.html
> 
> QEMU's qemu-img supports JSON output to make it easy to parse.
> qemu-nbd can be used for read-write access.
> 
> There was an attempt to create something called libqblock but the work
> was never completed.  I guess your approach is similar:
> https://lists.gnu.org/archive/html/qemu-devel/2013-02/msg02356.html
> 
> Stefan

Re: [Qemu-devel] Announcing QEMU Advent Calendar 2014

2014-12-01 Thread Stefan Hajnoczi

On Mon, Dec 1, 2014 at 1:57 PM, François Revol  wrote:
> On 24/11/2014 17:15, Stefan Hajnoczi wrote:
>> The QEMU Advent Calendar is launching on December 1st 2014:
>>
>> http://www.qemu-advent-calendar.org/
>>
>> Each day until Christmas (or until we run out) a new QEMU disk image
>> will be posted for you to enjoy.
>>
>> The disk images showcase interesting guest operating systems, demos,
>> and software that runs under QEMU.
>>
>> Want to contribute a disk image?  Send a description, QEMU
>> command-line, and the disk image (or download link) to
>> stefa...@gmail.com.  Disk images must be freely redistributable.
>
> I can probably propose a Haiku nightly :)

Fantastic, thanks!

> Btw, that would be a good opportunity to relaunch the Free Live OS Zoo:
> http://floz.sourceforge.net/

How about taking the images from the advent calendar into the Free
Live OS Zoo after 24th December?

Stefan

Re: [Qemu-devel] [RFC 4/6] migration: move dirty bitmap sync to ram_addr.h

2014-12-01 Thread Dr. David Alan Gilbert

* Stefan Hajnoczi (stefa...@redhat.com) wrote:
> On Thu, Nov 27, 2014 at 04:29:06PM +, Dr. David Alan Gilbert wrote:
> > * Stefan Hajnoczi (stefa...@redhat.com) wrote:
> > > The dirty memory bitmap is managed by ram_addr.h and copied to
> > > migration_bitmap[] periodically during live migration.
> > > 
> > > Move the code to sync the bitmap to ram_addr.h where related code lives.
> > 
> > Is this sync code going to need to gain a barrier (although I'm not quite
> > sure which) to ensure it's picked up all changes?
> 
> gcc makes these operations a full barrier:
> https://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html

Ah yes; actually our docs/atomics.txt is a good reference - all
the operations you're using are in the sequentially consistent
half of that doc.

Dave

> 
> Stefan


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-devel] [Bug 1383857] Re: aarch64: virtio disks don't show up in guest (neither blk nor scsi)

2014-12-01 Thread Richard Jones

Finally found the problem, patch posted:
https://lists.gnu.org/archive/html/qemu-devel/2014-12/msg00034.html

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1383857

Title:
  aarch64: virtio disks don't show up in guest (neither blk nor scsi)

Status in QEMU:
  New

Bug description:
  kernel-3.18.0-0.rc1.git0.1.rwmj5.fc22.aarch64 (3.18 rc1 + some hardware 
enablement)
  qemu from git today

  When I create a guest with virtio-scsi disks, they don't show up inside the 
guest.
  Literally after the virtio_mmio.ko and virtio_scsi.ko modules are loaded, 
there are
  no messages about disks, and of course nothing else works.

  Really long command line (generated by libvirt):

  HOME=/home/rjones USER=rjones LOGNAME=rjones QEMU_AUDIO_DRV=none
  TMPDIR=/home/rjones/d/libguestfs/tmp
  /home/rjones/d/qemu/aarch64-softmmu/qemu-system-aarch64 -name guestfs-
  oqv29um3jp03kpjf -S -machine virt,accel=tcg,usb=off -cpu cortex-a57 -m
  500 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid
  a5f1a15d-2bc7-46df-9974-1d1f643b2449 -nographic -no-user-config
  -nodefaults -chardev
  socket,id=charmonitor,path=/home/rjones/.config/libvirt/qemu/lib
  /guestfs-oqv29um3jp03kpjf.monitor,server,nowait -mon
  chardev=charmonitor,id=monitor,mode=control -rtc
  base=utc,driftfix=slew -no-reboot -boot strict=on -kernel
  /home/rjones/d/libguestfs/tmp/.guestfs-1000/appliance.d/kernel -initrd
  /home/rjones/d/libguestfs/tmp/.guestfs-1000/appliance.d/initrd -append
  panic=1 console=ttyAMA0 earlyprintk=pl011,0x900 ignore_loglevel
  efi-rtc=noprobe udevtimeout=6000 udev.event-timeout=6000
  no_timer_check lpj=50 acpi=off printk.time=1 cgroup_disable=memory
  root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=xterm-256color -device
  virtio-scsi-device,id=scsi0 -device virtio-serial-device,id=virtio-
  serial0 -usb -drive
  file=/home/rjones/d/libguestfs/tmp/libguestfs4GxfQ9/scratch.1,if=none,id
  =drive-scsi0-0-0-0,format=raw,cache=unsafe -device scsi-
  hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-
  scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -drive
  file=/home/rjones/d/libguestfs/tmp/libguestfs4GxfQ9/overlay2,if=none,id
  =drive-scsi0-0-1-0,format=qcow2,cache=unsafe -device scsi-
  hd,bus=scsi0.0,channel=0,scsi-id=1,lun=0,drive=drive-
  scsi0-0-1-0,id=scsi0-0-1-0 -serial
  unix:/home/rjones/d/libguestfs/tmp/libguestfs4GxfQ9/console.sock
  -chardev
  
socket,id=charchannel0,path=/home/rjones/d/libguestfs/tmp/libguestfs4GxfQ9/guestfsd.sock
  -device virtserialport,bus=virtio-
  serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.libguestfs.channel.0
  -msg timestamp=on

  There are no kernel messages about the disks, they just are not seen.

  Worked with kernel 3.16 so I suspect this could be a kernel bug rather than a
  qemu bug, but I've no idea where to report those.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1383857/+subscriptions

Re: [Qemu-devel] [PATCH] arm: dtb: Align dtb to 64K because some kernels use 64K page size.

2014-12-01 Thread Peter Maydell

On 1 December 2014 at 14:13, Richard W.M. Jones  wrote:
> Resolves: https://bugs.launchpad.net/qemu/+bug/1383857
> Signed-off-by: Richard W.M. Jones 

This is really a kernel bug.
https://www.kernel.org/doc/Documentation/arm64/booting.txt
just says "The device tree blob (dtb) must be placed on an
8-byte boundary within the first 512 megabytes from the start
of the kernel image and must not cross a 2-megabyte boundary."

and doesn't make any requirements about it not overlapping
a page with the initrd.

> ---
>  hw/arm/boot.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> index 0014c34..a859922 100644
> --- a/hw/arm/boot.c
> +++ b/hw/arm/boot.c
> @@ -632,11 +632,11 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
> *info)
>   */
>  if (have_dtb(info)) {
>  /* Place the DTB after the initrd in memory. Note that some
> - * kernels will trash anything in the 4K page the initrd
> + * kernels will trash anything in the page the initrd
>   * ends in, so make sure the DTB isn't caught up in that.
>   */
>  hwaddr dtb_start = QEMU_ALIGN_UP(info->initrd_start + 
> initrd_size,
> - 4096);
> + 65536);

I'd rather we didn't do this unconditionally, because for some of
our board models 64K is a significant proportion of their total RAM.

-- PMM

Re: [Qemu-devel] [Xen-devel] [PATCH] increase maxmem before calling xc_domain_populate_physmap

2014-12-01 Thread Stefano Stabellini

On Mon, 1 Dec 2014, Don Slutz wrote:
> On 11/27/14 05:48, Stefano Stabellini wrote:
> > On Wed, 26 Nov 2014, Don Slutz wrote:
> > > On 11/26/14 13:17, Stefano Stabellini wrote:
> > > > On Tue, 25 Nov 2014, Andrew Cooper wrote:
> > > > > On 25/11/14 17:45, Stefano Stabellini wrote:
> > > > > > Increase maxmem before calling xc_domain_populate_physmap_exact to
> > > > > > avoid
> > > > > > the risk of running out of guest memory. This way we can also avoid
> > > > > > complex memory calculations in libxl at domain construction time.
> > > > > > 
> > > > > > This patch fixes an abort() when assigning more than 4 NICs to a VM.
> > > > > > 
> > > > > > Signed-off-by: Stefano Stabellini 
> > > > > > 
> > > > > > diff --git a/xen-hvm.c b/xen-hvm.c
> > > > > > index 5c69a8d..38e08c3 100644
> > > > > > --- a/xen-hvm.c
> > > > > > +++ b/xen-hvm.c
> > > > > > @@ -218,6 +218,7 @@ void xen_ram_alloc(ram_addr_t ram_addr,
> > > > > > ram_addr_t
> > > > > > size, MemoryRegion *mr)
> > > > > >unsigned long nr_pfn;
> > > > > >xen_pfn_t *pfn_list;
> > > > > >int i;
> > > > > > +xc_dominfo_t info;
> > > > > >  if (runstate_check(RUN_STATE_INMIGRATE)) {
> > > > > >/* RAM already populated in Xen */
> > > > > > @@ -240,6 +241,13 @@ void xen_ram_alloc(ram_addr_t ram_addr,
> > > > > > ram_addr_t
> > > > > > size, MemoryRegion *mr)
> > > > > >pfn_list[i] = (ram_addr >> TARGET_PAGE_BITS) + i;
> > > > > >}
> > > > > >+if (xc_domain_getinfo(xen_xc, xen_domid, 1, &info) < 0) {
> > > > > xc_domain_getinfo()'s interface is mad, and provides no guarantee that
> > > > > it returns the information for the domain you requested.  It also
> > > > > won't
> > > > > return -1 on error.  The correct error handing is:
> > > > > 
> > > > > (xc_domain_getinfo(xen_xc, xen_domid, 1, &info) != 1) || (info.domid
> > > > > !=
> > > > > xen_domid)
> > > > It might be wiser to switch to xc_domain_getinfolist
> > > Either needs the same tests, since both return an vector of info.
> > Right
> > 
> > 
> > > > > ~Andrew
> > > > > 
> > > > > > +hw_error("xc_domain_getinfo failed");
> > > > > > +}
> > > > > > +if (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
> > > > > > +(nr_pfn * XC_PAGE_SIZE / 1024)) < 0) {
> > > There are two big issues and 1 minor one with this.
> > > 1) You will allocate the videoram again.
> > > 2) You will never use the 1 MB already allocated for option ROMs.
> > > 
> > > And the minor one is that you can increase maxmem more then is needed.
> > I don't understand: are you aware that setmaxmem doesn't allocate any
> > memory, just raises the maximum amount of memory allowed for the domain
> > to have?
> 
> Yes.
> 
> > But you are right that we would raise the limit more than it could be,
> > specifically the videoram would get accounted for twice and we wouldn't
> > need LIBXL_MAXMEM_CONSTANT. I guess we would have to write a patch for
> > that.
> > 
> > 
> > 
> > > Here is a better if:
> > > 
> > > -if (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
> > > -(nr_pfn * XC_PAGE_SIZE / 1024)) < 0) {
> > > +max_pages = info.max_memkb * 1024 / XC_PAGE_SIZE;
> > > +free_pages = max_pages - info.nr_pages;
> > > +need_pages = nr_pfn - free_pages;
> > > +if ((free_pages < nr_pfn) &&
> > > +   (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
> > > +(need_pages * XC_PAGE_SIZE / 1024)) < 0)) {
> > That's an interesting idea, but I am not sure if it is safe in all
> > configurations.
> > 
> > It could make QEMU work better with older libxl and avoid increasing
> > maxmem more than necessary.
> > On the other hand I guess it could break things when PoD is used, or in
> > general when the user purposely sets maxmem on the vm config file.
> > 
> 
> Works fine in both claim modes and with PoD used (maxmem > memory).  Do
> not know how to test with tmem.  I do not see how it would be worse then
> current
> code that does not auto increase.  I.E. even without a xen change, I think
> something
> like this could be done.

OK, good to know. I am OK with increasing maxmem only if it is strictly
necessary.


> 
> > > My testing shows a free 32 pages that I am not sure where they come from.
> > > But
> > > the code about is passing my 8 nics of e1000.
> > I think that raising maxmem a bit higher than necessary is not too bad.
> > If we really care about it, we could lower the limit after QEMU's
> > initialization is completed.
> 
> Ok.  I did find the 32 it is VGA_HOLE_SIZE.  So here is what I have which
> includes
> a lot of extra printf.

In QEMU I would prefer not to assume that libxl increased maxmem for the
vga hole. I would rather call xc_domain_setmaxmem twice for the vga hole
than tie QEMU to a particular maxmem allocation scheme in libxl.

In libxl I would like to avoid increasing mamxem for anything QEMU will
allocate later, that

Re: [Qemu-devel] [PATCH v2 01/13] block: Make essential BlockDriver objects public

2014-12-01 Thread Eric Blake

On 11/27/2014 07:48 AM, Max Reitz wrote:
> There are some block drivers which are essential to QEMU and may not be
> removed: These are raw, file and qcow2 (as the default non-raw format).
> Make their BlockDriver objects public so they can be directly referenced
> throughout the block layer without needing to call bdrv_find_format()
> and having to deal with an error at runtime, while the real problem
> occured during linking (where raw, file or qcow2 were not linked into

s/occured/occurred/

> qemu).
> 
> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Max Reitz 
> ---
>  block/qcow2.c | 4 ++--
>  block/raw-posix.c | 4 ++--
>  block/raw-win32.c | 4 ++--
>  block/raw_bsd.c   | 4 ++--
>  include/block/block_int.h | 8 
>  5 files changed, 16 insertions(+), 8 deletions(-)

Reviewed-by: Eric Blake 

> +++ b/block/qcow2.c
> @@ -2847,7 +2847,7 @@ static QemuOptsList qcow2_create_opts = {
>  }
>  };
>  
> -static BlockDriver bdrv_qcow2 = {
> +BlockDriver *bdrv_qcow2 = &(BlockDriver){

Do we want any use of 'const', to avoid accidental manipulation of the
pointer and/or pointed-to contents?

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 02/13] block: Omit bdrv_find_format for essential drivers

2014-12-01 Thread Eric Blake

On 11/27/2014 07:48 AM, Max Reitz wrote:
> We can always assume raw, file and qcow2 being available; so do not use
> bdrv_find_format() to locate their BlockDriver objects but statically
> reference the respective objects.
> 
> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Max Reitz 
> ---
>  block.c | 11 ++-
>  1 file changed, 2 insertions(+), 9 deletions(-)

Reviewed-by: Eric Blake 

> @@ -1293,7 +1288,6 @@ int bdrv_append_temp_snapshot(BlockDriverState *bs, int 
> flags, Error **errp)
>  /* TODO: extra byte is a hack to ensure MAX_PATH space on Windows. */
>  char *tmp_filename = g_malloc0(PATH_MAX + 1);
>  int64_t total_size;
> -BlockDriver *bdrv_qcow2;

Hmm - how hard would it be to get qemu to be clean under -Wshadow?  This
is a case where you would have had to change this hunk during patch 1 if
-Wshadow were in effect.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFC PATCH 0/2] Support to change VNC keyboard layout dynamically

2014-12-01 Thread Eric Blake

On 11/29/2014 03:39 AM, arei.gong...@huawei.com wrote:
> From: Gonglei 
> 
> A bonus of this feature is that supporting different
> people (in different countries) using defferent keyboard

s/defferent/different/

> to connect the same guest but not need to configure
> command line or libivrt xml file then restart guest.
> 
> Using the existing qmp command:
> 
> -> { "execute": "change",
>  "arguments": { "device": "vnc", "target": "keymap",
> "arg": "de" } }
> <- { "return": {} }

Please add a new QMP command.  The existing 'change' command is not
type-safe and therefore difficult to introspect; we should not be adding
more band-aids to it.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFC PATCH 2/2] vnc: add change keyboard layout interface

2014-12-01 Thread Eric Blake

On 11/29/2014 03:39 AM, arei.gong...@huawei.com wrote:
> From: Gonglei 
> 
> Example QMP command of Change VNC keyboard layout:
> 
> -> { "execute": "change",
>  "arguments": { "device": "vnc", "target": "keymap",
> "arg": "de" } }
> <- { "return": {} }

As I said in the cover letter, we should NOT be adding stuff to the
broken 'change' command, but should instead add a new command.

> 
> Signed-off-by: Gonglei 
> ---
>  qapi-schema.json |  8 +---
>  qmp.c| 17 +
>  2 files changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 9ffdcf8..8c02a9f 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -1552,13 +1552,15 @@
>  #
>  # @target: If @device is a block device, then this is the new filename.
>  #  If @device is 'vnc', then if the value 'password' selects the vnc
> -#  change password command.   Otherwise, this specifies a new server 
> URI
> +#  change password command, if the value 'keymap'selects the vnc 
> change

s/'keymap'selects/'keymap' selects/

> +#  keyboard layout command. Otherwise, this specifies a new server 
> URI
>  #  address to listen to for VNC connections.
>  #
>  # @arg:If @device is a block device, then this is an optional format to 
> open
>  #  the device with.
> -#  If @device is 'vnc' and @target is 'password', this is the new VNC
> -#  password to set.  If this argument is an empty string, then no 
> future
> +#  If @device is 'vnc' and if @target is 'password', this is the new 
> VNC
> +#  password to set; if @target is 'keymap', this is the new VNC 
> keyboard
> +#  layout to set. If this argument is an empty string, then no future
>  #  logins will be allowed.

Not discoverable.  As proposed, libvirt has no way of knowing if qemu is
new enough to support this horrible hack.  A new command has multiple
benefits: it would be discoverable ('query-commands') and type-safe
(none of this horrid overloading of special text values).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH RFC] e1000: defer packets until BM enabled

2014-12-01 Thread Michael S. Tsirkin

Some guests seem to set BM for e1000 after
enabling RX.
If packets arrive in the window, device is wedged.
Probably works by luck on real hardware, work around
this by making can_receive depend on BM.

Signed-off-by: Michael S. Tsirkin 
---
 hw/net/e1000.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index e33a4da..34625ac 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -923,7 +923,9 @@ e1000_can_receive(NetClientState *nc)
 E1000State *s = qemu_get_nic_opaque(nc);
 
 return (s->mac_reg[STATUS] & E1000_STATUS_LU) &&
-(s->mac_reg[RCTL] & E1000_RCTL_EN) && e1000_has_rxbufs(s, 1);
+(s->mac_reg[RCTL] & E1000_RCTL_EN) &&
+(s->parent_obj.config[PCI_COMMAND] & PCI_COMMAND_MASTER) &&
+e1000_has_rxbufs(s, 1);
 }
 
 static uint64_t rx_desc_base(E1000State *s)
@@ -1529,6 +1531,20 @@ static NetClientInfo net_e1000_info = {
 .link_status_changed = e1000_set_link_status,
 };
 
+static void e1000_write_config(PCIDevice *pci_dev, uint32_t address,
+uint32_t val, int len)
+{
+E1000State *d = E1000(dev);
+
+pci_default_write_config(pci_dev, address, val, len);
+
+if (range_covers_byte(address, len, PCI_COMMAND) &&
+(pci_dev->config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
+qemu_flush_queued_packets(qemu_get_queue(s->nic));
+}
+}
+
+
 static int pci_e1000_init(PCIDevice *pci_dev)
 {
 DeviceState *dev = DEVICE(pci_dev);
@@ -1539,6 +1555,8 @@ static int pci_e1000_init(PCIDevice *pci_dev)
 int i;
 uint8_t *macaddr;
 
+pci_dev->config_write = e1000_write_config;
+
 pci_conf = pci_dev->config;
 
 /* TODO: RST# value should be 0, PCI spec 6.2.4 */
-- 
MST

[Qemu-devel] QEMU Advent Calendar has begun - Day 1

2014-12-01 Thread Stefan Hajnoczi

Today is the first day of QEMU Advent Calendar 2014, where an
interesting and fun QEMU disk image is published every day.

http://www.qemu-advent-calendar.org/

I won't spam the mailing list every day but I wanted to let you know
that from now until Christmas we will publish a daily image for your
enjoyment.

The first image was contributed by Gerd Hoffmann and showcases an
amazing Slackware Linux 1.0 distribution with a pre-1.0 Linux kernel
from 1993!

Want to contribute a cool disk image?  There are still days remaining
for which we need disk images.  Email me if you would like to
contribute!

Enjoy,
Stefan

Re: [Qemu-devel] [PATCH RFC] e1000: defer packets until BM enabled

2014-12-01 Thread Michael S. Tsirkin

On Mon, Dec 01, 2014 at 06:01:18PM +, Gabriel Somlo wrote:
> Hi Michael,
> 
> I had to make some small changes to get this patch to build successfully,
> see inline below:

Ouch, looks like I sent out a stale version:
git commit
build+edit
git format-patch (without git commit)

Happens to me now and again :(

Sorry.

> On Monday, December 01, 2014 11:50am, Michael S. Tsirkin [m...@redhat.com] 
> wrote:
> > 
> > Some guests seem to set BM for e1000 after
> > enabling RX.
> > If packets arrive in the window, device is wedged.
> > Probably works by luck on real hardware, work around
> > this by making can_receive depend on BM.
> > 
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >  hw/net/e1000.c | 20 +++-
> >  1 file changed, 19 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> > index e33a4da..34625ac 100644
> > --- a/hw/net/e1000.c
> > +++ b/hw/net/e1000.c
> > @@ -923,7 +923,9 @@ e1000_can_receive(NetClientState *nc)
> >  E1000State *s = qemu_get_nic_opaque(nc);
> > 
> >  return (s->mac_reg[STATUS] & E1000_STATUS_LU) &&
> > -(s->mac_reg[RCTL] & E1000_RCTL_EN) && e1000_has_rxbufs(s, 1);
> > +(s->mac_reg[RCTL] & E1000_RCTL_EN) &&
> > +(s->parent_obj.config[PCI_COMMAND] & PCI_COMMAND_MASTER) &&
> > +e1000_has_rxbufs(s, 1);
> >  }
> > 
> >  static uint64_t rx_desc_base(E1000State *s)
> > @@ -1529,6 +1531,20 @@ static NetClientInfo net_e1000_info = {
> >  .link_status_changed = e1000_set_link_status,
> >  };
> > 
> > +static void e1000_write_config(PCIDevice *pci_dev, uint32_t address,
> > +uint32_t val, int len)
> > +{
> > +E1000State *d = E1000(dev);
> 
> s/dev/pci_dev/
> 
> > +
> > +pci_default_write_config(pci_dev, address, val, len);
> > +
> > +if (range_covers_byte(address, len, PCI_COMMAND) &&
> 
> requires #include "qemu/range.h" at the top of e1000.c
> 
> > +(pci_dev->config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
> > +qemu_flush_queued_packets(qemu_get_queue(s->nic));
> 
> s/s->nic/d->nic/
> 
> > +}
> > +}
> > +
> > +
> >  static int pci_e1000_init(PCIDevice *pci_dev)
> >  {
> >  DeviceState *dev = DEVICE(pci_dev);
> > @@ -1539,6 +1555,8 @@ static int pci_e1000_init(PCIDevice *pci_dev)
> >  int i;
> >  uint8_t *macaddr;
> > 
> > +pci_dev->config_write = e1000_write_config;
> > +
> >  pci_conf = pci_dev->config;
> > 
> >  /* TODO: RST# value should be 0, PCI spec 6.2.4 */
> > --
> 
> With this, I can confirm everything still works fine on both Mavericks and
> F21-beta-live. So:
> 
> Tested-by: Gabriel Somlo 
> 
> Regards,
> --Gabriel

Thanks!

[Qemu-devel] [PATCH v2] e1000: defer packets until BM enabled

2014-12-01 Thread Michael S. Tsirkin

Some guests seem to set BM for e1000 after
enabling RX.
If packets arrive in the window, device is wedged.
Probably works by luck on real hardware, work around
this by making can_receive depend on BM.

Tested-by: Gabriel Somlo 
Signed-off-by: Michael S. Tsirkin 
---

Amos - you were the one reporting the failures, could
you pls confirm this patch fixes the issues for you?

 hw/net/e1000.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index e33a4da..89c5788 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -33,6 +33,7 @@
 #include "sysemu/sysemu.h"
 #include "sysemu/dma.h"
 #include "qemu/iov.h"
+#include "qemu/range.h"
 
 #include "e1000_regs.h"
 
@@ -923,7 +924,9 @@ e1000_can_receive(NetClientState *nc)
 E1000State *s = qemu_get_nic_opaque(nc);
 
 return (s->mac_reg[STATUS] & E1000_STATUS_LU) &&
-(s->mac_reg[RCTL] & E1000_RCTL_EN) && e1000_has_rxbufs(s, 1);
+(s->mac_reg[RCTL] & E1000_RCTL_EN) &&
+(s->parent_obj.config[PCI_COMMAND] & PCI_COMMAND_MASTER) &&
+e1000_has_rxbufs(s, 1);
 }
 
 static uint64_t rx_desc_base(E1000State *s)
@@ -1529,6 +1532,20 @@ static NetClientInfo net_e1000_info = {
 .link_status_changed = e1000_set_link_status,
 };
 
+static void e1000_write_config(PCIDevice *pci_dev, uint32_t address,
+uint32_t val, int len)
+{
+E1000State *s = E1000(pci_dev);
+
+pci_default_write_config(pci_dev, address, val, len);
+
+if (range_covers_byte(address, len, PCI_COMMAND) &&
+(pci_dev->config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
+qemu_flush_queued_packets(qemu_get_queue(s->nic));
+}
+}
+
+
 static int pci_e1000_init(PCIDevice *pci_dev)
 {
 DeviceState *dev = DEVICE(pci_dev);
@@ -1539,6 +1556,8 @@ static int pci_e1000_init(PCIDevice *pci_dev)
 int i;
 uint8_t *macaddr;
 
+pci_dev->config_write = e1000_write_config;
+
 pci_conf = pci_dev->config;
 
 /* TODO: RST# value should be 0, PCI spec 6.2.4 */
-- 
MST

Re: [Qemu-devel] MinGW build

2014-12-01 Thread Stefan Weil

Am 01.12.2014 um 11:30 schrieb Liviu Ionescu:
> 
> On 28 Nov 2014, at 09:03, Stefan Weil  wrote:
> 
>> This is my build script:
>> http://qemu.weilnetz.de/results/make-installers-all.
> 
> we finally have a functional windows version, installed with a setup, as you 
> recommended.
> 
> the build procedure was fully documented at:
> 
> http://gnuarmeclipse.livius.net/wiki/How_to_build_QEMU
> 
> (as Windows cross build on Debian)
> 
> 
> the build procedure itself is moderately complex, but fixing the prerequisite 
> details was nightmarish, the official wiki page is schematic, at least.
> 
> not to mention the note that only older versions of glib are supported, 
> hidden somewhere. perhaps an update to newer glib would be useful.
> 
> other details worth noting were related to the lack of clear separation 
> between the host build and the cross build.
> 
> the procedure to detect the presence of packages with pkg-config is great, 
> but is seems not used consistently, for example detecting libz is not done 
> with pkg-config, but using the compiler, and this required some extra flags 
> to configure.
> 
> to accomodate the details of my windows setup, I also had to add a new 
> QEMU_NSI_FILE variable to Makefile, so I can redefine it externally.
> 
>  
> regards,
> 
> Liviu
> 

Hi,

thank's for your work on this documention.

Here are some more hints which might help people to get cross
compilation for Windows running. http://qemu.weilnetz.de/debian/
includes some packages which I made from the GTK all-in-one bundles and
also packages for libfdt which I compiled from the sources.

Regards
Stefan

Re: [Qemu-devel] [PATCH v8 02/10] qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove

2014-12-01 Thread John Snow




On 11/27/2014 04:41 AM, Max Reitz wrote:

On 2014-11-26 at 18:41, John Snow wrote:

From: Fam Zheng 

The new command pair is added to manage user created dirty bitmap. The
dirty bitmap's name is mandatory and must be unique for the same device,
but different devices can have bitmaps with the same names.

The granularity is an optional field. If it is not specified, we will
choose a default granularity based on the cluster size if available,
clamped to between 4K and 64K (To mirror how the 'mirror' code was
already choosing granularity.) If we do not have cluster size info


Maybe swap the right parenthesis and the full stop?


This is an American thing, the difference between "aesthetic 
punctuation" and "logical punctuation." (<-- aesthetic.)


http://www.slate.com/articles/life/the_good_word/2011/05/the_rise_of_logical_punctuation.html

I can make a mental note in the future to not use the American style, I 
just thought it would be fun to explain it.



available, we choose 64K. This code has been factored out into helper


Naturally you're better at English than me, but shouldn't this be "into
a helper"?


This, on the other hand, is just a typo where my brain filled in the 
missing glue for me.



shared with block/mirror.

The types added to block-core.json will be re-used in future patches
in this series, see:
'qapi: Add transaction support to block-dirty-bitmap-{add, enable,
disable}'

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
---
  block.c   | 19 ++
  block/mirror.c| 10 +-
  blockdev.c| 54
++
  include/block/block.h |  1 +
  qapi/block-core.json  | 55
+++
  qmp-commands.hx   | 49
+
  6 files changed, 179 insertions(+), 9 deletions(-)


Anyway, with or without these minor changes:

Reviewed-by: Max Reitz

[Qemu-devel] Update on TCG Multithreading

2014-12-01 Thread Mark Burton


All - first a huge thanks for those who have contributed, and those who have 
expressed an interest in helping out.

One issue I’d like to see more opinions on is the question of a cache per core, 
or a shared cache.
I have heard anecdotal evidence that a shared cache gives a major performance 
benefit….
Does anybody have anything more concrete?
(of course we will get numbers in the end if we implement the hybrid scheme as 
suggested in the wiki - but I’d still appreciate any feedback).

Our next plan is to start putting an implementation plan together. Probably 
quite sketchy at this point, and we hope to start coding shortly.


Cheers

Mark.





 +44 (0)20 7100 3485 x 210
 +33 (0)5 33 52 01 77x 210

+33 (0)603762104
mark.burton

[Qemu-devel] [PATCH v9 06/10] qmp: Add block-dirty-bitmap-enable and block-dirty-bitmap-disable

2014-12-01 Thread John Snow

From: Fam Zheng 

This allows to put the dirty bitmap into a disabled state where no more
writes will be tracked.

It will be used before backup or writing to persistent file.

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 block.c   | 15 +
 blockdev.c| 61 +++
 include/block/block.h |  2 ++
 qapi/block-core.json  | 28 +++
 qmp-commands.hx   | 10 +
 5 files changed, 116 insertions(+)

diff --git a/block.c b/block.c
index 2d08b9f..85215b3 100644
--- a/block.c
+++ b/block.c
@@ -56,6 +56,7 @@ struct BdrvDirtyBitmap {
 int64_t size;
 int64_t granularity;
 char *name;
+bool enabled;
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
 
@@ -5396,6 +5397,7 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs,
 bitmap->granularity = granularity;
 bitmap->bitmap = hbitmap_alloc(bitmap->size, ffs(sector_granularity) - 1);
 bitmap->name = g_strdup(name);
+bitmap->enabled = true;
 QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);
 return bitmap;
 }
@@ -5414,6 +5416,16 @@ void bdrv_release_dirty_bitmap(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap)
 }
 }
 
+void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+{
+bitmap->enabled = false;
+}
+
+void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+{
+bitmap->enabled = true;
+}
+
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 {
 BdrvDirtyBitmap *bm;
@@ -5482,6 +5494,9 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t 
cur_sector,
 {
 BdrvDirtyBitmap *bitmap;
 QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
+if (!bitmap->enabled) {
+continue;
+}
 hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
 }
 }
diff --git a/blockdev.c b/blockdev.c
index 4d30b09..c6b0bf1 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1173,6 +1173,41 @@ out_aio_context:
 return NULL;
 }
 
+/**
+ * Return a dirty bitmap (if present), after validating
+ * the device and bitmap names. Returns NULL on error,
+ * including when the device and/or bitmap is not found.
+ */
+static BdrvDirtyBitmap *block_dirty_bitmap_lookup(const char *device,
+  const char *name,
+  Error **errp)
+{
+BlockDriverState *bs;
+BdrvDirtyBitmap *bitmap;
+
+if (!device) {
+error_setg(errp, "Device cannot be NULL");
+return NULL;
+}
+if (!name) {
+error_setg(errp, "Bitmap name cannot be NULL");
+return NULL;
+}
+
+bs = bdrv_lookup_bs(device, NULL, errp);
+if (!bs) {
+return NULL;
+}
+
+bitmap = bdrv_find_dirty_bitmap(bs, name);
+if (!bitmap) {
+error_setg(errp, "Dirty bitmap not found: %s", name);
+return NULL;
+}
+
+return bitmap;
+}
+
 /* New and old BlockDriverState structs for atomic group operations */
 
 typedef struct BlkTransactionState BlkTransactionState;
@@ -1948,6 +1983,32 @@ void qmp_block_dirty_bitmap_remove(const char *device, 
const char *name,
 bdrv_release_dirty_bitmap(bs, bitmap);
 }
 
+void qmp_block_dirty_bitmap_enable(const char *device, const char *name,
+   Error **errp)
+{
+BdrvDirtyBitmap *bitmap;
+
+bitmap = block_dirty_bitmap_lookup(device, name, errp);
+if (!bitmap) {
+return;
+}
+
+bdrv_enable_dirty_bitmap(bitmap);
+}
+
+void qmp_block_dirty_bitmap_disable(const char *device, const char *name,
+Error **errp)
+{
+BdrvDirtyBitmap *bitmap;
+
+bitmap = block_dirty_bitmap_lookup(device, name, errp);
+if (!bitmap) {
+return;
+}
+
+bdrv_disable_dirty_bitmap(bitmap);
+}
+
 int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
 {
 const char *id = qdict_get_str(qdict, "id");
diff --git a/include/block/block.h b/include/block/block.h
index 5522eba..b583457 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -442,6 +442,8 @@ BdrvDirtyBitmap *bdrv_copy_dirty_bitmap(BlockDriverState 
*bs,
 const BdrvDirtyBitmap *bitmap,
 const char *name);
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
+void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
+void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
 uint64_t bdrv_dbm_calc_def_granularity(BlockDriverState *bs);
 int64_t bdrv_dirty_bitmap_granularity(BlockDriverState *bs,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index da86e91..64b5755 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -949,6 +949,34 @@
   'data': 'BlockDirtyBitmap' }
 
 ##
+# @block-dirty-bitmap-enable
+#
+# Enable a dirty bitmap on the de

[Qemu-devel] [PATCH v9 03/10] block: Introduce bdrv_dirty_bitmap_granularity()

2014-12-01 Thread John Snow

From: Fam Zheng 

This returns the granularity (in bytes) of dirty bitmap,
which matches the QMP interface and the existing query
interface.

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 block.c   | 9 +++--
 include/block/block.h | 2 ++
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 3f27519..a0d1150 100644
--- a/block.c
+++ b/block.c
@@ -5399,8 +5399,7 @@ BlockDirtyInfoList 
*bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 BlockDirtyInfo *info = g_new0(BlockDirtyInfo, 1);
 BlockDirtyInfoList *entry = g_new0(BlockDirtyInfoList, 1);
 info->count = bdrv_get_dirty_count(bs, bm);
-info->granularity =
-((int64_t) BDRV_SECTOR_SIZE << hbitmap_granularity(bm->bitmap));
+info->granularity = bdrv_dirty_bitmap_granularity(bs, bm);
 info->has_name = !!bm->name;
 info->name = g_strdup(bm->name);
 entry->value = info;
@@ -5439,6 +5438,12 @@ uint64_t bdrv_dbm_calc_def_granularity(BlockDriverState 
*bs)
 return granularity;
 }
 
+int64_t bdrv_dirty_bitmap_granularity(BlockDriverState *bs,
+  BdrvDirtyBitmap *bitmap)
+{
+return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
+}
+
 void bdrv_dirty_iter_init(BlockDriverState *bs,
   BdrvDirtyBitmap *bitmap, HBitmapIter *hbi)
 {
diff --git a/include/block/block.h b/include/block/block.h
index 066ded6..f180f93 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -440,6 +440,8 @@ void bdrv_dirty_bitmap_make_anon(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap);
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
 uint64_t bdrv_dbm_calc_def_granularity(BlockDriverState *bs);
+int64_t bdrv_dirty_bitmap_granularity(BlockDriverState *bs,
+  BdrvDirtyBitmap *bitmap);
 int bdrv_get_dirty(BlockDriverState *bs, BdrvDirtyBitmap *bitmap, int64_t 
sector);
 void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector, int nr_sectors);
 void bdrv_reset_dirty(BlockDriverState *bs, int64_t cur_sector, int 
nr_sectors);
-- 
1.9.3

[Qemu-devel] [PATCH v9 01/10] qapi: Add optional field "name" to block dirty bitmap

2014-12-01 Thread John Snow

From: Fam Zheng 

This field will be set for user created dirty bitmap. Also pass in an
error pointer to bdrv_create_dirty_bitmap, so when a name is already
taken on this BDS, it can report an error message. This is not global
check, two BDSes can have dirty bitmap with a common name.

Implemented bdrv_find_dirty_bitmap to find a dirty bitmap by name, will
be used later when other QMP commands want to reference dirty bitmap by
name.

Add bdrv_dirty_bitmap_make_anon. This unsets the name of dirty bitmap.

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 block-migration.c |  2 +-
 block.c   | 32 +++-
 block/mirror.c|  2 +-
 include/block/block.h |  7 ++-
 qapi/block-core.json  |  4 +++-
 5 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/block-migration.c b/block-migration.c
index 08db01a..6f3aa18 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -319,7 +319,7 @@ static int set_dirty_tracking(void)
 
 QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
 bmds->dirty_bitmap = bdrv_create_dirty_bitmap(bmds->bs, BLOCK_SIZE,
-  NULL);
+  NULL, NULL);
 if (!bmds->dirty_bitmap) {
 ret = -errno;
 goto fail;
diff --git a/block.c b/block.c
index 591fbe4..e5c6ccf 100644
--- a/block.c
+++ b/block.c
@@ -53,6 +53,7 @@
 
 struct BdrvDirtyBitmap {
 HBitmap *bitmap;
+char *name;
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
 
@@ -5326,7 +5327,28 @@ bool bdrv_qiov_is_aligned(BlockDriverState *bs, 
QEMUIOVector *qiov)
 return true;
 }
 
-BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs, int 
granularity,
+BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, const char *name)
+{
+BdrvDirtyBitmap *bm;
+
+assert(name);
+QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
+if (bm->name && !strcmp(name, bm->name)) {
+return bm;
+}
+}
+return NULL;
+}
+
+void bdrv_dirty_bitmap_make_anon(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
+{
+g_free(bitmap->name);
+bitmap->name = NULL;
+}
+
+BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
+  int granularity,
+  const char *name,
   Error **errp)
 {
 int64_t bitmap_size;
@@ -5334,6 +5356,10 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs, int granularity,
 
 assert((granularity & (granularity - 1)) == 0);
 
+if (name && bdrv_find_dirty_bitmap(bs, name)) {
+error_setg(errp, "Bitmap already exists: %s", name);
+return NULL;
+}
 granularity >>= BDRV_SECTOR_BITS;
 assert(granularity);
 bitmap_size = bdrv_nb_sectors(bs);
@@ -5344,6 +5370,7 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs, int granularity,
 }
 bitmap = g_new0(BdrvDirtyBitmap, 1);
 bitmap->bitmap = hbitmap_alloc(bitmap_size, ffs(granularity) - 1);
+bitmap->name = g_strdup(name);
 QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);
 return bitmap;
 }
@@ -5355,6 +5382,7 @@ void bdrv_release_dirty_bitmap(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap)
 if (bm == bitmap) {
 QLIST_REMOVE(bitmap, list);
 hbitmap_free(bitmap->bitmap);
+g_free(bitmap->name);
 g_free(bitmap);
 return;
 }
@@ -5373,6 +5401,8 @@ BlockDirtyInfoList 
*bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 info->count = bdrv_get_dirty_count(bs, bm);
 info->granularity =
 ((int64_t) BDRV_SECTOR_SIZE << hbitmap_granularity(bm->bitmap));
+info->has_name = !!bm->name;
+info->name = g_strdup(bm->name);
 entry->value = info;
 *plist = entry;
 plist = &entry->next;
diff --git a/block/mirror.c b/block/mirror.c
index 2c6dd2a..858e4ff 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -699,7 +699,7 @@ static void mirror_start_job(BlockDriverState *bs, 
BlockDriverState *target,
 s->granularity = granularity;
 s->buf_size = MAX(buf_size, granularity);
 
-s->dirty_bitmap = bdrv_create_dirty_bitmap(bs, granularity, errp);
+s->dirty_bitmap = bdrv_create_dirty_bitmap(bs, granularity, NULL, errp);
 if (!s->dirty_bitmap) {
 return;
 }
diff --git a/include/block/block.h b/include/block/block.h
index 610be9f..52fb6b2 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -430,8 +430,13 @@ bool bdrv_qiov_is_aligned(BlockDriverState *bs, 
QEMUIOVector *qiov);
 
 struct HBitmapIter;
 typedef struct BdrvDirtyBitmap BdrvDirtyBitmap;
-BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs, int 
granularity,
+BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
+  int gran

[Qemu-devel] [PATCH v9 00/10] block: Incremental backup series

2014-12-01 Thread John Snow

Note: This patch is now based on top of stefanha/block-next and is
intended for QEMU 2.3.

This is the in memory part of the incremental backup feature.

With the added commands, we can create a bitmap on a block backend, from which
point of time all the writes are tracked by the bitmap, marking sectors as
dirty.  Later, we call drive-backup and pass the bitmap to it, to do an
incremental backup.

See the last patch which adds some tests for this use case.

Fam

==

This is the next iteration of Fam's incremenetal backup feature.
I have since taken this over from him and have (slowly) worked through
the feedback to the last version of his patch and have made many
tiny little edits.

For convenience: https://github.com/jnsnow/qemu/commits/dbm-backup

John.

v9:
 - Edited commit message, for English embetterment (02/10)
 - Rebased on top of stefanha/block-next (06,08/10)
 - Adjusted error message and line length (07/10)

v8:
 - Changed int64_t return for bdrv_dbm_calc_def_granularity to uint64_t (2/10)
 - Updated commit message (2/10)
 - Removed redundant check for null in device parameter (2/10)
 - Removed comment cruft. (2/10)
 - Removed redundant local_err propagation (several)
 - Updated commit message (3/10)
 - Fix HBitmap copy loop index (4/10)
 - Remove redundant ternary (5/10)
 - Shift up the block_dirty_bitmap_lookup function (6/10)
 - Error messages cleanup (7/10)
 - Add an assertion to explain the re-use of .prepare() for two transactions.
   (8/10)
 - Removed BDS argument from bitmap enable/disable helper; it was unused. (8/10)

v7: (highlights)
 - First version being authored by jsnow
 - Addressed most list feedback from V6, many small changes.
   All feedback was either addressed on-list (as a wontfix) or patched.
 - Replaced all error_set with error_setg
 - Replaced all bdrv_find with bdrv_lookup_bs()
 - Adjusted the max granularity to share a common function with
   backup/mirror that attempts to "guess" a reasonable default.
   It clamps between [4K,64K] currently.
 - The BdrvDirtyBitmap object now counts granularity exclusively in
   bytes to match its interface.
   It leaves the sector granularity concerns to HBitmap.
 - Reworked the backup loop to utilize the hbitmap iterator.
   There are some extra concerns to handle arrhythmic cases where the
   granularity of the bitmap does not match the backup cluster size.
   This iteration works best when it does match, but it's not a
   deal-breaker if it doesn't -- it just gets less efficient.
 - Reworked the transactional functions so that abort() wouldn't "undo"
   a redundant command. They now have been split into a prepare and a
   commit function (with state) and do not provide an abort command.
 - Added a block_dirty_bitmap_lookup(device, name, errp) function to
   shorten a few of the commands added in this series, particularly
   qmp_enable, qmp_disable, and the transaction preparations.

v6: Re-send of v5.

v5: Rebase to master.

v4: Last version tailored by Fam Zheng.

Fam Zheng (10):
  qapi: Add optional field "name" to block dirty bitmap
  qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove
  block: Introduce bdrv_dirty_bitmap_granularity()
  hbitmap: Add hbitmap_copy
  block: Add bdrv_copy_dirty_bitmap and bdrv_reset_dirty_bitmap
  qmp: Add block-dirty-bitmap-enable and block-dirty-bitmap-disable
  qmp: Add support of "dirty-bitmap" sync mode for drive-backup
  qapi: Add transaction support to block-dirty-bitmap-{add, enable,
disable}
  qmp: Add dirty bitmap 'enabled' field in query-block
  qemu-iotests: Add tests for drive-backup sync=dirty-bitmap

 block-migration.c |   2 +-
 block.c   | 114 --
 block/backup.c| 130 +
 block/mirror.c|  16 ++--
 blockdev.c| 217 +-
 hmp.c |   4 +-
 include/block/block.h |  17 +++-
 include/block/block_int.h |   6 ++
 include/qemu/hbitmap.h|   8 ++
 qapi-schema.json  |   5 +-
 qapi/block-core.json  | 120 ++-
 qmp-commands.hx   |  66 -
 tests/qemu-iotests/056|  33 ++-
 tests/qemu-iotests/056.out|   4 +-
 tests/qemu-iotests/iotests.py |   8 ++
 util/hbitmap.c|  16 
 16 files changed, 711 insertions(+), 55 deletions(-)

-- 
1.9.3

[Qemu-devel] [PATCH v9 04/10] hbitmap: Add hbitmap_copy

2014-12-01 Thread John Snow

From: Fam Zheng 

This makes a deep copy of an HBitmap.

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 include/qemu/hbitmap.h |  8 
 util/hbitmap.c | 16 
 2 files changed, 24 insertions(+)

diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 550d7ce..b645cfc 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -65,6 +65,14 @@ struct HBitmapIter {
 HBitmap *hbitmap_alloc(uint64_t size, int granularity);
 
 /**
+ * hbitmap_copy:
+ * @bitmap: The original bitmap to copy.
+ *
+ * Copy a HBitmap.
+ */
+HBitmap *hbitmap_copy(const HBitmap *bitmap);
+
+/**
  * hbitmap_empty:
  * @hb: HBitmap to operate on.
  *
diff --git a/util/hbitmap.c b/util/hbitmap.c
index b3060e6..8aa7406 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -395,3 +395,19 @@ HBitmap *hbitmap_alloc(uint64_t size, int granularity)
 hb->levels[0][0] |= 1UL << (BITS_PER_LONG - 1);
 return hb;
 }
+
+HBitmap *hbitmap_copy(const HBitmap *bitmap)
+{
+int i;
+uint64_t size;
+HBitmap *hb = g_memdup(bitmap, sizeof(HBitmap));
+
+size = bitmap->size;
+for (i = HBITMAP_LEVELS - 1; i >= 0; i--) {
+size = MAX((size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL, 1);
+hb->levels[i] = g_memdup(bitmap->levels[i],
+ size * sizeof(unsigned long));
+}
+
+return hb;
+}
-- 
1.9.3

[Qemu-devel] [PATCH v9 02/10] qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove

2014-12-01 Thread John Snow

From: Fam Zheng 

The new command pair is added to manage user created dirty bitmap. The
dirty bitmap's name is mandatory and must be unique for the same device,
but different devices can have bitmaps with the same names.

The granularity is an optional field. If it is not specified, we will
choose a default granularity based on the cluster size if available,
clamped to between 4K and 64K to mirror how the 'mirror' code was
already choosing granularity. If we do not have cluster size info
available, we choose 64K. This code has been factored out into a helper
shared with block/mirror.

The types added to block-core.json will be re-used in future patches
in this series, see:
'qapi: Add transaction support to block-dirty-bitmap-{add, enable, disable}'

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 block.c   | 19 ++
 block/mirror.c| 10 +-
 blockdev.c| 54 ++
 include/block/block.h |  1 +
 qapi/block-core.json  | 55 +++
 qmp-commands.hx   | 49 +
 6 files changed, 179 insertions(+), 9 deletions(-)

diff --git a/block.c b/block.c
index e5c6ccf..3f27519 100644
--- a/block.c
+++ b/block.c
@@ -5420,6 +5420,25 @@ int bdrv_get_dirty(BlockDriverState *bs, BdrvDirtyBitmap 
*bitmap, int64_t sector
 }
 }
 
+#define BDB_MIN_DEF_GRANULARITY 4096
+#define BDB_MAX_DEF_GRANULARITY 65536
+#define BDB_DEFAULT_GRANULARITY BDB_MAX_DEF_GRANULARITY
+
+uint64_t bdrv_dbm_calc_def_granularity(BlockDriverState *bs)
+{
+BlockDriverInfo bdi;
+uint64_t granularity;
+
+if (bdrv_get_info(bs, &bdi) >= 0 && bdi.cluster_size != 0) {
+granularity = MAX(BDB_MIN_DEF_GRANULARITY, bdi.cluster_size);
+granularity = MIN(BDB_MAX_DEF_GRANULARITY, granularity);
+} else {
+granularity = BDB_DEFAULT_GRANULARITY;
+}
+
+return granularity;
+}
+
 void bdrv_dirty_iter_init(BlockDriverState *bs,
   BdrvDirtyBitmap *bitmap, HBitmapIter *hbi)
 {
diff --git a/block/mirror.c b/block/mirror.c
index 858e4ff..3633632 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -664,15 +664,7 @@ static void mirror_start_job(BlockDriverState *bs, 
BlockDriverState *target,
 MirrorBlockJob *s;
 
 if (granularity == 0) {
-/* Choose the default granularity based on the target file's cluster
- * size, clamped between 4k and 64k.  */
-BlockDriverInfo bdi;
-if (bdrv_get_info(target, &bdi) >= 0 && bdi.cluster_size != 0) {
-granularity = MAX(4096, bdi.cluster_size);
-granularity = MIN(65536, granularity);
-} else {
-granularity = 65536;
-}
+granularity = bdrv_dbm_calc_def_granularity(target);
 }
 
 assert ((granularity & (granularity - 1)) == 0);
diff --git a/blockdev.c b/blockdev.c
index 5651a8e..4d30b09 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1894,6 +1894,60 @@ void qmp_block_set_io_throttle(const char *device, 
int64_t bps, int64_t bps_rd,
 aio_context_release(aio_context);
 }
 
+void qmp_block_dirty_bitmap_add(const char *device, const char *name,
+bool has_granularity, int64_t granularity,
+Error **errp)
+{
+BlockDriverState *bs;
+
+bs = bdrv_lookup_bs(device, NULL, errp);
+if (!bs) {
+return;
+}
+
+if (!name || name[0] == '\0') {
+error_setg(errp, "Bitmap name cannot be empty");
+return;
+}
+if (has_granularity) {
+if (granularity < 512 || !is_power_of_2(granularity)) {
+error_setg(errp, "Granularity must be power of 2 "
+ "and at least 512");
+return;
+}
+} else {
+/* Default to cluster size, if available: */
+granularity = bdrv_dbm_calc_def_granularity(bs);
+}
+
+bdrv_create_dirty_bitmap(bs, granularity, name, errp);
+}
+
+void qmp_block_dirty_bitmap_remove(const char *device, const char *name,
+   Error **errp)
+{
+BlockDriverState *bs;
+BdrvDirtyBitmap *bitmap;
+
+bs = bdrv_lookup_bs(device, NULL, errp);
+if (!bs) {
+return;
+}
+
+if (!name || name[0] == '\0') {
+error_setg(errp, "Bitmap name cannot be empty");
+return;
+}
+bitmap = bdrv_find_dirty_bitmap(bs, name);
+if (!bitmap) {
+error_setg(errp, "Dirty bitmap not found: %s", name);
+return;
+}
+
+bdrv_dirty_bitmap_make_anon(bs, bitmap);
+bdrv_release_dirty_bitmap(bs, bitmap);
+}
+
 int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
 {
 const char *id = qdict_get_str(qdict, "id");
diff --git a/include/block/block.h b/include/block/block.h
index 52fb6b2..066ded6 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -439,6 +439,7 @@ BdrvDirtyBit

[Qemu-devel] [PATCH v9 09/10] qmp: Add dirty bitmap 'enabled' field in query-block

2014-12-01 Thread John Snow

From: Fam Zheng 

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 block.c  | 1 +
 qapi/block-core.json | 5 -
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 42244f6..677bc6f 100644
--- a/block.c
+++ b/block.c
@@ -5439,6 +5439,7 @@ BlockDirtyInfoList 
*bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 info->granularity = bdrv_dirty_bitmap_granularity(bs, bm);
 info->has_name = !!bm->name;
 info->name = g_strdup(bm->name);
+info->enabled = bm->enabled;
 entry->value = info;
 *plist = entry;
 plist = &entry->next;
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 04f9824..4efc66e 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -328,10 +328,13 @@
 #
 # @granularity: granularity of the dirty bitmap in bytes (since 1.4)
 #
+# @enabled: whether the dirty bitmap is enabled (Since 2.3)
+#
 # Since: 1.3
 ##
 { 'type': 'BlockDirtyInfo',
-  'data': {'*name': 'str', 'count': 'int', 'granularity': 'int'} }
+  'data': {'*name': 'str', 'count': 'int', 'granularity': 'int',
+   'enabled': 'bool'} }
 
 ##
 # @BlockInfo:
-- 
1.9.3

[Qemu-devel] [PATCH v9 08/10] qapi: Add transaction support to block-dirty-bitmap-{add, enable, disable}

2014-12-01 Thread John Snow

From: Fam Zheng 

This adds three qmp commands to transactions.

Users can stop a dirty bitmap, start backup of it, and start another
dirty bitmap atomically, so that the dirty bitmap is tracked
incrementally and we don't miss any write.

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 blockdev.c   | 85 
 qapi-schema.json |  5 +++-
 2 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 3ab3404..da03025 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1596,6 +1596,76 @@ static void drive_backup_clean(BlkTransactionState 
*common)
 }
 }
 
+static void block_dirty_bitmap_add_prepare(BlkTransactionState *common,
+   Error **errp)
+{
+BlockDirtyBitmapAdd *action;
+
+action = common->action->block_dirty_bitmap_add;
+qmp_block_dirty_bitmap_add(action->device, action->name,
+   action->has_granularity, action->granularity,
+   errp);
+}
+
+static void block_dirty_bitmap_add_abort(BlkTransactionState *common)
+{
+BlockDirtyBitmapAdd *action;
+BdrvDirtyBitmap *bm;
+BlockDriverState *bs;
+
+action = common->action->block_dirty_bitmap_add;
+bs = bdrv_lookup_bs(action->device, NULL, NULL);
+if (bs) {
+bm = bdrv_find_dirty_bitmap(bs, action->name);
+if (bm) {
+bdrv_release_dirty_bitmap(bs, bm);
+}
+}
+}
+
+typedef struct BlockDirtyBitmapState {
+BlkTransactionState common;
+BdrvDirtyBitmap *bitmap;
+} BlockDirtyBitmapState;
+
+/**
+ * Enable and Disable re-uses the same preparation.
+ */
+static void block_dirty_bitmap_en_toggle_prepare(BlkTransactionState *common,
+ Error **errp)
+{
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+BlockDirtyBitmap *action;
+
+/* We may be used by either enable or disable;
+ * We use the "enable" member of the union here,
+ * but "disable" should be functionally equivalent: */
+action = common->action->block_dirty_bitmap_enable;
+assert(action == common->action->block_dirty_bitmap_disable);
+
+state->bitmap = block_dirty_bitmap_lookup(action->device,
+  action->name,
+  errp);
+if (!state->bitmap) {
+return;
+}
+}
+
+static void block_dirty_bitmap_enable_commit(BlkTransactionState *common)
+{
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+bdrv_enable_dirty_bitmap(state->bitmap);
+}
+
+static void block_dirty_bitmap_disable_commit(BlkTransactionState *common)
+{
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+bdrv_disable_dirty_bitmap(state->bitmap);
+}
+
 static void abort_prepare(BlkTransactionState *common, Error **errp)
 {
 error_setg(errp, "Transaction aborted using Abort action");
@@ -1630,6 +1700,21 @@ static const BdrvActionOps actions[] = {
 .abort = internal_snapshot_abort,
 .clean = internal_snapshot_clean,
 },
+[TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_ADD] = {
+.instance_size = sizeof(BlkTransactionState),
+.prepare = block_dirty_bitmap_add_prepare,
+.abort = block_dirty_bitmap_add_abort,
+},
+[TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_ENABLE] = {
+.instance_size = sizeof(BlockDirtyBitmapState),
+.prepare = block_dirty_bitmap_en_toggle_prepare,
+.commit = block_dirty_bitmap_enable_commit,
+},
+[TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_DISABLE] = {
+.instance_size = sizeof(BlockDirtyBitmapState),
+.prepare = block_dirty_bitmap_en_toggle_prepare,
+.commit = block_dirty_bitmap_disable_commit,
+},
 };
 
 /*
diff --git a/qapi-schema.json b/qapi-schema.json
index 9ffdcf8..793031b 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1260,7 +1260,10 @@
'blockdev-snapshot-sync': 'BlockdevSnapshot',
'drive-backup': 'DriveBackup',
'abort': 'Abort',
-   'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternal'
+   'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternal',
+   'block-dirty-bitmap-add': 'BlockDirtyBitmapAdd',
+   'block-dirty-bitmap-enable': 'BlockDirtyBitmap',
+   'block-dirty-bitmap-disable': 'BlockDirtyBitmap'
} }
 
 ##
-- 
1.9.3

[Qemu-devel] [PATCH v9 07/10] qmp: Add support of "dirty-bitmap" sync mode for drive-backup

2014-12-01 Thread John Snow

From: Fam Zheng 

For "dirty-bitmap" sync mode, the block job will iterate through the
given dirty bitmap to decide if a sector needs backup (backup all the
dirty clusters and skip clean ones), just as allocation conditions of
"top" sync mode.

There are two bitmap use modes for sync=dirty-bitmap:

 - reset: backup job makes a copy of bitmap and resets the original
   one.
 - consume: backup job makes the original anonymous (invisible to user)
   and releases it after use.

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 block.c   |   5 ++
 block/backup.c| 130 ++
 block/mirror.c|   4 ++
 blockdev.c|  17 +-
 hmp.c |   4 +-
 include/block/block.h |   1 +
 include/block/block_int.h |   6 +++
 qapi/block-core.json  |  30 +--
 qmp-commands.hx   |   7 +--
 9 files changed, 174 insertions(+), 30 deletions(-)

diff --git a/block.c b/block.c
index 85215b3..42244f6 100644
--- a/block.c
+++ b/block.c
@@ -5489,6 +5489,11 @@ void bdrv_dirty_iter_init(BlockDriverState *bs,
 hbitmap_iter_init(hbi, bitmap->bitmap, 0);
 }
 
+void bdrv_dirty_iter_set(HBitmapIter *hbi, int64_t offset)
+{
+hbitmap_iter_init(hbi, hbi->hb, offset);
+}
+
 void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
 int nr_sectors)
 {
diff --git a/block/backup.c b/block/backup.c
index 792e655..2aab68f 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -37,6 +37,10 @@ typedef struct CowRequest {
 typedef struct BackupBlockJob {
 BlockJob common;
 BlockDriverState *target;
+/* bitmap for sync=dirty-bitmap */
+BdrvDirtyBitmap *sync_bitmap;
+/* dirty bitmap granularity */
+int64_t sync_bitmap_gran;
 MirrorSyncMode sync_mode;
 RateLimit limit;
 BlockdevOnError on_source_error;
@@ -242,6 +246,31 @@ static void backup_complete(BlockJob *job, void *opaque)
 g_free(data);
 }
 
+static bool yield_and_check(BackupBlockJob *job)
+{
+if (block_job_is_cancelled(&job->common)) {
+return true;
+}
+
+/* we need to yield so that qemu_aio_flush() returns.
+ * (without, VM does not reboot)
+ */
+if (job->common.speed) {
+uint64_t delay_ns = ratelimit_calculate_delay(&job->limit,
+  job->sectors_read);
+job->sectors_read = 0;
+block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, delay_ns);
+} else {
+block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, 0);
+}
+
+if (block_job_is_cancelled(&job->common)) {
+return true;
+}
+
+return false;
+}
+
 static void coroutine_fn backup_run(void *opaque)
 {
 BackupBlockJob *job = opaque;
@@ -254,13 +283,13 @@ static void coroutine_fn backup_run(void *opaque)
 };
 int64_t start, end;
 int ret = 0;
+bool error_is_read;
 
 QLIST_INIT(&job->inflight_reqs);
 qemu_co_rwlock_init(&job->flush_rwlock);
 
 start = 0;
-end = DIV_ROUND_UP(job->common.len / BDRV_SECTOR_SIZE,
-   BACKUP_SECTORS_PER_CLUSTER);
+end = DIV_ROUND_UP(job->common.len, BACKUP_CLUSTER_SIZE);
 
 job->bitmap = hbitmap_alloc(end, 0);
 
@@ -278,28 +307,44 @@ static void coroutine_fn backup_run(void *opaque)
 qemu_coroutine_yield();
 job->common.busy = true;
 }
+} else if (job->sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
+/* Dirty Bitmap sync has a slightly different iteration method */
+HBitmapIter hbi;
+int64_t sector;
+int64_t cluster;
+bool polyrhythmic;
+
+bdrv_dirty_iter_init(bs, job->sync_bitmap, &hbi);
+/* Does the granularity happen to match our backup cluster size? */
+polyrhythmic = (job->sync_bitmap_gran != BACKUP_CLUSTER_SIZE);
+
+/* Find the next dirty /sector/ and copy that /cluster/ */
+while ((sector = hbitmap_iter_next(&hbi)) != -1) {
+if (yield_and_check(job)) {
+goto leave;
+}
+cluster = sector / BACKUP_SECTORS_PER_CLUSTER;
+
+do {
+ret = backup_do_cow(bs, cluster * BACKUP_SECTORS_PER_CLUSTER,
+BACKUP_SECTORS_PER_CLUSTER, 
&error_is_read);
+if ((ret < 0) &&
+backup_error_action(job, error_is_read, -ret) ==
+BLOCK_ERROR_ACTION_REPORT) {
+goto leave;
+}
+} while (ret < 0);
+
+/* Advance (or rewind) our iterator if we need to. */
+if (polyrhythmic) {
+bdrv_dirty_iter_set(&hbi,
+(cluster + 1) * 
BACKUP_SECTORS_PER_CLUSTER);
+}
+}
 } else {
 /* Both FULL and TOP SYNC_MODE's require copying.. */
 for (; start < end; start++) {
-bool error_is_read;

[Qemu-devel] [PATCH v9 05/10] block: Add bdrv_copy_dirty_bitmap and bdrv_reset_dirty_bitmap

2014-12-01 Thread John Snow

From: Fam Zheng 

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 block.c   | 35 +++
 include/block/block.h |  4 
 2 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index a0d1150..2d08b9f 100644
--- a/block.c
+++ b/block.c
@@ -53,6 +53,8 @@
 
 struct BdrvDirtyBitmap {
 HBitmap *bitmap;
+int64_t size;
+int64_t granularity;
 char *name;
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
@@ -5346,6 +5348,26 @@ void bdrv_dirty_bitmap_make_anon(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap)
 bitmap->name = NULL;
 }
 
+BdrvDirtyBitmap *bdrv_copy_dirty_bitmap(BlockDriverState *bs,
+const BdrvDirtyBitmap *bitmap,
+const char *name)
+{
+BdrvDirtyBitmap *new_bitmap;
+
+new_bitmap = g_malloc0(sizeof(BdrvDirtyBitmap));
+new_bitmap->bitmap = hbitmap_copy(bitmap->bitmap);
+new_bitmap->size = bitmap->size;
+new_bitmap->granularity = bitmap->granularity;
+new_bitmap->name = g_strdup(name);
+QLIST_INSERT_HEAD(&bs->dirty_bitmaps, new_bitmap, list);
+return new_bitmap;
+}
+
+void bdrv_reset_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
+{
+hbitmap_reset(bitmap->bitmap, 0, bitmap->size);
+}
+
 BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
   int granularity,
   const char *name,
@@ -5353,6 +5375,7 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs,
 {
 int64_t bitmap_size;
 BdrvDirtyBitmap *bitmap;
+int sector_granularity;
 
 assert((granularity & (granularity - 1)) == 0);
 
@@ -5360,8 +5383,8 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs,
 error_setg(errp, "Bitmap already exists: %s", name);
 return NULL;
 }
-granularity >>= BDRV_SECTOR_BITS;
-assert(granularity);
+sector_granularity = granularity >> BDRV_SECTOR_BITS;
+assert(sector_granularity);
 bitmap_size = bdrv_nb_sectors(bs);
 if (bitmap_size < 0) {
 error_setg_errno(errp, -bitmap_size, "could not get length of device");
@@ -5369,7 +5392,9 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs,
 return NULL;
 }
 bitmap = g_new0(BdrvDirtyBitmap, 1);
-bitmap->bitmap = hbitmap_alloc(bitmap_size, ffs(granularity) - 1);
+bitmap->size = bitmap_size;
+bitmap->granularity = granularity;
+bitmap->bitmap = hbitmap_alloc(bitmap->size, ffs(sector_granularity) - 1);
 bitmap->name = g_strdup(name);
 QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);
 return bitmap;
@@ -5441,7 +5466,9 @@ uint64_t bdrv_dbm_calc_def_granularity(BlockDriverState 
*bs)
 int64_t bdrv_dirty_bitmap_granularity(BlockDriverState *bs,
   BdrvDirtyBitmap *bitmap)
 {
-return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
+g_assert(BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap) ==
+ bitmap->granularity);
+return bitmap->granularity;
 }
 
 void bdrv_dirty_iter_init(BlockDriverState *bs,
diff --git a/include/block/block.h b/include/block/block.h
index f180f93..5522eba 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -437,6 +437,10 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState 
*bs,
 BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs,
 const char *name);
 void bdrv_dirty_bitmap_make_anon(BlockDriverState *bs, BdrvDirtyBitmap 
*bitmap);
+void bdrv_reset_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
+BdrvDirtyBitmap *bdrv_copy_dirty_bitmap(BlockDriverState *bs,
+const BdrvDirtyBitmap *bitmap,
+const char *name);
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
 uint64_t bdrv_dbm_calc_def_granularity(BlockDriverState *bs);
-- 
1.9.3

[Qemu-devel] [PATCH v9 10/10] qemu-iotests: Add tests for drive-backup sync=dirty-bitmap

2014-12-01 Thread John Snow

From: Fam Zheng 

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/056| 33 ++---
 tests/qemu-iotests/056.out|  4 ++--
 tests/qemu-iotests/iotests.py |  8 
 3 files changed, 40 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/056 b/tests/qemu-iotests/056
index 54e4bd0..fc9114e 100755
--- a/tests/qemu-iotests/056
+++ b/tests/qemu-iotests/056
@@ -23,17 +23,17 @@
 import time
 import os
 import iotests
-from iotests import qemu_img, qemu_io, create_image
+from iotests import qemu_img, qemu_img_map_assert, qemu_io, create_image
 
 backing_img = os.path.join(iotests.test_dir, 'backing.img')
 test_img = os.path.join(iotests.test_dir, 'test.img')
 target_img = os.path.join(iotests.test_dir, 'target.img')
 
-class TestSyncModesNoneAndTop(iotests.QMPTestCase):
+class TestSyncModes(iotests.QMPTestCase):
 image_len = 64 * 1024 * 1024 # MB
 
 def setUp(self):
-create_image(backing_img, TestSyncModesNoneAndTop.image_len)
+create_image(backing_img, TestSyncModes.image_len)
 qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % 
backing_img, test_img)
 qemu_io('-c', 'write -P0x41 0 512', test_img)
 qemu_io('-c', 'write -P0xd5 1M 32k', test_img)
@@ -64,6 +64,33 @@ class TestSyncModesNoneAndTop(iotests.QMPTestCase):
 self.assertTrue(iotests.compare_images(test_img, target_img),
 'target image does not match source after backup')
 
+def test_sync_dirty_bitmap_missing(self):
+self.assert_no_active_block_jobs()
+result = self.vm.qmp('drive-backup', device='drive0', 
sync='dirty-bitmap',
+ format=iotests.imgfmt, target=target_img)
+self.assert_qmp(result, 'error/class', 'GenericError')
+
+def test_sync_dirty_bitmap_not_found(self):
+self.assert_no_active_block_jobs()
+result = self.vm.qmp('drive-backup', device='drive0', 
sync='dirty-bitmap',
+ bitmap='unknown',
+ format=iotests.imgfmt, target=target_img)
+self.assert_qmp(result, 'error/class', 'GenericError')
+
+def test_sync_dirty_bitmap(self):
+self.assert_no_active_block_jobs()
+result = self.vm.qmp('block-dirty-bitmap-add', device='drive0', 
name='bitmap0')
+self.assert_qmp(result, 'return', {})
+self.vm.hmp_qemu_io('drive0', 'write -P0x5a 0 512')
+self.vm.hmp_qemu_io('drive0', 'write -P0x5a 48M 512')
+result = self.vm.qmp('drive-backup', device='drive0', 
sync='dirty-bitmap',
+ bitmap='bitmap0',
+ format=iotests.imgfmt, target=target_img)
+self.assert_qmp(result, 'return', {})
+self.wait_until_completed(check_offset=False)
+self.assert_no_active_block_jobs()
+qemu_img_map_assert(target_img, [0, 0x300])
+
 def test_cancel_sync_none(self):
 self.assert_no_active_block_jobs()
 
diff --git a/tests/qemu-iotests/056.out b/tests/qemu-iotests/056.out
index fbc63e6..914e373 100644
--- a/tests/qemu-iotests/056.out
+++ b/tests/qemu-iotests/056.out
@@ -1,5 +1,5 @@
-..
+.
 --
-Ran 2 tests
+Ran 5 tests
 
 OK
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index f57f154..95bb959 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -55,6 +55,14 @@ def qemu_img_pipe(*args):
 '''Run qemu-img and return its output'''
 return subprocess.Popen(qemu_img_args + list(args), 
stdout=subprocess.PIPE).communicate()[0]
 
+def qemu_img_map_assert(img, offsets):
+'''Run qemu-img map on img and check the mapped ranges'''
+offs = []
+for line in qemu_img_pipe('map', img).splitlines()[1:]:
+offset, length, mapped, fname = line.split()
+offs.append(int(offset, 16))
+assert set(offs) == set(offsets), "mapped offsets in image '%s' not equal 
to '%s'" % (str(offs), str(offsets))
+
 def qemu_io(*args):
 '''Run qemu-io and return the stdout data'''
 args = qemu_io_args + list(args)
-- 
1.9.3

[Qemu-devel] [Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

2014-12-01 Thread Chris J Arges

Also I've been able to reproduce this with the latest master in qemu,
and even with the latest daily 3.18-rcX kernel on the host.

** Also affects: qemu
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1292234

Title:
  qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  The security team uses a tool (http://bazaar.launchpad.net/~ubuntu-
  bugcontrol/ubuntu-qa-tools/master/view/head:/vm-tools/uvt) that uses
  libvirt snapshots quite a bit. I noticed after upgrading to trusty
  some time ago that qemu 1.7 (and the qemu 2.0 in the candidate ppa)
  has had stability problems such that the disk/partition table seems to
  be corrupted after removing a libvirt snapshot and then creating
  another with the same name. I don't have a very simple reproducer, but
  had enough that hallyn suggested I file a bug. First off:

  qemu-kvm 2.0~git-20140307.4c288ac-0ubuntu2

  $ cat /proc/version_signature
  Ubuntu 3.13.0-16.36-generic 3.13.5

  $ qemu-img info ./forhallyn-trusty-amd64.img
  image: ./forhallyn-trusty-amd64.img
  file format: qcow2
  virtual size: 8.0G (8589934592 bytes)
  disk size: 4.0G
  cluster_size: 65536
  Format specific information:
  compat: 0.10

  Steps to reproduce:
  1. create a virtual machine. For a simplified reproducer, I used virt-manager 
with:
    OS type: Linux
    Version: Ubuntu 14.04
    Memory: 768
    CPUs: 1

    Select managed or existing (Browse, new volume)
  Create a new storage volume:
    qcow2
    Max capacity: 8192
    Allocation: 0

    Advanced:
  NAT
  kvm
  x86_64
  firmware: default

  2. install a VM. I used trusty-desktop-amd64.iso from Jan 23 since it
  seems like I can hit the bug more reliably if I have lots of updates
  in a dist-upgrade. I have seen this with lucid-trusty guests that are
  i386 and amd64. After the install, reboot and then cleanly shutdown

  3. Backup the image file somewhere since steps 1 and 2 take a while :)

  4. Execute the following commands which are based on what our uvt tool
  does:

  $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot"
  $ virsh snapshot-current --name forhallyn-trusty-amd64
  pristine
  $ virsh start forhallyn-trusty-amd64
  $ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff 
after start, this might be different with qemu 1.5

  in guest:
  sudo apt-get update
  sudo apt-get dist-upgrade
  780 upgraded...
  shutdown -h now

  $ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children
  $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot"

  $ virsh start forhallyn-trusty-amd64 # this command works, but there
  is often disk corruption

  The idea behind the above is to create a new VM with a pristine
  snapshot that we could revert later if we wanted. Instead, we boot the
  VM, run apt-get dist-upgrade, cleanly shutdown and then remove the old
  'pristine' snapshot and create a new 'pristine' snapshot. The
  intention is to update the VM and the pristine snapshot so that when
  we boot the next time, we boot from the updated VM and can revert back
  to the updated VM.

  After running 'virsh start' after doing snapshot-delete/snapshot-
  create-as, the disk may be corrupted. This can be seen with grub
  failing to find .mod files, the kernel not booting, init failing, etc.

  This does not seem to be related to the machine type used. Ie, pc-
  i440fx-1.5, pc-i440fx-1.7 and pc-i440fx-2.0 all fail with qemu 2.0,
  pc-i440fx-1.5 and pc-i440fx-1.7 fail with qemu 1.7 and pc-i440fx-1.5
  works fine with qemu 1.5.

  Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-
  3ubuntu5.4 from Ubuntu 13.10.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1292234/+subscriptions

Re: [Qemu-devel] [PATCH v9 01/10] qapi: Add optional field "name" to block dirty bitmap

2014-12-01 Thread Eric Blake

On 12/01/2014 01:30 PM, John Snow wrote:
> From: Fam Zheng 
> 
> This field will be set for user created dirty bitmap. Also pass in an
> error pointer to bdrv_create_dirty_bitmap, so when a name is already
> taken on this BDS, it can report an error message. This is not global
> check, two BDSes can have dirty bitmap with a common name.
> 
> Implemented bdrv_find_dirty_bitmap to find a dirty bitmap by name, will
> be used later when other QMP commands want to reference dirty bitmap by
> name.
> 
> Add bdrv_dirty_bitmap_make_anon. This unsets the name of dirty bitmap.
> 
> Signed-off-by: Fam Zheng 
> Signed-off-by: John Snow 
> Reviewed-by: Max Reitz 
> ---
>  block-migration.c |  2 +-
>  block.c   | 32 +++-
>  block/mirror.c|  2 +-
>  include/block/block.h |  7 ++-
>  qapi/block-core.json  |  4 +++-
>  5 files changed, 42 insertions(+), 5 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] Update on TCG Multithreading

2014-12-01 Thread Lluís Vilanova

Mark Burton writes:

> All - first a huge thanks for those who have contributed, and those who have
> expressed an interest in helping out.

> One issue I’d like to see more opinions on is the question of a cache per 
> core,
> or a shared cache.
> I have heard anecdotal evidence that a shared cache gives a major performance
> benefit….
> Does anybody have anything more concrete?
> (of course we will get numbers in the end if we implement the hybrid scheme as
> suggested in the wiki - but I’d still appreciate any feedback).

I think it makes sense to have a per-core pointer to a qom TCGCacheClass. That
can then have its own methods for working with updates, making it much simpler
to work with different implementations, like completely avoiding locks (per-cpu
cache) or a hybrid approach like the one described in the wiki.

> Our next plan is to start putting an implementation plan together. Probably
> quite sketchy at this point, and we hope to start coding shortly.

BTW, I've added some links to the COREMU project, which was discussed long ago
in this list.

Best,
  Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth

Re: [Qemu-devel] [PATCHv3] block: add event when disk usage exceeds threshold

2014-12-01 Thread Eric Blake

On 11/28/2014 05:31 AM, Francesco Romani wrote:
> Managing applications, like oVirt (http://www.ovirt.org), make extensive
> use of thin-provisioned disk images.
> To let the guest run smoothly and be not unnecessarily paused, oVirt sets
> a disk usage threshold (so called 'high water mark') based on the occupation
> of the device,  and automatically extends the image once the threshold
> is reached or exceeded.
> 
> In order to detect the crossing of the threshold, oVirt has no choice but
> aggressively polling the QEMU monitor using the query-blockstats command.
> This lead to unnecessary system load, and is made even worse under scale:
> deployments with hundreds of VMs are no longer rare.
> 
> To fix this, this patch adds:
> * A new monitor command to set a mark for a given block device.
> * A new event to report if a block device usage exceeds the threshold.
> 
> This will allow the managing application to drop the polling
> altogether and just wait for a watermark crossing event.

I like the idea!

Question - what happens if management misses the event (for example, if
libvirtd is restarted)?  Does the existing 'query-blockstats' and/or
'query-named-block-nodes' still work to query the current threshold and
whether it has been exceeded, as a poll-once command executed when
reconnecting to the monitor?

> 
> Signed-off-by: Francesco Romani 
> ---

No need for a 0/1 cover letter on a 1-patch series; you have the option
of just putting the side-band information here and sending it as a
single mail.  But the cover letter approach doesn't hurt either, and I
can see how it can be easier for some workflows to always send a cover
letter than to special-case a 1-patch series.

> +static int coroutine_fn before_write_notify(NotifierWithReturn *notifier,
> +void *opaque)
> +{
> +BdrvTrackedRequest *req = opaque;
> +BlockDriverState *bs = req->bs;
> +uint64_t amount = 0;
> +
> +amount = bdrv_usage_threshold_exceeded(bs, req);
> +if (amount > 0) {
> +qapi_event_send_block_usage_threshold(
> +bs->node_name,
> +amount,
> +bs->write_threshold_offset,
> +&error_abort);
> +
> +/* autodisable to avoid to flood the monitor */

s/to flood/flooding/

> +/*
> + * bdrv_usage_threshold_is_set
> + *
> + * Tell if an usage threshold is set for a given BDS.

s/an usage/a usage/

(in English, the difference between "a" and "an" is whether the leading
sound of the next word is pronounced or not; in this case, "usage" is
pronounced with a hard "yoo-sage".  It may help to remember "an umbrella
for a unicorn")

> +++ b/qapi/block-core.json
> @@ -239,6 +239,9 @@
>  #
>  # @iops_size: #optional an I/O size in bytes (Since 1.7)
>  #
> +# @write-threshold: configured write threshold for the device.
> +#   0 if disabled. (Since 2.3)
> +#
>  # Since: 0.14.0
>  #
>  ##
> @@ -253,7 +256,7 @@
>  '*bps_max': 'int', '*bps_rd_max': 'int',
>  '*bps_wr_max': 'int', '*iops_max': 'int',
>  '*iops_rd_max': 'int', '*iops_wr_max': 'int',
> -'*iops_size': 'int' } }
> +'*iops_size': 'int', 'write-threshold': 'uint64' } }

In QMP specs, 'uint64' and 'int' are practically synonyms.  I can live
with either spelling, although 'int' is more common.

Bikeshed on naming: Although we prefer '-' over '_' in new interfaces,
we also favor consistency, and BlockDeviceInfo is one of those dinosaur
commands that uses _ everywhere until your addition.  So naming this
field 'write_threshold' would be more consistent.

> +##
> +# @BLOCK_USAGE_THRESHOLD
> +#
> +# Emitted when writes on block device reaches or exceeds the
> +# configured threshold. For thin-provisioned devices, this
> +# means the device should be extended to avoid pausing for
> +# disk exaustion.

s/exaustion/exhaustion/

> +#
> +# @node-name: graph node name on which the threshold was exceeded.
> +#
> +# @amount-exceeded: amount of data which exceeded the threshold, in bytes.
> +#
> +# @offset-threshold: last configured threshold, in bytes.
> +#

Might want to mention that this event is one-shot; after it triggers, a
user must re-register a threshold to get the event again.

> +# Since: 2.3
> +##
> +{ 'event': 'BLOCK_USAGE_THRESHOLD',
> +  'data': { 'node-name': 'str',
> + 'amount-exceeded': 'uint64',

TAB damage.  Please use spaces.  ./scripts/checkpatch.pl will catch some
offenders (although I didn't test if it will catch this one).

However, here you are correct in using '-' for naming :)

> + 'threshold': 'uint64' } }
> +
> +##
> +# @block-set-threshold
> +#
> +# Change usage threshold for a block drive. An event will be delivered
> +# if a write to this block drive crosses the configured threshold.
> +# This is useful to transparently resize thin-provisioned drives without
> +# the guest OS noticing.
> +#
> +# @node-name: graph node name on which the threshold must be set.
> +#
> +#

Re: [Qemu-devel] [PATCH v9 02/10] qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove

2014-12-01 Thread Eric Blake

On 12/01/2014 01:30 PM, John Snow wrote:
> From: Fam Zheng 
> 
> The new command pair is added to manage user created dirty bitmap. The
> dirty bitmap's name is mandatory and must be unique for the same device,
> but different devices can have bitmaps with the same names.
> 
> The granularity is an optional field. If it is not specified, we will
> choose a default granularity based on the cluster size if available,
> clamped to between 4K and 64K to mirror how the 'mirror' code was
> already choosing granularity. If we do not have cluster size info
> available, we choose 64K. This code has been factored out into a helper
> shared with block/mirror.
> 
> The types added to block-core.json will be re-used in future patches
> in this series, see:
> 'qapi: Add transaction support to block-dirty-bitmap-{add, enable, disable}'
> 
> Signed-off-by: Fam Zheng 
> Signed-off-by: John Snow 
> Reviewed-by: Max Reitz 
> ---

> +block-dirty-bitmap-add
> +--
> +
> +Create a dirty bitmap with a name on the device, and start tracking the 
> writes.
> +
> +Arguments:
> +
> +- "device": device name to create dirty bitmap (json-string)
> +- "name": name of the new dirty bitmap (json-string)
> +- "granularity": granularity to track writes with (int)

Worth mentioning that this is optional?

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH RFC] e1000: defer packets until BM enabled

2014-12-01 Thread Gabriel Somlo

Hi Michael,

I had to make some small changes to get this patch to build successfully,
see inline below:

On Monday, December 01, 2014 11:50am, Michael S. Tsirkin [m...@redhat.com] 
wrote:
> 
> Some guests seem to set BM for e1000 after
> enabling RX.
> If packets arrive in the window, device is wedged.
> Probably works by luck on real hardware, work around
> this by making can_receive depend on BM.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  hw/net/e1000.c | 20 +++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> index e33a4da..34625ac 100644
> --- a/hw/net/e1000.c
> +++ b/hw/net/e1000.c
> @@ -923,7 +923,9 @@ e1000_can_receive(NetClientState *nc)
>  E1000State *s = qemu_get_nic_opaque(nc);
> 
>  return (s->mac_reg[STATUS] & E1000_STATUS_LU) &&
> -(s->mac_reg[RCTL] & E1000_RCTL_EN) && e1000_has_rxbufs(s, 1);
> +(s->mac_reg[RCTL] & E1000_RCTL_EN) &&
> +(s->parent_obj.config[PCI_COMMAND] & PCI_COMMAND_MASTER) &&
> +e1000_has_rxbufs(s, 1);
>  }
> 
>  static uint64_t rx_desc_base(E1000State *s)
> @@ -1529,6 +1531,20 @@ static NetClientInfo net_e1000_info = {
>  .link_status_changed = e1000_set_link_status,
>  };
> 
> +static void e1000_write_config(PCIDevice *pci_dev, uint32_t address,
> +uint32_t val, int len)
> +{
> +E1000State *d = E1000(dev);

s/dev/pci_dev/

> +
> +pci_default_write_config(pci_dev, address, val, len);
> +
> +if (range_covers_byte(address, len, PCI_COMMAND) &&

requires #include "qemu/range.h" at the top of e1000.c

> +(pci_dev->config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
> +qemu_flush_queued_packets(qemu_get_queue(s->nic));

s/s->nic/d->nic/

> +}
> +}
> +
> +
>  static int pci_e1000_init(PCIDevice *pci_dev)
>  {
>  DeviceState *dev = DEVICE(pci_dev);
> @@ -1539,6 +1555,8 @@ static int pci_e1000_init(PCIDevice *pci_dev)
>  int i;
>  uint8_t *macaddr;
> 
> +pci_dev->config_write = e1000_write_config;
> +
>  pci_conf = pci_dev->config;
> 
>  /* TODO: RST# value should be 0, PCI spec 6.2.4 */
> --

With this, I can confirm everything still works fine on both Mavericks and
F21-beta-live. So:

Tested-by: Gabriel Somlo 

Regards,
--Gabriel

Re: [Qemu-devel] [2.2 PATCH V2 for-4.5] virtio-net: fix unmap leak

2014-12-01 Thread Konrad Rzeszutek Wilk

On Thu, Nov 27, 2014 at 05:46:28PM +, Stefano Stabellini wrote:
> On Thu, 27 Nov 2014, Konrad Rzeszutek Wilk wrote:
> > On Nov 27, 2014 10:26 AM, Stefano Stabellini 
> >  wrote:
> > >
> > > On Thu, 27 Nov 2014, Konrad Rzeszutek Wilk wrote: 
> > > > On Nov 27, 2014 9:58 AM, Stefano Stabellini 
> > > >  wrote: 
> > > > > 
> > > > > On Thu, 27 Nov 2014, Konrad Rzeszutek Wilk wrote: 
> > > > > > On Nov 27, 2014 7:46 AM, Stefano Stabellini 
> > > > > >  wrote: 
> > > > > > > 
> > > > > > > Konrad, I think we should have this fix in 4.5: without it 
> > > > > > > vif=[ 'model=virtio-net' ] crashes QEMU. 
> > > > > > > 
> > > > > > 
> > > > > > Is it an regression? 
> > > > > 
> > > > > Good question: I was trying to investigate that. 
> > > > > 
> > > > > virtio-net is currently *not* documented in the xl interface: 
> > > > > 
> > > > > 
> > > > > ### model 
> > > > > 
> > > > > This keyword is valid for HVM guest devices with `type=ioemu` only. 
> > > > > 
> > > > > Specifies the type device to emulated for this guest. Valid values 
> > > > > are: 
> > > > > 
> > > > >   * `rtl8139` (default) -- Realtek RTL8139 
> > > > >   * `e1000` -- Intel E1000 
> > > > >   * in principle any device supported by your device model 
> > > > > 
> > > > > 
> > > > > The last working version of virtio-net on Xen is QEMU v1.4.0. That 
> > > > > means 
> > > > > that the bug affects Xen 4.4 too (but it should work in Xen 4.3). 
> > > > 
> > > > Not a regression compared to 4.4 but it has been for two releases. 
> > >
> > > That is true. On the plus side, virtio-net has never been properly 
> > > documented as working in the first place. 
> > >
> > >
> > > > So if nobody noticed it for two releases will they notice it if it not 
> > > > fixed in this release either? And can it be fixed in the next one? 
> > >
> > > We can fix the crash even in this release by backporting this rather 
> > > simple patch. However the patch would just avoid the crash: virtio-net 
> > > would still be not working once the guest is booted. I haven't figured 
> > > out the cause of that problem yet. 
> > >
> > 
> > Perhaps then the hack^H^Hfix is to return an error if user is using 
> > virtio-net then?
> > 
> > And then in later releases make it work right.
> 
> Sorry, I take it back: the fix is enough to get virtio-net working, I
> had a configuration error that was confusing my test results.
> 
> Given that the fix is very simple, I think we should backport it to Xen 4.5
> and Xen 4.4.

OK.
>From 4.5 perspective: Release-Acked-by: Konrad Rzeszutek Wilk 
>
> 
> 
> > > > > > > On Thu, 27 Nov 2014, Peter Maydell wrote: 
> > > > > > > > On 27 November 2014 at 12:33, Michael S. Tsirkin 
> > > > > > > >  wrote: 
> > > > > > > > > On Thu, Nov 27, 2014 at 06:04:03PM +0800, Jason Wang wrote: 
> > > > > > > > >> virtio_net_handle_ctrl() and other functions that process 
> > > > > > > > >> control vq 
> > > > > > > > >> request call iov_discard_front() which will shorten the iov. 
> > > > > > > > >> This will 
> > > > > > > > >> lead unmapping in virtqueue_push() leaks mapping. 
> > > > > > > > >> 
> > > > > > > > >> Fixes this by keeping the original iov untouched and using a 
> > > > > > > > >> temp variable 
> > > > > > > > >> in those functions. 
> > > > > > > > >> 
> > > > > > > > >> Cc: Wen Congyang  
> > > > > > > > >> Cc: Stefano Stabellini  
> > > > > > > > >> Cc: qemu-sta...@nongnu.org 
> > > > > > > > >> Signed-off-by: Jason Wang  
> > > > > > > > > 
> > > > > > > > > Reviewed-by: Michael S. Tsirkin  
> > > > > > > > > 
> > > > > > > > > Peter, can you pick this up or do you want a pull request? 
> > > > > > > > 
> > > > > > > > I can pick it up. I was waiting a bit to check that everybody 
> > > > > > > > was happy that this is the correct way to fix the bug and the 
> > > > > > > > patch is ok... 
> > > > > > 
> > > >

Re: [Qemu-devel] [PATCH] qmp: extend QMP to provide read/write access to physical memory

2014-12-01 Thread Eric Blake

On 11/26/2014 01:27 PM, Bryan D. Payne wrote:
> This patch adds a new QMP command that sets up a domain socket. This
> socket can then be used for fast read/write access to the guest's
> physical memory. The key benefit to this system over existing solutions
> is speed. Using this patch, guest memory can be copied out at a rate of
> ~200MB/sec, depending on the hardware. Existing solutions only achieve
> a small fraction of this speed.
> 
> Signed-off-by: Bryan D. Payne 

> +
> +##
> +# @pmemaccess
> +#
> +# This command enables access to guest physical memory using
> +# a simple protocol over a UNIX domain socket.
> +#
> +# @path Location to use for the UNIX domain socket
> +#
> +# Since: 2.3
> +##
> +{ 'command': 'pmemaccess', 'data': { 'path': 'str' } }

In addition to Fam's review, I have a question - does this code properly
use qemu_open() so that I can use 'add-fd' to pass in a pre-opened
socket fd into fdset 1, then call pmemaccess with '/dev/fdset/1'?  If
not, can you please fix it to allow this usage?

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2] qmp: extend QMP to provide read/write access to physical memory

2014-12-01 Thread Eric Blake

On 11/26/2014 01:27 PM, Bryan D. Payne wrote:
> Thanks for the feedback Eric, I've updated the patch.
> 
> v2 changes:
> - added QMP command contract to qapi-schema.json
> - corrected some comments
> - rewired QMP command to use schema code

When sending a v2, it's best to send it as a new top-level thread,
instead of buried in-reply-to an existing thread.  Also, for a single
patch, a cover letter is not strictly necessary (the information you
gave here can instead be given after the --- separator of the
one-and-only patch email).  Cover letters are mandatory only for
multi-patch series.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [Xen-devel] [PATCH] increase maxmem before calling xc_domain_populate_physmap

2014-12-01 Thread Don Slutz


On 12/01/14 10:37, Stefano Stabellini wrote:

On Mon, 1 Dec 2014, Don Slutz wrote:

On 11/27/14 05:48, Stefano Stabellini wrote:


[...]



Works fine in both claim modes and with PoD used (maxmem > memory).  Do
not know how to test with tmem.  I do not see how it would be worse then
current
code that does not auto increase.  I.E. even without a xen change, I think
something
like this could be done.

OK, good to know. I am OK with increasing maxmem only if it is strictly
necessary.



My testing shows a free 32 pages that I am not sure where they come from.
But
the code about is passing my 8 nics of e1000.

I think that raising maxmem a bit higher than necessary is not too bad.
If we really care about it, we could lower the limit after QEMU's
initialization is completed.

Ok.  I did find the 32 it is VGA_HOLE_SIZE.  So here is what I have which
includes
a lot of extra printf.

In QEMU I would prefer not to assume that libxl increased maxmem for the
vga hole. I would rather call xc_domain_setmaxmem twice for the vga hole
than tie QEMU to a particular maxmem allocation scheme in libxl.


Ok.  The area we are talking about is 0x000a to 0x000c.
It is in libxc (xc_hvm_build_x86), not libxl.   I have no issue with a 
name change to

some thing like QEMU_EXTRA_FREE_PAGES.

My testing has shows that some of these 32 pages are used outside of QEMU.
I am seeing just 23 free pages using a standalone program to display
the same info after a CentOS 6.3 guest is done booting.


In libxl I would like to avoid increasing mamxem for anything QEMU will
allocate later, that includes rom and vga vram. I am not sure how to
make that work with older QEMU versions that don't call
xc_domain_setmaxmem by themselves yet though. Maybe we could check the
specific QEMU version in libxl and decide based on that. Or we could
export a feature flag in QEMU.


Yes, it would be nice to adjust libxl to not increase maxmem. However since
videoram is included in memory (and maxmem), making the change related
to vram is a bigger issue.

the rom change is much simpler.

Currently I do not know of a way to do different things based on the 
QEMU version

and/or features (this includes getting the QEMU version in libxl).

I have been going with:
1) change QEMU 1st.
2) Wait for an upstream version of QEMU with this.
3) change xen to optionally use a feature in the latest QEMU.





--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -67,6 +67,7 @@ static inline ioreq_t *xen_vcpu_ioreq(shared_iopage_t
*shared_page, int vcpu)
  #endif

  #define BUFFER_IO_MAX_DELAY  100
+#define VGA_HOLE_SIZE (0x20)

  typedef struct XenPhysmap {
  hwaddr start_addr;
@@ -219,6 +220,11 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size,
MemoryRegion *mr)
  xen_pfn_t *pfn_list;
  int i;
  xc_dominfo_t info;
+unsigned long max_pages, free_pages, real_free;
+long need_pages;
+uint64_t tot_pages, pod_cache_pages, pod_entries;
+
+trace_xen_ram_alloc(ram_addr, size, mr->name);

  if (runstate_check(RUN_STATE_INMIGRATE)) {
  /* RAM already populated in Xen */
@@ -232,13 +238,6 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size,
MemoryRegion *mr)
  return;
  }

-fprintf(stderr, "%s: alloc "RAM_ADDR_FMT
-" bytes (%ld Kib) of ram at "RAM_ADDR_FMT
-" mr.name=%s\n",
-__func__, size, (long)(size>>10), ram_addr, mr->name);
-
-trace_xen_ram_alloc(ram_addr, size);
-
  nr_pfn = size >> TARGET_PAGE_BITS;
  pfn_list = g_malloc(sizeof (*pfn_list) * nr_pfn);

@@ -246,11 +245,38 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size,
MemoryRegion *mr)
  pfn_list[i] = (ram_addr >> TARGET_PAGE_BITS) + i;
  }

-if (xc_domain_getinfo(xen_xc, xen_domid, 1, &info) < 0) {
+if ((xc_domain_getinfo(xen_xc, xen_domid, 1, &info) != 1) ||
+   (info.domid != xen_domid)) {
  hw_error("xc_domain_getinfo failed");
  }
-if (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
-(nr_pfn * XC_PAGE_SIZE / 1024)) < 0) {
+max_pages = info.max_memkb * 1024 / XC_PAGE_SIZE;
+free_pages = max_pages - info.nr_pages;
+real_free = free_pages;
+if (free_pages > VGA_HOLE_SIZE) {
+free_pages -= VGA_HOLE_SIZE;
+} else {
+free_pages = 0;
+}
+need_pages = nr_pfn - free_pages;
+fprintf(stderr, "%s: alloc "RAM_ADDR_FMT
+" bytes (%ld KiB) of ram at "RAM_ADDR_FMT
+" mp=%ld fp=%ld nr=%ld need=%ld mr.name=%s\n",
+__func__, size, (long)(size>>10), ram_addr,
+   max_pages, free_pages, nr_pfn, need_pages,
+mr->name);
+if (xc_domain_get_pod_target(xen_xc, xen_domid, &tot_pages,
+ &pod_cache_pages, &pod_entries) >= 0) {
+unsigned long populated = tot_pages - pod_cache_pages;
+long delta_tot = tot_pages - info.nr_pages;
+
+fprintf(stderr, "%s: PoD pop=%ld tot=%ld(%ld) cnt=%ld ent=%ld nop=%l

[Qemu-devel] [ANNOUNCE] QEMU 2.2.0-rc4 is now available

2014-12-01 Thread Michael Roth

On behalf of the QEMU Team, I'd like to announce the availability of the
fifth release candidate for the QEMU 2.2 release.  This release is meant
for testing purposes and should not be used in a production environment.

  http://wiki.qemu.org/download/qemu-2.2.0-rc4.tar.bz2

This is the last planned RC before the release of QEMU 2.2 on Friday,
December 5th.

You can help improve the quality of the QEMU 2.2 release by testing this
release and reporting bugs on Launchpad:

  https://bugs.launchpad.net/qemu/

The release plan for the 2.2 release is available at:

  http://wiki.qemu.org/Planning/2.2

Please add entries to the ChangeLog for the 2.2 release below:

  http://wiki.qemu.org/ChangeLog/2.2

Changes since 2.2.0-rc3:

0d7954c: Update version for v2.2.0-rc4 release (Peter Maydell)
b19ca18: vhost: Fix vhostfd leak in error branch (Gonglei)
db12451: Fix for crash after migration in virtio-rng on bi-endian targets 
(David Gibson)
771b6ed: virtio-net: fix unmap leak (Jason Wang)
4cae4d5: hmp: fix regression of HMP device_del auto-completion (Marcel 
Apfelbaum)
490309f: qemu-timer: Avoid overflows when converting timeout to struct timespec 
(Peter Maydell)
dc622de: s390x/kvm: Fix compile error (Christian Borntraeger)
f3b3766: fw_cfg: fix boot order bug when dynamically modified via QOM (Gonglei)
d1048be: -machine vmport=auto: Fix handling of VMWare ioport emulation for xen 
(Don Slutz)

Re: [Qemu-devel] Update on TCG Multithreading

2014-12-01 Thread Alexander Graf



On 01.12.14 22:00, Lluís Vilanova wrote:
> Mark Burton writes:
> 
>> All - first a huge thanks for those who have contributed, and those who have
>> expressed an interest in helping out.
> 
>> One issue I’d like to see more opinions on is the question of a cache per 
>> core,
>> or a shared cache.
>> I have heard anecdotal evidence that a shared cache gives a major performance
>> benefit….
>> Does anybody have anything more concrete?
>> (of course we will get numbers in the end if we implement the hybrid scheme 
>> as
>> suggested in the wiki - but I’d still appreciate any feedback).
> 
> I think it makes sense to have a per-core pointer to a qom TCGCacheClass. That
> can then have its own methods for working with updates, making it much simpler
> to work with different implementations, like completely avoiding locks 
> (per-cpu
> cache) or a hybrid approach like the one described in the wiki.

I don't think you want to have indirect function calls in the fast path ;).


Alex

Re: [Qemu-devel] [PATCH] target-mips: add CPU definition for MIPS-II

2014-12-01 Thread Petar Jovanovic

Adding (another) generic model for an old ISA revision is rather

discouraged in QEMU trunk. Can you add a particular real CPU model?

 

Regards,

Petar



From: Vasileios Kalintiris

Sent: 25 November 2014 11:04

To: address@hidden

Cc: Leon Alrae; address@hidden

Subject: [PATCH] target-mips: add CPU definition for MIPS-II

 

Add mips2-generic among CPU definitions for MIPS.

 

Signed-off-by: Vasileios Kalintiris 

---

target-mips/translate_init.c | 23 +++

1 file changed, 23 insertions(+)

 

diff --git a/target-mips/translate_init.c b/target-mips/translate_init.c

index 148b394..d4b1cd8 100644

--- a/target-mips/translate_init.c

+++ b/target-mips/translate_init.c

@@ -108,6 +108,29 @@ struct mips_def_t {

static const mips_def_t mips_defs[] =

{

 {

+/* A generic CPU providing MIPS-II features.

+   FIXME: Eventually this should be replaced by a real CPU model.
*/

+.name = "mips2-generic",

+.CP0_PRid = 0x00018000,

+.CP0_Config0 = MIPS_CONFIG0 | (MMU_TYPE_R4000 << CP0C0_MT),

+.CP0_Config1 = MIPS_CONFIG1 | (1 << CP0C1_FP) | (15 << CP0C1_MMU) |

+  (0 << CP0C1_IS) | (3 << CP0C1_IL) | (1 << CP0C1_IA) |

+  (0 << CP0C1_DS) | (3 << CP0C1_DL) | (1 << CP0C1_DA) |

+  (0 << CP0C1_CA),

+.CP0_Config2 = MIPS_CONFIG2,

+.CP0_Config3 = MIPS_CONFIG3,

+.CP0_LLAddr_rw_bitmask = 0,

+.CP0_LLAddr_shift = 4,

+.SYNCI_Step = 32,

+.CCRes = 2,

+.CP0_Status_rw_bitmask = 0x3011,

+.CP1_fcr0 = (1 << FCR0_W) | (1 << FCR0_D) | (1 << FCR0_S),

+.SEGBITS = 32,

+.PABITS = 32,

+.insn_flags = CPU_MIPS2,

+.mmu_type = MMU_TYPE_R4000,

+},

+{

 .name = "4Kc",

 .CP0_PRid = 0x00018000,

 .CP0_Config0 = MIPS_CONFIG0 | (MMU_TYPE_R4000 << CP0C0_MT),

--

 

ping

[Qemu-devel] how to allocate more video memory to qemu

2014-12-01 Thread jenia.ivlev

Hello.

How do I allocate a specific amount of video memory to a qemu machine?

My OS is GNU/Linux Arch. I'm running qemu using i7 and asus hero7. I want to 
run the game
Civilization 5 (windows), but when I run it, it crushes silently. I went to see
how much video memory windows can see. It turns out to be 4M. I want to
allocate maybe 256M or something like that to it. How do I do that?

Thanks for you time and help in advance.

[Qemu-devel] about tracetool

2014-12-01 Thread Ady Wahyudi Paundu

Hi all,
I know that simpletrace records go to memory buffer before it flushed
to file by a writer thread.  My question is how to call this thread
(preferably using python), so I can make the flush process periodical?

Regards,
Ady

Re: [Qemu-devel] [PATCH 3/7] test-coroutine: avoid overflow on 32-bit systems

2014-12-01 Thread Ming Lei

On Mon, Dec 1, 2014 at 8:41 PM, Paolo Bonzini  wrote:
>
>
> On 01/12/2014 02:28, Ming Lei wrote:
>>> > -   (unsigned long)(10 * duration) / maxcycles);
>>> > +   (unsigned long)(10.0 * duration / maxcycles));
>> One more single bracket.
>
> I don't understand?

Sorry, it is my fault, :-(

Thanks

Re: [Qemu-devel] Update on TCG Multithreading

2014-12-01 Thread Lluís Vilanova

Alexander Graf writes:

> On 01.12.14 22:00, Lluís Vilanova wrote:
>> Mark Burton writes:
>> 
>>> All - first a huge thanks for those who have contributed, and those who have
>>> expressed an interest in helping out.
>> 
>>> One issue I’d like to see more opinions on is the question of a cache per 
>>> core,
>>> or a shared cache.
>>> I have heard anecdotal evidence that a shared cache gives a major 
>>> performance
>>> benefit….
>>> Does anybody have anything more concrete?
>>> (of course we will get numbers in the end if we implement the hybrid scheme 
>>> as
>>> suggested in the wiki - but I’d still appreciate any feedback).
>> 
>> I think it makes sense to have a per-core pointer to a qom TCGCacheClass. 
>> That
>> can then have its own methods for working with updates, making it much 
>> simpler
>> to work with different implementations, like completely avoiding locks 
>> (per-cpu
>> cache) or a hybrid approach like the one described in the wiki.

> I don't think you want to have indirect function calls in the fast path ;).

Ooops, true; at least probably, since you're never sure how much the HW
prefetcher is going to outsmart you :)

Well, I guess that a define will have to do then. But I think it still makes
sense to refactor tb_* functions and such to have a TCGCache as first argument.


Best,
  Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth

Re: [Qemu-devel] [RFC PATCH 2/2] vnc: add change keyboard layout interface

2014-12-01 Thread Gonglei

On 2014/12/2 0:40, Eric Blake wrote:

> On 11/29/2014 03:39 AM, arei.gong...@huawei.com wrote:
>> From: Gonglei 
>>
>> Example QMP command of Change VNC keyboard layout:
>>
>> -> { "execute": "change",
>>  "arguments": { "device": "vnc", "target": "keymap",
>> "arg": "de" } }
>> <- { "return": {} }
> 
> As I said in the cover letter, we should NOT be adding stuff to the
> broken 'change' command, but should instead add a new command.
> 

OK.

>>
>> Signed-off-by: Gonglei 
>> ---
>>  qapi-schema.json |  8 +---
>>  qmp.c| 17 +
>>  2 files changed, 22 insertions(+), 3 deletions(-)
>>
>> diff --git a/qapi-schema.json b/qapi-schema.json
>> index 9ffdcf8..8c02a9f 100644
>> --- a/qapi-schema.json
>> +++ b/qapi-schema.json
>> @@ -1552,13 +1552,15 @@
>>  #
>>  # @target: If @device is a block device, then this is the new filename.
>>  #  If @device is 'vnc', then if the value 'password' selects the vnc
>> -#  change password command.   Otherwise, this specifies a new 
>> server URI
>> +#  change password command, if the value 'keymap'selects the vnc 
>> change
> 
> s/'keymap'selects/'keymap' selects/
> 
>> +#  keyboard layout command. Otherwise, this specifies a new server 
>> URI
>>  #  address to listen to for VNC connections.
>>  #
>>  # @arg:If @device is a block device, then this is an optional format to 
>> open
>>  #  the device with.
>> -#  If @device is 'vnc' and @target is 'password', this is the new 
>> VNC
>> -#  password to set.  If this argument is an empty string, then no 
>> future
>> +#  If @device is 'vnc' and if @target is 'password', this is the 
>> new VNC
>> +#  password to set; if @target is 'keymap', this is the new VNC 
>> keyboard
>> +#  layout to set. If this argument is an empty string, then no 
>> future
>>  #  logins will be allowed.
> 
> Not discoverable.  As proposed, libvirt has no way of knowing if qemu is
> new enough to support this horrible hack.  A new command has multiple
> benefits: it would be discoverable ('query-commands') and type-safe
> (none of this horrid overloading of special text values).
> 

Great! Thank you so much for your comments, Eric.
I will add a new QMP command for this.

Regards,
-Gonglei

Re: [Qemu-devel] [BUG] Redhat-6.4_64bit-guest kernel panic with cpu-passthrough and guest numa

2014-12-01 Thread Gonglei

On 2014/12/1 17:48, Paolo Bonzini wrote:

> 
> 
> On 28/11/2014 03:38, Gonglei wrote:
 Can you find what line of kernel/sched.c it is?
>> Yes, of course. See below please:
>> "sgs->avg_load = (sgs->group_load * SCHED_LOAD_SCALE) / group->cpu_power; "
>> in update_sg_lb_stats(), file sched.c, line 4094
>> And I can share the cause of we found. After commit 787aaf57(target-i386:
>> forward CPUID cache leaves when -cpu host is used), guest will get cpu cache
>> from host when -cpu host is used. But if we configure guest numa:
>>   node 0 cpus 0~7
>>   node 1 cpus 8~15
>> then the numa nodes lie in the same host cpu cache (cpus 0~16).
>> When the guest os boot, calculate group->cpu_power, but the guest find thoes
>> two different nodes own the same cache, then node1's group->cpu_power
>> will not be valued, just is the initial value '0'. And when vcpu is 
>> scheduled,
>> division by 0 causes kernel panic.
> 
> Thanks.  Please open a Red Hat bugzilla with the information, and Cc
> Larry Woodman  who fixed a few instances of this in
> the past.
> 

Hi, Paolo

A bug has been reported:
https://bugzilla.redhat.com/process_bug.cgi

Regards,
-Gonglei

Re: [Qemu-devel] [BUG] Redhat-6.4_64bit-guest kernel panic with cpu-passthrough and guest numa

2014-12-01 Thread Gonglei

On 2014/12/2 11:41, Gonglei wrote:

> Hi, Paolo
> 
> A bug has been reported:

https://bugzilla.redhat.com/show_bug.cgi?id=1169577

Regards,
-Gonglei

Re: [Qemu-devel] [PATCH v2] qmp: extend QMP to provide read/write access to physical memory

2014-12-01 Thread Bryan D. Payne

Ok thanks for the advice, I'll adjust for v3.  This is (clearly!) my first
contribution to Qemu so I'm still learning how you guys operate.

Cheers,
-bryan

On Mon, Dec 1, 2014 at 2:12 PM, Eric Blake  wrote:

> On 11/26/2014 01:27 PM, Bryan D. Payne wrote:
> > Thanks for the feedback Eric, I've updated the patch.
> >
> > v2 changes:
> > - added QMP command contract to qapi-schema.json
> > - corrected some comments
> > - rewired QMP command to use schema code
>
> When sending a v2, it's best to send it as a new top-level thread,
> instead of buried in-reply-to an existing thread.  Also, for a single
> patch, a cover letter is not strictly necessary (the information you
> gave here can instead be given after the --- separator of the
> one-and-only patch email).  Cover letters are mandatory only for
> multi-patch series.
>
> --
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>
>

Re: [Qemu-devel] [PATCH v2] qmp: extend QMP to provide read/write access to physical memory

2014-12-01 Thread Fam Zheng

On Mon, 12/01 20:36, Bryan D. Payne wrote:
> Ok thanks for the advice, I'll adjust for v3.  This is (clearly!) my first
> contribution to Qemu so I'm still learning how you guys operate.

Great. Looking forward to the next version. BTW please try to use inline
replying.

Fam

> 
> Cheers,
> -bryan
> 
> On Mon, Dec 1, 2014 at 2:12 PM, Eric Blake  wrote:
> 
> > On 11/26/2014 01:27 PM, Bryan D. Payne wrote:
> > > Thanks for the feedback Eric, I've updated the patch.
> > >
> > > v2 changes:
> > > - added QMP command contract to qapi-schema.json
> > > - corrected some comments
> > > - rewired QMP command to use schema code
> >
> > When sending a v2, it's best to send it as a new top-level thread,
> > instead of buried in-reply-to an existing thread.  Also, for a single
> > patch, a cover letter is not strictly necessary (the information you
> > gave here can instead be given after the --- separator of the
> > one-and-only patch email).  Cover letters are mandatory only for
> > multi-patch series.
> >
> > --
> > Eric Blake   eblake redhat com+1-919-301-3266
> > Libvirt virtualization library http://libvirt.org
> >
> >

[Qemu-devel] [PATCH for-2.3 0/6] vmdk: A few small fixes

2014-12-01 Thread Fam Zheng

Here are some improvements on miscellaneous things such as CID generation,
comments, input validation.

Fam Zheng (6):
  vmdk: Use g_random_int to generate CID
  vmdk: Fix comment to match code of extent lines
  vmdk: Clean up descriptor file reading
  vmdk: Check descriptor file length when reading it
  vmdk: Remove unnecessary initialization
  vmdk: Set errp on failures in vmdk_open_vmdk4

 block/vmdk.c | 25 ++---
 1 file changed, 18 insertions(+), 7 deletions(-)

-- 
1.9.3

[Qemu-devel] [PATCH for-2.3 5/6] vmdk: Remove unnecessary initialization

2014-12-01 Thread Fam Zheng

It will be assigned to the return value of vmdk_read_desc.

Suggested-by: Markus Armbruster 
Signed-off-by: Fam Zheng 
---
 block/vmdk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index f7c7979..22f85c4 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -910,7 +910,7 @@ exit:
 static int vmdk_open(BlockDriverState *bs, QDict *options, int flags,
  Error **errp)
 {
-char *buf = NULL;
+char *buf;
 int ret;
 BDRVVmdkState *s = bs->opaque;
 uint32_t magic;
-- 
1.9.3

[Qemu-devel] [PATCH for-2.3 3/6] vmdk: Clean up descriptor file reading

2014-12-01 Thread Fam Zheng

Zeroing a buffer that will be filled right after is not necessary, and
allocating a power of two + 1 is naughty.

Suggested-by: Markus Armbruster 
Signed-off-by: Fam Zheng 
---
 block/vmdk.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 28d22db..0c5769c 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -558,14 +558,15 @@ static char *vmdk_read_desc(BlockDriverState *file, 
uint64_t desc_offset,
 }
 
 size = MIN(size, 1 << 20);  /* avoid unbounded allocation */
-buf = g_malloc0(size + 1);
+buf = g_malloc(size);
 
-ret = bdrv_pread(file, desc_offset, buf, size);
+ret = bdrv_pread(file, desc_offset, buf, size - 1);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "Could not read from file");
 g_free(buf);
 return NULL;
 }
+buf[ret - 1] = 0;
 
 return buf;
 }
-- 
1.9.3

[Qemu-devel] [PATCH for-2.3 1/6] vmdk: Use g_random_int to generate CID

2014-12-01 Thread Fam Zheng

This replaces two "time(NULL)" invocations with "g_random_int()".
According to VMDK spec, CID "is a random 32‐bit value updated the first
time the content of the virtual disk is modified after the virtual disk
is opened". Using "seconds since epoch" is just a "lame way" to generate
it, and not completely safe because of the low precision.

Suggested-by: Markus Armbruster 
Signed-off-by: Fam Zheng 
---
 block/vmdk.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 2cbfd3e..ebb4b70 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -28,6 +28,7 @@
 #include "qemu/module.h"
 #include "migration/migration.h"
 #include 
+#include 
 
 #define VMDK3_MAGIC (('C' << 24) | ('O' << 16) | ('W' << 8) | 'D')
 #define VMDK4_MAGIC (('K' << 24) | ('D' << 16) | ('M' << 8) | 'V')
@@ -1538,7 +1539,7 @@ static int vmdk_write(BlockDriverState *bs, int64_t 
sector_num,
 /* update CID on the first write every time the virtual disk is
  * opened */
 if (!s->cid_updated) {
-ret = vmdk_write_cid(bs, time(NULL));
+ret = vmdk_write_cid(bs, g_random_int());
 if (ret < 0) {
 return ret;
 }
@@ -1922,7 +1923,7 @@ static int vmdk_create(const char *filename, QemuOpts 
*opts, Error **errp)
 }
 /* generate descriptor file */
 desc = g_strdup_printf(desc_template,
-   (uint32_t)time(NULL),
+   g_random_int(),
parent_cid,
fmt,
parent_desc_line,
-- 
1.9.3

[Qemu-devel] [PATCH for-2.3 2/6] vmdk: Fix comment to match code of extent lines

2014-12-01 Thread Fam Zheng

commit 04d542c8b (vmdk: support vmfs files) added support of VMFS extent
type but the comment above the changed code is left out. Update the
comment so they are consistent.

Signed-off-by: Fam Zheng 
---
 block/vmdk.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index ebb4b70..28d22db 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -785,10 +785,11 @@ static int vmdk_parse_extents(const char *desc, 
BlockDriverState *bs,
 VmdkExtent *extent;
 
 while (*p) {
-/* parse extent line:
+/* parse extent line in one of below formats:
+ *
  * RW [size in sectors] FLAT "file-name.vmdk" OFFSET
- * or
  * RW [size in sectors] SPARSE "file-name.vmdk"
+ * RW [size in sectors] VMFS "file-name.vmdk"
  */
 flat_offset = -1;
 ret = sscanf(p, "%10s %" SCNd64 " %10s \"%511[^\n\r\"]\" %" SCNd64,
-- 
1.9.3

[Qemu-devel] [PATCH for-2.3 4/6] vmdk: Check descriptor file length when reading it

2014-12-01 Thread Fam Zheng

Since a too small file cannot be a valid VMDK image, and also since the
buffer's first 4 bytes will be unconditionally examined by
vmdk_open_sparse, let's error out the small file case to be clear.

Signed-off-by: Fam Zheng 
---
 block/vmdk.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/block/vmdk.c b/block/vmdk.c
index 0c5769c..f7c7979 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -557,6 +557,11 @@ static char *vmdk_read_desc(BlockDriverState *file, 
uint64_t desc_offset,
 return NULL;
 }
 
+if (size < 4) {
+error_setg_errno(errp, -size, "File is too small, not a valid image");
+return NULL;
+}
+
 size = MIN(size, 1 << 20);  /* avoid unbounded allocation */
 buf = g_malloc(size);
 
-- 
1.9.3

[Qemu-devel] [PATCH for-2.3 6/6] vmdk: Set errp on failures in vmdk_open_vmdk4

2014-12-01 Thread Fam Zheng

Reported-by: Markus Armbruster 
Signed-off-by: Fam Zheng 
---
 block/vmdk.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/vmdk.c b/block/vmdk.c
index 22f85c4..dd97e25 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -642,6 +642,7 @@ static int vmdk_open_vmdk4(BlockDriverState *bs,
 bs->file->total_sectors * 512 - 1536,
 &footer, sizeof(footer));
 if (ret < 0) {
+error_setg_errno(errp, -ret, "Failed to read footer");
 return ret;
 }
 
@@ -653,6 +654,7 @@ static int vmdk_open_vmdk4(BlockDriverState *bs,
 le32_to_cpu(footer.eos_marker.size) != 0  ||
 le32_to_cpu(footer.eos_marker.type) != MARKER_END_OF_STREAM)
 {
+error_setg(errp, "Invalid footer");
 return -EINVAL;
 }
 
@@ -683,6 +685,7 @@ static int vmdk_open_vmdk4(BlockDriverState *bs,
 l1_entry_sectors = le32_to_cpu(header.num_gtes_per_gt)
 * le64_to_cpu(header.granularity);
 if (l1_entry_sectors == 0) {
+error_setg(errp, "L1 entry size is invalid");
 return -EINVAL;
 }
 l1_size = (le64_to_cpu(header.capacity) + l1_entry_sectors - 1)
-- 
1.9.3

Re: [Qemu-devel] [PATCHv3] block: add event when disk usage exceeds threshold

2014-12-01 Thread Francesco Romani

Thanks for the quick review!

- Original Message -
> From: "Eric Blake" 
> To: "Francesco Romani" , qemu-devel@nongnu.org
> Cc: kw...@redhat.com, mdr...@linux.vnet.ibm.com, stefa...@redhat.com, 
> lcapitul...@redhat.com
> Sent: Monday, December 1, 2014 10:07:38 PM
> Subject: Re: [Qemu-devel] [PATCHv3] block: add event when disk usage exceeds  
> threshold
> 
> On 11/28/2014 05:31 AM, Francesco Romani wrote:
> > Managing applications, like oVirt (http://www.ovirt.org), make extensive
> > use of thin-provisioned disk images.
> > To let the guest run smoothly and be not unnecessarily paused, oVirt sets
> > a disk usage threshold (so called 'high water mark') based on the
> > occupation
> > of the device,  and automatically extends the image once the threshold
> > is reached or exceeded.
> > 
> > In order to detect the crossing of the threshold, oVirt has no choice but
> > aggressively polling the QEMU monitor using the query-blockstats command.
> > This lead to unnecessary system load, and is made even worse under scale:
> > deployments with hundreds of VMs are no longer rare.
> > 
> > To fix this, this patch adds:
> > * A new monitor command to set a mark for a given block device.
> > * A new event to report if a block device usage exceeds the threshold.
> > 
> > This will allow the managing application to drop the polling
> > altogether and just wait for a watermark crossing event.
> 
> I like the idea!
> 
> Question - what happens if management misses the event (for example, if
> libvirtd is restarted)?  Does the existing 'query-blockstats' and/or
> 'query-named-block-nodes' still work to query the current threshold and
> whether it has been exceeded, as a poll-once command executed when
> reconnecting to the monitor?

Indeed oVirt wants to keep the existing polling and to use it as fallback,
to make sure no events are missed. oVirt will not rely on the new notification
*alone*.
The plan is to "just" poll *much less* frequently. Today's default poll
rate is every 2 (two) seconds, so there is a lot of room for improvement.

> 
> > 
> > Signed-off-by: Francesco Romani 
> > ---
> 
> No need for a 0/1 cover letter on a 1-patch series; you have the option
> of just putting the side-band information here and sending it as a
> single mail.  But the cover letter approach doesn't hurt either, and I
> can see how it can be easier for some workflows to always send a cover
> letter than to special-case a 1-patch series.

Also, I found that a separate context/introduction/room for comments
would helped, especially on first two revisions of the patch which were
tagged as RFC. I have zero problems in dropping the cover letter now that
consensus is forming and patch is taking shape, just let me know.

[... snip spelling: thanks! will fix]
> > +++ b/qapi/block-core.json
> > @@ -239,6 +239,9 @@
> >  #
> >  # @iops_size: #optional an I/O size in bytes (Since 1.7)
> >  #
> > +# @write-threshold: configured write threshold for the device.
> > +#   0 if disabled. (Since 2.3)
> > +#
> >  # Since: 0.14.0
> >  #
> >  ##
> > @@ -253,7 +256,7 @@
> >  '*bps_max': 'int', '*bps_rd_max': 'int',
> >  '*bps_wr_max': 'int', '*iops_max': 'int',
> >  '*iops_rd_max': 'int', '*iops_wr_max': 'int',
> > -'*iops_size': 'int' } }
> > +'*iops_size': 'int', 'write-threshold': 'uint64' } }
> 
> In QMP specs, 'uint64' and 'int' are practically synonyms.  I can live
> with either spelling, although 'int' is more common.
> 
> Bikeshed on naming: Although we prefer '-' over '_' in new interfaces,
> we also favor consistency, and BlockDeviceInfo is one of those dinosaur
> commands that uses _ everywhere until your addition.  So naming this
> field 'write_threshold' would be more consistent.

Agreed. Will fix.

> > +#
> > +# @node-name: graph node name on which the threshold was exceeded.
> > +#
> > +# @amount-exceeded: amount of data which exceeded the threshold, in bytes.
> > +#
> > +# @offset-threshold: last configured threshold, in bytes.
> > +#
> 
> Might want to mention that this event is one-shot; after it triggers, a
> user must re-register a threshold to get the event again.

Good point. Will fix.

> 
> > +# Since: 2.3
> > +##
> > +{ 'event': 'BLOCK_USAGE_THRESHOLD',
> > +  'data': { 'node-name': 'str',
> > +   'amount-exceeded': 'uint64',
> 
> TAB damage.  Please use spaces.  ./scripts/checkpatch.pl will catch some
> offenders (although I didn't test if it will catch this one).
> 
> However, here you are correct in using '-' for naming :)

Oops. Rebase error (was clean before!) will fix.

> 
> > +   'threshold': 'uint64' } }
> > +
> > +##
> > +# @block-set-threshold
> > +#
> > +# Change usage threshold for a block drive. An event will be delivered
> > +# if a write to this block drive crosses the configured threshold.
> > +# This is useful to transparently resize thin-provisioned drives without
> > +# the guest OS noticing.
> > +#
> > +# @node-name: graph

86 matches

Mail list logo