date:20160927

[Qemu-block] [Qemu-devel][PATCH] block: modify top-id's comments

2016-09-27 Thread Wang WeiWei

Signed-off-by: Wen Congyang 
Signed-off-by: Changlong Xie 
Signed-off-by: Wang WeiWei 
Signed-off-by: zhanghailiang 
Signed-off-by: Gonglei 
Reviewed-by: Eric Blake 
---
 qapi/block-core.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index ada3202..0935b81 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2184,7 +2184,7 @@
 # @mode: the replication mode
 #
 # @top-id: #optional In secondary mode, node name or device ID of the root
-#  node who owns the replication node chain. Ignored in primary mode.
+#  node who owns the replication node chain. Must not be given in 
primary mode.
 #
 # Since: 2.8
 ##
-- 
1.9.3

Re: [Qemu-block] [Qemu-devel] [PATCH 3/3] linux-aio: fix re-entrant completion processing

2016-09-27 Thread Fam Zheng

On Tue, 09/27 19:55, Roman Penyaev wrote:
> > The bug is 100% deterministic.  Just boot up a guest with -drive
> > format=qcow2,aio=native.
> 
> It turns out to be that everything is broken.  I started all my
> tests with format=raw,aio=native and immediately got coroutine
> recursive.  That is completely weird.
> 
> So, what I did is the following:
> 
> 1. Took latest master (nothing works)
> 2. Did interactive rebase to 12c8720
> 12c8720 2016-06-28 | Merge remote-tracking branch
> 'remotes/stefanha/tags/block-pull-request' into staging [Peter
> Maydell]
> 
> this merge request includes all your patches related to
> virtio-blk and MQ support.
> 
> 3. Applied 0ed93d84edab. Everything works fine.

Have you tried qcow2 at this point? raw crashes with 1a62d0accdf85 doesn't mean
qcow2 is fine without it.

Fam

> 
> 4. Rebased up till 0647d47:
> 0647d47 2016-09-13 | qcow2: avoid memcpy(dst, NULL, len) [Stefan Hajnoczi]
> 
> this is the point, after which 0ed93d84edab was applied
> on master.
> 
> Got recursive coroutine, so nothing works.
> 
> 5. Did a besect, which shows this commit:
> 
> --
> commit 1a62d0accdf85fbeac149018ee8d1728e754de73
> Author: Eric Blake 
> Date:   Fri Jul 15 12:31:59 2016 -0600
> 
> block: Fragment reads to max transfer length
> --
> 
> So after this commit my commit 0ed93d84edab stops working.
> And now for me is completely not clear what is happening there.
> 
> --
> Roman
>

Re: [Qemu-block] [PATCH v2 3/3] linux-aio: fix re-entrant completion processing

2016-09-27 Thread Fam Zheng

On Tue, 09/27 16:18, Stefan Hajnoczi wrote:
> Commit 0ed93d84edabc7656f5c998ae1a346fe8b94ca54 ("linux-aio: process
> completions from ioq_submit()") added an optimization that processes
> completions each time ioq_submit() returns with requests in flight.
> This commit introduces a "Co-routine re-entered recursively" error which
> can be triggered with -drive format=qcow2,aio=native.
> 
> Fam Zheng , Kevin Wolf , and I
> debugged the following backtrace:
> 
>   (gdb) bt
>   #0  0x70a046f5 in raise () at /lib64/libc.so.6
>   #1  0x70a062fa in abort () at /lib64/libc.so.6
>   #2  0x55ac0013 in qemu_coroutine_enter (co=0x583464d0) at 
> util/qemu-coroutine.c:113
>   #3  0x55a4b663 in qemu_laio_process_completions 
> (s=s@entry=0x57e2f7f0) at block/linux-aio.c:218
>   #4  0x55a4b874 in ioq_submit (s=s@entry=0x57e2f7f0) at 
> block/linux-aio.c:331
>   #5  0x55a4ba12 in laio_do_submit (fd=fd@entry=13, 
> laiocb=laiocb@entry=0x59d38ae0, offset=offset@entry=2932727808, 
> type=type@entry=1) at block/linux-aio.c:383
>   #6  0x55a4bbd3 in laio_co_submit (bs=, 
> s=0x57e2f7f0, fd=13, offset=2932727808, qiov=0x59d38e20, type=1) at 
> block/linux-aio.c:402
>   #7  0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x5663bcb0, 
> offset=offset@entry=2932727808, bytes=bytes@entry=8192, 
> qiov=qiov@entry=0x59d38e20, flags=0) at block/io.c:804
>   #8  0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x5663bcb0, 
> req=req@entry=0x59d38d20, offset=offset@entry=2932727808, 
> bytes=bytes@entry=8192, align=align@entry=512, 
> qiov=qiov@entry=0x59d38e20, flags=0) at block/io.c:1041
>   #9  0x55a52db8 in bdrv_co_preadv (child=, 
> offset=2932727808, bytes=8192, qiov=qiov@entry=0x59d38e20, 
> flags=flags@entry=0) at block/io.c:1133
>   #10 0x55a29629 in qcow2_co_preadv (bs=0x56635890, 
> offset=6178725888, bytes=8192, qiov=0x57527840, flags=) at 
> block/qcow2.c:1509
>   #11 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x56635890, 
> offset=offset@entry=6178725888, bytes=bytes@entry=8192, 
> qiov=qiov@entry=0x57527840, flags=0) at block/io.c:804
>   #12 0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x56635890, 
> req=req@entry=0x59d39000, offset=offset@entry=6178725888, 
> bytes=bytes@entry=8192, align=align@entry=1, qiov=qiov@entry=0x57527840, 
> flags=0) at block/io.c:1041
>   #13 0x55a52db8 in bdrv_co_preadv (child=, 
> offset=offset@entry=6178725888, bytes=bytes@entry=8192, 
> qiov=qiov@entry=0x57527840, flags=flags@entry=0) at block/io.c:1133
>   #14 0x55a4515a in blk_co_preadv (blk=0x566356d0, 
> offset=6178725888, bytes=8192, qiov=0x57527840, flags=0) at 
> block/block-backend.c:783
>   #15 0x55a45266 in blk_aio_read_entry (opaque=0x577025e0) at 
> block/block-backend.c:991
>   #16 0x55ac0cfa in coroutine_trampoline (i0=, 
> i1=) at util/coroutine-ucontext.c:78
> 
> It turned out that re-entrant ioq_submit() and completion processing
> between three requests caused this error.  The following check is not
> sufficient to prevent recursively entering coroutines:
> 
>   if (laiocb->co != qemu_coroutine_self()) {
>   qemu_coroutine_enter(laiocb->co);
>   }
> 
> As the following coroutine backtrace shows, not just the current
> coroutine (self) can be entered.  There might also be other coroutines
> that are currently entered and transferred control due to the qcow2 lock
> (CoMutex):
> 
>   (gdb) qemu coroutine 0x583464d0
>   #0  0x55ac0c90 in qemu_coroutine_switch 
> (from_=from_@entry=0x583464d0, to_=to_@entry=0x572f9890, 
> action=action@entry=COROUTINE_ENTER) at util/coroutine-ucontext.c:175
>   #1  0x55abfe54 in qemu_coroutine_enter (co=0x572f9890) at 
> util/qemu-coroutine.c:117
>   #2  0x55ac031c in qemu_co_queue_run_restart 
> (co=co@entry=0x583462c0) at util/qemu-coroutine-lock.c:60
>   #3  0x55abfe5e in qemu_coroutine_enter (co=0x583462c0) at 
> util/qemu-coroutine.c:119
>   #4  0x55a4b663 in qemu_laio_process_completions 
> (s=s@entry=0x57e2f7f0) at block/linux-aio.c:218
>   #5  0x55a4b874 in ioq_submit (s=s@entry=0x57e2f7f0) at 
> block/linux-aio.c:331
>   #6  0x55a4ba12 in laio_do_submit (fd=fd@entry=13, 
> laiocb=laiocb@entry=0x5a338b40, offset=offset@entry=2911477760, 
> type=type@entry=1) at block/linux-aio.c:383
>   #7  0x55a4bbd3 in laio_co_submit (bs=, 
> s=0x57e2f7f0, fd=13, offset=2911477760, qiov=0x5a338e80, type=1) at 
> block/linux-aio.c:402
>   #8  0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x5663bcb0, 
> offset=offset@entry=2911477760, bytes=bytes@entry=8192, 
> qiov=qiov@entry=0x5a338e80, flags=0) at block/io.c:804
>   #9  0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x5663bcb0, 
>

Re: [Qemu-block] [PATCH v2 2/3] test-coroutine: test qemu_coroutine_entered()

2016-09-27 Thread Fam Zheng

On Tue, 09/27 16:18, Stefan Hajnoczi wrote:
> Signed-off-by: Stefan Hajnoczi 
> ---
>  tests/test-coroutine.c | 42 ++
>  1 file changed, 42 insertions(+)
> 
> diff --git a/tests/test-coroutine.c b/tests/test-coroutine.c
> index 6431dd6..abd97c2 100644
> --- a/tests/test-coroutine.c
> +++ b/tests/test-coroutine.c
> @@ -53,6 +53,47 @@ static void test_self(void)
>  }
>  
>  /*
> + * Check that qemu_coroutine_entered() works
> + */

Not related to this patch:

It's a bit weird that in this file function header comments are followed by a
blank line, and in one case it even looks like as odd as this:


static void test_order(void)
{
int i;
const struct coroutine_position expected_pos[] = {
{1, 1,}, {2, 1}, {1, 2}, {2, 2}, {1, 3}
};
do_order_test();
g_assert_cmpint(record_pos, ==, 5);
for (i = 0; i < record_pos; i++) {
g_assert_cmpint(records[i].func , ==, expected_pos[i].func );
g_assert_cmpint(records[i].state, ==, expected_pos[i].state);
}
}
/*
 * Lifecycle benchmark
 */

static void coroutine_fn empty_coroutine(void *opaque)
{
/* Do nothing */
}

> +
> +static void coroutine_fn verify_entered_step_2(void *opaque)
> +{
> +Coroutine *caller = (Coroutine *)opaque;
> +
> +g_assert(qemu_coroutine_entered(caller));
> +g_assert(qemu_coroutine_entered(qemu_coroutine_self()));
> +qemu_coroutine_yield();
> +
> +/* Once more to check it still works after yielding */
> +g_assert(qemu_coroutine_entered(caller));
> +g_assert(qemu_coroutine_entered(qemu_coroutine_self()));
> +qemu_coroutine_yield();
> +}
> +
> +static void coroutine_fn verify_entered_step_1(void *opaque)
> +{
> +Coroutine *self = qemu_coroutine_self();
> +Coroutine *coroutine;
> +
> +g_assert(qemu_coroutine_entered(self));
> +
> +coroutine = qemu_coroutine_create(verify_entered_step_2, self);
> +g_assert(!qemu_coroutine_entered(coroutine));
> +qemu_coroutine_enter(coroutine);
> +g_assert(!qemu_coroutine_entered(coroutine));
> +qemu_coroutine_enter(coroutine);
> +}
> +
> +static void test_entered(void)
> +{
> +Coroutine *coroutine;
> +
> +coroutine = qemu_coroutine_create(verify_entered_step_1, NULL);
> +g_assert(!qemu_coroutine_entered(coroutine));
> +qemu_coroutine_enter(coroutine);
> +}
> +
> +/*
>   * Check that coroutines may nest multiple levels
>   */
>  
> @@ -389,6 +430,7 @@ int main(int argc, char **argv)
>  g_test_add_func("/basic/yield", test_yield);
>  g_test_add_func("/basic/nesting", test_nesting);
>  g_test_add_func("/basic/self", test_self);
> +g_test_add_func("/basic/entered", test_entered);
>  g_test_add_func("/basic/in_coroutine", test_in_coroutine);
>  g_test_add_func("/basic/order", test_order);
>  if (g_test_perf()) {
> -- 
> 2.7.4
> 

Reviewed-by: Fam Zheng

Re: [Qemu-block] [PATCH v2 1/3] coroutine: add qemu_coroutine_entered() function

2016-09-27 Thread Fam Zheng

On Tue, 09/27 16:18, Stefan Hajnoczi wrote:
> See the doc comments for a description of this new coroutine API.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  include/qemu/coroutine.h | 13 +
>  util/qemu-coroutine.c|  5 +
>  2 files changed, 18 insertions(+)
> 
> diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
> index 29a2078..e6a60d5 100644
> --- a/include/qemu/coroutine.h
> +++ b/include/qemu/coroutine.h
> @@ -92,6 +92,19 @@ Coroutine *coroutine_fn qemu_coroutine_self(void);
>   */
>  bool qemu_in_coroutine(void);
>  
> +/**
> + * Return true if the coroutine is currently entered
> + *
> + * A coroutine is "entered" if it has not yielded from the current
> + * qemu_coroutine_enter() call used to run it.  This does not mean that the
> + * coroutine is currently executing code since it may have transferred 
> control
> + * to another coroutine using qemu_coroutine_enter().
> + *
> + * When several coroutines enter each other there may be no way to know which
> + * ones have already been entered.  In such situations this function can be
> + * used to avoid recursively entering coroutines.
> + */
> +bool qemu_coroutine_entered(Coroutine *co);
>  
>  
>  /**
> diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
> index 3cbf225..737bffa 100644
> --- a/util/qemu-coroutine.c
> +++ b/util/qemu-coroutine.c
> @@ -146,3 +146,8 @@ void coroutine_fn qemu_coroutine_yield(void)
>  self->caller = NULL;
>  qemu_coroutine_switch(self, to, COROUTINE_YIELD);
>  }
> +
> +bool qemu_coroutine_entered(Coroutine *co)
> +{
> +return co->caller;
> +}
> -- 
> 2.7.4
> 

Reviewed-by: Fam Zheng

Re: [Qemu-block] [PATCH v24 11/12] support replication driver in blockdev-add

2016-09-27 Thread wangweiwei


在 2016年09月12日 22:01, Stefan Hajnoczi 写道:

On Mon, Aug 15, 2016 at 05:32:19PM +0800, Changlong Xie wrote:

On 08/15/2016 04:37 PM, Kevin Wolf wrote:

Am 15.08.2016 um 03:49 hat Changlong Xie geschrieben:

On 08/09/2016 05:08 PM, Kevin Wolf wrote:

Am 27.07.2016 um 09:01 hat Changlong Xie geschrieben:

From: Wen Congyang 

Signed-off-by: Wen Congyang 
Signed-off-by: Changlong Xie 
Signed-off-by: Wang WeiWei 
Signed-off-by: zhanghailiang 
Signed-off-by: Gonglei 
Reviewed-by: Eric Blake 



@@ -2078,6 +2079,23 @@
   { 'enum' : 'ReplicationMode', 'data' : [ 'primary', 'secondary' ] }

   ##
+# @BlockdevOptionsReplication
+#
+# Driver specific block device options for replication
+#
+# @mode: the replication mode
+#
+# @top-id: #optional In secondary mode, node name or device ID of the root
+#  node who owns the replication node chain. Ignored in primary mode.


Can we change this to "Must not be given in primary mode"? Not sure what
the code currently does, but I think it should error out if top-id is


Replication driver will ignore "top-id" parameter in Primary mode.


This is not good behaviour, which is why I requested a change.



Hi stefan

Would you like me send another [PATCH v25] based your block-next? Or a
separate patch until your tree is merged.


Sorry for the slow response.  Please send a new patch on top of my
block-next tree.

Stefan


ok, I will send a new patch on the top of block-next tree.

Re: [Qemu-block] [PATCH v24 00/12] Block replication for continuous checkpoints

2016-09-27 Thread wangweiwei


在 2016年09月12日 22:11, Stefan Hajnoczi 写道:

On Mon, Aug 08, 2016 at 03:50:27PM +0100, Stefan Hajnoczi wrote:

On Wed, Jul 27, 2016 at 03:01:41PM +0800, Changlong Xie wrote:

Block replication is a very important feature which is used for
continuous checkpoints(for example: COLO).

You can get the detailed information about block replication from here:
http://wiki.qemu.org/Features/BlockReplication

Usage:
Please refer to docs/block-replication.txt

You can get the patch here:
https://github.com//Pating/qemu/tree/block-replication-v24

You can get the patch with framework here:
https://github.com//Pating/qemu/tree/colo_framework_v23

TODO:
1. Continuous block replication. It will be started after basic functions
are accepted.

Change Log:

V24:
1. Address comments from Max
p9: pass NULL to bdrv_lookup_bs(), and introduce bdrv_is_root_node() to check 
top_bs
p11: perfect @top-id description, and make it #optional
p12: "replication" => "Replication", add docs/block-replication.txt
Note: we need bdrv_is_root_node() in p9, so this patchset is based on 
kevin/qmp-node-name,
V23:
1. Address comments from Stefan and Max, this series introduce p7/p12
p2. add Copyright for block_backup.h
p7. support configure --disable-replication
p8. update 2.7 to 2.8
p11. update 2.7 to 2.8, add missing "top-id"
p12. update MAINTAINERS
V22:
1. Rebase to the lastest code
2. modify code adapt to the modification of backup_start & commit_active_start
3. rewrite io_read & io_write for interface changes
V21:
1. Rebase to the lastest code
2. use bdrv_pwrite_zeroes() and BDRV_SECTOR_BITS for p9
V20 Resend:
1. Resend to avoid bothering qemu-trivial maintainers
2. Address comments from Eric, fix header file issue and add a brief commit 
message for p7
V20:
1. Rebase to the lastest code
2. Address comments from stefan
p8:
1. error_setg() with an error message when check_top_bs() fails.
2. remove bdrv_ref(s->hidden_disk->bs) since commit 5c438bc6
3. use bloc_job_cancel_sync() before active commit
p9:
1. fix uninitialized 'pattern_buf'
2. introduce mkstemp(3) to fix unique filenames
3. use qemu_vfree() for qemu_blockalign() memory
4. add missing replication_start_all()
5. remove useless pattern for io_write()
V19:
1. Rebase to v2.6.0
2. Address comments from stefan
p3: a new patch that export interfaces for extra serialization
p8:
1. call replication_stop() before freeing s->top_id
2. check top_bs
3. reopen file readonly in error return paths
4. enable extra serialization between read and COW
p9: try to hanlde SIGABRT
V18:
p6: add local_err in all replication callbacks to prevent "errp == NULL"
p7: add missing qemu_iovec_destroy(xxx)
V17:
1. Rebase to the lastest codes
p2: refactor backup_do_checkpoint addressed comments from Jeff Cody
p4: fix bugs in "drive_add buddy xxx" hmp commands
p6: add "since: 2.7"
p7: fix bug in replication_close(), add missing "qapi/error.h", add 
test-replication
p8: add "since: 2.7"
V16:
1. Rebase to the newest codes
2. Address comments from Stefan & hailiang
p3: we don't need this patch now
p4: add "top-id" parameters for secondary
p6: fix NULL pointer in replication callbacks, remove unnecessary typedefs,
add doc comments that explain the semantics of Replication
p7: Refactor AioContext for thread-safe, remove unnecessary get_top_bs()
*Note*: I'm working on replication testcase now, will send out in V17
V15:
1. Rebase to the newest codes
2. Fix typos and coding style addresed Eric's comments
3. Address Stefan's comments
1) Make backup_do_checkpoint public, drop the changes on BlockJobDriver
2) Update the message and description for [PATCH 4/9]
3) Make replication_(start/stop/do_checkpoint)_all as global interfaces
4) Introduce AioContext lock to protect start/stop/do_checkpoint callbacks
5) Use BdrvChild instead of holding on to BlockDriverState * pointers
4. Clear BDRV_O_INACTIVE for hidden disk's open_flags since commit 09e0c771
5. Introduce replication_get_error_all to check replication status
6. Remove useless discard interface
V14:
1. Implement auto complete active commit
2. Implement active commit block job for replication.c
3. Address the comments from Stefan, add replication-specific API and data
structure, also remove old block layer APIs
V13:
1. Rebase to the newest codes
2. Remove redundant marcos and semicolon in replication.c
3. Fix typos in block-replication.txt
V12:
1. Rebase to the newest codes
2. Use backing reference to replcace 'allow-write-backing-file'
V11:
1. Reopen the backing file when starting blcok replication if it is not
opened in R/W mode
2. Unblock BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET
when opening backing file
3. Block the top BDS so there is only one block job for the top BDS and
its backing chain.
V10:
1. Use blockdev-remove-medium and blockdev-insert-medium to replace backing
reference.
2. Address the comments from Eric Blake
V9:
1. Update the error messages
2. Rebase to the newest qemu
3. Split child add/delete

Re: [Qemu-block] [Qemu-devel] [PATCH v14 04/19] qapi: add trace events for visitor

2016-09-27 Thread Eric Blake

On 09/27/2016 08:13 AM, Daniel P. Berrange wrote:
> Allow tracing of the operation of visitors

Ooooh, shiny!

> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  Makefile.objs  |  1 +
>  qapi/qapi-visit-core.c | 27 +++
>  qapi/trace-events  | 33 +
>  3 files changed, 61 insertions(+)
>  create mode 100644 qapi/trace-events
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v14 07/19] qapi: don't pass two copies of TestInputVisitorData to tests

2016-09-27 Thread Eric Blake

On 09/27/2016 05:10 PM, Eric Blake wrote:
> On 09/27/2016 08:13 AM, Daniel P. Berrange wrote:
>> The input_visitor_test_add() method was accepting an instance
>> of 'TestInputVisitorData' and passing it as the 'user_data'
>> parameter to test functions. The main 'TestInputVisitorData'
>> instance that was actually used, was meanwhile being allocated
>> automatically by the test framework fixture setup.
>>
>> Signed-off-by: Daniel P. Berrange 
>> ---
>>  tests/test-qobject-input-visitor.c | 76 
>> --
>>  1 file changed, 32 insertions(+), 44 deletions(-)
>>
> 
> Reviewed-by: Eric Blake 
> 

Having said that, I note that ALL callers now pass NULL for user_data.
If you plan on using it later in the series for something other than
NULL for some of the (new?) tests added at that point, it would be wise
to say so in the commit message; if not, I would suggest eliminating the
parameter altogether.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v14 07/19] qapi: don't pass two copies of TestInputVisitorData to tests

2016-09-27 Thread Eric Blake

On 09/27/2016 08:13 AM, Daniel P. Berrange wrote:
> The input_visitor_test_add() method was accepting an instance
> of 'TestInputVisitorData' and passing it as the 'user_data'
> parameter to test functions. The main 'TestInputVisitorData'
> instance that was actually used, was meanwhile being allocated
> automatically by the test framework fixture setup.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  tests/test-qobject-input-visitor.c | 76 
> --
>  1 file changed, 32 insertions(+), 44 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v14 03/19] option: allow qemu_opts_to_qdict to merge repeated options

2016-09-27 Thread Eric Blake

On 09/27/2016 08:13 AM, Daniel P. Berrange wrote:
> If given an option string such as
> 
>   size=1024,nodes=10,nodes=4-5,nodes=1-2,policy=bind
> 
> the qemu_opts_to_qdict() method will currently overwrite
> the values for repeated option keys, so only the last
> value is in the returned dict:
> 
> size=1024
> nodes=1-2
> policy=bind
> 
> This adds the ability for the caller to ask that the
> repeated keys be turned into list indexes:
> 
> size=1024
> nodes.0=10
> nodes.1=4-5
> nodes.2=1-2
> policy=bind
> 
> Note that the conversion has no way of knowing whether
> any given key is expected to be a list upfront - it can
> only figure that out when seeing the first duplicated
> key. Thus the caller has to be prepared to deal with the
> fact that if a key 'foo' is a list, then the returned
> qdict may contain either 'foo' (if only a single instance
> of the key was seen) or 'foo.NN' (if multiple instances
> of the key were seen).
> 
> Signed-off-by: Daniel P. Berrange 
> ---

If I'm not mistaken, this policy adds a new policy, but all existing
clients use the old policy, and the new policy is exercised only by the
testsuite additions.  Might be useful to mention that in the commit
message, rather than making me read the whole commit before guessing that.

> +++ b/blockdev.c
> @@ -911,7 +911,8 @@ DriveInfo *drive_new(QemuOpts *all_opts, 
> BlockInterfaceType block_default_type)
>  
>  /* Get a QDict for processing the options */
>  bs_opts = qdict_new();
> -qemu_opts_to_qdict(all_opts, bs_opts);
> +qemu_opts_to_qdict(all_opts, bs_opts,
> +   QEMU_OPTS_REPEAT_POLICY_LAST);

git send-email/format-patch -O/path/to/file (or the corresponding config
option) allows you to sort the diff to put the interesting stuff first
(in this case, the new enum).

> +++ b/include/qemu/option.h
> @@ -125,7 +125,13 @@ void qemu_opts_set_defaults(QemuOptsList *list, const 
> char *params,
>  int permit_abbrev);
>  QemuOpts *qemu_opts_from_qdict(QemuOptsList *list, const QDict *qdict,
> Error **errp);
> -QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict);
> +typedef enum {
> +QEMU_OPTS_REPEAT_POLICY_LAST,
> +QEMU_OPTS_REPEAT_POLICY_LIST,

Hmm. I suspect this subtle difference (one vowel) to be the source of
typo bugs.  Can we come up with more obvious policy names, such as
LAST_ONLY vs. INTO_LIST?  Except that doing that makes it harder to fit
80 columns.  So up to you if you want to ignore me here.

On the other hand, a documentation comment here would go a long ways to
helping future readers:

LAST: last occurrence of a duplicate option silently overwrites all earlier
LIST: each occurrence of a duplicate option converts it into a list

maybe you also want to add:

ERROR: an occurrence of a duplicate option is considered an error

Also, while you turn 'foo=a,foo=b' into 'foo.0=a,foo.1=b', does your
code correctly handle the cases of 'foo.0=a,foo=b' and 'foo=a,foo.1=b'?
(And what IS the correct handling of those cases logically supposed to be?)

> +++ b/tests/test-qemu-opts.c
> @@ -421,6 +421,45 @@ static void test_qemu_opts_set(void)
>  g_assert(opts == NULL);
>  }
>  
> +
> +static void test_qemu_opts_to_qdict(void)
> +{

Here would be a good place to test the two mixed-use optstrings I
mentioned above (inconsistent use of plain vs. list syntax).

> +}
> +
>  int main(int argc, char *argv[])
>  {

> +++ b/util/qemu-option.c
> @@ -1058,10 +1058,12 @@ void qemu_opts_absorb_qdict(QemuOpts *opts, QDict 
> *qdict, Error **errp)
>   * TODO We'll want to use types appropriate for opt->desc->type, but
>   * this is enough for now.
>   */
> -QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict)
> +QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict,
> +  QemuOptsRepeatPolicy repeatPolicy)
>  {
>  QemuOpt *opt;
> -QObject *val;
> +QObject *val, *prevval;
> +QDict *lists = qdict_new();
>  
>  if (!qdict) {
>  qdict = qdict_new();
> @@ -1070,9 +1072,42 @@ QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict)
>  qdict_put(qdict, "id", qstring_from_str(opts->id));
>  }
>  QTAILQ_FOREACH(opt, >head, next) {
> +gchar *key = NULL;
>  val = QOBJECT(qstring_from_str(opt->str));
> -qdict_put_obj(qdict, opt->name, val);
> +switch (repeatPolicy) {
> +case QEMU_OPTS_REPEAT_POLICY_LIST:
> +if (qdict_haskey(lists, opt->name)) {
> +/* Current val goes into 'foo.N' */
> +int64_t max = qdict_get_int(lists, opt->name);
> +max++;
> +key = g_strdup_printf("%s.%" PRId64, opt->name, max);
> +qdict_put_obj(lists, opt->name, QOBJECT(qint_from_int(max)));
> +qdict_put_obj(qdict, key, val);
> +} else if (qdict_haskey(qdict, opt->name)) {
> +/* Move previous

Re: [Qemu-block] [Qemu-devel] [PATCH v14 01/19] qdict: implement a qdict_crumple method for un-flattening a dict

2016-09-27 Thread Eric Blake

On 09/27/2016 08:13 AM, Daniel P. Berrange wrote:
> The qdict_flatten() method will take a dict whose elements are
> further nested dicts/lists and flatten them by concatenating
> keys.
> 
> The qdict_crumple() method aims to do the reverse, taking a flat
> qdict, and turning it into a set of nested dicts/lists. It will
> apply nesting based on the key name, with a '.' indicating a
> new level in the hierarchy. If the keys in the nested structure
> are all numeric, it will create a list, otherwise it will create
> a dict.
> 
> If the keys are a mixture of numeric and non-numeric, or the
> numeric keys are not in strictly ascending order, an error will
> be reported.
> 

> 
> The intent of this function is that it allows a set of QemuOpts
> to be turned into a nested data structure that mirrors the nesting
> used when the same object is defined over QMP.
> 
> Reviewed-by: Kevin Wolf 
> Reviewed-by: Marc-André Lureau 
> Signed-off-by: Daniel P. Berrange 
> ---

> +
> +/**
> + * qdict_split_flat_key:
> + * @key: the key string to split
> + * @prefix: non-NULL pointer to hold extracted prefix
> + * @suffix: non-NULL pointer to remaining suffix
> + *
> + * Given a flattened key such as 'foo.0.bar', split it into two parts
> + * at the first '.' separator. Allows double dot ('..') to escape the
> + * normal separator.
> + *
> + * eg

s/eg/e.g./ or just spell it as 'for example'


> +static int qdict_is_list(QDict *maybe_list, Error **errp)

> +
> +/* NB this isn't a perfect check - eg it won't catch

Another such use.

> + * a list containing '1', '+1', '01', '3', but that
> + * does not matter - we've still proved that the
> + * input is a list. It is up the caller to do a
> + * stricter check if desired */
> +if (len != (max + 1)) {
> +error_setg(errp, "List indexes are not contiguous, "

s/indexes/indices/ ? (my spellchecker likes both, but indexes is a sign
that modern English speakers are getting lazy and drifting away from Latin)

> +/**
> + * qdict_crumple:
> + * @src: the original flat dictionary (only scalar values) to crumple
> + * @recursive: true to recursively crumple nested dictionaries

> + * For example, an input of:
> + *
> + * { 'foo.0.bar': 'one', 'foo.0.wizz': '1',
> + *   'foo.1.bar': 'two', 'foo.1.wizz': '2' }
> + *
> + * will result in any output of:

s/any/an/

> + *
> + * {
> + *   'foo': [
> + *  { 'bar': 'one', 'wizz': '1' },
> + *  { 'bar': 'two', 'wizz': '2' }
> + *   ],
> + * }
> + *
> + * The following scenarios in the input dict will result in an
> + * error being returned:
> + *
> + *  - Any values in @src are non-scalar types
> + *  - If keys in @src imply that a particular level is both a
> + *list and a dict. eg, "foo.0.bar" and "foo.eek.bar".
> + *  - If keys in @src imply that a particular level is a list,
> + *but the indexes are non-contigous. eg "foo.0.bar" and

s/contigous/contiguous/

and another pesky 'eg'

Modulo typo fixes and potential grammar changes,
Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH v2 3/7] block/qapi: Move 'aio' option to file driver

2016-09-27 Thread Max Reitz

On 27.09.2016 10:20, Kevin Wolf wrote:
> Am 26.09.2016 um 18:49 hat Max Reitz geschrieben:
>> On 23.09.2016 16:32, Kevin Wolf wrote:
>>> The option whether or not to use a native AIO interface really isn't a
>>> generic option for all drivers, but only applies to the native file
>>> protocols. This patch moves the option in blockdev-add to the
>>> appropriate places (raw-posix and raw-win32).
>>>
>>> We still have to keep the flag BDRV_O_NATIVE_AIO for compatibility
>>> because so far the AIO option was usually specified on the wrong layer
>>> (the top-level format driver, which didn't even look at it) and then
>>> inherited by the protocol driver (where it was actually used). We can't
>>> forbid this use except in new interfaces.
>>>
>>> Signed-off-by: Kevin Wolf 
>>> ---
>>>  block/raw-posix.c  | 44 ---
>>>  block/raw-win32.c  | 56 
>>> +-
>>>  qapi/block-core.json   |  6 +++---
>>>  tests/qemu-iotests/087 |  4 ++--
>>>  4 files changed, 83 insertions(+), 27 deletions(-)
>>
>> [...]
>>
>>> diff --git a/block/raw-win32.c b/block/raw-win32.c
>>> index 56f45fe..734bb10 100644
>>> --- a/block/raw-win32.c
>>> +++ b/block/raw-win32.c
>>
>> [...]
>>
>>> +static bool get_aio_option(QemuOpts *opts, int flags, Error **errp)
>>> +{
>>> +BlockdevAioOptions aio, aio_default;
>>> +
>>> +aio_default = (flags & BDRV_O_NATIVE_AIO) ? BLOCKDEV_AIO_OPTIONS_NATIVE
>>> +  : 
>>> BLOCKDEV_AIO_OPTIONS_THREADS;
>>> +aio = qapi_enum_parse(BlockdevAioOptions_lookup, qemu_opt_get(opts, 
>>> "aio"),
>>> +  BLOCKDEV_AIO_OPTIONS__MAX, aio_default, errp);
>>> +
>>> +switch (aio) {
>>> +case BLOCKDEV_AIO_OPTIONS_NATIVE:
>>> +return true;
>>> +case BLOCKDEV_AIO_OPTIONS_THREADS:
>>> +return false;
>>> +default:
>>> +error_setg(errp, "Invalid AIO option");
>>
>> Any reason for catching this case here but not in raw-posix?
>>
>> (Not that it really matters, though.)
> 
> Nobody will forget raw-posix when adding a new AIO mode to win32, so I
> didn't feel like the additional code was worth it there. But if we add a
> new AIO mode to raw-posix, I'm pretty sure we will forget win32.

:-)

Good point.

Max




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PULL 00/18] Block layer patches

2016-09-27 Thread Peter Maydell

On 27 September 2016 at 06:53, Kevin Wolf  wrote:
> The following changes since commit 7cfdc02dae0d2ff58c897496cfdbbafc0eda0f3f:
>
>   Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging 
> (2016-09-26 19:47:00 +0100)
>
> are available in the git repository at:
>
>
>   git://repo.or.cz/qemu/kevin.git tags/for-upstream
>
> for you to fetch changes up to 3b856cebe5e93547852c156ca2119d075e62aed7:
>
>   coroutine: reduce stack size to 60kB (2016-09-27 14:05:21 +0200)
>
> 
> Block layer patches
>
> 

I see 'make check' failures on x86-64 host, clang Linux:

  /i386/ahci/migrate/ncq/simple:   OK
  /i386/ahci/migrate/ncq/halted:   OK
  /i386/ahci/cdrom/dma/single: OK
  /i386/ahci/cdrom/dma/multi:  OK
  /i386/ahci/cdrom/pio/single:
Broken pipe
FAIL
GTester: last random seed: R02Sa8f729848b07c3b3e5ee67368f9d0350
(pid=10590)
  /i386/ahci/cdrom/pio/multi:
Broken pipe
FAIL
GTester: last random seed: R02Se85704e04bbd382223983c878723b811
(pid=10598)
FAIL: tests/ahci-test
TEST: tests/hd-geo-test... (pid=10601)
  /i386/hd-geo/ide/none:   OK

thanks
-- PMM

Re: [Qemu-block] [PATCH v3 1/3] qemu-nbd: Add --fork option

2016-09-27 Thread Max Reitz

On 27.09.2016 11:04, Kevin Wolf wrote:
> Am 25.08.2016 um 18:30 hat Max Reitz geschrieben:
>> Using the --fork option, one can make qemu-nbd fork the worker process.
>> The original process will exit on error of the worker or once the worker
>> enters the main loop.
>>
>> Suggested-by: Sascha Silbe 
>> Signed-off-by: Max Reitz 
>> ---
>>  qemu-nbd.c | 15 ++-
>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/qemu-nbd.c b/qemu-nbd.c
>> index e3571c2..8c2d582 100644
>> --- a/qemu-nbd.c
>> +++ b/qemu-nbd.c
>> @@ -48,6 +48,7 @@
>>  #define QEMU_NBD_OPT_OBJECT260
>>  #define QEMU_NBD_OPT_TLSCREDS  261
>>  #define QEMU_NBD_OPT_IMAGE_OPTS262
>> +#define QEMU_NBD_OPT_FORK  263
>>  
>>  #define MBR_SIZE 512
>>  
>> @@ -503,6 +504,7 @@ int main(int argc, char **argv)
>>  { "tls-creds", required_argument, NULL, QEMU_NBD_OPT_TLSCREDS },
>>  { "image-opts", no_argument, NULL, QEMU_NBD_OPT_IMAGE_OPTS },
>>  { "trace", required_argument, NULL, 'T' },
>> +{ "fork", no_argument, NULL, QEMU_NBD_OPT_FORK },
>>  { NULL, 0, NULL, 0 }
>>  };
>>  int ch;
>> @@ -524,6 +526,8 @@ int main(int argc, char **argv)
>>  bool imageOpts = false;
>>  bool writethrough = true;
>>  char *trace_file = NULL;
>> +bool fork_process = false;
>> +int old_stderr = -1;
>>  
>>  /* The client thread uses SIGTERM to interrupt the server.  A signal
>>   * handler ensures that "qemu-nbd -v -c" exits with a nice status code.
>> @@ -714,6 +718,9 @@ int main(int argc, char **argv)
>>  g_free(trace_file);
>>  trace_file = trace_opt_parse(optarg);
>>  break;
>> +case QEMU_NBD_OPT_FORK:
>> +fork_process = true;
>> +break;
>>  }
>>  }
>>  
>> @@ -773,7 +780,7 @@ int main(int argc, char **argv)
>>  return 0;
>>  }
>>  
>> -if (device && !verbose) {
>> +if ((device && !verbose) || fork_process) {
>>  int stderr_fd[2];
>>  pid_t pid;
>>  int ret;
>> @@ -796,6 +803,7 @@ int main(int argc, char **argv)
>>  ret = qemu_daemon(1, 0);
>>  
>>  /* Temporarily redirect stderr to the parent's pipe...  */
>> +old_stderr = dup(STDERR_FILENO);
>>  dup2(stderr_fd[1], STDERR_FILENO);
>>  if (ret < 0) {
>>  error_report("Failed to daemonize: %s", strerror(errno));
> 
> I don't have a kernel with NBD support, so I can't test this easily, but
> doesn't this break the daemon mode for --connect? I mean, it will still
> fork, but won't the parent be alive until the child exits?

Well, for me the parent closes as it should.

> 
>> @@ -951,6 +959,11 @@ int main(int argc, char **argv)
>>  exit(EXIT_FAILURE);
>>  }
>>  
>> +if (fork_process) {
>> +dup2(old_stderr, STDERR_FILENO);
>> +close(old_stderr);
>> +}
> 
> Because this code doesn't run for --connect (unless --fork is given,
> too).

Hm, so? It never ran before either, because I'm only just now
introducing it. And the fact that I'm keeping the original stderr FD
open has nothing to do with when the parent process will quit because
the parent process is not connected to that *original* stderr.

Also, when using --connect, the FD is closed in nbd_client_thread().

> 
>>  state = RUNNING;
>>  do {
>>  main_loop_wait(false);
> 
> The documentation needs an update, too.

Right. I wonder why I forgot this. I guess the answer is "Because I
wrote this in some spare time at KVM Forum to see if it would work at
all"...

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH] proto: add 'shift' extension.

2016-09-27 Thread Denis V. Lunev

On 09/27/2016 08:04 PM, Paolo Bonzini wrote:
>
> On 27/09/2016 15:28, Denis V. Lunev wrote:
>> On 09/27/2016 03:07 PM, Paolo Bonzini wrote:
>>> - Original Message -
 From: "Denis V. Lunev" 
 To: "Paolo Bonzini" 
 Cc: "Vladimir Sementsov-Ogievskiy" , 
 qemu-de...@nongnu.org, qemu-block@nongnu.org,
 nbd-gene...@lists.sourceforge.net, a...@alex.org.uk, ebl...@redhat.com, 
 kw...@redhat.com, stefa...@redhat.com,
 w...@uter.be
 Sent: Tuesday, September 27, 2016 12:25:54 PM
 Subject: Re: [PATCH] proto: add 'shift' extension.

 On 09/27/2016 01:15 PM, Paolo Bonzini wrote:
>> We could go in a different direction and export flag
>> 'has_zero_init' which will report that the storage is
>> initialized with all zeroes at the moment. In this
>> case mirroring code will not fall into this
>> branch.
> Why don't you add the zero_init flag to QEMU's NBD driver instead?
 for all cases without knowing real backend on the server side?
 I think this would be wrong.
>>> Add it to the command line, and leave it to libvirt or the user to
>>> pass "-drive file.driver=nbd,...,file.zero-init=on".
>> I have started with something very similar for 'drive-mirror' command.
>> We have added additional flag for this to improve migration speed
>> and this was rejected.
> You can add it through the filename path too, through a URI option
> "nbd://...?zero-init=on".
>
> Paolo
ha, cool idea! Thanks!

Den

Re: [Qemu-block] [PATCH 3/3] linux-aio: fix re-entrant completion processing

2016-09-27 Thread Roman Penyaev

On Tue, Sep 27, 2016 at 5:25 PM, Stefan Hajnoczi  wrote:
> On Tue, Sep 27, 2016 at 04:29:55PM +0200, Roman Penyaev wrote:
>> On Tue, Sep 27, 2016 at 4:06 PM, Stefan Hajnoczi  wrote:
>> > Commit 0ed93d84edabc7656f5c998ae1a346fe8b94ca54 ("linux-aio: process
>> > completions from ioq_submit()") added an optimization that processes
>> > completions each time ioq_submit() returns with requests in flight.
>> > This commit introduces a "Co-routine re-entered recursively" error which
>> > can be triggered with -drive format=qcow2,aio=native.
>> >
>> > Fam Zheng , Kevin Wolf , and I
>> > debugged the following backtrace:
>> >
>> >   (gdb) bt
>> >   #0  0x70a046f5 in raise () at /lib64/libc.so.6
>> >   #1  0x70a062fa in abort () at /lib64/libc.so.6
>> >   #2  0x55ac0013 in qemu_coroutine_enter (co=0x583464d0) at 
>> > util/qemu-coroutine.c:113
>> >   #3  0x55a4b663 in qemu_laio_process_completions 
>> > (s=s@entry=0x57e2f7f0) at block/linux-aio.c:218
>> >   #4  0x55a4b874 in ioq_submit (s=s@entry=0x57e2f7f0) at 
>> > block/linux-aio.c:331
>> >   #5  0x55a4ba12 in laio_do_submit (fd=fd@entry=13, 
>> > laiocb=laiocb@entry=0x59d38ae0, offset=offset@entry=2932727808, 
>> > type=type@entry=1) at block/linux-aio.c:383
>> >   #6  0x55a4bbd3 in laio_co_submit (bs=, 
>> > s=0x57e2f7f0, fd=13, offset=2932727808, qiov=0x59d38e20, type=1) 
>> > at block/linux-aio.c:402
>> >   #7  0x55a4fd23 in bdrv_driver_preadv 
>> > (bs=bs@entry=0x5663bcb0, offset=offset@entry=2932727808, 
>> > bytes=bytes@entry=8192, qiov=qiov@entry=0x59d38e20, flags=0) at 
>> > block/io.c:804
>> >   #8  0x55a52b34 in bdrv_aligned_preadv 
>> > (bs=bs@entry=0x5663bcb0, req=req@entry=0x59d38d20, 
>> > offset=offset@entry=2932727808, bytes=bytes@entry=8192, 
>> > align=align@entry=512, qiov=qiov@entry=0x59d38e20, flags=0) at 
>> > block/io.c:1041
>> >   #9  0x55a52db8 in bdrv_co_preadv (child=, 
>> > offset=2932727808, bytes=8192, qiov=qiov@entry=0x59d38e20, 
>> > flags=flags@entry=0) at block/io.c:1133
>> >   #10 0x55a29629 in qcow2_co_preadv (bs=0x56635890, 
>> > offset=6178725888, bytes=8192, qiov=0x57527840, flags=) 
>> > at block/qcow2.c:1509
>> >   #11 0x55a4fd23 in bdrv_driver_preadv 
>> > (bs=bs@entry=0x56635890, offset=offset@entry=6178725888, 
>> > bytes=bytes@entry=8192, qiov=qiov@entry=0x57527840, flags=0) at 
>> > block/io.c:804
>> >   #12 0x55a52b34 in bdrv_aligned_preadv 
>> > (bs=bs@entry=0x56635890, req=req@entry=0x59d39000, 
>> > offset=offset@entry=6178725888, bytes=bytes@entry=8192, 
>> > align=align@entry=1, qiov=qiov@entry=0x57527840, flags=0) at 
>> > block/io.c:1041
>> >   #13 0x55a52db8 in bdrv_co_preadv (child=, 
>> > offset=offset@entry=6178725888, bytes=bytes@entry=8192, 
>> > qiov=qiov@entry=0x57527840, flags=flags@entry=0) at block/io.c:1133
>> >   #14 0x55a4515a in blk_co_preadv (blk=0x566356d0, 
>> > offset=6178725888, bytes=8192, qiov=0x57527840, flags=0) at 
>> > block/block-backend.c:783
>> >   #15 0x55a45266 in blk_aio_read_entry (opaque=0x577025e0) at 
>> > block/block-backend.c:991
>> >   #16 0x55ac0cfa in coroutine_trampoline (i0=, 
>> > i1=) at util/coroutine-ucontext.c:78
>> >
>> > It turned out that re-entrant ioq_submit() and completion processing
>> > between three requests caused this error.  The following check is not
>> > sufficient to prevent recursively entering coroutines:
>> >
>> >   if (laiocb->co != qemu_coroutine_self()) {
>> >   qemu_coroutine_enter(laiocb->co);
>> >   }
>> >
>> > As the following coroutine backtrace shows, not just the current
>> > coroutine (self) can be entered.  There might also be other coroutines
>> > that are currently entered and transferred control due to the qcow2 lock
>> > (CoMutex):
>>
>> I doubt that that was introduced by the commit you've specified:
>> 0ed93d84edab.
>>
>> Before my patch coroutine was unconditionally entered.  The following
>> is what was changed by 0ed93d84edab:
>>
>>  if (laiocb->co) {
>> -qemu_coroutine_enter(laiocb->co);
>> +/* Jump and continue completion for foreign requests, don't do
>> + * anything for current request, it will be completed shortly. */
>> +if (laiocb->co != qemu_coroutine_self()) {
>> +qemu_coroutine_enter(laiocb->co);
>> +}
>
> Unconditionally entering was safe prior to 0ed93d84edab since all
> coroutines yielded and qemu_coroutine_entered() would be false all the
> time.  Therefore it wasn't necessary to protect against re-entering a
> coroutine.
>
>> If you have a strong reproduction, could you please verify that.
>
> The bug is 100% deterministic.  Just boot up a guest with -drive
> format=qcow2,aio=native.

It turns out to be that everything is broken.  I started all my

Re: [Qemu-block] [PATCH] proto: add 'shift' extension.

2016-09-27 Thread Denis V. Lunev

On 09/27/2016 03:07 PM, Paolo Bonzini wrote:
>
> - Original Message -
>> From: "Denis V. Lunev" 
>> To: "Paolo Bonzini" 
>> Cc: "Vladimir Sementsov-Ogievskiy" , 
>> qemu-de...@nongnu.org, qemu-block@nongnu.org,
>> nbd-gene...@lists.sourceforge.net, a...@alex.org.uk, ebl...@redhat.com, 
>> kw...@redhat.com, stefa...@redhat.com,
>> w...@uter.be
>> Sent: Tuesday, September 27, 2016 12:25:54 PM
>> Subject: Re: [PATCH] proto: add 'shift' extension.
>>
>> On 09/27/2016 01:15 PM, Paolo Bonzini wrote:
 We could go in a different direction and export flag
 'has_zero_init' which will report that the storage is
 initialized with all zeroes at the moment. In this
 case mirroring code will not fall into this
 branch.
>>> Why don't you add the zero_init flag to QEMU's NBD driver instead?
>> for all cases without knowing real backend on the server side?
>> I think this would be wrong.
> Add it to the command line, and leave it to libvirt or the user to
> pass "-drive file.driver=nbd,...,file.zero-init=on".
>
> Paolo
this specifically means that all QMP commands like 'drive-backup' and
'drive-mirror' will have to be modified to pass this attribute. If this is
OK, we can proceed with that.

Den

Re: [Qemu-block] [Qemu-devel] [PATCH 1/3] coroutine: add qemu_coroutine_entered() function

2016-09-27 Thread Paolo Bonzini



On 27/09/2016 18:29, Stefan Hajnoczi wrote:
> On Tue, Sep 27, 2016 at 5:20 PM, Paolo Bonzini  wrote:
>>
>>
>> On 27/09/2016 16:06, Stefan Hajnoczi wrote:
>>> See the doc comments for a description of this new coroutine API.
>>>
>>> Signed-off-by: Stefan Hajnoczi 
>>> ---
>>>  include/qemu/coroutine.h | 13 +
>>>  util/qemu-coroutine.c|  5 +
>>>  2 files changed, 18 insertions(+)
>>>
>>> diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
>>> index 29a2078..e6a60d5 100644
>>> --- a/include/qemu/coroutine.h
>>> +++ b/include/qemu/coroutine.h
>>> @@ -92,6 +92,19 @@ Coroutine *coroutine_fn qemu_coroutine_self(void);
>>>   */
>>>  bool qemu_in_coroutine(void);
>>>
>>> +/**
>>> + * Return true if the coroutine is currently entered
>>> + *
>>> + * A coroutine is "entered" if it has not yielded from the current
>>> + * qemu_coroutine_enter() call used to run it.  This does not mean that the
>>> + * coroutine is currently executing code since it may have transferred 
>>> control
>>> + * to another coroutine using qemu_coroutine_enter().
>>> + *
>>> + * When several coroutines enter each other there may be no way to know 
>>> which
>>> + * ones have already been entered.  In such situations this function can be
>>> + * used to avoid recursively entering coroutines.
>>> + */
>>> +bool qemu_coroutine_entered(Coroutine *co);
>>
>> Perhaps qemu_coroutine_running is a better name?
> 
> I find "running" confusing since the coroutine may not actually be
> currently executing (as mentioned in the doc comment).

Ok, makes sense.  Another possibility is qemu_coroutine_on_stack, but
I'm not sure it's better...

Paolo

Re: [Qemu-block] [Qemu-devel] [PATCH 1/3] coroutine: add qemu_coroutine_entered() function

2016-09-27 Thread Stefan Hajnoczi

On Tue, Sep 27, 2016 at 5:20 PM, Paolo Bonzini  wrote:
>
>
> On 27/09/2016 16:06, Stefan Hajnoczi wrote:
>> See the doc comments for a description of this new coroutine API.
>>
>> Signed-off-by: Stefan Hajnoczi 
>> ---
>>  include/qemu/coroutine.h | 13 +
>>  util/qemu-coroutine.c|  5 +
>>  2 files changed, 18 insertions(+)
>>
>> diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
>> index 29a2078..e6a60d5 100644
>> --- a/include/qemu/coroutine.h
>> +++ b/include/qemu/coroutine.h
>> @@ -92,6 +92,19 @@ Coroutine *coroutine_fn qemu_coroutine_self(void);
>>   */
>>  bool qemu_in_coroutine(void);
>>
>> +/**
>> + * Return true if the coroutine is currently entered
>> + *
>> + * A coroutine is "entered" if it has not yielded from the current
>> + * qemu_coroutine_enter() call used to run it.  This does not mean that the
>> + * coroutine is currently executing code since it may have transferred 
>> control
>> + * to another coroutine using qemu_coroutine_enter().
>> + *
>> + * When several coroutines enter each other there may be no way to know 
>> which
>> + * ones have already been entered.  In such situations this function can be
>> + * used to avoid recursively entering coroutines.
>> + */
>> +bool qemu_coroutine_entered(Coroutine *co);
>
> Perhaps qemu_coroutine_running is a better name?

I find "running" confusing since the coroutine may not actually be
currently executing (as mentioned in the doc comment).

Stefan

Re: [Qemu-block] [PATCH 1/3] coroutine: add qemu_coroutine_entered() function

2016-09-27 Thread Paolo Bonzini



On 27/09/2016 16:06, Stefan Hajnoczi wrote:
> See the doc comments for a description of this new coroutine API.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  include/qemu/coroutine.h | 13 +
>  util/qemu-coroutine.c|  5 +
>  2 files changed, 18 insertions(+)
> 
> diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
> index 29a2078..e6a60d5 100644
> --- a/include/qemu/coroutine.h
> +++ b/include/qemu/coroutine.h
> @@ -92,6 +92,19 @@ Coroutine *coroutine_fn qemu_coroutine_self(void);
>   */
>  bool qemu_in_coroutine(void);
>  
> +/**
> + * Return true if the coroutine is currently entered
> + *
> + * A coroutine is "entered" if it has not yielded from the current
> + * qemu_coroutine_enter() call used to run it.  This does not mean that the
> + * coroutine is currently executing code since it may have transferred 
> control
> + * to another coroutine using qemu_coroutine_enter().
> + *
> + * When several coroutines enter each other there may be no way to know which
> + * ones have already been entered.  In such situations this function can be
> + * used to avoid recursively entering coroutines.
> + */
> +bool qemu_coroutine_entered(Coroutine *co);

Perhaps qemu_coroutine_running is a better name?

Paolo

>  
>  
>  /**
> diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
> index 3cbf225..737bffa 100644
> --- a/util/qemu-coroutine.c
> +++ b/util/qemu-coroutine.c
> @@ -146,3 +146,8 @@ void coroutine_fn qemu_coroutine_yield(void)
>  self->caller = NULL;
>  qemu_coroutine_switch(self, to, COROUTINE_YIELD);
>  }
> +
> +bool qemu_coroutine_entered(Coroutine *co)
> +{
> +return co->caller;
> +}
>

Re: [Qemu-block] [PATCH 3/3] linux-aio: fix re-entrant completion processing

2016-09-27 Thread Roman Penyaev

Hey Stefan,

On Tue, Sep 27, 2016 at 4:06 PM, Stefan Hajnoczi  wrote:
> Commit 0ed93d84edabc7656f5c998ae1a346fe8b94ca54 ("linux-aio: process
> completions from ioq_submit()") added an optimization that processes
> completions each time ioq_submit() returns with requests in flight.
> This commit introduces a "Co-routine re-entered recursively" error which
> can be triggered with -drive format=qcow2,aio=native.
>
> Fam Zheng , Kevin Wolf , and I
> debugged the following backtrace:
>
>   (gdb) bt
>   #0  0x70a046f5 in raise () at /lib64/libc.so.6
>   #1  0x70a062fa in abort () at /lib64/libc.so.6
>   #2  0x55ac0013 in qemu_coroutine_enter (co=0x583464d0) at 
> util/qemu-coroutine.c:113
>   #3  0x55a4b663 in qemu_laio_process_completions 
> (s=s@entry=0x57e2f7f0) at block/linux-aio.c:218
>   #4  0x55a4b874 in ioq_submit (s=s@entry=0x57e2f7f0) at 
> block/linux-aio.c:331
>   #5  0x55a4ba12 in laio_do_submit (fd=fd@entry=13, 
> laiocb=laiocb@entry=0x59d38ae0, offset=offset@entry=2932727808, 
> type=type@entry=1) at block/linux-aio.c:383
>   #6  0x55a4bbd3 in laio_co_submit (bs=, 
> s=0x57e2f7f0, fd=13, offset=2932727808, qiov=0x59d38e20, type=1) at 
> block/linux-aio.c:402
>   #7  0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x5663bcb0, 
> offset=offset@entry=2932727808, bytes=bytes@entry=8192, 
> qiov=qiov@entry=0x59d38e20, flags=0) at block/io.c:804
>   #8  0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x5663bcb0, 
> req=req@entry=0x59d38d20, offset=offset@entry=2932727808, 
> bytes=bytes@entry=8192, align=align@entry=512, 
> qiov=qiov@entry=0x59d38e20, flags=0) at block/io.c:1041
>   #9  0x55a52db8 in bdrv_co_preadv (child=, 
> offset=2932727808, bytes=8192, qiov=qiov@entry=0x59d38e20, 
> flags=flags@entry=0) at block/io.c:1133
>   #10 0x55a29629 in qcow2_co_preadv (bs=0x56635890, 
> offset=6178725888, bytes=8192, qiov=0x57527840, flags=) at 
> block/qcow2.c:1509
>   #11 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x56635890, 
> offset=offset@entry=6178725888, bytes=bytes@entry=8192, 
> qiov=qiov@entry=0x57527840, flags=0) at block/io.c:804
>   #12 0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x56635890, 
> req=req@entry=0x59d39000, offset=offset@entry=6178725888, 
> bytes=bytes@entry=8192, align=align@entry=1, qiov=qiov@entry=0x57527840, 
> flags=0) at block/io.c:1041
>   #13 0x55a52db8 in bdrv_co_preadv (child=, 
> offset=offset@entry=6178725888, bytes=bytes@entry=8192, 
> qiov=qiov@entry=0x57527840, flags=flags@entry=0) at block/io.c:1133
>   #14 0x55a4515a in blk_co_preadv (blk=0x566356d0, 
> offset=6178725888, bytes=8192, qiov=0x57527840, flags=0) at 
> block/block-backend.c:783
>   #15 0x55a45266 in blk_aio_read_entry (opaque=0x577025e0) at 
> block/block-backend.c:991
>   #16 0x55ac0cfa in coroutine_trampoline (i0=, 
> i1=) at util/coroutine-ucontext.c:78
>
> It turned out that re-entrant ioq_submit() and completion processing
> between three requests caused this error.  The following check is not
> sufficient to prevent recursively entering coroutines:
>
>   if (laiocb->co != qemu_coroutine_self()) {
>   qemu_coroutine_enter(laiocb->co);
>   }
>
> As the following coroutine backtrace shows, not just the current
> coroutine (self) can be entered.  There might also be other coroutines
> that are currently entered and transferred control due to the qcow2 lock
> (CoMutex):

I doubt that that was introduced by the commit you've specified:
0ed93d84edab.

Before my patch coroutine was unconditionally entered.  The following
is what was changed by 0ed93d84edab:

 if (laiocb->co) {
-qemu_coroutine_enter(laiocb->co);
+/* Jump and continue completion for foreign requests, don't do
+ * anything for current request, it will be completed shortly. */
+if (laiocb->co != qemu_coroutine_self()) {
+qemu_coroutine_enter(laiocb->co);
+}

If you have a strong reproduction, could you please verify that.

What worries me is the following:

 1. Issue was introduced before and was unnoticed (ok).
 2. Issue - is something else and/or was really introduced by commit
0ed93d84edab (not ok).

Of course the 2. is not nice.
Thanks.

--
Roman

Re: [Qemu-block] [PATCH 3/3] linux-aio: fix re-entrant completion processing

2016-09-27 Thread Stefan Hajnoczi

On Tue, Sep 27, 2016 at 04:29:55PM +0200, Roman Penyaev wrote:
> On Tue, Sep 27, 2016 at 4:06 PM, Stefan Hajnoczi  wrote:
> > Commit 0ed93d84edabc7656f5c998ae1a346fe8b94ca54 ("linux-aio: process
> > completions from ioq_submit()") added an optimization that processes
> > completions each time ioq_submit() returns with requests in flight.
> > This commit introduces a "Co-routine re-entered recursively" error which
> > can be triggered with -drive format=qcow2,aio=native.
> >
> > Fam Zheng , Kevin Wolf , and I
> > debugged the following backtrace:
> >
> >   (gdb) bt
> >   #0  0x70a046f5 in raise () at /lib64/libc.so.6
> >   #1  0x70a062fa in abort () at /lib64/libc.so.6
> >   #2  0x55ac0013 in qemu_coroutine_enter (co=0x583464d0) at 
> > util/qemu-coroutine.c:113
> >   #3  0x55a4b663 in qemu_laio_process_completions 
> > (s=s@entry=0x57e2f7f0) at block/linux-aio.c:218
> >   #4  0x55a4b874 in ioq_submit (s=s@entry=0x57e2f7f0) at 
> > block/linux-aio.c:331
> >   #5  0x55a4ba12 in laio_do_submit (fd=fd@entry=13, 
> > laiocb=laiocb@entry=0x59d38ae0, offset=offset@entry=2932727808, 
> > type=type@entry=1) at block/linux-aio.c:383
> >   #6  0x55a4bbd3 in laio_co_submit (bs=, 
> > s=0x57e2f7f0, fd=13, offset=2932727808, qiov=0x59d38e20, type=1) at 
> > block/linux-aio.c:402
> >   #7  0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x5663bcb0, 
> > offset=offset@entry=2932727808, bytes=bytes@entry=8192, 
> > qiov=qiov@entry=0x59d38e20, flags=0) at block/io.c:804
> >   #8  0x55a52b34 in bdrv_aligned_preadv 
> > (bs=bs@entry=0x5663bcb0, req=req@entry=0x59d38d20, 
> > offset=offset@entry=2932727808, bytes=bytes@entry=8192, 
> > align=align@entry=512, qiov=qiov@entry=0x59d38e20, flags=0) at 
> > block/io.c:1041
> >   #9  0x55a52db8 in bdrv_co_preadv (child=, 
> > offset=2932727808, bytes=8192, qiov=qiov@entry=0x59d38e20, 
> > flags=flags@entry=0) at block/io.c:1133
> >   #10 0x55a29629 in qcow2_co_preadv (bs=0x56635890, 
> > offset=6178725888, bytes=8192, qiov=0x57527840, flags=) 
> > at block/qcow2.c:1509
> >   #11 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x56635890, 
> > offset=offset@entry=6178725888, bytes=bytes@entry=8192, 
> > qiov=qiov@entry=0x57527840, flags=0) at block/io.c:804
> >   #12 0x55a52b34 in bdrv_aligned_preadv 
> > (bs=bs@entry=0x56635890, req=req@entry=0x59d39000, 
> > offset=offset@entry=6178725888, bytes=bytes@entry=8192, 
> > align=align@entry=1, qiov=qiov@entry=0x57527840, flags=0) at 
> > block/io.c:1041
> >   #13 0x55a52db8 in bdrv_co_preadv (child=, 
> > offset=offset@entry=6178725888, bytes=bytes@entry=8192, 
> > qiov=qiov@entry=0x57527840, flags=flags@entry=0) at block/io.c:1133
> >   #14 0x55a4515a in blk_co_preadv (blk=0x566356d0, 
> > offset=6178725888, bytes=8192, qiov=0x57527840, flags=0) at 
> > block/block-backend.c:783
> >   #15 0x55a45266 in blk_aio_read_entry (opaque=0x577025e0) at 
> > block/block-backend.c:991
> >   #16 0x55ac0cfa in coroutine_trampoline (i0=, 
> > i1=) at util/coroutine-ucontext.c:78
> >
> > It turned out that re-entrant ioq_submit() and completion processing
> > between three requests caused this error.  The following check is not
> > sufficient to prevent recursively entering coroutines:
> >
> >   if (laiocb->co != qemu_coroutine_self()) {
> >   qemu_coroutine_enter(laiocb->co);
> >   }
> >
> > As the following coroutine backtrace shows, not just the current
> > coroutine (self) can be entered.  There might also be other coroutines
> > that are currently entered and transferred control due to the qcow2 lock
> > (CoMutex):
> 
> I doubt that that was introduced by the commit you've specified:
> 0ed93d84edab.
> 
> Before my patch coroutine was unconditionally entered.  The following
> is what was changed by 0ed93d84edab:
> 
>  if (laiocb->co) {
> -qemu_coroutine_enter(laiocb->co);
> +/* Jump and continue completion for foreign requests, don't do
> + * anything for current request, it will be completed shortly. */
> +if (laiocb->co != qemu_coroutine_self()) {
> +qemu_coroutine_enter(laiocb->co);
> +}

Unconditionally entering was safe prior to 0ed93d84edab since all
coroutines yielded and qemu_coroutine_entered() would be false all the
time.  Therefore it wasn't necessary to protect against re-entering a
coroutine.

> If you have a strong reproduction, could you please verify that.

The bug is 100% deterministic.  Just boot up a guest with -drive
format=qcow2,aio=native.

I noticed that I forgot to include a second backtrace in the commit
description.  I am resending the patch series with the missing
backtrace.  Hopefully that will make the bug clearer.

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-block] [PATCH v3 3/3] doc: Document driver-specific -blockdev options

2016-09-27 Thread Eric Blake

On 09/26/2016 10:27 AM, Kevin Wolf wrote:
> This documents the driver-specific options for the raw, qcow2 and file
> block drivers for the man page. For everything else, we refer to the
> QAPI documentation.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  qemu-options.hx | 115 
> +++-
>  1 file changed, 114 insertions(+), 1 deletion(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-block] [PATCH v2 3/3] linux-aio: fix re-entrant completion processing

2016-09-27 Thread Stefan Hajnoczi

Commit 0ed93d84edabc7656f5c998ae1a346fe8b94ca54 ("linux-aio: process
completions from ioq_submit()") added an optimization that processes
completions each time ioq_submit() returns with requests in flight.
This commit introduces a "Co-routine re-entered recursively" error which
can be triggered with -drive format=qcow2,aio=native.

Fam Zheng , Kevin Wolf , and I
debugged the following backtrace:

  (gdb) bt
  #0  0x70a046f5 in raise () at /lib64/libc.so.6
  #1  0x70a062fa in abort () at /lib64/libc.so.6
  #2  0x55ac0013 in qemu_coroutine_enter (co=0x583464d0) at 
util/qemu-coroutine.c:113
  #3  0x55a4b663 in qemu_laio_process_completions 
(s=s@entry=0x57e2f7f0) at block/linux-aio.c:218
  #4  0x55a4b874 in ioq_submit (s=s@entry=0x57e2f7f0) at 
block/linux-aio.c:331
  #5  0x55a4ba12 in laio_do_submit (fd=fd@entry=13, 
laiocb=laiocb@entry=0x59d38ae0, offset=offset@entry=2932727808, 
type=type@entry=1) at block/linux-aio.c:383
  #6  0x55a4bbd3 in laio_co_submit (bs=, 
s=0x57e2f7f0, fd=13, offset=2932727808, qiov=0x59d38e20, type=1) at 
block/linux-aio.c:402
  #7  0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x5663bcb0, 
offset=offset@entry=2932727808, bytes=bytes@entry=8192, 
qiov=qiov@entry=0x59d38e20, flags=0) at block/io.c:804
  #8  0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x5663bcb0, 
req=req@entry=0x59d38d20, offset=offset@entry=2932727808, 
bytes=bytes@entry=8192, align=align@entry=512, qiov=qiov@entry=0x59d38e20, 
flags=0) at block/io.c:1041
  #9  0x55a52db8 in bdrv_co_preadv (child=, 
offset=2932727808, bytes=8192, qiov=qiov@entry=0x59d38e20, 
flags=flags@entry=0) at block/io.c:1133
  #10 0x55a29629 in qcow2_co_preadv (bs=0x56635890, 
offset=6178725888, bytes=8192, qiov=0x57527840, flags=) at 
block/qcow2.c:1509
  #11 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x56635890, 
offset=offset@entry=6178725888, bytes=bytes@entry=8192, 
qiov=qiov@entry=0x57527840, flags=0) at block/io.c:804
  #12 0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x56635890, 
req=req@entry=0x59d39000, offset=offset@entry=6178725888, 
bytes=bytes@entry=8192, align=align@entry=1, qiov=qiov@entry=0x57527840, 
flags=0) at block/io.c:1041
  #13 0x55a52db8 in bdrv_co_preadv (child=, 
offset=offset@entry=6178725888, bytes=bytes@entry=8192, 
qiov=qiov@entry=0x57527840, flags=flags@entry=0) at block/io.c:1133
  #14 0x55a4515a in blk_co_preadv (blk=0x566356d0, 
offset=6178725888, bytes=8192, qiov=0x57527840, flags=0) at 
block/block-backend.c:783
  #15 0x55a45266 in blk_aio_read_entry (opaque=0x577025e0) at 
block/block-backend.c:991
  #16 0x55ac0cfa in coroutine_trampoline (i0=, 
i1=) at util/coroutine-ucontext.c:78

It turned out that re-entrant ioq_submit() and completion processing
between three requests caused this error.  The following check is not
sufficient to prevent recursively entering coroutines:

  if (laiocb->co != qemu_coroutine_self()) {
  qemu_coroutine_enter(laiocb->co);
  }

As the following coroutine backtrace shows, not just the current
coroutine (self) can be entered.  There might also be other coroutines
that are currently entered and transferred control due to the qcow2 lock
(CoMutex):

  (gdb) qemu coroutine 0x583464d0
  #0  0x55ac0c90 in qemu_coroutine_switch 
(from_=from_@entry=0x583464d0, to_=to_@entry=0x572f9890, 
action=action@entry=COROUTINE_ENTER) at util/coroutine-ucontext.c:175
  #1  0x55abfe54 in qemu_coroutine_enter (co=0x572f9890) at 
util/qemu-coroutine.c:117
  #2  0x55ac031c in qemu_co_queue_run_restart 
(co=co@entry=0x583462c0) at util/qemu-coroutine-lock.c:60
  #3  0x55abfe5e in qemu_coroutine_enter (co=0x583462c0) at 
util/qemu-coroutine.c:119
  #4  0x55a4b663 in qemu_laio_process_completions 
(s=s@entry=0x57e2f7f0) at block/linux-aio.c:218
  #5  0x55a4b874 in ioq_submit (s=s@entry=0x57e2f7f0) at 
block/linux-aio.c:331
  #6  0x55a4ba12 in laio_do_submit (fd=fd@entry=13, 
laiocb=laiocb@entry=0x5a338b40, offset=offset@entry=2911477760, 
type=type@entry=1) at block/linux-aio.c:383
  #7  0x55a4bbd3 in laio_co_submit (bs=, 
s=0x57e2f7f0, fd=13, offset=2911477760, qiov=0x5a338e80, type=1) at 
block/linux-aio.c:402
  #8  0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x5663bcb0, 
offset=offset@entry=2911477760, bytes=bytes@entry=8192, 
qiov=qiov@entry=0x5a338e80, flags=0) at block/io.c:804
  #9  0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x5663bcb0, 
req=req@entry=0x5a338d80, offset=offset@entry=2911477760, 
bytes=bytes@entry=8192, align=align@entry=512, qiov=qiov@entry=0x5a338e80, 
flags=0) at block/io.c:1041
  #10 0x55a52db8 in bdrv_co_preadv (child=, 
offset=2911477760,

[Qemu-block] [PATCH v2 2/3] test-coroutine: test qemu_coroutine_entered()

2016-09-27 Thread Stefan Hajnoczi

Signed-off-by: Stefan Hajnoczi 
---
 tests/test-coroutine.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/tests/test-coroutine.c b/tests/test-coroutine.c
index 6431dd6..abd97c2 100644
--- a/tests/test-coroutine.c
+++ b/tests/test-coroutine.c
@@ -53,6 +53,47 @@ static void test_self(void)
 }
 
 /*
+ * Check that qemu_coroutine_entered() works
+ */
+
+static void coroutine_fn verify_entered_step_2(void *opaque)
+{
+Coroutine *caller = (Coroutine *)opaque;
+
+g_assert(qemu_coroutine_entered(caller));
+g_assert(qemu_coroutine_entered(qemu_coroutine_self()));
+qemu_coroutine_yield();
+
+/* Once more to check it still works after yielding */
+g_assert(qemu_coroutine_entered(caller));
+g_assert(qemu_coroutine_entered(qemu_coroutine_self()));
+qemu_coroutine_yield();
+}
+
+static void coroutine_fn verify_entered_step_1(void *opaque)
+{
+Coroutine *self = qemu_coroutine_self();
+Coroutine *coroutine;
+
+g_assert(qemu_coroutine_entered(self));
+
+coroutine = qemu_coroutine_create(verify_entered_step_2, self);
+g_assert(!qemu_coroutine_entered(coroutine));
+qemu_coroutine_enter(coroutine);
+g_assert(!qemu_coroutine_entered(coroutine));
+qemu_coroutine_enter(coroutine);
+}
+
+static void test_entered(void)
+{
+Coroutine *coroutine;
+
+coroutine = qemu_coroutine_create(verify_entered_step_1, NULL);
+g_assert(!qemu_coroutine_entered(coroutine));
+qemu_coroutine_enter(coroutine);
+}
+
+/*
  * Check that coroutines may nest multiple levels
  */
 
@@ -389,6 +430,7 @@ int main(int argc, char **argv)
 g_test_add_func("/basic/yield", test_yield);
 g_test_add_func("/basic/nesting", test_nesting);
 g_test_add_func("/basic/self", test_self);
+g_test_add_func("/basic/entered", test_entered);
 g_test_add_func("/basic/in_coroutine", test_in_coroutine);
 g_test_add_func("/basic/order", test_order);
 if (g_test_perf()) {
-- 
2.7.4

[Qemu-block] [PATCH v2 1/3] coroutine: add qemu_coroutine_entered() function

2016-09-27 Thread Stefan Hajnoczi

See the doc comments for a description of this new coroutine API.

Signed-off-by: Stefan Hajnoczi 
---
 include/qemu/coroutine.h | 13 +
 util/qemu-coroutine.c|  5 +
 2 files changed, 18 insertions(+)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index 29a2078..e6a60d5 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -92,6 +92,19 @@ Coroutine *coroutine_fn qemu_coroutine_self(void);
  */
 bool qemu_in_coroutine(void);
 
+/**
+ * Return true if the coroutine is currently entered
+ *
+ * A coroutine is "entered" if it has not yielded from the current
+ * qemu_coroutine_enter() call used to run it.  This does not mean that the
+ * coroutine is currently executing code since it may have transferred control
+ * to another coroutine using qemu_coroutine_enter().
+ *
+ * When several coroutines enter each other there may be no way to know which
+ * ones have already been entered.  In such situations this function can be
+ * used to avoid recursively entering coroutines.
+ */
+bool qemu_coroutine_entered(Coroutine *co);
 
 
 /**
diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index 3cbf225..737bffa 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -146,3 +146,8 @@ void coroutine_fn qemu_coroutine_yield(void)
 self->caller = NULL;
 qemu_coroutine_switch(self, to, COROUTINE_YIELD);
 }
+
+bool qemu_coroutine_entered(Coroutine *co)
+{
+return co->caller;
+}
-- 
2.7.4

[Qemu-block] [PATCH v2 0/3] linux-aio: fix "Co-routine re-entered recursively" error

2016-09-27 Thread Stefan Hajnoczi

v2:
 * Add missing backtrace in Patch 3

It's possible to hit the "Co-routine re-entered recursively" error with -drive
format=qcow2,aio=native.  This is a regression introduced by a linux-aio.c
optimization.  See Patch 3 for details.

Patches 1 & 2 add a new coroutine API called qemu_coroutine_entered() for
checking whether a coroutine is currently "entered".  This makes it possible to
avoid re-entering coroutines recursively.

Stefan Hajnoczi (3):
  coroutine: add qemu_coroutine_entered() function
  test-coroutine: test qemu_coroutine_entered()
  linux-aio: fix re-entrant completion processing

 block/linux-aio.c|  9 ++---
 include/qemu/coroutine.h | 13 +
 tests/test-coroutine.c   | 42 ++
 util/qemu-coroutine.c|  5 +
 4 files changed, 66 insertions(+), 3 deletions(-)

-- 
2.7.4

[Qemu-block] [PATCH 3/3] linux-aio: fix re-entrant completion processing

2016-09-27 Thread Stefan Hajnoczi

Commit 0ed93d84edabc7656f5c998ae1a346fe8b94ca54 ("linux-aio: process
completions from ioq_submit()") added an optimization that processes
completions each time ioq_submit() returns with requests in flight.
This commit introduces a "Co-routine re-entered recursively" error which
can be triggered with -drive format=qcow2,aio=native.

Fam Zheng , Kevin Wolf , and I
debugged the following backtrace:

  (gdb) bt
  #0  0x70a046f5 in raise () at /lib64/libc.so.6
  #1  0x70a062fa in abort () at /lib64/libc.so.6
  #2  0x55ac0013 in qemu_coroutine_enter (co=0x583464d0) at 
util/qemu-coroutine.c:113
  #3  0x55a4b663 in qemu_laio_process_completions 
(s=s@entry=0x57e2f7f0) at block/linux-aio.c:218
  #4  0x55a4b874 in ioq_submit (s=s@entry=0x57e2f7f0) at 
block/linux-aio.c:331
  #5  0x55a4ba12 in laio_do_submit (fd=fd@entry=13, 
laiocb=laiocb@entry=0x59d38ae0, offset=offset@entry=2932727808, 
type=type@entry=1) at block/linux-aio.c:383
  #6  0x55a4bbd3 in laio_co_submit (bs=, 
s=0x57e2f7f0, fd=13, offset=2932727808, qiov=0x59d38e20, type=1) at 
block/linux-aio.c:402
  #7  0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x5663bcb0, 
offset=offset@entry=2932727808, bytes=bytes@entry=8192, 
qiov=qiov@entry=0x59d38e20, flags=0) at block/io.c:804
  #8  0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x5663bcb0, 
req=req@entry=0x59d38d20, offset=offset@entry=2932727808, 
bytes=bytes@entry=8192, align=align@entry=512, qiov=qiov@entry=0x59d38e20, 
flags=0) at block/io.c:1041
  #9  0x55a52db8 in bdrv_co_preadv (child=, 
offset=2932727808, bytes=8192, qiov=qiov@entry=0x59d38e20, 
flags=flags@entry=0) at block/io.c:1133
  #10 0x55a29629 in qcow2_co_preadv (bs=0x56635890, 
offset=6178725888, bytes=8192, qiov=0x57527840, flags=) at 
block/qcow2.c:1509
  #11 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x56635890, 
offset=offset@entry=6178725888, bytes=bytes@entry=8192, 
qiov=qiov@entry=0x57527840, flags=0) at block/io.c:804
  #12 0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x56635890, 
req=req@entry=0x59d39000, offset=offset@entry=6178725888, 
bytes=bytes@entry=8192, align=align@entry=1, qiov=qiov@entry=0x57527840, 
flags=0) at block/io.c:1041
  #13 0x55a52db8 in bdrv_co_preadv (child=, 
offset=offset@entry=6178725888, bytes=bytes@entry=8192, 
qiov=qiov@entry=0x57527840, flags=flags@entry=0) at block/io.c:1133
  #14 0x55a4515a in blk_co_preadv (blk=0x566356d0, 
offset=6178725888, bytes=8192, qiov=0x57527840, flags=0) at 
block/block-backend.c:783
  #15 0x55a45266 in blk_aio_read_entry (opaque=0x577025e0) at 
block/block-backend.c:991
  #16 0x55ac0cfa in coroutine_trampoline (i0=, 
i1=) at util/coroutine-ucontext.c:78

It turned out that re-entrant ioq_submit() and completion processing
between three requests caused this error.  The following check is not
sufficient to prevent recursively entering coroutines:

  if (laiocb->co != qemu_coroutine_self()) {
  qemu_coroutine_enter(laiocb->co);
  }

As the following coroutine backtrace shows, not just the current
coroutine (self) can be entered.  There might also be other coroutines
that are currently entered and transferred control due to the qcow2 lock
(CoMutex):

(gdb) qemu coroutine 0x583464d0

Use the new qemu_coroutine_entered() function instead of comparing
against qemu_coroutine_self().  This is correct because:

1. If a coroutine is not entered then it must have yielded to wait for
   I/O completion.  It is therefore safe to enter.

2. If a coroutine is entered then it must be in
   ioq_submit()/qemu_laio_process_completions() because otherwise it
   would be yielded while waiting for I/O completion.  Therefore it will
   check laio->ret and return from ioq_submit() instead of yielding,
   i.e. it's guaranteed not to hang.

Reported-by: Fam Zheng 
Tested-by: Fam Zheng 
Signed-off-by: Stefan Hajnoczi 
---
 block/linux-aio.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/block/linux-aio.c b/block/linux-aio.c
index d4e19d4..1685ec2 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -94,9 +94,12 @@ static void qemu_laio_process_completion(struct qemu_laiocb 
*laiocb)
 
 laiocb->ret = ret;
 if (laiocb->co) {
-/* Jump and continue completion for foreign requests, don't do
- * anything for current request, it will be completed shortly. */
-if (laiocb->co != qemu_coroutine_self()) {
+/* If the coroutine is already entered it must be in ioq_submit() and
+ * will notice laio->ret has been filled in when it eventually runs
+ * later.  Coroutines cannot be entered recursively so avoid doing
+ * that!
+ */
+if (!qemu_coroutine_entered(laiocb->co)) {

Re: [Qemu-block] [PATCH] proto: add 'shift' extension.

2016-09-27 Thread Denis V. Lunev

On 09/27/2016 03:07 PM, Paolo Bonzini wrote:
>
> - Original Message -
>> From: "Denis V. Lunev" 
>> To: "Paolo Bonzini" 
>> Cc: "Vladimir Sementsov-Ogievskiy" , 
>> qemu-de...@nongnu.org, qemu-block@nongnu.org,
>> nbd-gene...@lists.sourceforge.net, a...@alex.org.uk, ebl...@redhat.com, 
>> kw...@redhat.com, stefa...@redhat.com,
>> w...@uter.be
>> Sent: Tuesday, September 27, 2016 12:25:54 PM
>> Subject: Re: [PATCH] proto: add 'shift' extension.
>>
>> On 09/27/2016 01:15 PM, Paolo Bonzini wrote:
 We could go in a different direction and export flag
 'has_zero_init' which will report that the storage is
 initialized with all zeroes at the moment. In this
 case mirroring code will not fall into this
 branch.
>>> Why don't you add the zero_init flag to QEMU's NBD driver instead?
>> for all cases without knowing real backend on the server side?
>> I think this would be wrong.
> Add it to the command line, and leave it to libvirt or the user to
> pass "-drive file.driver=nbd,...,file.zero-init=on".
>
> Paolo
I have started with something very similar for 'drive-mirror' command.
We have added additional flag for this to improve migration speed
and this was rejected.

Den

[Qemu-block] [PULL 01/18] block: reintroduce bdrv_flush_all

2016-09-27 Thread Kevin Wolf

From: John Snow 

Commit fe1a9cbc moved the flush_all routine from the bdrv layer to the
block-backend layer. In doing so, however, the semantics of the routine
changed slightly such that flush_all now used blk_flush instead of
bdrv_flush.

blk_flush can fail if the attached device model reports that it is not
"available," (i.e. the tray is open.) This changed the semantics of
flush_all such that it can now fail for e.g. open CDROM drives.

Reintroduce bdrv_flush_all to regain the old semantics without having to
alter the behavior of blk_flush or blk_flush_all, which are already
'doing the right thing.'

Signed-off-by: John Snow 
Reviewed-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Acked-by: Fam Zheng 
Signed-off-by: Kevin Wolf 
---
 block/io.c| 25 +
 include/block/block.h |  1 +
 2 files changed, 26 insertions(+)

diff --git a/block/io.c b/block/io.c
index fdf7080..57a2eeb 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1619,6 +1619,31 @@ int coroutine_fn bdrv_co_pwrite_zeroes(BdrvChild *child, 
int64_t offset,
BDRV_REQ_ZERO_WRITE | flags);
 }
 
+/*
+ * Flush ALL BDSes regardless of if they are reachable via a BlkBackend or not.
+ */
+int bdrv_flush_all(void)
+{
+BdrvNextIterator it;
+BlockDriverState *bs = NULL;
+int result = 0;
+
+for (bs = bdrv_first(); bs; bs = bdrv_next()) {
+AioContext *aio_context = bdrv_get_aio_context(bs);
+int ret;
+
+aio_context_acquire(aio_context);
+ret = bdrv_flush(bs);
+if (ret < 0 && !result) {
+result = ret;
+}
+aio_context_release(aio_context);
+}
+
+return result;
+}
+
+
 typedef struct BdrvCoGetBlockStatusData {
 BlockDriverState *bs;
 BlockDriverState *base;
diff --git a/include/block/block.h b/include/block/block.h
index e18233a..811b060 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -333,6 +333,7 @@ int bdrv_inactivate_all(void);
 /* Ensure contents are flushed to disk.  */
 int bdrv_flush(BlockDriverState *bs);
 int coroutine_fn bdrv_co_flush(BlockDriverState *bs);
+int bdrv_flush_all(void);
 void bdrv_close_all(void);
 void bdrv_drain(BlockDriverState *bs);
 void coroutine_fn bdrv_co_drain(BlockDriverState *bs);
-- 
1.8.3.1

[Qemu-block] [PULL 04/18] block: Fix error path in qmp_blockdev_change_medium()

2016-09-27 Thread Kevin Wolf

Commit 00949bab incorrectly changed one instance of  into errp while
touching the line. Change it back.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
---
 blockdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 29c6561..62d0dd0 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2614,7 +2614,7 @@ void qmp_blockdev_change_medium(bool has_device, const 
char *device,
 error_free(err);
 err = NULL;
 
-qmp_x_blockdev_remove_medium(has_device, device, has_id, id, errp);
+qmp_x_blockdev_remove_medium(has_device, device, has_id, id, );
 if (err) {
 error_propagate(errp, err);
 goto fail;
-- 
1.8.3.1

[Qemu-block] [PATCH 1/3] coroutine: add qemu_coroutine_entered() function

2016-09-27 Thread Stefan Hajnoczi

See the doc comments for a description of this new coroutine API.

Signed-off-by: Stefan Hajnoczi 
---
 include/qemu/coroutine.h | 13 +
 util/qemu-coroutine.c|  5 +
 2 files changed, 18 insertions(+)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index 29a2078..e6a60d5 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -92,6 +92,19 @@ Coroutine *coroutine_fn qemu_coroutine_self(void);
  */
 bool qemu_in_coroutine(void);
 
+/**
+ * Return true if the coroutine is currently entered
+ *
+ * A coroutine is "entered" if it has not yielded from the current
+ * qemu_coroutine_enter() call used to run it.  This does not mean that the
+ * coroutine is currently executing code since it may have transferred control
+ * to another coroutine using qemu_coroutine_enter().
+ *
+ * When several coroutines enter each other there may be no way to know which
+ * ones have already been entered.  In such situations this function can be
+ * used to avoid recursively entering coroutines.
+ */
+bool qemu_coroutine_entered(Coroutine *co);
 
 
 /**
diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index 3cbf225..737bffa 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -146,3 +146,8 @@ void coroutine_fn qemu_coroutine_yield(void)
 self->caller = NULL;
 qemu_coroutine_switch(self, to, COROUTINE_YIELD);
 }
+
+bool qemu_coroutine_entered(Coroutine *co)
+{
+return co->caller;
+}
-- 
2.7.4

[Qemu-block] [PULL 10/18] block: Move 'discard' option to bdrv_open_common()

2016-09-27 Thread Kevin Wolf

This enables its use for nested child nodes. The compatibility
between the 'discard' and 'detect-zeroes' setting is checked in
bdrv_open_common() now as the former setting isn't available before
calling bdrv_open() any more.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Max Reitz 
---
 block.c   | 17 -
 blockdev.c| 25 -
 include/block/block.h |  1 +
 3 files changed, 17 insertions(+), 26 deletions(-)

diff --git a/block.c b/block.c
index 1f10457..bb1f1ec 100644
--- a/block.c
+++ b/block.c
@@ -765,7 +765,7 @@ static void bdrv_inherited_options(int *child_flags, QDict 
*child_options,
 /* Our block drivers take care to send flushes and respect unmap policy,
  * so we can default to enable both on lower layers regardless of the
  * corresponding parent options. */
-flags |= BDRV_O_UNMAP;
+qdict_set_default_str(child_options, BDRV_OPT_DISCARD, "unmap");
 
 /* Clear flags that only apply to the top layer */
 flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING | BDRV_O_COPY_ON_READ |
@@ -960,6 +960,11 @@ static QemuOptsList bdrv_runtime_opts = {
 .type = QEMU_OPT_STRING,
 .help = "try to optimize zero writes (off, on, unmap)",
 },
+{
+.name = "discard",
+.type = QEMU_OPT_STRING,
+.help = "discard operation (ignore/off, unmap/on)",
+},
 { /* end of list */ }
 },
 };
@@ -976,6 +981,7 @@ static int bdrv_open_common(BlockDriverState *bs, BdrvChild 
*file,
 const char *filename;
 const char *driver_name = NULL;
 const char *node_name = NULL;
+const char *discard;
 const char *detect_zeroes;
 QemuOpts *opts;
 BlockDriver *drv;
@@ -1045,6 +1051,15 @@ static int bdrv_open_common(BlockDriverState *bs, 
BdrvChild *file,
 }
 }
 
+discard = qemu_opt_get(opts, "discard");
+if (discard != NULL) {
+if (bdrv_parse_discard_flags(discard, >open_flags) != 0) {
+error_setg(errp, "Invalid discard option");
+ret = -EINVAL;
+goto fail_opts;
+}
+}
+
 detect_zeroes = qemu_opt_get(opts, "detect-zeroes");
 if (detect_zeroes) {
 BlockdevDetectZeroesOptions value =
diff --git a/blockdev.c b/blockdev.c
index 7b87bd8..e2ace04 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -356,7 +356,6 @@ static void extract_common_blockdev_options(QemuOpts *opts, 
int *bdrv_flags,
 const char **throttling_group, ThrottleConfig *throttle_cfg,
 BlockdevDetectZeroesOptions *detect_zeroes, Error **errp)
 {
-const char *discard;
 Error *local_error = NULL;
 const char *aio;
 
@@ -365,13 +364,6 @@ static void extract_common_blockdev_options(QemuOpts 
*opts, int *bdrv_flags,
 *bdrv_flags |= BDRV_O_COPY_ON_READ;
 }
 
-if ((discard = qemu_opt_get(opts, "discard")) != NULL) {
-if (bdrv_parse_discard_flags(discard, bdrv_flags) != 0) {
-error_setg(errp, "Invalid discard option");
-return;
-}
-}
-
 if ((aio = qemu_opt_get(opts, "aio")) != NULL) {
 if (!strcmp(aio, "native")) {
 *bdrv_flags |= BDRV_O_NATIVE_AIO;
@@ -449,15 +441,6 @@ static void extract_common_blockdev_options(QemuOpts 
*opts, int *bdrv_flags,
 error_propagate(errp, local_error);
 return;
 }
-
-if (bdrv_flags &&
-*detect_zeroes == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP &&
-!(*bdrv_flags & BDRV_O_UNMAP))
-{
-error_setg(errp, "setting detect-zeroes to unmap is not allowed "
- "without setting discard operation to unmap");
-return;
-}
 }
 }
 
@@ -3990,10 +3973,6 @@ QemuOptsList qemu_common_drive_opts = {
 .type = QEMU_OPT_BOOL,
 .help = "enable/disable snapshot mode",
 },{
-.name = "discard",
-.type = QEMU_OPT_STRING,
-.help = "discard operation (ignore/off, unmap/on)",
-},{
 .name = "aio",
 .type = QEMU_OPT_STRING,
 .help = "host AIO implementation (threads, native)",
@@ -4125,10 +4104,6 @@ static QemuOptsList qemu_root_bds_opts = {
 .head = QTAILQ_HEAD_INITIALIZER(qemu_root_bds_opts.head),
 .desc = {
 {
-.name = "discard",
-.type = QEMU_OPT_STRING,
-.help = "discard operation (ignore/off, unmap/on)",
-},{
 .name = "aio",
 .type = QEMU_OPT_STRING,
 .help = "host AIO implementation (threads, native)",
diff --git a/include/block/block.h b/include/block/block.h
index 811b060..107c603 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -108,6 +108,7 @@ typedef struct HDGeometry {
 #define BDRV_OPT_CACHE_DIRECT   "cache.direct"
 #define

Re: [Qemu-block] [PATCH v5 7/9] block: don't make snapshots for filters

2016-09-27 Thread Pavel Dovgalyuk

> From: Kevin Wolf [mailto:kw...@redhat.com]
> Am 26.09.2016 um 11:51 hat Pavel Dovgalyuk geschrieben:
> > > From: Kevin Wolf [mailto:kw...@redhat.com]
> > > Am 26.09.2016 um 10:08 hat Pavel Dovgalyuk geschrieben:
> > > > This patch disables snapshotting for block driver filters.
> > > > It is needed, because snapshots should be created
> > > > in underlying disk images, not in filters itself.
> > > >
> > > > Signed-off-by: Pavel Dovgalyuk 
> > >
> > > But that's exactly what the existing code implements? If a driver
> > > doesn't provide .bdrv_snapshot_goto, the request is redirected to
> > > bs->file.
> > >
> > > >  block/snapshot.c |3 +++
> > > >  1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/block/snapshot.c b/block/snapshot.c
> > > > index bf5c2ca..8998b8b 100644
> > > > --- a/block/snapshot.c
> > > > +++ b/block/snapshot.c
> > > > @@ -184,6 +184,9 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
> > > >  if (!drv) {
> > > >  return -ENOMEDIUM;
> > > >  }
> > > > +if (drv->is_filter) {
> > > > +return 0;
> > > > +}
> > >
> > > This, on the other hand, doesn't redirect the request, but silently
> > > ignores it. That is, loading the snapshot will apparently succeed, but
> > > it wouldn't actually load anything and the disk would stay in its
> > > current state.
> >
> > In my use case bdrv_all_goto_snapshot iterates all block drivers, including
> > filters and disk images. Therefore skipping goto for images is ok.
> 
> Hm, this can happy today indeed.
> 
> Originally, we only called bdrv_goto_snapshot() for all _top level_
> BDSes, and this is still what you normally get. However, if you
> explicitly create a BDS (e.g. with its own -drive option), it is
> considered a top level BDS without actually being top level for the
> guest, and therefore the snapshotting function is called for it.
> 
> Of course, this is highly inefficient because the goto_snapshot request
> is passed by the filter driver and then called another time for the
> lower node, effectively loading the snapshot a second time.
> 
> On the other hand if you use a single -drive option to create both the
> qcow2 BDS and the blkreplay filter, we do need to pass down the
> goto_snapshot request because it won't be called for the qcow2 layer
> otherwise.

How this can be specified in command line?
I believed that separate -drive option is required.

> 
> I'm not completely sure yet what the right behaviour would be here.

Pavel Dovgalyuk

[Qemu-block] [PATCH 0/3] linux-aio: fix "Co-routine re-entered recursively" error

2016-09-27 Thread Stefan Hajnoczi

It's possible to hit the "Co-routine re-entered recursively" error with -drive
format=qcow2,aio=native.  This is a regression introduced by a linux-aio.c
optimization.  See Patch 3 for details.

Patches 1 & 2 add a new coroutine API called qemu_coroutine_entered() for
checking whether a coroutine is currently "entered".  This makes it possible to
avoid re-entering coroutines recursively.

Stefan Hajnoczi (3):
  coroutine: add qemu_coroutine_entered() function
  test-coroutine: test qemu_coroutine_entered()
  linux-aio: fix re-entrant completion processing

 block/linux-aio.c|  9 ++---
 include/qemu/coroutine.h | 13 +
 tests/test-coroutine.c   | 42 ++
 util/qemu-coroutine.c|  5 +
 4 files changed, 66 insertions(+), 3 deletions(-)

-- 
2.7.4

[Qemu-block] [PULL 03/18] block-backend: remove blk_flush_all

2016-09-27 Thread Kevin Wolf

From: John Snow 

We can teach Xen to drain and flush each device as it needs to, instead
of trying to flush ALL devices. This removes the last user of
blk_flush_all.

The function is therefore removed under the premise that any new uses
of blk_flush_all would be the wrong paradigm: either flush the single
device that requires flushing, or use an appropriate flush_all mechanism
from outside of the BlkBackend layer.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
Acked-by: Fam Zheng 
Signed-off-by: Kevin Wolf 
---
 block/block-backend.c  | 22 --
 hw/i386/xen/xen_platform.c |  2 --
 hw/ide/piix.c  |  4 
 include/sysemu/block-backend.h |  1 -
 4 files changed, 4 insertions(+), 25 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 0bd19ab..f34bad5 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1640,28 +1640,6 @@ int blk_commit_all(void)
 return 0;
 }
 
-int blk_flush_all(void)
-{
-BlockBackend *blk = NULL;
-int result = 0;
-
-while ((blk = blk_all_next(blk)) != NULL) {
-AioContext *aio_context = blk_get_aio_context(blk);
-int ret;
-
-aio_context_acquire(aio_context);
-if (blk_is_inserted(blk)) {
-ret = blk_flush(blk);
-if (ret < 0 && !result) {
-result = ret;
-}
-}
-aio_context_release(aio_context);
-}
-
-return result;
-}
-
 
 /* throttling disk I/O limits */
 void blk_set_io_limits(BlockBackend *blk, ThrottleConfig *cfg)
diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c
index aa78393..f85635c 100644
--- a/hw/i386/xen/xen_platform.c
+++ b/hw/i386/xen/xen_platform.c
@@ -134,8 +134,6 @@ static void platform_fixed_ioport_writew(void *opaque, 
uint32_t addr, uint32_t v
devices, and bit 2 the non-primary-master IDE devices. */
 if (val & UNPLUG_ALL_IDE_DISKS) {
 DPRINTF("unplug disks\n");
-blk_drain_all();
-blk_flush_all();
 pci_unplug_disks(pci_dev->bus);
 }
 if (val & UNPLUG_ALL_NICS) {
diff --git a/hw/ide/piix.c b/hw/ide/piix.c
index c190fca..d5777fd 100644
--- a/hw/ide/piix.c
+++ b/hw/ide/piix.c
@@ -179,6 +179,10 @@ int pci_piix3_xen_ide_unplug(DeviceState *dev)
 if (di != NULL && !di->media_cd) {
 BlockBackend *blk = blk_by_legacy_dinfo(di);
 DeviceState *ds = blk_get_attached_dev(blk);
+
+blk_drain(blk);
+blk_flush(blk);
+
 if (ds) {
 blk_detach_dev(blk, ds);
 }
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 3b29317..24d1d85 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -152,7 +152,6 @@ BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long 
int req, void *buf,
 int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int count);
 int blk_co_flush(BlockBackend *blk);
 int blk_flush(BlockBackend *blk);
-int blk_flush_all(void);
 int blk_commit_all(void);
 void blk_drain(BlockBackend *blk);
 void blk_drain_all(void);
-- 
1.8.3.1

[Qemu-block] [PULL 15/18] coroutine-ucontext: use helper for allocating stack memory

2016-09-27 Thread Kevin Wolf

From: Peter Lieven 

Signed-off-by: Peter Lieven 
Signed-off-by: Kevin Wolf 
---
 util/coroutine-ucontext.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/util/coroutine-ucontext.c b/util/coroutine-ucontext.c
index 31254ab..6621f3f 100644
--- a/util/coroutine-ucontext.c
+++ b/util/coroutine-ucontext.c
@@ -34,6 +34,7 @@
 typedef struct {
 Coroutine base;
 void *stack;
+size_t stack_size;
 sigjmp_buf env;
 
 #ifdef CONFIG_VALGRIND_H
@@ -82,7 +83,6 @@ static void coroutine_trampoline(int i0, int i1)
 
 Coroutine *qemu_coroutine_new(void)
 {
-const size_t stack_size = COROUTINE_STACK_SIZE;
 CoroutineUContext *co;
 ucontext_t old_uc, uc;
 sigjmp_buf old_env;
@@ -101,17 +101,18 @@ Coroutine *qemu_coroutine_new(void)
 }
 
 co = g_malloc0(sizeof(*co));
-co->stack = g_malloc(stack_size);
+co->stack_size = COROUTINE_STACK_SIZE;
+co->stack = qemu_alloc_stack(>stack_size);
 co->base.entry_arg = _env; /* stash away our jmp_buf */
 
 uc.uc_link = _uc;
 uc.uc_stack.ss_sp = co->stack;
-uc.uc_stack.ss_size = stack_size;
+uc.uc_stack.ss_size = co->stack_size;
 uc.uc_stack.ss_flags = 0;
 
 #ifdef CONFIG_VALGRIND_H
 co->valgrind_stack_id =
-VALGRIND_STACK_REGISTER(co->stack, co->stack + stack_size);
+VALGRIND_STACK_REGISTER(co->stack, co->stack + co->stack_size);
 #endif
 
 arg.p = co;
@@ -149,7 +150,7 @@ void qemu_coroutine_delete(Coroutine *co_)
 valgrind_stack_deregister(co);
 #endif
 
-g_free(co->stack);
+qemu_free_stack(co->stack, co->stack_size);
 g_free(co);
 }
 
-- 
1.8.3.1

[Qemu-block] [PATCH 2/3] test-coroutine: test qemu_coroutine_entered()

2016-09-27 Thread Stefan Hajnoczi

Signed-off-by: Stefan Hajnoczi 
---
 tests/test-coroutine.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/tests/test-coroutine.c b/tests/test-coroutine.c
index 6431dd6..abd97c2 100644
--- a/tests/test-coroutine.c
+++ b/tests/test-coroutine.c
@@ -53,6 +53,47 @@ static void test_self(void)
 }
 
 /*
+ * Check that qemu_coroutine_entered() works
+ */
+
+static void coroutine_fn verify_entered_step_2(void *opaque)
+{
+Coroutine *caller = (Coroutine *)opaque;
+
+g_assert(qemu_coroutine_entered(caller));
+g_assert(qemu_coroutine_entered(qemu_coroutine_self()));
+qemu_coroutine_yield();
+
+/* Once more to check it still works after yielding */
+g_assert(qemu_coroutine_entered(caller));
+g_assert(qemu_coroutine_entered(qemu_coroutine_self()));
+qemu_coroutine_yield();
+}
+
+static void coroutine_fn verify_entered_step_1(void *opaque)
+{
+Coroutine *self = qemu_coroutine_self();
+Coroutine *coroutine;
+
+g_assert(qemu_coroutine_entered(self));
+
+coroutine = qemu_coroutine_create(verify_entered_step_2, self);
+g_assert(!qemu_coroutine_entered(coroutine));
+qemu_coroutine_enter(coroutine);
+g_assert(!qemu_coroutine_entered(coroutine));
+qemu_coroutine_enter(coroutine);
+}
+
+static void test_entered(void)
+{
+Coroutine *coroutine;
+
+coroutine = qemu_coroutine_create(verify_entered_step_1, NULL);
+g_assert(!qemu_coroutine_entered(coroutine));
+qemu_coroutine_enter(coroutine);
+}
+
+/*
  * Check that coroutines may nest multiple levels
  */
 
@@ -389,6 +430,7 @@ int main(int argc, char **argv)
 g_test_add_func("/basic/yield", test_yield);
 g_test_add_func("/basic/nesting", test_nesting);
 g_test_add_func("/basic/self", test_self);
+g_test_add_func("/basic/entered", test_entered);
 g_test_add_func("/basic/in_coroutine", test_in_coroutine);
 g_test_add_func("/basic/order", test_order);
 if (g_test_perf()) {
-- 
2.7.4

[Qemu-block] [PULL 16/18] coroutine-sigaltstack: use helper for allocating stack memory

2016-09-27 Thread Kevin Wolf

From: Peter Lieven 

Signed-off-by: Peter Lieven 
Signed-off-by: Kevin Wolf 
---
 util/coroutine-sigaltstack.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
index a5bcb7e..f6fc49a 100644
--- a/util/coroutine-sigaltstack.c
+++ b/util/coroutine-sigaltstack.c
@@ -33,6 +33,7 @@
 typedef struct {
 Coroutine base;
 void *stack;
+size_t stack_size;
 sigjmp_buf env;
 } CoroutineSigAltStack;
 
@@ -143,7 +144,6 @@ static void coroutine_trampoline(int signal)
 
 Coroutine *qemu_coroutine_new(void)
 {
-const size_t stack_size = COROUTINE_STACK_SIZE;
 CoroutineSigAltStack *co;
 CoroutineThreadState *coTS;
 struct sigaction sa;
@@ -164,7 +164,8 @@ Coroutine *qemu_coroutine_new(void)
  */
 
 co = g_malloc0(sizeof(*co));
-co->stack = g_malloc(stack_size);
+co->stack_size = COROUTINE_STACK_SIZE;
+co->stack = qemu_alloc_stack(>stack_size);
 co->base.entry_arg = _env; /* stash away our jmp_buf */
 
 coTS = coroutine_get_thread_state();
@@ -189,7 +190,7 @@ Coroutine *qemu_coroutine_new(void)
  * Set the new stack.
  */
 ss.ss_sp = co->stack;
-ss.ss_size = stack_size;
+ss.ss_size = co->stack_size;
 ss.ss_flags = 0;
 if (sigaltstack(, ) < 0) {
 abort();
@@ -253,7 +254,7 @@ void qemu_coroutine_delete(Coroutine *co_)
 {
 CoroutineSigAltStack *co = DO_UPCAST(CoroutineSigAltStack, base, co_);
 
-g_free(co->stack);
+qemu_free_stack(co->stack, co->stack_size);
 g_free(co);
 }
 
-- 
1.8.3.1

[Qemu-block] [PULL 17/18] oslib-posix: add a configure switch to debug stack usage

2016-09-27 Thread Kevin Wolf

From: Peter Lieven 

this adds a knob to track the maximum stack usage of stacks
created by qemu_alloc_stack.

Signed-off-by: Peter Lieven 
Signed-off-by: Kevin Wolf 
---
 configure  | 19 +++
 util/oslib-posix.c | 35 +++
 2 files changed, 54 insertions(+)

diff --git a/configure b/configure
index 8fa62ad..93ef00a 100755
--- a/configure
+++ b/configure
@@ -296,6 +296,7 @@ libiscsi=""
 libnfs=""
 coroutine=""
 coroutine_pool=""
+debug_stack_usage="no"
 seccomp=""
 glusterfs=""
 glusterfs_xlator_opt="no"
@@ -1004,6 +1005,8 @@ for opt do
   ;;
   --enable-coroutine-pool) coroutine_pool="yes"
   ;;
+  --enable-debug-stack-usage) debug_stack_usage="yes"
+  ;;
   --disable-docs) docs="no"
   ;;
   --enable-docs) docs="yes"
@@ -4276,6 +4279,17 @@ if test "$coroutine" = "gthread" -a "$coroutine_pool" = 
"yes"; then
   error_exit "'gthread' coroutine backend does not support pool (use 
--disable-coroutine-pool)"
 fi
 
+if test "$debug_stack_usage" = "yes"; then
+  if test "$cpu" = "ia64" -o "$cpu" = "hppa"; then
+error_exit "stack usage debugging is not supported for $cpu"
+  fi
+  if test "$coroutine_pool" = "yes"; then
+echo "WARN: disabling coroutine pool for stack usage debugging"
+coroutine_pool=no
+  fi
+fi
+
+
 ##
 # check if we have open_by_handle_at
 
@@ -4861,6 +4875,7 @@ echo "QGA MSI support   $guest_agent_msi"
 echo "seccomp support   $seccomp"
 echo "coroutine backend $coroutine"
 echo "coroutine pool$coroutine_pool"
+echo "debug stack usage $debug_stack_usage"
 echo "GlusterFS support $glusterfs"
 echo "Archipelago support $archipelago"
 echo "gcov  $gcov_tool"
@@ -5330,6 +5345,10 @@ else
   echo "CONFIG_COROUTINE_POOL=0" >> $config_host_mak
 fi
 
+if test "$debug_stack_usage" = "yes" ; then
+  echo "CONFIG_DEBUG_STACK_USAGE=y" >> $config_host_mak
+fi
+
 if test "$open_by_handle_at" = "yes" ; then
   echo "CONFIG_OPEN_BY_HANDLE=y" >> $config_host_mak
 fi
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index d950c34..aaec189 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -50,6 +50,10 @@
 
 #include "qemu/mmap-alloc.h"
 
+#ifdef CONFIG_DEBUG_STACK_USAGE
+#include "qemu/error-report.h"
+#endif
+
 int qemu_get_thread_id(void)
 {
 #if defined(__linux__)
@@ -503,6 +507,9 @@ pid_t qemu_fork(Error **errp)
 void *qemu_alloc_stack(size_t *sz)
 {
 void *ptr, *guardpage;
+#ifdef CONFIG_DEBUG_STACK_USAGE
+void *ptr2;
+#endif
 size_t pagesz = getpagesize();
 #ifdef _SC_THREAD_STACK_MIN
 /* avoid stacks smaller than _SC_THREAD_STACK_MIN */
@@ -534,10 +541,38 @@ void *qemu_alloc_stack(size_t *sz)
 abort();
 }
 
+#ifdef CONFIG_DEBUG_STACK_USAGE
+for (ptr2 = ptr + pagesz; ptr2 < ptr + *sz; ptr2 += sizeof(uint32_t)) {
+*(uint32_t *)ptr2 = 0xdeadbeaf;
+}
+#endif
+
 return ptr;
 }
 
+#ifdef CONFIG_DEBUG_STACK_USAGE
+static __thread unsigned int max_stack_usage;
+#endif
+
 void qemu_free_stack(void *stack, size_t sz)
 {
+#ifdef CONFIG_DEBUG_STACK_USAGE
+unsigned int usage;
+void *ptr;
+
+for (ptr = stack + getpagesize(); ptr < stack + sz;
+ ptr += sizeof(uint32_t)) {
+if (*(uint32_t *)ptr != 0xdeadbeaf) {
+break;
+}
+}
+usage = sz - (uintptr_t) (ptr - stack);
+if (usage > max_stack_usage) {
+error_report("thread %d max stack usage increased from %u to %u",
+ qemu_get_thread_id(), max_stack_usage, usage);
+max_stack_usage = usage;
+}
+#endif
+
 munmap(stack, sz);
 }
-- 
1.8.3.1

[Qemu-block] [PULL 06/18] block/qapi: Use separate options type for curl driver

2016-09-27 Thread Kevin Wolf

We're going to add an option to the file drivers which doesn't apply to
the curl drivers, so give them a separate option type.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Max Reitz 
---
 qapi/block-core.json | 25 ++---
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 92193ab..b5fdd42 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1721,8 +1721,7 @@
 ##
 # @BlockdevOptionsFile
 #
-# Driver specific block device options for the file backend and similar
-# protocols.
+# Driver specific block device options for the file backend.
 #
 # @filename:path to the image file
 #
@@ -2211,6 +2210,18 @@
 '*top-id': 'str' } }
 
 ##
+# @BlockdevOptionsCurl
+#
+# Driver specific block device options for the curl backend.
+#
+# @filename:path to the image file
+#
+# Since: 1.7
+##
+{ 'struct': 'BlockdevOptionsCurl',
+  'data': { 'filename': 'str' } }
+
+##
 # @BlockdevOptions
 #
 # Options for creating a block device.  Many options are available for all
@@ -2248,13 +2259,13 @@
   'cloop':  'BlockdevOptionsGenericFormat',
   'dmg':'BlockdevOptionsGenericFormat',
   'file':   'BlockdevOptionsFile',
-  'ftp':'BlockdevOptionsFile',
-  'ftps':   'BlockdevOptionsFile',
+  'ftp':'BlockdevOptionsCurl',
+  'ftps':   'BlockdevOptionsCurl',
   'gluster':'BlockdevOptionsGluster',
   'host_cdrom': 'BlockdevOptionsFile',
   'host_device':'BlockdevOptionsFile',
-  'http':   'BlockdevOptionsFile',
-  'https':  'BlockdevOptionsFile',
+  'http':   'BlockdevOptionsCurl',
+  'https':  'BlockdevOptionsCurl',
 # TODO iscsi: Wait for structured options
   'luks':   'BlockdevOptionsLUKS',
 # TODO nbd: Should take InetSocketAddress for 'host'?
@@ -2271,7 +2282,7 @@
   'replication':'BlockdevOptionsReplication',
 # TODO sheepdog: Wait for structured options
 # TODO ssh: Should take InetSocketAddress for 'host'?
-  'tftp':   'BlockdevOptionsFile',
+  'tftp':   'BlockdevOptionsCurl',
   'vdi':'BlockdevOptionsGenericFormat',
   'vhdx':   'BlockdevOptionsGenericFormat',
   'vmdk':   'BlockdevOptionsGenericCOWFormat',
-- 
1.8.3.1

[Qemu-block] [PULL 00/18] Block layer patches

2016-09-27 Thread Kevin Wolf

The following changes since commit 7cfdc02dae0d2ff58c897496cfdbbafc0eda0f3f:

  Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging 
(2016-09-26 19:47:00 +0100)

are available in the git repository at:


  git://repo.or.cz/qemu/kevin.git tags/for-upstream

for you to fetch changes up to 3b856cebe5e93547852c156ca2119d075e62aed7:

  coroutine: reduce stack size to 60kB (2016-09-27 14:05:21 +0200)


Block layer patches


John Snow (3):
  block: reintroduce bdrv_flush_all
  qemu: use bdrv_flush_all for vm_stop et al
  block-backend: remove blk_flush_all

Kevin Wolf (8):
  block: Fix error path in qmp_blockdev_change_medium()
  block: Drop aio/cache consistency check from qmp_blockdev_add()
  block/qapi: Use separate options type for curl driver
  block/qapi: Move 'aio' option to file driver
  block: Parse 'detect-zeroes' in bdrv_open_common()
  block: Use 'detect-zeroes' option for 'blockdev-change-medium'
  block: Move 'discard' option to bdrv_open_common()
  block: Remove qemu_root_bds_opts

Peter Lieven (7):
  oslib-posix: add helpers for stack alloc and free
  coroutine-sigaltstack: rename coroutine struct appropriately
  coroutine: add a macro for the coroutine stack size
  coroutine-ucontext: use helper for allocating stack memory
  coroutine-sigaltstack: use helper for allocating stack memory
  oslib-posix: add a configure switch to debug stack usage
  coroutine: reduce stack size to 60kB

 block.c|  50 +-
 block/block-backend.c  |  31 ++--
 block/io.c |  25 +
 block/raw-posix.c  |  44 +---
 block/raw-win32.c  |  56 +++--
 blockdev.c | 112 +++--
 configure  |  19 +++
 cpus.c |   4 +-
 hw/i386/xen/xen_platform.c |   2 -
 hw/ide/piix.c  |   4 ++
 include/block/block.h  |   2 +
 include/qemu/coroutine_int.h   |   2 +
 include/sysemu/block-backend.h |   3 +-
 include/sysemu/os-posix.h  |  27 ++
 qapi/block-core.json   |  31 
 tests/qemu-iotests/087 |   4 +-
 tests/qemu-iotests/087.out |   2 +-
 util/coroutine-sigaltstack.c   |  25 -
 util/coroutine-ucontext.c  |  11 ++--
 util/coroutine-win32.c |   2 +-
 util/oslib-posix.c |  77 
 21 files changed, 342 insertions(+), 191 deletions(-)

[Qemu-block] [PULL 05/18] block: Drop aio/cache consistency check from qmp_blockdev_add()

2016-09-27 Thread Kevin Wolf

The TODO comment has been addressed a while ago and this is now checked
in raw-posix, so we don't have to special case this in blockdev-add any
more.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Max Reitz 
---
 blockdev.c | 15 ---
 tests/qemu-iotests/087.out |  2 +-
 2 files changed, 1 insertion(+), 16 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 62d0dd0..7820f42 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3832,21 +3832,6 @@ void qmp_blockdev_add(BlockdevOptions *options, Error 
**errp)
 QDict *qdict;
 Error *local_err = NULL;
 
-/* TODO Sort it out in raw-posix and drive_new(): Reject aio=native with
- * cache.direct=false instead of silently switching to aio=threads, except
- * when called from drive_new().
- *
- * For now, simply forbidding the combination for all drivers will do. */
-if (options->has_aio && options->aio == BLOCKDEV_AIO_OPTIONS_NATIVE) {
-bool direct = options->has_cache &&
-  options->cache->has_direct &&
-  options->cache->direct;
-if (!direct) {
-error_setg(errp, "aio=native requires cache.direct=true");
-goto fail;
-}
-}
-
 visit_type_BlockdevOptions(v, NULL, , _err);
 if (local_err) {
 error_propagate(errp, local_err);
diff --git a/tests/qemu-iotests/087.out b/tests/qemu-iotests/087.out
index bef6862..cd02eae 100644
--- a/tests/qemu-iotests/087.out
+++ b/tests/qemu-iotests/087.out
@@ -27,7 +27,7 @@ QMP_VERSION
 Testing:
 QMP_VERSION
 {"return": {}}
-{"error": {"class": "GenericError", "desc": "aio=native requires 
cache.direct=true"}}
+{"error": {"class": "GenericError", "desc": "aio=native was specified, but it 
requires cache.direct=on, which was not specified."}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"SHUTDOWN"}
 
-- 
1.8.3.1

[Qemu-block] [PULL 13/18] coroutine-sigaltstack: rename coroutine struct appropriately

2016-09-27 Thread Kevin Wolf

From: Peter Lieven 

The name of the sigaltstack coroutine struct was misleading.

Signed-off-by: Peter Lieven 
Signed-off-by: Kevin Wolf 
---
 util/coroutine-sigaltstack.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
index a7c3366..171cd44 100644
--- a/util/coroutine-sigaltstack.c
+++ b/util/coroutine-sigaltstack.c
@@ -34,7 +34,7 @@ typedef struct {
 Coroutine base;
 void *stack;
 sigjmp_buf env;
-} CoroutineUContext;
+} CoroutineSigAltStack;
 
 /**
  * Per-thread coroutine bookkeeping
@@ -44,7 +44,7 @@ typedef struct {
 Coroutine *current;
 
 /** The default coroutine */
-CoroutineUContext leader;
+CoroutineSigAltStack leader;
 
 /** Information for the signal handler (trampoline) */
 sigjmp_buf tr_reenter;
@@ -89,7 +89,7 @@ static void __attribute__((constructor)) coroutine_init(void)
  * (from the signal handler when it is not signal handling, read ahead
  * for more information).
  */
-static void coroutine_bootstrap(CoroutineUContext *self, Coroutine *co)
+static void coroutine_bootstrap(CoroutineSigAltStack *self, Coroutine *co)
 {
 /* Initialize longjmp environment and switch back the caller */
 if (!sigsetjmp(self->env, 0)) {
@@ -109,7 +109,7 @@ static void coroutine_bootstrap(CoroutineUContext *self, 
Coroutine *co)
  */
 static void coroutine_trampoline(int signal)
 {
-CoroutineUContext *self;
+CoroutineSigAltStack *self;
 Coroutine *co;
 CoroutineThreadState *coTS;
 
@@ -144,7 +144,7 @@ static void coroutine_trampoline(int signal)
 Coroutine *qemu_coroutine_new(void)
 {
 const size_t stack_size = 1 << 20;
-CoroutineUContext *co;
+CoroutineSigAltStack *co;
 CoroutineThreadState *coTS;
 struct sigaction sa;
 struct sigaction osa;
@@ -251,7 +251,7 @@ Coroutine *qemu_coroutine_new(void)
 
 void qemu_coroutine_delete(Coroutine *co_)
 {
-CoroutineUContext *co = DO_UPCAST(CoroutineUContext, base, co_);
+CoroutineSigAltStack *co = DO_UPCAST(CoroutineSigAltStack, base, co_);
 
 g_free(co->stack);
 g_free(co);
@@ -260,8 +260,8 @@ void qemu_coroutine_delete(Coroutine *co_)
 CoroutineAction qemu_coroutine_switch(Coroutine *from_, Coroutine *to_,
   CoroutineAction action)
 {
-CoroutineUContext *from = DO_UPCAST(CoroutineUContext, base, from_);
-CoroutineUContext *to = DO_UPCAST(CoroutineUContext, base, to_);
+CoroutineSigAltStack *from = DO_UPCAST(CoroutineSigAltStack, base, from_);
+CoroutineSigAltStack *to = DO_UPCAST(CoroutineSigAltStack, base, to_);
 CoroutineThreadState *s = coroutine_get_thread_state();
 int ret;
 
-- 
1.8.3.1

[Qemu-block] [PULL 18/18] coroutine: reduce stack size to 60kB

2016-09-27 Thread Kevin Wolf

From: Peter Lieven 

evaluation with the recently introduced maximum stack usage monitoring revealed
that the actual used stack size was never above 4kB so allocating 1MB stack
for each coroutine is a lot of wasted memory. So reduce the stack size to
60kB which should still give enough head room. The guard page added
in qemu_alloc_stack will catch a potential stack overflow introduced
by this commit. The 60kB + guard page will result in an allocation of
64kB per coroutine on systems where a page is 4kB.

Signed-off-by: Peter Lieven 
Signed-off-by: Kevin Wolf 
---
 include/qemu/coroutine_int.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
index 14d4f1d..be14260 100644
--- a/include/qemu/coroutine_int.h
+++ b/include/qemu/coroutine_int.h
@@ -28,7 +28,7 @@
 #include "qemu/queue.h"
 #include "qemu/coroutine.h"
 
-#define COROUTINE_STACK_SIZE (1 << 20)
+#define COROUTINE_STACK_SIZE 61440
 
 typedef enum {
 COROUTINE_YIELD = 1,
-- 
1.8.3.1

Re: [Qemu-block] [Qemu-devel] [PATCH v3] block: Turn on "unmap" in active commit

2016-09-27 Thread Eric Blake

On 09/27/2016 06:14 AM, Fam Zheng wrote:
> We already specified BDRV_O_UNMAP when opening images in 'qemu-img
> commit', but didn't turn on the "unmap" in the active commit job. This
> patch fixes that so that zeroed clusters in top image can be discarded
> which is desired in the virt-sparsify use case, where a temporary
> overlay is created and fstrim'ed before commiting back, to free space in

s/commiting/committing/

> the original image.
> 
> This also enables it for block-commit.
> 
> Signed-off-by: Fam Zheng 
> ---
> v3: Change the right parameter.
> v2: Add "unmap" to block-commit as well. [Kevin]
> ---
>  block/mirror.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v3] block: Turn on "unmap" in active commit

2016-09-27 Thread Eric Blake

On 09/27/2016 06:14 AM, Fam Zheng wrote:
> We already specified BDRV_O_UNMAP when opening images in 'qemu-img
> commit', but didn't turn on the "unmap" in the active commit job. This
> patch fixes that so that zeroed clusters in top image can be discarded
> which is desired in the virt-sparsify use case, where a temporary
> overlay is created and fstrim'ed before commiting back, to free space in
> the original image.
> 
> This also enables it for block-commit.
> 
> Signed-off-by: Fam Zheng 
> ---
> v3: Change the right parameter.

Doesn't affect this patch, but it may be worth using the 'boxed':true
notation in the .json file to make it more compact to pass job
information around via a struct rather than a large mess of parameters
where you are likely to get the wrong one.

> v2: Add "unmap" to block-commit as well. [Kevin]
> ---
>  block/mirror.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index f9d1fec..8847ec5 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -1042,7 +1042,7 @@ void commit_active_start(const char *job_id, 
> BlockDriverState *bs,
>  
>  mirror_start_job(job_id, bs, base, NULL, speed, 0, 0,
>   MIRROR_LEAVE_BACKING_CHAIN,
> - on_error, on_error, false, cb, opaque, _err,
> + on_error, on_error, true, cb, opaque, _err,
>   _active_job_driver, false, base, auto_complete);

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-block] [PULL 11/18] block: Remove qemu_root_bds_opts

2016-09-27 Thread Kevin Wolf

The remaining options in qemu_root_bds_opts (aio and copy-on-read)
aren't used any more, the QAPI schema doesn't contain them. Therefore
all the code processing qemu_root_bds_opts options is dead and can be
removed.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Max Reitz 
---
 blockdev.c | 54 +-
 1 file changed, 1 insertion(+), 53 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index e2ace04..07ec733 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -633,34 +633,11 @@ err_no_opts:
 return NULL;
 }
 
-static QemuOptsList qemu_root_bds_opts;
-
 /* Takes the ownership of bs_opts */
 static BlockDriverState *bds_tree_init(QDict *bs_opts, Error **errp)
 {
-BlockDriverState *bs;
-QemuOpts *opts;
-Error *local_error = NULL;
 int bdrv_flags = 0;
 
-opts = qemu_opts_create(_root_bds_opts, NULL, 1, errp);
-if (!opts) {
-goto fail;
-}
-
-qemu_opts_absorb_qdict(opts, bs_opts, _error);
-if (local_error) {
-error_propagate(errp, local_error);
-goto fail;
-}
-
-extract_common_blockdev_options(opts, _flags, NULL, NULL,
-NULL, _error);
-if (local_error) {
-error_propagate(errp, local_error);
-goto fail;
-}
-
 /* bdrv_open() defaults to the values in bdrv_flags (for compatibility
  * with other callers) rather than what we want as the real defaults.
  * Apply the defaults here instead. */
@@ -672,19 +649,7 @@ static BlockDriverState *bds_tree_init(QDict *bs_opts, 
Error **errp)
 bdrv_flags |= BDRV_O_INACTIVE;
 }
 
-bs = bdrv_open(NULL, NULL, bs_opts, bdrv_flags, errp);
-if (!bs) {
-goto fail_no_bs_opts;
-}
-
-fail_no_bs_opts:
-qemu_opts_del(opts);
-return bs;
-
-fail:
-qemu_opts_del(opts);
-QDECREF(bs_opts);
-return NULL;
+return bdrv_open(NULL, NULL, bs_opts, bdrv_flags, errp);
 }
 
 void blockdev_close_all_bdrv_states(void)
@@ -4099,23 +4064,6 @@ QemuOptsList qemu_common_drive_opts = {
 },
 };
 
-static QemuOptsList qemu_root_bds_opts = {
-.name = "root-bds",
-.head = QTAILQ_HEAD_INITIALIZER(qemu_root_bds_opts.head),
-.desc = {
-{
-.name = "aio",
-.type = QEMU_OPT_STRING,
-.help = "host AIO implementation (threads, native)",
-},{
-.name = "copy-on-read",
-.type = QEMU_OPT_BOOL,
-.help = "copy read data from backing file into image file",
-},
-{ /* end of list */ }
-},
-};
-
 QemuOptsList qemu_drive_opts = {
 .name = "drive",
 .head = QTAILQ_HEAD_INITIALIZER(qemu_drive_opts.head),
-- 
1.8.3.1

[Qemu-block] [PULL 08/18] block: Parse 'detect-zeroes' in bdrv_open_common()

2016-09-27 Thread Kevin Wolf

Amongst others, this means that you can now use the 'detect-zeroes'
option for non-top-level nodes in blockdev-add, like the QAPI schema
promises.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Max Reitz 
---
 block.c| 33 +
 blockdev.c |  9 +
 2 files changed, 34 insertions(+), 8 deletions(-)

diff --git a/block.c b/block.c
index 493ecf3..1f10457 100644
--- a/block.c
+++ b/block.c
@@ -42,6 +42,7 @@
 #include "qapi-event.h"
 #include "qemu/cutils.h"
 #include "qemu/id.h"
+#include "qapi/util.h"
 
 #ifdef CONFIG_BSD
 #include 
@@ -954,6 +955,11 @@ static QemuOptsList bdrv_runtime_opts = {
 .type = QEMU_OPT_BOOL,
 .help = "Node is opened in read-only mode",
 },
+{
+.name = "detect-zeroes",
+.type = QEMU_OPT_STRING,
+.help = "try to optimize zero writes (off, on, unmap)",
+},
 { /* end of list */ }
 },
 };
@@ -970,6 +976,7 @@ static int bdrv_open_common(BlockDriverState *bs, BdrvChild 
*file,
 const char *filename;
 const char *driver_name = NULL;
 const char *node_name = NULL;
+const char *detect_zeroes;
 QemuOpts *opts;
 BlockDriver *drv;
 Error *local_err = NULL;
@@ -1038,6 +1045,32 @@ static int bdrv_open_common(BlockDriverState *bs, 
BdrvChild *file,
 }
 }
 
+detect_zeroes = qemu_opt_get(opts, "detect-zeroes");
+if (detect_zeroes) {
+BlockdevDetectZeroesOptions value =
+qapi_enum_parse(BlockdevDetectZeroesOptions_lookup,
+detect_zeroes,
+BLOCKDEV_DETECT_ZEROES_OPTIONS__MAX,
+BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF,
+_err);
+if (local_err) {
+error_propagate(errp, local_err);
+ret = -EINVAL;
+goto fail_opts;
+}
+
+if (value == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP &&
+!(bs->open_flags & BDRV_O_UNMAP))
+{
+error_setg(errp, "setting detect-zeroes to unmap is not allowed "
+ "without setting discard operation to unmap");
+ret = -EINVAL;
+goto fail_opts;
+}
+
+bs->detect_zeroes = value;
+}
+
 if (filename != NULL) {
 pstrcpy(bs->filename, sizeof(bs->filename), filename);
 } else {
diff --git a/blockdev.c b/blockdev.c
index 7820f42..511260c 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -658,7 +658,6 @@ static BlockDriverState *bds_tree_init(QDict *bs_opts, 
Error **errp)
 BlockDriverState *bs;
 QemuOpts *opts;
 Error *local_error = NULL;
-BlockdevDetectZeroesOptions detect_zeroes;
 int bdrv_flags = 0;
 
 opts = qemu_opts_create(_root_bds_opts, NULL, 1, errp);
@@ -673,7 +672,7 @@ static BlockDriverState *bds_tree_init(QDict *bs_opts, 
Error **errp)
 }
 
 extract_common_blockdev_options(opts, _flags, NULL, NULL,
-_zeroes, _error);
+NULL, _error);
 if (local_error) {
 error_propagate(errp, local_error);
 goto fail;
@@ -695,8 +694,6 @@ static BlockDriverState *bds_tree_init(QDict *bs_opts, 
Error **errp)
 goto fail_no_bs_opts;
 }
 
-bs->detect_zeroes = detect_zeroes;
-
 fail_no_bs_opts:
 qemu_opts_del(opts);
 return bs;
@@ -4136,10 +4133,6 @@ static QemuOptsList qemu_root_bds_opts = {
 .name = "copy-on-read",
 .type = QEMU_OPT_BOOL,
 .help = "copy read data from backing file into image file",
-},{
-.name = "detect-zeroes",
-.type = QEMU_OPT_STRING,
-.help = "try to optimize zero writes (off, on, unmap)",
 },
 { /* end of list */ }
 },
-- 
1.8.3.1

[Qemu-block] [PULL 14/18] coroutine: add a macro for the coroutine stack size

2016-09-27 Thread Kevin Wolf

From: Peter Lieven 

Signed-off-by: Peter Lieven 
Reviewed-by: Paolo Bonzini 
Reviewed-by: Richard Henderson 
Signed-off-by: Kevin Wolf 
---
 include/qemu/coroutine_int.h | 2 ++
 util/coroutine-sigaltstack.c | 2 +-
 util/coroutine-ucontext.c| 2 +-
 util/coroutine-win32.c   | 2 +-
 4 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
index 6df9d33..14d4f1d 100644
--- a/include/qemu/coroutine_int.h
+++ b/include/qemu/coroutine_int.h
@@ -28,6 +28,8 @@
 #include "qemu/queue.h"
 #include "qemu/coroutine.h"
 
+#define COROUTINE_STACK_SIZE (1 << 20)
+
 typedef enum {
 COROUTINE_YIELD = 1,
 COROUTINE_TERMINATE = 2,
diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
index 171cd44..a5bcb7e 100644
--- a/util/coroutine-sigaltstack.c
+++ b/util/coroutine-sigaltstack.c
@@ -143,7 +143,7 @@ static void coroutine_trampoline(int signal)
 
 Coroutine *qemu_coroutine_new(void)
 {
-const size_t stack_size = 1 << 20;
+const size_t stack_size = COROUTINE_STACK_SIZE;
 CoroutineSigAltStack *co;
 CoroutineThreadState *coTS;
 struct sigaction sa;
diff --git a/util/coroutine-ucontext.c b/util/coroutine-ucontext.c
index 2bb7e10..31254ab 100644
--- a/util/coroutine-ucontext.c
+++ b/util/coroutine-ucontext.c
@@ -82,7 +82,7 @@ static void coroutine_trampoline(int i0, int i1)
 
 Coroutine *qemu_coroutine_new(void)
 {
-const size_t stack_size = 1 << 20;
+const size_t stack_size = COROUTINE_STACK_SIZE;
 CoroutineUContext *co;
 ucontext_t old_uc, uc;
 sigjmp_buf old_env;
diff --git a/util/coroutine-win32.c b/util/coroutine-win32.c
index 02e28e8..de6bd4f 100644
--- a/util/coroutine-win32.c
+++ b/util/coroutine-win32.c
@@ -71,7 +71,7 @@ static void CALLBACK coroutine_trampoline(void *co_)
 
 Coroutine *qemu_coroutine_new(void)
 {
-const size_t stack_size = 1 << 20;
+const size_t stack_size = COROUTINE_STACK_SIZE;
 CoroutineWin32 *co;
 
 co = g_malloc0(sizeof(*co));
-- 
1.8.3.1

[Qemu-block] [PULL 02/18] qemu: use bdrv_flush_all for vm_stop et al

2016-09-27 Thread Kevin Wolf

From: John Snow 

Reimplement bdrv_flush_all for vm_stop. In contrast to blk_flush_all,
bdrv_flush_all does not have device model restrictions. This allows
us to flush and halt unconditionally without error.

This allows us to do things like migrate when we have a device with
an open tray, but has a node that may need to be flushed, or nodes
that aren't currently attached to any device and need to be flushed.

Specifically, this allows us to migrate when we have a CDROM with
an open tray.

Signed-off-by: John Snow 
Reviewed-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Acked-by: Fam Zheng 
Signed-off-by: Kevin Wolf 
---
 cpus.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/cpus.c b/cpus.c
index e39ccb7..96d9352 100644
--- a/cpus.c
+++ b/cpus.c
@@ -751,7 +751,7 @@ static int do_vm_stop(RunState state)
 }
 
 bdrv_drain_all();
-ret = blk_flush_all();
+ret = bdrv_flush_all();
 
 return ret;
 }
@@ -1494,7 +1494,7 @@ int vm_stop_force_state(RunState state)
 bdrv_drain_all();
 /* Make sure to return an error if the flush in a previous vm_stop()
  * failed. */
-return blk_flush_all();
+return bdrv_flush_all();
 }
 }
 
-- 
1.8.3.1

[Qemu-block] [PULL 12/18] oslib-posix: add helpers for stack alloc and free

2016-09-27 Thread Kevin Wolf

From: Peter Lieven 

the allocated stack will be adjusted to the minimum supported stack size
by the OS and rounded up to be a multiple of the system pagesize.
Additionally an architecture dependent guard page is added to the stack
to catch stack overflows.

Signed-off-by: Peter Lieven 
Signed-off-by: Kevin Wolf 
---
 include/sysemu/os-posix.h | 27 +++
 util/oslib-posix.c| 42 ++
 2 files changed, 69 insertions(+)

diff --git a/include/sysemu/os-posix.h b/include/sysemu/os-posix.h
index 9c7dfdf..3cfedbc 100644
--- a/include/sysemu/os-posix.h
+++ b/include/sysemu/os-posix.h
@@ -60,4 +60,31 @@ int qemu_utimens(const char *path, const qemu_timespec 
*times);
 
 bool is_daemonized(void);
 
+/**
+ * qemu_alloc_stack:
+ * @sz: pointer to a size_t holding the requested usable stack size
+ *
+ * Allocate memory that can be used as a stack, for instance for
+ * coroutines. If the memory cannot be allocated, this function
+ * will abort (like g_malloc()). This function also inserts an
+ * additional guard page to catch a potential stack overflow.
+ * Note that the memory required for the guard page and alignment
+ * and minimal stack size restrictions will increase the value of sz.
+ *
+ * The allocated stack must be freed with qemu_free_stack().
+ *
+ * Returns: pointer to (the lowest address of) the stack memory.
+ */
+void *qemu_alloc_stack(size_t *sz);
+
+/**
+ * qemu_free_stack:
+ * @stack: stack to free
+ * @sz: size of stack in bytes
+ *
+ * Free a stack allocated via qemu_alloc_stack(). Note that sz must
+ * be exactly the adjusted stack size returned by qemu_alloc_stack.
+ */
+void qemu_free_stack(void *stack, size_t sz);
+
 #endif
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index f2d4e9e..d950c34 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -499,3 +499,45 @@ pid_t qemu_fork(Error **errp)
 }
 return pid;
 }
+
+void *qemu_alloc_stack(size_t *sz)
+{
+void *ptr, *guardpage;
+size_t pagesz = getpagesize();
+#ifdef _SC_THREAD_STACK_MIN
+/* avoid stacks smaller than _SC_THREAD_STACK_MIN */
+long min_stack_sz = sysconf(_SC_THREAD_STACK_MIN);
+*sz = MAX(MAX(min_stack_sz, 0), *sz);
+#endif
+/* adjust stack size to a multiple of the page size */
+*sz = ROUND_UP(*sz, pagesz);
+/* allocate one extra page for the guard page */
+*sz += pagesz;
+
+ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE,
+   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+if (ptr == MAP_FAILED) {
+abort();
+}
+
+#if defined(HOST_IA64)
+/* separate register stack */
+guardpage = ptr + (((*sz - pagesz) / 2) & ~pagesz);
+#elif defined(HOST_HPPA)
+/* stack grows up */
+guardpage = ptr + *sz - pagesz;
+#else
+/* stack grows down */
+guardpage = ptr;
+#endif
+if (mprotect(guardpage, pagesz, PROT_NONE) != 0) {
+abort();
+}
+
+return ptr;
+}
+
+void qemu_free_stack(void *stack, size_t sz)
+{
+munmap(stack, sz);
+}
-- 
1.8.3.1

[Qemu-block] [PATCH v14 19/19] qapi: delete unused OptsVisitor code

2016-09-27 Thread Daniel P. Berrange

Now that all code has been converted to QObjectInputVisitor's
QemuOpts compatibility mode, there is no longer any reason
to keep OptsVisitor alive.

Signed-off-by: Daniel P. Berrange 
---
 include/qapi/opts-visitor.h |  40 
 qapi/Makefile.objs  |   2 +-
 qapi/opts-visitor.c | 544 
 tests/Makefile.include  |   5 +-
 tests/test-opts-visitor.c   | 268 --
 vl.c|   1 -
 6 files changed, 2 insertions(+), 858 deletions(-)
 delete mode 100644 include/qapi/opts-visitor.h
 delete mode 100644 qapi/opts-visitor.c
 delete mode 100644 tests/test-opts-visitor.c

diff --git a/include/qapi/opts-visitor.h b/include/qapi/opts-visitor.h
deleted file mode 100644
index 6462c96..000
--- a/include/qapi/opts-visitor.h
+++ /dev/null
@@ -1,40 +0,0 @@
-/*
- * Options Visitor
- *
- * Copyright Red Hat, Inc. 2012
- *
- * Author: Laszlo Ersek 
- *
- * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
- * See the COPYING.LIB file in the top-level directory.
- *
- */
-
-#ifndef OPTS_VISITOR_H
-#define OPTS_VISITOR_H
-
-#include "qapi/visitor.h"
-#include "qemu/option.h"
-
-/* Inclusive upper bound on the size of any flattened range. This is a safety
- * (= anti-annoyance) measure; wrong ranges should not cause long startup
- * delays nor exhaust virtual memory.
- */
-#define OPTS_VISITOR_RANGE_MAX 65536
-
-typedef struct OptsVisitor OptsVisitor;
-
-/* Contrarily to qemu-option.c::parse_option_number(), OptsVisitor's "int"
- * parser relies on strtoll() instead of strtoull(). Consequences:
- * - string representations of negative numbers yield negative values,
- * - values below INT64_MIN or LLONG_MIN are rejected,
- * - values above INT64_MAX or LLONG_MAX are rejected.
- *
- * The Opts input visitor does not implement support for visiting QAPI
- * alternates, numbers (other than integers), null, or arbitrary
- * QTypes.  It also requires a non-null list argument to
- * visit_start_list().
- */
-Visitor *opts_visitor_new(const QemuOpts *opts);
-
-#endif
diff --git a/qapi/Makefile.objs b/qapi/Makefile.objs
index 33906ff..4b820a7 100644
--- a/qapi/Makefile.objs
+++ b/qapi/Makefile.objs
@@ -1,6 +1,6 @@
 util-obj-y = qapi-visit-core.o qapi-dealloc-visitor.o qobject-input-visitor.o
 util-obj-y += qobject-output-visitor.o qmp-registry.o qmp-dispatch.o
 util-obj-y += string-input-visitor.o string-output-visitor.o
-util-obj-y += opts-visitor.o qapi-clone-visitor.o
+util-obj-y += qapi-clone-visitor.o
 util-obj-y += qmp-event.o
 util-obj-y += qapi-util.o
diff --git a/qapi/opts-visitor.c b/qapi/opts-visitor.c
deleted file mode 100644
index 084f7cc..000
--- a/qapi/opts-visitor.c
+++ /dev/null
@@ -1,544 +0,0 @@
-/*
- * Options Visitor
- *
- * Copyright Red Hat, Inc. 2012-2016
- *
- * Author: Laszlo Ersek 
- *
- * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
- * See the COPYING.LIB file in the top-level directory.
- *
- */
-
-#include "qemu/osdep.h"
-#include "qapi/error.h"
-#include "qemu/cutils.h"
-#include "qapi/qmp/qerror.h"
-#include "qapi/opts-visitor.h"
-#include "qemu/queue.h"
-#include "qemu/option_int.h"
-#include "qapi/visitor-impl.h"
-
-
-enum ListMode
-{
-LM_NONE, /* not traversing a list of repeated options */
-
-LM_IN_PROGRESS,  /* opts_next_list() ready to be called.
-  *
-  * Generating the next list link will consume the most
-  * recently parsed QemuOpt instance of the repeated
-  * option.
-  *
-  * Parsing a value into the list link will examine the
-  * next QemuOpt instance of the repeated option, and
-  * possibly enter LM_SIGNED_INTERVAL or
-  * LM_UNSIGNED_INTERVAL.
-  */
-
-LM_SIGNED_INTERVAL,  /* opts_next_list() has been called.
-  *
-  * Generating the next list link will consume the most
-  * recently stored element from the signed interval,
-  * parsed from the most recent QemuOpt instance of the
-  * repeated option. This may consume QemuOpt itself
-  * and return to LM_IN_PROGRESS.
-  *
-  * Parsing a value into the list link will store the
-  * next element of the signed interval.
-  */
-
-LM_UNSIGNED_INTERVAL /* Same as above, only for an unsigned interval. */
-};
-
-typedef enum ListMode ListMode;
-
-struct OptsVisitor
-{
-Visitor visitor;
-
-/* Ownership remains with opts_visitor_new()'s caller. */
-const QemuOpts *opts_root;
-
-unsigned depth;
-
-/* Non-null

[Qemu-block] [PATCH v14 13/19] qom: support non-scalar properties with -object

2016-09-27 Thread Daniel P. Berrange

The current -object command line syntax only allows for
creation of objects with scalar properties, or a list
with a fixed scalar element type. Objects which have
properties that are represented as structs in the QAPI
schema cannot be created using -object.

This is a design limitation of the way the OptsVisitor
is written. It simply iterates over the QemuOpts values
as a flat list. The support for lists is enabled by
allowing the same key to be repeated in the opts string.

The QObjectInputVisitor now has functionality that allows
it to replace OptsVisitor, maintaining full backwards
compatibility for previous CLI syntax, while also allowing
use of new syntax for structs.

Thus -object can now support non-scalar properties,
for example the QMP object:

  {
"execute": "object-add",
"arguments": {
  "qom-type": "demo",
  "id": "demo0",
  "parameters": {
"foo": [
  { "bar": "one", "wizz": "1" },
  { "bar": "two", "wizz": "2" }
]
  }
}
  }

Would be creatable via the CLI now using

$QEMU \
  -object demo,id=demo0,\
  foo.0.bar=one,foo.0.wizz=1,\
  foo.1.bar=two,foo.1.wizz=2

Notice that this syntax is intentionally compatible
with that currently used by block drivers. NB the
indentation seen here after line continuations should
not be used in reality, it is just for clarity in this
commit message.

Signed-off-by: Daniel P. Berrange 
---
 qapi/qobject-input-visitor.c |   2 +-
 qom/object_interfaces.c  |  38 -
 tests/check-qom-proplist.c   | 367 ++-
 3 files changed, 397 insertions(+), 10 deletions(-)

diff --git a/qapi/qobject-input-visitor.c b/qapi/qobject-input-visitor.c
index 0aef20e..a353d67 100644
--- a/qapi/qobject-input-visitor.c
+++ b/qapi/qobject-input-visitor.c
@@ -204,7 +204,7 @@ static void qobject_input_start_struct(Visitor *v, const 
char *name, void **obj,
 *obj = NULL;
 }
 
-if (!qobj && (qiv->struct_level < qiv->autocreate_struct_levels)) {
+if (!qobj && (qiv->struct_level <= qiv->autocreate_struct_levels)) {
 /* Create a new dict that contains all the currently
  * unvisited items */
 if (tos) {
diff --git a/qom/object_interfaces.c b/qom/object_interfaces.c
index f7afe49..faea1b1 100644
--- a/qom/object_interfaces.c
+++ b/qom/object_interfaces.c
@@ -4,7 +4,8 @@
 #include "qemu/module.h"
 #include "qapi-visit.h"
 #include "qapi/qobject-output-visitor.h"
-#include "qapi/opts-visitor.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qemu/option.h"
 
 void user_creatable_complete(Object *obj, Error **errp)
 {
@@ -63,12 +64,16 @@ Object *user_creatable_add(const QDict *qdict,
 if (local_err) {
 goto out_visit;
 }
-visit_check_struct(v, _err);
+
+obj = user_creatable_add_type(type, id, pdict, v, _err);
 if (local_err) {
 goto out_visit;
 }
 
-obj = user_creatable_add_type(type, id, pdict, v, _err);
+visit_check_struct(v, _err);
+if (local_err) {
+goto out_visit;
+}
 
 out_visit:
 visit_end_struct(v, NULL);
@@ -114,7 +119,7 @@ Object *user_creatable_add_type(const char *type, const 
char *id,
 
 assert(qdict);
 obj = object_new(type);
-visit_start_struct(v, NULL, NULL, 0, _err);
+visit_start_struct(v, "props", NULL, 0, _err);
 if (local_err) {
 goto out;
 }
@@ -158,14 +163,31 @@ Object *user_creatable_add_opts(QemuOpts *opts, Error 
**errp)
 {
 Visitor *v;
 QDict *pdict;
+QObject *pobj;
 Object *obj = NULL;
 
-v = opts_visitor_new(opts);
 pdict = qemu_opts_to_qdict(opts, NULL,
-   QEMU_OPTS_REPEAT_POLICY_LAST);
-
-obj = user_creatable_add(pdict, v, errp);
+   QEMU_OPTS_REPEAT_POLICY_LIST);
+
+pobj = qdict_crumple(pdict, true, errp);
+if (!pobj) {
+goto cleanup;
+}
+/*
+ * We need autocreate_list=true + permit_int_ranges=true
+ * in order to maintain compat with OptsVisitor creation
+ * of the 'numa' object which had an int16List property.
+ *
+ * We need autocreate_structs=1, because at the CLI level
+ * we have the object type + properties in the same flat
+ * struct, even though at the QMP level they are nested.
+ */
+v = qobject_input_visitor_new_autocast(pobj, true, 1, true);
+
+obj = user_creatable_add(qobject_to_qdict(pobj), v, errp);
 visit_free(v);
+qobject_decref(pobj);
+ cleanup:
 QDECREF(pdict);
 return obj;
 }
diff --git a/tests/check-qom-proplist.c b/tests/check-qom-proplist.c
index a16cefc..20eb1d1 100644
--- a/tests/check-qom-proplist.c
+++ b/tests/check-qom-proplist.c
@@ -22,7 +22,16 @@
 
 #include "qapi/error.h"
 #include "qom/object.h"
+#include "qom/object_interfaces.h"
 #include "qemu/module.h"
+#include "qapi/visitor.h"
+#include "qom/object_interfaces.h"
+#include "qemu/option.h"
+#include "qemu/config-file.h"

[Qemu-block] [PATCH v14 15/19] numa: convert to use QObjectInputVisitor for -numa

2016-09-27 Thread Daniel P. Berrange

Switch away from using OptsVisitor to parse the -numa
argument processing. This enables use of the modern
list syntax for specifying CPUs. e.g. the old syntax

  -numa node,nodeid=0,cpus=0-3,cpus=8-11,mem=107

is equivalent to

  -numa node,nodeid=0,cpus.0=0,cpus.1=1,cpus.2=2,cpus.3=3,\
cpus.4=8,cpus.5=9,cpus.6=10,cpus.7=11,mem=107

Furthermore, the cli arg can now follow the QAPI schema
nesting, so the above is equivalent to the canonical
syntax:

  -numa type=node,data.nodeid=0,data.cpus.0=0,data.cpus.1=1,\
data.cpus.2=2,data.cpus.3=3,data.cpus.4=8,data.cpus.5=9,\
data.cpus.6=10,data.cpus.7=11,data.mem=107

A test case is added to cover argument parsing to validate
that both the old and new syntax is correctly handled.

Signed-off-by: Daniel P. Berrange 
---
 include/sysemu/numa_int.h |  11 +
 numa.c|  36 +-
 stubs/Makefile.objs   |   5 ++
 stubs/exec.c  |   6 +++
 stubs/hostmem.c   |  14 ++
 stubs/memory.c|  41 
 stubs/qdev.c  |   8 
 stubs/vl.c|   8 
 stubs/vmstate.c   |   4 ++
 tests/Makefile.include|   2 +
 tests/test-numa.c | 116 ++
 11 files changed, 240 insertions(+), 11 deletions(-)
 create mode 100644 include/sysemu/numa_int.h
 create mode 100644 stubs/exec.c
 create mode 100644 stubs/hostmem.c
 create mode 100644 stubs/memory.c
 create mode 100644 stubs/qdev.c
 create mode 100644 stubs/vl.c
 create mode 100644 tests/test-numa.c

diff --git a/include/sysemu/numa_int.h b/include/sysemu/numa_int.h
new file mode 100644
index 000..93160da
--- /dev/null
+++ b/include/sysemu/numa_int.h
@@ -0,0 +1,11 @@
+#ifndef SYSEMU_NUMA_INT_H
+#define SYSEMU_NUMA_INT_H
+
+#include "sysemu/numa.h"
+
+extern int have_memdevs;
+extern int max_numa_nodeid;
+
+int parse_numa(void *opaque, QemuOpts *opts, Error **errp);
+
+#endif
diff --git a/numa.c b/numa.c
index 6289f46..5daa3d1 100644
--- a/numa.c
+++ b/numa.c
@@ -23,14 +23,14 @@
  */
 
 #include "qemu/osdep.h"
-#include "sysemu/numa.h"
+#include "sysemu/numa_int.h"
 #include "exec/cpu-common.h"
 #include "qemu/bitmap.h"
 #include "qom/cpu.h"
 #include "qemu/error-report.h"
 #include "include/exec/cpu-common.h" /* for RAM_ADDR_FMT */
 #include "qapi-visit.h"
-#include "qapi/opts-visitor.h"
+#include "qapi/qobject-input-visitor.h"
 #include "hw/boards.h"
 #include "sysemu/hostmem.h"
 #include "qmp-commands.h"
@@ -45,10 +45,10 @@ QemuOptsList qemu_numa_opts = {
 .desc = { { 0 } } /* validated with OptsVisitor */
 };
 
-static int have_memdevs = -1;
-static int max_numa_nodeid; /* Highest specified NUMA node ID, plus one.
- * For all nodes, nodeid < max_numa_nodeid
- */
+int have_memdevs = -1;
+int max_numa_nodeid; /* Highest specified NUMA node ID, plus one.
+  * For all nodes, nodeid < max_numa_nodeid
+  */
 int nb_numa_nodes;
 NodeInfo numa_info[MAX_NODES];
 
@@ -189,6 +189,9 @@ static void numa_node_parse(NumaNodeOptions *node, QemuOpts 
*opts, Error **errp)
 if (node->has_mem) {
 uint64_t mem_size = node->mem;
 const char *mem_str = qemu_opt_get(opts, "mem");
+if (!mem_str) {
+mem_str = qemu_opt_get(opts, "data.mem");
+}
 /* Fix up legacy suffix-less format */
 if (g_ascii_isdigit(mem_str[strlen(mem_str) - 1])) {
 mem_size <<= 20;
@@ -211,16 +214,27 @@ static void numa_node_parse(NumaNodeOptions *node, 
QemuOpts *opts, Error **errp)
 max_numa_nodeid = MAX(max_numa_nodeid, nodenr + 1);
 }
 
-static int parse_numa(void *opaque, QemuOpts *opts, Error **errp)
+int parse_numa(void *opaque, QemuOpts *opts, Error **errp)
 {
 NumaOptions *object = NULL;
 Error *err = NULL;
+Visitor *v;
 
-{
-Visitor *v = opts_visitor_new(opts);
-visit_type_NumaOptions(v, NULL, , );
-visit_free(v);
+/*
+ * Needs autocreate_list=true and permit_int_ranges=true
+ * in order to support existing syntax for 'cpus' parameter
+ * which is an int list.
+ *
+ * Needs autocreate_struct_levels=1, because existing syntax
+ * used a nested struct in the QMP schema with a flat namespace
+ * in the CLI args.
+ */
+v = qobject_input_visitor_new_opts(opts, true, 1, true, );
+if (err) {
+goto end;
 }
+visit_type_NumaOptions(v, NULL, , );
+visit_free(v);
 
 if (err) {
 goto end;
diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index c5850e8..661b48a 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -48,3 +48,8 @@ stub-obj-y += iohandler.o
 stub-obj-y += smbios_type_38.o
 stub-obj-y += ipmi.o
 stub-obj-y += pc_madt_cpu_entry.o
+stub-obj-y += vl.o
+stub-obj-y += exec.o
+stub-obj-y += memory.o
+stub-obj-y += hostmem.o
+stub-obj-y += qdev.o
diff --git a/stubs/exec.c

[Qemu-block] [PATCH v14 17/19] acpi: convert to QObjectInputVisitor for -acpi parsing

2016-09-27 Thread Daniel P. Berrange

The -acpi command line option parsing uses the OptsVisitor
currently. This is easily replaced by the QObjectInputVisitor
instead. There is no need to enable any of the compatibility
options, since the AcpiTableOptions QAPI struct only contains
scalar properties.

Signed-off-by: Daniel P. Berrange 
---
 hw/acpi/core.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/hw/acpi/core.c b/hw/acpi/core.c
index e890a5d..480d3dd 100644
--- a/hw/acpi/core.c
+++ b/hw/acpi/core.c
@@ -25,7 +25,7 @@
 #include "hw/acpi/acpi.h"
 #include "hw/nvram/fw_cfg.h"
 #include "qemu/config-file.h"
-#include "qapi/opts-visitor.h"
+#include "qapi/qobject-input-visitor.h"
 #include "qapi-visit.h"
 #include "qapi-event.h"
 
@@ -237,14 +237,14 @@ void acpi_table_add(const QemuOpts *opts, Error **errp)
 char **cur;
 size_t bloblen = 0;
 char unsigned *blob = NULL;
+Visitor *v;
 
-{
-Visitor *v;
-
-v = opts_visitor_new(opts);
-visit_type_AcpiTableOptions(v, NULL, , );
-visit_free(v);
+v = qobject_input_visitor_new_opts(opts, false, 0, false, );
+if (err) {
+goto out;
 }
+visit_type_AcpiTableOptions(v, NULL, , );
+visit_free(v);
 
 if (err) {
 goto out;
-- 
2.7.4

[Qemu-block] [PATCH v14 14/19] hmp: support non-scalar properties with object_add

2016-09-27 Thread Daniel P. Berrange

The current object_add HMP syntax only allows for
creation of objects with scalar properties, or a list
with a fixed scalar element type. Objects which have
properties that are represented as structs in the QAPI
schema cannot be created using -object.

This is a design limitation of the way the OptsVisitor
is written. It simply iterates over the QemuOpts values
as a flat list. The support for lists is enabled by
allowing the same key to be repeated in the opts string.

The QObjectInputVisitor now has functionality that allows
it to replace OptsVisitor, maintaining full backwards
compatibility for previous CLI syntax, while also allowing
use of new syntax for structs.

Thus -object can now support non-scalar properties,
for example the QMP object:

  {
"execute": "object-add",
"arguments": {
  "qom-type": "demo",
  "id": "demo0",
  "parameters": {
"foo": [
  { "bar": "one", "wizz": "1" },
  { "bar": "two", "wizz": "2" }
]
  }
}
  }

Would be creatable via the HMP now using

   object_add demo,id=demo0,\
  foo.0.bar=one,foo.0.wizz=1,\
  foo.1.bar=two,foo.1.wizz=2

Notice that this syntax is intentionally compatible
with that currently used by block drivers. NB the
indentation seen here after line continuations should
not be used in reality, it is just for clarity in this
commit message.

Signed-off-by: Daniel P. Berrange 
---
 hmp.c | 25 +
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/hmp.c b/hmp.c
index ad33b44..da77d4c 100644
--- a/hmp.c
+++ b/hmp.c
@@ -25,7 +25,7 @@
 #include "qemu/sockets.h"
 #include "monitor/monitor.h"
 #include "monitor/qdev.h"
-#include "qapi/opts-visitor.h"
+#include "qapi/qobject-input-visitor.h"
 #include "qapi/qmp/qerror.h"
 #include "qapi/string-output-visitor.h"
 #include "qapi/util.h"
@@ -1695,21 +1695,30 @@ void hmp_netdev_del(Monitor *mon, const QDict *qdict)
 void hmp_object_add(Monitor *mon, const QDict *qdict)
 {
 Error *err = NULL;
-QemuOpts *opts;
 Visitor *v;
 Object *obj = NULL;
+QObject *pobj;
 
-opts = qemu_opts_from_qdict(qemu_find_opts("object"), qdict, );
+pobj = qdict_crumple(qdict, true, );
 if (err) {
-hmp_handle_error(mon, );
-return;
+goto cleanup;
 }
 
-v = opts_visitor_new(opts);
-obj = user_creatable_add(qdict, v, );
+/*
+ * We need autocreate_list=true + permit_int_ranges=true
+ * in order to maintain compat with OptsVisitor creation
+ * of the 'numa' object which had an int16List property.
+ *
+ * We need autocreate_structs=1, because at the CLI level
+ * we have the object type + properties in the same flat
+ * struct, even though at the QMP level they are nested.
+ */
+v = qobject_input_visitor_new_autocast(pobj, true, 1, true);
+obj = user_creatable_add(qobject_to_qdict(pobj), v, );
 visit_free(v);
-qemu_opts_del(opts);
 
+ cleanup:
+qobject_decref(pobj);
 if (err) {
 hmp_handle_error(mon, );
 }
-- 
2.7.4

[Qemu-block] [PATCH v14 12/19] qapi: allow QObjectInputVisitor to be created with QemuOpts

2016-09-27 Thread Daniel P. Berrange

Instead of requiring all callers to go through the mutli-step
process of turning QemuOpts into a suitable QObject for visiting,
add a new constructor that encapsulates this logic. This will
allow QObjectInputVisitor to be a drop-in replacement for the
existing OptsVisitor with minimal code changes for callers.

Signed-off-by: Daniel P. Berrange 
---
 include/qapi/qobject-input-visitor.h | 19 +++
 include/qemu/option.h|  2 +-
 qapi/qobject-input-visitor.c | 29 +
 util/qemu-option.c   |  2 +-
 4 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/include/qapi/qobject-input-visitor.h 
b/include/qapi/qobject-input-visitor.h
index 63f3782..242b767 100644
--- a/include/qapi/qobject-input-visitor.h
+++ b/include/qapi/qobject-input-visitor.h
@@ -102,4 +102,23 @@ Visitor *qobject_input_visitor_new_autocast(QObject *obj,
 size_t autocreate_struct_levels,
 bool permit_int_ranges);
 
+
+/**
+ * Create a new input visitor that converts @opts to a QAPI object.
+ *
+ * The QemuOpts will be converted into a QObject using the
+ * qdict_crumple() method to automatically create structs
+ * and lists. The resulting QDict will then be passed to the
+ * qobject_input_visitor_new_autocast() method. See the docs
+ * of that method for further details on processing behaviour.
+ *
+ * The returned input visitor should be released by calling
+ * visit_free() when no longer required.
+ */
+Visitor *qobject_input_visitor_new_opts(const QemuOpts *opts,
+bool autocreate_list,
+size_t autocreate_struct_levels,
+bool permit_int_ranges,
+Error **errp);
+
 #endif
diff --git a/include/qemu/option.h b/include/qemu/option.h
index 328c468..bf1f078 100644
--- a/include/qemu/option.h
+++ b/include/qemu/option.h
@@ -130,7 +130,7 @@ typedef enum {
 QEMU_OPTS_REPEAT_POLICY_LIST,
 } QemuOptsRepeatPolicy;
 
-QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict,
+QDict *qemu_opts_to_qdict(const QemuOpts *opts, QDict *qdict,
   QemuOptsRepeatPolicy repeatPolicy);
 void qemu_opts_absorb_qdict(QemuOpts *opts, QDict *qdict, Error **errp);
 
diff --git a/qapi/qobject-input-visitor.c b/qapi/qobject-input-visitor.c
index a38e779..0aef20e 100644
--- a/qapi/qobject-input-visitor.c
+++ b/qapi/qobject-input-visitor.c
@@ -747,3 +747,32 @@ Visitor *qobject_input_visitor_new_autocast(QObject *obj,
 
 return >visitor;
 }
+
+
+Visitor *qobject_input_visitor_new_opts(const QemuOpts *opts,
+bool autocreate_list,
+size_t autocreate_struct_levels,
+bool permit_int_ranges,
+Error **errp)
+{
+QDict *pdict;
+QObject *pobj;
+Visitor *v = NULL;
+
+pdict = qemu_opts_to_qdict(opts, NULL,
+   QEMU_OPTS_REPEAT_POLICY_LIST);
+
+pobj = qdict_crumple(pdict, true, errp);
+if (!pobj) {
+goto cleanup;
+}
+
+v = qobject_input_visitor_new_autocast(pobj,
+   autocreate_list,
+   autocreate_struct_levels,
+   permit_int_ranges);
+ cleanup:
+qobject_decref(pobj);
+QDECREF(pdict);
+return v;
+}
diff --git a/util/qemu-option.c b/util/qemu-option.c
index ad28d4e..db0fef2 100644
--- a/util/qemu-option.c
+++ b/util/qemu-option.c
@@ -1058,7 +1058,7 @@ void qemu_opts_absorb_qdict(QemuOpts *opts, QDict *qdict, 
Error **errp)
  * TODO We'll want to use types appropriate for opt->desc->type, but
  * this is enough for now.
  */
-QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict,
+QDict *qemu_opts_to_qdict(const QemuOpts *opts, QDict *qdict,
   QemuOptsRepeatPolicy repeatPolicy)
 {
 QemuOpt *opt;
-- 
2.7.4

[Qemu-block] [PATCH v14 10/19] qapi: permit auto-creating nested structs

2016-09-27 Thread Daniel P. Berrange

Some of the historical command line opts that had their
keys in in a completely flat namespace are now represented
by QAPI schemas that use a nested structs. When converting
the QemuOpts to QObject, there is no information about
compound types available, so the QObject will be completely
flat, even after the qdict_crumple() call. So when starting
a struct, we may not have a QDict available in the input
data, so we auto-create a QDict containing all the currently
unvisited input data keys. Not all historical command line
opts require this, so the behaviour is opt-in, by specifying
how many levels of structs are permitted to be auto-created.

Note that this only works if the child struct is the last
field to the visited in the parent struct. This is always
the case for currently existing legacy command line options.

The example is the NetLegacy type which has 3 levels of
structs. The modern way to represent this in QemuOpts would
be the dot-separated component approach

  -net vlan=1,id=foo,name=bar,opts.type=tap,\
   opts.data.fd=3,opts.data.script=ifup

The legacy syntax will just be presenting

  -net vlan=1,id=foo,name=bar,type=tap,fd=3,script=ifup

So we need to auto-create 3 levels of struct when visiting.

The implementation here will enable visiting in both the
modern and legacy syntax, compared to OptsVisitor which
only allows the legacy syntax.

Signed-off-by: Daniel P. Berrange 
---
 include/qapi/qobject-input-visitor.h |  21 +-
 qapi/qobject-input-visitor.c |  55 +--
 tests/test-qobject-input-visitor.c   | 130 ---
 3 files changed, 190 insertions(+), 16 deletions(-)

diff --git a/include/qapi/qobject-input-visitor.h 
b/include/qapi/qobject-input-visitor.h
index fb52457..90989f4 100644
--- a/include/qapi/qobject-input-visitor.h
+++ b/include/qapi/qobject-input-visitor.h
@@ -45,7 +45,7 @@ Visitor *qobject_input_visitor_new(QObject *obj, bool strict);
  * If @autocreate_list is true, then as an alternative to a normal QList,
  * list values can be stored as a QString or QDict instead, which will
  * be interpreted as representing single element lists. This should only
- * by used if compatibility is required with the OptsVisitor which allowed
+ * be used if compatibility is required with the OptsVisitor which allowed
  * repeated keys, without list indexes, to represent lists. e.g. set this
  * to true if you have compatibility requirements for
  *
@@ -55,6 +55,22 @@ Visitor *qobject_input_visitor_new(QObject *obj, bool 
strict);
  *
  *   -arg foo.0=hello,foo.1=world
  *
+ * If @autocreate_struct_levels is non-zero, then when visiting structs,
+ * if the corresponding QDict is not found, it will automatically create
+ * a QDict containing all remaining unvisited options. This should only
+ * be used if compatibility is required with the OptsVisitor which flatten
+ * structs so that all keys were at the same level. e.g. set this to a
+ * non-zero number if you compatibility requirements for
+ *
+ *   -arg type=person,surname=blogs,forename=fred
+ *
+ * to be treated as equivalent to the perferred syntax
+ *
+ *   -arg type=person,data.surname=blogs,data.forename=fred
+ *
+ * The value given determines how many levels of structs are allowed to
+ * be flattened in this way.
+ *
  * The visitor always operates in strict mode, requiring all dict keys
  * to be consumed during visitation. An error will be reported if this
  * does not happen.
@@ -63,6 +79,7 @@ Visitor *qobject_input_visitor_new(QObject *obj, bool strict);
  * visit_free() when no longer required.
  */
 Visitor *qobject_input_visitor_new_autocast(QObject *obj,
-bool autocreate_list);
+bool autocreate_list,
+size_t autocreate_struct_levels);
 
 #endif
diff --git a/qapi/qobject-input-visitor.c b/qapi/qobject-input-visitor.c
index e8afd1e..6cacd4b 100644
--- a/qapi/qobject-input-visitor.c
+++ b/qapi/qobject-input-visitor.c
@@ -52,6 +52,14 @@ struct QObjectInputVisitor
 /* Whether we can auto-create single element lists when
  * encountering a non-QList type */
 bool autocreate_list;
+
+/* Current depth of recursion into structs */
+size_t struct_level;
+
+/* Numbers of levels at which we will
+ * consider auto-creating a struct containing
+ * remaining unvisited items */
+size_t autocreate_struct_levels;
 };
 
 static QObjectInputVisitor *to_qiv(Visitor *v)
@@ -79,7 +87,12 @@ static QObject *qobject_input_get_object(QObjectInputVisitor 
*qiv,
 
 if (qobject_type(qobj) == QTYPE_QDICT) {
 assert(name);
-ret = qdict_get(qobject_to_qdict(qobj), name);
+if (qiv->autocreate_struct_levels &&
+!g_hash_table_contains(tos->h, name)) {
+ret = NULL;
+} else {
+ret = qdict_get(qobject_to_qdict(qobj), name);
+}
 if (tos->h

[Qemu-block] [PATCH v14 08/19] qapi: permit scalar type conversions in QObjectInputVisitor

2016-09-27 Thread Daniel P. Berrange

Currently the QObjectInputVisitor assumes that all scalar
values are directly represented as the final types declared
by the thing being visited. ie it assumes an 'int' is using
QInt, and a 'bool' is using QBool, etc.  This is good when
QObjectInputVisitor is fed a QObject that came from a JSON
document on the QMP monitor, as it will strictly validate
correctness.

To allow QObjectInputVisitor to be reused for visiting
a QObject originating from QemuOpts, an alternative mode
is needed where all the scalars types are represented as
QString and converted on the fly to the final desired
type.

Reviewed-by: Kevin Wolf 
Reviewed-by: Marc-André Lureau 
Signed-off-by: Daniel P. Berrange 
---
 include/qapi/qobject-input-visitor.h |  32 +-
 qapi/qobject-input-visitor.c | 132 
 tests/test-qobject-input-visitor.c   | 194 ++-
 3 files changed, 350 insertions(+), 8 deletions(-)

diff --git a/include/qapi/qobject-input-visitor.h 
b/include/qapi/qobject-input-visitor.h
index cde328d..5022297 100644
--- a/include/qapi/qobject-input-visitor.h
+++ b/include/qapi/qobject-input-visitor.h
@@ -19,12 +19,36 @@
 
 typedef struct QObjectInputVisitor QObjectInputVisitor;
 
-/*
- * Return a new input visitor that converts a QObject to a QAPI object.
+/**
+ * Create a new input visitor that converts @obj to a QAPI object.
+ *
+ * Any scalar values in the @obj input data structure should be in the
+ * required type already. i.e. if visiting a bool, the value should
+ * already be a QBool instance.
  *
- * Set @strict to reject a parse that doesn't consume all keys of a
- * dictionary; otherwise excess input is ignored.
+ * If @strict is set to true, then an error will be reported if any
+ * dict keys are not consumed during visitation. If @strict is false
+ * then extra dict keys are silently ignored.
+ *
+ * The returned input visitor should be released by calling
+ * visit_free() when no longer required.
  */
 Visitor *qobject_input_visitor_new(QObject *obj, bool strict);
 
+/**
+ * Create a new input visitor that converts @obj to a QAPI object.
+ *
+ * Any scalar values in the @obj input data structure should always be
+ * represented as strings. i.e. if visiting a boolean, the value should
+ * be a QString whose contents represent a valid boolean.
+ *
+ * The visitor always operates in strict mode, requiring all dict keys
+ * to be consumed during visitation. An error will be reported if this
+ * does not happen.
+ *
+ * The returned input visitor should be released by calling
+ * visit_free() when no longer required.
+ */
+Visitor *qobject_input_visitor_new_autocast(QObject *obj);
+
 #endif
diff --git a/qapi/qobject-input-visitor.c b/qapi/qobject-input-visitor.c
index 5ff3db3..cf41df6 100644
--- a/qapi/qobject-input-visitor.c
+++ b/qapi/qobject-input-visitor.c
@@ -20,6 +20,7 @@
 #include "qemu-common.h"
 #include "qapi/qmp/types.h"
 #include "qapi/qmp/qerror.h"
+#include "qemu/cutils.h"
 
 #define QIV_STACK_SIZE 1024
 
@@ -263,6 +264,28 @@ static void qobject_input_type_int64(Visitor *v, const 
char *name, int64_t *obj,
 *obj = qint_get_int(qint);
 }
 
+
+static void qobject_input_type_int64_autocast(Visitor *v, const char *name,
+  int64_t *obj, Error **errp)
+{
+QObjectInputVisitor *qiv = to_qiv(v);
+QString *qstr = qobject_to_qstring(qobject_input_get_object(qiv, name,
+true));
+int64_t ret;
+
+if (!qstr || !qstr->string) {
+error_setg(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : "null",
+   "string");
+return;
+}
+
+if (qemu_strtoll(qstr->string, NULL, 0, ) < 0) {
+error_setg(errp, QERR_INVALID_PARAMETER_VALUE, name, "a number");
+return;
+}
+*obj = ret;
+}
+
 static void qobject_input_type_uint64(Visitor *v, const char *name,
   uint64_t *obj, Error **errp)
 {
@@ -279,6 +302,27 @@ static void qobject_input_type_uint64(Visitor *v, const 
char *name,
 *obj = qint_get_int(qint);
 }
 
+static void qobject_input_type_uint64_autocast(Visitor *v, const char *name,
+   uint64_t *obj, Error **errp)
+{
+QObjectInputVisitor *qiv = to_qiv(v);
+QString *qstr = qobject_to_qstring(qobject_input_get_object(qiv, name,
+true));
+unsigned long long ret;
+
+if (!qstr || !qstr->string) {
+error_setg(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : "null",
+   "string");
+return;
+}
+
+if (parse_uint_full(qstr->string, , 0) < 0) {
+error_setg(errp, QERR_INVALID_PARAMETER_VALUE, name, "a number");
+return;
+}
+*obj = ret;
+}
+
 static void qobject_input_type_bool(Visitor *v, const char *name, bool

[Qemu-block] [PATCH v14 07/19] qapi: don't pass two copies of TestInputVisitorData to tests

2016-09-27 Thread Daniel P. Berrange

The input_visitor_test_add() method was accepting an instance
of 'TestInputVisitorData' and passing it as the 'user_data'
parameter to test functions. The main 'TestInputVisitorData'
instance that was actually used, was meanwhile being allocated
automatically by the test framework fixture setup.

Signed-off-by: Daniel P. Berrange 
---
 tests/test-qobject-input-visitor.c | 76 --
 1 file changed, 32 insertions(+), 44 deletions(-)

diff --git a/tests/test-qobject-input-visitor.c 
b/tests/test-qobject-input-visitor.c
index 0e65e63..26c5012 100644
--- a/tests/test-qobject-input-visitor.c
+++ b/tests/test-qobject-input-visitor.c
@@ -747,10 +747,11 @@ static void 
test_visitor_in_native_list_number(TestInputVisitorData *data,
 }
 
 static void input_visitor_test_add(const char *testpath,
-   TestInputVisitorData *data,
-   void (*test_func)(TestInputVisitorData 
*data, const void *user_data))
+   const void *user_data,
+   void (*test_func)(TestInputVisitorData 
*data,
+ const void *user_data))
 {
-g_test_add(testpath, TestInputVisitorData, data, NULL, test_func,
+g_test_add(testpath, TestInputVisitorData, user_data, NULL, test_func,
visitor_input_teardown);
 }
 
@@ -833,77 +834,64 @@ static void 
test_visitor_in_wrong_type(TestInputVisitorData *data,
 
 int main(int argc, char **argv)
 {
-TestInputVisitorData in_visitor_data;
-
 g_test_init(, , NULL);
 
 input_visitor_test_add("/visitor/input/int",
-   _visitor_data, test_visitor_in_int);
+   NULL, test_visitor_in_int);
 input_visitor_test_add("/visitor/input/int_overflow",
-   _visitor_data, test_visitor_in_int_overflow);
+   NULL, test_visitor_in_int_overflow);
 input_visitor_test_add("/visitor/input/bool",
-   _visitor_data, test_visitor_in_bool);
+   NULL, test_visitor_in_bool);
 input_visitor_test_add("/visitor/input/number",
-   _visitor_data, test_visitor_in_number);
+   NULL, test_visitor_in_number);
 input_visitor_test_add("/visitor/input/string",
-   _visitor_data, test_visitor_in_string);
+   NULL, test_visitor_in_string);
 input_visitor_test_add("/visitor/input/enum",
-   _visitor_data, test_visitor_in_enum);
+   NULL, test_visitor_in_enum);
 input_visitor_test_add("/visitor/input/struct",
-   _visitor_data, test_visitor_in_struct);
+   NULL, test_visitor_in_struct);
 input_visitor_test_add("/visitor/input/struct-nested",
-   _visitor_data, test_visitor_in_struct_nested);
+   NULL, test_visitor_in_struct_nested);
 input_visitor_test_add("/visitor/input/list",
-   _visitor_data, test_visitor_in_list);
+   NULL, test_visitor_in_list);
 input_visitor_test_add("/visitor/input/any",
-   _visitor_data, test_visitor_in_any);
+   NULL, test_visitor_in_any);
 input_visitor_test_add("/visitor/input/null",
-   _visitor_data, test_visitor_in_null);
+   NULL, test_visitor_in_null);
 input_visitor_test_add("/visitor/input/union-flat",
-   _visitor_data, test_visitor_in_union_flat);
+   NULL, test_visitor_in_union_flat);
 input_visitor_test_add("/visitor/input/alternate",
-   _visitor_data, test_visitor_in_alternate);
+   NULL, test_visitor_in_alternate);
 input_visitor_test_add("/visitor/input/errors",
-   _visitor_data, test_visitor_in_errors);
+   NULL, test_visitor_in_errors);
 input_visitor_test_add("/visitor/input/wrong-type",
-   _visitor_data, test_visitor_in_wrong_type);
+   NULL, test_visitor_in_wrong_type);
 input_visitor_test_add("/visitor/input/alternate-number",
-   _visitor_data, test_visitor_in_alternate_number);
+   NULL, test_visitor_in_alternate_number);
 input_visitor_test_add("/visitor/input/native_list/int",
-   _visitor_data,
-   test_visitor_in_native_list_int);
+   NULL, test_visitor_in_native_list_int);
 input_visitor_test_add("/visitor/input/native_list/int8",
-   _visitor_data,
-   test_visitor_in_native_list_int8);
+

[Qemu-block] [PATCH v14 04/19] qapi: add trace events for visitor

2016-09-27 Thread Daniel P. Berrange

Allow tracing of the operation of visitors

Signed-off-by: Daniel P. Berrange 
---
 Makefile.objs  |  1 +
 qapi/qapi-visit-core.c | 27 +++
 qapi/trace-events  | 33 +
 3 files changed, 61 insertions(+)
 create mode 100644 qapi/trace-events

diff --git a/Makefile.objs b/Makefile.objs
index 7301544..54da068 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -160,3 +160,4 @@ trace-events-y += target-s390x/trace-events
 trace-events-y += target-ppc/trace-events
 trace-events-y += qom/trace-events
 trace-events-y += linux-user/trace-events
+trace-events-y += qapi/trace-events
diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
index 55f5876..bfcb276 100644
--- a/qapi/qapi-visit-core.c
+++ b/qapi/qapi-visit-core.c
@@ -19,10 +19,12 @@
 #include "qapi/qmp/qerror.h"
 #include "qapi/visitor.h"
 #include "qapi/visitor-impl.h"
+#include "trace.h"
 
 void visit_complete(Visitor *v, void *opaque)
 {
 assert(v->type != VISITOR_OUTPUT || v->complete);
+trace_visit_complete(v, opaque);
 if (v->complete) {
 v->complete(v, opaque);
 }
@@ -30,6 +32,7 @@ void visit_complete(Visitor *v, void *opaque)
 
 void visit_free(Visitor *v)
 {
+trace_visit_free(v);
 if (v) {
 v->free(v);
 }
@@ -40,6 +43,7 @@ void visit_start_struct(Visitor *v, const char *name, void 
**obj,
 {
 Error *err = NULL;
 
+trace_visit_start_struct(v, name, obj, size);
 if (obj) {
 assert(size);
 assert(!(v->type & VISITOR_OUTPUT) || *obj);
@@ -53,6 +57,7 @@ void visit_start_struct(Visitor *v, const char *name, void 
**obj,
 
 void visit_check_struct(Visitor *v, Error **errp)
 {
+trace_visit_check_struct(v);
 if (v->check_struct) {
 v->check_struct(v, errp);
 }
@@ -60,6 +65,7 @@ void visit_check_struct(Visitor *v, Error **errp)
 
 void visit_end_struct(Visitor *v, void **obj)
 {
+trace_visit_end_struct(v, obj);
 v->end_struct(v, obj);
 }
 
@@ -69,6 +75,7 @@ void visit_start_list(Visitor *v, const char *name, 
GenericList **list,
 Error *err = NULL;
 
 assert(!list || size >= sizeof(GenericList));
+trace_visit_start_list(v, name, list, size);
 v->start_list(v, name, list, size, );
 if (list && (v->type & VISITOR_INPUT)) {
 assert(!(err && *list));
@@ -79,11 +86,13 @@ void visit_start_list(Visitor *v, const char *name, 
GenericList **list,
 GenericList *visit_next_list(Visitor *v, GenericList *tail, size_t size)
 {
 assert(tail && size >= sizeof(GenericList));
+trace_visit_next_list(v, tail, size);
 return v->next_list(v, tail, size);
 }
 
 void visit_end_list(Visitor *v, void **obj)
 {
+trace_visit_end_list(v, obj);
 v->end_list(v, obj);
 }
 
@@ -95,6 +104,7 @@ void visit_start_alternate(Visitor *v, const char *name,
 
 assert(obj && size >= sizeof(GenericAlternate));
 assert(!(v->type & VISITOR_OUTPUT) || *obj);
+trace_visit_start_alternate(v, name, obj, size, promote_int);
 if (v->start_alternate) {
 v->start_alternate(v, name, obj, size, promote_int, );
 }
@@ -106,6 +116,7 @@ void visit_start_alternate(Visitor *v, const char *name,
 
 void visit_end_alternate(Visitor *v, void **obj)
 {
+trace_visit_end_alternate(v, obj);
 if (v->end_alternate) {
 v->end_alternate(v, obj);
 }
@@ -113,6 +124,7 @@ void visit_end_alternate(Visitor *v, void **obj)
 
 bool visit_optional(Visitor *v, const char *name, bool *present)
 {
+trace_visit_optional(v, name, present);
 if (v->optional) {
 v->optional(v, name, present);
 }
@@ -127,6 +139,7 @@ bool visit_is_input(Visitor *v)
 void visit_type_int(Visitor *v, const char *name, int64_t *obj, Error **errp)
 {
 assert(obj);
+trace_visit_type_int(v, name, obj);
 v->type_int64(v, name, obj, errp);
 }
 
@@ -151,6 +164,7 @@ void visit_type_uint8(Visitor *v, const char *name, uint8_t 
*obj,
   Error **errp)
 {
 uint64_t value = *obj;
+trace_visit_type_uint8(v, name, obj);
 visit_type_uintN(v, , name, UINT8_MAX, "uint8_t", errp);
 *obj = value;
 }
@@ -159,6 +173,7 @@ void visit_type_uint16(Visitor *v, const char *name, 
uint16_t *obj,
Error **errp)
 {
 uint64_t value = *obj;
+trace_visit_type_uint16(v, name, obj);
 visit_type_uintN(v, , name, UINT16_MAX, "uint16_t", errp);
 *obj = value;
 }
@@ -167,6 +182,7 @@ void visit_type_uint32(Visitor *v, const char *name, 
uint32_t *obj,
Error **errp)
 {
 uint64_t value = *obj;
+trace_visit_type_uint32(v, name, obj);
 visit_type_uintN(v, , name, UINT32_MAX, "uint32_t", errp);
 *obj = value;
 }
@@ -175,6 +191,7 @@ void visit_type_uint64(Visitor *v, const char *name, 
uint64_t *obj,
Error **errp)
 {
 assert(obj);
+trace_visit_type_uint64(v, name, obj);
 v->type_uint64(v, name, obj, errp);
 }
 
@@ -199,6 +216,7 @@ static void

[Qemu-block] [PATCH v14 11/19] qapi: add integer range support for QObjectInputVisitor

2016-09-27 Thread Daniel P. Berrange

The traditional CLI arg syntax allows two ways to specify
integer lists, either one value per key, or a range of
values per key. eg the following are identical:

  -arg foo=5,foo=6,foo=7
  -arg foo=5-7

This extends the QObjectInputVisitor so that it is able
to parse ranges and turn them into distinct list entries.

This means that

  -arg foo=5-7

is treated as equivalent to

  -arg foo.0=5,foo.1=6,foo.2=7

Edge case tests are copied from test-opts-visitor to
ensure identical behaviour when parsing.

Signed-off-by: Daniel P. Berrange 
---
 include/qapi/qobject-input-visitor.h |  22 +++-
 qapi/qobject-input-visitor.c | 154 +--
 tests/test-qobject-input-visitor.c   | 195 +--
 3 files changed, 356 insertions(+), 15 deletions(-)

diff --git a/include/qapi/qobject-input-visitor.h 
b/include/qapi/qobject-input-visitor.h
index 90989f4..63f3782 100644
--- a/include/qapi/qobject-input-visitor.h
+++ b/include/qapi/qobject-input-visitor.h
@@ -19,6 +19,12 @@
 
 typedef struct QObjectInputVisitor QObjectInputVisitor;
 
+/* Inclusive upper bound on the size of any flattened range. This is a safety
+ * (= anti-annoyance) measure; wrong ranges should not cause long startup
+ * delays nor exhaust virtual memory.
+ */
+#define QIV_RANGE_MAX 65536
+
 /**
  * Create a new input visitor that converts @obj to a QAPI object.
  *
@@ -71,6 +77,19 @@ Visitor *qobject_input_visitor_new(QObject *obj, bool 
strict);
  * The value given determines how many levels of structs are allowed to
  * be flattened in this way.
  *
+ * If @permit_int_ranges is true, then when visiting a list of integers,
+ * the integer value strings may encode ranges eg a single element
+ * containing "5-7" is treated as if there were three elements "5", "6",
+ * "7". This should only be used if compatibility is required with the
+ * OptsVisitor which would allow integer ranges. e.g. set this to true
+ * if you have compatibility requirements for
+ *
+ *   -arg val=5-8
+ *
+ * to be treated as equivalent to the preferred syntax:
+ *
+ *   -arg val.0=5,val.1=6,val.2=7,val.3=8
+ *
  * The visitor always operates in strict mode, requiring all dict keys
  * to be consumed during visitation. An error will be reported if this
  * does not happen.
@@ -80,6 +99,7 @@ Visitor *qobject_input_visitor_new(QObject *obj, bool strict);
  */
 Visitor *qobject_input_visitor_new_autocast(QObject *obj,
 bool autocreate_list,
-size_t autocreate_struct_levels);
+size_t autocreate_struct_levels,
+bool permit_int_ranges);
 
 #endif
diff --git a/qapi/qobject-input-visitor.c b/qapi/qobject-input-visitor.c
index 6cacd4b..a38e779 100644
--- a/qapi/qobject-input-visitor.c
+++ b/qapi/qobject-input-visitor.c
@@ -31,6 +31,8 @@ typedef struct StackObject
 
 GHashTable *h;   /* If obj is dict: unvisited keys */
 const QListEntry *entry; /* If obj is list: unvisited tail */
+uint64_t range_val;
+uint64_t range_limit;
 
 QSLIST_ENTRY(StackObject) node;
 } StackObject;
@@ -60,6 +62,10 @@ struct QObjectInputVisitor
  * consider auto-creating a struct containing
  * remaining unvisited items */
 size_t autocreate_struct_levels;
+
+/* Whether int lists can have single values representing
+ * ranges of values */
+bool permit_int_ranges;
 };
 
 static QObjectInputVisitor *to_qiv(Visitor *v)
@@ -282,7 +288,7 @@ static GenericList *qobject_input_next_list(Visitor *v, 
GenericList *tail,
 QObjectInputVisitor *qiv = to_qiv(v);
 StackObject *so = QSLIST_FIRST(>stack);
 
-if (!so->entry) {
+if ((so->range_val == so->range_limit) && !so->entry) {
 return NULL;
 }
 tail->next = g_malloc0(size);
@@ -329,21 +335,87 @@ static void qobject_input_type_int64_autocast(Visitor *v, 
const char *name,
   int64_t *obj, Error **errp)
 {
 QObjectInputVisitor *qiv = to_qiv(v);
-QString *qstr = qobject_to_qstring(qobject_input_get_object(qiv, name,
-true));
+QString *qstr;
 int64_t ret;
+const char *end = NULL;
+StackObject *tos;
+bool inlist = false;
+
+/* Preferentially generate values from a range, before
+ * trying to consume another QList element */
+tos = QSLIST_FIRST(>stack);
+if (tos) {
+if ((int64_t)tos->range_val < (int64_t)tos->range_limit) {
+*obj = tos->range_val + 1;
+tos->range_val++;
+return;
+} else {
+inlist = tos->entry != NULL;
+}
+}
 
+qstr = qobject_to_qstring(qobject_input_get_object(qiv, name,
+   true));
 if (!qstr || !qstr->string) {
 error_setg(errp,

[Qemu-block] [PATCH v14 06/19] qapi: rename QmpOutputVisitor to QObjectOutputVisitor

2016-09-27 Thread Daniel P. Berrange

The QmpOutputVisitor has no direct dependency on QMP. It is
valid to use it anywhere that one wants a QObject. Rename it
to better reflect its functionality as a generic QAPI
to QObject converter.

Reviewed-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrange 
---
 block/qapi.c   |   4 +-
 blockdev.c |   4 +-
 docs/qapi-code-gen.txt |   2 +-
 ...p-output-visitor.h => qobject-output-visitor.h} |  10 +-
 qapi/Makefile.objs |   2 +-
 qapi/qapi-clone-visitor.c  |   2 +-
 qapi/qmp-output-visitor.c  | 256 -
 qapi/qobject-output-visitor.c  | 254 
 qemu-img.c |   8 +-
 qom/object_interfaces.c|   2 +-
 qom/qom-qobject.c  |   4 +-
 scripts/qapi-commands.py   |   4 +-
 scripts/qapi-event.py  |   4 +-
 tests/.gitignore   |   2 +-
 tests/Makefile.include |   8 +-
 tests/check-qnull.c|   4 +-
 ...put-visitor.c => test-qobject-output-visitor.c} |   6 +-
 tests/test-string-output-visitor.c |   2 +-
 tests/test-visitor-serialization.c |   4 +-
 util/qemu-sockets.c|   2 +-
 20 files changed, 291 insertions(+), 293 deletions(-)
 rename include/qapi/{qmp-output-visitor.h => qobject-output-visitor.h} (66%)
 delete mode 100644 qapi/qmp-output-visitor.c
 create mode 100644 qapi/qobject-output-visitor.c
 rename tests/{test-qmp-output-visitor.c => test-qobject-output-visitor.c} (99%)

diff --git a/block/qapi.c b/block/qapi.c
index 6f947e3..f35c89f 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -29,7 +29,7 @@
 #include "block/write-threshold.h"
 #include "qmp-commands.h"
 #include "qapi-visit.h"
-#include "qapi/qmp-output-visitor.h"
+#include "qapi/qobject-output-visitor.h"
 #include "qapi/qmp/types.h"
 #include "sysemu/block-backend.h"
 #include "qemu/cutils.h"
@@ -691,7 +691,7 @@ void bdrv_image_info_specific_dump(fprintf_function 
func_fprintf, void *f,
ImageInfoSpecific *info_spec)
 {
 QObject *obj, *data;
-Visitor *v = qmp_output_visitor_new();
+Visitor *v = qobject_output_visitor_new();
 
 visit_type_ImageInfoSpecific(v, NULL, _spec, _abort);
 visit_complete(v, );
diff --git a/blockdev.c b/blockdev.c
index 5ef3193..3f5d528 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -43,7 +43,7 @@
 #include "qapi/qmp/types.h"
 #include "qapi-visit.h"
 #include "qapi/qmp/qerror.h"
-#include "qapi/qmp-output-visitor.h"
+#include "qapi/qobject-output-visitor.h"
 #include "qapi/util.h"
 #include "sysemu/sysemu.h"
 #include "block/block_int.h"
@@ -3784,7 +3784,7 @@ void qmp_blockdev_add(BlockdevOptions *options, Error 
**errp)
 BlockDriverState *bs;
 BlockBackend *blk = NULL;
 QObject *obj;
-Visitor *v = qmp_output_visitor_new();
+Visitor *v = qobject_output_visitor_new();
 QDict *qdict;
 Error *local_err = NULL;
 
diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt
index d2604b6..2841c51 100644
--- a/docs/qapi-code-gen.txt
+++ b/docs/qapi-code-gen.txt
@@ -1005,7 +1005,7 @@ Example:
 Error *err = NULL;
 Visitor *v;
 
-v = qmp_output_visitor_new(ret_out);
+v = qobject_output_visitor_new(ret_out);
 visit_type_UserDefOne(v, "unused", _in, );
 if (!err) {
 visit_complete(v, ret_out);
diff --git a/include/qapi/qmp-output-visitor.h 
b/include/qapi/qobject-output-visitor.h
similarity index 66%
rename from include/qapi/qmp-output-visitor.h
rename to include/qapi/qobject-output-visitor.h
index 040fdda..358c959 100644
--- a/include/qapi/qmp-output-visitor.h
+++ b/include/qapi/qobject-output-visitor.h
@@ -11,20 +11,20 @@
  *
  */
 
-#ifndef QMP_OUTPUT_VISITOR_H
-#define QMP_OUTPUT_VISITOR_H
+#ifndef QOBJECT_OUTPUT_VISITOR_H
+#define QOBJECT_OUTPUT_VISITOR_H
 
 #include "qapi/visitor.h"
 #include "qapi/qmp/qobject.h"
 
-typedef struct QmpOutputVisitor QmpOutputVisitor;
+typedef struct QObjectOutputVisitor QObjectOutputVisitor;
 
 /*
- * Create a new QMP output visitor.
+ * Create a new QOBJECT output visitor.
  *
  * If everything else succeeds, pass @result to visit_complete() to
  * collect the result of the visit.
  */
-Visitor *qmp_output_visitor_new(QObject **result);
+Visitor *qobject_output_visitor_new(QObject **result);
 
 #endif
diff --git a/qapi/Makefile.objs b/qapi/Makefile.objs
index 6ec7bdc..33906ff 100644
--- a/qapi/Makefile.objs
+++ b/qapi/Makefile.objs
@@ -1,5 +1,5 @@
 util-obj-y = qapi-visit-core.o qapi-dealloc-visitor.o qobject-input-visitor.o
-util-obj-y +=

[Qemu-block] [PATCH v14 05/19] qapi: rename QmpInputVisitor to QObjectInputVisitor

2016-09-27 Thread Daniel P. Berrange

The QmpInputVisitor has no direct dependency on QMP. It is
valid to use it anywhere that one has a QObject. Rename it
to better reflect its functionality as a generic QObject
to QAPI converter.

Reviewed-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrange 
---
 docs/qapi-code-gen.txt |   2 +-
 ...qmp-input-visitor.h => qobject-input-visitor.h} |  10 +-
 include/qapi/visitor.h |   6 +-
 monitor.c  |   2 +-
 qapi/Makefile.objs |   2 +-
 ...qmp-input-visitor.c => qobject-input-visitor.c} | 171 +++--
 qmp.c  |   4 +-
 qom/qom-qobject.c  |   4 +-
 scripts/qapi-commands.py   |   4 +-
 target-s390x/cpu_models.c  |   4 +-
 tests/.gitignore   |   4 +-
 tests/Makefile.include |  12 +-
 tests/check-qnull.c|   4 +-
 tests/test-qmp-commands.c  |   4 +-
 ...-input-strict.c => test-qobject-input-strict.c} |   6 +-
 ...nput-visitor.c => test-qobject-input-visitor.c} |   6 +-
 tests/test-string-input-visitor.c  |   2 +-
 tests/test-visitor-serialization.c |   4 +-
 util/qemu-sockets.c|   2 +-
 19 files changed, 128 insertions(+), 125 deletions(-)
 rename include/qapi/{qmp-input-visitor.h => qobject-input-visitor.h} (63%)
 rename qapi/{qmp-input-visitor.c => qobject-input-visitor.c} (56%)
 rename tests/{test-qmp-input-strict.c => test-qobject-input-strict.c} (98%)
 rename tests/{test-qmp-input-visitor.c => test-qobject-input-visitor.c} (99%)

diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt
index 5d4c2cd..d2604b6 100644
--- a/docs/qapi-code-gen.txt
+++ b/docs/qapi-code-gen.txt
@@ -1024,7 +1024,7 @@ Example:
 Visitor *v;
 UserDefOneList *arg1 = NULL;
 
-v = qmp_input_visitor_new(QOBJECT(args), true);
+v = qobject_input_visitor_new(QOBJECT(args), true);
 visit_start_struct(v, NULL, NULL, 0, );
 if (err) {
 goto out;
diff --git a/include/qapi/qmp-input-visitor.h 
b/include/qapi/qobject-input-visitor.h
similarity index 63%
rename from include/qapi/qmp-input-visitor.h
rename to include/qapi/qobject-input-visitor.h
index f3ff5f3..cde328d 100644
--- a/include/qapi/qmp-input-visitor.h
+++ b/include/qapi/qobject-input-visitor.h
@@ -11,20 +11,20 @@
  *
  */
 
-#ifndef QMP_INPUT_VISITOR_H
-#define QMP_INPUT_VISITOR_H
+#ifndef QOBJECT_INPUT_VISITOR_H
+#define QOBJECT_INPUT_VISITOR_H
 
 #include "qapi/visitor.h"
 #include "qapi/qmp/qobject.h"
 
-typedef struct QmpInputVisitor QmpInputVisitor;
+typedef struct QObjectInputVisitor QObjectInputVisitor;
 
 /*
- * Return a new input visitor that converts QMP to QAPI.
+ * Return a new input visitor that converts a QObject to a QAPI object.
  *
  * Set @strict to reject a parse that doesn't consume all keys of a
  * dictionary; otherwise excess input is ignored.
  */
-Visitor *qmp_input_visitor_new(QObject *obj, bool strict);
+Visitor *qobject_input_visitor_new(QObject *obj, bool strict);
 
 #endif
diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
index 6c77a91..9bb6cba 100644
--- a/include/qapi/visitor.h
+++ b/include/qapi/visitor.h
@@ -25,14 +25,14 @@
  * for doing work at each node of a QAPI graph; it can also be used
  * for a virtual walk, where there is no actual QAPI C struct.
  *
- * There are four kinds of visitor classes: input visitors (QMP,
+ * There are four kinds of visitor classes: input visitors (QObject,
  * string, and QemuOpts) parse an external representation and build
- * the corresponding QAPI graph, output visitors (QMP and string) take
+ * the corresponding QAPI graph, output visitors (QObject and string) take
  * a completed QAPI graph and generate an external representation, the
  * dealloc visitor can take a QAPI graph (possibly partially
  * constructed) and recursively free its resources, and the clone
  * visitor performs a deep clone of one QAPI object to another.  While
- * the dealloc and QMP input/output visitors are general, the string,
+ * the dealloc and QObject input/output visitors are general, the string,
  * QemuOpts, and clone visitors have some implementation limitations;
  * see the documentation for each visitor for more details on what it
  * supports.  Also, see visitor-impl.h for the callback contracts
diff --git a/monitor.c b/monitor.c
index 7dcd66b..792b995 100644
--- a/monitor.c
+++ b/monitor.c
@@ -957,7 +957,7 @@ EventInfoList *qmp_query_events(Error **errp)
  * directly into QObject instead of first parsing it with
  * visit_type_SchemaInfoList() into a SchemaInfoList, then marshal it
  * to QObject with generated output marshallers, every

[Qemu-block] [PATCH v14 01/19] qdict: implement a qdict_crumple method for un-flattening a dict

2016-09-27 Thread Daniel P. Berrange

The qdict_flatten() method will take a dict whose elements are
further nested dicts/lists and flatten them by concatenating
keys.

The qdict_crumple() method aims to do the reverse, taking a flat
qdict, and turning it into a set of nested dicts/lists. It will
apply nesting based on the key name, with a '.' indicating a
new level in the hierarchy. If the keys in the nested structure
are all numeric, it will create a list, otherwise it will create
a dict.

If the keys are a mixture of numeric and non-numeric, or the
numeric keys are not in strictly ascending order, an error will
be reported.

As an example, a flat dict containing

 {
   'foo.0.bar': 'one',
   'foo.0.wizz': '1',
   'foo.1.bar': 'two',
   'foo.1.wizz': '2'
 }

will get turned into a dict with one element 'foo' whose
value is a list. The list elements will each in turn be
dicts.

 {
   'foo': [
 { 'bar': 'one', 'wizz': '1' },
 { 'bar': 'two', 'wizz': '2' }
   ],
 }

If the key is intended to contain a literal '.', then it must
be escaped as '..'. ie a flat dict

  {
 'foo..bar': 'wizz',
 'bar.foo..bar': 'eek',
 'bar.hello': 'world'
  }

Will end up as

  {
 'foo.bar': 'wizz',
 'bar': {
'foo.bar': 'eek',
'hello': 'world'
 }
  }

The intent of this function is that it allows a set of QemuOpts
to be turned into a nested data structure that mirrors the nesting
used when the same object is defined over QMP.

Reviewed-by: Kevin Wolf 
Reviewed-by: Marc-André Lureau 
Signed-off-by: Daniel P. Berrange 
---
 include/qapi/qmp/qdict.h |   1 +
 qobject/qdict.c  | 291 +++
 tests/check-qdict.c  | 260 ++
 3 files changed, 552 insertions(+)

diff --git a/include/qapi/qmp/qdict.h b/include/qapi/qmp/qdict.h
index 71b8eb0..e0d24e1 100644
--- a/include/qapi/qmp/qdict.h
+++ b/include/qapi/qmp/qdict.h
@@ -73,6 +73,7 @@ void qdict_flatten(QDict *qdict);
 void qdict_extract_subqdict(QDict *src, QDict **dst, const char *start);
 void qdict_array_split(QDict *src, QList **dst);
 int qdict_array_entries(QDict *src, const char *subqdict);
+QObject *qdict_crumple(const QDict *src, bool recursive, Error **errp);
 
 void qdict_join(QDict *dest, QDict *src, bool overwrite);
 
diff --git a/qobject/qdict.c b/qobject/qdict.c
index 60f158c..beef273 100644
--- a/qobject/qdict.c
+++ b/qobject/qdict.c
@@ -17,6 +17,7 @@
 #include "qapi/qmp/qbool.h"
 #include "qapi/qmp/qstring.h"
 #include "qapi/qmp/qobject.h"
+#include "qapi/error.h"
 #include "qemu/queue.h"
 #include "qemu-common.h"
 #include "qemu/cutils.h"
@@ -683,6 +684,296 @@ void qdict_array_split(QDict *src, QList **dst)
 }
 }
 
+
+/**
+ * qdict_split_flat_key:
+ * @key: the key string to split
+ * @prefix: non-NULL pointer to hold extracted prefix
+ * @suffix: non-NULL pointer to remaining suffix
+ *
+ * Given a flattened key such as 'foo.0.bar', split it into two parts
+ * at the first '.' separator. Allows double dot ('..') to escape the
+ * normal separator.
+ *
+ * eg
+ *'foo.0.bar' -> prefix='foo' and suffix='0.bar'
+ *'foo..0.bar' -> prefix='foo.0' and suffix='bar'
+ *
+ * The '..' sequence will be unescaped in the returned 'prefix'
+ * string. The 'suffix' string will be left in escaped format, so it
+ * can be fed back into the qdict_split_flat_key() key as the input
+ * later.
+ *
+ * The caller is responsible for freeing the string returned in @prefix
+ * using g_free().
+ */
+static void qdict_split_flat_key(const char *key, char **prefix,
+ const char **suffix)
+{
+const char *separator;
+size_t i, j;
+
+/* Find first '.' separator, but if there is a pair '..'
+ * that acts as an escape, so skip over '..' */
+separator = NULL;
+do {
+if (separator) {
+separator += 2;
+} else {
+separator = key;
+}
+separator = strchr(separator, '.');
+} while (separator && separator[1] == '.');
+
+if (separator) {
+*prefix = g_strndup(key, separator - key);
+*suffix = separator + 1;
+} else {
+*prefix = g_strdup(key);
+*suffix = NULL;
+}
+
+/* Unescape the '..' sequence into '.' */
+for (i = 0, j = 0; (*prefix)[i] != '\0'; i++, j++) {
+if ((*prefix)[i] == '.') {
+assert((*prefix)[i + 1] == '.');
+i++;
+}
+(*prefix)[j] = (*prefix)[i];
+}
+(*prefix)[j] = '\0';
+}
+
+
+/**
+ * qdict_is_list:
+ * @maybe_list: dict to check if keys represent list elements.
+ *
+ * Determine whether all keys in @maybe_list are valid list elements.
+ * If @maybe_list is non-zero in length and all the keys look like
+ * valid list indexes, this will return 1. If @maybe_list is zero
+ * length or all keys are non-numeric then it will return 0 to indicate
+ * it is a normal qdict. If there is a mix of numeric and

[Qemu-block] [PATCH v14 02/19] option: make parse_option_bool/number non-static

2016-09-27 Thread Daniel P. Berrange

The opts-visitor.c opts_type_bool() method has code for
parsing a string to set a bool value, as does the
qemu-option.c parse_option_bool() method, except it
handles fewer cases.

To enable consistency across the codebase, extend
parse_option_bool() to handle "yes", "no", "y" and
"n", and make it non-static. Convert the opts
visitor to call this method directly.

Also make parse_option_number() non-static to allow
for similar reuse later.

Reviewed-by: Kevin Wolf 
Reviewed-by: Marc-André Lureau 
Reviewed-by: Eric Blake 
Reviewed-by: Markus Armbruster 
Signed-off-by: Daniel P. Berrange 
---
 include/qemu/option.h |  4 
 qapi/opts-visitor.c   | 19 +--
 tests/qemu-iotests/051.out|  6 +++---
 tests/qemu-iotests/051.pc.out |  6 +++---
 tests/qemu-iotests/137.out|  4 ++--
 util/qemu-option.c| 27 ---
 6 files changed, 29 insertions(+), 37 deletions(-)

diff --git a/include/qemu/option.h b/include/qemu/option.h
index 1f9e3f9..2a5266f 100644
--- a/include/qemu/option.h
+++ b/include/qemu/option.h
@@ -37,6 +37,10 @@ int get_param_value(char *buf, int buf_size,
 const char *tag, const char *str);
 
 
+void parse_option_bool(const char *name, const char *value, bool *ret,
+   Error **errp);
+void parse_option_number(const char *name, const char *value,
+ uint64_t *ret, Error **errp);
 void parse_option_size(const char *name, const char *value,
uint64_t *ret, Error **errp);
 bool has_help_option(const char *param);
diff --git a/qapi/opts-visitor.c b/qapi/opts-visitor.c
index 1048bbc..084f7cc 100644
--- a/qapi/opts-visitor.c
+++ b/qapi/opts-visitor.c
@@ -334,7 +334,6 @@ opts_type_str(Visitor *v, const char *name, char **obj, 
Error **errp)
 }
 
 
-/* mimics qemu-option.c::parse_option_bool() */
 static void
 opts_type_bool(Visitor *v, const char *name, bool *obj, Error **errp)
 {
@@ -346,23 +345,7 @@ opts_type_bool(Visitor *v, const char *name, bool *obj, 
Error **errp)
 return;
 }
 
-if (opt->str) {
-if (strcmp(opt->str, "on") == 0 ||
-strcmp(opt->str, "yes") == 0 ||
-strcmp(opt->str, "y") == 0) {
-*obj = true;
-} else if (strcmp(opt->str, "off") == 0 ||
-strcmp(opt->str, "no") == 0 ||
-strcmp(opt->str, "n") == 0) {
-*obj = false;
-} else {
-error_setg(errp, QERR_INVALID_PARAMETER_VALUE, opt->name,
-   "on|yes|y|off|no|n");
-return;
-}
-} else {
-*obj = true;
-}
+parse_option_bool(opt->name, opt->str, obj, errp);
 
 processed(ov, name);
 }
diff --git a/tests/qemu-iotests/051.out b/tests/qemu-iotests/051.out
index 408d613..ffdc0b9 100644
--- a/tests/qemu-iotests/051.out
+++ b/tests/qemu-iotests/051.out
@@ -86,13 +86,13 @@ QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) q[K[Dqu[K[D[Dqui[K[D[D[Dquit[K
 
 Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=
-QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=: 
Parameter 'lazy-refcounts' expects 'on' or 'off'
+QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=: 
Parameter 'lazy-refcounts' expects 'on', 'yes', 'y', 'off', 'no' or 'n'
 
 Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=42
-QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=42: 
Parameter 'lazy-refcounts' expects 'on' or 'off'
+QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=42: 
Parameter 'lazy-refcounts' expects 'on', 'yes', 'y', 'off', 'no' or 'n'
 
 Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=foo
-QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=foo: 
Parameter 'lazy-refcounts' expects 'on' or 'off'
+QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=foo: 
Parameter 'lazy-refcounts' expects 'on', 'yes', 'y', 'off', 'no' or 'n'
 
 
 === With version 2 images enabling lazy refcounts must fail ===
diff --git a/tests/qemu-iotests/051.pc.out b/tests/qemu-iotests/051.pc.out
index ec6d222..6abf1ab 100644
--- a/tests/qemu-iotests/051.pc.out
+++ b/tests/qemu-iotests/051.pc.out
@@ -86,13 +86,13 @@ QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) q[K[Dqu[K[D[Dqui[K[D[D[Dquit[K
 
 Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=
-QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=: 
Parameter 'lazy-refcounts' expects 'on' or 'off'
+QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=: 
Parameter 'lazy-refcounts' expects 'on', 'yes', 'y', 'off', 'no' or 'n'
 
 Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=42
-QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=42: 
Parameter

[Qemu-block] [PATCH v14 03/19] option: allow qemu_opts_to_qdict to merge repeated options

2016-09-27 Thread Daniel P. Berrange

If given an option string such as

  size=1024,nodes=10,nodes=4-5,nodes=1-2,policy=bind

the qemu_opts_to_qdict() method will currently overwrite
the values for repeated option keys, so only the last
value is in the returned dict:

size=1024
nodes=1-2
policy=bind

This adds the ability for the caller to ask that the
repeated keys be turned into list indexes:

size=1024
nodes.0=10
nodes.1=4-5
nodes.2=1-2
policy=bind

Note that the conversion has no way of knowing whether
any given key is expected to be a list upfront - it can
only figure that out when seeing the first duplicated
key. Thus the caller has to be prepared to deal with the
fact that if a key 'foo' is a list, then the returned
qdict may contain either 'foo' (if only a single instance
of the key was seen) or 'foo.NN' (if multiple instances
of the key were seen).

Signed-off-by: Daniel P. Berrange 
---
 blockdev.c   |  7 ---
 include/qemu/option.h|  8 +++-
 monitor.c|  3 ++-
 qemu-img.c   |  4 +++-
 qemu-io-cmds.c   |  3 ++-
 qemu-io.c|  6 --
 qemu-nbd.c   |  3 ++-
 qom/object_interfaces.c  |  3 ++-
 tests/test-qemu-opts.c   | 40 
 tests/test-replication.c |  9 ++---
 util/qemu-option.c   | 41 ++---
 11 files changed, 110 insertions(+), 17 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 3010393..5ef3193 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -911,7 +911,8 @@ DriveInfo *drive_new(QemuOpts *all_opts, BlockInterfaceType 
block_default_type)
 
 /* Get a QDict for processing the options */
 bs_opts = qdict_new();
-qemu_opts_to_qdict(all_opts, bs_opts);
+qemu_opts_to_qdict(all_opts, bs_opts,
+   QEMU_OPTS_REPEAT_POLICY_LAST);
 
 legacy_opts = qemu_opts_create(_legacy_drive_opts, NULL, 0,
_abort);
@@ -3758,8 +3759,8 @@ void hmp_drive_add_node(Monitor *mon, const char *optstr)
 return;
 }
 
-qdict = qemu_opts_to_qdict(opts, NULL);
-
+qdict = qemu_opts_to_qdict(opts, NULL,
+   QEMU_OPTS_REPEAT_POLICY_LAST);
 if (!qdict_get_try_str(qdict, "node-name")) {
 QDECREF(qdict);
 error_report("'node-name' needs to be specified");
diff --git a/include/qemu/option.h b/include/qemu/option.h
index 2a5266f..328c468 100644
--- a/include/qemu/option.h
+++ b/include/qemu/option.h
@@ -125,7 +125,13 @@ void qemu_opts_set_defaults(QemuOptsList *list, const char 
*params,
 int permit_abbrev);
 QemuOpts *qemu_opts_from_qdict(QemuOptsList *list, const QDict *qdict,
Error **errp);
-QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict);
+typedef enum {
+QEMU_OPTS_REPEAT_POLICY_LAST,
+QEMU_OPTS_REPEAT_POLICY_LIST,
+} QemuOptsRepeatPolicy;
+
+QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict,
+  QemuOptsRepeatPolicy repeatPolicy);
 void qemu_opts_absorb_qdict(QemuOpts *opts, QDict *qdict, Error **errp);
 
 typedef int (*qemu_opts_loopfunc)(void *opaque, QemuOpts *opts, Error **errp);
diff --git a/monitor.c b/monitor.c
index 83c4edf..7dcd66b 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2642,7 +2642,8 @@ static QDict *monitor_parse_arguments(Monitor *mon,
 if (!opts) {
 goto fail;
 }
-qemu_opts_to_qdict(opts, qdict);
+qemu_opts_to_qdict(opts, qdict,
+   QEMU_OPTS_REPEAT_POLICY_LAST);
 qemu_opts_del(opts);
 }
 break;
diff --git a/qemu-img.c b/qemu-img.c
index ceffefe..b399ae5 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -273,7 +273,9 @@ static BlockBackend *img_open_opts(const char *optstr,
 QDict *options;
 Error *local_err = NULL;
 BlockBackend *blk;
-options = qemu_opts_to_qdict(opts, NULL);
+options = qemu_opts_to_qdict(opts, NULL,
+ QEMU_OPTS_REPEAT_POLICY_LAST);
+
 blk = blk_new_open(NULL, NULL, options, flags, _err);
 if (!blk) {
 error_reportf_err(local_err, "Could not open '%s': ", optstr);
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 3a3838a..e14beed 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -1952,7 +1952,8 @@ static int reopen_f(BlockBackend *blk, int argc, char 
**argv)
 }
 
 qopts = qemu_opts_find(_opts, NULL);
-opts = qopts ? qemu_opts_to_qdict(qopts, NULL) : NULL;
+opts = qopts ? qemu_opts_to_qdict(qopts, NULL,
+  QEMU_OPTS_REPEAT_POLICY_LAST) : NULL;
 qemu_opts_reset(_opts);
 
 brq = bdrv_reopen_queue(NULL, bs, opts, flags);
diff --git a/qemu-io.c b/qemu-io.c
index db129ea..b295afa 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -207,7 +207,8 @@ static int open_f(BlockBackend *blk, int argc, char **argv)

Re: [Qemu-block] [PATCH] proto: add 'shift' extension.

2016-09-27 Thread Paolo Bonzini



- Original Message -
> From: "Denis V. Lunev" 
> To: "Paolo Bonzini" 
> Cc: "Vladimir Sementsov-Ogievskiy" , 
> qemu-de...@nongnu.org, qemu-block@nongnu.org,
> nbd-gene...@lists.sourceforge.net, a...@alex.org.uk, ebl...@redhat.com, 
> kw...@redhat.com, stefa...@redhat.com,
> w...@uter.be
> Sent: Tuesday, September 27, 2016 12:25:54 PM
> Subject: Re: [PATCH] proto: add 'shift' extension.
> 
> On 09/27/2016 01:15 PM, Paolo Bonzini wrote:
> >> We could go in a different direction and export flag
> >> 'has_zero_init' which will report that the storage is
> >> initialized with all zeroes at the moment. In this
> >> case mirroring code will not fall into this
> >> branch.
> > Why don't you add the zero_init flag to QEMU's NBD driver instead?
>
> for all cases without knowing real backend on the server side?
> I think this would be wrong.

Add it to the command line, and leave it to libvirt or the user to
pass "-drive file.driver=nbd,...,file.zero-init=on".

Paolo

Re: [Qemu-block] [PATCH] proto: add 'shift' extension.

2016-09-27 Thread Denis V. Lunev

On 09/26/2016 03:46 PM, Vladimir Sementsov-Ogievskiy wrote:
> This extension allows big requests for TRIM and WRITE_ZEROES through
> special 'shift' parameter, which means that offset and length should be
> shifted left by several bits.
>
> This is needed for efficient clearing large regions of the disk (up to
> the whole disk).
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>
> Here mentioned WRITE_ZEROES command which is only an experemental
> extension for now.
>
> To dicuss:
> NBD_OPT_SHIFT Data. It can be reduced to 8 bits actually... Are some
>reserved bits needed here?
>
>  doc/proto.md | 19 ++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/doc/proto.md b/doc/proto.md
> index 2de3a6a..6fd1b16 100644
> --- a/doc/proto.md
> +++ b/doc/proto.md
> @@ -682,6 +682,8 @@ The field has the following format:
>experimental `WRITE_ZEROES` 
> [extension](https://github.com/yoe/nbd/blob/extension-write-zeroes/doc/proto.md).
>  - bit 7, `NBD_FLAG_SEND_DF`: defined by the experimental `STRUCTURED_REPLY`
>
> [extension](https://github.com/yoe/nbd/blob/extension-structured-reply/doc/proto.md).
> +- bit 8, `NBD_FLAG_SEND_SHIFT` : exposes support for `NBD_CMD_FLAG_SHIFT` and
> +  `NBD_OPT_SHIFT`
>  
>  Clients SHOULD ignore unknown flags.
>  
> @@ -765,6 +767,15 @@ of the newstyle negotiation.
>  
>  Defined by the experimental `INFO` 
> [extension](https://github.com/yoe/nbd/blob/extension-info/doc/proto.md).
>  
> +- `NBD_OPT_SHIFT` (10)
> +
> +Defines shift used to calculate request offset and length if
> +`NBD_CMD_FLAG_SHIFT` is set.
> +
> +Data:
> +
> +- 32 bits, shift (unsigned); Must not be larger than 32.
> +
>   Option reply types
>  
>  These values are used in the "reply type" field, sent by the server
> @@ -872,7 +883,13 @@ valid may depend on negotiation during the handshake 
> phase.
>
> [extension](https://github.com/yoe/nbd/blob/extension-write-zeroes/doc/proto.md).
>  - bit 2, `NBD_CMD_FLAG_DF`; defined by the experimental `STRUCTURED_REPLY`
>
> [extension](https://github.com/yoe/nbd/blob/extension-structured-reply/doc/proto.md).
> -
> +- bit 3, `NBD_CMD_FLAG_SHIFT`; This flag is valid for `NBD_CMD_TRIM` and
> +  `NBD_CMD_WRITE_ZEROES`. If this flag is set the server shifts request
> +  *length* and *offset* left by N bits, where N is defined by `NBD_OPT_SHIFT`
> +  option or is assumed to be 16 bits by default if `NBD_OPT_SHIFT` option is
> +  not specified. If after shift `(offset + length)` exceeds disk size, length
> +  should be reduced to `( - offset)`. However, `(offset + length)`
> +  must not exceed disk size by more than `(1 << N) - 1`.
>  
>   Request types
>  
We could go in a different direction and export flag
'has_zero_init' which will report that the storage is
initialized with all zeroes at the moment. In this
case mirroring code will not fall into this
branch.

Den

Re: [Qemu-block] [Qemu-devel] [PATCH v2] block: Turn on "unmap" in active commit

2016-09-27 Thread Fam Zheng

On Tue, 09/27 13:15, Kevin Wolf wrote:
> Am 27.09.2016 um 13:04 hat Fam Zheng geschrieben:
> > On Tue, 09/27 17:33, Fam Zheng wrote:
> > > We already specified BDRV_O_UNMAP when opening images in 'qemu-img
> > > commit', but didn't turn on the "unmap" in the active commit job. This
> > > patch fixes that so that zeroed clusters in top image can be discarded
> > > which is desired in the virt-sparsify use case, where a temporary
> > > overlay is created and fstrim'ed before commiting back, to free space in
> > > the original image.
> > > 
> > > This also enables it for block-commit.
> > > 
> > > Signed-off-by: Fam Zheng 
> > > ---
> > > v2: Add "unmap" to block-commit as well. [Kevin]
> > > ---
> > >  block/mirror.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/block/mirror.c b/block/mirror.c
> > > index f9d1fec..8f6f506 100644
> > > --- a/block/mirror.c
> > > +++ b/block/mirror.c
> > > @@ -1043,7 +1043,7 @@ void commit_active_start(const char *job_id, 
> > > BlockDriverState *bs,
> > >  mirror_start_job(job_id, bs, base, NULL, speed, 0, 0,
> > >   MIRROR_LEAVE_BACKING_CHAIN,
> > >   on_error, on_error, false, cb, opaque, _err,
> > > - _active_job_driver, false, base, 
> > > auto_complete);
> > > + _active_job_driver, true, base, 
> > > auto_complete);
> > 
> > I'm an idiot! I changed the wrong parameter (is_none_mode).
> 
> Actually, the idiotic part is having a function with eighteen
> parameters. Not too surprising that someone confuses them sooner or
> later...
> 
> Perhaps we could enable QAPI boxing for blockdev-mirror and then just
> pass down a struct. Then we could have designated initialisers here in
> commit_active_start() and it would be obvious if the wrong one is
> changed.

Good, one more thing in my TODO list. Thanks.

Fam

> 
> Not something to do in this series, of course, just a general idea for
> improvement.
> 
> Kevin

Re: [Qemu-block] [Qemu-devel] [PATCH v2] block: Turn on "unmap" in active commit

2016-09-27 Thread Kevin Wolf

Am 27.09.2016 um 13:04 hat Fam Zheng geschrieben:
> On Tue, 09/27 17:33, Fam Zheng wrote:
> > We already specified BDRV_O_UNMAP when opening images in 'qemu-img
> > commit', but didn't turn on the "unmap" in the active commit job. This
> > patch fixes that so that zeroed clusters in top image can be discarded
> > which is desired in the virt-sparsify use case, where a temporary
> > overlay is created and fstrim'ed before commiting back, to free space in
> > the original image.
> > 
> > This also enables it for block-commit.
> > 
> > Signed-off-by: Fam Zheng 
> > ---
> > v2: Add "unmap" to block-commit as well. [Kevin]
> > ---
> >  block/mirror.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/block/mirror.c b/block/mirror.c
> > index f9d1fec..8f6f506 100644
> > --- a/block/mirror.c
> > +++ b/block/mirror.c
> > @@ -1043,7 +1043,7 @@ void commit_active_start(const char *job_id, 
> > BlockDriverState *bs,
> >  mirror_start_job(job_id, bs, base, NULL, speed, 0, 0,
> >   MIRROR_LEAVE_BACKING_CHAIN,
> >   on_error, on_error, false, cb, opaque, _err,
> > - _active_job_driver, false, base, 
> > auto_complete);
> > + _active_job_driver, true, base, auto_complete);
> 
> I'm an idiot! I changed the wrong parameter (is_none_mode).

Actually, the idiotic part is having a function with eighteen
parameters. Not too surprising that someone confuses them sooner or
later...

Perhaps we could enable QAPI boxing for blockdev-mirror and then just
pass down a struct. Then we could have designated initialisers here in
commit_active_start() and it would be obvious if the wrong one is
changed.

Not something to do in this series, of course, just a general idea for
improvement.

Kevin

[Qemu-block] [PATCH v3] block: Turn on "unmap" in active commit

2016-09-27 Thread Fam Zheng

We already specified BDRV_O_UNMAP when opening images in 'qemu-img
commit', but didn't turn on the "unmap" in the active commit job. This
patch fixes that so that zeroed clusters in top image can be discarded
which is desired in the virt-sparsify use case, where a temporary
overlay is created and fstrim'ed before commiting back, to free space in
the original image.

This also enables it for block-commit.

Signed-off-by: Fam Zheng 
---
v3: Change the right parameter.
v2: Add "unmap" to block-commit as well. [Kevin]
---
 block/mirror.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/mirror.c b/block/mirror.c
index f9d1fec..8847ec5 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1042,7 +1042,7 @@ void commit_active_start(const char *job_id, 
BlockDriverState *bs,
 
 mirror_start_job(job_id, bs, base, NULL, speed, 0, 0,
  MIRROR_LEAVE_BACKING_CHAIN,
- on_error, on_error, false, cb, opaque, _err,
+ on_error, on_error, true, cb, opaque, _err,
  _active_job_driver, false, base, auto_complete);
 if (local_err) {
 error_propagate(errp, local_err);
-- 
2.7.4

Re: [Qemu-block] [Qemu-devel] [PATCH v2] block: Turn on "unmap" in active commit

2016-09-27 Thread Fam Zheng

On Tue, 09/27 17:33, Fam Zheng wrote:
> We already specified BDRV_O_UNMAP when opening images in 'qemu-img
> commit', but didn't turn on the "unmap" in the active commit job. This
> patch fixes that so that zeroed clusters in top image can be discarded
> which is desired in the virt-sparsify use case, where a temporary
> overlay is created and fstrim'ed before commiting back, to free space in
> the original image.
> 
> This also enables it for block-commit.
> 
> Signed-off-by: Fam Zheng 
> ---
> v2: Add "unmap" to block-commit as well. [Kevin]
> ---
>  block/mirror.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index f9d1fec..8f6f506 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -1043,7 +1043,7 @@ void commit_active_start(const char *job_id, 
> BlockDriverState *bs,
>  mirror_start_job(job_id, bs, base, NULL, speed, 0, 0,
>   MIRROR_LEAVE_BACKING_CHAIN,
>   on_error, on_error, false, cb, opaque, _err,
> - _active_job_driver, false, base, auto_complete);
> + _active_job_driver, true, base, auto_complete);

I'm an idiot! I changed the wrong parameter (is_none_mode).

Will send v3.

Fam

>  if (local_err) {
>  error_propagate(errp, local_err);
>  goto error_restore_flags;
> -- 
> 2.7.4
> 
>

Re: [Qemu-block] [PATCH] proto: add 'shift' extension.

2016-09-27 Thread Denis V. Lunev

On 09/27/2016 01:15 PM, Paolo Bonzini wrote:
>> We could go in a different direction and export flag
>> 'has_zero_init' which will report that the storage is
>> initialized with all zeroes at the moment. In this
>> case mirroring code will not fall into this
>> branch.
> Why don't you add the zero_init flag to QEMU's NBD driver instead?
>
> Thanks,
>
> Paolo
for all cases without knowing real backend on the server side?
I think this would be wrong.

Den

Re: [Qemu-block] [PATCH] proto: add 'shift' extension.

2016-09-27 Thread Paolo Bonzini

> We could go in a different direction and export flag
> 'has_zero_init' which will report that the storage is
> initialized with all zeroes at the moment. In this
> case mirroring code will not fall into this
> branch.

Why don't you add the zero_init flag to QEMU's NBD driver instead?

Thanks,

Paolo

Re: [Qemu-block] [Qemu-devel] [PATCH v2] block: Turn on "unmap" in active commit

2016-09-27 Thread Fam Zheng

On Tue, 09/27 17:33, Fam Zheng wrote:
> We already specified BDRV_O_UNMAP when opening images in 'qemu-img
> commit', but didn't turn on the "unmap" in the active commit job. This
> patch fixes that so that zeroed clusters in top image can be discarded
> which is desired in the virt-sparsify use case, where a temporary
> overlay is created and fstrim'ed before commiting back, to free space in
> the original image.
> 
> This also enables it for block-commit.
> 
> Signed-off-by: Fam Zheng 

Sent too soon. I see a number of iotests failures (020 040 097 129),
still investigating.

Fam

Re: [Qemu-block] VMDK file unable to open in Qemu but ESXi

2016-09-27 Thread Peter Lieven


Am 27.09.2016 um 10:16 schrieb Kevin Wolf:

Am 27.09.2016 um 04:02 hat Fam Zheng geschrieben:

On Mon, 09/26 16:30, Peter Lieven wrote:

So it looks like this file is not compliant to the specfication. Is it
recognized and imported by VMware?

Yes, VMware ESXi imports it flawlessly.

The only possibility is that the "footer" is not in the end of the image. There
must be some other sector begins with bytes '4b 44 4d 56' besides the first.
Theoretically we can make QEMU accept this case by "scanning" the whole image
to look for the footer, but I'm not sure it's worth it.

I would be surprised if VMware were scanning the whole image, so there
must be some other way. If such images exist in real life, we should
probably try to support them.


There is no second magic in the whole file. Is it possible that VMware restores
the footer from the info in the descriptor?

$ hexdump -C PI-VA-3.1.0.0.132-flex-disk1.vmdk | grep "4b 44 4d 56"
  4b 44 4d 56 03 00 00 00  01 00 03 00 00 00 9f 24 |KDMV...$|
$

Peter

[Qemu-block] [PATCH v2] block: Turn on "unmap" in active commit

2016-09-27 Thread Fam Zheng

We already specified BDRV_O_UNMAP when opening images in 'qemu-img
commit', but didn't turn on the "unmap" in the active commit job. This
patch fixes that so that zeroed clusters in top image can be discarded
which is desired in the virt-sparsify use case, where a temporary
overlay is created and fstrim'ed before commiting back, to free space in
the original image.

This also enables it for block-commit.

Signed-off-by: Fam Zheng 
---
v2: Add "unmap" to block-commit as well. [Kevin]
---
 block/mirror.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/mirror.c b/block/mirror.c
index f9d1fec..8f6f506 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1043,7 +1043,7 @@ void commit_active_start(const char *job_id, 
BlockDriverState *bs,
 mirror_start_job(job_id, bs, base, NULL, speed, 0, 0,
  MIRROR_LEAVE_BACKING_CHAIN,
  on_error, on_error, false, cb, opaque, _err,
- _active_job_driver, false, base, auto_complete);
+ _active_job_driver, true, base, auto_complete);
 if (local_err) {
 error_propagate(errp, local_err);
 goto error_restore_flags;
-- 
2.7.4

Re: [Qemu-block] [PATCH v3 1/3] qemu-nbd: Add --fork option

2016-09-27 Thread Kevin Wolf

Am 25.08.2016 um 18:30 hat Max Reitz geschrieben:
> Using the --fork option, one can make qemu-nbd fork the worker process.
> The original process will exit on error of the worker or once the worker
> enters the main loop.
> 
> Suggested-by: Sascha Silbe 
> Signed-off-by: Max Reitz 
> ---
>  qemu-nbd.c | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index e3571c2..8c2d582 100644
> --- a/qemu-nbd.c
> +++ b/qemu-nbd.c
> @@ -48,6 +48,7 @@
>  #define QEMU_NBD_OPT_OBJECT260
>  #define QEMU_NBD_OPT_TLSCREDS  261
>  #define QEMU_NBD_OPT_IMAGE_OPTS262
> +#define QEMU_NBD_OPT_FORK  263
>  
>  #define MBR_SIZE 512
>  
> @@ -503,6 +504,7 @@ int main(int argc, char **argv)
>  { "tls-creds", required_argument, NULL, QEMU_NBD_OPT_TLSCREDS },
>  { "image-opts", no_argument, NULL, QEMU_NBD_OPT_IMAGE_OPTS },
>  { "trace", required_argument, NULL, 'T' },
> +{ "fork", no_argument, NULL, QEMU_NBD_OPT_FORK },
>  { NULL, 0, NULL, 0 }
>  };
>  int ch;
> @@ -524,6 +526,8 @@ int main(int argc, char **argv)
>  bool imageOpts = false;
>  bool writethrough = true;
>  char *trace_file = NULL;
> +bool fork_process = false;
> +int old_stderr = -1;
>  
>  /* The client thread uses SIGTERM to interrupt the server.  A signal
>   * handler ensures that "qemu-nbd -v -c" exits with a nice status code.
> @@ -714,6 +718,9 @@ int main(int argc, char **argv)
>  g_free(trace_file);
>  trace_file = trace_opt_parse(optarg);
>  break;
> +case QEMU_NBD_OPT_FORK:
> +fork_process = true;
> +break;
>  }
>  }
>  
> @@ -773,7 +780,7 @@ int main(int argc, char **argv)
>  return 0;
>  }
>  
> -if (device && !verbose) {
> +if ((device && !verbose) || fork_process) {
>  int stderr_fd[2];
>  pid_t pid;
>  int ret;
> @@ -796,6 +803,7 @@ int main(int argc, char **argv)
>  ret = qemu_daemon(1, 0);
>  
>  /* Temporarily redirect stderr to the parent's pipe...  */
> +old_stderr = dup(STDERR_FILENO);
>  dup2(stderr_fd[1], STDERR_FILENO);
>  if (ret < 0) {
>  error_report("Failed to daemonize: %s", strerror(errno));

I don't have a kernel with NBD support, so I can't test this easily, but
doesn't this break the daemon mode for --connect? I mean, it will still
fork, but won't the parent be alive until the child exits?

> @@ -951,6 +959,11 @@ int main(int argc, char **argv)
>  exit(EXIT_FAILURE);
>  }
>  
> +if (fork_process) {
> +dup2(old_stderr, STDERR_FILENO);
> +close(old_stderr);
> +}

Because this code doesn't run for --connect (unless --fork is given,
too).

>  state = RUNNING;
>  do {
>  main_loop_wait(false);

The documentation needs an update, too.

Kevin

Re: [Qemu-block] [PATCH] qemu-img: Allow unmap backing image for zeroed clusters

2016-09-27 Thread Fam Zheng

On Tue, 09/27 10:41, Kevin Wolf wrote:
> Am 27.09.2016 um 04:20 hat Fam Zheng geschrieben:
> > We already specified BDRV_O_UNMAP when opening images in 'qemu-img
> > commit', but didn't turn on the "unmap" in the active commit job. This
> > patch fixes that so that zeroed clusters in top image can be discarded
> > which is desired in the virt-sparsify use case, where a temporary
> > overlay is created and fstrim'ed before commiting back, to free space in
> > the original image.
> > 
> > The block-commit keeps the existing behavior.
> > 
> > Signed-off-by: Fam Zheng 
> 
> Is there a good reason for not using discard in an active commit if the
> image was opened with BDRV_O_UNMAP? That is, wouldn't it make sense to
> just set the unmap option for the mirror unconditionally and also for
> block-commit?
> 
> If there is a reason, we should probably add an option to block-commit,
> but I still feels that enabling it should be the default then.

I didn't touch block-commit because I wasn't sure, but you made a good point,
we can turn this on there too.

Fam

Re: [Qemu-block] [PATCH] qemu-img: Allow unmap backing image for zeroed clusters

2016-09-27 Thread Kevin Wolf

Am 27.09.2016 um 04:20 hat Fam Zheng geschrieben:
> We already specified BDRV_O_UNMAP when opening images in 'qemu-img
> commit', but didn't turn on the "unmap" in the active commit job. This
> patch fixes that so that zeroed clusters in top image can be discarded
> which is desired in the virt-sparsify use case, where a temporary
> overlay is created and fstrim'ed before commiting back, to free space in
> the original image.
> 
> The block-commit keeps the existing behavior.
> 
> Signed-off-by: Fam Zheng 

Is there a good reason for not using discard in an active commit if the
image was opened with BDRV_O_UNMAP? That is, wouldn't it make sense to
just set the unmap option for the mirror unconditionally and also for
block-commit?

If there is a reason, we should probably add an option to block-commit,
but I still feels that enabling it should be the default then.

Kevin

Re: [Qemu-block] [PATCH v2 0/7] block: Make more blockdev-add options work

2016-09-27 Thread Kevin Wolf

Am 23.09.2016 um 16:32 hat Kevin Wolf geschrieben:
> This series moves a few more options from flags to the appropriate place. This
> doesn't only result in cleaner code, but also allows using these options in
> nested node definitions.
> 
> After this series, bds_tree_init() is only a thin wrapper around bdrv_open()
> which sets the right defaults for cache modes and the BDRV_O_INACTIVE flag if
> necessary.
> 
> Depends on my block branch, specifically Berto's 'read-only' patches.

Applied to the block branch.

Kevin

Re: [Qemu-block] [PATCH v2 0/7] block: Make more blockdev-add options work

2016-09-27 Thread Kevin Wolf

Am 26.09.2016 um 19:10 hat Max Reitz geschrieben:
> It's a bit funny now how blockdev_init() and
> extract_common_blockdev_options() are actually used by neither
> blockdev-add nor -blockdev, but only by drive_new() (which happily
> advertises blockdev_init() to be shared with blockev-add, though).

Yes. Essentially drive_new() is now the clean wrappers that map legacy
syntax to the modern options, whereas blockdev_init() is left as the
place for the messy stuff. Maybe we can get rid of it sooner or later.

Kevin


pgpoL4iN9pEYc.pgp
Description: PGP signature

Re: [Qemu-block] [PATCH v2 3/7] block/qapi: Move 'aio' option to file driver

2016-09-27 Thread Kevin Wolf

Am 26.09.2016 um 18:49 hat Max Reitz geschrieben:
> On 23.09.2016 16:32, Kevin Wolf wrote:
> > The option whether or not to use a native AIO interface really isn't a
> > generic option for all drivers, but only applies to the native file
> > protocols. This patch moves the option in blockdev-add to the
> > appropriate places (raw-posix and raw-win32).
> > 
> > We still have to keep the flag BDRV_O_NATIVE_AIO for compatibility
> > because so far the AIO option was usually specified on the wrong layer
> > (the top-level format driver, which didn't even look at it) and then
> > inherited by the protocol driver (where it was actually used). We can't
> > forbid this use except in new interfaces.
> > 
> > Signed-off-by: Kevin Wolf 
> > ---
> >  block/raw-posix.c  | 44 ---
> >  block/raw-win32.c  | 56 
> > +-
> >  qapi/block-core.json   |  6 +++---
> >  tests/qemu-iotests/087 |  4 ++--
> >  4 files changed, 83 insertions(+), 27 deletions(-)
> 
> [...]
> 
> > diff --git a/block/raw-win32.c b/block/raw-win32.c
> > index 56f45fe..734bb10 100644
> > --- a/block/raw-win32.c
> > +++ b/block/raw-win32.c
> 
> [...]
> 
> > +static bool get_aio_option(QemuOpts *opts, int flags, Error **errp)
> > +{
> > +BlockdevAioOptions aio, aio_default;
> > +
> > +aio_default = (flags & BDRV_O_NATIVE_AIO) ? BLOCKDEV_AIO_OPTIONS_NATIVE
> > +  : 
> > BLOCKDEV_AIO_OPTIONS_THREADS;
> > +aio = qapi_enum_parse(BlockdevAioOptions_lookup, qemu_opt_get(opts, 
> > "aio"),
> > +  BLOCKDEV_AIO_OPTIONS__MAX, aio_default, errp);
> > +
> > +switch (aio) {
> > +case BLOCKDEV_AIO_OPTIONS_NATIVE:
> > +return true;
> > +case BLOCKDEV_AIO_OPTIONS_THREADS:
> > +return false;
> > +default:
> > +error_setg(errp, "Invalid AIO option");
> 
> Any reason for catching this case here but not in raw-posix?
> 
> (Not that it really matters, though.)

Nobody will forget raw-posix when adding a new AIO mode to win32, so I
didn't feel like the additional code was worth it there. But if we add a
new AIO mode to raw-posix, I'm pretty sure we will forget win32.

Kevin


pgpzdSa6z_E2f.pgp
Description: PGP signature

Re: [Qemu-block] VMDK file unable to open in Qemu but ESXi

2016-09-27 Thread Kevin Wolf

Am 27.09.2016 um 04:02 hat Fam Zheng geschrieben:
> On Mon, 09/26 16:30, Peter Lieven wrote:
> > > So it looks like this file is not compliant to the specfication. Is it
> > > recognized and imported by VMware?
> > 
> > Yes, VMware ESXi imports it flawlessly.
> 
> The only possibility is that the "footer" is not in the end of the image. 
> There
> must be some other sector begins with bytes '4b 44 4d 56' besides the first.
> Theoretically we can make QEMU accept this case by "scanning" the whole image
> to look for the footer, but I'm not sure it's worth it.

I would be surprised if VMware were scanning the whole image, so there
must be some other way. If such images exist in real life, we should
probably try to support them.

Kevin

Re: [Qemu-block] [Nbd] [PATCH] proto: add 'shift' extension.

2016-09-27 Thread Alex Bligh


> On 27 Sep 2016, at 00:41, Wouter Verhelst  wrote:
> 
> Thoughts?

Honestly? This whole thing seems like complication for little gain. One command 
per 2GB wiped doesn't seem like a bad thing.

-- 
Alex Bligh

[Qemu-block] [PATCH v2 5/5] block: keep AioContext pointer in BlockBackend

2016-09-27 Thread Fam Zheng

From: Stefan Hajnoczi 

blk_get/set_aio_context() delegate to BlockDriverState without storing
the AioContext pointer in BlockBackend.

There are two flaws:

1. BlockBackend falls back to the QEMU main loop AioContext when there
   is no root BlockDriverState.  This means the drive loses its
   AioContext during media change and would break dataplane.

2. BlockBackend state used from multiple threads has no lock.  Race
   conditions will creep in as functionality is moved from
   BlockDriverState to BlockBackend due to the absense of a lock.  The
   monitor cannot access BlockBackend state safely while an IOThread is
   also accessing the state.

Issue #1 can be triggered by "change" on virtio-scsi dataplane, causing
a assertion failure (virtio-blk is fine because medium change is not
possible). #2 may be possible with block accounting statistics in
BlockBackend but I'm not aware of a crash that can be triggered.

This patch stores the AioContext pointer in BlockBackend and puts newly
inserted BlockDriverStates into the AioContext.

Signed-off-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/block-backend.c | 24 +---
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index b71babe..cda67cc 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -31,6 +31,7 @@ static AioContext *blk_aiocb_get_aio_context(BlockAIOCB *acb);
 struct BlockBackend {
 char *name;
 int refcnt;
+AioContext *aio_context;
 BdrvChild *root;
 DriveInfo *legacy_dinfo;/* null unless created by drive_new() */
 QTAILQ_ENTRY(BlockBackend) link; /* for block_backends */
@@ -121,6 +122,7 @@ static BlockBackend *blk_new_with_ctx(AioContext *ctx)
 
 blk = g_new0(BlockBackend, 1);
 blk->refcnt = 1;
+blk->aio_context = ctx;
 blk_set_enable_write_cache(blk, true);
 
 qemu_co_queue_init(>public.throttled_reqs[0]);
@@ -510,6 +512,8 @@ void blk_remove_bs(BlockBackend *blk)
 void blk_insert_bs(BlockBackend *blk, BlockDriverState *bs)
 {
 bdrv_ref(bs);
+
+assert(blk->aio_context == bdrv_get_aio_context(bs));
 blk->root = bdrv_root_attach_child(bs, "root", _root, blk);
 
 notifier_list_notify(>insert_bs_notifiers, blk);
@@ -1413,13 +1417,7 @@ void blk_op_unblock_all(BlockBackend *blk, Error *reason)
 
 AioContext *blk_get_aio_context(BlockBackend *blk)
 {
-BlockDriverState *bs = blk_bs(blk);
-
-if (bs) {
-return bdrv_get_aio_context(bs);
-} else {
-return qemu_get_aio_context();
-}
+return blk->aio_context;
 }
 
 static AioContext *blk_aiocb_get_aio_context(BlockAIOCB *acb)
@@ -1432,7 +1430,19 @@ void blk_set_aio_context(BlockBackend *blk, AioContext 
*new_context)
 {
 BlockDriverState *bs = blk_bs(blk);
 
+blk->aio_context = new_context;
+
 if (bs) {
+AioContext *ctx = bdrv_get_aio_context(bs);
+
+if (ctx == new_context) {
+return;
+}
+/* Moving context around happens when a block device is
+ * enabling/disabling data plane, in which case we own the root BDS and
+ * it cannot be associated with another AioContext. */
+assert(ctx == qemu_get_aio_context() ||
+   new_context == qemu_get_aio_context());
 if (blk->public.throttle_state) {
 throttle_timers_detach_aio_context(>public.throttle_timers);
 }
-- 
2.7.4

[Qemu-block] [PATCH v2 3/5] block: Introduce and make use of blk_new_with_root

2016-09-27 Thread Fam Zheng

The idiom of "blk_new() + blk_insert_bs()" is common. Use a new
BB interface that does both things to make the coming transition that
adds AioContext to BB easier.

The added blk_new_with_ctx doesn't handle the passed ctx and is purely
here to avoid a small code churn.

Signed-off-by: Fam Zheng 
---
 block/backup.c   |  3 +--
 block/block-backend.c| 24 ++--
 block/commit.c   | 12 
 block/mirror.c   |  3 +--
 blockjob.c   |  3 +--
 hmp.c|  3 +--
 hw/core/qdev-properties-system.c |  3 +--
 include/sysemu/block-backend.h   |  1 +
 nbd/server.c |  3 +--
 tests/test-blockjob.c|  3 +--
 10 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 582bd0f..3eed9a1 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -601,8 +601,7 @@ void backup_start(const char *job_id, BlockDriverState *bs,
 goto error;
 }
 
-job->target = blk_new();
-blk_insert_bs(job->target, target);
+job->target = blk_new_with_root(target);
 
 job->on_source_error = on_source_error;
 job->on_target_error = on_target_error;
diff --git a/block/block-backend.c b/block/block-backend.c
index 0bd19ab..b71babe 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -115,12 +115,7 @@ static const BdrvChildRole child_root = {
 .drained_end= blk_root_drained_end,
 };
 
-/*
- * Create a new BlockBackend with a reference count of one.
- * Store an error through @errp on failure, unless it's null.
- * Return the new BlockBackend on success, null on failure.
- */
-BlockBackend *blk_new(void)
+static BlockBackend *blk_new_with_ctx(AioContext *ctx)
 {
 BlockBackend *blk;
 
@@ -139,6 +134,23 @@ BlockBackend *blk_new(void)
 }
 
 /*
+ * Create a new BlockBackend with a reference count of one.
+ * Store an error through @errp on failure, unless it's null.
+ * Return the new BlockBackend on success, null on failure.
+ */
+BlockBackend *blk_new(void)
+{
+return blk_new_with_ctx(qemu_get_aio_context());
+}
+
+BlockBackend *blk_new_with_root(BlockDriverState *root)
+{
+BlockBackend *blk = blk_new_with_ctx(bdrv_get_aio_context(root));
+blk_insert_bs(blk, root);
+return blk;
+}
+
+/*
  * Creates a new BlockBackend, opens a new BlockDriverState, and connects both.
  *
  * Just as with bdrv_open(), after having called this function the reference to
diff --git a/block/commit.c b/block/commit.c
index 9f67a8b..45d4f96 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -260,11 +260,9 @@ void commit_start(const char *job_id, BlockDriverState *bs,
 }
 
 
-s->base = blk_new();
-blk_insert_bs(s->base, base);
+s->base = blk_new_with_root(base);
 
-s->top = blk_new();
-blk_insert_bs(s->top, top);
+s->top = blk_new_with_root(top);
 
 s->active = bs;
 
@@ -314,11 +312,9 @@ int bdrv_commit(BlockDriverState *bs)
 }
 }
 
-src = blk_new();
-blk_insert_bs(src, bs);
+src = blk_new_with_root(bs);
 
-backing = blk_new();
-blk_insert_bs(backing, bs->backing->bs);
+backing = blk_new_with_root(bs->backing->bs);
 
 length = blk_getlength(src);
 if (length < 0) {
diff --git a/block/mirror.c b/block/mirror.c
index f9d1fec..8910d53 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -941,8 +941,7 @@ static void mirror_start_job(const char *job_id, 
BlockDriverState *bs,
 return;
 }
 
-s->target = blk_new();
-blk_insert_bs(s->target, target);
+s->target = blk_new_with_root(target);
 
 s->replaces = g_strdup(replaces);
 s->on_source_error = on_source_error;
diff --git a/blockjob.c b/blockjob.c
index a167f96..8fe6d5d 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -148,8 +148,7 @@ void *block_job_create(const char *job_id, const 
BlockJobDriver *driver,
 return NULL;
 }
 
-blk = blk_new();
-blk_insert_bs(blk, bs);
+blk = blk_new_with_root(bs);
 
 job = g_malloc0(driver->instance_size);
 error_setg(>blocker, "block device is in use by block job: %s",
diff --git a/hmp.c b/hmp.c
index 336e7bf..1ee83d2 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1934,8 +1934,7 @@ void hmp_qemu_io(Monitor *mon, const QDict *qdict)
 if (!blk) {
 BlockDriverState *bs = bdrv_lookup_bs(NULL, device, );
 if (bs) {
-blk = local_blk = blk_new();
-blk_insert_bs(blk, bs);
+blk = local_blk = blk_new_with_root(bs);
 } else {
 goto fail;
 }
diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index e55afe6..2b07beb 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -78,8 +78,7 @@ static void parse_drive(DeviceState *dev, const char *str, 
void **ptr,
 if (!blk) {
 BlockDriverState *bs = bdrv_lookup_bs(NULL, str, NULL);
 if (bs) {
-

[Qemu-block] [PATCH v2 4/5] migration: Set correct AioContext to BlockBackend

2016-09-27 Thread Fam Zheng

The BB is newly created and it should follow the BDS's current context.

Signed-off-by: Fam Zheng 
---
 migration/block.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/migration/block.c b/migration/block.c
index ebc10e6..b7365ee 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -445,6 +445,7 @@ static void init_blk_migration(QEMUFile *f)
 BlockDriverState *bs = bmds_bs[i].bs;
 
 if (bmds) {
+blk_set_aio_context(bmds->blk, bdrv_get_aio_context(bs));
 blk_insert_bs(bmds->blk, bs);
 
 alloc_aio_bitmap(bmds);
-- 
2.7.4

[Qemu-block] [PATCH v2 2/5] blockdev: Move BDS AioContext before inserting to BB

2016-09-27 Thread Fam Zheng

The callers made sure the BDS has no BB attached before, so blk is
becoming the sole owner. It is necessary to move the BDS to the right
AioContext before inserting it, to keep them in sync.

Signed-off-by: Fam Zheng 
---
 blockdev.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index a4960b9..9700dee 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2478,6 +2478,7 @@ out:
 static void qmp_blockdev_insert_anon_medium(BlockBackend *blk,
 BlockDriverState *bs, Error **errp)
 {
+AioContext *ctx;
 bool has_device;
 
 /* For BBs without a device, we can exchange the BDS tree at will */
@@ -2498,6 +2499,12 @@ static void qmp_blockdev_insert_anon_medium(BlockBackend 
*blk,
 return;
 }
 
+ctx = blk_get_aio_context(blk);
+if (ctx != bdrv_get_aio_context(bs)) {
+aio_context_acquire(ctx);
+bdrv_set_aio_context(bs, ctx);
+aio_context_release(ctx);
+}
 blk_insert_bs(blk, bs);
 
 if (!blk_dev_has_tray(blk)) {
-- 
2.7.4

[Qemu-block] [PATCH v2 0/5] block: keep AioContext pointer in BlockBackend

2016-09-27 Thread Fam Zheng

The first patches clean up usage of BlockBackend and changing of its (root's)
aio contexts; the last patch is an update of Stefan's previous version rebasing
on top of current master. The biggest change from the RFC is that blk_insert_bs
callers are responsible to put the BB and BDS on the same context before
calling it.

This fixes the crash triggered by "change" a scsi-cd on a virtio-scsi dataplane
device.

The new assertions in block-backend.c ensures we won't have a conflict pair of
BlockBackend users from different contextes.

Fam Zheng (4):
  blockdev-mirror: Sanity check before moving target_bs AioContext
  blockdev: Move BDS AioContext before inserting to BB
  block: Introduce and make use of blk_new_with_root
  migration: Set correct AioContext to BlockBackend

Stefan Hajnoczi (1):
  block: keep AioContext pointer in BlockBackend

 block/backup.c   |  3 +--
 block/block-backend.c| 48 +---
 block/commit.c   | 12 --
 block/mirror.c   |  3 +--
 blockdev.c   | 42 ++-
 blockjob.c   |  3 +--
 hmp.c|  3 +--
 hw/core/qdev-properties-system.c |  3 +--
 include/sysemu/block-backend.h   |  1 +
 migration/block.c|  1 +
 nbd/server.c |  3 +--
 tests/test-blockjob.c|  3 +--
 12 files changed, 79 insertions(+), 46 deletions(-)

-- 
2.7.4

[Qemu-block] [PATCH v2 1/5] blockdev-mirror: Sanity check before moving target_bs AioContext

2016-09-27 Thread Fam Zheng

Similar to blockdev-backup, if the target was already moved to a
different AioContext, bad things can happen. This happens when the
target belongs to a data plane device. It's a very unlikely case, but
let's check it anyway.

Signed-off-by: Fam Zheng 
---
 blockdev.c | 35 ---
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 29c6561..a4960b9 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3281,6 +3281,21 @@ BlockDeviceInfoList *qmp_query_named_block_nodes(Error 
**errp)
 return bdrv_named_nodes_list(errp);
 }
 
+static void blockdev_set_aio_context(BlockDriverState *bs, AioContext *ctx,
+ Error **errp)
+{
+if (bdrv_get_aio_context(bs) != ctx) {
+if (!bdrv_has_blk(bs)) {
+/* The target BDS is not attached, we can safely move it to another
+ * AioContext. */
+bdrv_set_aio_context(bs, ctx);
+} else {
+error_setg(errp, "Target is attached to a different thread from "
+ "source.");
+}
+}
+}
+
 void do_blockdev_backup(BlockdevBackup *backup, BlockJobTxn *txn, Error **errp)
 {
 BlockDriverState *bs;
@@ -3317,16 +3332,10 @@ void do_blockdev_backup(BlockdevBackup *backup, 
BlockJobTxn *txn, Error **errp)
 goto out;
 }
 
-if (bdrv_get_aio_context(target_bs) != aio_context) {
-if (!bdrv_has_blk(target_bs)) {
-/* The target BDS is not attached, we can safely move it to another
- * AioContext. */
-bdrv_set_aio_context(target_bs, aio_context);
-} else {
-error_setg(errp, "Target is attached to a different thread from "
- "source.");
-goto out;
-}
+blockdev_set_aio_context(target_bs, aio_context, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+goto out;
 }
 backup_start(backup->job_id, bs, target_bs, backup->speed, backup->sync,
  NULL, backup->compress, backup->on_source_error,
@@ -3538,7 +3547,11 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
 goto out;
 }
 
-bdrv_set_aio_context(target_bs, aio_context);
+blockdev_set_aio_context(target_bs, aio_context, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+goto out;
+}
 
 blockdev_mirror_common(arg->has_job_id ? arg->job_id : NULL, bs, target_bs,
arg->has_replaces, arg->replaces, arg->sync,
-- 
2.7.4

94 matches

Mail list logo